minipreco.pt
robots.txt

Robots Exclusion Standard data for minipreco.pt

Resource Scan

Scan Details

Site Domain minipreco.pt
Base Domain minipreco.pt
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-06-24T03:34:47+00:00
Next Scan 2024-09-22T03:34:47+00:00

Last Successful Scan

Scanned2023-11-03T06:18:38+00:00
URL https://minipreco.pt/robots.txt
Redirect https://www.minipreco.pt/robots.txt
Redirect Domain www.minipreco.pt
Redirect Base minipreco.pt
Domain IPs 23.220.203.11, 23.220.203.32, 2600:1417:3f::b81b:7b13, 2600:1417:3f::b81b:7b48
Redirect IPs 125.56.219.65, 2600:1413:b000:13::b857:c184, 2600:1413:b000:13::b857:c1a0, 72.247.127.217
Response IP 42.99.140.179
Found Yes
Hash 042c75e7cdd1c0fd569b8f805cdd8240540dd474014a762ace4be74493fbe3b0
SimHash 3c550f14cff1

Groups

*

Rule Path
Disallow /cart
Disallow /checkout
Disallow /my-account
Disallow /*/reviewhtml/
Disallow /ca/*
Disallow /en/*
Disallow /products/
Disallow /productes/
Disallow /prueba-compra-online

cazoodlebot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot/1.0

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

Other Records

Field Value
sitemap /sitemap.xml

Comments

  • For all robots
  • Block access to specific groups of pages
  • Allow search crawlers to discover the sitemap
  • Block CazoodleBot as it does not present correct accept content headers
  • Block MJ12bot as it is just noise
  • Block dotbot as it cannot parse base urls properly
  • Block Gigabot