tuclothing.sainsburys.co.uk
robots.txt

Robots Exclusion Standard data for tuclothing.sainsburys.co.uk

Resource Scan

Scan Details

Site Domain tuclothing.sainsburys.co.uk
Base Domain sainsburys.co.uk
Scan Status Ok
Last Scan2024-11-03T06:56:41+00:00
Next Scan 2024-12-03T06:56:41+00:00

Last Scan

Scanned2024-11-03T06:56:41+00:00
URL https://tuclothing.sainsburys.co.uk/robots.txt
Domain IPs 23.32.29.16, 96.17.180.50
Response IP 96.17.180.181
Found Yes
Hash 6974861e5b614523b2354f15ef3123b7427afe87a781c82ec63d283942557be7
SimHash 3c54d716edf2

Groups

*

Rule Path
Disallow
Disallow /basket
Disallow /checkout
Disallow /list*

cazoodlebot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot/1.0

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://tuclothing.sainsburys.co.uk/sitemap.xml

Comments

  • For all robots
  • Block access to specific groups of pages
  • Allow search crawlers to discover the sitemap
  • Block CazoodleBot as it does not present correct accept content headers
  • Block MJ12bot as it is just noise
  • Block dotbot as it cannot parse base urls properly
  • Block Gigabot