you.38degrees.org.uk
robots.txt

Robots Exclusion Standard data for you.38degrees.org.uk

Resource Scan

Scan Details

Site Domain you.38degrees.org.uk
Base Domain 38degrees.org.uk
Scan Status Ok
Last Scan2024-11-03T13:48:11+00:00
Next Scan 2024-12-03T13:48:11+00:00

Last Scan

Scanned2024-11-03T13:48:11+00:00
URL https://you.38degrees.org.uk/robots.txt
Domain IPs 104.22.38.97, 104.22.39.97, 172.67.29.53, 2606:4700:10::6816:2661, 2606:4700:10::6816:2761, 2606:4700:10::ac43:1d35
Response IP 104.22.38.97
Found Yes
Hash 9d0a5cd0b831f719c766fef513a1436219781734e5b2405b6fdd72df655c9caf
SimHash 268d3d095560

Groups

yahoo! slurp

Rule Path
Disallow /petitions/*/comments

Comments

  • See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:
  • User-Agent: *
  • Disallow: /
  • Tell Yahoo! Slurp to stop trying to call the AJAX endpoint for the next page of comments
  • Other crawlers seem to be smart enough to not need this.