gcustudents.co.uk
robots.txt

Robots Exclusion Standard data for gcustudents.co.uk

Resource Scan

Scan Details

Site Domain gcustudents.co.uk
Base Domain gcustudents.co.uk
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-07-02T11:29:40+00:00
Next Scan 2024-09-30T11:29:40+00:00

Last Successful Scan

Scanned2023-02-16T09:34:07+00:00
URL https://www.gcustudents.co.uk/robots.txt
Domain IPs 2600:9000:203f:2600:4:f546:7880:93a1, 2600:9000:203f:5000:4:f546:7880:93a1, 2600:9000:203f:7000:4:f546:7880:93a1, 2600:9000:203f:ac00:4:f546:7880:93a1, 2600:9000:203f:be00:4:f546:7880:93a1, 2600:9000:203f:cc00:4:f546:7880:93a1, 2600:9000:203f:f000:4:f546:7880:93a1, 2600:9000:203f:f200:4:f546:7880:93a1, 54.192.150.106, 54.192.150.40, 54.192.150.7, 54.192.150.85
Response IP 54.192.150.106
Found Yes
Hash 03d5dcf5887db9fcf28baa5a126a2959b4dcc9d0c09a18bc6c6c0a8961289314
SimHash b2850d8d6370

Groups

*

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 5

Comments

  • See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file
  • To ban all spiders from the entire site uncomment the next two lines:
  • User-agent: *
  • Disallow: /