/.well-known/

Log In Sign Up

gcustudents.co.uk
robots.txt

Robots Exclusion Standard data for gcustudents.co.uk

Archived Snapshots

Resource Scan

Scan Details

Site Domain	gcustudents.co.uk
Base Domain	gcustudents.co.uk
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Server returned a client error.
Last Scan	2024-07-02T11:29:40+00:00
Next Scan	2024-09-30T11:29:40+00:00

Last Successful Scan

Scanned	2023-02-16T09:34:07+00:00
URL	https://www.gcustudents.co.uk/robots.txt
Domain IPs	2600:9000:203f:2600:4:f546:7880:93a1, 2600:9000:203f:5000:4:f546:7880:93a1, 2600:9000:203f:7000:4:f546:7880:93a1, 2600:9000:203f:ac00:4:f546:7880:93a1, 2600:9000:203f:be00:4:f546:7880:93a1, 2600:9000:203f:cc00:4:f546:7880:93a1, 2600:9000:203f:f000:4:f546:7880:93a1, 2600:9000:203f:f200:4:f546:7880:93a1, 54.192.150.106, 54.192.150.40, 54.192.150.7, 54.192.150.85
Response IP	54.192.150.106
Found	Yes
Hash	03d5dcf5887db9fcf28baa5a126a2959b4dcc9d0c09a18bc6c6c0a8961289314
SimHash	b2850d8d6370

Groups

*

No rules defined. All paths allowed.

Other Records

Field

Value

crawl-delay

5

Back to top

Comments

See http://www.robotstxt.org/robotstxt.html for documentation on how to use the robots.txt file
To ban all spiders from the entire site uncomment the next two lines:
User-agent: *
Disallow: /

Back to top