studentspace.org.uk
robots.txt

Robots Exclusion Standard data for studentspace.org.uk

Resource Scan

Scan Details

Site Domain studentspace.org.uk
Base Domain studentspace.org.uk
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2025-07-03T17:54:53+00:00
Next Scan 2025-08-02T17:54:53+00:00

Last Successful Scan

Scanned2025-06-04T17:53:47+00:00
URL https://studentspace.org.uk/robots.txt
Domain IPs 185.194.90.26
Response IP 185.194.90.26
Found Yes
Hash ed08673726a9313caa86abb2979a79ebc0faf3193b084df7cf82cd5ef61390f8
SimHash 412019723590

Groups

*

Rule Path
Disallow /cpresources/
Disallow /vendor/
Disallow /.env
Disallow /cache/

Other Records

Field Value
sitemap https://studentspace.org.uk/sitemaps-1-sitemap.xml

Comments

  • robots.txt for https://studentspace.org.uk/
  • live - don't allow web crawlers to index cpresources/ or vendor/