strath.ac.uk
robots.txt

Robots Exclusion Standard data for strath.ac.uk

Resource Scan

Scan Details

Site Domain strath.ac.uk
Base Domain strath.ac.uk
Scan Status Ok
Last Scan2025-06-29T02:46:07+00:00
Next Scan 2025-07-29T02:46:07+00:00

Last Scan

Scanned2025-06-29T02:46:07+00:00
URL https://www.strath.ac.uk/robots.txt
Domain IPs 130.159.20.74
Response IP 130.159.20.74
Found Yes
Hash fa276694f72c4ef64b9cc34773ba683177f94a34b717d2d5452b0ebbb59d60f2
SimHash 05169d5b0775

Groups

funnelback

Rule Path
Allow /

Comments

  • This file is called robots.txt and must be logically stored at the root of
  • the web server directory tree.
  • We specify which directories cannot be indexed by particualar robots
  • - assuming the robots acknowledge the Robot Exclusion protocol:
  • http://info.webcrawler.com/mak/projects/robots/exclusion.html
  • Disallow any client from indexing any part
  • User-agent: *
  • Disallow: *
  • Allow: /accessibilityproject/
  • Allow funnelback to crawl