/.well-known/

Log In Sign Up

strath.ac.uk
robots.txt

Robots Exclusion Standard data for strath.ac.uk

Archived Snapshots

Resource Scan

Scan Details

Site Domain	strath.ac.uk
Base Domain	strath.ac.uk
Scan Status	Ok
Last Scan	2025-06-29T02:46:07+00:00
Next Scan	2025-07-29T02:46:07+00:00

Last Scan

Scanned	2025-06-29T02:46:07+00:00
URL	https://www.strath.ac.uk/robots.txt
Domain IPs	130.159.20.74
Response IP	130.159.20.74
Found	Yes
Hash	fa276694f72c4ef64b9cc34773ba683177f94a34b717d2d5452b0ebbb59d60f2
SimHash	05169d5b0775

Groups

funnelback

Rule

Path

Allow

/

Back to top

Comments

This file is called robots.txt and must be logically stored at the root of
the web server directory tree.
We specify which directories cannot be indexed by particualar robots
- assuming the robots acknowledge the Robot Exclusion protocol:
http://info.webcrawler.com/mak/projects/robots/exclusion.html
Disallow any client from indexing any part
User-agent: *
Disallow: *
Allow: /accessibilityproject/
Allow funnelback to crawl

Back to top