thecrimson.com
robots.txt

Robots Exclusion Standard data for thecrimson.com

Resource Scan

Scan Details

Site Domain thecrimson.com
Base Domain thecrimson.com
Scan Status Ok
Last Scan2024-09-24T05:29:27+00:00
Next Scan 2024-10-01T05:29:27+00:00

Last Scan

Scanned2024-09-24T05:29:27+00:00
URL https://www.thecrimson.com/robots.txt
Domain IPs 54.192.18.116, 54.192.18.4, 54.192.18.60, 54.192.18.75
Response IP 3.165.102.129
Found Yes
Hash 86d88d40bc2c427d29729f0db339f3a974e7f57b3adf7398134ebdda34a27869
SimHash ec2a5335c393

Groups

hul-wax

Rule Path
Disallow

slurp

Rule Path
Disallow

Other Records

Field Value
crawl-delay 5

yahoo! slurp

Rule Path
Disallow

Other Records

Field Value
crawl-delay 5