cce.caltech.edu
robots.txt
Robots Exclusion Standard data for cce.caltech.edu
Resource Scan
Scan Details
Site Domain | cce.caltech.edu |
Base Domain | caltech.edu |
Scan Status | Ok |
Last Scan | 2025-02-15T00:33:31+00:00 |
Next Scan | 2025-03-17T00:33:31+00:00 |
Last Scan
Scanned | 2025-02-15T00:33:31+00:00 |
URL | https://cce.caltech.edu/robots.txt |
Domain IPs | 104.18.40.96, 172.64.147.160, 2606:4700:4400::6812:2860, 2606:4700:4400::ac40:93a0 |
Response IP | 104.18.40.96 |
Found | Yes |
Hash | bd342b50932a3a070ecd63832e29aa705f869029c5e64433d66df42849696f3d |
SimHash | 4814d4724a91 |
Groups
*
Rule | Path |
---|---|
Disallow | /news-and-events/events/minicalendar/* |
Disallow | /map/landmark_ajax/* |
Disallow | /map/milestone/* |
Allow | * |
Other Records
Field | Value |
---|---|
crawl-delay | 10 |
Other Records
Field | Value |
---|---|
sitemap | https://cce.caltech.edu/sitemap.xml |