cce.caltech.edu
robots.txt

Robots Exclusion Standard data for cce.caltech.edu

Resource Scan

Scan Details

Site Domain cce.caltech.edu
Base Domain caltech.edu
Scan Status Ok
Last Scan2025-02-15T00:33:31+00:00
Next Scan 2025-03-17T00:33:31+00:00

Last Scan

Scanned2025-02-15T00:33:31+00:00
URL https://cce.caltech.edu/robots.txt
Domain IPs 104.18.40.96, 172.64.147.160, 2606:4700:4400::6812:2860, 2606:4700:4400::ac40:93a0
Response IP 104.18.40.96
Found Yes
Hash bd342b50932a3a070ecd63832e29aa705f869029c5e64433d66df42849696f3d
SimHash 4814d4724a91

Groups

semrushbot

Rule Path
Disallow /

blp_bbot

Rule Path
Disallow /

*

Rule Path
Disallow /news-and-events/events/minicalendar/*
Disallow /map/landmark_ajax/*
Disallow /map/milestone/*
Allow *

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://cce.caltech.edu/sitemap.xml