hope.edu
robots.txt

Robots Exclusion Standard data for hope.edu

Resource Scan

Scan Details

Site Domain hope.edu
Base Domain hope.edu
Scan Status Ok
Last Scan2024-08-29T12:03:38+00:00
Next Scan 2024-09-28T12:03:38+00:00

Last Scan

Scanned2024-08-29T12:03:38+00:00
URL https://hope.edu/robots.txt
Domain IPs 209.140.194.21
Response IP 209.140.194.21
Found Yes
Hash 54c04a43a39259735b562904b2c45b9960bdef3252c445fbb89cc5f5fb61b507
SimHash a88409283782

Groups

*

Rule Path
Disallow /sitemap-generator.html
Disallow /_resources/
Disallow /_offices/
Disallow /_academics/
Disallow /_showcase/
Disallow /_training/
Disallow /_dev/
Disallow /_mh-dev/
Disallow /email/
Disallow /catalog/current/majors-minors/index.html
Disallow /catalog/working/
Disallow /*.xml$
Disallow /*.inc$
Disallow /*.php$
Disallow /*.txt$
Disallow /*_props.html$
Disallow /offices/computing-information-technology/wi-fi.html
Disallow /admissions/niche-direct-admissions.html
Allow /sitemap.xml
Allow /data/htdocs/sitemap.xml

claudebot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://hope.edu/sitemap.xml