usm.edu
robots.txt

Robots Exclusion Standard data for usm.edu

Resource Scan

Scan Details

Site Domain usm.edu
Base Domain usm.edu
Scan Status Ok
Last Scan2024-11-17T22:22:51+00:00
Next Scan 2024-12-17T22:22:51+00:00

Last Scan

Scanned2024-11-17T22:22:51+00:00
URL https://usm.edu/robots.txt
Redirect https://www.usm.edu/robots.txt
Redirect Domain www.usm.edu
Redirect Base usm.edu
Domain IPs 131.95.7.15
Redirect IPs 209.41.65.188
Response IP 209.41.65.188
Found Yes
Hash f79e7c4138c2522e701ddc485eff825cc97097cc2615ffa6f08164bcbdaf8ca3
SimHash 9871440a4653

Groups

*

Rule Path
Disallow /_resources/
Disallow /_training/
Disallow /_training_part_deux/
Disallow /_oublogs_training/
Disallow /training/
Disallow /_training-2020/
Disallow /ou-alerts/
Disallow /_showcase/
Disallow /_usm-testing/
Disallow /_email/
Disallow /_archive/
Disallow /widgets/
Disallow /custom_gadgets/
Disallow /graduate-admissions/_archive/
Disallow /graduate-programs/_archive/
Disallow /undergraduate-admissions/_archived/
Disallow /undergraduate-admissions/archived/
Disallow /undergraduate-programs/_archive/
Disallow /*/_archive/
Disallow /*/_archived/
Disallow /mou/
Disallow /*.html
Disallow /*.inc

Other Records

Field Value
sitemap https://www.usm.edu/sitemap.xml