american.edu
robots.txt

Robots Exclusion Standard data for american.edu

Resource Scan

Scan Details

Site Domain american.edu
Base Domain american.edu
Scan Status Ok
Last Scan2024-11-07T03:07:09+00:00
Next Scan 2024-12-07T03:07:09+00:00

Last Scan

Scanned2024-11-07T03:07:09+00:00
URL https://american.edu/robots.txt
Domain IPs 13.107.253.40
Response IP 13.107.246.40
Found Yes
Hash a8566ce905bee6040fea73af68659c28ac71e644ee536c94d8fb62db9c3469c6
SimHash 61011b454f96

Groups

*

Rule Path
Disallow /customcf/email-a-friend.cfm
Disallow /profiles/admin/
Disallow /oit/network/Wired-7.cfm
Disallow /customcf/au-experts/details.cfm
Disallow /directory/?
Disallow /online/online-program/draft-online-program-page.cfm
Disallow /online/online-program/curriculum_draft.cfm

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://www.american.edu/sitemap.cfm

Comments

  • robots.txt for www.american.edu
  • Use this to block robots in an emergency
  • Disallow: /
  • Disallow: /spa/calendar/
  • Disallow: /alumni/events/
  • Disallow: /cas/calendar/
  • Disallow: /kogod/calendar/
  • Disallow: /sis/calendar/
  • Disallow: /soc/calendar/
  • Disallow: /spa/calendar/