du.edu
robots.txt

Robots Exclusion Standard data for du.edu

Resource Scan

Scan Details

Site Domain du.edu
Base Domain du.edu
Scan Status Ok
Last Scan2024-09-21T12:52:21+00:00
Next Scan 2024-10-21T12:52:21+00:00

Last Scan

Scanned2024-09-21T12:52:21+00:00
URL https://du.edu/robots.txt
Redirect https://www.du.edu/robots.txt
Redirect Domain www.du.edu
Redirect Base du.edu
Domain IPs 130.253.2.250
Redirect IPs 23.185.0.1, 2620:12a:8000::1, 2620:12a:8001::1
Response IP 23.185.0.1
Found Yes
Hash 684f6ec254899ec062495c8334c923ab507d43e4cd76ecb2be38c58eed1b16a5
SimHash f5808903c487

Groups

*

Rule Path Comment
Disallow /_old-stuff -
Disallow /apply/mobile -
Disallow /asian -
Disallow /az -
Disallow /banner -
Disallow /campus-safety -
Disallow /car/ leave trailing slash so career gets indexed
Disallow /ccs/ leave trailing slash so ccst gets indexed
Disallow /ce/ leave trailing slash so other ce* directories get indexed
Disallow /city -
Disallow /crs -
Disallow /ctl -
Disallow /ctir -
Disallow /disability -
Disallow /discoveries -
Disallow /emad -
Disallow /foodservice -
Disallow /geography -
Disallow /gsis -
Disallow /gssw -
Disallow /homecoming -
Disallow /laptops -
Disallow /maps -
Disallow /newmancenter -
Disallow /portal -
Disallow /resources -
Disallow /sass -
Disallow /secs/ -
Disallow /studentlife/career -
Disallow /today -
Disallow /tom -
Disallow /ucomm -
Disallow /uts/multimedia -

Other Records

Field Value
sitemap https://www.du.edu/sitemapindex.xml

Comments

  • robots.txt for https://www.du.edu
  • mobile
  • User-agent: Googlebot-Mobile
  • Allow: /apply/mobile
  • Disallow: /
  • User-agent: gsa-crawler-du
  • Allow: /apply/mobile
  • all other user-agents
  • Disallow: /admission
  • Disallow: /athletics
  • sitemap location