www.cs.rochester.edu
robots.txt

Robots Exclusion Standard data for www.cs.rochester.edu

Resource Scan

Scan Details

Site Domain www.cs.rochester.edu
Base Domain rochester.edu
Scan Status Ok
Last Scan2025-06-29T17:12:23+00:00
Next Scan 2025-07-29T17:12:23+00:00

Last Scan

Scanned2025-06-29T17:12:23+00:00
URL https://www.cs.rochester.edu/robots.txt
Domain IPs 128.151.167.12
Response IP 128.151.167.12
Found Yes
Hash a387c6ccc4e6d7b2d77efd524e16a523ae3c926ae7ec2e568e851ccd3cebcbe7
SimHash e648b05fc691

Groups

*

Rule Path
Disallow /cgi-bin/
Disallow /msdnaa/
Disallow /dept/calendar/
Disallow /dept/phpicalendar/
Disallow /dept/contest/
Disallow /bin/attach
Disallow /bin/changes
Disallow /bin/edit
Disallow /bin/geturl
Disallow /bin/installpasswd
Disallow /bin/mailnotify
Disallow /bin/manage
Disallow /bin/oops
Disallow /bin/passwd
Disallow /bin/preview
Disallow /bin/rdiff
Disallow /bin/rdiffauth
Disallow /bin/register
Disallow /bin/rename
Disallow /bin/save
Disallow /bin/savemulti
Disallow /bin/search
Disallow /bin/setlib.cfg
Disallow /bin/statistics
Disallow /bin/testenv
Disallow /bin/configure
Disallow /bin/upload
Disallow /bin/viewauth
Disallow /bin/viewfile
Disallow /list/
Disallow /wcms/research/cisd
Disallow /research/cisd/projects/trips/lexicon/cgi/browseontology

*

Rule Path
Disallow /~gildea/cgi-bin/
Disallow /pipermail/
Disallow /dept/seminars/view/view/nlprg/2017/5
Disallow /dept/seminars/view/view/nlprg/2016/12
Disallow /dept/seminars/view/view/nlprg/2017/2
Disallow /dept/seminars/view/view/nlprg/2017/1

Other Records

Field Value
crawl-delay 10

Comments

  • twiki disallows
  • page does not exist
  • bingbot is hammering us. throttle their crawls
  • 1 slow, 5 very slow, 10 extremely slow
  • User-agent: msnbot
  • Crawl-delay: 10
  • User-agent: bingbot
  • Crawl-delay: 10
  • gildea wordnet indexing is confusing search results for BRAID and other site keyword searches
  • mailman archives
  • google search console reports 'coverage error' indexing non-exisitent urls
  • report claims linked at /dept/seminars/view/nlprg/ but that does not seem to be the case