lib.uchicago.edu
robots.txt

Robots Exclusion Standard data for lib.uchicago.edu

Resource Scan

Scan Details

Site Domain lib.uchicago.edu
Base Domain uchicago.edu
Scan Status Failed
Failure StageFetching resource.
Failure ReasonRequest timed out.
Last Scan2025-11-08T14:28:11+00:00
Next Scan 2025-11-15T14:28:11+00:00

Last Successful Scan

Scanned2025-11-05T00:37:14+00:00
URL https://lib.uchicago.edu/robots.txt
Redirect https://www.lib.uchicago.edu/robots.txt
Redirect Domain www.lib.uchicago.edu
Redirect Base uchicago.edu
Domain IPs 128.135.181.141
Redirect IPs 128.135.181.65
Response IP 128.135.181.65
Found Yes
Hash 286599b9d016d5c598d5306cbb5a0b5794d26ebc3cc3a0d87f669a0ccf50690f
SimHash 7a9fcb419ed7

Groups

archive.org_bot

Rule Path
Allow /

*

Rule Path
Disallow /search/
Disallow /TestInfo/
Disallow /Test/
Disallow /StaffInfo/
Disallow /staffweb/
Disallow /dldc/
Disallow /~chas/
Allow /cgi-bin/nand/search/stc
Allow /cgi-bin/nand/search/rosenthal
Allow /cgi-bin/nand/search/databasefinder
Disallow /cgi-bin/nand/search/databasefinder?
Disallow /cgi-bin/
Disallow /e/keith/
Disallow /archives/
Disallow /e/chas/
Disallow /bus/
Disallow /e/busecon/macroecon/
Disallow /phplib/
Disallow /automail/
Disallow /ui/

ai2bot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

cohere-training-data-crawler

Rule Path
Disallow /

cotoyogi

Rule Path
Disallow /

datenbank crawler

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

factset_spyderbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

icc-crawler

Rule Path
Disallow /

kangaroo bot

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

netestate imprint crawler

Rule Path
Disallow /

omgili

Rule Path
Disallow /

pangubot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

semrushbot-ocob

Rule Path
Disallow /

timpibot

Rule Path
Disallow /

webzio-extended

Rule Path
Disallow /

Comments

  • Allow commands must precede Disallows for that directory
  • Do not add a googlebot-specific user-agent without replicating what is in the universal one - google ignores User-agent:* if there is User-agent: Googlebot
  • Allow good bots
  • UChicago legacy
  • AI Data Scraper
  • https://darkvisitors.com/agents/ai2bot
  • AI Data Scraper
  • https://darkvisitors.com/agents/applebot-extended
  • AI Data Scraper
  • https://darkvisitors.com/agents/bytespider
  • AI Data Scraper
  • https://darkvisitors.com/agents/ccbot
  • AI Data Scraper
  • https://darkvisitors.com/agents/claudebot
  • AI Data Scraper
  • https://darkvisitors.com/agents/cohere-training-data-crawler
  • AI Data Scraper
  • https://darkvisitors.com/agents/cotoyogi
  • AI Data Scraper
  • https://darkvisitors.com/agents/datenbank-crawler
  • AI Data Scraper
  • https://darkvisitors.com/agents/diffbot
  • AI Data Scraper
  • https://darkvisitors.com/agents/facebookbot
  • AI Data Scraper
  • https://darkvisitors.com/agents/factset-spyderbot
  • AI Data Scraper
  • https://darkvisitors.com/agents/google-extended
  • AI Data Scraper
  • https://darkvisitors.com/agents/gptbot
  • AI Data Scraper
  • https://darkvisitors.com/agents/icc-crawler
  • AI Data Scraper
  • https://darkvisitors.com/agents/kangaroo-bot
  • AI Data Scraper
  • https://darkvisitors.com/agents/meta-externalagent
  • AI Data Scraper
  • https://darkvisitors.com/agents/netestate-imprint-crawler
  • AI Data Scraper
  • https://darkvisitors.com/agents/omgili
  • AI Data Scraper
  • https://darkvisitors.com/agents/pangubot
  • AI Data Scraper
  • https://darkvisitors.com/agents/petalbot
  • AI Data Scraper
  • https://darkvisitors.com/agents/semrushbot-ocob
  • AI Data Scraper
  • https://darkvisitors.com/agents/timpibot
  • AI Data Scraper
  • https://darkvisitors.com/agents/webzio-extended