torontopubliclibrary.ca
robots.txt

Robots Exclusion Standard data for torontopubliclibrary.ca

Resource Scan

Scan Details

Site Domain torontopubliclibrary.ca
Base Domain torontopubliclibrary.ca
Scan Status Ok
Last Scan2024-09-18T14:12:03+00:00
Next Scan 2024-10-18T14:12:03+00:00

Last Scan

Scanned2024-09-18T14:12:03+00:00
URL https://torontopubliclibrary.ca/robots.txt
Redirect https://www.torontopubliclibrary.ca/robots.txt
Redirect Domain www.torontopubliclibrary.ca
Redirect Base torontopubliclibrary.ca
Domain IPs 147.154.3.128
Redirect IPs 192.29.39.162, 192.29.39.51, 192.29.39.98
Response IP 147.154.1.1
Found Yes
Hash f0ddbfadd05d496bceaf984015177f2c39611a6f972a28742fe663a60345ea62
SimHash f4db5bc3c6f4

Groups

*

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 30

*

Rule Path
Disallow /branch-computer/
Disallow /components/
Disallow /config/
Disallow /eblast/
Disallow /it-essentials/
Disallow /kids-computer-rr/
Disallow /kids-computer/
Disallow /kidsstop/
Disallow /kiosk/
Disallow /placehold
Disallow /research-computer/
Disallow /rss.jsp
Disallow /search.jsp
Disallow /share-item-detail.jsp
Disallow /xml/

amazonbot
applebot
applebot-extended
bytespider
ccbot
chatgpt-user
claude-web
claudebot
doc
diffbot
download ninja
facebookbot
fetch
friendlycrawler
gptbot
google-extended
googleother
googleother-image
googleother-video
httrack
icc-crawler
imagesiftbot
msiecrawler
mediapartners-google*
meta-externalagent
meta-externalfetcher
microsoft.url.control
npbot
oai-searchbot
offline explorer
perplexitybot
petalbot
scrapy
sitesnagger
teleport
teleportpro
timpibot
ubicrawler
velenpublicwebcrawler
webcopier
webreaper
webstripper
webzip
webzio-extended
xenu
youbot
zao
zealbot
zyborg
anthropic-ai
cohere-ai
facebookexternalhit
grub-client
img2dataset
larbin
libwww
linko
omgili
omgilibot
sitecheck.internetseer.com
wget

Rule Path
Disallow /

Comments

  • robots.txt for www.torontopubliclibrary.ca
  • Updated: 2024-09-15
  • "Allow" and "Crawl-delay" are "non-standard"
  • Not supported by all robots, but requests a 30s delay between page loads:
  • https://github.com/ai-robots-txt/ai.robots.txt/blob/main/robots.txt