torontopubliclibrary.ca
robots.txt

Robots Exclusion Standard data for torontopubliclibrary.ca

Archived Snapshots

Resource Scan

Scan Details

Site Domain	torontopubliclibrary.ca
Base Domain	torontopubliclibrary.ca
Scan Status	Ok
Last Scan	2024-09-18T14:12:03+00:00
Next Scan	2024-10-18T14:12:03+00:00

Last Scan

Scanned	2024-09-18T14:12:03+00:00
URL	https://torontopubliclibrary.ca/robots.txt
Redirect	https://www.torontopubliclibrary.ca/robots.txt
Redirect Domain	www.torontopubliclibrary.ca
Redirect Base	torontopubliclibrary.ca
Domain IPs	147.154.3.128
Redirect IPs	192.29.39.162, 192.29.39.51, 192.29.39.98
Response IP	147.154.1.1
Found	Yes
Hash	f0ddbfadd05d496bceaf984015177f2c39611a6f972a28742fe663a60345ea62
SimHash	f4db5bc3c6f4

Groups

*

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

30

*

Rule	Path
Disallow	/branch-computer/
Disallow	/components/
Disallow	/config/
Disallow	/eblast/
Disallow	/it-essentials/
Disallow	/kids-computer-rr/
Disallow	/kids-computer/
Disallow	/kidsstop/
Disallow	/kiosk/
Disallow	/placehold
Disallow	/research-computer/
Disallow	/rss.jsp
Disallow	/search.jsp
Disallow	/share-item-detail.jsp
Disallow	/xml/

Rule

Path

Disallow

/branch-computer/

Disallow

/components/

Disallow

/config/

Disallow

/eblast/

Disallow

/it-essentials/

Disallow

/kids-computer-rr/

Disallow

/kids-computer/

Disallow

/kidsstop/

Disallow

/kiosk/

Disallow

/placehold

Disallow

/research-computer/

Disallow

/rss.jsp

Disallow

/search.jsp

Disallow

/share-item-detail.jsp

Disallow

/xml/

amazonbot
applebot
applebot-extended
bytespider
ccbot
chatgpt-user
claude-web
claudebot
doc
diffbot
download ninja
facebookbot
fetch
friendlycrawler
gptbot
google-extended
googleother
googleother-image
googleother-video
httrack
icc-crawler
imagesiftbot
msiecrawler
mediapartners-google*
meta-externalagent
meta-externalfetcher
microsoft.url.control
npbot
oai-searchbot
offline explorer
perplexitybot
petalbot
scrapy
sitesnagger
teleport
teleportpro
timpibot
ubicrawler
velenpublicwebcrawler
webcopier
webreaper
webstripper
webzip
webzio-extended
xenu
youbot
zao
zealbot
zyborg
anthropic-ai
cohere-ai
facebookexternalhit
grub-client
img2dataset
larbin
libwww
linko
omgili
omgilibot
sitecheck.internetseer.com
wget

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Comments

robots.txt for www.torontopubliclibrary.ca
Updated: 2024-09-15
"Allow" and "Crawl-delay" are "non-standard"
Not supported by all robots, but requests a 30s delay between page loads:
https://github.com/ai-robots-txt/ai.robots.txt/blob/main/robots.txt

Back to top

torontopubliclibrary.carobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

*

Comments

torontopubliclibrary.ca
robots.txt