hasbrouck.org
robots.txt

Robots Exclusion Standard data for hasbrouck.org

Resource Scan

Scan Details

Site Domain hasbrouck.org
Base Domain hasbrouck.org
Scan Status Ok
Last Scan2024-11-16T18:32:32+00:00
Next Scan 2024-11-23T18:32:32+00:00

Last Scan

Scanned2024-11-16T18:32:32+00:00
URL https://hasbrouck.org/robots.txt
Domain IPs 104.26.10.60, 104.26.11.60, 172.67.74.96, 2606:4700:20::681a:a3c, 2606:4700:20::681a:b3c, 2606:4700:20::ac43:4a60
Response IP 104.26.11.60
Found Yes
Hash c00fd9c70eec6a3a3765af1e4b237aa066f9ae2fd066c7a5ff5ae2138b20b09e
SimHash 3992d14aeea0

Groups

adsbot-google
adsbot-google-mobile
adsbot-google-mobile-apps
ahrefsbot
aiohttp
amazonbot
anthropic-ai
apis-google
archive.org_bot
awariorssbot
awariosmartbot
barkrowler
bl.uk_lddc_bot
blexbot
buck
bytespider
ccbot
chatgpt
chatgpt-user
claudebot
claude-web
cohere-ai
dataforseobot
diffbot
domainappender
dotbot
facebookbot
feed a fever
genai
go-neb
google-extended
googleother
gptbot
heritrix
httrack
ia_archiver
jamesbot
lcc
linkdexbot
linkedinbot
ltx71
magpie-crawler
meta-externalagent
mediapartners-google
mj12bot
obot
omgili
omgilibot
perplexitybot
rankactivelinkbot
rogerbot
semrushbot
semrushbot-ba
semrushbot-bm
semrushbot-ct
semrushbot-sa
semrushbot-seoab
semrushbot-si
semrushbot-swa
serpstatbot
spbot
special_archiver
xovibot
youbot
zoominfobot

Rule Path
Disallow /

*

Rule Path
Disallow /cgi-bin/
Disallow /mt/
Disallow /mt-static/
Disallow /ehasbrouck.asc

Other Records

Field Value
crawl-delay 1

Other Records

Field Value
sitemap https://hasbrouck.org/sitemap.xml