miskatonic.org
robots.txt

Robots Exclusion Standard data for miskatonic.org

Resource Scan

Scan Details

Site Domain miskatonic.org
Base Domain miskatonic.org
Scan Status Ok
Last Scan2025-10-17T14:40:12+00:00
Next Scan 2025-11-16T14:40:12+00:00

Last Scan

Scanned2025-10-17T14:40:12+00:00
URL https://miskatonic.org/robots.txt
Domain IPs 209.68.16.207
Response IP 209.68.16.207
Found Yes
Hash 3f0e7f172e2b95998354297077fec3697378d5c5bef044734716c4ffcc5a7ab0
SimHash 701c5b50c69f

Groups

*

Rule Path
Disallow

Other Records

Field Value
crawl-delay 2

dotbot

Rule Path
Disallow /

dotnetdotcom

Rule Path
Disallow /

spinn3r

Rule Path
Disallow /

spinn3r

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

spbot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

spinn3r

Rule Path
Disallow /

betabot

Rule Path
Disallow /

trident

Rule Path
Disallow /

tagoobot

Rule Path
Disallow /

alexa

Rule Path
Disallow /

exabot

Rule Path
Disallow /

discoveryengine

Rule Path
Disallow /

scholaruniverse

Rule Path
Disallow /

myspace

Rule Path
Disallow /

discobot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

Other Records

Field Value
sitemap http://www.miskatonic.org/sitemap.xml

Comments

  • All these scraping bots I lifted from https://ruk.ca/robots.txt