mrdizzy.blog
robots.txt

Robots Exclusion Standard data for mrdizzy.blog

Resource Scan

Scan Details

Site Domain mrdizzy.blog
Base Domain mrdizzy.blog
Scan Status Ok
Last Scan2025-06-14T11:14:31+00:00
Next Scan 2025-06-21T11:14:31+00:00

Last Scan

Scanned2025-06-14T11:14:31+00:00
URL https://mrdizzy.blog/robots.txt
Domain IPs 198.51.233.1, 2620:2:6000::bad:dab:cafe
Response IP 198.51.233.1
Found Yes
Hash eb3c083073c56639920c42074e3bbf02230de8adcaba91eacb3fc9e8df30c10a
SimHash 6518fb006583

Groups

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

ariadne

Rule Path
Allow /

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

slurp

Rule Path
Allow /

duckduckbot

Rule Path
Allow /

applebot

Rule Path
Allow /

baiduspider

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

*

Rule Path
Allow /

Comments

  • Block AI Crawlers
  • Allow Search Engine Crawlers
  • Clew.se
  • Default rule