blimamma.se
robots.txt

Robots Exclusion Standard data for blimamma.se

Resource Scan

Scan Details

Site Domain blimamma.se
Base Domain blimamma.se
Scan Status Ok
Last Scan2025-10-11T19:35:05+00:00
Next Scan 2025-10-18T19:35:05+00:00

Last Scan

Scanned2025-10-11T19:35:05+00:00
URL https://blimamma.se/robots.txt
Domain IPs 104.21.8.47, 172.67.156.215, 2606:4700:3031::ac43:9cd7, 2606:4700:3033::6815:82f
Response IP 104.21.8.47
Found Yes
Hash 9e6a34bd137a75fcb19522ea20f1123eb360c7f28cfef1f52e499adf43dcaa05
SimHash 783a9720cd82

Groups

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

openai-user

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

duckduckbot

Rule Path
Disallow /

facebookexternalhit

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

yeti

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

bytedancespider

Rule Path
Disallow /

Comments

  • Block OpenAI's GPTBot
  • Block Google's AI data scraper (Google-Extended)
  • Block Anthropic's ClaudeBot
  • Block Common Crawl (used for AI training datasets)
  • Block OpenAI's web crawler used for training
  • Block Perplexity AI (used for AI-generated responses)
  • Block You.com AI search engine crawler
  • Block DuckDuckGo's AI-enhanced search bot
  • Block Meta (Facebook) AI data gathering bot
  • Block Amazon's AI data collection bot
  • Block Naver's AI research bot (Korea)
  • Block Baidu AI and Search crawlers (China)
  • Block TikTok AI data crawler