editeur.org
robots.txt

Robots Exclusion Standard data for editeur.org

Resource Scan

Scan Details

Site Domain editeur.org
Base Domain editeur.org
Scan Status Ok
Last Scan2025-09-10T22:01:45+00:00
Next Scan 2025-10-10T22:01:45+00:00

Last Scan

Scanned2025-09-10T22:01:45+00:00
URL https://editeur.org/robots.txt
Domain IPs 35.214.38.142
Response IP 35.214.38.142
Found Yes
Hash 6023fc7e3a1344be5f98623d05c000701fa5815fbce2c4968588ef01deae0a9d
SimHash 20963bd5a5e4

Groups

*

Rule Path
Disallow /

googlebot
bingbot
slurp
yandexbot
duckduckbot
baiduspider
yeti
ia_archiver

Rule Path
Allow /

amazonbot
anthropic-ai
claude-web
claudebot
applebot-extended
cohere-ai
ccbot
google-extended
facebookbot
gptbot
chatgpt-user
oai-searchbot
perplexitybot
bytedance
bytespider
omgili
omgilibot

Rule Path
Disallow /

Comments

  • See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
  • 1. Ban MOST spiders from the entire site:
  • 2. Allow CERTAIN spiders access:
  • 3. Sitemaps
  • 4. Specifically disallow some bots
  • Amazon
  • Anthropic AI
  • Apple
  • Cohere
  • Common Crawl
  • Google Bard
  • Meta
  • OpenAI
  • Perplexity AI
  • Bytedance (won't work but shows our intent)
  • Webz.io
  • disallow the above AI bots
  • contact EDItEUR via info@editeur.org