dpe.org
robots.txt

Robots Exclusion Standard data for dpe.org

Resource Scan

Scan Details

Site Domain dpe.org
Base Domain dpe.org
Scan Status Ok
Last Scan2025-12-16T08:11:37+00:00
Next Scan 2026-01-15T08:11:37+00:00

Last Scan

Scanned2025-12-16T08:11:37+00:00
URL https://dpe.org/robots.txt
Domain IPs 104.21.28.32, 172.67.170.57, 2606:4700:3034::ac43:aa39, 2606:4700:3037::6815:1c20
Response IP 104.21.28.32
Found Yes
Hash d18857c71bf29d47c0c56f0207f5836ebf6dd75e4c2a181314071977fc1e4fb7
SimHash 6c5e9b0a45f7

Groups

*

Rule Path
Allow /

gptbot

Rule Path
Allow /

claudebot

Rule Path
Allow /

perplexitybot

Rule Path
Allow /

google-extended

Rule Path
Allow /

bingai

Rule Path
Allow /

ccbot

Rule Path
Allow /

youbot

Rule Path
Allow /

neevaai

Rule Path
Allow /

stabilityai

Rule Path
Allow /

cohereai

Rule Path
Allow /

Other Records

Field Value
sitemap https://dpe.org/sitemap.xml

Comments

  • robots.txt for dpe.org
  • Rules for search engines and AI crawlers
  • --- Search Engine Crawlers ---
  • Point to the sitemap index (this file links to other sitemaps)
  • --- AI Crawlers (mirrors llms.txt) ---
  • OpenAI (ChatGPT / GPTBot)
  • Anthropic (Claude)
  • Perplexity AI
  • Google AI (Bard / Gemini crawlers)
  • Microsoft Bing AI
  • Commoncrawl (feeds many AI models)
  • You.com
  • Neeva AI (Snowflake)
  • Stability AI
  • Cohere