deepsiteai.com
robots.txt

Robots Exclusion Standard data for deepsiteai.com

Resource Scan

Scan Details

Site Domain deepsiteai.com
Base Domain deepsiteai.com
Scan Status Ok
Last Scan2025-09-23T22:20:54+00:00
Next Scan 2025-09-30T22:20:54+00:00

Last Scan

Scanned2025-09-23T22:20:54+00:00
URL https://deepsiteai.com/robots.txt
Domain IPs 104.21.69.46, 172.67.204.86, 2606:4700:3032::ac43:cc56, 2606:4700:3036::6815:452e
Response IP 172.67.204.86
Found Yes
Hash 0dd8f24c5a8b6780ce9155408e2010d0cceaa85d88355e3d114183032ccf7f54
SimHash 69152bb165c2

Groups

*

Rule Path
Allow /
Disallow /_next/
Disallow /api/
Disallow /checkout/confirmation
Disallow /admin/
Disallow /chat/
Disallow /s/
Disallow /en/s/
Disallow /ar/s/
Disallow /ja/s/
Disallow /ru/s/
Disallow /ko/s/
Disallow /de/s/
Disallow /fr/s/
Disallow /es/s/
Disallow /pt/s/
Disallow /tr/s/
Disallow /zh/s/

oai-searchbot
chatgpt-user
gptbot
claude-web
claudebot
anthropic-ai
perplexitybot
perplexity-user
googleother
bingbot
amazonbot
applebot
applebot-extended
facebookbot
meta-externalagent
linkedinbot
bytespider
duckassistbot
cohere-ai
ai2bot
ccbot
diffbot
omgili
timpibot
youbot

Rule Path
Allow /llms.txt
Allow /llms-full.txt
Allow /
Allow /pricing
Allow /r
Allow /m

Other Records

Field Value
sitemap https://deepsiteai.com/sitemap.xml

Comments

  • Disallow language-specific /s directories
  • AI爬虫特定规则
  • ——— OPENAI ———
  • ——— ANTHROPIC (Claude) ———
  • ——— PERPLEXITY ———
  • ——— GOOGLE (Gemini) ———
  • ——— MICROSOFT (Bing / Copilot) ———
  • ——— AMAZON ———
  • ——— APPLE ———
  • ——— META ———
  • ——— LINKEDIN ———
  • ——— BYTEDANCE ———
  • ——— DUCKDUCKGO ———
  • ——— COHERE ———
  • ——— ALLEN INSTITUTE / COMMON CRAWL / OTHER RESEARCH ———
  • ——— EMERGING SEARCH START-UPS ———
  • 引导AI爬虫到llms.txt
  • å

Warnings

  • 1 invalid line.