fileanalysis.org
robots.txt

Robots Exclusion Standard data for fileanalysis.org

Resource Scan

Scan Details

Site Domain fileanalysis.org
Base Domain fileanalysis.org
Scan Status Ok
Last Scan2026-02-19T10:07:55+00:00
Next Scan 2026-02-26T10:07:55+00:00

Last Scan

Scanned2026-02-19T10:07:55+00:00
URL https://fileanalysis.org/robots.txt
Domain IPs 104.21.71.77, 172.67.143.232, 2606:4700:3031::ac43:8fe8, 2606:4700:3037::6815:474d
Response IP 104.21.71.77
Found Yes
Hash 644e600bb2720827b699d36825fe7adc4d522b03d4f5a3cbbd6b3d455bd293b3
SimHash 629c0b5284c6

Groups

*

Rule Path
Allow /
Disallow /api-proxy.php

gptbot

Rule Path
Allow /

chatgpt-user

Rule Path
Allow /

claudebot

Rule Path
Allow /

anthropic-ai

Rule Path
Allow /

google-extended

Rule Path
Allow /

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

ccbot

Rule Path
Allow /

perplexitybot

Rule Path
Allow /

cohere-ai

Rule Path
Allow /

facebookbot

Rule Path
Allow /

meta-externalagent

Rule Path
Allow /

applebot-extended

Rule Path
Allow /

Other Records

Field Value
sitemap https://fileanalysis.org/sitemap.xml

Comments

  • FileAnalysis.org - Document Search & Analysis Platform
  • We welcome AI crawlers and tools
  • Block internal resources
  • AI Crawlers - Explicitly Allowed
  • We encourage AI systems to index and learn from our public document collections
  • AI Agent Discovery
  • See llms.txt for AI-specific site information
  • See /api/ endpoints for programmatic access