extensionfile.net
robots.txt

Robots Exclusion Standard data for extensionfile.net

Resource Scan

Scan Details

Site Domain extensionfile.net
Base Domain extensionfile.net
Scan Status Ok
Last Scan2025-12-28T00:03:21+00:00
Next Scan 2026-01-04T00:03:21+00:00

Last Scan

Scanned2025-12-28T00:03:21+00:00
URL https://extensionfile.net/robots.txt
Domain IPs 104.24.12.80, 104.24.13.80, 172.67.81.198, 2606:4700:20::6818:c50, 2606:4700:20::6818:d50, 2606:4700:20::ac43:51c6
Response IP 104.24.12.80
Found Yes
Hash 21b644292cd62d49385d47fbddec1881691e9bd3ea5f7f5b32bafa9511c396dc
SimHash 4876da70b454

Groups

googlebot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 1

googlebot-image

Rule Path
Allow /

Other Records

Field Value
crawl-delay 1

googlebot-news

Rule Path
Allow /

Other Records

Field Value
crawl-delay 1

googlebot-video

Rule Path
Allow /

Other Records

Field Value
crawl-delay 1

adsbot-google

Rule Path
Allow /

mediapartners-google

Rule Path
Allow /

bingbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 2

msnbot

Rule Path
Allow /

Other Records

Field Value
crawl-delay 2

duckduckbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

ai2bot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

yandex

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

Comments

  • Allow Google
  • Allow Bing / Microsoft
  • Block AI / LLM / Data Scrapers
  • OpenAI GPTBot
  • CCBot (Common Crawl AI training)
  • AmazonBot / AGI crawlers
  • Anthropic Claude crawler
  • Perplexity AI
  • You.com
  • AI research scrapers
  • Block Chinese/Russian large-scale scrapers
  • Block SEO tools & aggressive scrapers
  • Generic AI-crawler catch-all
  • But re-allow the ones we explicitly trust

Warnings

  • 7 invalid lines.