fair.org
robots.txt

Robots Exclusion Standard data for fair.org

Resource Scan

Scan Details

Site Domain fair.org
Base Domain fair.org
Scan Status Ok
Last Scan5/9/2025, 1:55:14 AM
Next Scan 6/8/2025, 1:55:14 AM

Last Scan

Scanned5/9/2025, 1:55:14 AM
URL https://fair.org/robots.txt
Domain IPs 173.249.144.83
Response IP 173.249.144.83
Found Yes
Hash 74f9f039dd47aaa68c95fd5ed486236a43ec8f718e1040d3baac16b028d33ae9
SimHash 5094d9c4a0a2

Groups

*

Rule Path
Allow /wp-content/uploads/
Disallow /wp-content/plugins/
Disallow /wp-admin/
Disallow /wp-login.php

Other Records

Field Value
crawl-delay 10

amazonbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

google-inspectiontool

Rule Path
Allow /

google-image

Rule Path
Allow /

google-video

Rule Path
Allow /

googlebot

Rule Path
Allow /

Other Records

Field Value
sitemap https://fair.org/sitemap.xml

Comments

  • Specific to Bots
  • Disallowing the OpenAI web crawler
  • Disallowing OpenAI plugins
  • Disallowing Common Crawl
  • Disallowing Google Bard and Vertex AI web crawlers
  • Disallowing various bots
  • Allow Google Search Console for sitemap crawling