conservationevidence.com
robots.txt

Robots Exclusion Standard data for conservationevidence.com

Resource Scan

Scan Details

Site Domain conservationevidence.com
Base Domain conservationevidence.com
Scan Status Ok
Last Scan2026-02-19T23:31:53+00:00
Next Scan 2026-03-21T23:31:53+00:00

Last Scan

Scanned2026-02-19T23:31:53+00:00
URL https://conservationevidence.com/robots.txt
Domain IPs 104.26.8.213, 104.26.9.213, 172.67.72.167, 2606:4700:20::681a:8d5, 2606:4700:20::681a:9d5, 2606:4700:20::ac43:48a7
Response IP 172.67.72.167
Found Yes
Hash 89208bcb1a73645cc46c9f62c7fd45e861e1b8695d496b89c56542f0625c51a0
SimHash c634c251c7d5

Groups

*

Rule Path
Allow /

amazonbot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

*

Rule Path
Disallow

Other Records

Field Value
crawl-delay 30

googlebot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 30

bingbot

Rule Path
Disallow

Other Records

Field Value
crawl-delay 30

blexbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

claude-user

Rule Path
Disallow /

claude-searchbot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

perplexity-user

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

Comments

  • As a condition of accessing this website, you agree to abide by the following
  • content signals:
  • (a) If a Content-Signal = yes, you may collect content for the corresponding
  • use.
  • (b) If a Content-Signal = no, you may not collect content for the
  • corresponding use.
  • (c) If the website operator does not include a Content-Signal for a
  • corresponding use, the website operator neither grants nor restricts
  • permission via Content-Signal with respect to the corresponding use.
  • The content signals and their meanings are:
  • search: building a search index and providing search results (e.g., returning
  • hyperlinks and short excerpts from your website's contents). Search does not
  • include providing AI-generated search summaries.
  • ai-input: inputting content into one or more AI models (e.g., retrieval
  • augmented generation, grounding, or other real-time taking of content for
  • generative AI search answers).
  • ai-train: training or fine-tuning AI models.
  • ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF
  • RIGHTS UNDER ARTICLE 4 OF THE EUROPEAN UNION DIRECTIVE 2019/790 ON COPYRIGHT
  • AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET.
  • BEGIN Cloudflare Managed content
  • END Cloudflare Managed Content
  • .__________________________.
  • | .___________________. |==|
  • | | ................. | | |
  • | | ::[ Dear robot ]: | | |
  • | | ::::[ be nice ]:: | | |
  • | | ::::::::::::::::: | | |
  • | | ::::::::::::::::: | | |
  • | | ::::::::::::::::: | | |
  • | | ::::::::::::::::: | | ,|
  • | !___________________! |(c|
  • !_______________________!__!
  • / \
  • / [][][][][][][][][][][][][] \
  • / [][][][][][][][][][][][][][] \
  • ( [][][][][____________][][][][] )
  • \ ------------------------------ /
  • \______________________________/
  • Last updated: 2025-12-12 by Ibrahim Alhas.
  • --------------------------------------------------------------------
  • Cloudflare / Content Signals policy (human-readable explanation)
  • --------------------------------------------------------------------
  • As a condition of accessing this website, you agree to abide by the
  • following content signals:
  • (a) If a content-signal = yes, you may collect content for the
  • corresponding use.
  • (b) If a content-signal = no, you may not collect content for the
  • corresponding use.
  • (c) If the website operator does not include a content signal for a
  • corresponding use, the website operator neither grants nor
  • restricts permission via content signal with respect to that use.
  • The content signals and their meanings are:
  • search: building a search index and providing search results
  • (e.g., returning hyperlinks and short excerpts from the
  • website's contents). Search does not include providing
  • AI-generated search summaries.
  • ai-input: inputting content into one or more AI models (e.g.,
  • retrieval augmented generation, grounding, or other
  • real-time use of content for generative AI search answers).
  • ai-train: training or fine-tuning AI models.
  • ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS
  • RESERVATIONS OF RIGHTS UNDER ARTICLE 4 OF THE EUROPEAN UNION
  • DIRECTIVE 2019/790 ON COPYRIGHT AND RELATED RIGHTS IN THE DIGITAL
  • SINGLE MARKET.
  • --------------------------------------------------------------------
  • 1. Default rules – allow normal crawling & search indexing
  • --------------------------------------------------------------------
  • We allow standard search indexing, but do NOT permit use of our
  • content for AI input or AI training.
  • Explicitly restate for Google web search; AI training is controlled
  • separately via Google-Extended below.
  • Explicitly restate for Bing web search.
  • Aggressive generic crawler we wish to block entirely.
  • --------------------------------------------------------------------
  • 2. AI / LLM-specific crawlers – blocked
  • --------------------------------------------------------------------
  • These user-agents are commonly associated with AI training or AI
  • search services. We do not permit crawling or reuse of our content
  • by these bots.
  • OpenAI
  • Anthropic (Claude)
  • Perplexity
  • Google AI training (separate from standard search indexing)
  • CommonCrawl (widely used in AI training corpora)
  • Apple AI training
  • Meta / Facebook
  • ByteDance
  • Amazon
  • --------------------------------------------------------------------
  • 3. Notes
  • --------------------------------------------------------------------
  • - Standard search engines that respect robots.txt (Googlebot,
  • Bingbot, etc.) are allowed to crawl under the default rules.
  • - Content-Signal values indicate that traditional search indexing
  • is permitted, but AI input and AI training uses are not.
  • - robots.txt is an advisory mechanism: compliant crawlers will
  • respect it; hostile or disguised scrapers may ignore it and
  • must be handled via other measures (e.g. Cloudflare Bot
  • Management, WAF, rate limiting).
  • - Additional AI/LLM crawlers can be added to the blocked list
  • as the ecosystem evolves.

Warnings

  • `content-signal` is not a known field.