neil-clarke.com
robots.txt

Robots Exclusion Standard data for neil-clarke.com

Resource Scan

Scan Details

Site Domain neil-clarke.com
Base Domain neil-clarke.com
Scan Status Ok
Last Scan2026-03-11T07:15:10+00:00
Next Scan 2026-04-10T07:15:10+00:00

Last Scan

Scanned2026-03-11T07:15:10+00:00
URL https://neil-clarke.com/robots.txt
Domain IPs 104.21.22.240, 172.67.207.219, 2606:4700:3032::6815:16f0, 2606:4700:3037::ac43:cfdb
Response IP 172.67.207.219
Found Yes
Hash b5983976b3a00368671ba0b21c3ef2125858d2ab1cf915c0012ff9f144009cb6
SimHash 5ade1a5182b6

Groups

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-cloudvertexbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

duckassistbot

Rule Path
Disallow /

ai2bot

Rule Path
Disallow /

kangaroo bot

Rule Path
Disallow /

pangubot

Rule Path
Disallow /

cohere-training-data-crawler

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /

webzio-extended

Rule Path
Disallow /

webzio

Rule Path
Disallow /

youbot

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

owler@ows.eu/1

Rule Path
Disallow /

owler@ows.eu/x

Rule Path
Disallow /

owler@ows.eu/2

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

internet-measurement

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

python-requests

Rule Path
Disallow /

java

Rule Path
Disallow /

dataprovider

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

meta-externalfetcher

Rule Path
Disallow /

timpibot

Rule Path
Disallow /

velenpublicwebcrawler

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

sidetrade indexer bot

Rule Path
Disallow /

ai2bot-dolma

Rule Path
Disallow /

friendlycrawler

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

operator

Rule Path
Disallow /

onespot-scraperbot

Rule Path
Disallow /

instapaper

Rule Path
Disallow /

googleagent-mariner

Rule Path
Disallow /

cotoyogi

Rule Path
Disallow /

netestate imprint crawler

Rule Path
Disallow /

novaact

Rule Path
Disallow /

bigsur.ai

Rule Path
Disallow /

claude-user

Rule Path
Disallow /

duckassistbot

Rule Path
Disallow /

gemini-deep-research

Rule Path
Disallow /

mistralai-user

Rule Path
Disallow /

perplexity-user

Rule Path
Disallow /

qualifiedbot

Rule Path
Disallow /

turnitin

Rule Path
Disallow /

chatgpt agent

Rule Path
Disallow /

manus-user

Rule Path
Disallow /

novaact

Rule Path
Disallow /

twinagent

Rule Path
Disallow /

ai2bot-deepresearcheval

Rule Path
Disallow /

bigsur.ai

Rule Path
Disallow /

devin

Rule Path
Disallow /

google-notebooklm

Rule Path
Disallow /

klaviyoaibot

Rule Path
Disallow /

linerbot

Rule Path
Disallow /

phindbot

Rule Path
Disallow /

poggio-citations

Rule Path
Disallow /

qualifiedbot

Rule Path
Disallow /

tavilybot

Rule Path
Disallow /

chatglm-spider

Rule Path
Disallow /

datenbank crawler

Rule Path
Disallow /

googleother

Rule Path
Disallow /

icc-crawler

Rule Path
Disallow /

laion-huggingface-processor

Rule Path
Disallow /

lcc

Rule Path
Disallow /

netestate imprint crawler

Rule Path
Disallow /

sbintuitionsbot

Rule Path
Disallow /

spider

Rule Path
Disallow /

velenpublicwebcrawler

Rule Path
Disallow /

webzio-extended

Rule Path
Disallow /

archive-it

Rule Path
Disallow /

deepseekbot

Rule Path
Disallow /

wrtnbot

Rule Path
Disallow /