winston.be
robots.txt

Robots Exclusion Standard data for winston.be

Resource Scan

Scan Details

Site Domain winston.be
Base Domain winston.be
Scan Status Ok
Last Scan2025-09-11T15:37:56+00:00
Next Scan 2025-09-18T15:37:56+00:00

Last Scan

Scanned2025-09-11T15:37:56+00:00
URL https://winston.be/robots.txt
Redirect https://www.winston.be/robots.txt
Redirect Domain www.winston.be
Redirect Base winston.be
Domain IPs 2a00:1c98:1000:10d3:0:2:5f82:d18f, 83.217.70.141
Redirect IPs 2a00:1c98:1000:10d3:0:2:5f82:d18f, 83.217.70.141
Response IP 83.217.70.141
Found Yes
Hash 839c9d0b8e5c48bcbab4454258e00c999443a7cd8b91df8ce6adaf1c4829d707
SimHash f3974250c5f2

Groups

oai-searchbot

Rule Path
Allow /

chatgpt-user
chatgpt-user/2.0

Rule Path
Allow /

gptbot

Rule Path Comment
Disallow /private/ example private folder
Allow / everything else

anthropic-ai

Product Comment
anthropic-ai bulk model training
Rule Path
Allow /

claudebot
claude-web

Product Comment
claudebot chat citation fetch
claude-web web-focused crawl
Rule Path
Allow /

perplexitybot

Product Comment
perplexitybot index builder
Rule Path
Allow /

perplexity-user

Product Comment
perplexity-user human-triggered visit
Rule Path
Allow /

google-extended

Rule Path
Allow /

bingbot

Rule Path
Allow /

amazonbot

Rule Path
Allow /

applebot
applebot-extended

Rule Path
Allow /

facebookbot
meta-externalagent

Rule Path
Allow /

linkedinbot

Rule Path
Allow /

bytespider

Rule Path
Allow /

duckassistbot

Rule Path
Allow /

cohere-ai

Rule Path
Allow /

ai2bot
ccbot
diffbot
omgili

Rule Path
Allow /

timpibot
youbot

Rule Path
Allow /

*

Rule Path
Disallow /cpresources/
Disallow /vendor/
Disallow /.env
Disallow /cache/

Other Records

Field Value
sitemap https://www.winston.be/sitemaps-1-sitemap.xml

Comments

  • robots.txt for https://www.winston.be/
  • ——— OPENAI ———
  • Search (shows my webpages as links inside ChatGPT search). NOT used for model training.
  • User-driven browsing from ChatGPT and Custom GPTs. Acts after a human click.
  • Model-training crawler. Opt-out here if I don’t want content in GPT-4o or GPT-5.
  • ——— ANTHROPIC (Claude) ———
  • ——— PERPLEXITY ———
  • ——— GOOGLE (Gemini) ———
  • ——— MICROSOFT (Bing / Copilot) ———
  • ——— AMAZON ———
  • ——— APPLE ———
  • ——— META ———
  • ——— LINKEDIN ———
  • ——— BYTEDANCE ———
  • ——— DUCKDUCKGO ———
  • ——— COHERE ———
  • ——— ALLEN INSTITUTE / COMMON CRAWL / OTHER RESEARCH ———
  • ——— EMERGING SEARCH START-UPS ———
  • live - don't allow web crawlers to index cpresources/ or vendor/