til.ello.tech
robots.txt

Robots Exclusion Standard data for til.ello.tech

Resource Scan

Scan Details

Site Domain til.ello.tech
Base Domain ello.tech
Scan Status Ok
Last Scan2025-11-14T11:39:10+00:00
Next Scan 2025-11-15T11:39:10+00:00

Last Scan

Scanned2025-11-14T11:39:10+00:00
URL https://til.ello.tech/robots.txt
Domain IPs 65.108.241.142
Response IP 65.108.241.142
Found Yes
Hash 7bf080c83569321323082707ae947de9186e990a5c768565cb9aef0a5af8806f
SimHash 705690b08432

Groups

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

omgilibot
omgili
webzio-extended

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

*

Rule Path
Disallow /d/
Disallow /me/
Disallow /admin/
Disallow /login

Comments

  • OpenAI’s web crawler: GPT3.5, GPT4, ChatGPT
  • https://platform.openai.com/docs/bots
  • ChatGPT plugins
  • https://platform.openai.com/docs/bots
  • OpenAI Search bot
  • https://platform.openai.com/docs/bots
  • Google's web crawler: Bard, VertexAI, Gemini
  • https://blog.google/technology/ai/an-update-on-web-publisher-controls/
  • Apple's web crawler, dedicated to GenAI projects
  • https://support.apple.com/en-us/119829
  • Claude
  • Claude Bot
  • Claude web
  • Cohere
  • Perplexity
  • Common Crawl
  • https://commoncrawl.org/ccbot
  • Omglibot: webz.io
  • https://webz.io/blog/web-data/what-is-the-omgili-bot-and-why-is-it-crawling-your-website/
  • Facebook: Llama
  • https://developers.facebook.com/docs/sharing/bot/
  • ByteDance: Duobao