kontserdimaja.ee
robots.txt

Robots Exclusion Standard data for kontserdimaja.ee

Resource Scan

Scan Details

Site Domain kontserdimaja.ee
Base Domain kontserdimaja.ee
Scan Status Ok
Last Scan2026-01-19T08:02:20+00:00
Next Scan 2026-02-18T08:02:20+00:00

Last Scan

Scanned2026-01-19T08:02:20+00:00
URL https://kontserdimaja.ee/robots.txt
Domain IPs 185.7.252.220
Response IP 185.7.252.220
Found Yes
Hash 691e329a5e00131aa0e01acfcec0263eab57d40cb165e7e90144d0db8f156407
SimHash 641295488534

Groups

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

youbot

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

applebot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

Comments

  • Common Crawl's bot - Common Crawl is one of the largest public datasets used by AI for training, with ChatGPT, Bard and other large language models.
  • ChatGPT Bot - bot used when a ChatGPT user instructs it to reference your website.
  • OpenAI API - bot that OpenAI specifically uses to collect bulk training data from your website for ChatGPT.
  • Google Bard and VertexAI. This will not have an impact on Google Search indexing. This will not affect GoogleBot crawling.
  • Anthropic AI Bot
  • Claude Bot run by Anthropic
  • Cohere AI Bot - unconfirmed bot believed to be associated with Cohere’s chatbot.
  • OMGilibot - They sell data for training LLMs (large language models)
  • Omgili (Oh My God I Love It)
  • Perplexity AI
  • KUKA's youBot
  • Diffbot - somewhat dishonest scraping bot used to collect data to train LLMs.
  • Bytespider is a web crawler operated by ByteDance, the Chinese owner of TikTok
  • ImagesiftBot is billed as a reverse image search tool, but it's associated with The Hive, a company that produces models for image generation.
  • Social Media Bots
  • Amazon Bot - enabling Alexa to answer even more questions for customers.
  • Apple Bot - collects website data for its Siri and Spotlight services.
  • Meta’s bot that crawls public web pages to improve language models for their speech recognition technology.