boehs.org
robots.txt

Robots Exclusion Standard data for boehs.org

Resource Scan

Scan Details

Site Domain boehs.org
Base Domain boehs.org
Scan Status Ok
Last Scan2025-05-03T14:55:54+00:00
Next Scan 2025-06-02T14:55:54+00:00

Last Scan

Scanned2025-05-03T14:55:54+00:00
URL https://boehs.org/robots.txt
Domain IPs 104.21.69.48, 172.67.204.89, 2606:4700:3037::6815:4530, 2606:4700:3037::ac43:cc59
Response IP 172.67.204.89
Found Yes
Hash 0ec27c1ebfd2a8db4f6e3f4337b3b3b083707db5d9e0f9434db3f8653107f346
SimHash ac1a7171cd7d

Groups

gptbot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

turnitin

Rule Path
Disallow /

friendlycrawler

Rule Path
Disallow /

Comments

  • These AI systems do not cite the content they ingest, and are hence exploitative at best
  • Boehs.org does not permit usage incl. but not limited to: for large language
  • models (LLMs), machine learning and/or artificial intelligence-related
  • purposes; and/or with any of the aforementioned technologies
  • Turnitin generates money by indexing stuff without permission and without benefit to me.
  • For a product against plagiarism, it sure likes profiting off IP.
  • https://boehs.org/node/is-(my)-rss-dead
  • Who knows if it will respect it