boehs.org
robots.txt

Robots Exclusion Standard data for boehs.org

Archived Snapshots

Resource Scan

Scan Details

Site Domain	boehs.org
Base Domain	boehs.org
Scan Status	Ok
Last Scan	2025-05-03T14:55:54+00:00
Next Scan	2025-06-02T14:55:54+00:00

Last Scan

Scanned	2025-05-03T14:55:54+00:00
URL	https://boehs.org/robots.txt
Domain IPs	104.21.69.48, 172.67.204.89, 2606:4700:3037::6815:4530, 2606:4700:3037::ac43:cc59
Response IP	172.67.204.89
Found	Yes
Hash	0ec27c1ebfd2a8db4f6e3f4337b3b3b083707db5d9e0f9434db3f8653107f346
SimHash	ac1a7171cd7d

Groups

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

cohere-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

/

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

/

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

claude-web

Rule	Path
Disallow	/

Rule

Path

Disallow

/

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

/

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

/

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

turnitin

Rule	Path
Disallow	/

Rule

Path

Disallow

/

friendlycrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Comments

These AI systems do not cite the content they ingest, and are hence exploitative at best
Boehs.org does not permit usage incl. but not limited to: for large language
models (LLMs), machine learning and/or artificial intelligence-related
purposes; and/or with any of the aforementioned technologies
Turnitin generates money by indexing stuff without permission and without benefit to me.
For a product against plagiarism, it sure likes profiting off IP.
https://boehs.org/node/is-(my)-rss-dead
Who knows if it will respect it

Back to top

boehs.orgrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

gptbot

cohere-ai

google-extended

claudebot

claude-web

anthropic-ai

bytespider

ccbot

turnitinbot

turnitin

friendlycrawler

Comments

boehs.org
robots.txt