/.well-known/

Log In Sign Up

lobste.rs
robots.txt

Robots Exclusion Standard data for lobste.rs

Archived Snapshots

Resource Scan

Scan Details

Site Domain	lobste.rs
Base Domain	lobste.rs
Scan Status	Ok
Last Scan	2024-11-13T15:12:58+00:00
Next Scan	2024-11-27T15:12:58+00:00

Last Scan

Scanned	2024-11-13T15:12:58+00:00
URL	https://lobste.rs/robots.txt
Domain IPs	2604:a880:400:d0::2082:1001, 67.205.189.7
Response IP	67.205.189.7
Found	Yes
Hash	c6666d5ba88da5c2af817ac740d9739bcd538f8b480282051eed42ca35bad5db
SimHash	38929141e242

Groups

amazonbot
applebot
applebot-extended
anthropic-ai
bytespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
facebookbot
google-extended
gptbot
oai-searchbot
omgili
omgilibot
perplexitybot
timpibot
youbot

Rule

Path

Disallow

/

ahrefsbot
blexbot
clickagy
semrushbot
semrushbot-ba
semrushbot-coub
semrushbot-ct
semrushbot-si
semrushbot-swa
siteauditbot
splitsignalbot

Rule

Path

Disallow

/

*

Rule

Path

Disallow

/search

Disallow

/page/

Disallow

/comments/page/

Other Records

Field

Value

crawl-delay

1

Back to top

Comments

https://lobste.rs/s/ybowdq/great_gpt_firewall
SEO/spam tools
Google refuses to support crawl-delay so when this was at the top they
combined it with the following (anti-LLM slop) rules and blocked the site
https://developers.google.com/search/blog/2019/07/a-note-on-unsupported-rules-in-robotstxt

Back to top