lobste.rs
robots.txt

Robots Exclusion Standard data for lobste.rs

Resource Scan

Scan Details

Site Domain lobste.rs
Base Domain lobste.rs
Scan Status Ok
Last Scan2024-11-13T15:12:58+00:00
Next Scan 2024-11-27T15:12:58+00:00

Last Scan

Scanned2024-11-13T15:12:58+00:00
URL https://lobste.rs/robots.txt
Domain IPs 2604:a880:400:d0::2082:1001, 67.205.189.7
Response IP 67.205.189.7
Found Yes
Hash c6666d5ba88da5c2af817ac740d9739bcd538f8b480282051eed42ca35bad5db
SimHash 38929141e242

Groups

amazonbot
applebot
applebot-extended
anthropic-ai
bytespider
ccbot
chatgpt-user
claudebot
claude-web
cohere-ai
facebookbot
google-extended
gptbot
oai-searchbot
omgili
omgilibot
perplexitybot
timpibot
youbot

Rule Path
Disallow /

ahrefsbot
blexbot
clickagy
semrushbot
semrushbot-ba
semrushbot-coub
semrushbot-ct
semrushbot-si
semrushbot-swa
siteauditbot
splitsignalbot

Rule Path
Disallow /

*

Rule Path
Disallow /search
Disallow /page/
Disallow /comments/page/

Other Records

Field Value
crawl-delay 1

Comments

  • https://lobste.rs/s/ybowdq/great_gpt_firewall
  • SEO/spam tools
  • Google refuses to support crawl-delay so when this was at the top they
  • combined it with the following (anti-LLM slop) rules and blocked the site
  • https://developers.google.com/search/blog/2019/07/a-note-on-unsupported-rules-in-robotstxt