sr.ht
robots.txt

Robots Exclusion Standard data for sr.ht

Resource Scan

Scan Details

Site Domain sr.ht
Base Domain sr.ht
Scan Status Ok
Last Scan2024-09-18T14:18:33+00:00
Next Scan 2024-10-18T14:18:33+00:00

Last Scan

Scanned2024-09-18T14:18:33+00:00
URL https://sr.ht/robots.txt
Domain IPs 2a03:6000:1813:1337::159, 46.23.81.159
Response IP 46.23.81.159
Found Yes
Hash 24bc8d8ce4e0b36940736341c38d04b5de8c0726cb0781ef213800db60ae4534
SimHash 3619c651cfd9

Groups

*

Rule Path
Disallow /*?*
Disallow /*.tar.gz$
Disallow /metrics
Disallow /*/*/blame/*
Disallow /*/*/log/*
Disallow /*/*/tree/*
Disallow /*/*/item/*
Disallow /*/*/mbox
Disallow /*/*/*/raw

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

aspiegelbot

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

yandex

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

turnitin

Rule Path
Disallow /

seekport crawler

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

serpstatbot

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

Comments

  • Our policy
  • Allowed:
  • - Search engine indexers
  • - Archival services (e.g. IA)
  • Disallowed:
  • - Marketing or SEO crawlers
  • - Bots which are too agressive by default. This is subjective, if you annoy
  • our sysadmins you'll be blocked.
  • Reach out to sir@cmpwn.com if you have questions.
  • It doesn't make sense to index these and/or it's expensive:
  • Too aggressive, marketing/SEO
  • Too aggressive, marketing/SEO
  • Marketing/SEO
  • Marketing/SEO
  • Marketing/SEO
  • Huwei something or another, badly behaved
  • Marketing/SEO
  • YandexBot is a dickhead, too aggressive
  • Marketing/SEO
  • Marketing/SEO
  • Used for Alexa, I guess, who cares
  • No
  • Does not respect * directives
  • No thanks
  • Fairly certain that this is an LLM data vacuum
  • Same
  • Marketing
  • Marketing/SEO
  • Very aggressive, used for TikTok or something
  • Facebook