kytta.dev
robots.txt

Robots Exclusion Standard data for kytta.dev

Resource Scan

Scan Details

Site Domain kytta.dev
Base Domain kytta.dev
Scan Status Ok
Last Scan2025-11-29T20:29:15+00:00
Next Scan 2025-12-06T20:29:15+00:00

Last Scan

Scanned2025-11-29T20:29:15+00:00
URL https://kytta.dev/robots.txt
Redirect https://www.kytta.dev/robots.txt
Redirect Domain www.kytta.dev
Redirect Base kytta.dev
Domain IPs 2a01:4f9:c01f:8002::, 95.217.26.94
Redirect IPs 2a01:4f9:c01f:8002::, 95.217.26.94
Response IP 95.217.26.94
Found Yes
Hash 35ca64ee9f4a87f6b54ef6f3a0c2f9eac55c2b0352ccd3eb10a737ff553eac53
SimHash ba1ad800c142

Groups

*

Rule Path
Disallow /blog/why-is-algeria-dz-backstory
Disallow /donate

peer39_crawler/1.0

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

academicbotrtu

Rule Path
Disallow /

slysearch

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

checkmarknetwork/1.0 (+https://www.checkmarknetwork.com/spider.html)

Rule Path
Disallow /

brandverity/1.0

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

mj12bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

facebookbot
meta-externalagent

Rule Path
Disallow /

cotoyogi

Rule Path
Disallow /

webzio-extended

Rule Path
Disallow /

kangaroo bot

Rule Path
Disallow /

genai

Rule Path
Disallow /

semrushbot-ocob
semrushbot-ft

Rule Path
Disallow /

velenpublicwebcrawler

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.kytta.dev/sitemap.xml

Comments

  • Borrowed verbatim from Seirdy: <https://seirdy.one/robots.txt>
  • -----STARTING HERE-----
  • IP-violation scanners
  • Misc. icky stuff
  • Well-known overly-aggressive bot that claims to respect robots.txt: http://mj12bot.com/
  • Gen-AI data scrapers
  • -----ENDING HERE-----