kytta.dev
robots.txt

Robots Exclusion Standard data for kytta.dev

Archived Snapshots

Resource Scan

Scan Details

Site Domain	kytta.dev
Base Domain	kytta.dev
Scan Status	Ok
Last Scan	2025-11-29T20:29:15+00:00
Next Scan	2025-12-06T20:29:15+00:00

Last Scan

Scanned	2025-11-29T20:29:15+00:00
URL	https://kytta.dev/robots.txt
Redirect	https://www.kytta.dev/robots.txt
Redirect Domain	www.kytta.dev
Redirect Base	kytta.dev
Domain IPs	2a01:4f9:c01f:8002::, 95.217.26.94
Redirect IPs	2a01:4f9:c01f:8002::, 95.217.26.94
Response IP	95.217.26.94
Found	Yes
Hash	35ca64ee9f4a87f6b54ef6f3a0c2f9eac55c2b0352ccd3eb10a737ff553eac53
SimHash	ba1ad800c142

Groups

*

Rule	Path
Disallow	/blog/why-is-algeria-dz-backstory
Disallow	/donate

Rule

Path

Disallow

/blog/why-is-algeria-dz-backstory

Disallow

/donate

peer39_crawler/1.0

Rule	Path
Disallow	/

Rule

Path

Disallow

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

academicbotrtu

Rule	Path
Disallow	/

Rule

Path

Disallow

slysearch

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

checkmarknetwork/1.0 (+https://www.checkmarknetwork.com/spider.html)

Rule	Path
Disallow	/

Rule

Path

Disallow

brandverity/1.0

Rule	Path
Disallow	/

Rule

Path

Disallow

piplbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

applebot-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookbot
meta-externalagent

Rule	Path
Disallow	/

Rule

Path

Disallow

cotoyogi

Rule	Path
Disallow	/

Rule

Path

Disallow

webzio-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

kangaroo bot

Rule	Path
Disallow	/

Rule

Path

Disallow

genai

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-ocob
semrushbot-ft

Rule	Path
Disallow	/

Rule

Path

Disallow

velenpublicwebcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

Other Records

Field	Value
sitemap	https://www.kytta.dev/sitemap.xml

Field

Value

sitemap

https://www.kytta.dev/sitemap.xml

Comments

Borrowed verbatim from Seirdy: <https://seirdy.one/robots.txt>
-----STARTING HERE-----
IP-violation scanners
Misc. icky stuff
Well-known overly-aggressive bot that claims to respect robots.txt: http://mj12bot.com/
Gen-AI data scrapers
-----ENDING HERE-----

kytta.devrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

peer39_crawler/1.0

turnitinbot

academicbotrtu

slysearch

blexbot

checkmarknetwork/1.0 (+https://www.checkmarknetwork.com/spider.html)

brandverity/1.0

piplbot

mj12bot

Other Records

gptbot

google-extended

applebot-extended

claudebot

facebookbotmeta-externalagent

cotoyogi

webzio-extended

kangaroo bot

genai

semrushbot-ocobsemrushbot-ft

velenpublicwebcrawler

Other Records

Comments

kytta.dev
robots.txt

facebookbot
meta-externalagent

semrushbot-ocob
semrushbot-ft