superkuh.com
robots.txt

Robots Exclusion Standard data for superkuh.com

Resource Scan

Scan Details

Site Domain superkuh.com
Base Domain superkuh.com
Scan Status Ok
Last Scan2024-10-17T14:07:37+00:00
Next Scan 2024-11-16T14:07:37+00:00

Last Scan

Scanned2024-10-17T14:07:37+00:00
URL http://superkuh.com/robots.txt
Domain IPs 73.5.160.29
Response IP 73.5.160.29
Found Yes
Hash 488db1c4b006504820d1150b1f3530a6b835d025b4c46b1764d7e9cb03c75f81
SimHash 480e89c753a4

Groups

*

Rule Path
Allow /
Allow /spaceweather/

*

Rule Path
Disallow /radio/

mj12bot

Rule Path
Disallow /ajax/

cloudflare-amp

Rule Path
Disallow /

pinterest

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

idmarch

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

academicbotrtu

Rule Path
Disallow /

npbot

Rule Path
Disallow /

slysearch

Rule Path
Disallow /

ias_crawler

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

*

Rule Path
Disallow /library/

*

Rule Path
Disallow /library/*

*

Rule Path
Disallow /library/*.pdf$

*

Rule Path
Disallow /library/*.djvu$

*

Rule Path
Disallow /library/*.txt$

*

Rule Path
Disallow /library/*.epub$

*

Rule Path
Disallow /library/*.lit$

*

Rule Path
Disallow /library/*.chm$

zombies

Rule Path
Disallow /brains

killer ai

Rule Path
Disallow /~superkuh/

the person who is reacting to the lame jokes that is reading this robots.txt right now. yes, you.

Rule Path
Allow /hello/awarenessofthemeta/

unscrupulous algorithms

Rule Path
Disallow /visitingthisurlisasignalthatyouarealgorithmicseriouslypleasedontvisitthisturlactualpeople/

Comments

  • Bots tend to get bogged down and try to download the 3 TB and tens of millions of png images
  • http headers for unicode art, dns TXT records for inspirational messages