crunchytech.net
robots.txt

Robots Exclusion Standard data for crunchytech.net

Resource Scan

Scan Details

Site Domain crunchytech.net
Base Domain crunchytech.net
Scan Status Ok
Last Scan2025-10-15T23:23:15+00:00
Next Scan 2025-10-22T23:23:15+00:00

Last Scan

Scanned2025-10-15T23:23:15+00:00
URL https://crunchytech.net/robots.txt
Domain IPs 104.21.11.53, 172.67.165.45, 2606:4700:3031::6815:b35, 2606:4700:3033::ac43:a52d
Response IP 172.67.165.45
Found Yes
Hash 9d7db3bbcec48b2a64f82d9659978f36e66dc9c9e1482aab43e8cde64af9c4b6
SimHash bb88ca00a1f2

Groups

*

Rule Path Comment
Disallow /wp-admin/ -
Disallow /wp-login.php -
Disallow /cgi-bin/ -
Disallow /?s= Blocks search results pages
Disallow /search/ Alternate search URL pattern
Disallow /?attachment_id= -
Disallow /*?replytocom= -
Disallow /feed/ -
Disallow /comments/ -
Allow /wp-admin/admin-ajax.php -

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

Other Records

Field Value
sitemap https://crunchytech.net/sitemap_index.xml

Comments

  • CrunchyTech.net robots.txt
  • Allow good crawlers (Google, Bing, etc.) but block AI training bots & junk pages
  • Block AI training crawlers
  • Sitemap location (Rank Math generates these automatically)