alazycowboy.com
robots.txt

Robots Exclusion Standard data for alazycowboy.com

Resource Scan

Scan Details

Site Domain alazycowboy.com
Base Domain alazycowboy.com
Scan Status Ok
Last Scan2025-05-27T08:56:53+00:00
Next Scan 2025-06-26T08:56:53+00:00

Last Scan

Scanned2025-05-27T08:56:53+00:00
URL https://alazycowboy.com/robots.txt
Domain IPs 104.21.42.148, 172.67.163.8, 2606:4700:3031::6815:2a94, 2606:4700:3036::ac43:a308
Response IP 172.67.163.8
Found Yes
Hash f3a7ebdca677abf8814c66fe8c642046b7fd8a8c68523df36d10de536a58450a
SimHash c23391f8c7d3

Groups

*

Rule Path
Disallow /cgi-bin/
Disallow /wp-content/plugins/

googlebot

Rule Path
Allow /wp-content/plugins/wptouch/themes/

googlebot

Rule Path
Allow /wp-content/plugins/wptouch/resources/

marcopolo
nutch
zao
semanticdiscovery
pubcrawl
turnitinbot
npbot
psbot
baiduspider
larbin
ia_archiver
nationaldirectory
lnspiderguy
teleport
miixpc
asterias
lwp-trivial
linkwalker
cosmos
msiecrawler
pompos
generic
websearchbench
almaden
k2spider
curl
wget
ahrefsbot

Rule Path
Disallow /

Comments

  • http://www.webmasterworld.com/robots.txt has a long list of active robots you might want to block.
  • Some of these (and many others) ignore robots.txt, and are forcibly blocked in .htaccess.
  • User-agent: sitecheck.internetseer.com