theguardsman.com
robots.txt

Robots Exclusion Standard data for theguardsman.com

Resource Scan

Scan Details

Site Domain theguardsman.com
Base Domain theguardsman.com
Scan Status Ok
Last Scan2025-10-23T22:11:06+00:00
Next Scan 2025-11-06T22:11:06+00:00

Last Scan

Scanned2025-10-23T22:11:06+00:00
URL https://theguardsman.com/robots.txt
Domain IPs 107.180.51.80
Response IP 107.180.51.80
Found Yes
Hash 2e29c279714ad5685cbf40e4e98423c9eeee5be700cf66675c4b0ac5484bbac9
SimHash 2b50d7b04711

Groups

*

Rule Path
Disallow /wp-includes
Disallow /wp-content/gallery
Disallow /wp-content/plugins
Disallow /wp-content/themes
Disallow /wp-content/wflogs
Disallow /wp-content/xhc-xmt
Disallow /wp-content/upgrade
Disallow /wp-content/aiowps_backups
Disallow /wp-content/aiowps_backups
Disallow /etc-magazine.com/wp-admin
Disallow /etc-magazine.com/wp-includes

discobot

Rule Path
Disallow /

slurp

Rule Path
Disallow /*blackhole
Disallow /?blackhole

Other Records

Field Value
crawl-delay 4