usgwarchives.net
robots.txt

Robots Exclusion Standard data for usgwarchives.net

Resource Scan

Scan Details

Site Domain usgwarchives.net
Base Domain usgwarchives.net
Scan Status Ok
Last Scan2024-05-24T21:12:30+00:00
Next Scan 2024-06-23T21:12:30+00:00

Last Scan

Scanned2024-05-24T21:12:30+00:00
URL http://usgwarchives.net/robots.txt
Domain IPs 192.175.112.4
Response IP 192.175.112.4
Found Yes
Hash 6339af0fa5c25d799ea8ca98b884af0e21f2702cbdfdec72cb07b7bd843256cb
SimHash 8814ded43192

Groups

jrank-bot

Rule Path
Disallow /

zoomspider

Rule Path
Disallow /

zoom*

Rule Path
Disallow /

zsebot

Rule Path
Disallow /

myfamilybot

Rule Path
Disallow /

*

Rule Path
Disallow /sekrit/

gigabot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

webalta crawler

Rule Path
Disallow /