water.usgs.gov
robots.txt
Robots Exclusion Standard data for water.usgs.gov
Resource Scan
Scan Details
Site Domain | water.usgs.gov |
Base Domain | usgs.gov |
Scan Status | Ok |
Last Scan | 2024-06-01T15:42:10+00:00 |
Next Scan | 2024-07-01T15:42:10+00:00 |
Last Scan
Scanned | 2024-06-01T15:42:10+00:00 |
URL | https://water.usgs.gov/robots.txt |
Domain IPs | 137.227.233.178, 2001:49c8:0:126c::76 |
Response IP | 137.227.233.178 |
Found | Yes |
Hash | cf37c5adee7f1698d6cce556f43117a9d345689365054862b17a9b5e18edf839 |
SimHash | 614069659f75 |
Groups
*
Rule | Path |
---|---|
Disallow | /camera/ |
Disallow | /cgi-bin/feedback_form |
Disallow | /cgi-bin/lookup |
Disallow | /icons/ |
Disallow | /images/ |
Disallow | /nawdex/ |
Disallow | /nawqa-only/sparrowweb/ |
Disallow | /nsip/nsipmaps/ |
Disallow | /outreach/images/ |
Disallow | /preview/ |
Disallow | /project_alert/ |
Disallow | /public/ |
Disallow | /usgs_access/ |
Disallow | /watuse/wuhuc/ |
Disallow | /usgs/ogw/software-archive |
Disallow | /usgs/owq/software-archive |
Comments