in.gov
robots.txt
Robots Exclusion Standard data for in.gov
Resource Scan
Scan Details
Site Domain | in.gov |
Base Domain | in.gov |
Scan Status | Ok |
Last Scan | 2024-11-04T15:48:29+00:00 |
Next Scan | 2024-12-04T15:48:29+00:00 |
Last Scan
Scanned | 2024-11-04T15:48:29+00:00 |
URL | https://in.gov/robots.txt |
Redirect | https://www.in.gov/robots.txt |
Redirect Domain | www.in.gov |
Redirect Base | in.gov |
Domain IPs | 208.40.244.65 |
Redirect IPs | 208.40.244.65 |
Response IP | 208.40.244.65 |
Found | Yes |
Hash | 7b576f5e60a62d9d44e5fc532fb5c62ca1bfecd9c280b2f1391a8f74b3b22969 |
SimHash | 2a9e757f6e92 |
Groups
*
Rule | Path |
---|---|
Disallow | /serv/ |
Disallow | /cgi-bin/ |
Disallow | /isdh/drafts_local/ |
Disallow | /demand |
Disallow | /search |
Disallow | /ai/errors/ |
Disallow | /dor/4572.htm |
Disallow | /dor/reference/legal/rulings/unused/ |
Disallow | /dwd/files/swic/ |
Disallow | /dwd/files/JWIB/ |
Disallow | /dwd/files/CM_Files/ |
Disallow | /dwd/files/policy/ |
Disallow | /dwd/test/ |
Disallow | /ActiveCalendar/mobile/mobilelist.aspx |
Disallow | *subscribetocalendar.aspx* |
Disallow | *RSSSyndicator.aspx* |
Disallow | *downloadtype.aspx* |
Disallow | /indot/3212.htm |
Disallow | /sos/online_corps/ |
Disallow | /sos/clerical/ |
Disallow | /sos/registration/ |
Comments