catalog.data.gov
robots.txt
Robots Exclusion Standard data for catalog.data.gov
Resource Scan
Scan Details
Site Domain | catalog.data.gov |
Base Domain | data.gov |
Scan Status | Ok |
Last Scan | 2024-11-03T09:52:24+00:00 |
Next Scan | 2024-12-03T09:52:24+00:00 |
Last Scan
Scanned | 2024-11-03T09:52:24+00:00 |
URL | https://catalog.data.gov/robots.txt |
Domain IPs | 108.156.133.105, 108.156.133.115, 108.156.133.21, 108.156.133.42, 2600:9000:2755:2400:1:569b:bd00:93a1, 2600:9000:2755:4200:1:569b:bd00:93a1, 2600:9000:2755:4c00:1:569b:bd00:93a1, 2600:9000:2755:4e00:1:569b:bd00:93a1, 2600:9000:2755:5200:1:569b:bd00:93a1, 2600:9000:2755:600:1:569b:bd00:93a1, 2600:9000:2755:7e00:1:569b:bd00:93a1, 2600:9000:2755:b600:1:569b:bd00:93a1 |
Response IP | 108.156.133.115 |
Found | Yes |
Hash | 32866e626f9d21ae83b4efc8c34c522571e5ea0a8c37c5374c031da6033aee3a |
SimHash | 6d31dc648891 |
Groups
*
Rule | Path |
---|---|
Disallow | /dataset/rate/ |
Disallow | /revision/ |
Disallow | /dataset/*/history |
Disallow | /api/ |
Other Records
Field | Value |
---|---|
crawl-delay | 10 |
Other Records
Field | Value |
---|---|
sitemap | https://catalog.data.gov/sitemap.xml |
Comments