guides.loc.gov
robots.txt
Robots Exclusion Standard data for guides.loc.gov
Resource Scan
Scan Details
Site Domain | guides.loc.gov |
Base Domain | loc.gov |
Scan Status | Ok |
Last Scan | 2024-11-03T08:45:44+00:00 |
Next Scan | 2024-12-03T08:45:44+00:00 |
Last Scan
Scanned | 2024-11-03T08:45:44+00:00 |
URL | https://guides.loc.gov/robots.txt |
Domain IPs | 104.17.6.58, 104.18.64.82, 2606:4700::6811:63a, 2606:4700::6812:4052 |
Response IP | 104.18.64.82 |
Found | Yes |
Hash | 99d13d009d67b1ff254a03e3c23d25815ca57687656fe386375b00a27f278991 |
SimHash | 5824ccc1e583 |
Groups
*
Rule | Path |
---|---|
Disallow | /er.php |
Disallow | /err.php |
Disallow | /go.php |
Disallow | /friendly.php |
Disallow | /ld.php |
Disallow | /srch.php |
Other Records
Field | Value |
---|---|
crawl-delay | 10 |
Other Records
Field | Value |
---|---|
sitemap | https://guides.loc.gov/sitemap.xml |