maint.loc.gov
robots.txt
Robots Exclusion Standard data for maint.loc.gov
Resource Scan
Scan Details
| Site Domain | maint.loc.gov |
| Base Domain | loc.gov |
| Scan Status | Failed |
| Failure Stage | Fetching resource. |
| Failure Reason | Couldn't connect to server. |
| Last Scan | 2025-12-13T12:58:29+00:00 |
| Next Scan | 2026-03-13T12:58:29+00:00 |
Last Successful Scan
| Scanned | 2025-04-20T00:05:57+00:00 |
| URL | https://maint.loc.gov/robots.txt |
| Domain IPs | 104.17.6.58, 104.18.64.82, 2606:4700::6811:63a, 2606:4700::6812:4052 |
| Response IP | 104.18.64.82 |
| Found | Yes |
| Hash | 3f78887689175fb3668e34a93566949455578d114d955362f9127c7f6e3d18a3 |
| SimHash | 0d5bda70c2c2 |
Groups
*
| Rule | Path |
|---|---|
| Disallow | /cgi-bin/ |
| Disallow | /web_arch/ |
| Disallow | /rr/mopic/staff |
| Disallow | /loc/volunteers |
| Disallow | /ficmanagers |
| Disallow | /preserv/extranet/ |
| Disallow | /myloc |
| Disallow | /nationalfilmregistry |
| Disallow | /fedsearch |
| Disallow | /search |
Other Records
| Field | Value |
|---|---|
| crawl-delay | 2 |
Warnings
- 2 invalid lines.