maint.loc.gov
robots.txt

Robots Exclusion Standard data for maint.loc.gov

Resource Scan

Scan Details

Site Domain maint.loc.gov
Base Domain loc.gov
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2025-12-13T12:58:29+00:00
Next Scan 2026-03-13T12:58:29+00:00

Last Successful Scan

Scanned2025-04-20T00:05:57+00:00
URL https://maint.loc.gov/robots.txt
Domain IPs 104.17.6.58, 104.18.64.82, 2606:4700::6811:63a, 2606:4700::6812:4052
Response IP 104.18.64.82
Found Yes
Hash 3f78887689175fb3668e34a93566949455578d114d955362f9127c7f6e3d18a3
SimHash 0d5bda70c2c2

Groups

*

Rule Path
Disallow /cgi-bin/
Disallow /web_arch/
Disallow /rr/mopic/staff
Disallow /loc/volunteers
Disallow /ficmanagers
Disallow /preserv/extranet/
Disallow /myloc
Disallow /nationalfilmregistry
Disallow /fedsearch
Disallow /search

Other Records

Field Value
crawl-delay 2

Warnings

  • 2 invalid lines.