self.gutenberg.org
robots.txt

Robots Exclusion Standard data for self.gutenberg.org

Resource Scan

Scan Details

Site Domain self.gutenberg.org
Base Domain gutenberg.org
Scan Status Failed
Failure StageFetching resource.
Failure ReasonRequest timed out.
Last Scan2025-12-29T09:52:09+00:00
Next Scan 2026-02-27T09:52:09+00:00

Last Successful Scan

Scanned2025-10-05T00:50:45+00:00
URL http://self.gutenberg.org/robots.txt
Domain IPs 72.235.245.98
Response IP 72.235.245.98
Found Yes
Hash 54fb8ab0fece1e34ae942d0a50ba6ec8ed0bd620aded9d8e1b98646211b77ad2
SimHash 4340cc47c312

Groups

*

Rule Path Comment
Allow /* -
Disallow /view/ -
Disallow /ebooks/ -
Disallow /Articles/ -
Disallow /results.aspx -
Disallow /Get956uFile.aspx -
Disallow /ebooks/Get956uFile.aspx -
Disallow /App_Themes/ -
Disallow /img/ private area
Disallow /images/ private area
Disallow /js/ private area
Disallow /Members/ private area
Disallow /Members.2/ private area
Disallow /Members.3/ private area
Disallow /Members.4/ private area
Disallow /Members.5/ private area
Disallow /Members.6/ private area
Disallow /Members.7/ private area
Disallow /Members.8/ private area
Disallow /Members.9/ private area
Disallow /opac/ private area
Disallow /Report/ private area
Disallow /Services/ -
Disallow /styles/ -
Disallow /view/opac* -
Disallow /XmlDb/ -

Other Records

Field Value
sitemap http://self.gutenberg.org/sitemap.xml

Comments

  • robots.txt