leanpub.com
robots.txt

Robots Exclusion Standard data for leanpub.com

Resource Scan

Scan Details

Site Domain leanpub.com
Base Domain leanpub.com
Scan Status Ok
Last Scan2024-06-03T10:06:32+00:00
Next Scan 2024-07-03T10:06:32+00:00

Last Scan

Scanned2024-06-03T10:06:32+00:00
URL https://leanpub.com/robots.txt
Domain IPs 99.84.66.21, 99.84.66.64, 99.84.66.79, 99.84.66.80
Response IP 18.165.171.95
Found Yes
Hash 1dfb06b806cf29aea7355ca01c9fc641dccd0a53fffbaf0d1db7beebba69ebbb
SimHash a83991774143

Groups

*

Rule Path
Disallow /s/
Disallow /author_app/
Disallow /course_admin/
Disallow /course_set_admin/
Disallow /user_dashboard/
Disallow /author_dashboard/
Disallow /library/

easouspider

Rule Path
Disallow /

Other Records

Field Value
sitemap https://leanpub.com/sitemap.xml
sitemap https://leanpub.com/ai/sitemap.xml

Comments

  • Since these are sitemap indexes we cannot link to them in
  • our own sitemap, search engines do not accept nested indexes
  • But multiple sitemap records here are allowed as per:
  • https://www.sitemaps.org/protocol.html#submit_robots
  • Sitemap: https://leanpub.com/blog/sitemap.xml
  • Sitemap: https://leanpub.com/frontmatter/sitemap.xml
  • ShortURLS
  • Product management pages
  • User auth required