allmishnah.org
robots.txt

Robots Exclusion Standard data for allmishnah.org

Resource Scan

Scan Details

Site Domain allmishnah.org
Base Domain allmishnah.org
Scan Status Ok
Last Scan2025-12-15T08:48:54+00:00
Next Scan 2025-12-29T08:48:54+00:00

Last Scan

Scanned2025-12-15T08:48:54+00:00
URL https://allmishnah.org/robots.txt
Domain IPs 3.170.229.23, 3.170.229.29, 3.170.229.66, 3.170.229.71
Response IP 3.170.229.29
Found Yes
Hash a9572a6319e3dd433765ec8a179970b0f9bca33a11ac5442c6f6ea10bf65d02f
SimHash 000a5e1ae42d

Groups

*

Rule Path
Allow /
Allow /authors/
Allow /series/
Allow /p/
Allow /blogs/
Allow /mishnah
Allow /today-mishnah/
Allow /text/
Allow /about-us
Allow /contact-us
Allow /events/
Allow /podcast/
Allow /search/
Disallow /dashboard
Disallow /preferences/
Disallow /library/history
Disallow /library/downloads
Disallow /library/subscriptions
Disallow /library/playlists/
Disallow /library/learning
Disallow /library/podcast
Disallow /login/
Disallow /sign-out
Disallow /auth-callback
Disallow /api/
Disallow /trpc/
Disallow /donate
Disallow /support
Allow *.css
Allow *.js
Allow *.png
Allow *.jpg
Allow *.jpeg
Allow *.gif
Allow *.webp
Allow *.svg
Allow *.pdf

Other Records

Field Value
crawl-delay 1

Comments

  • Robots.txt for AllMishnah.org
  • Mishnah learning platform
  • Allow crawling of main content
  • Block user-specific and authentication areas
  • Block API endpoints
  • Block donation and support flows (user-specific)
  • Allow specific file types that are good for SEO
  • Common crawler optimizations
  • Sitemap location (when implemented)
  • Sitemap: https://allmishnah.org/sitemap.xml