thebreakthrough.org
robots.txt

Robots Exclusion Standard data for thebreakthrough.org

Resource Scan

Scan Details

Site Domain thebreakthrough.org
Base Domain thebreakthrough.org
Scan Status Ok
Last Scan2025-11-24T19:33:36+00:00
Next Scan 2025-12-01T19:33:36+00:00

Last Scan

Scanned2025-11-24T19:33:36+00:00
URL https://thebreakthrough.org/robots.txt
Domain IPs 94.247.142.1
Response IP 94.247.142.1
Found Yes
Hash 339ebc3592848f06ef3b1c52460e9a1153e756f8cc8b05421cf3fdb3037b0236
SimHash e2289d723790

Groups

*

Rule Path
Disallow /cpresources/
Disallow /vendor/
Disallow /.env
Disallow /cache/
Disallow /index.php/

Other Records

Field Value
sitemap https://thebreakthrough.org/sitemaps-1-sitemap.xml

Comments

  • robots.txt for https://thebreakthrough.org/
  • live - don't allow web crawlers to index cpresources/ or vendor/