grubstreet.org
robots.txt

Robots Exclusion Standard data for grubstreet.org

Resource Scan

Scan Details

Site Domain grubstreet.org
Base Domain grubstreet.org
Scan Status Ok
Last Scan2024-05-19T20:57:45+00:00
Next Scan 2024-06-18T20:57:45+00:00

Last Scan

Scanned2024-05-19T20:57:45+00:00
URL https://grubstreet.org/robots.txt
Domain IPs 104.26.4.39, 104.26.5.39, 172.67.72.174, 2606:4700:20::681a:427, 2606:4700:20::681a:527, 2606:4700:20::ac43:48ae
Response IP 104.26.5.39
Found Yes
Hash d13287a3f61ef089673eb36ab6b7363ab07b29ab663ac0c72b9f734885ab4a7f
SimHash c338195237b3

Groups

*

Rule Path
Disallow /cpresources/
Disallow /vendor/
Disallow /.env
Disallow /cache/

Other Records

Field Value
sitemap https://grubstreet.org/sitemaps-1-sitemap.xml

Comments

  • robots.txt for
  • live - don't allow web crawlers to index cpresources/ or vendor/