habitat.ca
robots.txt

Robots Exclusion Standard data for habitat.ca

Resource Scan

Scan Details

Site Domain habitat.ca
Base Domain habitat.ca
Scan Status Ok
Last Scan2024-11-03T19:28:42+00:00
Next Scan 2024-11-17T19:28:42+00:00

Last Scan

Scanned2024-11-03T19:28:42+00:00
URL https://habitat.ca/robots.txt
Domain IPs 108.157.254.109, 108.157.254.112, 108.157.254.116, 108.157.254.58
Response IP 108.157.254.116
Found Yes
Hash 39abafc109720faa367303a9e8826e1c9fe17f6322994d3cff0a526d738eba76
SimHash 415215523fd7

Groups

*

Rule Path
Disallow /cpresources/
Disallow /vendor/
Disallow /.env

Other Records

Field Value
sitemap https://habitat.ca/en/sitemaps-1-sitemap.xml
sitemap https://habitat.ca/fr/sitemaps-1-sitemap.xml

Comments

  • robots.txt for https://habitat.ca/en/
  • live - don't allow web crawlers to index cpresources/ or vendor/