hgem.com
robots.txt

Robots Exclusion Standard data for hgem.com

Resource Scan

Scan Details

Site Domain hgem.com
Base Domain hgem.com
Scan Status Ok
Last Scan2025-06-28T16:24:16+00:00
Next Scan 2025-07-05T16:24:16+00:00

Last Scan

Scanned2025-06-28T16:24:16+00:00
URL https://hgem.com/robots.txt
Redirect https://www.hgem.com/robots.txt
Redirect Domain www.hgem.com
Redirect Base hgem.com
Domain IPs 94.247.142.1
Redirect IPs 94.247.142.1
Response IP 94.247.142.1
Found Yes
Hash 8f496b1330c2f4acb3d058c4e280821d1d8553d35263647fb89ab3e35dd7adeb
SimHash 8328dd422f33

Groups

*

Rule Path
Disallow /cpresources/
Disallow /vendor/
Disallow /.env
Disallow /cache/
Disallow /courses

Other Records

Field Value
sitemap https://www.hgem.com/sitemaps-1-sitemap.xml

Comments

  • robots.txt for https://www.hgem.com/
  • default - don't allow web crawlers to index cpresources/ or vendor/