hg.org
robots.txt
Robots Exclusion Standard data for hg.org
Resource Scan
Scan Details
Site Domain | hg.org |
Base Domain | hg.org |
Scan Status | Failed |
Failure Stage | Fetching resource. |
Failure Reason | Server returned a client error. |
Last Scan | 2024-07-10T10:04:45+00:00 |
Next Scan | 2024-10-08T10:04:45+00:00 |
Last Successful Scan
Scanned | 2024-03-13T08:51:42+00:00 |
URL | https://hg.org/robots.txt |
Redirect | https://www.hg.org/robots.txt |
Redirect Domain | www.hg.org |
Redirect Base | hg.org |
Domain IPs | 172.66.40.113, 172.66.43.143, 2606:4700:3108::ac42:2871, 2606:4700:3108::ac42:2b8f |
Redirect IPs | 172.66.40.113, 172.66.43.143, 2606:4700:3108::ac42:2871, 2606:4700:3108::ac42:2b8f |
Response IP | 172.66.40.113 |
Found | Yes |
Hash | 06c576e3fe70c1d6d67d6b4783b303546e89f0ce10e42a63502e2dcbe48a2335 |
SimHash | 28054742e4d3 |
Groups
*
Rule | Path |
---|---|
Disallow | /cgi-bin/ |
Disallow | /_adm20hg_45/ |
Disallow | /cdn-cgi/ |
Disallow | /files/invoices/ |
Other Records
Field | Value |
---|---|
sitemap | https://www.hg.org/hgsitemap/sitemap.xml |