mygetahead.org
robots.txt

Robots Exclusion Standard data for mygetahead.org

Resource Scan

Scan Details

Site Domain mygetahead.org
Base Domain mygetahead.org
Scan Status Ok
Last Scan2025-10-18T02:37:07+00:00
Next Scan 2025-11-01T02:37:07+00:00

Last Scan

Scanned2025-10-18T02:37:07+00:00
URL https://mygetahead.org/robots.txt
Domain IPs 109.228.40.216
Response IP 109.228.40.216
Found Yes
Hash e0236a9b181c3d4b8cb4a5ae93f83729caa43ef43259e11743b89d1f374880bc
SimHash 39155c11c701

Groups

*

Rule Path
Disallow /admin/
Disallow /bin/
Disallow /Connections/
Allow /i/
Disallow /inc/
Disallow /docs/
Disallow /*.pdf$
Disallow /*.doc$
Disallow /*.xls$
Disallow /*.docx$
Allow /inc/gallery/
Allow /i/photos/Gallery/

Other Records

Field Value
sitemap https://getahead.greenhousecms.co.uk/sitemap.xml