gcu.edu
robots.txt
Robots Exclusion Standard data for gcu.edu
Resource Scan
Scan Details
Site Domain | gcu.edu |
Base Domain | gcu.edu |
Scan Status | Ok |
Last Scan | 2025-09-25T09:05:03+00:00 |
Next Scan | 2025-10-25T09:05:03+00:00 |
Last Scan
Scanned | 2025-09-25T09:05:03+00:00 |
URL | https://gcu.edu/robots.txt |
Redirect | https://www.gcu.edu/robots.txt |
Redirect Domain | www.gcu.edu |
Redirect Base | gcu.edu |
Domain IPs | 104.16.2.115, 104.16.3.115, 2606:4700::6810:273, 2606:4700::6810:373 |
Redirect IPs | 104.17.210.95, 104.17.211.95, 2606:4700::6811:d25f, 2606:4700::6811:d35f |
Response IP | 104.17.210.95 |
Found | Yes |
Hash | 84139d3c5dc59411c43a75120c047fc850f7e69b7419ddefbcc60b1a53c0b67d |
SimHash | 3816151a4564 |
Groups
*
Rule | Path |
---|---|
Disallow | /core/ |
Disallow | /profiles/ |
Disallow | /modules/ |
Disallow | /web.config |
Disallow | /admin/ |
Disallow | /comment/ |
Disallow | /filter/ |
Disallow | /search* |
Disallow | /user/ |
Disallow | /node/ |
Disallow | /cdn-cgi/ |
Disallow | /blog/author/*?page= |
Disallow | /blog/tag/*?page= |
Other Records
Field | Value |
---|---|
sitemap | https://www.gcu.edu/sitemap.xml |
Comments