gcu.ac.uk
robots.txt

Robots Exclusion Standard data for gcu.ac.uk

Resource Scan

Scan Details

Site Domain gcu.ac.uk
Base Domain gcu.ac.uk
Scan Status Ok
Last Scan2024-10-20T16:11:23+00:00
Next Scan 2024-11-19T16:11:23+00:00

Last Scan

Scanned2024-10-20T16:11:23+00:00
URL https://gcu.ac.uk/robots.txt
Redirect https://www.gcu.ac.uk/robots.txt
Redirect Domain www.gcu.ac.uk
Redirect Base gcu.ac.uk
Domain IPs 43.245.41.60
Redirect IPs 104.18.34.156, 172.64.153.100
Response IP 104.18.34.156
Found Yes
Hash fea4ea48125acc8bb6721fc3fbe00a47acc294b3b587ba389cb2ee9fbf7be7fc
SimHash 25a5e00c8ed3

Groups

*

Rule Path
Disallow /_designs/
Disallow /*?sq_content_src=
Disallow /*_recache
Disallow /*_edit
Disallow /*_admin
Disallow /*_login
Disallow /*_performance
Disallow /*_design
Disallow /*_web_services
Disallow /*_feeds
Disallow /search
Disallow /migration/
Disallow /squiz-test/
Disallow /digitaldesign/
Disallow /digital-design/
Disallow /training/
Disallow /gcu-test/
Disallow /gcutest/
Disallow /testing/
Disallow /brandhub/
Disallow /brand-hub/
Disallow /drafts/
Disallow /external-links/

Other Records

Field Value
sitemap https://www.gcu.ac.uk/sitemap.xml

Comments

  • Disallow some matrix defaults
  • GCU added
  • Sitemap