gci.com
robots.txt

Robots Exclusion Standard data for gci.com

Resource Scan

Scan Details

Site Domain gci.com
Base Domain gci.com
Scan Status Ok
Last Scan2024-11-17T02:44:05+00:00
Next Scan 2024-12-01T02:44:05+00:00

Last Scan

Scanned2024-11-17T02:44:05+00:00
URL https://gci.com/robots.txt
Domain IPs 40.125.104.202
Response IP 40.125.104.202
Found Yes
Hash a5f9145609e69286a5131f04a36dc790b58c4bc9d123a5415c0b5f2177939ccf
SimHash ecec9a8c8d97

Groups

*

Rule Path
Disallow /business/industries/government/soa
Disallow /soawireless
Disallow /~/media/files/gci/business/soa
Disallow /sitecore
Disallow /Sitecore
Disallow /sitecore_files
Disallow /App_Browsers
Disallow /App_config
Disallow /App_Data
Disallow /temp
Disallow /upload
Disallow /xsl
Disallow /beta-testers

Other Records

Field Value
sitemap https://gci.com/sitemap.xml

Warnings

  • 1 invalid line.
  • `noindex` is not a known field.