chcp.edu
robots.txt

Robots Exclusion Standard data for chcp.edu

Resource Scan

Scan Details

Site Domain chcp.edu
Base Domain chcp.edu
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2025-09-05T20:58:44+00:00
Next Scan 2025-12-04T20:58:44+00:00

Last Successful Scan

Scanned2025-04-16T20:57:21+00:00
URL https://chcp.edu/robots.txt
Redirect https://www.chcp.edu/robots.txt
Redirect Domain www.chcp.edu
Redirect Base chcp.edu
Domain IPs 104.19.232.38, 104.19.233.38
Redirect IPs 104.19.232.38, 104.19.233.38
Response IP 104.19.233.38
Found Yes
Hash d3bf623c6186a90d1ee7211cfaf1bc810080f61550d8557e9039fb2e0a9d7601
SimHash 23290840a9b9

Groups

*

Rule Path
Allow *******************.js
Allow *******************.css
Disallow /manager/
Disallow /connectors/
Disallow /core/
Disallow /search-results/
Disallow /sites/default/files/content/documents/
Disallow /assets/pdf/
Disallow *?ref=*
Disallow /token/

mj12bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 5

Other Records

Field Value
sitemap https://www.chcp.edu/sitemap.xml

Comments

  • For sitemap.xml autodiscovery.
  • Crawl Rates for Search/AI Bots (that respect it) added by Jay Gilmore (MODX)