/.well-known/

Log In Sign Up

clearchain.com
robots.txt

Robots Exclusion Standard data for clearchain.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	clearchain.com
Base Domain	clearchain.com
Scan Status	Ok
Last Scan	2026-01-10T14:16:44+00:00
Next Scan	2026-01-17T14:16:44+00:00

Last Scan

Scanned	2026-01-10T14:16:44+00:00
URL	https://clearchain.com/robots.txt
Domain IPs	104.21.25.249, 172.67.134.242
Response IP	104.21.25.249
Found	Yes
Hash	76d4231390781332b21b80cc2b1d3d13473deb6e67df5c3c35e4dbc1c23341a3
SimHash	7a723704ec57

Groups

*

Rule

Path

Disallow

/mailman/

Disallow

/pipermail/

Disallow

/~benjsc/temp

Other Records

Field

Value

crawl-delay

0.5

wget

Rule

Path

Disallow

/

*

Rule

Path

Disallow

/blog/wp-admin/

Back to top

Comments

$ClearChain: www/data/robots.txt,v 1.2 2004/02/06 02:30:47 benjsc Exp $
This file aids in providing web crawling software with restrictions on
the content they should index
Sorry, wget in its recursive mode is a frequent problem.
Please read the man page and use it properly; there is a
--wait option you can use to set the delay between hits,
for instance.
Wiki Requests
Don't index non article wiki pages

Back to top