cls.cn
robots.txt

Robots Exclusion Standard data for cls.cn

Archived Snapshots

Resource Scan

Scan Details

Site Domain	cls.cn
Base Domain	cls.cn
Scan Status	Ok
Last Scan	2024-10-29T01:25:34+00:00
Next Scan	2024-11-28T01:25:34+00:00

Last Scan

Scanned	2024-10-29T01:25:34+00:00
URL	https://www.cls.cn/robots.txt
Domain IPs	103.143.19.17, 240e:940:e009:1e0::18f
Response IP	103.143.19.17
Found	Yes
Hash	fb65f0409dcd8486d15dc87eb42220657fb6c102dd6ff40d12c913aaf543fceb
SimHash	a838d4c00990

Groups

*

Rule	Path
Disallow	/hwwebscan_verify.html
Disallow	/static/
Disallow	.jpg$
Disallow	.jpeg$
Disallow	.gif$
Disallow	.png$
Disallow	.bmp$

Rule

Path

Disallow

/hwwebscan_verify.html

Disallow

/static/

Disallow

.jpg$

Disallow

.jpeg$

Disallow

.gif$

Disallow

.png$

Disallow

.bmp$

Back to top

Other Records

Field	Value
sitemap	https://cls.cn/map.xml

Field

Value

sitemap

https://cls.cn/map.xml

Back to top

Comments

User-agent:* //制定规则适用于哪个蜘蛛,'*'代表所有搜索引擎
Disallow: /（禁止蜘蛛爬取网站的所有目录 "/" 表示根目录下）
Allow:（用来定义允许蜘蛛爬取的页面或子目录）
Sitemap：告诉蜘蛛XML网站地图在哪里。

Back to top

Warnings

1 invalid line.

Back to top

cls.cnrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

Comments

Warnings

cls.cn
robots.txt