hsxcsl.com
robots.txt

Robots Exclusion Standard data for hsxcsl.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	hsxcsl.com
Base Domain	hsxcsl.com
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Couldn't connect to server.
Last Scan	2026-02-08T19:00:50+00:00
Next Scan	2026-02-15T19:00:50+00:00

Last Successful Scan

Scanned	2026-01-08T18:11:03+00:00
URL	http://hsxcsl.com/robots.txt
Domain IPs	183.136.138.172
Response IP	183.136.138.172
Found	Yes
Hash	7a6ba79b4d94c10f7193832c5f7fe384a1fca15a210222bf4ef79450d366f245
SimHash	331e6e62d2a6

Groups

*

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	2

Field

Value

crawl-delay

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

openai

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

google-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

claude-web

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

anthropic

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookbot

Rule	Path
Disallow	/

Rule

Path

Disallow

meta-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

cohere-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

ai21

Rule	Path
Disallow	/

Rule

Path

Disallow

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

applebot

Rule	Path
Disallow	/

Rule

Path

Disallow

petalbot

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Product	Comment
bytespider	TikTok

Product

Comment

bytespider

TikTok

Rule	Path
Disallow	/
Disallow	/admin/
Disallow	/private/
Disallow	/api/
Disallow	/ajax/
Disallow	/user-data/

Rule

Path

Disallow

/admin/

Disallow

/private/

Disallow

/api/

Disallow

/ajax/

Disallow

/user-data/

Other Records

Field	Value
sitemap	https://hsxcsl.com/sitemap.xml
sitemap	https://hsxcsl.com/news-sitemap.xml

Field

Value

sitemap

https://hsxcsl.com/sitemap.xml

sitemap

https://hsxcsl.com/news-sitemap.xml

Comments

允许所有搜索引擎爬虫访问公开内容
======== 屏蔽AI训练爬虫 ========
OpenAI
Google AI
Anthropic (Claude)
Common Crawl
Facebook/Meta AI
其他AI/数据收集爬虫
======== 目录限制 ========
可选：限制特定目录
站点地图
额外指令
建议爬虫不要缓存页面
限制AI训练使用

Warnings

`cache-control` is not a known field.
`host` is not a known field.
`x-robots-tag` is not a known field.

hsxcsl.comrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

*

Other Records

gptbot

chatgpt-user

openai

google-extended

google-ai

claude-web

claudebot

anthropic

ccbot

facebookbot

meta-ai

cohere-ai

ai21

amazonbot

applebot

petalbot

bytespider

Other Records

Comments

Warnings

hsxcsl.com
robots.txt