hsxcsl.com
robots.txt

Robots Exclusion Standard data for hsxcsl.com

Resource Scan

Scan Details

Site Domain hsxcsl.com
Base Domain hsxcsl.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2026-02-08T19:00:50+00:00
Next Scan 2026-02-15T19:00:50+00:00

Last Successful Scan

Scanned2026-01-08T18:11:03+00:00
URL http://hsxcsl.com/robots.txt
Domain IPs 183.136.138.172
Response IP 183.136.138.172
Found Yes
Hash 7a6ba79b4d94c10f7193832c5f7fe384a1fca15a210222bf4ef79450d366f245
SimHash 331e6e62d2a6

Groups

*

Rule Path
Allow /

Other Records

Field Value
crawl-delay 2

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

openai

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

google-ai

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

anthropic

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

meta-ai

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

ai21

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

applebot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

bytespider

Product Comment
bytespider TikTok
Rule Path
Disallow /
Disallow /admin/
Disallow /private/
Disallow /api/
Disallow /ajax/
Disallow /user-data/

Other Records

Field Value
sitemap https://hsxcsl.com/sitemap.xml
sitemap https://hsxcsl.com/news-sitemap.xml

Comments

  • 允许所有搜索引擎爬虫访问公开内容
  • ======== 屏蔽AI训练爬虫 ========
  • OpenAI
  • Google AI
  • Anthropic (Claude)
  • Common Crawl
  • Facebook/Meta AI
  • 其他AI/数据收集爬虫
  • ======== 目录限制 ========
  • 可选:限制特定目录
  • 站点地图
  • 额外指令
  • 建议爬虫不要缓存页面
  • 限制AI训练使用

Warnings

  • `cache-control` is not a known field.
  • `host` is not a known field.
  • `x-robots-tag` is not a known field.