huaweicloud.cn
robots.txt

Robots Exclusion Standard data for huaweicloud.cn

Resource Scan

Scan Details

Site Domain huaweicloud.cn
Base Domain huaweicloud.cn
Scan Status Ok
Last Scan2025-09-10T03:13:16+00:00
Next Scan 2025-10-10T03:13:16+00:00

Last Scan

Scanned2025-09-10T03:13:16+00:00
URL https://www.huaweicloud.cn/robots.txt
Redirect https://www.huaweicloud.com/robots.txt
Redirect Domain www.huaweicloud.com
Redirect Base huaweicloud.com
Domain IPs 240d:c010:15c:1::158, 43.159.108.144
Redirect IPs 129.227.87.58, 129.227.87.61, 129.227.87.63, 23.251.120.94
Response IP 129.227.87.61
Found Yes
Hash ef988fc36f80c24d332d3bb3f7a02ce688073fcf118078ad66ea026644911493
SimHash ff194b1181c0

Groups

*

Rule Path
Disallow /common/
Disallow /tips/
Disallow /test/
Disallow /activity/share/
Disallow /s/

youbot

Rule Path
Disallow /guide/

webzio-extended

Rule Path
Disallow /guide/

velenpublicwebcrawler

Rule Path
Disallow /guide/

turnitinbot

Rule Path
Disallow /guide/

timpibot

Rule Path
Disallow /guide/

taragroup intelligent bot

Rule Path
Disallow /guide/

sidetrade indexer bot

Rule Path
Disallow /guide/

semrushbot-swa

Rule Path
Disallow /guide/

semrushbot-ocob

Rule Path
Disallow /guide/

seekrbot

Rule Path
Disallow /guide/

scrapy

Rule Path
Disallow /guide/

quora-bot

Rule Path
Disallow /guide/

quillbot.com

Rule Path
Disallow /guide/

poseidon research crawler

Rule Path
Disallow /guide/

piplbot

Rule Path
Disallow /guide/

petalbot

Rule Path
Disallow /guide/

perplexitybot

Rule Path
Disallow /guide/

peer39_crawler/1.0

Rule Path
Disallow /guide/

peer39_crawler

Rule Path
Disallow /guide/

pangubot

Rule Path
Disallow /guide/

omgilibot

Rule Path
Disallow /guide/

omgili

Rule Path
Disallow /guide/

oai-searchbot

Rule Path
Disallow /guide/

news-please

Rule Path
Disallow /guide/

newsnow

Rule Path
Disallow /guide/

meta-externalfetcher

Rule Path
Disallow /guide/

meta-externalagent

Rule Path
Disallow /guide/

magpie-crawler

Rule Path
Disallow /guide/

kangaroo bot

Rule Path
Disallow /guide/

jetslide

Rule Path
Disallow /guide/

jamesbot

Rule Path
Disallow /guide/

isscyberriskcrawler

Rule Path
Disallow /guide/

img2dataset

Rule Path
Disallow /guide/

imagesiftbot

Rule Path
Disallow /guide/

icc-crawler

Rule Path
Disallow /guide/

iaskspider/2.0

Rule Path
Disallow /guide/

ia_archiver

Rule Path
Disallow /guide/

gptbot

Rule Path
Disallow /guide/

friendlycrawler

Rule Path
Disallow /guide/

facebookbot

Rule Path
Disallow /guide/

echoboxbot

Rule Path
Disallow /guide/

duckassistbot

Rule Path
Disallow /guide/

diffbot

Rule Path
Disallow /guide/

dataforseobot

Rule Path
Disallow /guide/

criteobot

Rule Path
Disallow /guide/

crawlspace

Rule Path
Disallow /guide/

cohere-training-data-crawler

Rule Path
Disallow /guide/

cohere-ai

Rule Path
Disallow /guide/

claude-web

Rule Path
Disallow /guide/

claudebot

Rule Path
Disallow /guide/

chatgpt-user

Rule Path
Disallow /guide/

ccbot

Rule Path
Disallow /guide/

bytespider

Rule Path
Disallow /guide/

brightbot 1.0

Rule Path
Disallow /guide/

bravebot

Rule Path
Disallow /guide/

awariosmartbot

Rule Path
Disallow /guide/

awariorssbot

Rule Path
Disallow /guide/

arquivo-web-crawler

Rule Path
Disallow /guide/

applebot-extended

Rule Path
Disallow /guide/

anthropic-ai

Rule Path
Disallow /guide/

amazonbot

Rule Path
Disallow /guide/

ai2bot-dolma

Rule Path
Disallow /guide/

ai2bot

Rule Path
Disallow /guide/

Comments

  • Search Engine Spider Rules
  • Disallow LLm Rules