msn.cn
robots.txt

Robots Exclusion Standard data for msn.cn

Resource Scan

Scan Details

Site Domain msn.cn
Base Domain msn.cn
Scan Status Ok
Last Scan2024-04-24T01:14:35+00:00
Next Scan 2024-05-01T01:14:35+00:00

Last Scan

Scanned2024-04-24T01:14:35+00:00
URL https://www.msn.cn/robots.txt
Domain IPs 204.79.197.235
Response IP 204.79.197.235
Found Yes
Hash 0a645079a65728be0504383589a04f01eeea21e8b11f2c03111d1f822ddf0547
SimHash 4c94fbf1cf32

Groups

*

Rule Path
Disallow /*/health/search/filter
Disallow /spartan
Disallow /*preview%3D*
Disallow /*?item=*%3A
Disallow /*%26item%3D*%3A

adsbot-google

Rule Path
Allow /
Disallow /*/health/search/filter

verity

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 3

ias_crawler

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 0.5

yandex.com/bot

Rule Path
Disallow /*/weather/
Disallow /*/clima/
Disallow /*/cuaca/
Disallow /*/eltiempo/
Disallow /*/el-tiempo/
Disallow /*/havadurumu/
Disallow /*/idojaras/
Disallow /*/meteo/
Disallow /*/meteorologia/
Disallow /*/pocasi/
Disallow /*/pogoda/
Disallow /*/saa/
Disallow /*/vader/
Disallow /*/vejr/
Disallow /*/weer/
Disallow /*/wetter/

Other Records

Field Value
crawl-delay 1

amazonbot
anthropic-ai
applebot
ccbot
chatgpt-user
claude-web
cohere-ai
facebookbot
google-extended
gptbot
omgili
omgilibot
perplexitybot
twitterbot
youbot

Rule Path
Disallow /