msn.com
robots.txt

Robots Exclusion Standard data for msn.com

Resource Scan

Scan Details

Site Domain msn.com
Base Domain msn.com
Scan Status Ok
Last Scan2024-04-25T02:56:56+00:00
Next Scan 2024-05-02T02:56:56+00:00

Last Scan

Scanned2024-04-25T02:56:56+00:00
URL https://msn.com/robots.txt
Redirect https://www.msn.com/robots.txt
Redirect Domain www.msn.com
Redirect Base msn.com
Domain IPs 204.79.197.219
Redirect IPs 204.79.197.203
Response IP 204.79.197.203
Found Yes
Hash 731b8ce01b568a7842e076cd7530b8c9586e1bf0122474b2a2790e3625f4ce89
SimHash 4c94db71c532

Groups

*

Rule Path
Disallow /*/health/search/filter
Disallow /spartan
Disallow /pt-ao
Disallow /*preview%3D*
Disallow /*/autos/marketplace/product/*
Disallow /*/cars/marketplace/product/*
Disallow /*?item=*%3A
Disallow /*%26item%3D*%3A
Disallow /*/channel/source/

adsbot-google

Rule Path
Allow /
Disallow /*/health/search/filter

verity

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 3

ias_crawler

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 0.5

yandex.com/bot

Rule Path
Disallow /*/weather/
Disallow /*/clima/
Disallow /*/cuaca/
Disallow /*/eltiempo/
Disallow /*/el-tiempo/
Disallow /*/havadurumu/
Disallow /*/idojaras/
Disallow /*/meteo/
Disallow /*/meteorologia/
Disallow /*/pocasi/
Disallow /*/pogoda/
Disallow /*/saa/
Disallow /*/vader/
Disallow /*/vejr/
Disallow /*/weer/
Disallow /*/wetter/

Other Records

Field Value
crawl-delay 1

admantx-ussy04/3.2

Rule Path
Allow /

amazonbot
anthropic-ai
applebot
ccbot
chatgpt-user
claude-web
cohere-ai
facebookbot
google-extended
gptbot
omgili
omgilibot
perplexitybot
twitterbot
youbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.msn.com/sitemaps/health/health-sitemap-index.xml
sitemap https://www.msn.com/sitemaps/shopping/shopping-sitemap-index.xml
sitemap https://www.msn.com/en-us/autos/marketplace/sitemap.xml
sitemap https://www.msn.com/staticsb/statics/latest/0/casualgames/sitemaps/sitemap-index.xml