asahi.com
robots.txt

Robots Exclusion Standard data for asahi.com

Resource Scan

Scan Details

Site Domain asahi.com
Base Domain asahi.com
Scan Status Ok
Last Scan2024-09-20T20:22:41+00:00
Next Scan 2024-09-27T20:22:41+00:00

Last Scan

Scanned2024-09-20T20:22:41+00:00
URL https://asahi.com/robots.txt
Redirect https://www.asahi.com/robots.txt
Redirect Domain www.asahi.com
Redirect Base asahi.com
Domain IPs 118.215.85.237
Redirect IPs 118.215.85.237
Response IP 104.69.155.197
Found Yes
Hash fdc458eca67f1326aab7ced35e55fbd79c62a2cbbc09c3b2e04470e60b375a1f
SimHash 118cfb5581b1

Groups

*

Rule Path
Disallow /video/news/TKY200903050250.html
Disallow /kansai/news/OSK200903050055.html
Disallow /travel/event/search/
Disallow /science/index.html
Disallow /entertainment/index.html
Disallow /car/index.html
Disallow /housing/index.html
Disallow /showbiz/column/animagedon/index.html
Disallow /english/newsfeatures.html
Disallow /english/business.html
Disallow /english/cooljapan.html
Disallow /english/sports.html
Allow /
Allow /.well-known/assetlinks.json

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

icc-crawler

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

perplexity-ai

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.asahi.com/sitemap.xml