editorial.huffingtonpost.com
robots.txt

Robots Exclusion Standard data for editorial.huffingtonpost.com

Resource Scan

Scan Details

Site Domain editorial.huffingtonpost.com
Base Domain huffingtonpost.com
Scan Status Ok
Last Scan2024-06-26T03:20:17+00:00
Next Scan 2024-07-03T03:20:17+00:00

Last Scan

Scanned2024-06-26T03:20:17+00:00
URL https://editorial.huffingtonpost.com/robots.txt
Redirect https://www.huffpost.com/robots.txt
Redirect Domain www.huffpost.com
Redirect Base huffpost.com
Domain IPs 13.33.88.102, 13.33.88.20, 13.33.88.61, 13.33.88.85
Redirect IPs 151.101.130.114, 151.101.194.114, 151.101.2.114, 151.101.66.114
Response IP 199.232.46.114
Found Yes
Hash ba29a98dd9c5812c95e7cbbcd3ef5aab09dfc6df5d808a7bf2e2e55be122dae0
SimHash 4f3cdb618de0

Groups

grapeshot

Rule Path
Disallow /member
Disallow /*?*err_code=404
Disallow /search
Disallow /search/?*

*

Rule Path
Disallow /*?*page=
Disallow /member
Disallow /*?*err_code=404
Disallow /search
Disallow /search/?*
Disallow /mapi/v4/*/user/*
Disallow /embed

Other Records

Field Value
crawl-delay 4

amazonbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

googlebot

Rule Path
Allow /
Disallow /*?*err_code=404
Disallow /search
Disallow /search/?*

gptbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.huffpost.com/sitemaps/sitemap-v1.xml
sitemap https://www.huffpost.com/sitemaps/sitemap-google-news.xml
sitemap https://www.huffpost.com/sitemaps/sitemap-google-video.xml
sitemap https://www.huffpost.com/sitemaps/sections.xml
sitemap https://www.huffpost.com/sitemaps-huffingtonpost/sitemap.xml
sitemap https://www.huffpost.com/sitemaps-huffingtonpost/sections.xml

Comments

  • Cambria robots
  • archives
  • huffingtonpost.com archive sitemaps