huffpost.com.au
robots.txt

Robots Exclusion Standard data for huffpost.com.au

Resource Scan

Scan Details

Site Domain huffpost.com.au
Base Domain huffpost.com.au
Scan Status Ok
Last Scan2024-11-02T03:56:10+00:00
Next Scan 2024-11-09T03:56:10+00:00

Last Scan

Scanned2024-11-02T03:56:10+00:00
URL https://huffpost.com.au/robots.txt
Redirect https://www.huffpost.com/robots.txt
Redirect Domain www.huffpost.com
Redirect Base huffpost.com
Domain IPs 3.165.82.122, 3.165.82.57, 3.165.82.84, 3.165.82.98
Redirect IPs 151.101.130.114, 151.101.194.114, 151.101.2.114, 151.101.66.114
Response IP 199.232.46.114
Found Yes
Hash ba29a98dd9c5812c95e7cbbcd3ef5aab09dfc6df5d808a7bf2e2e55be122dae0
SimHash 4f3cdb618de0

Groups

grapeshot

Rule Path
Disallow /member
Disallow /*?*err_code=404
Disallow /search
Disallow /search/?*

*

Rule Path
Disallow /*?*page=
Disallow /member
Disallow /*?*err_code=404
Disallow /search
Disallow /search/?*
Disallow /mapi/v4/*/user/*
Disallow /embed

Other Records

Field Value
crawl-delay 4

amazonbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

googlebot

Rule Path
Allow /
Disallow /*?*err_code=404
Disallow /search
Disallow /search/?*

gptbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.huffpost.com/sitemaps/sitemap-v1.xml
sitemap https://www.huffpost.com/sitemaps/sitemap-google-news.xml
sitemap https://www.huffpost.com/sitemaps/sitemap-google-video.xml
sitemap https://www.huffpost.com/sitemaps/sections.xml
sitemap https://www.huffpost.com/sitemaps-huffingtonpost/sitemap.xml
sitemap https://www.huffpost.com/sitemaps-huffingtonpost/sections.xml

Comments

  • Cambria robots
  • archives
  • huffingtonpost.com archive sitemaps