thejournalnews.com
robots.txt

Robots Exclusion Standard data for thejournalnews.com

Resource Scan

Scan Details

Site Domain thejournalnews.com
Base Domain thejournalnews.com
Scan Status Failed
Failure ReasonScan timed out.
Last Scan2024-09-13T17:35:45+00:00
Next Scan 2024-12-12T17:35:45+00:00

Last Successful Scan

Scanned2022-04-25T16:07:36+00:00
URL http://thejournalnews.com/robots.txt
Redirect https://www.lohud.com/robots.txt
Redirect Domain www.lohud.com
Redirect Base lohud.com
Response IP 199.232.46.62
Found Yes
Hash b7fc52e8df9dd9140b34b0aecfe4f4de45d9201052f6ad1b9d5c1d903ef1ea55
SimHash ab8e1fe7ddf3

Groups

googlebot-news

Rule Path
Disallow /story/sponsor-story/
Disallow /picture-gallery/sponsor-story/
Disallow /videos/sponsor-story/
Disallow /longform/sponsor-story/
Disallow /pages/interactives/sponsor-story/
Disallow /interactives/sponsor-story/
Disallow /videos/embed/

*

Rule Path
Disallow /errors
Disallow /interactive/
Disallow /userauth/
Disallow /ugc/
Disallow /feeds/
Disallow /services/
Disallow /facebook/
Disallow /version-info/
Disallow /longform/draft/
Disallow /story/draft/
Disallow /topic/*/smart/
Disallow /search
Disallow /module-showcase/
Disallow /newsletter/
Disallow /blended-newsletter/
Disallow /story/nletter/
Disallow /sports/services/photos/
Disallow /optimus
Disallow /ux-train
Disallow /story/advisory/

Other Records

Field Value
sitemap https://www.lohud.com/news-sitemap.xml
sitemap https://www.lohud.com/web-sitemap-index.xml
sitemap https://www.lohud.com/video-sitemap-index.xml

Comments

  • robots.txt file for https://www.lohud.com/