lcnewschronicle.com
robots.txt

Robots Exclusion Standard data for lcnewschronicle.com

Resource Scan

Scan Details

Site Domain lcnewschronicle.com
Base Domain lcnewschronicle.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-09-21T22:03:22+00:00
Next Scan 2024-12-20T22:03:22+00:00

Last Successful Scan

Scanned2024-05-25T22:02:05+00:00
URL https://lcnewschronicle.com/robots.txt
Redirect https://www.duluthnewstribune.com/robots.txt
Redirect Domain www.duluthnewstribune.com
Redirect Base duluthnewstribune.com
Domain IPs 108.156.133.28, 108.156.133.53, 108.156.133.72, 108.156.133.79
Redirect IPs 13.33.88.100, 13.33.88.30, 13.33.88.43, 13.33.88.97
Response IP 13.33.88.97
Found Yes
Hash 71997f7298e715490fbf2216291977a258d8c2f7cc45e0fae6aab74f07a0116f
SimHash 6a35d8686593

Groups

*

Rule Path
Disallow /search
Disallow /cms

Other Records

Field Value
crawl-delay 10

ccbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.duluthnewstribune.com/sitemap.xml

Comments

  • Sitemap