lcnewschronicle.com
robots.txt

Robots Exclusion Standard data for lcnewschronicle.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	lcnewschronicle.com
Base Domain	lcnewschronicle.com
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Couldn't connect to server.
Last Scan	2024-09-21T22:03:22+00:00
Next Scan	2024-12-20T22:03:22+00:00

Last Successful Scan

Scanned	2024-05-25T22:02:05+00:00
URL	https://lcnewschronicle.com/robots.txt
Redirect	https://www.duluthnewstribune.com/robots.txt
Redirect Domain	www.duluthnewstribune.com
Redirect Base	duluthnewstribune.com
Domain IPs	108.156.133.28, 108.156.133.53, 108.156.133.72, 108.156.133.79
Redirect IPs	13.33.88.100, 13.33.88.30, 13.33.88.43, 13.33.88.97
Response IP	13.33.88.97
Found	Yes
Hash	71997f7298e715490fbf2216291977a258d8c2f7cc45e0fae6aab74f07a0116f
SimHash	6a35d8686593

Groups

*

Rule	Path
Disallow	/search
Disallow	/cms

Rule

Path

Disallow

/search

Disallow

/cms

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

10

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

/

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

/

cohere-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

/

ia_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

/

omgili

Rule	Path
Disallow	/

Rule

Path

Disallow

/

omgilibot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

piplbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Other Records

Field	Value
sitemap	https://www.duluthnewstribune.com/sitemap.xml

Field

Value

sitemap

https://www.duluthnewstribune.com/sitemap.xml

Back to top

Comments

Sitemap

Back to top

lcnewschronicle.comrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

*

Other Records

ccbot

gptbot

chatgpt-user

anthropic-ai

cohere-ai

ia_archiver

omgili

omgilibot

mj12bot

piplbot

google-extended

Other Records

Comments

lcnewschronicle.com
robots.txt