ctwatch.dk
robots.txt

Robots Exclusion Standard data for ctwatch.dk

Archived Snapshots

Resource Scan

Scan Details

Site Domain	ctwatch.dk
Base Domain	ctwatch.dk
Scan Status	Ok
Last Scan	2024-10-20T18:27:53+00:00
Next Scan	2024-11-03T18:27:53+00:00

Last Scan

Scanned	2024-10-20T18:27:53+00:00
URL	https://ctwatch.dk/robots.txt
Domain IPs	13.226.2.121, 13.226.2.38, 13.226.2.41, 13.226.2.56
Response IP	3.164.206.82
Found	Yes
Hash	9f33a7f4bc77f804a9cf0aa42153da7db2d101db49e026df43c38507733b915c
SimHash	38169a24a7e5

Groups

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

/

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

/

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

/

*

Rule	Path
Disallow	/archive/
Disallow	/auth
Disallow	/user/addTrial
Disallow	/metrics
Disallow	/health
Disallow	/cache
Disallow	/esi
Disallow	/mark-variant-won
Disallow	/article/5253094
Disallow	/Sygdom___Sundhed/article5253094.ece
Disallow	/service/cbp

Rule

Path

Disallow

/archive/

Disallow

/auth

Disallow

/user/addTrial

Disallow

/metrics

Disallow

/health

Disallow

/cache

Disallow

/esi

Disallow

/mark-variant-won

Disallow

/article/5253094

Disallow

/Sygdom___Sundhed/article5253094.ece

Disallow

/service/cbp

Back to top

Other Records

Field	Value
sitemap	https://ctwatch.dk/sitemapindex.xml

Field

Value

sitemap

https://ctwatch.dk/sitemapindex.xml

Back to top

Comments

AI crawler reference
The link below provides instructions to what kind of content can be used to train AI models on this website
https://ctwatch.dk/ai.txt
Common crawl
OpenAI (ChatGPT)
OpenAI (ChatGPT realtime search)
Anthropic
Google (only AI crawler)

Back to top

ctwatch.dkrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

ccbot

gptbot

chatgpt-user

anthropic-ai

google-extended

*

Other Records

Comments

ctwatch.dk
robots.txt