thesundaytimes.co.uk
robots.txt

Robots Exclusion Standard data for thesundaytimes.co.uk

Archived Snapshots

Resource Scan

Scan Details

Site Domain	thesundaytimes.co.uk
Base Domain	thesundaytimes.co.uk
Scan Status	Ok
Last Scan	2024-09-19T01:03:29+00:00
Next Scan	2024-09-26T01:03:29+00:00

Last Scan

Scanned	2024-09-19T01:03:29+00:00
URL	https://thesundaytimes.co.uk/robots.txt
Redirect	https://www.thetimes.com/robots.txt
Redirect Domain	www.thetimes.com
Redirect Base	thetimes.com
Domain IPs	34.240.28.43, 52.208.17.106, 54.76.240.177
Redirect IPs	13.33.88.20, 13.33.88.30, 13.33.88.5, 13.33.88.95, 2600:9000:223b:3600:a:1602:de80:93a1, 2600:9000:223b:3800:a:1602:de80:93a1, 2600:9000:223b:4800:a:1602:de80:93a1, 2600:9000:223b:8600:a:1602:de80:93a1, 2600:9000:223b:8c00:a:1602:de80:93a1, 2600:9000:223b:9400:a:1602:de80:93a1, 2600:9000:223b:9600:a:1602:de80:93a1, 2600:9000:223b:9e00:a:1602:de80:93a1
Response IP	13.33.88.95
Found	Yes
Hash	43261a821d0bea71a5c3b5fb2917438858745e4fef8916e261657feb98ebd8e5
SimHash	3d50194b4fc4

Groups

*

Rule	Path
Disallow	/login.thetimes.com/user/logout
Disallow	/feeds.thetimes.com/puzzles/
Disallow	/feeds.thetimes.com/timescrossword/
Disallow	/archive/page/*
Disallow	/archive/article/*
Disallow	/interactives/*
Disallow	/?s=
Disallow	/%26s%3D
Disallow	/?p=
Disallow	/?filter=
Allow	/past-six-days/$
Allow	/past-six-days$
Disallow	/past-six-days/*
Disallow	/topic/bbc
Disallow	/tto/*
Disallow	/player/brightcove/
Disallow	/my-articles
Disallow	/my-articles/
Disallow	/edition/null/
Disallow	/goto
Disallow	/?region=
Disallow	/?_ga
Disallow	/?CMP
Disallow	/?ExternalDataReference
Disallow	/article/category/
Disallow	/article/this-article-has-been-deleted*
Disallow	/article/this-article-has-been-removed*
Disallow	/article/this-article-is-no-longer-available*
Disallow	/search?*

Rule

Path

Disallow

/login.thetimes.com/user/logout

Disallow

/feeds.thetimes.com/puzzles/

Disallow

/feeds.thetimes.com/timescrossword/

Disallow

/archive/page/*

Disallow

/archive/article/*

Disallow

/interactives/*

Disallow

/*?s=*

Disallow

/*%26s%3D*

Disallow

/*?p=*

Disallow

/*?filter=*

Allow

/past-six-days/$

Allow

/past-six-days$

Disallow

/past-six-days/*

Disallow

/topic/bbc

Disallow

/tto/*

Disallow

/player/brightcove/

Disallow

/my-articles

Disallow

/my-articles/

Disallow

/edition/null/

Disallow

/goto

Disallow

/?region=

Disallow

/?_ga

Disallow

/?CMP

Disallow

/?ExternalDataReference

Disallow

/article/category/

Disallow

/article/this-article-has-been-deleted*

Disallow

/article/this-article-has-been-removed*

Disallow

/article/this-article-is-no-longer-available*

Disallow

/search?*

newsnow

Rule	Path
Disallow	/

Rule

Path

Disallow

omgili

Rule	Path
Disallow	/

Rule

Path

Disallow

webvac

Rule	Path
Disallow	/

Rule

Path

Disallow

webzip

Rule	Path
Disallow	/

Rule

Path

Disallow

psbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

meltwater

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

cohere-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

omgilibot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

piplbot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

anthropic-aibytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

claude-web

Rule	Path
Disallow	/

Rule

Path

Disallow

magpie-crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

news-please

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookbot

Rule	Path
Disallow	/

Rule

Path

Disallow

applebot-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

perplexitybot

Rule	Path
Disallow	/

Rule

Path

Disallow

Other Records

Field	Value
sitemap	https://www.thetimes.com/sitemaps/sitemap.xml

Field

Value

sitemap

https://www.thetimes.com/sitemaps/sitemap.xml

Comments

This is the robots.txt file for thetimes.com
The Times does not permit the unlicensed use of our content for large language models. Contact enquiries@newslicensing.com for assistance
Agent Specific Disallowed Sections

thesundaytimes.co.ukrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

newsnow

omgili

webvac

webzip

psbot

ia_archiver

meltwater

ccbot

anthropic-ai

cohere-ai

omgilibot

mj12bot

piplbot

google-extended

anthropic-aibytespider

claudebot

claude-web

magpie-crawler

news-please

facebookbot

applebot-extended

perplexitybot

Other Records

Comments

thesundaytimes.co.uk
robots.txt