static01.nyt.com
robots.txt

Robots Exclusion Standard data for static01.nyt.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	static01.nyt.com
Base Domain	nyt.com
Scan Status	Ok
Last Scan	2024-06-17T02:41:33+00:00
Next Scan	2024-07-01T02:41:33+00:00

Last Scan

Scanned	2024-06-17T02:41:33+00:00
URL	https://static01.nyt.com/robots.txt
Domain IPs	151.101.1.164, 151.101.129.164, 151.101.193.164, 151.101.65.164
Response IP	199.232.45.164
Found	Yes
Hash	3c468117ad78918361661207f9242920cec7839841ebb9637df2dc2fe549db1b
SimHash	7d41111a2775

Groups

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

awariorssbot
awariosmartbot

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

claude-web

Rule	Path
Disallow	/

Rule

Path

Disallow

cohere-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

dataforseobot

Rule	Path
Disallow	/

Rule

Path

Disallow

diffbot

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookbot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

magpie-crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

newsnow

Rule	Path
Disallow	/

Rule

Path

Disallow

news-please

Rule	Path
Disallow	/

Rule

Path

Disallow

omgili

Rule	Path
Disallow	/

Rule

Path

Disallow

omgilibot

Rule	Path
Disallow	/

Rule

Path

Disallow

peer39_crawler
peer39_crawler/1.0

Rule	Path
Disallow	/

Rule

Path

Disallow

perplexitybot

Rule	Path
Disallow	/

Rule

Path

Disallow

scrapy

Rule	Path
Disallow	/

Rule

Path

Disallow

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

*

Rule	Path
Disallow	/pages/college/
Disallow	/college/
Disallow	/library/
Disallow	/learning/
Disallow	/aponline/
Disallow	/reuters/
Disallow	/cnet/
Disallow	/partners/
Disallow	/archives/
Disallow	/indexes/
Disallow	/adx/bin/
Disallow	/thestreet/
Disallow	/nytimes-partners/
Disallow	/financialtimes/
Disallow	/email-content/
Allow	/pages/
Allow	/2003/
Allow	/2004/
Allow	/2005/
Allow	/top/
Allow	/ref/
Allow	/services/xml/

Rule

Path

Disallow

/pages/college/

Disallow

/college/

Disallow

/library/

Disallow

/learning/

Disallow

/aponline/

Disallow

/reuters/

Disallow

/cnet/

Disallow

/partners/

Disallow

/archives/

Disallow

/indexes/

Disallow

/adx/bin/

Disallow

/thestreet/

Disallow

/nytimes-partners/

Disallow

/financialtimes/

Disallow

/email-content/

Allow

/pages/

Allow

/2003/

Allow

/2004/

Allow

/2005/

Allow

/top/

Allow

/ref/

Allow

/services/xml/

mediapartners-google*

Rule	Path
Disallow

Rule

Path

Disallow

Comments

New York Times content is made available for your personal, non-commercial
use subject to our Terms of Service here:
https://help.nytimes.com/hc/en-us/articles/115014893428-Terms-of-Service.
Use of any device, tool, or process designed to data mine or scrape the content
using automated means is prohibited without prior written permission from
The New York Times Company. Prohibited uses include but are not limited to:
(1) text and data mining activities under Art. 4 of the EU Directive on Copyright in
the Digital Single Market;
(2) the development of any software, machine learning, artificial intelligence (AI),
and/or large language models (LLMs);
(3) creating or providing archived or cached data sets containing our content to others; and/or
(4) any commercial purposes.
Contact https://nytlicensing.com/contact/ for assistance.
Disallow Rules
Other Bot Rules

static01.nyt.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

amazonbot

anthropic-ai

awariorssbotawariosmartbot

bytespider

ccbot

chatgpt-user

claudebot

claude-web

cohere-ai

dataforseobot

diffbot

facebookbot

google-extended

gptbot

magpie-crawler

newsnow

news-please

omgili

omgilibot

peer39_crawlerpeer39_crawler/1.0

perplexitybot

scrapy

turnitinbot

*

mediapartners-google*

Comments

static01.nyt.com
robots.txt

awariorssbot
awariosmartbot

peer39_crawler
peer39_crawler/1.0