archive.nytimes.com
robots.txt

Robots Exclusion Standard data for archive.nytimes.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	archive.nytimes.com
Base Domain	nytimes.com
Scan Status	Ok
Last Scan	2024-04-18T08:19:16+00:00
Next Scan	2024-05-18T08:19:16+00:00

Last Scan

Scanned	2024-04-18T08:19:16+00:00
URL	https://archive.nytimes.com/robots.txt
Domain IPs	151.101.1.164, 151.101.129.164, 151.101.193.164, 151.101.65.164
Response IP	199.232.45.164
Found	Yes
Hash	8501ff67d18311b331ae016c8f97374737a7ab714a7b09db6dcc6920d136d7dd
SimHash	fd51111b2771

Groups

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

anthropic-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

awariorssbot
awariosmartbot

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

chatgpt-user

Rule	Path
Disallow	/

Rule

Path

Disallow

claudebot

Rule	Path
Disallow	/

Rule

Path

Disallow

claude-web

Rule	Path
Disallow	/

Rule

Path

Disallow

cohere-ai

Rule	Path
Disallow	/

Rule

Path

Disallow

dataforseobot

Rule	Path
Disallow	/

Rule

Path

Disallow

diffbot

Rule	Path
Disallow	/

Rule

Path

Disallow

facebookbot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

magpie-crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

newsnow

Rule	Path
Disallow	/

Rule

Path

Disallow

news-please

Rule	Path
Disallow	/

Rule

Path

Disallow

omgili

Rule	Path
Disallow	/

Rule

Path

Disallow

omgilibot

Rule	Path
Disallow	/

Rule

Path

Disallow

peer39_crawler
peer39_crawler/1.0

Rule	Path
Disallow	/

Rule

Path

Disallow

perplexitybot

Rule	Path
Disallow	/

Rule

Path

Disallow

scrapy

Rule	Path
Disallow	/

Rule

Path

Disallow

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

Comments

New York Times content is made available for your personal, non-commercial
use subject to our Terms of Service here:
https://help.nytimes.com/hc/en-us/articles/115014893428-Terms-of-Service.
Use of any device, tool, or process designed to data mine or scrape the content
using automated means is prohibited without prior written permission from
The New York Times Company. Prohibited uses include but are not limited to:
(1) text and data mining activities under Art. 4 of the EU Directive on Copyright in
the Digital Single Market;
(2) the development of any software, machine learning, artificial intelligence (AI),
and/or large language models (LLMs);
(3) creating or providing archived or cached data sets containing our content to others; and/or
(4) any commercial purposes.
Contact https://nytlicensing.com/contact/ for assistance.

archive.nytimes.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

amazonbot

anthropic-ai

awariorssbotawariosmartbot

bytespider

ccbot

chatgpt-user

claudebot

claude-web

cohere-ai

dataforseobot

diffbot

facebookbot

google-extended

gptbot

magpie-crawler

newsnow

news-please

omgili

omgilibot

peer39_crawlerpeer39_crawler/1.0

perplexitybot

scrapy

turnitinbot

Comments

archive.nytimes.com
robots.txt

awariorssbot
awariosmartbot

peer39_crawler
peer39_crawler/1.0