ww38.nytims.com
robots.txt

Robots Exclusion Standard data for ww38.nytims.com

Resource Scan

Scan Details

Site Domain ww38.nytims.com
Base Domain nytims.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-04-12T15:07:39+00:00
Next Scan 2024-07-11T15:07:39+00:00

Last Successful Scan

Scanned2023-02-24T18:53:11+00:00
URL http://ww38.nytims.com/robots.txt
Domain IPs 185.53.178.51
Response IP 185.53.178.51
Found Yes
Hash 81150fed4cd6b900092954012f0a8181687ab60105f4ff82e33b6f19277123f6
SimHash 64a75840449a

Groups

googlebot

Rule Path
Disallow /?*

baiduspider

Rule Path
Disallow /?*

yandexbot

Rule Path
Disallow /?*

ichiro

Rule Path
Disallow /?*

sogou spider

Rule Path
Disallow /?*

sosospider

Rule Path
Disallow /?*

youdaobot

Rule Path
Disallow /?*

yetibot

Rule Path
Disallow /?*

bingbot

Rule Path
Disallow /?*

Other Records

Field Value
crawl-delay 2

yahoo! slurp

Rule Path
Disallow /?*

Other Records

Field Value
crawl-delay 2

rdfbot

Rule Path
Disallow /?*

seznambot

Rule Path
Disallow /?*

ia_archiver

Rule Path
Disallow

mediapartners-google

Rule Path
Disallow

Warnings

  • `request-rate` is not a known field.