newsday.com
robots.txt

Robots Exclusion Standard data for newsday.com

Resource Scan

Scan Details

Site Domain newsday.com
Base Domain newsday.com
Scan Status Ok
Last Scan2024-04-28T11:13:29+00:00
Next Scan 2024-05-05T11:13:29+00:00

Last Scan

Scanned2024-04-28T11:13:29+00:00
URL https://newsday.com/robots.txt
Redirect https://www.newsday.com/robots.txt
Redirect Domain www.newsday.com
Redirect Base newsday.com
Domain IPs 54.161.209.79, 54.175.165.2
Redirect IPs 13.225.142.115, 13.225.142.2, 13.225.142.67, 13.225.142.9, 2600:9000:21eb:1600:3:cdf4:ba00:93a1, 2600:9000:21eb:1800:3:cdf4:ba00:93a1, 2600:9000:21eb:200:3:cdf4:ba00:93a1, 2600:9000:21eb:3000:3:cdf4:ba00:93a1, 2600:9000:21eb:7600:3:cdf4:ba00:93a1, 2600:9000:21eb:8600:3:cdf4:ba00:93a1, 2600:9000:21eb:e000:3:cdf4:ba00:93a1, 2600:9000:21eb:fe00:3:cdf4:ba00:93a1
Response IP 18.165.171.10
Found Yes
Hash 71f639018c027a55cc007b8f33f3101feaf119150bb6395ba4d8f8f5b8334066
SimHash f81c1537e79b

Groups

*

Rule Path
Disallow /beta/
Disallow /preview/
Disallow /iosfeeds/
Disallow /json/
Disallow /cmlink/
Disallow /mobile/
Disallow /5819/
Disallow *?lng
Disallow *?view

gptbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.newsday.com/sitemap/
sitemap https://www.newsday.com/sitemap/news
sitemap https://projects.newsday.com/sitemap_index.xml
sitemap https://projects.newsday.com/voters-guide/sitemap.xml
sitemap https://projects.newsday.com/schools/sitemap.xml
sitemap https://projects.newsday.com/payrolls/sitemap_index.xml