newsbreak.com
robots.txt

Robots Exclusion Standard data for newsbreak.com

Resource Scan

Scan Details

Site Domain newsbreak.com
Base Domain newsbreak.com
Scan Status Ok
Last Scan2024-04-23T11:03:16+00:00
Next Scan 2024-04-30T11:03:16+00:00

Last Scan

Scanned2024-04-23T11:03:16+00:00
URL https://newsbreak.com/robots.txt
Redirect https://www.newsbreak.com/robots.txt
Redirect Domain www.newsbreak.com
Redirect Base newsbreak.com
Domain IPs 35.85.88.137, 50.112.174.91
Redirect IPs 100.20.73.189, 35.167.251.113, 52.40.84.236
Response IP 100.20.73.189
Found Yes
Hash 7f7bcf0a94d586c94e3ce6f4f030654c3de39753b52cee3bd4662100629a2dc8
SimHash 0d0c5e1e4d32

Groups

*

Rule Path
Disallow /_api/
Disallow /api/
Disallow /channels/
Disallow /privacy
Disallow /terms
Disallow /t-*
Disallow /redirect-external
Disallow /me/
Disallow /_next/data/*.json
Disallow /following
Disallow /following/

Other Records

Field Value
sitemap https://www.newsbreak.com/sitemap.xml
sitemap https://www.newsbreak.com/sitemap-publisher.xml
sitemap https://www.newsbreak.com/sitemap-local-index.xml
sitemap https://www.newsbreak.com/sitemap-news-index.xml