northeast.newsnetmedia.com
robots.txt

Robots Exclusion Standard data for northeast.newsnetmedia.com

Resource Scan

Scan Details

Site Domain northeast.newsnetmedia.com
Base Domain newsnetmedia.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-07-02T04:13:25+00:00
Next Scan 2024-09-30T04:13:25+00:00

Last Successful Scan

Scanned2023-03-10T20:53:19+00:00
URL https://northeast.newsnetmedia.com/robots.txt
Domain IPs 104.18.30.13, 104.18.31.13, 2606:4700::6812:1e0d, 2606:4700::6812:1f0d
Response IP 104.18.31.13
Found Yes
Hash aea8f7e47a0219524e5481f9e2e7595f40d4ea06d0d0b3563806f0ad308abff7
SimHash 45b5554c3ad3

Groups

*

Rule Path
Disallow /ads/
Disallow /global/tools/
Disallow /global/interfaces/
Disallow /global/images/
Disallow /global/include/
Disallow /global/applications/
Disallow /global/pm/
Disallow /global/utilities/
Disallow /global/reports/
Disallow /global/video/
Disallow /applications/
Disallow /cgi-bin/
Disallow /classifieds/
Disallow /default_files/
Disallow /images/
Disallow /include/
Disallow /incoming/
Disallow /reports/
Disallow /professionalservices/
Disallow /search
Disallow /temp/
Disallow /trafficcam/
Disallow /traffic/
Disallow /contentmgmt/
Disallow /link/
Disallow /register
Disallow /login
Disallow /forgot-password
Disallow /reset-password
Disallow /profile

Other Records

Field Value
crawl-delay 3

Other Records

Field Value
sitemap https://northeast.newsnetmedia.com/sitemap.xml.gz
sitemap https://northeast.newsnetmedia.com/sitemap-pages.xml.gz
sitemap https://northeast.newsnetmedia.com/newssitemap.xml.gz
sitemap https://northeast.newsnetmedia.com/videositemap.xml.gz