southeast.newsnetmedia.com
robots.txt

Robots Exclusion Standard data for southeast.newsnetmedia.com

Resource Scan

Scan Details

Site Domain southeast.newsnetmedia.com
Base Domain newsnetmedia.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-04-03T14:07:52+00:00
Next Scan 2024-07-02T14:07:52+00:00

Last Successful Scan

Scanned2023-03-11T11:03:09+00:00
URL https://southeast.newsnetmedia.com/robots.txt
Domain IPs 104.18.30.13, 104.18.31.13, 2606:4700::6812:1e0d, 2606:4700::6812:1f0d
Response IP 104.18.30.13
Found Yes
Hash 62065357c4ea429e3ce0c628c48321b168f70407b9b1b2079b45043f74a85836
SimHash 4cbd7f5c3a9b

Groups

*

Rule Path
Disallow /ads/
Disallow /global/tools/
Disallow /global/interfaces/
Disallow /global/images/
Disallow /global/include/
Disallow /global/applications/
Disallow /global/pm/
Disallow /global/utilities/
Disallow /global/reports/
Disallow /global/video/
Disallow /applications/
Disallow /cgi-bin/
Disallow /classifieds/
Disallow /default_files/
Disallow /images/
Disallow /include/
Disallow /incoming/
Disallow /reports/
Disallow /professionalservices/
Disallow /search
Disallow /temp/
Disallow /trafficcam/
Disallow /traffic/
Disallow /contentmgmt/
Disallow /link/
Disallow /register
Disallow /login
Disallow /forgot-password
Disallow /reset-password
Disallow /profile

Other Records

Field Value
crawl-delay 3

Other Records

Field Value
sitemap https://southeast.newsnetmedia.com/sitemap.xml.gz
sitemap https://southeast.newsnetmedia.com/sitemap-pages.xml.gz
sitemap https://southeast.newsnetmedia.com/newssitemap.xml.gz
sitemap https://southeast.newsnetmedia.com/videositemap.xml.gz