newschannelnebraska.com
robots.txt

Robots Exclusion Standard data for newschannelnebraska.com

Resource Scan

Scan Details

Site Domain newschannelnebraska.com
Base Domain newschannelnebraska.com
Scan Status Ok
Last Scan2024-05-06T03:45:22+00:00
Next Scan 2024-05-13T03:45:22+00:00

Last Scan

Scanned2024-05-06T03:45:22+00:00
URL https://newschannelnebraska.com/robots.txt
Redirect https://www.newschannelnebraska.com/robots.txt
Redirect Domain www.newschannelnebraska.com
Redirect Base newschannelnebraska.com
Domain IPs 100.25.172.61, 54.145.205.131
Redirect IPs 104.18.30.13, 104.18.31.13, 2606:4700::6812:1e0d, 2606:4700::6812:1f0d
Response IP 104.18.31.13
Found Yes
Hash b5615e7efd06a936b2e074cd9eb1a436b59baa26f0504dcd7fce26d0d4894dd5
SimHash 4dbd5f4c1ad3

Groups

*

Rule Path
Disallow /ads/
Disallow /global/tools/
Disallow /global/interfaces/
Disallow /global/images/
Disallow /global/include/
Disallow /global/applications/
Disallow /global/pm/
Disallow /global/utilities/
Disallow /global/reports/
Disallow /global/video/
Disallow /applications/
Disallow /cgi-bin/
Disallow /classifieds/
Disallow /default_files/
Disallow /images/
Disallow /include/
Disallow /incoming/
Disallow /reports/
Disallow /professionalservices/
Disallow /search
Disallow /temp/
Disallow /trafficcam/
Disallow /traffic/
Disallow /contentmgmt/
Disallow /register
Disallow /login
Disallow /forgot-password
Disallow /reset-password
Disallow /profile

Other Records

Field Value
crawl-delay 3

Other Records

Field Value
sitemap https://www.newschannelnebraska.com/sitemap.xml.gz
sitemap https://www.newschannelnebraska.com/sitemap-pages.xml.gz
sitemap https://www.newschannelnebraska.com/newssitemap.xml.gz
sitemap https://www.newschannelnebraska.com/videositemap.xml.gz