newser.com
robots.txt

Robots Exclusion Standard data for newser.com

Resource Scan

Scan Details

Site Domain newser.com
Base Domain newser.com
Scan Status Ok
Last Scan2024-05-04T20:21:12+00:00
Next Scan 2024-05-11T20:21:12+00:00

Last Scan

Scanned2024-05-04T20:21:12+00:00
URL https://newser.com/robots.txt
Redirect https://www.newser.com/robots.txt
Redirect Domain www.newser.com
Redirect Base newser.com
Domain IPs 40.114.51.62
Redirect IPs 40.114.51.62
Response IP 40.114.51.62
Found Yes
Hash 3dd3ef000f3c2cd4843855c3f3bbc20fe85da98ea78c9d8c5e909fa2a564283f
SimHash 29101d404f81

Groups

*

Rule Path
Disallow /*?*enddate=
Disallow /commentsajax.aspx
Disallow /contactajax.aspx
Disallow /controlpage.aspx
Disallow /emailchangeajax.aspx
Disallow /facebookajax.aspx
Disallow /getimage.aspx
Disallow /newsletter/
Disallow /newsletterpromoajax.aspx
Disallow /newslettersubscribe.aspx
Disallow /pwapushprocessajax.aspx
Disallow /recaptchaajax.aspx
Disallow /rss.aspx
Disallow /search.aspx
Disallow /story/comments/
Disallow /submitlink.aspx
Disallow /useremailajax.aspx
Disallow /usererrorreportajax.aspx
Disallow /utility.aspx
Disallow /widgets/
Disallow /widgetsite/

Other Records

Field Value
sitemap https://www.newser.com/ossitemap1.index

Comments

  • Newser site: robots.txt
  • This file is used to allow crawlers to crawl our site.