www2.startribune.com
robots.txt

Robots Exclusion Standard data for www2.startribune.com

Resource Scan

Scan Details

Site Domain www2.startribune.com
Base Domain startribune.com
Scan Status Ok
Last Scan2024-11-11T14:11:35+00:00
Next Scan 2024-11-18T14:11:35+00:00

Last Scan

Scanned2024-11-11T14:11:35+00:00
URL https://www2.startribune.com/robots.txt
Domain IPs 104.18.114.50, 104.18.115.50
Response IP 104.18.114.50
Found Yes
Hash eb0a396e2db858798e8f79b8e6236f27d71e4566635014e7f2a103f38ae03507
SimHash 641455309bb3

Groups

*

Rule Path
Allow /

*

Rule Path
Disallow /login

*

Rule Path
Disallow /obituaries

Other Records

Field Value
sitemap https://www.startribune.com/sitemap-fresh-news-index.xml/
sitemap https://www.startribune.com/sitemap-fresh-video-index.xml/
sitemap https://www.startribune.com/sitemap-full-index.xml/

Comments

  • crawlers trigger our login attempt rate limiting
  • crawlers, stay away from obits