timesunion.com
robots.txt

Robots Exclusion Standard data for timesunion.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	timesunion.com
Base Domain	timesunion.com
Scan Status	Ok
Last Scan	2024-10-28T08:09:53+00:00
Next Scan	2024-11-04T08:09:53+00:00

Last Scan

Scanned	2024-10-28T08:09:53+00:00
URL	https://timesunion.com/robots.txt
Redirect	https://www.timesunion.com/robots.txt
Redirect Domain	www.timesunion.com
Redirect Base	timesunion.com
Domain IPs	98.129.228.59
Redirect IPs	151.101.0.200, 151.101.128.200, 151.101.192.200, 151.101.64.200
Response IP	199.232.44.200
Found	Yes
Hash	15f947dd645110dc01c78bd8d0db55f42665c3f5ae453f7e8ddde57ce3b980fd
SimHash	c829096782d2

Groups

*

Rule	Path
Disallow	/style/beauty/hearstmagazines/
Disallow	/style/fashion/hearstmagazines/
Disallow	/living/relationships/hearstmagazines/
Disallow	/homeandgarden/home/hearstmagazines/
Disallow	/living/wellness/hearstmagazines/
Disallow	/sponsored
Disallow	/events/
Disallow	/search

Rule

Path

Disallow

/style/beauty/hearstmagazines/

Disallow

/style/fashion/hearstmagazines/

Disallow

/living/relationships/hearstmagazines/

Disallow

/homeandgarden/home/hearstmagazines/

Disallow

/living/wellness/hearstmagazines/

Disallow

/sponsored

Disallow

/events/

Disallow

/search

googlebot-news

Rule	Path
Disallow	/business/press-releases
Disallow	/news/article/Your-horoscope

Rule

Path

Disallow

/business/press-releases

Disallow

/news/article/Your-horoscope

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

*

Rule	Path
Disallow	/413gkwMT/

Rule

Path

Disallow

/413gkwMT/

applebot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Other Records

Field	Value
sitemap	https://www.timesunion.com/sitemap.xml
sitemap	https://www.timesunion.com/sitemap_news.xml
sitemap	https://www.timesunion.com/projects/sitemap_projects.xml
sitemap	https://www.timesunion.com/sitemap_devhub.xml

Field

Value

sitemap

https://www.timesunion.com/sitemap.xml

sitemap

https://www.timesunion.com/sitemap_news.xml

sitemap

https://www.timesunion.com/projects/sitemap_projects.xml

sitemap

https://www.timesunion.com/sitemap_devhub.xml

Back to top

timesunion.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

googlebot-news

ccbot

*

applebot

Other Records

timesunion.com
robots.txt