sfchronicle.com
robots.txt

Robots Exclusion Standard data for sfchronicle.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	sfchronicle.com
Base Domain	sfchronicle.com
Scan Status	Ok
Last Scan	2024-11-02T04:09:34+00:00
Next Scan	2024-11-09T04:09:34+00:00

Last Scan

Scanned	2024-11-02T04:09:34+00:00
URL	https://sfchronicle.com/robots.txt
Redirect	https://www.sfchronicle.com/robots.txt
Redirect Domain	www.sfchronicle.com
Redirect Base	sfchronicle.com
Domain IPs	98.129.228.59
Redirect IPs	151.101.0.200, 151.101.128.200, 151.101.192.200, 151.101.64.200
Response IP	199.232.44.200
Found	Yes
Hash	a8906eef0c0e1fa7b12e259f578b46d01f6235c8fca583fafdfb23c08e45eb53
SimHash	892e4856a2f2

Groups

*

Rule	Path
Disallow	/style/beauty/hearstmagazines/
Disallow	/style/fashion/hearstmagazines/
Disallow	/living/relationships/hearstmagazines/
Disallow	/homeandgarden/home/hearstmagazines/
Disallow	/living/wellness/hearstmagazines/
Disallow	/sponsored
Disallow	/adtest
Disallow	/events/
Disallow	/search

Rule

Path

Disallow

/style/beauty/hearstmagazines/

Disallow

/style/fashion/hearstmagazines/

Disallow

/living/relationships/hearstmagazines/

Disallow

/homeandgarden/home/hearstmagazines/

Disallow

/living/wellness/hearstmagazines/

Disallow

/sponsored

Disallow

/adtest

Disallow

/events/

Disallow

/search

googlebot-news

Rule	Path
Disallow	/business/press-releases/

Rule

Path

Disallow

/business/press-releases/

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

*

Rule	Path
Disallow	/413gkwMT/

Rule

Path

Disallow

/413gkwMT/

applebot-extended

Rule	Path
Disallow	/private/

Rule

Path

Disallow

/private/

Back to top

Other Records

Field	Value
sitemap	https://www.sfchronicle.com/sitemap.xml
sitemap	https://www.sfchronicle.com/sitemap_news.xml
sitemap	https://www.sfchronicle.com/projects/sitemap_projects.xml

Field

Value

sitemap

https://www.sfchronicle.com/sitemap.xml

sitemap

https://www.sfchronicle.com/sitemap_news.xml

sitemap

https://www.sfchronicle.com/projects/sitemap_projects.xml

Back to top

sfchronicle.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

googlebot-news

ccbot

*

applebot-extended

Other Records

sfchronicle.com
robots.txt