thetelegraph.com
robots.txt

Robots Exclusion Standard data for thetelegraph.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	thetelegraph.com
Base Domain	thetelegraph.com
Scan Status	Ok
Last Scan	2024-11-09T23:18:46+00:00
Next Scan	2024-11-16T23:18:46+00:00

Last Scan

Scanned	2024-11-09T23:18:46+00:00
URL	https://thetelegraph.com/robots.txt
Redirect	https://www.thetelegraph.com/robots.txt
Redirect Domain	www.thetelegraph.com
Redirect Base	thetelegraph.com
Domain IPs	98.129.228.59
Redirect IPs	151.101.0.200, 151.101.128.200, 151.101.192.200, 151.101.64.200
Response IP	199.232.44.200
Found	Yes
Hash	1ce311548bc9ad84f932531b600afe0eb870ce8349cae5291f8389869891d887
SimHash	88aa044682d2

Groups

*

Rule	Path
Disallow	/style/beauty/hearstmagazines/
Disallow	/style/fashion/hearstmagazines/
Disallow	/living/relationships/hearstmagazines/
Disallow	/homeandgarden/home/hearstmagazines/
Disallow	/living/wellness/hearstmagazines/
Disallow	/sponsored
Disallow	/events/
Disallow	/search

Rule

Path

Disallow

/style/beauty/hearstmagazines/

Disallow

/style/fashion/hearstmagazines/

Disallow

/living/relationships/hearstmagazines/

Disallow

/homeandgarden/home/hearstmagazines/

Disallow

/living/wellness/hearstmagazines/

Disallow

/sponsored

Disallow

/events/

Disallow

/search

googlebot-news

Rule	Path
Disallow	/business/press-releases
Disallow	/news/article/Your-horoscope

Rule

Path

Disallow

/business/press-releases

Disallow

/news/article/Your-horoscope

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

*

Rule	Path
Disallow	/413gkwMT/

Rule

Path

Disallow

/413gkwMT/

applebot-extended

Rule	Path
Disallow	/private/

Rule

Path

Disallow

/private/

Back to top

Other Records

Field	Value
sitemap	https://www.thetelegraph.com/sitemap.xml
sitemap	https://www.thetelegraph.com/sitemap_news.xml
sitemap	https://www.thetelegraph.com/sitemap_devhub.xml

Field

Value

sitemap

https://www.thetelegraph.com/sitemap.xml

sitemap

https://www.thetelegraph.com/sitemap_news.xml

sitemap

https://www.thetelegraph.com/sitemap_devhub.xml

Back to top

thetelegraph.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

googlebot-news

ccbot

*

applebot-extended

Other Records

thetelegraph.com
robots.txt