trujillo.hoy.es
robots.txt

Robots Exclusion Standard data for trujillo.hoy.es

Archived Snapshots

Resource Scan

Scan Details

Site Domain	trujillo.hoy.es
Base Domain	hoy.es
Scan Status	Ok
Last Scan	2025-10-13T08:41:30+00:00
Next Scan	2025-11-12T08:41:30+00:00

Last Scan

Scanned	2025-10-13T08:41:30+00:00
URL	https://trujillo.hoy.es/robots.txt
Domain IPs	151.101.131.42, 151.101.195.42, 151.101.3.42, 151.101.67.42
Response IP	146.75.47.42
Found	Yes
Hash	3811efecf060cfaf00261279014fb918dd00f1678178795166ede85f722af60c
SimHash	07149b444792

Groups

*

Rule	Path
Disallow	/guia-tv/
Disallow	/temas/
Disallow	/hemeroteca/*.html

Rule

Path

Disallow

/guia-tv/

Disallow

/temas/

Disallow

/hemeroteca/*.html

mediapartners-google

Rule	Path
Allow	/

Rule

Path

Allow

/

googlebot-image

Rule	Path
Allow	/

Rule

Path

Allow

/

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

/

perplexitybot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Other Records

Field	Value
sitemap	https://trujillo.hoy.es/sitemap.xml
sitemap	https://trujillo.hoy.es/sitemap.incremental.xml

Field

Value

sitemap

https://trujillo.hoy.es/sitemap.xml

sitemap

https://trujillo.hoy.es/sitemap.incremental.xml

Back to top

Comments

Robots trujillo.hoy.es
Sitemaps
User Agents

Back to top

trujillo.hoy.esrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

mediapartners-google

googlebot-image

google-extended

perplexitybot

Other Records

Comments

trujillo.hoy.es
robots.txt