trujillo.hoy.es
robots.txt

Robots Exclusion Standard data for trujillo.hoy.es

Resource Scan

Scan Details

Site Domain trujillo.hoy.es
Base Domain hoy.es
Scan Status Ok
Last Scan2025-10-13T08:41:30+00:00
Next Scan 2025-11-12T08:41:30+00:00

Last Scan

Scanned2025-10-13T08:41:30+00:00
URL https://trujillo.hoy.es/robots.txt
Domain IPs 151.101.131.42, 151.101.195.42, 151.101.3.42, 151.101.67.42
Response IP 146.75.47.42
Found Yes
Hash 3811efecf060cfaf00261279014fb918dd00f1678178795166ede85f722af60c
SimHash 07149b444792

Groups

*

Rule Path
Disallow /guia-tv/
Disallow /temas/
Disallow /hemeroteca/*.html

mediapartners-google

Rule Path
Allow /

googlebot-image

Rule Path
Allow /

google-extended

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://trujillo.hoy.es/sitemap.xml
sitemap https://trujillo.hoy.es/sitemap.incremental.xml

Comments

  • Robots trujillo.hoy.es
  • Sitemaps
  • User Agents