degustalarioja.com
robots.txt

Robots Exclusion Standard data for degustalarioja.com

Resource Scan

Scan Details

Site Domain degustalarioja.com
Base Domain degustalarioja.com
Scan Status Ok
Last Scan2024-11-12T07:09:57+00:00
Next Scan 2024-11-19T07:09:57+00:00

Last Scan

Scanned2024-11-12T07:09:57+00:00
URL https://www.degustalarioja.com/robots.txt
Domain IPs 23.209.46.137, 23.209.46.149
Response IP 23.215.7.20
Found Yes
Hash 8df3b4aeb96f560377429b972c6f5b6c0a8fe50af06247afae6cbdab80fffd35
SimHash 2b24d05cc333

Groups

mediapartners-google

Rule Path
Allow /

googlebot-image

Rule Path
Allow /

twitterbot

Rule Path
Disallow *

msnbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 5

*

Rule Path
Disallow /modulos/
Disallow /includes/
Disallow /NFS/
Disallow /*?ns_
Disallow /4900/webm.LARIOJA/
Disallow /4900/vocento.larioja/
Disallow /*/guia-tv/
Disallow /guia-tv/
Disallow /*/servicios/
Disallow /eltiempo/
Disallow /temas/
Disallow /hemeroteca/

Other Records

Field Value
sitemap https://www.degustalarioja.com/sitemap.xml
sitemap https://www.degustalarioja.com/sitemap.incremental.xml

Comments

  • Robots www.degustalarioja.com
  • Sitemaps
  • User Agents