diariodenavarra.es
robots.txt

Robots Exclusion Standard data for diariodenavarra.es

Resource Scan

Scan Details

Site Domain diariodenavarra.es
Base Domain diariodenavarra.es
Scan Status Ok
Last Scan2024-05-08T04:40:00+00:00
Next Scan 2024-05-15T04:40:00+00:00

Last Scan

Scanned2024-05-08T04:40:00+00:00
URL https://diariodenavarra.es/robots.txt
Redirect https://www.diariodenavarra.es/robots.txt
Redirect Domain www.diariodenavarra.es
Redirect Base diariodenavarra.es
Domain IPs 108.138.246.34, 108.138.246.42, 108.138.246.71, 108.138.246.88
Redirect IPs 108.138.246.34, 108.138.246.42, 108.138.246.71, 108.138.246.88, 2600:9000:2548:4800:1d:70a8:e000:93a1, 2600:9000:2548:5400:1d:70a8:e000:93a1, 2600:9000:2548:5800:1d:70a8:e000:93a1, 2600:9000:2548:6a00:1d:70a8:e000:93a1, 2600:9000:2548:800:1d:70a8:e000:93a1, 2600:9000:2548:a200:1d:70a8:e000:93a1, 2600:9000:2548:ac00:1d:70a8:e000:93a1, 2600:9000:2548:f400:1d:70a8:e000:93a1
Response IP 18.165.171.35
Found Yes
Hash ce8bc08c852b0a1ed4b3bf1ff7a26278830e29fb80f514a8892f3461bee6961f
SimHash b33e7e4888f6

Groups

*

Rule Path
Disallow /cgi-bin/
Disallow /db/
Disallow /admin/
Disallow /cache/
Disallow /includes/
Disallow /templates/
Disallow /index.php/mod.global/mem.buscadorHemeroteca
Disallow /index.php/mod.global/mem.buscadorHemeroteca/
Disallow /index.php/id.
Disallow /index.php/option.
Disallow /index.php/com.
Disallow /index.php/mod.global/mem.enviarAmigo/
Disallow /index.php/mod.noticias/mem.enviarRedaccionFormulario/
Disallow /edicionimpresa*
Disallow /imagenesDDN*
Disallow /masactualidad*
Disallow /click
Disallow /graficosFlash*
Disallow /yonobajo*
Disallow /decimoaniversario*
Disallow /especiales*
Disallow /contador.php?var=*
Disallow /*/?texto-libre=*
Disallow /*/?palabra-clave-galleries=*
Disallow /*/?palabra-clave=*

deepcrawl

Rule Path
Disallow

addthis.com

Rule Path
Disallow /

admantx

Rule Path
Disallow /

bdcbot

Rule Path
Disallow /

bender

Rule Path
Disallow /

bixocrawler

Rule Path
Disallow /

bl.uk_lddc_bot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

bubing

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

cncdialer

Rule Path
Disallow /

crawler4j

Rule Path
Disallow /

crystalsemanticsbot

Rule Path
Disallow /

cyberalert

Rule Path
Disallow /

digext

Rule Path
Disallow /

discobot

Rule Path
Disallow /

discoverybot

Rule Path
Disallow /

dloader

Rule Path
Disallow /

dloader(naverrobot)

Rule Path
Disallow /

doc

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

dts agent

Rule Path
Disallow /

exabot

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

fairshare

Rule Path
Disallow /

fetch

Rule Path
Disallow /

flamingo_searchengine

Rule Path
Disallow /

genieo

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

grub-client

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

heritrix/3.3.0

Rule Path
Disallow /

httrack

Rule Path
Disallow /

integromedb

Rule Path
Disallow /

istellabot

Rule Path
Disallow /

jikespider

Rule Path
Disallow /

jyxobot

Rule Path
Disallow /

k2spider

Rule Path
Disallow /

kimengi

Rule Path
Disallow /

kimengi/nineconnections.com

Rule Path
Disallow /

larbin

Rule Path
Disallow /

lexxebot/1.0

Rule Path
Disallow /

libwww

Rule Path
Disallow /

linko

Rule Path
Disallow /

livelapbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

maxthon

Rule Path
Disallow /

metauri

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

moreover

Rule Path
Disallow /

moreoverbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

nabot

Rule Path
Disallow /

naverbot

Rule Path
Disallow /

nerdbynature.bot

Rule Path
Disallow /

netestate ne crawler

Rule Path
Disallow /

netseer crawler

Rule Path
Disallow /

newscan

Rule Path
Disallow /

nextgensearchbot

Rule Path
Disallow /

npbot

Rule Path
Disallow /

nutch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

orthogaffe

Rule Path
Disallow /

pixray-seeker

Rule Path
Disallow /

proximic

Rule Path
Disallow /

psbot

Rule Path
Disallow /

queryseekerspider

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

seokicks

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

sitebot

Rule Path
Disallow /

sitebot/0.1

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

slurp

Rule Path
Disallow /

sogou

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

spbot

Rule Path
Disallow /

spinn3r

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

trendictionbot

Rule Path
Disallow /

trovitbot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

ubicrawler

Rule Path
Disallow /

umbot-ln

Rule Path
Disallow /

unisterbot

Rule Path
Disallow /

universalfeedparser

Rule Path
Disallow /

wbsearchbot

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webzip

Rule Path
Disallow /

wesee:search

Rule Path
Disallow /

wget

Rule Path
Disallow /

wotbot

Rule Path
Disallow /

wotbox

Rule Path
Disallow /

xenu

Rule Path
Disallow /

yasni

Rule Path
Disallow /

zao

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.diariodenavarra.es/sitemap-index.xml
sitemap https://www.diariodenavarra.es/sitemap-category.xml
sitemap https://www.diariodenavarra.es/sitemap-tag.xml
sitemap https://www.diariodenavarra.es/sitemap-google-news.xml
sitemap https://www.diariodenavarra.es/uploads/sm/historico/idxsitemap.xml

Comments

  • Sitemaps
  • Disallow: /gigya/
  • Modulos
  • Old web
  • Disallow: /20* #BPV-681-32602
  • Disallow: /actualidad*
  • Permitir HSEARCH
  • Bloqueo de bots
  • INI - AI Related agents
  • END - AI Related agents

Warnings

  • 2 invalid lines.