librotea.eldiario.es
robots.txt

Robots Exclusion Standard data for librotea.eldiario.es

Resource Scan

Scan Details

Site Domain librotea.eldiario.es
Base Domain eldiario.es
Scan Status Ok
Last Scan2025-02-23T09:27:14+00:00
Next Scan 2025-03-25T09:27:14+00:00

Last Scan

Scanned2025-02-23T09:27:14+00:00
URL https://librotea.eldiario.es/robots.txt
Domain IPs 13.35.185.113, 13.35.185.119, 13.35.185.122, 13.35.185.31, 2600:9000:2079:2200:3:1906:7b00:93a1, 2600:9000:2079:3000:3:1906:7b00:93a1, 2600:9000:2079:3c00:3:1906:7b00:93a1, 2600:9000:2079:4c00:3:1906:7b00:93a1, 2600:9000:2079:6400:3:1906:7b00:93a1, 2600:9000:2079:7a00:3:1906:7b00:93a1, 2600:9000:2079:c800:3:1906:7b00:93a1, 2600:9000:2079:fe00:3:1906:7b00:93a1
Response IP 18.155.68.54
Found Yes
Hash eba96c39e4462837675d8e87ffefaf57ca672fe41856f02645c64ccee847d290
SimHash 38961f17e4e4

Groups

*

Rule Path
Allow /
Allow /wp-sitemap.xml
Allow /estanterias/*
Allow /inspiradores/*
Allow /articulos/*
Allow /libros/*
Allow /*.php$
Allow /*.js$
Allow /*.inc$
Allow /*.css$
Allow /*.xhtml$
Allow /*.gif$
Allow /*.png$
Allow /*.jpeg$
Allow /*.jpg$
Allow /*.ico$
Disallow /buscar
Disallow /restablecer-contrasena
Disallow /wp-content/
Disallow /bundles/
Disallow /css/
Disallow /fonts/
Disallow /sites/
Disallow /images/
Disallow /js/
Disallow /admin/

genio

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

scooperbot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

flamingo_searchengine

Rule Path
Disallow /

facebot

Rule Path
Disallow /

luminatebot

Rule Path
Disallow /

vagabondo

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

r6_commentreader

Rule Path
Disallow /

yeti

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

showyoubot

Rule Path
Disallow /

gozaikbot

Rule Path
Disallow /

python-requests

Rule Path
Disallow /

queryseekerspider

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

yandeximages

Rule Path
Disallow /

apache-httpclient

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

buck

Rule Path
Disallow /

wikido

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

sogou

Rule Path
Disallow /

zend_http_client

Rule Path
Disallow /

robots

Rule Path
Disallow /

arquivo-web-crawler

Rule Path
Disallow /

bidswitchbot

Rule Path
Disallow /

g-i-g-a-b-o-t

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

garlikcrawler

Rule Path
Disallow /

caam

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

clickagy intelligence bot

Rule Path
Disallow /

jersey

Rule Path
Disallow /

libwww-perl

Rule Path
Disallow /

ltx71

Rule Path
Disallow /

omgili

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

python-urllib

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://librotea.eldiario.es/sitemap/esp-sitemap-articles-index.xml
sitemap https://librotea.eldiario.es/sitemap/esp-sitemap-bks-index.xml
sitemap https://librotea.eldiario.es/sitemap/esp-sitemap-bnw-index.xml
sitemap https://librotea.eldiario.es/sitemap/esp-sitemap-edt-index.xml
sitemap https://librotea.eldiario.es/sitemap/esp-sitemap-ent-index.xml
sitemap https://librotea.eldiario.es/sitemap/esp-sitemap-google-news-index.xml
sitemap https://librotea.eldiario.es/sitemap/esp-sitemap-images-index.xml
sitemap https://librotea.eldiario.es/sitemap/esp-sitemap-ntv-index.xml
sitemap https://librotea.eldiario.es/sitemap/esp-sitemap-nws-index.xml
sitemap https://librotea.eldiario.es/sitemap/esp-sitemap-shv-index.xml
sitemap https://librotea.eldiario.es/sitemap/esp-sitemap-tags-index.xml
sitemap https://librotea.eldiario.es/sitemap/esp-sitemap-videos-index.xml

Comments

  • modified at 21-02-2025
  • Name: robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/wc/robots.html
  • For syntax checking, see:
  • http://www.sxw.org.uk/computing/robots/check.html
  • No incluir la página de búsqueda.
  • No incluir la página de restablecer contraseña.
  • Excluir rutas de wordpress
  • No carpetas de recursos estáticos.
  • Site maps
  • Block admin section
  • Otros bots