andalucia.digital
robots.txt

Robots Exclusion Standard data for andalucia.digital

Resource Scan

Scan Details

Site Domain andalucia.digital
Base Domain andalucia.digital
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2026-01-05T21:10:50+00:00
Next Scan 2026-03-06T21:10:50+00:00

Last Successful Scan

Scanned2025-11-07T03:53:22+00:00
URL https://andalucia.digital/robots.txt
Domain IPs 213.158.84.12
Response IP 213.158.84.12
Found Yes
Hash af080bb9e329aab0dcf08598bdf7e852707dfd0706340d363a9bbf699d40c5e1
SimHash 6af4d98005fb

Groups

*

Rule Path
Disallow /wp-login.php
Disallow /?s=
Disallow /search/
Disallow /wp-content/plugins/
Disallow /readme.html
Disallow /refer/
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php

Other Records

Field Value
crawl-delay 120

msnbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 120

discobot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 4

yacybot

Rule Path
Disallow /

*

Rule Path
Allow /wp-content/uploads/
Disallow /cgi-bin
Disallow /wp-content/plugins/
Disallow /wp-content/themes/
Disallow /wp-includes/
Disallow /wp-admin/
Disallow /wp-
Disallow /?s=
Disallow /search
Allow /feed/$
Disallow /feed
Disallow /comments/feed
Disallow /*/feed/$
Disallow /*/feed/rss/$
Disallow /*/trackback/$
Disallow /*/*/feed/$
Disallow /*/*/feed/rss/$
Disallow /*/*/trackback/$
Disallow /*/*/*/feed/$
Disallow /*/*/*/feed/rss/$
Disallow /*/*/*/trackback/$
Allow /*.js$
Allow /*.css$

googlebot-image

Rule Path
Allow /wp-content/uploads/

adsbot-google

Rule Path
Allow /

googlebot-mobile

Rule Path
Allow /

msiecrawler

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

libwww

Rule Path
Disallow /

noxtrumbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 50

msnbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 30

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap http://andalucia.digital/sitemap.xml

Comments

  • robots.txt para un blog WordPress.
  • Bloquear o permitir acceso a contenido adjunto.
  • (Si la instalación está en /public_html).
  • Desindexar carpetas que empiecen por wp-
  • Permitir sitemap pero no las búsquedas.
  • Permitir Feed general para Google Blogsearch.
  • Impedir que /permalink/feed/ sea indexado pues el feed de comentarios
  • suele posicionarse antes de los post.
  • Impedir URLs terminadas en /trackback/ que sirven como Trackback URI
  • (contenido duplicado).
  • Evita bloqueos de CSS y JS.
  • Lista de bots que deberías permitir.
  • Lista de bots que generan consultas abusivas aunque siguen
  • las pautas del archivo robots.txt
  • Slurp (Yahoo!), Noxtrum y el bot de MSN que suelen generar
  • excesivas consultas.