plantasparacurar.com
robots.txt

Robots Exclusion Standard data for plantasparacurar.com

Resource Scan

Scan Details

Site Domain plantasparacurar.com
Base Domain plantasparacurar.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a server error.
Last Scan2024-11-11T14:54:08+00:00
Next Scan 2024-11-25T14:54:08+00:00

Last Successful Scan

Scanned2024-10-27T03:16:41+00:00
URL https://plantasparacurar.com/robots.txt
Redirect https://www.plantasparacurar.com/robots.txt
Redirect Domain www.plantasparacurar.com
Redirect Base plantasparacurar.com
Domain IPs 93.189.33.7
Redirect IPs 93.189.33.7
Response IP 93.189.33.7
Found Yes
Hash 7e3db5f820e18064b755bda8e5c073bdb38c6b21d8fd556340eda678af454b9d
SimHash 49f7df740804

Groups

mediapartners-google

Rule Path
Disallow /wp-content/plugins/
Disallow /wp-content/themes/
Disallow /wp-includes/
Disallow /wp-admin/
Disallow /wp-

*

Rule Path Comment
Allow /wp-content/uploads/ -
Disallow /iframes/ -
Disallow /wp-content/plugins/ -
Disallow /wp-includes/ -
Disallow /wp-admin/ -
Allow /wp-content/themes/ Disallow
Allow /wp-content/plugins/contact-form-7/ -
Allow /wp-content/plugins/bwp-minify/min/ -
Allow /wp-content/plugins/wp-polls/images/ -
Allow /wp-content/plugins/comment-rating/images/ -
Allow /wp-content/plugins/wp-postratings/images/ -
Allow /wp-content/plugins/comments-evolved/assets/ -
Allow /wp-content/plugins/asynchronous-javascript/ -
Allow /wp-includes/images/ -
Disallow /wp- -
Disallow /?s= -
Disallow /search -
Disallow /1014335/ -
Disallow /plugins/ -
Disallow /enlaces/ -
Disallow /images/ -
Disallow /rating_1_over -
Disallow /rating_2_over -
Allow /feed/$ -
Allow /comments/feed$ -
Allow /comments/feed/$ -
Disallow /feed -
Disallow /comments/feed -
Disallow /*/feed/$ -
Disallow /*/feed/rss/$ -
Disallow /*/trackback/$ -
Disallow /*/*/feed/$ -
Disallow /*/*/feed/rss/$ -
Disallow /*/*/trackback/$ -
Disallow /*/*/*/feed/$ -
Disallow /*/*/*/feed/rss/$ -
Disallow /*/*/*/trackback/$ -

Comments

  • robots.txt para tu blog en WordPress.
  • Usar bajo propia responsabilidad, que nos conocemos }:)
  • http://www.sigt.net/desarrollo-web/robotstxt-para-wordpress.html
  • Para permitir que el robots de AdSense pueda crawlear el sitio
  • Primero el contenido adjunto.
  • Tambien podemos desindexar todo lo que empiece
  • por wp-. Es lo mismo que los Disallow de arriba pero
  • incluye cosas como wp-rss.php
  • Busquedas tampoco.
  • Problemas del crawler
  • Permitimos el feed general para Google Blogsearch.
  • Impedimos que permalink/feed/ sea indexado ya que el
  • feed con los comentarios suele posicionarse en lugar de
  • la entrada y desorienta a los usuarios.
  • Lo mismo con URLs terminadas en /trackback/ que solo
  • sirven como Trackback URI (y son contenido duplicado).