comunidad.madrid
robots.txt

Robots Exclusion Standard data for comunidad.madrid

Resource Scan

Scan Details

Site Domain comunidad.madrid
Base Domain comunidad.madrid
Scan Status Ok
Last Scan2024-05-28T14:53:30+00:00
Next Scan 2024-06-27T14:53:30+00:00

Last Scan

Scanned2024-05-28T14:53:30+00:00
URL https://comunidad.madrid/robots.txt
Redirect https://www.comunidad.madrid/robots.txt
Redirect Domain www.comunidad.madrid
Redirect Base comunidad.madrid
Domain IPs 195.77.128.115
Redirect IPs 18.155.68.30, 18.155.68.36, 18.155.68.82, 18.155.68.95, 2600:9000:23d2:3400:f:9cf1:a00:93a1, 2600:9000:23d2:4800:f:9cf1:a00:93a1, 2600:9000:23d2:600:f:9cf1:a00:93a1, 2600:9000:23d2:9000:f:9cf1:a00:93a1, 2600:9000:23d2:ca00:f:9cf1:a00:93a1, 2600:9000:23d2:cc00:f:9cf1:a00:93a1, 2600:9000:23d2:d600:f:9cf1:a00:93a1, 2600:9000:23d2:da00:f:9cf1:a00:93a1
Response IP 18.155.68.95
Found Yes
Hash 511d1feb856473c105f174beeb5bd0cee9ea010ef622b42252a28503e30c2eef
SimHash bc94f5588270

Groups

*

Rule Path
Allow /misc/*.css$
Allow /misc/*.css?
Allow /misc/*.js$
Allow /misc/*.js?
Allow /misc/*.gif
Allow /misc/*.jpg
Allow /misc/*.jpeg
Allow /misc/*.png
Allow /modules/*.css$
Allow /modules/*.css?
Allow /modules/*.js$
Allow /modules/*.js?
Allow /modules/*.gif
Allow /modules/*.jpg
Allow /modules/*.jpeg
Allow /modules/*.png
Allow /profiles/*.css$
Allow /profiles/*.css?
Allow /profiles/*.js$
Allow /profiles/*.js?
Allow /profiles/*.gif
Allow /profiles/*.jpg
Allow /profiles/*.jpeg
Allow /profiles/*.png
Allow /themes/*.css$
Allow /themes/*.css?
Allow /themes/*.js$
Allow /themes/*.js?
Allow /themes/*.gif
Allow /themes/*.jpg
Allow /themes/*.jpeg
Allow /themes/*.png
Disallow /includes/
Disallow /misc/
Disallow /modules/
Disallow /profiles/
Disallow /scripts/
Disallow /themes/
Disallow /CHANGELOG.txt
Disallow /cron.php
Disallow /INSTALL.mysql.txt
Disallow /INSTALL.pgsql.txt
Disallow /INSTALL.sqlite.txt
Disallow /install.php
Disallow /INSTALL.txt
Disallow /LICENSE.txt
Disallow /MAINTAINERS.txt
Disallow /update.php
Disallow /UPGRADE.txt
Disallow /xmlrpc.php
Disallow /admin/
Disallow /comment/reply/
Disallow /filter/tips/
Disallow /node/add/
Disallow /search/
Disallow /user/register/
Disallow /user/password/
Disallow /user/login/
Disallow /user/logout/
Disallow /?q=admin%2F
Disallow /?q=comment%2Freply%2F
Disallow /?q=filter%2Ftips%2F
Disallow /?q=node%2Fadd%2F
Disallow /?q=search%2F
Disallow /?q=user%2Fpassword%2F
Disallow /?q=user%2Fregister%2F
Disallow /?q=user%2Flogin%2F
Disallow /?q=user%2Flogout%2F
Disallow /sites/default/files/doc/empleo/1916_excl_listas_informativas_aux_adm.pdf
Disallow /centros/Consejerias/
Disallow /centros/Temas/
Disallow /centros/tipos-centro/
Disallow /centros/municipio/
Disallow /centros/aforo_automatizado/
Disallow /actividades/tipo-actividad/
Disallow /actividades/tema/
Disallow /actividades/municipio/
Disallow /actividades/etiqueta/
Disallow /actividades/fecha/
Disallow /pcor
Disallow /info/calculadora-impuesto-sucesiones
Disallow /file/298546
Disallow /sites/default/files/doc/educacion/rh01/
Disallow /sites/default/files/doc/educacion/rh02/
Disallow /sites/default/files/doc/educacion/rh03/
Disallow /sites/default/files/doc/educacion/rh04/
Disallow /sites/default/files/doc/educacion/rh05/
Disallow /sites/default/files/doc/educacion/rh06/
Disallow /sites/default/files/doc/educacion/rh07/
Disallow /sites/default/files/doc/educacion/rh08/
Disallow /sites/default/files/doc/educacion/rh09/
Disallow /sites/default/files/doc/educacion/rh10/
Disallow /sites/default/files/doc/educacion/rh11/
Disallow /sites/default/files/doc/educacion/rh12/
Disallow /sites/default/files/doc/educacion/rh13/
Disallow /sites/default/files/doc/educacion/rh14/
Disallow /sites/default/files/doc/educacion/rh15/
Disallow /sites/default/files/doc/educacion/rh16/
Disallow /sites/default/files/doc/educacion/rh17/
Disallow /sites/default/files/doc/educacion/rh18/
Disallow /sites/default/files/doc/educacion/rh19/
Disallow /sites/default/files/doc/educacion/rh20/
Disallow /info/universidades/alojamiento-estudiantes-universitarios?*
Disallow /categoria-alerta
Disallow /comunidad-autonoma
Disallow /ano-alerta
Disallow /origen-alerta
Disallow /dcma
Disallow /servicios/juventud/convocatorias/Fichero
Disallow /servicios/juventud/convocatorias/Convocatoria
Disallow /info/productores?*
Disallow /info/servicios/empleo/cursos?*
Disallow /portal-lector/comunicacion/actividades?f%5B0%5D
Disallow /portal-lector/comunicacion/actividades?111
Disallow /centros/area-liquidacion-tributos
Disallow /portal-lector/localiza-tu-biblioteca?

Other Records

Field Value
crawl-delay 10

Comments

  • robots.txt
  • This file is to prevent the crawling and indexing of certain parts
  • of your site by web crawlers and spiders run by sites like Yahoo!
  • and Google. By telling these "robots" where not to go on your site,
  • you save bandwidth and server resources.
  • This file will be ignored unless it is at the root of your host:
  • Used: http://example.com/robots.txt
  • Ignored: http://example.com/site/robots.txt
  • For more information about the robots.txt standard, see:
  • http://www.robotstxt.org/robotstxt.html
  • CSS, JS, Images
  • Directories
  • Files
  • Paths (clean URLs)
  • Paths (no clean URLs)