revistadecarabanchel.es
robots.txt

Robots Exclusion Standard data for revistadecarabanchel.es

Archived Snapshots

Resource Scan

Scan Details

Site Domain	revistadecarabanchel.es
Base Domain	revistadecarabanchel.es
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Couldn't connect to server.
Last Scan	2024-10-05T17:52:12+00:00
Next Scan	2025-01-03T17:52:12+00:00

Last Successful Scan

Scanned	2022-04-01T20:10:18+00:00
URL	https://revistadecarabanchel.es/robots.txt
Response IP	51.38.169.102
Found	Yes
Hash	3656d9a01c4d971a8804bb4014019d48fd07a24e1fdce7cb82b83fa09248c195
SimHash	8814901edc33

Groups

mediapartners-google

Rule	Path
Allow	/

Rule

Path

Allow

/

googlebot-image

Rule	Path
Allow	/

Rule

Path

Allow

/

adsbot-google

Rule	Path
Allow	/

Rule

Path

Allow

/

googlebot-mobile

Rule	Path
Allow	/
Disallow	/*?

Rule

Path

Allow

/

Disallow

/*?

*

Rule	Path
Disallow

Rule

Path

Disallow

addthis.com disallow: /
admantx disallow: /
ahrefsbot disallow: /
bdcbot disallow: /
bender disallow: /
bixocrawler disallow: /
bl.uk_lddc_bot disallow: /
blexbot disallow: /
bubing disallow: /
cliqzbot disallow: /
cncdialer disallow: /
crawler4j disallow: /
crystalsemanticsbot disallow: /
cyberalert disallow: /
digext disallow: /
discobot disallow: /
discoverybot disallow: /
dloader disallow: /
dloader(naverrobot) disallow: /
doc disallow: /
dotbot disallow: /
download ninja disallow: /
dts agent disallow: /
exabot disallow: /
ezooms disallow: /
fairshare disallow: /
fetch disallow: /
flamingo_searchengine disallow: /
genieo disallow: /
gigabot disallow: /
grub-client disallow: /
heritrix disallow: /
heritrix/3.3.0 disallow: /
httrack disallow: /
ia_archiver disallow: /
integromedb disallow: /
istellabot disallow: /
jikespider disallow: /
jyxobot disallow: /
k2spider disallow: /
kimengi disallow: /
kimengi/nineconnections.com disallow: /
larbin disallow: /
lexxebot/1.0 disallow: /
libwww disallow: /
linko disallow: /
livelapbot disallow: /
magpie-crawler disallow: /
maxthon disallow: /
metauri disallow: /
microsoft.url.control disallow: /
mj12bot disallow: /
moreover disallow: /
moreoverbot disallow: /
msiecrawler disallow: /
nabot disallow: /
naverbot disallow: /
nerdbynature.bot disallow: /
netestate ne crawler disallow: /
netseer crawler disallow: /
newscan disallow: /
nextgensearchbot disallow: /
npbot disallow: /
nutch disallow: /
offline explorer disallow: /
omgilibot disallow: /
orthogaffe disallow: /
piplbot disallow: /
pixray-seeker disallow: /
proximic disallow: /
psbot disallow: /
queryseekerspider disallow: /
rogerbot disallow: /
seokicks disallow: /
seokicks-robot disallow: /
sitebot disallow: /
sitebot/0.1 disallow: /
sitecheck.internetseer.com disallow: /
sitesnagger disallow: /
slurp disallow: /
sogou disallow: /
sosospider disallow: /
spbot disallow: /
spinn3r disallow: /
teleport disallow: /
teleportpro disallow: /
trendictionbot disallow: /
trovitbot disallow: /
turnitinbot disallow: /
ubicrawler disallow: /
umbot-ln disallow: /
unisterbot disallow: /
universalfeedparser disallow: /
wbsearchbot disallow: /
webcopier disallow: /
webreaper disallow: /
webstripper disallow: /
webzip disallow: /
wesee:search disallow: /
wget disallow: /
wotbot disallow: /
wotbox disallow: /
xenu disallow: /
yasni disallow: /
zao disallow: /
zealbot disallow: /
zyborg disallow: /
googlebot

Rule	Path
Allow	/*.css$
Allow	/*.js$

Rule

Path

Allow

/*.css$

Allow

/*.js$

Back to top

Other Records

Field	Value
sitemap	http://revistadecarabanchel.es/sitemap.html
sitemap	http://revistadecarabanchel.es/sitemap.xml.gz

Field

Value

sitemap

http://revistadecarabanchel.es/sitemap.html

sitemap

http://revistadecarabanchel.es/sitemap.xml.gz

Back to top

Comments

Robot de publicidad,evitar problemas con la publicidad en paginaciones, bÃºsquedas, etcâ¦
Lista de bots permitidos.
Bloqueo de las URL dinamicas
Bloqueo paginas
Bloqueo de bots y crawlers poco utiles
Previene problemas de recursos bloqueados en Google Webmaster Tools
En condiciones normales este es el sitemap

Back to top

Warnings

1 invalid line.

Back to top

revistadecarabanchel.esrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

mediapartners-google

googlebot-image

adsbot-google

googlebot-mobile

*

Other Records

Comments

Warnings

revistadecarabanchel.es
robots.txt