cafescandelas.com
robots.txt

Robots Exclusion Standard data for cafescandelas.com

Resource Scan

Scan Details

Site Domain cafescandelas.com
Base Domain cafescandelas.com
Scan Status Ok
Last Scan2024-11-11T03:52:12+00:00
Next Scan 2024-12-11T03:52:12+00:00

Last Scan

Scanned2024-11-11T03:52:12+00:00
URL https://cafescandelas.com/robots.txt
Redirect https://www.cafescandelas.com/robots.txt
Redirect Domain www.cafescandelas.com
Redirect Base cafescandelas.com
Domain IPs 82.98.136.17
Redirect IPs 82.98.136.17
Response IP 82.98.136.17
Found Yes
Hash 4475279ce7f607e9f2eead56b3713b4222f77b2276fd3b7ca5b905d1fe239cdd
SimHash f71e6187c136

Groups

*

Rule Path
Disallow /sibaristas/*
Disallow /engadir/*
Disallow /boletin-pagina/*
Disallow /cliente-login
Disallow /cliente-login/*
Disallow /engadir/*?*
Disallow /tienda/product.php
Disallow /tienda/product.php?*
Disallow /tienda/category.php
Disallow /tienda/category.php?*
Disallow /mostrarSeccion.php
Disallow /mantenimiento.html
Disallow /mostrarSeccion.php?*
Disallow /cesto/
Disallow /tienda/producto/ajax-mostrar-ficha
Disallow /tienda/producto/ajax-mostrar-ficha/*
Disallow /login_facebook_promocion
Disallow /login_facebook_promocion/*
Disallow /blog/category/*
Disallow /blog/2012/*
Disallow /blog/2013/*
Disallow /blog/2014/*
Disallow /blog/2015/*
Disallow /blog/2016/*
Disallow /blog/2017/*
Disallow /blog/2018/*
Disallow /blog/2019/*
Disallow /blog/2020/*
Disallow /blog/2021/*
Disallow /blog/2022/*
Disallow /blog/2023/*
Disallow /blog/2024/*

mj12bot

Rule Path
Disallow /

*

Rule Path
Disallow /analyze2/
Disallow /c/

abonti

Rule Path
Disallow /

backlinkcrawler

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

baiduspider-mobile

Rule Path
Disallow /

baiduspider-image

Rule Path
Disallow /

baiduspider-video

Rule Path
Disallow /

baiduspider-news

Rule Path
Disallow /

baiduspider-favo

Rule Path
Disallow /

baiduspider-cpro

Rule Path
Disallow /

baiduspider-ads

Rule Path
Disallow /

bpimagewalker

Rule Path
Disallow /

check_http

Rule Path
Disallow /

checks.panopta.com

Rule Path
Disallow /

curl

Rule Path
Disallow /

dle_spider.exe

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

email exractor

Rule Path
Disallow /

exabot

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

gbplugin

Rule Path
Disallow /

gsa-crawler

Rule Path
Disallow /

holmesbot

Rule Path
Disallow /

ichiro

Rule Path
Disallow /

infoseek sidewinder

Rule Path
Disallow /

infoseek sidewinder/2.0b (linux 2.4 i686)

Rule Path
Disallow /

ip-web-crawler.com

Rule Path
Disallow /

java

Rule Path
Disallow /

jikespider

Rule Path
Disallow /

libwww-perl

Rule Path
Disallow /

linkdex.com

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

mlbot

Rule Path
Disallow /

moget

Rule Path
Disallow /

naverbot

Rule Path
Disallow /

pear http_request class

Rule Path
Disallow /

php

Rule Path
Disallow /

pixray-seeker

Rule Path
Disallow /

pycurl

Rule Path
Disallow /

python-requests

Rule Path
Disallow /

python-urllib

Rule Path
Disallow /

ruby

Rule Path
Disallow /

seoengworldbot

Rule Path
Disallow /

sitebot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

sogou web

Rule Path
Disallow /

sogou web spider

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

sosospider+

Rule Path
Disallow /

squider

Rule Path
Disallow /

ssearch_bot

Rule Path
Disallow /

suchmaschinenoptimierung.de

Rule Path
Disallow /

synapse

Rule Path
Disallow /

wget

Rule Path
Disallow /

woopingbot

Rule Path
Disallow /

ezooms.bot

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

wordpress

Rule Path
Disallow /

yandex

Rule Path
Disallow /

yeti

Rule Path
Disallow /

yodaobot

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 15

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 15

petalbot

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

Comments

  • This robots.txt file ALLOWS ALL INDEXING. It is
  • suitable for a live production site. When your site
  • is officially public copy this file to robots.txt.
  • In dev, staging, and pre-launch client beta environments
  • you should use robots.txt.block-all instead.
  • User-agent: *
  • Disallow:

Warnings

  • 2 invalid lines.