cool-web.de
robots.txt

Robots Exclusion Standard data for cool-web.de

Archived Snapshots

Resource Scan

Scan Details

Site Domain	cool-web.de
Base Domain	cool-web.de
Scan Status	Ok
Last Scan	2024-10-24T01:57:12+00:00
Next Scan	2024-11-23T01:57:12+00:00

Last Scan

Scanned	2024-10-24T01:57:12+00:00
URL	https://cool-web.de/robots.txt
Domain IPs	89.58.38.147
Response IP	89.58.38.147
Found	Yes
Hash	f11a56a0aeb59ebc9e77264ca5053e3841a8cad61c1b66f681ef4ac38563af91
SimHash	3a563d364be6

Groups

*

Rule	Path
Disallow	/admin/
Disallow	/cgi-bin/
Disallow	/css/
Disallow	/domains/
Disallow	/exchange/
Disallow	/files/
Disallow	/forms/
Disallow	/fonts/
Disallow	/gc/
Disallow	/images/
Disallow	/img/
Disallow	/intern/
Disallow	/internal/
Disallow	/js/
Disallow	/mailsystem/
Disallow	/perl/
Disallow	/php/
Disallow	/privat/
Disallow	/private/
Disallow	/profil/
Disallow	/temp/
Disallow	/TEMP/
Disallow	/tmp/
Disallow	/test/
Disallow	/shops/
Disallow	/veraltet/
Disallow	/_veraltet/
Disallow	/_*/
Disallow	/tools/
Disallow	/uploads/
Disallow	/users/
Disallow	/webmail/
Disallow	/*.swf
Disallow	/*.log
Disallow	/*.bak
Disallow	/*.sid
Disallow	/*.mod
Disallow	/*.mid
Disallow	BotDetectCaptcha.ashx
Disallow	/WebResource*
Disallow	/%28*
Disallow	WebResource.axd
Disallow	usersendpass.aspx
Disallow	*/base.aspx
Disallow	*/sitemap.ashx

Rule

Path

Disallow

/admin/

Disallow

/cgi-bin/

Disallow

/css/

Disallow

/domains/

Disallow

/exchange/

Disallow

/files/

Disallow

/forms/

Disallow

/fonts/

Disallow

/gc/

Disallow

/images/

Disallow

/img/

Disallow

/intern/

Disallow

/internal/

Disallow

/js/

Disallow

/mailsystem/

Disallow

/perl/

Disallow

/php/

Disallow

/privat/

Disallow

/private/

Disallow

/profil/

Disallow

/temp/

Disallow

/TEMP/

Disallow

/tmp/

Disallow

/test/

Disallow

/shops/

Disallow

/veraltet/

Disallow

/_veraltet/

Disallow

/_*/

Disallow

/tools/

Disallow

/uploads/

Disallow

/users/

Disallow

/webmail/

Disallow

/*.swf

Disallow

/*.log

Disallow

/*.bak

Disallow

/*.sid

Disallow

/*.mod

Disallow

/*.mid

Disallow

*BotDetectCaptcha.ashx*

Disallow

/WebResource*

Disallow

/%28*

Disallow

*WebResource.axd*

Disallow

*usersendpass.aspx*

Disallow

*/base.aspx

Disallow

*/sitemap.ashx

googlebot

Rule	Path
Allow	/css/
Disallow	/js/

Rule

Path

Allow

/css/

Disallow

/js/

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

slysearch

Rule	Path
Disallow	/

Rule

Path

Disallow

findlinks

Rule	Path
Disallow	/

Rule

Path

Disallow

magpie-crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

pixray-seeker

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

ezooms

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

lb-spider

Rule	Path
Disallow	/

Rule

Path

Disallow

wbsearchbot

Rule	Path
Disallow	/

Rule

Path

Disallow

psbot

Rule	Path
Disallow	/

Rule

Path

Disallow

huaweisymantecspider

Rule	Path
Disallow	/

Rule

Path

Disallow

sistrix

Rule	Path
Disallow	/

Rule

Path

Disallow

ec2linkfinder

Rule	Path
Disallow	/

Rule

Path

Disallow

htdig

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

discobot

Rule	Path
Disallow	/

Rule

Path

Disallow

linkdex.com

Rule	Path
Disallow	/

Rule

Path

Disallow

seznambot

Rule	Path
Disallow	/

Rule

Path

Disallow

edisterbot

Rule	Path
Disallow	/

Rule

Path

Disallow

swebot

Rule	Path
Disallow	/

Rule

Path

Disallow

picmole

Rule	Path
Disallow	/

Rule

Path

Disallow

yeti

Rule	Path
Disallow	/

Rule

Path

Disallow

yeti-mobile

Rule	Path
Disallow	/

Rule

Path

Disallow

pagepeeker

Rule	Path
Disallow	/

Rule

Path

Disallow

catchbot

Rule	Path
Disallow	/

Rule

Path

Disallow

yacybot

Rule	Path
Disallow	/

Rule

Path

Disallow

netestate ne crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

surveybot

Rule	Path
Disallow	/

Rule

Path

Disallow

comodo ssl checker

Rule	Path
Disallow	/

Rule

Path

Disallow

comodo-certificates-spider

Rule	Path
Disallow	/

Rule

Path

Disallow

gonzo

Rule	Path
Disallow	/

Rule

Path

Disallow

schrein

Rule	Path
Disallow	/

Rule

Path

Disallow

backlinkcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

afilias web mining tool

Rule	Path
Disallow	/

Rule

Path

Disallow

seokicks

Rule	Path
Disallow	/

Rule

Path

Disallow

seokicks-robot

Rule	Path
Disallow	/

Rule

Path

Disallow

suggybot

Rule	Path
Disallow	/

Rule

Path

Disallow

bdbrandprotect

Rule	Path
Disallow	/

Rule

Path

Disallow

bpimagewalker

Rule	Path
Disallow	/

Rule

Path

Disallow

bpimagewalker*

Rule	Path
Disallow	/

Rule

Path

Disallow

updownerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

lex

Rule	Path
Disallow	/

Rule

Path

Disallow

content crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

dcpbot

Rule	Path
Disallow	/

Rule

Path

Disallow

kaloogabot

Rule	Path
Disallow	/

Rule

Path

Disallow

mlbot

Rule

Path

Disallow

icjobs

Rule

Path

Disallow

obot

Rule

Path

Disallow

webmastercoffee

Rule

Path

Disallow

qualidator*

Rule

Path

Disallow

webinator

Rule

Path

Disallow

scooter

Rule

Path

Disallow

larbin

Rule

Path

Disallow

opidoobot

Rule

Path

Disallow

ips-agent

Rule

Path

Disallow

unisterbot

Rule

Path

Disallow

unister*

Rule

Path

Disallow

reverseget

Rule

Path

Disallow

wget

Rule

Path

Disallow

libwww-perl

Rule

Path

Disallow

curl

Rule

Path

Disallow

java

Rule

Path

Disallow

Comments

Inhalte, die grundsätzlich von keinem Bot indiziert werden sollen:
Testweise Ausnahme für Google, damit das Webmaster-Tools gut funktioniert
NACH DEM TESTEN JS WIEDER AUSKOMMENTIEREN, SONST EVTL. NEGATIVE
AUSWIRKUNGEN AUFS RANKING (Z. B. WG. GESCHWINDIGKEIT)
(in /js/ steht eh nichts drin, was indiziert werden müsste)
Allow: /js/
Allow: /images/
unerwünschte bots, die aber die robots.txt abfragen
"TurnitinBot/2.1 (http://www.turnitin.com/robot/crawlerinfo.html)"
"findlinks/2.1.5 (+http://wortschatz.uni-leipzig.de/findlinks/)"
"magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
http://www.majestic12.co.uk/projects/dsearch/mj12bot.php
http://www.80legs.com/webcrawler.html - if 008 is crawling your website, it means that one or more 80legs users created a web crawl
"Mozilla/5.0 (compatible; AhrefsBot/2.0; +http://ahrefs.com/robot/)"
"lb-spider/Mozilla/5.0 Gecko/20100101 Firefox/10.0.2 (lb-spider; http://www.linkbutler.de/spider; spider@linkbutler.de)"
"Mozilla/5.0 (compatible; WBSearchBot/1.1; +http://www.warebay.com/bot.html)"
"psbot/0.1 (+http://www.picsearch.com/bot.html)"
"HuaweiSymantecSpider/1.0+DSE-support@huaweisymantec.com+(compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR ; http://www.huaweisymantec.com/en/IRL/spider)"
"Mozilla/5.0 (compatible; SISTRIX Crawler; http://crawler.sistrix.net/)"
"EC2LinkFinder"
"http://SiteIntel.net Bot"
"htdig"
"SemrushBot/0.91" - http://de.semrush.com/? -Professionelle Software für SEO & SEM Professionals?
"Mozilla/5.0 (compatible; discobot/2.0; +http://discoveryengine.com/discobot.html)" - we sell no wine before its time != trustworthy
"linkdex.com/v2.0" - SEO
"SeznamBot/3.0 (+http://fulltext.sblog.cz/)" - sz-SEO
"EdisterBot (http://www.edister.com/bot.html)"
"Mozilla/5.0 (compatible; SWEBot/1.0; +http://swebot-crawler.net)" - versucht auf posting im forum zu replien
ab hier noch in htaccess eintragen
"Mozilla/5.0 (compatible;picmole/1.0 +http://www.picmole.com)"
"Yeti/1.0 (NHN Corp.; http://help.naver.com/robots/)"
"Mozilla/5.0 (iPhone; CPU iPhone OS 5_0_1 like Mac OS X) (compatible; Yeti-Mobile/0.1; +http://help.naver.com/robots/)"
"PagePeeker.com (info: http://pagepeeker.com/robots)"
"CatchBot/1.0; +http://www.catchbot.com"
"yacybot (freeworld/global; amd64 Linux 3.2.1-gentoo-r2; java 1.6.0_24; Europe/de) http://yacy.net/bot.html"
"netEstate NE Crawler (+http://www.sengine.info/)"
"Mozilla/5.0 (Windows; U; Windows NT 5.1; en; rv:1.9.0.13) Gecko/2009073022 Firefox/3.5.2 (.NET CLR 3.5.30729) SurveyBot/2.3 (DomainTools)"
"COMODO SSL Checker"
"Comodo-Certificates-Spider"
"gonzo2[p] (+http://www.suchen.de/faq.html)" (Geschäftesuche)
"crawler schrein, crawler@schrein.nl id-4"
"BacklinkCrawler (http://www.backlinktest.com/crawler.html)"
"Mozilla/5.0 (compatible; Afilias Web Mining Tool 1.0; +http://www.afilias.info; awmt@afilias.info)"
"Mozilla/5.0 (compatible; SEOkicks-Robot +http://www.seokicks.de/robot.html)"
"Mozilla/5.0 (compatible; suggybot v0.01a, http://blog.suggy.com/was-ist-suggy/suggy-webcrawler/)"
"http://www.bdbrandprotect.com" "Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1)"
"Updownerbot (+http://www.updowner.com/bot)"
"lex/1.0"
"Content Crawler"
"Mozilla/5.0 (compatible; DCPbot/1.1; +http://domains.checkparams.com/)"
"Mozilla/5.0 (compatible; KaloogaBot; http://kalooga.com/crawler)"
"MLBot (www.metadatalabs.com/mlbot)"
"Mozilla/5.0 (X11; U; Linux i686; de; rv:1.9.0.1; compatible; iCjobs Stellenangebote Jobs; http://www.icjobs.de) Gecko/20100401 iCjobs/3.2.3"
"Mozilla/5.0 (compatible; oBot/2.3.1; +http://filterdb.iss.net/crawler/)"
"Mozilla/5.0 (compatible; WebmasterCoffee/0.7; +http://webmastercoffee.com/about)"
"Mozilla/5.0 (compatible; Qualidator.com Bot 1.0;)" (http://www.qualidator.com/Web/de/Support/robotstxt_Hinweise.htm)
"Mozilla/4.0 (compatible; http://search.thunderstone.com/texis/websearch/about.html)" (http://www.thunderstone.com/site/gw25man/page_exclusion_and_robots_txt.html)
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) (larbin2.6.3@unspecified.mail)"
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.24; ips-agent) Gecko/20111107 Ubuntu/10.04 (lucid) Firefox/3.6.24"
"Mozilla/5.0 (compatible; UnisterBot; crawler@unister.de)"
"Mozilla/5.0 (compatible; en-US; ReverseGet/1.0; http://reverseget.com/; robot@reverseget.com)"
robots über Linux-Tools, die sich nicht richtig zu erkennen geben und
die bei übermäßigem Gebrauch über die Tool-ID gesperrt werden können
"Wget/1.9"
"libwww-perl/5.837"
"curl/7.21.3 (amd64-portbld-freebsd7.2) libcurl/7.21.3 OpenSSL/0.9.8e zlib/1.2.3"
"Java/1.6.0_29"
to_check:
"\"Mozilla/5.0"
"Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.04506.648; .NET CLR 3.5.21022; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2)"
"Mozilla/4.0 (compatible; MSIE 6.0; MSN 2.5; Windows 98; Win 9x 4.90; FDM)"
beobachten:
"ssearch_bot (sSearch Crawler; http://www.semantissimo.de)"
"Mozilla/5.0 (compatible; Plukkie/1.4; http://www.botje.com/plukkie.htm)"
"Mozilla/5.0 (compatible; lemurwebcrawler admin@lemurproject.org; +http://boston.lti.cs.cmu.edu/crawler_12/)"
unerwünschte bots, die die robots.txt NICHT abfragen, gehören ggf. per Rewrite gesperrt:
"Mozilla/5.0+(compatible;+PiplBot;++http://www.pipl.com/bot/)" IGNORIERT ROBOTS.TXT
"Mozilla/5.0 (compatible; TweetmemeBot/2.11; +http://tweetmeme.com/)" IGNORIERT ROBOTS.TXT
okay:
"Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
"Googlebot-Image/1.0"
"Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)"
"Mozilla/5.0 (compatible; YandexImages/3.0; +http://yandex.com/bots)"
"Mozilla/5.0 (compatible; Yahoo! Slurp; http://help.yahoo.com/help/us/ysearch/slurp)"
"Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
"Mozilla/5.0 (compatible; archive.org_bot +http://www.archive.org/details/archive.org_bot)"
"ia_archiver (+http://www.alexa.com/site/help/webmasters; crawler@alexa.com)" (hängt auch mit archive.org zusammen)
"Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)"
"msnbot-media/1.1 (+http://search.msn.com/msnbot.htm)"
"Mozilla/5.0 (compatible; OpenindexDeepSpider/Nutch-1.5-dev; +http://www.openindex.io/en/webmasters/spider.html)"
"CloudACL/Nutch-1.4"
"webcrawler (compatible; heritrix/1.14.4 ++http://www.onb.ac.at/about/webarchivierung.htm)"
"Mail.RU/2.0" (russ. Suchmaschine)
"Sosospider+(+http://help.soso.com/webspider.htm)" (chin. Suchmaschine)
"Mozilla/5.0 (compatible; ScoutJet; +http://www.scoutjet.com/)"
"Eurobot/1.1 (http://eurobot.ayell.eu)"
"Mozilla/5.0 (compatible; MSIE or Firefox mutant; not on Windows server; +http://ws.daum.net/aboutWebSearch.html) Daumoa/2.0" (koreanische Suchmaschine)
"Acoon v4.10.3 (www.acoon.de)"
"DoCoMo/2.0 P900i(c100;TB;W24H11) (compatible; ichiro/mobile goo; +http://search.goo.ne.jp/option/use/sub4/sub4-1/)" (jap. Suchmaschine)
"ichiro/3.0 (http://help.goo.ne.jp/help/article/1142)"
"frogl-bot (Version: 1.06, powered by www.frogl.de +http://www.frogl.de/pfadzurbotseite/bot.html)"
"Mozilla/5.0 (compatible; NerdByNature.Bot; http://www.nerdbynature.net/bot)"
"Agent-SharewarePlazaBot/3.0+(+http://www.SharewarePlaza.com)" IGNORIERT ROBOTS.TXT
"Wotbox/2.0 (bot@wotbox.com; http://www.wotbox.com)" IGNORIERT ROBOTS.TXT
"www.freefileszone.com PadPollbot/1.1b (+http://www.freefileszone.com/)" IGNORIERT ROBOTS.TXT
"Mozilla/5.0 (compatible; Sitedomain-Bot 1.0; Headers only; +http://www.sitedomain.de/sitedomain-bot/)" IGNORIERT ROBOTS.TXT - checkt auf gelöschte Domains - ruft nur Hauptseite auf
"emefgebot/beta (+http://emefge.de/bot.html)" IGNORIERT ROBOTS.TXT
"TinEye/1.1 (http://tineye.com/crawler.html)" #User-agent: TinEye

Warnings

4 invalid lines.

cool-web.derobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

googlebot

turnitinbot

slysearch

findlinks

magpie-crawler

pixray-seeker

mj12bot

ezooms

ahrefsbot

lb-spider

wbsearchbot

psbot

huaweisymantecspider

sistrix

ec2linkfinder

htdig

semrushbot

discobot

linkdex.com

seznambot

edisterbot

swebot

picmole

yeti

yeti-mobile

pagepeeker

catchbot

yacybot

netestate ne crawler

surveybot

comodo ssl checker

comodo-certificates-spider

gonzo

schrein

backlinkcrawler

afilias web mining tool

seokicks

seokicks-robot

suggybot

bdbrandprotect

bpimagewalker

bpimagewalker*

updownerbot

lex

content crawler

dcpbot

kaloogabot

mlbot

icjobs

obot

webmastercoffee

qualidator*

webinator

scooter

larbin

opidoobot

ips-agent

unisterbot

unister*

reverseget

wget

libwww-perl

curl

java

Comments

Warnings

cool-web.de
robots.txt