rsm-news.com
robots.txt

Robots Exclusion Standard data for rsm-news.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	rsm-news.com
Base Domain	rsm-news.com
Scan Status	Ok
Last Scan	2024-10-12T04:01:21+00:00
Next Scan	2024-10-19T04:01:21+00:00

Last Scan

Scanned	2024-10-12T04:01:21+00:00
URL	https://rsm-news.com/robots.txt
Domain IPs	85.13.128.175
Response IP	85.13.128.175
Found	Yes
Hash	bf123d21deb1b0a17b6ac49a19e931e7310938707c8021a8576ae76af2c17d2b
SimHash	b2742c728f6e

Groups

*

Rule	Path
Allow	/

Rule

Path

Allow

psbot
teoma
yandex
exabot
gigabot
baiduspider
nutch
cityreview
webreaper
webcopier
offline explorer
httrack
microsoft.url.control
emailcollector
penthesilea

Rule	Path
Disallow	/

Rule

Path

Disallow

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

slysearch

Rule	Path
Disallow	/

Rule

Path

Disallow

findlinks

Rule	Path
Disallow	/

Rule

Path

Disallow

magpie-crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

pixray-seeker

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

ezooms

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

lb-spider

Rule	Path
Disallow	/

Rule

Path

Disallow

wbsearchbot

Rule	Path
Disallow	/

Rule

Path

Disallow

psbot

Rule	Path
Disallow	/

Rule

Path

Disallow

huaweisymantecspider

Rule	Path
Disallow	/

Rule

Path

Disallow

sistrix

Rule	Path
Disallow	/

Rule

Path

Disallow

ec2linkfinder

Rule	Path
Disallow	/

Rule

Path

Disallow

htdig

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-sa

Rule	Path
Disallow	/

Rule

Path

Disallow

discobot

Rule	Path
Disallow	/

Rule

Path

Disallow

linkdex.com

Rule	Path
Disallow	/

Rule

Path

Disallow

seznambot

Rule	Path
Disallow	/

Rule

Path

Disallow

edisterbot

Rule	Path
Disallow	/

Rule

Path

Disallow

swebot

Rule	Path
Disallow	/

Rule

Path

Disallow

picmole

Rule	Path
Disallow	/

Rule

Path

Disallow

yeti

Rule	Path
Disallow	/

Rule

Path

Disallow

yeti-mobile

Rule	Path
Disallow	/

Rule

Path

Disallow

pagepeeker

Rule	Path
Disallow	/

Rule

Path

Disallow

catchbot

Rule	Path
Disallow	/

Rule

Path

Disallow

yacybot

Rule	Path
Disallow	/

Rule

Path

Disallow

netestate ne crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

surveybot

Rule	Path
Disallow	/

Rule

Path

Disallow

comodo ssl checker

Rule	Path
Disallow	/

Rule

Path

Disallow

comodo-certificates-spider

Rule	Path
Disallow	/

Rule

Path

Disallow

gonzo

Rule	Path
Disallow	/

Rule

Path

Disallow

schrein

Rule	Path
Disallow	/

Rule

Path

Disallow

backlinkcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

afilias web mining tool

Rule	Path
Disallow	/

Rule

Path

Disallow

seokicks

Rule	Path
Disallow	/

Rule

Path

Disallow

seokicks-robot

Rule	Path
Disallow	/

Rule

Path

Disallow

suggybot

Rule	Path
Disallow	/

Rule

Path

Disallow

bdbrandprotect

Rule	Path
Disallow	/

Rule

Path

Disallow

bpimagewalker

Rule	Path
Disallow	/

Rule

Path

Disallow

bpimagewalker*

Rule	Path
Disallow	/

Rule

Path

Disallow

updownerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

lex

Rule	Path
Disallow	/

Rule

Path

Disallow

content crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

dcpbot

Rule	Path
Disallow	/

Rule

Path

Disallow

kaloogabot

Rule

Path

Disallow

mlbot

Rule

Path

Disallow

wget

Rule

Path

Disallow

libwww-perl

Rule

Path

Disallow

curl

Rule

Path

Disallow

java

Rule

Path

Disallow

icjobs

Rule

Path

Disallow

obot

Rule

Path

Disallow

webmastercoffee

Rule

Path

Disallow

qualidator*

Rule

Path

Disallow

webinator

Rule

Path

Disallow

scooter

Rule

Path

Disallow

larbin

Rule

Path

Disallow

opidoobot

Rule

Path

Disallow

ips-agent

Rule

Path

Disallow

tineye

Rule

Path

Disallow

unisterbot

Rule

Path

Disallow

unister*

Rule

Path

Disallow

reverseget

Rule

Path

Disallow

Other Records

Field

Value

sitemap

https://rsm-news.com/sitemaps/sitemap.xml

Comments

"TurnitinBot/2.1 (http://www.turnitin.com/robot/crawlerinfo.html)"
"findlinks/2.1.5 (+http://wortschatz.uni-leipzig.de/findlinks/)"
"magpie-crawler/1.1 (U; Linux amd64; en-GB; +http://www.brandwatch.net)"
http://www.majestic12.co.uk/projects/dsearch/mj12bot.php
http://www.80legs.com/webcrawler.html - if 008 is crawling your website, it means that one or more 80legs users created a web crawl
"Mozilla/5.0 (compatible; AhrefsBot/2.0; +http://ahrefs.com/robot/)"
"lb-spider/Mozilla/5.0 Gecko/20100101 Firefox/10.0.2 (lb-spider; http://www.linkbutler.de/spider; spider@linkbutler.de)"
"Mozilla/5.0 (compatible; WBSearchBot/1.1; +http://www.warebay.com/bot.html)"
"psbot/0.1 (+http://www.picsearch.com/bot.html)"
"HuaweiSymantecSpider/1.0+DSE-support@huaweisymantec.com+(compatible; MSIE 7.0; Windows NT 5.1; Trident/4.0; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR ; http://www.huaweisymantec.com/en/IRL/spider)"
"Mozilla/5.0 (compatible; SISTRIX Crawler; http://crawler.sistrix.net/)"
"EC2LinkFinder"
"http://SiteIntel.net Bot"
"htdig"
"SemrushBot/0.91" - http://de.semrush.com/? -Professionelle Software für SEO & SEM Professionals?
"Mozilla/5.0 (compatible; discobot/2.0; +http://discoveryengine.com/discobot.html)" - we sell no wine before its time != trustworthy
"linkdex.com/v2.0" - SEO
"SeznamBot/3.0 (+http://fulltext.sblog.cz/)" - sz-SEO
"EdisterBot (http://www.edister.com/bot.html)"
"Mozilla/5.0 (compatible; SWEBot/1.0; +http://swebot-crawler.net)" - versucht auf posting im forum zu replien
ab hier noch in htaccess eintragen
"Mozilla/5.0 (compatible;picmole/1.0 +http://www.picmole.com)"
"Yeti/1.0 (NHN Corp.; http://help.naver.com/robots/)"
"Mozilla/5.0 (iPhone; CPU iPhone OS 5_0_1 like Mac OS X) (compatible; Yeti-Mobile/0.1; +http://help.naver.com/robots/)"
"PagePeeker.com (info: http://pagepeeker.com/robots)"
"CatchBot/1.0; +http://www.catchbot.com"
"yacybot (freeworld/global; amd64 Linux 3.2.1-gentoo-r2; java 1.6.0_24; Europe/de) http://yacy.net/bot.html"
"netEstate NE Crawler (+http://www.sengine.info/)"
"Mozilla/5.0 (Windows; U; Windows NT 5.1; en; rv:1.9.0.13) Gecko/2009073022 Firefox/3.5.2 (.NET CLR 3.5.30729) SurveyBot/2.3 (DomainTools)"
"COMODO SSL Checker"
"Comodo-Certificates-Spider"
"gonzo2[p] (+http://www.suchen.de/faq.html)" (Geschäftesuche)
"crawler schrein, crawler@schrein.nl id-4"
"BacklinkCrawler (http://www.backlinktest.com/crawler.html)"
"Mozilla/5.0 (compatible; Afilias Web Mining Tool 1.0; +http://www.afilias.info; awmt@afilias.info)"
"Mozilla/5.0 (compatible; SEOkicks-Robot +http://www.seokicks.de/robot.html)"
"Mozilla/5.0 (compatible; suggybot v0.01a, http://blog.suggy.com/was-ist-suggy/suggy-webcrawler/)"
"http://www.bdbrandprotect.com" "Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1)"
"Updownerbot (+http://www.updowner.com/bot)"
"lex/1.0"
"Content Crawler"
"Mozilla/5.0 (compatible; DCPbot/1.1; +http://domains.checkparams.com/)"
"Mozilla/5.0 (compatible; KaloogaBot; http://kalooga.com/crawler)"
"MLBot (www.metadatalabs.com/mlbot)"
"Wget/1.9"
"libwww-perl/5.837"
"curl/7.21.3 (amd64-portbld-freebsd7.2) libcurl/7.21.3 OpenSSL/0.9.8e zlib/1.2.3"
"Java/1.6.0_29"
"Mozilla/5.0 (X11; U; Linux i686; de; rv:1.9.0.1; compatible; iCjobs Stellenangebote Jobs; http://www.icjobs.de) Gecko/20100401 iCjobs/3.2.3"
"Mozilla/5.0 (compatible; oBot/2.3.1; +http://filterdb.iss.net/crawler/)"
"Mozilla/5.0 (compatible; WebmasterCoffee/0.7; +http://webmastercoffee.com/about)"
"Mozilla/5.0 (compatible; Qualidator.com Bot 1.0;)" (http://www.qualidator.com/Web/de/Support/robotstxt_Hinweise.htm)
"Mozilla/4.0 (compatible; http://search.thunderstone.com/texis/websearch/about.html)" (http://www.thunderstone.com/site/gw25man/page_exclusion_and_robots_txt.html)
"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0) (larbin2.6.3@unspecified.mail)"
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.24; ips-agent) Gecko/20111107 Ubuntu/10.04 (lucid) Firefox/3.6.24"
"TinEye/1.1 (http://tineye.com/crawler.html)"
"Mozilla/5.0 (compatible; UnisterBot; crawler@unister.de)"
"Mozilla/5.0 (compatible; en-US; ReverseGet/1.0; http://reverseget.com/; robot@reverseget.com)"

Warnings

5 invalid lines.

rsm-news.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

psbotteomayandexexabotgigabotbaiduspidernutchcityreviewwebreaperwebcopieroffline explorerhttrackmicrosoft.url.controlemailcollectorpenthesilea

turnitinbot

slysearch

findlinks

magpie-crawler

pixray-seeker

mj12bot

ezooms

ahrefsbot

lb-spider

wbsearchbot

psbot

huaweisymantecspider

sistrix

ec2linkfinder

htdig

semrushbot

semrushbot-sa

discobot

linkdex.com

seznambot

edisterbot

swebot

picmole

yeti

yeti-mobile

pagepeeker

catchbot

yacybot

netestate ne crawler

surveybot

comodo ssl checker

comodo-certificates-spider

gonzo

schrein

backlinkcrawler

afilias web mining tool

seokicks

seokicks-robot

suggybot

bdbrandprotect

bpimagewalker

bpimagewalker*

updownerbot

lex

content crawler

dcpbot

kaloogabot

mlbot

wget

libwww-perl

curl

java

icjobs

obot

webmastercoffee

qualidator*

webinator

scooter

larbin

opidoobot

ips-agent

tineye

unisterbot

unister*

reverseget

Other Records

Comments

Warnings

rsm-news.com
robots.txt

psbot
teoma
yandex
exabot
gigabot
baiduspider
nutch
cityreview
webreaper
webcopier
offline explorer
httrack
microsoft.url.control
emailcollector
penthesilea