breakfastspots.com
robots.txt

Robots Exclusion Standard data for breakfastspots.com

Resource Scan

Scan Details

Site Domain breakfastspots.com
Base Domain breakfastspots.com
Scan Status Ok
Last Scan2024-11-12T18:27:48+00:00
Next Scan 2024-11-19T18:27:48+00:00

Last Scan

Scanned2024-11-12T18:27:48+00:00
URL https://breakfastspots.com/robots.txt
Domain IPs 95.85.17.130
Response IP 95.85.17.130
Found Yes
Hash 07357e9ce539b86d9bc45877baa84e86b8a881751338abbc732350defe55a640
SimHash 54c2f187c801

Groups

*

Rule Path
Disallow /service
Disallow /admin

mj12bot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

semrushbot-sa

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

alexibot

Rule Path
Disallow /

surveybot

Rule Path
Disallow /

xenu’s

Rule Path
Disallow /

xenu’s link sleuth 1.1c

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

sistrix crawler

Rule Path
Disallow /

uptimerobot/2.0

Rule Path
Disallow /

ezooms robot

Rule Path
Disallow /

perl lwp

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

netestate ne crawler (+http://www.website-datenbank.de/)

Rule Path
Disallow /

wiseguys robot

Rule Path
Disallow /

turnitin robot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

turnitin bot

Rule Path
Disallow /

turnitinbot/3.0 (http://www.turnitin.com/robot/crawlerinfo.html)

Rule Path
Disallow /

turnitinbot/3.0

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

pimonster

Rule Path
Disallow /

pimonster

Rule Path
Disallow /

searchmetricsbot

Rule Path
Disallow /

eccp/1.0 (search@eniro.com)

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

gsa-crawler (enterprise; t4-knhh62cdkc2w3; gsa_manage@nikon-sys.co.jp)

Rule Path
Disallow /

megaindex.ru/2.0

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

megaindex.com

Rule Path
Disallow /

mozilla/5.0 (compatible; megaindex.ru/2.0; +http://megaindex.com/crawler)

Rule Path
Disallow /

garlikcrawler/1.2 (http://garlik.com/, crawler@garlik.com)

Rule Path
Disallow /

acrobat
aisearchbot
baidu*
baiduspider
baiduspider+(+http://www.baidu.com/search/spider.htm)
baiduspider+(+http://www.baidu.jp/spider/)
baiduspider-image
baiduspider-video
black hole
blackwidow
blackwidow 4.40
cfetch/1.0
cheesebot
cherrypicker
comodo+http(s)+crawler
converamultimediacrawler/0.1
dloader(naverrobot)
dloader(speedy spider)
dotbot
emailcollector
emailsiphon
emailwolf
everbeecrawler
extractorpro
flash processor
flash+processor
flatlandbot
flatlandbot/baypup
grub-client
hloader
home.thenewweb.com
htdig/3.1.5
httrack
ia_archiver
ichiro
ichiro
icollect
igetter
imagewalker
industry program
indy
indy library
innerprise
installshield
internetlinkagent/
intscanner
ipd
ipiumbot
iria
iupui research bot
java
java1
java1.3.0
java1.3.1
java2
jobo
joc web spider
johnhasbeenhere
jscript+processor
kapere
lachesis
larbin
larbin_2.6.1
larbin_2.6.1 larbin2.6.2@unspecified.mail
larbin_2.6.2
larbin_2.6.2 larbin2.6.2@unspecified.mail
leechget
libwww-perl
lightningdownload
linkalarm
linkchecker
linklint-checkonly
linkman
llupdatectrl
mac finder
mail sweeper
mass
mass downloader
mcbot
metaproducts
metaproducts download express
mfc_tear_sample
mfhttpscan
moget
moget
mozilla/4.0 (compatible; naverbot/1.0; http://help.naver.com/customer_webtxt_02.jsp)
mygetright
naver
naverbot
naverrobot
netpumper
newt
nextgensearchbot
nicerspro
nitro
nitro downloader
nudelsalat
nutch
obot
offline
offline explorer
page_prefetcher
pagmiedownload
pavuk
pixgrabber
plantynet_webrobot
plucker
pockey
popdexter
program
program shareware
progressive
progressive download
prowebwalker
proxytester
psbot
puf
puxarapido
python-urllib
python-webchecker
realdownload
repomonkey
repomonkey bait & tackle
robotmidareru
rpt-httpclient
scat
scoutabout
semanticdiscovery
siphon
sitesnagger
sitesnagger
sitewinder
slurp
slysearch
smartdownload
softwing_tear_agent
sogou head spider/3.0( http://www.sogou.com/docs/help/webmasters.htm
sogou orion spider/3.0( http://www.sogou.com/docs/help/webmasters.htm
sogou pic agent
sogou pic spider/3.0( http://www.sogou.com/docs/help/webmasters.htm
sogou spider
sogou spider
sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm
sogou-test-spider/4.0 (compatible; msie 5.5; windows 98)
sonic
sosospider
speeddownload
speedy
sprocket
sq
sq webscanner
stamina
star
star downloader
steeler
superhttp
surveybot
synobot
teleport
teleport
teleport pro
thunderstone
turnitinbot
turnitinbot
tweakmaster
twiceler
udmsearch
undisclosed
urlgetfile
utilmind
utilmind httpget
vcikjzddls
vobsub
voyager/2.0
voyager-hc/1.0
web downloader
webalta
webauto
webcapture
webclipping.com
webcollage
webcopier
webcopier
webinator
webleacher
webmole
webreaper
websauger
website extractor
webster
webstripper
webstripper
webzip
webzip
wep search
wep search 00
wget
whizbang
whsearch
wildsoft
wildsoft surfer
winhttp.winhttprequest
woriobot
www4mail
wwwoffle
xaldon
xaldon webspider
xedit
xenu
yandex
yandex*
yandexsomething/1.0
yanga worldsearch bot v1.1/beta
yanga worldsearch bot v1.1/beta (http
yeti
youdaobot
zao
zbot
zeus
zyborg
mauibot
crazywebcrawler-spider

Product Comment
sogou head spider/3.0( http://www.sogou.com/docs/help/webmasters.htm 07)
sogou orion spider/3.0( http://www.sogou.com/docs/help/webmasters.htm 07)
sogou pic spider/3.0( http://www.sogou.com/docs/help/webmasters.htm 07)
sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm 07)
Rule Path
Disallow /

Comments

  • Block MJ12bot as it is just noise
  • Block Ahrefs
  • Block Sogou
  • Block SEOkicks
  • Block BlexBot
  • Block SISTRIX
  • Block Uptime robot
  • Block Ezooms Robot
  • Block Perl LWP
  • Block BlexBot
  • Block netEstate NE Crawler (+http://www.website-datenbank.de/)
  • Block WiseGuys Robot
  • Block Turnitin Robot
  • Block Heritrix
  • Block pricepi
  • Block Searchmetrics Bot
  • Block Eniro
  • Block SoGou
  • Block Youdao
  • Block Nikon JP Crawler
  • Block MegaIndex.ru

Warnings

  • 7 invalid lines.