discoverybit.com
robots.txt

Robots Exclusion Standard data for discoverybit.com

Resource Scan

Scan Details

Site Domain discoverybit.com
Base Domain discoverybit.com
Scan Status Ok
Last Scan2024-09-29T17:29:10+00:00
Next Scan 2024-10-06T17:29:10+00:00

Last Scan

Scanned2024-09-29T17:29:10+00:00
URL https://discoverybit.com/robots.txt
Redirect https://www.discoverybit.com/robots.txt
Redirect Domain www.discoverybit.com
Redirect Base discoverybit.com
Domain IPs 104.21.14.185, 172.67.160.37, 2606:4700:3031::ac43:a025, 2606:4700:3036::6815:eb9
Redirect IPs 104.21.14.185, 172.67.160.37, 2606:4700:3031::ac43:a025, 2606:4700:3036::6815:eb9
Response IP 172.67.160.37
Found Yes
Hash 23d60160fbc5bee1b0b66d2c46f7b3561bd1319976e43e84e603e22868162747
SimHash 539e7372e3c3

Groups

*

Rule Path
Disallow

*

Rule Path
Disallow /wp-admin/
Allow /wp-admin/admin-ajax.php

mauibot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 40

msnbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 40

msnbot-media

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 40

a6-indexer

Rule Path
Disallow /

aboundex
asterias
backdoorbot/1.0
backlinkcrawler
black hole
blowfish/1.0
botalot
builtbottough
bullseye/1.0
bunnyslippers
ca-crawler
ccbot
ccbot/2.0
cegbfeieh
cheesebot
cherrypicker
cherrypickerelite/1.0
cherrypickerse/1.0
copyrightcheck
cosmos
crescent
crescent internet toolpak http ole control v.1.0
dbot
dittospyder
easouspider
eccp
emailcollector
emailsiphon
emailwolf
erocrawler
exabot/3.0
extractorpro
foobot
fr-crawler
friendly crawler
gigablastopensource
goodzer/2.0
grapeshotcrawler/2.0
harvest/1.5
heritrix/1.14.4
hloader
httplib
hubspot crawler 1.0
humanlinks
istellabot
infonavirobot
jennybot
kenjin spider
keyword density/0.9
konqueror/3.5
lexibot
libweb/clshttp
linkextractorpro
linkscan/8.1a unix
linkwalker
lipperhey seo service
lnspiderguy
lwp-trivial
lwp-trivial/1.34
mata hari
meanpathbot
microsoft url control - 5.01.4511
microsoft url control - 6.00.8169
miixpc
miixpc/4.2
mister pix
mixrankbot
moget
moget/2.1
mozilla/4
mozilla/4.0 (compatible; bullseye; windows 95)
mozilla/4.0 (compatible; msie 4.0; windows 95)
mozilla/4.0 (compatible; msie 4.0; windows 98)
mozilla/4.0 (compatible; msie 4.0; windows nt)
mozilla/4.0 (compatible; msie 4.0; windows xp)
mozilla/4.0 (compatible; msie 4.0; windows 2000)
mozilla/4.0 (compatible; msie 4.0; windows me)
mozilla/5
nbot/2.0
netants
netzcheckbot/1.0
nicerspro
obot/2.3.1
offline explorer
openfind
openfind data gathere
pagesinventory
panscient.com

Rule Path
Disallow /https%3A//www.siteground.com/kb/google_marked_my_website_as_harmful/

propowerbot/2.14
prowebwalker
queryn metasearch
repomonkey
repomonkey bait & tackle/v1.01
riddler
rma
ru_bot
screenerbot crawler beta 2.0
semrushbot/0.98~bl
seoengworldbot
seokicks-robot
seplinkbot
seplinkbot/1.0
seznambot
sistrix
sitesnagger
smtbot/1.0
spankbot
spbot
sogou web spider
spanner
suzuran
szukacz/1.4
teleport
teleportpro
telesoft
the intraformant
thenomad
tighttwatbot
titan
tocrawl/urldispatcher
true_robot
true_robot/1.0
turingos
urlappendbot
urly warning
vci
vci webviewer vci webviewer win32
voltron
wbsearchbot
web image collector
webauto
webbandit
webbandit/3.50
webcapture 2.0
webcheck 1.10.4
webcopierdanilo
c14542.sgvps.net webenhancer
webmasterworldforumbot
websauger
website quester
webster pro
webstripper
webzip
webzip/4.0
wesee
wget
wget/1.5.3
wget/1.6
www-collector-e
www.integromedb.org/crawler
xenu link sleuth
xenu's link sleuth 1.1c
zeus
zeus 32297 webster pro v2.9 win32
xovibot

Rule Path
Disallow /Danilo

zumbot

Rule Path
Disallow /cgi-bin/

ahrefsbot

Rule Path
Disallow /

sch-fast-se-crawl02.osl.basefarm.net
sch-fast-se-crawl04.osl.basefarm.net
ichiro
naverbot
yeti
baiduspider-video
baiduspider-image
sogou spider
youdaobot
mj12bot

Rule Path
Disallow /wp-content/uploads/

*

Rule Path
Disallow /calendar/action~posterboard/
Disallow /calendar/action~agenda/
Disallow /calendar/action~oneday/
Disallow /calendar/action~month/
Disallow /calendar/action~week/
Disallow /calendar/action~stream/
Disallow /calendar/action~undefined/
Disallow /calendar/action~http%3A/
Disallow /calendar/action~default/
Disallow /calendar/action~poster/
Disallow /calendar/action~*/
Disallow /*controller%3Dai1ec_exporter_controller*
Disallow /*/action~*/

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://www.discoverybit.com/sitemap_index.xml

Comments

  • Begin Exclusion From Directories from robots.txt
  • disallow OSL.basefarm.net

Warnings

  • 2 invalid lines.
  • `user-agenc14542.sgvps.nett` is not a known field.