/.well-known/

Log In Sign Up

ewhois.org
robots.txt

Robots Exclusion Standard data for ewhois.org

Archived Snapshots

Resource Scan

Scan Details

Site Domain	ewhois.org
Base Domain	ewhois.org
Scan Status	Ok
Last Scan	2024-10-20T00:03:16+00:00
Next Scan	2024-11-19T00:03:16+00:00

Last Scan

Scanned	2024-10-20T00:03:16+00:00
URL	https://ewhois.org/robots.txt
Domain IPs	104.21.90.138, 172.67.156.196, 2606:4700:3031::ac43:9cc4, 2606:4700:3037::6815:5a8a
Response IP	172.67.156.196
Found	Yes
Hash	be4c06d36e89cb7fcbb2b8c1cb738005d6b19de7492a2f20285efd11bff7f432
SimHash	7f9453cbcdf4

Groups

ahrefsbot
ahrefssiteaudit
adbeat_bot
alexibot
appengine
aqua_products
asterias
b2w/0.1
backdoorbot/1.0
becomebot
blekkobot
blexbot
blowfish/1.0
bookmark search tool
botalot
builtbottough
bullseye/1.0
bunnyslippers
ccbot
cheesebot
cherrypicker
cherrypickerelite/1.0
cherrypickerse/1.0
chroot
copernic
copyrightcheck
cosmos
crescent
crescent internet toolpak http ole control v.1.0
dittospyder
dotbot
dumbot
emailcollector
emailsiphon
emailwolf
enterprise_search
enterprise_search/1.0
erocrawler
es
exabot
extractorpro
fairad client
flaming attackbot
foobot
gaisbot
getright/4.2
gigabot
grub
grub-client
go-http-client
harvest/1.5
hatena antenna
hloader
http://www.searchengineworld.com bot
http://www.webmasterworld.com bot
httplib
humanlinks
infonavirobot
iron33/1.0.2
jamesbot
jennybot
jetbot
jetbot/1.0
jorgee
kenjin spider
keyword density/0.9
lexibot
libweb/clshttp
linkextractorpro
linkpadbot
linkscan/8.1a unix
linkwalker
lnspiderguy
looksmart
lwp-trivial
lwp-trivial/1.34
mata hari
megalodon
microsoft url control
microsoft url control - 5.01.4511
microsoft url control - 6.00.8169
miixpc
miixpc/4.2
mister pix
moget
moget/2.1
naver
nerdybot
netants
netmechanic
nicerspro
nutch
openbot
openfind
openfind data gathere
oracle ultra search
perman
propowerbot/2.14
prowebwalker
psbot
python-urllib
queryn metasearch
radiation retriever 1.1
repomonkey
repomonkey bait & tackle/v1.01
rma
rogerbot
scooter
searchpreview
semrushbot
semrushbot
semrushbot-sa
seokicks-robot
sootle
spankbot
spanner
spbot
stanford
stanford comp sci
stanford compclub
stanford compsciclub
stanford spiderboys
surveybot
surveybot_ignoreip
suzuran
szukacz/1.4
szukacz/1.4
telesoft
teoma
the intraformant
thenomad
tocrawl/urldispatcher
true_robot
true_robot/1.0
turingos
typhoeus
url control
url_spider_pro
urly warning
vci
vci webviewer vci webviewer win32
web image collector
webauto
webbandit
webbandit/3.50
webenhancer
webmasterworld extractor
webmasterworldforumbot
websauger
website quester
webster pro
webvac
www-collector-e
zeus
zeus 32297 webster pro v2.9 win32
zeus link scout

Rule

Path

Disallow

/

ubicrawler
doc
zao
gsa-crawler

Rule

Path

Disallow

/

sitecheck.internetseer.com
zealbot
msiecrawler
sitesnagger
webstripper
webcopier
fetch
offline explorer
teleport
teleportpro
webzip
webzip/4.0
linko
httrack
microsoft.url.control
xenu
xenu's
xenu's link sleuth 1.1c
larbin
libwww
zyborg
download ninja

Rule

Path

Disallow

/

wget
wget/1.11.4
wget/1.13.4
wget/1.12
wget/1.5.3
wget/1.6

Rule

Path

Disallow

/

webreaper
cncdialer
maxthon
mj12bot
slurp
screaming frog seo spider

Rule

Path

Disallow

/

Back to top

Comments

robots.txt
Specific bots settings
Crawlers that are kind enough to obey, but which we'd rather not have
unless they're feeding search engines.
Some bots are known to be trouble, particularly those designed to copy entire sites.
wget in its recursive mode is a frequent problem.
There is a wait option you can use to set the delay between hits for instance.
A capture bot, downloads gazillions of pages with no public benefit

Back to top