alanwest.com
robots.txt

Robots Exclusion Standard data for alanwest.com

Resource Scan

Scan Details

Site Domain alanwest.com
Base Domain alanwest.com
Scan Status Ok
Last Scan2024-11-10T22:43:11+00:00
Next Scan 2024-11-24T22:43:11+00:00

Last Scan

Scanned2024-11-10T22:43:11+00:00
URL https://alanwest.com/robots.txt
Domain IPs 2001:8b0:dfaa:1af8::32, 81.187.2.252
Response IP 81.187.2.252
Found Yes
Hash 97fa294030b92b1bc3ba93f976e5e269339dc1946067b12f61e717de5e873061
SimHash c03dde9e73d3

Groups

mozilla/3.0 (compatible;miner;mailto:miner@miner.com.br)

Rule Path
Disallow

webferret

Rule Path
Disallow

due to a deficiency in java it's not currently possible to set the user-agent.

Rule Path
Disallow

no

Rule Path
Disallow

arachnophilia

Rule Path
Disallow

architextspider

Rule Path
Disallow

aspider/0.09

Rule Path
Disallow

auresys/1.0

Rule Path
Disallow

backrub/*.*

Rule Path
Disallow

big brother

Rule Path
Disallow

blackwidow

Rule Path
Disallow

bspider/1.0 libwww-perl/0.40

Rule Path
Disallow

cactvs chemistry spider

Rule Path
Disallow

digimarc cgireader/1.0

Rule Path
Disallow

checkbot/x.xx lwp/5.x

Rule Path
Disallow

cmc/0.01

Rule Path
Disallow

combine/0.0

Rule Path
Disallow

conceptbot/0.3

Rule Path
Disallow

crescent internet toolpak http ole control v.1.0

Rule Path
Disallow

root/0.1

Rule Path
Disallow

cs-hkust-indexserver/1.0

Rule Path
Disallow

cyberspyder/2.1

Rule Path
Disallow

deweb/1.01

Rule Path
Disallow

dragonbot/1.0 libwww/5.0

Rule Path
Disallow

eit-link-verifier-robot/0.2

Rule Path
Disallow

emacs-w3/v[0-9\.]+

Rule Path
Disallow

emailsiphon

Rule Path
Disallow

emc spider

Rule Path
Disallow

explorersearch

Rule Path
Disallow

explorer

Rule Path
Disallow

extractorpro

Rule Path
Disallow

felixide/1.0

Rule Path
Disallow

hazel's ferret web hopper,

Rule Path
Disallow

esirover v1.0

Rule Path
Disallow

fido/0.9 harvest/1.4.pl2

Rule Path
Disallow

h���ki/0.2

Rule Path
Disallow

kit-fireball/2.0 libwww/5.0a

Rule Path
Disallow

fish-search-robot

Rule Path
Disallow

mozilla/2.0 (compatible fouineur v2.0; fouineur.9bit.qc.ca)

Rule Path
Disallow

robot du crim 1.0a

Rule Path
Disallow

freecrawl

Rule Path
Disallow

funnelweb-1.0

Rule Path
Disallow

gcreep/1.0

Rule Path
Disallow

geturl.rexx v1.05

Rule Path
Disallow

golem/1.1

Rule Path
Disallow

gromit/1.0

Rule Path
Disallow

gulliver/1.1

Rule Path
Disallow

yes

Rule Path
Disallow

aitcsrobot/1.1

Rule Path
Disallow

wired-digital-newsbot/1.5

Rule Path
Disallow

htdig/3.0b3

Rule Path
Disallow

htmlgobble v2.2

Rule Path
Disallow

no

Rule Path
Disallow

ibm_planetwide,

Rule Path
Disallow

gestalticonoclast/1.0 libwww-fm/2.17

Rule Path
Disallow

ingrid/0.1

Rule Path
Disallow

incywincy/1.0b1

Rule Path
Disallow

informant

Rule Path
Disallow

infoseek robot 1.0

Rule Path
Disallow

infoseek sidewinder

Rule Path
Disallow

infospiders/0.1

Rule Path
Disallow

inspectorwww/1.0 http://www.greenpac.com/inspectorwww.html

Rule Path
Disallow

israelisearch/1.0

Rule Path
Disallow

jcrawler/0.2

Rule Path
Disallow

jeeves v0.05alpha (perl, lwp, lglb@doc.ic.ac.uk)

Rule Path
Disallow

jobot/0.1alpha libwww-perl/4.0

Rule Path
Disallow

joebot,

Rule Path
Disallow

jubiirobot

Rule Path
Disallow

jumpstation

Rule Path
Disallow

katipo/1.0

Rule Path
Disallow

kdd-explorer/0.1

Rule Path
Disallow

ko_yappo_robot/1.0.4(http://yappo.com/info/robot.html)

Rule Path
Disallow

labelgrab/1.1

Rule Path
Disallow

linkwalker

Rule Path
Disallow

logo.gif crawler

Rule Path
Disallow

lycos/x.x

Rule Path
Disallow

lycos_spider_(t-rex)

Rule Path
Disallow

magpie/1.0

Rule Path
Disallow

mediafox/x.y

Rule Path
Disallow

merzscope

Rule Path
Disallow

nec-meshexplorer

Rule Path
Disallow

momspider/1.00 libwww-perl/0.40

Rule Path
Disallow

monster/vx.x.x -$type ($ostype)

Rule Path
Disallow

motor/0.2

Rule Path
Disallow

muscatferret

Rule Path
Disallow

mwdsearch/0.1

Rule Path
Disallow

netcarta cyberpilot pro

Rule Path
Disallow

netmechanic

Rule Path
Disallow

netscoop/1.0 libwww/5.0a

Rule Path
Disallow

nhsewalker/3.0

Rule Path
Disallow

nomad-v2.x

Rule Path
Disallow

northstar

Rule Path
Disallow

occam/1.0

Rule Path
Disallow

hku www robot,

Rule Path
Disallow

orbsearch/1.0

Rule Path
Disallow

packrat/1.0

Rule Path
Disallow

patric/0.01a

Rule Path
Disallow

peregrinator-mathematics/0.7

Rule Path
Disallow

duppies

Rule Path
Disallow

pioneer

Rule Path
Disallow

pgp-ka/1.2

Rule Path
Disallow

resume robot

Rule Path
Disallow

road runner: imagescape robot (lim@cs.leidenuniv.nl)

Rule Path
Disallow

robbie/0.1

Rule Path
Disallow

computingsite robi/1.0 (robi@computingsite.com)

Rule Path
Disallow

roverbot

Rule Path
Disallow

safetynet robot 0.1,

Rule Path
Disallow

scooter/1.0

Rule Path
Disallow

not available

Rule Path
Disallow

senrigan/xxxxxx

Rule Path
Disallow

sg-scout

Rule Path
Disallow

shai'hulud

Rule Path
Disallow

simbot/1.0

Rule Path
Disallow

open text site crawler v1.0

Rule Path
Disallow

sitetech-rover

Rule Path
Disallow

slurp/2.0

Rule Path
Disallow

esismartspider/2.0

Rule Path
Disallow

snooper/b97_01

Rule Path
Disallow

solbot/1.0 lwp/5.07

Rule Path
Disallow

spanner/1.0 (linux 2.0.27 i586)

Rule Path
Disallow

no

Rule Path
Disallow

mozilla/3.0 (black widow v1.1.0; linux 2.0.27; dec 31 1997 12:25:00

Rule Path
Disallow

tarantula/1.0

Rule Path
Disallow

tarspider

Rule Path
Disallow

dlw3robot/x.y (in tclx by http://hplyot.obspm.fr/~dl/)

Rule Path
Disallow

templeton/

Rule Path
Disallow

titin/0.2

Rule Path
Disallow

titan/0.1

Rule Path
Disallow

ucsd-crawler

Rule Path
Disallow

urlck/1.2.3

Rule Path
Disallow

valkyrie/1.0 libwww-perl/0.40

Rule Path
Disallow

victoria/1.0

Rule Path
Disallow

vision-search/3.0'

Rule Path
Disallow

vwbot_k/4.2

Rule Path
Disallow

w3index

Rule Path
Disallow

w3m2/x.xxx

Rule Path
Disallow

wwwwanderer v3.0

Rule Path
Disallow

webcopy/

Rule Path
Disallow

webcrawler/3.0 robot libwww/5.0a

Rule Path
Disallow

webfetcher/0.8,

Rule Path
Disallow

weblayers/0.0

Rule Path
Disallow

weblinker/0.0 libwww-perl/0.1

Rule Path
Disallow

no

Rule Path
Disallow

webmoose/0.0.0000

Rule Path
Disallow

digimarc webreader/1.2

Rule Path
Disallow

webs@recruit.co.jp

Rule Path
Disallow

webvac/1.0

Rule Path
Disallow

webwalk

Rule Path
Disallow

webwalker/1.10

Rule Path
Disallow

webwatch

Rule Path
Disallow

wget/1.4.0

Rule Path
Disallow

w3mir

Rule Path
Disallow

no

Rule Path
Disallow

wwwc/0.25 (win95)

Rule Path
Disallow

none

Rule Path
Disallow

xget/0.7

Rule Path
Disallow

nederland.zoek

Rule Path
Disallow

bizbot04 kirk.overleaf.com

Rule Path
Disallow

happybot (gserver.kw.net)

Rule Path
Disallow

californiabrownspider

Rule Path
Disallow

ei*net/0.1 libwww/0.1

Rule Path
Disallow

ibot/1.0 libwww-perl/0.40

Rule Path
Disallow

merritt/1.0

Rule Path
Disallow

statfetcher/1.0

Rule Path
Disallow

teachersoft/1.0 libwww/2.17

Rule Path
Disallow

www collector

Rule Path
Disallow

processor/0.0alpha libwww-perl/0.20

Rule Path
Disallow

wobot/1.0 from 206.214.202.45

Rule Path
Disallow

libertech-rover www.libertech.com?

Rule Path
Disallow

whowhere robot

Rule Path
Disallow

iti spider

Rule Path
Disallow

w3index

Rule Path
Disallow

mycnnspider

Rule Path
Disallow

summycrawler

Rule Path
Disallow

ogspider

Rule Path
Disallow

linklooker

Rule Path
Disallow

cyberspyder (amant@www.cyberspyder.com)

Rule Path
Disallow

slowbot

Rule Path
Disallow

heraspider

Rule Path
Disallow

surfbot

Rule Path
Disallow

bizbot003

Rule Path
Disallow

webwalker

Rule Path
Disallow

sandbot

Rule Path
Disallow

enigmabot

Rule Path
Disallow

spyder3.microsys.com

Rule Path
Disallow

www.freeloader.com.

Rule Path
Disallow

googlebot

Rule Path
Disallow

metagopher

Rule Path
Disallow

*

Rule Path
Disallow /cgi-bin/

Comments

  • Robots.txt file from http://www.searchengineworld.com
  • Built from text file
  • http://info.webcrawler.com/mak/projects/robots/active/all.txt
  • This restricts access to only known and registered robots.
  • /* send all your seo requests tpo abuse@google.com*/

Warnings

  • 6 invalid lines.