dmm.org.uk
robots.txt

Robots Exclusion Standard data for dmm.org.uk

Resource Scan

Scan Details

Site Domain dmm.org.uk
Base Domain dmm.org.uk
Scan Status Ok
Last Scan2024-11-09T09:30:46+00:00
Next Scan 2024-11-16T09:30:46+00:00

Last Scan

Scanned2024-11-09T09:30:46+00:00
URL http://dmm.org.uk/robots.txt
Domain IPs 176.32.230.13
Response IP 176.32.230.13
Found Yes
Hash 2c0fc163d79743edb51b6637d69e7b8d44d611b57be902b0cd10b80e9a25b707
SimHash 41a75fd18e32

Groups

slurp
googlebot
msnbot
teoma
changedetection
changedetection

Rule Path
Disallow /abandon/i/
Disallow /abandon/images/
Disallow /adverts/i/
Disallow /adverts/t/
Disallow /archives/i/
Disallow /archives/images/
Disallow /articles/t/
Disallow /banners/
Disallow /books/i/
Disallow /books/images/
Disallow /books/t/
Disallow /borings/i/
Disallow /boringsi/
Disallow /cemetery/
Disallow /census/i/
Disallow /certs/images/
Disallow /cgi-bin/
Disallow /childemp/i/
Disallow /coalmag/i/
Disallow /coalmag/t/
Disallow /coalnews/i/
Disallow /coalnews/t/
Disallow /cokegas/i/
Disallow /cokegas/t/
Disallow /colleng/i/
Disallow /colleng/t/
Disallow /colliery/banners/
Disallow /colliery/i/
Disallow /colliery/images/
Disallow /collmap/i/
Disallow /collnear/i/
Disallow /company/i/
Disallow /company/images/
Disallow /css/
Disallow /daysout/i/
Disallow /daysout/images/
Disallow /daysout/t/
Disallow /digital/
Disallow /digital/heaviside/
Disallow /digital/i/
Disallow /digital/police/
Disallow /digital/punshon/
Disallow /displays/i/
Disallow /displays/images/
Disallow /dmaarchv/i/
Disallow /educate/i/
Disallow /educate/images/
Disallow /employee/i/
Disallow /errors/
Disallow /gala/i/
Disallow /gala/t/
Disallow /galantry/i/
Disallow /galantry/i/
Disallow /gallery/i/
Disallow /gallery/t/
Disallow /grid/i/
Disallow /history/i/
Disallow /history/images/
Disallow /history/t/
Disallow /images/
Disallow /incident/i/
Disallow /indivdu0/i/
Disallow /indivdu1/i/
Disallow /indivdu2/i/
Disallow /indivdu3/i/
Disallow /individ/
Disallow /individ/1883/
Disallow /individ/1884/
Disallow /individ/1888/
Disallow /individ/1889/
Disallow /individ/1890/
Disallow /individ/1891/
Disallow /individ/1892/
Disallow /individ/1893/
Disallow /individ/1894/
Disallow /individ/1895/
Disallow /individ/1896/
Disallow /individ/1897/
Disallow /individ/1898/
Disallow /individ/1899/
Disallow /individ/1900/
Disallow /individ/1901/
Disallow /individ/1902/
Disallow /individ/1903/
Disallow /individ/1904/
Disallow /individ/1905/
Disallow /individ/1906/
Disallow /individ/1907/
Disallow /individ/1908/
Disallow /individ/1909/
Disallow /individ/1910/
Disallow /individ/1911/
Disallow /individ/1912/
Disallow /individ/1913/
Disallow /individ/1914/
Disallow /individ/i/
Disallow /individ/t-00/
Disallow /individ/t-01/
Disallow /individ/t-02/
Disallow /individ/t-03/
Disallow /individ/t-04/
Disallow /individ/t-05/
Disallow /individ/t-06/
Disallow /individ/t-07/
Disallow /individ0/i/
Disallow /individ1/i/
Disallow /individ2/i/
Disallow /individn/i/
Disallow /individx/i/
Disallow /inquests/i/
Disallow /inquests/i/
Disallow /localrec/i/
Disallow /lowsrc/
Disallow /managers/i/
Disallow /maps/i/
Disallow /masterix/i/
Disallow /memorial/i/
Disallow /memorial/t/
Disallow /minequar/i/
Disallow /minequar/t/
Disallow /minerals/
Disallow /minerals/i/
Disallow /minerals/t/
Disallow /minjourn/i/
Disallow /minjourn/t/
Disallow /misc/i/
Disallow /names/i/
Disallow /names/images/
Disallow /ncb/i/
Disallow /ncbarchv/i/
Disallow /ncbarchv/t/
Disallow /pastpres/a/
Disallow /pastpres/i/
Disallow /pastpres/t/
Disallow /prosecut/i/
Disallow /railway/i/
Disallow /railway/t/
Disallow /reports/i/
Disallow /reports/images/
Disallow /seaham/i/
Disallow /shafts/i/
Disallow /shaftsf/i/
Disallow /shaftsm/i/
Disallow /sitemap/i/
Disallow /spam/
Disallow /stats/i/
Disallow /stats/images/
Disallow /temp/
Disallow /tramsime/t/
Disallow /transime/i/
Disallow /ukinqust/i/
Disallow /ukinqust/images/
Disallow /uknames/i/
Disallow /uknames/images/
Disallow /ukreport/i/
Disallow /ukreport/images/
Disallow /videos/i/
Disallow /videos/t/
Disallow /welfare/i/
Disallow /welfare/t/
Disallow /whoswho/i/
Disallow /whoswho/images/
Disallow /pitwork/1images/
Disallow /pitwork/2002/
Disallow /pitwork/2003/
Disallow /pitwork/2003a/
Disallow /pitwork/2003b/
Disallow /pitwork/2003c/
Disallow /pitwork/2003gala/
Disallow /pitwork/2004/
Disallow /pitwork/2004a/
Disallow /pitwork/2005/
Disallow /pitwork/2005a/
Disallow /pitwork/2006/
Disallow /pitwork/2007/
Disallow /pitwork/2008/
Disallow /pitwork/76banner/
Disallow /pitwork/83banner/
Disallow /pitwork/aberfan/
Disallow /pitwork/animates/
Disallow /pitwork/awallace/
Disallow /pitwork/banners/
Disallow /pitwork/bannersx/
Disallow /pitwork/bates/
Disallow /pitwork/beamish/
Disallow /pitwork/bsharp/
Disallow /pitwork/cgi-bin/
Disallow /pitwork/chimages/
Disallow /pitwork/colpics/
Disallow /pitwork/daz/
Disallow /pitwork/dedwards/
Disallow /pitwork/errors/
Disallow /pitwork/export/
Disallow /pitwork/firstaid/
Disallow /pitwork/hartley/
Disallow /pitwork/huskar/
Disallow /pitwork/i/
Disallow /pitwork/images/
Disallow /pitwork/jstocks/
Disallow /pitwork/jstocks2/
Disallow /pitwork/lamps/
Disallow /pitwork/midi/
Disallow /pitwork/notts/
Disallow /pitwork/pleasley/
Disallow /pitwork/roy/
Disallow /pitwork/roy2/
Disallow /pitwork/roy2002/
Disallow /pitwork/roy3/
Disallow /pitwork/roydec02/
Disallow /pitwork/slideshw/
Disallow /pitwork/spam/
Disallow /pitwork/temp/
Disallow /pitwork/wavs/
Disallow /pitwork/whaley/
Disallow /forum/attachments/
Disallow /forum/avatars/
Disallow /forum/Packages/
Disallow /forum/Smileys/
Disallow /forum/Themes/

aboundex
aboundexbot
aleadsoftbot
botw spider
backrub
baglbot
baiduspider
baiduspider
baiduspider+
becomebot
beijingcrawler
bilbo
bilgibot
botrighthere
bumblebee
buzzrankingbot
catchbot
cazoodlebot
charlotte
cherrypicker
clushbot
copyrightcheck
crawler
crescent
cydralspider
dynamic
datafountains
domaincrawler
diamondbot
dittospyder
dulance bot
earthcom.info
edi
edisterbot
easydl
emailcollector
emailmarketingrobot
emailwolf
emeraldshield.com webbot
exabot
exabot-images
exabot-test
exalead ng
fangcrawl
feed::find
gigabot
gigabot
gigabotsitesearch
grub.org
gurujibot
hagansreportcrawlbot
hailoobot
hatena antenna
hatena bookmark
hatena rss
hatenascreenshot
helix
hiddenmarket
huaweisymantecspider
hyperestraier
iiitbot
infociousbot
jetbot
keeprightbot
kretrieve
kolinka forum search
looq
letscrawl.com
lincoln state web browser
linkedinbot
links4us-crawler
linkwalker
lsearch/sondeur
mj12bot
mapoftheinternet.com
mirago
moreoverbot
np
npbot
nationaldirectory
nerdbynature.bot
netcarta_webmapper
newsgator
nextgensearchbot
nudelsalat
nutch
oozbot
omniexplorer_bot
openintelligencedata
oracle enterprise search
pmafind
pagepeeker
pajaczek
peerfactor 404 crawler
peerfactor crawler
plantynet
plantynet_webrobot
pogodak!
python-urllib
quickfinder crawler
radiation retriever
reaper
redcarpet
scorpionbot
scrubby
scumbot
search17
seeker.lookseek.com
seznambot
showxml
sistrix
skimbot
snappreviewbot
snapbot
socialradarbot
spankbot
sharegloo_bot
speedy
speedy spider
speedyspider
speedy_spider
squigglebotbot
steeler
sunrise
surveybot
synapticsearch
t-h-u-n-d-e-r-s-t-o-n-e
tmcrawler
talkro web-shot
tarantula
terrawizbot
theinformant
thriceler
tridentspider
tutorial crawler
tweetmemebot
tweetedtimes bot
twiceler
uri::fetch
vagabondo
vengabot
vonna.com b o t
vortex
votay bot
voyager
wire
wisebot
walhello appie
webalta crawler
webbandit
webcorp
webbot
webclipping.com
webinator
xspider
xaldon_webspider
xerka webbot
yahoo-mmcrawler
yodaobot
yoono
zyborg
abot
bot
botlist
bumblebee
ccbot
dealgates
dotbot
envolk
exabot
exabot-images
exabot-thumbnails
exactseek-pagereaper
flatlandbot
followsite
gigabot
googlebot-image
googlebot-mobile
ivia
ivia page fetcher
ia_archiver
iaskspider
ichiro
kalooga
larbin
magpie-crawler
msnbot-media
msnbot-newsblogs
msnbot-products
naverbot
ng
nicebot
panscient.com
plinki
psbot
searchbot
seekbot
snap.com
snap.com beta crawler
sohu
sosospider
sphsearch
spider
turnitinbot
twiceler
unwrapbot
voyager
vscooter
webcollage
yoono

Rule Path
Disallow /

bilbo

Rule Path
Disallow /

vagabondo

Rule Path
Disallow /

webdatacentrebot

Rule Path
Disallow /

botmobi

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

catchbot

Rule Path
Disallow /

firstsearchbot
firstsearchbot/1.0

Rule Path
Disallow /

gaisbot

Rule Path
Disallow /

keywenbot

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

sqwidgebot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

firstsearchbot/1.0
firstsearchbot

Rule Path
Disallow /

botw spider

Rule Path
Disallow /

datapatrol/nutch-1.0
datapatrol
garlikcrawler/1.1
garlikcrawler

Rule Path
Disallow /

gingercrawler/1.0
gingercrawler

Rule Path
Disallow /

smart.apnoti.com robot/v1.34 (http://smart.apnoti.com/bot)

Rule Path
Disallow /

webalta crawler/1.3.23
webalta

Rule Path
Disallow /

vdbot/1.0
vdbot

Rule Path
Disallow /

servage robot
servage

Rule Path
Disallow /

purebot
purebot/1.1

Rule Path
Disallow /

spbot
spbot/1.0
spbot/2.0

Rule Path
Disallow /

linguee

Rule Path
Disallow /

purebot
purebot/1.1

Rule Path
Disallow /

askpeter
askpeter_bot
askpeter_bot/5.1

Rule Path
Disallow /

methabot

Rule Path
Disallow /

discobot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

baiduimagespider

Rule Path
Disallow /

baidumobaider

Rule Path
Disallow /

mozilla/5.0 (compatible; baiduspider/2.0; +http://www.baidu.com/search/spider.html)
baiduspider/2.0
baiduspider

Rule Path
Disallow /

catchbot
catchbot/2.0
catchbot/2.0; +http://www.catchbot.com

Rule Path
Disallow /

mozilla/5.0 (compatible; yahoo! searchmonkey 1.0; http://developer.yahoo.com/searchmonkey/useragent)
yahoo! searchmonkey 1.0
searchmonkey 1.0

Rule Path
Disallow /

mozilla/5.0 (compatible; firstsearchbot/1.0; +http://www.first-search.com/dmm-pitwork.org.uk.htm)

Rule Path
Disallow /

mozilla/5.0 (compatible; purebot/1.1; +http://www.puritysearch.net/)

Rule Path
Disallow /

chen li/nutch-1.0

Rule Path
Disallow /

magus bot 1.0

Rule Path
Disallow /

mozilla/5.0 (compatible; search17bot/1.1; http://www.search17.com/bot.php)
search17bot/1.1

Rule Path
Disallow /

easydl/3.04 http://keywen.com/encyclopedia/bot
easydl/3.04

Rule Path
Disallow /

corensearchbot/1.4 en libwww-perl/5.808
corensearchbot/1.4

Rule Path
Disallow /

contextad bot 1.0
mozilla/4.0 (compatible; msie 6.0; windows nt 5.0;.net clr 1.0.3705; contextad bot 1.0)

Rule Path
Disallow /

domaincrawler 1.0

Rule Path
Disallow /

vwbot - corensearchbot/1.5 en derivative
vwbot

Rule Path
Disallow /

mozilla/5.0 (compatible; goguidesbot/1.3; http://www.goguides.org/spider.html)
goguidesbot/1.3
goguidesbot

Rule Path
Disallow /

mozilla/5.0 (snappreviewbot) gecko/20061206 firefox/1.5.0.9
snappreviewbot

Rule Path
Disallow /

mozilla/5.0 (compatible; emailmarketingrobot/2.1; +http://www.emailmarketingrobot.com/webmasters/)
emailmarketingrobot/2.1

Rule Path
Disallow /

keeprightbot/0.2 (keepright openstreetmap checker; http://keepright.ipax.at)
keeprightbot/0.2

Rule Path
Disallow /

magpie-crawler/1.1 (u; linux amd64; en-gb; +http://www.brandwatch.net)
magpie-crawler/1.1
magpie-crawler

Rule Path
Disallow /

mozilla/4.0 compatible zyborg/1.0 (wn-14.zyborg.net; http://www.wisenutbot.com)
zyborg/1.0

Rule Path
Disallow /

mozilla/5.0 (compatible; tweetmemebot/2.11; +http://tweetmeme.com/)
tweetmemebot
tweetmemebot/2.11

Rule Path
Disallow /

pagepeeker

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

mozilla/5.0 (compatible; megaindex.ru/2.0; +http://megaindex.com/crawler)
megaindex.ru
megaindex.ru/2.0

Rule Path
Disallow /

mozilla/5.0 (compatible) semanticscholarbot (+https://www.semanticscholar.org/crawler)
semanticscholarbot

Rule Path
Disallow /

bublupbot (+https://www.bublup.com/bublup-bot.html)
bublupbot

Rule Path
Disallow /

mozilla/5.0 (compatible; alphabot/3.2; +http://alphaseobot.com/bot.html)
alphaseobot
alphabot
alphabot/3.2

Rule Path
Disallow /

mozilla/5.0 (compatible; femtosearchbot/1.0; http://femtosearch.com)
femtosearchbot
femtosearchbot/1.0

Rule Path
Disallow /

mozilla/5.0 (compatible; nimbostratus-bot/v1.3.2; http://cloudsystemnetworks.com)
nimbostratus-bot
nimbostratus-bot/v1.3.2

Rule Path
Disallow /

mozilla/5.0 (compatible; semrushbot/2~bl; +http://www.semrush.com/bot.html)
semrushbot

Rule Path
Disallow /

mozilla/5.0 (compatible; yandexbot/3.0; +http://yandex.com/bots)
yandexbot
yandexbot/3.0
yandexmobilebot
yandexdirect
yandexdirectdyn
yandexmedia
yandeximages
yadirectfetcher
yandexblogs
yandexnews
yandexpagechecker
yandexmetrika
yandexmarket
yandexcalendar

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

mozilla/5.0 (compatible; archive.org_bot +http://archive.org/details/archive.org_bot)
archive.org_bot

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

mozilla/5.0 (compatible; neevabot/1.0; +https://neeva.com/neevabot)
neevabot
neevabot/1.0

Rule Path
Disallow /

accompanybot

Rule Path
Disallow /

newspaper
newspaper/0.2.8

Rule Path
Disallow /

*

Rule Path
Disallow /

Comments

  • robots.txt for www.dmm.org.uk
  • www.dmm-pitwork.org.uk
  • www.dmm-forum.org.uk
  • Block crawler: Mozilla/4.0 (compatible; Vagabondo/4.0; webcrawler at wise-guys dot nl; http://webagent.wise-guys.nl/; http://www.wise-guys.nl/)
  • Block crawler: Mozilla/4.0 (compatible; Vagabondo/4.0; webcrawler at wise-guys dot nl; http://webagent.wise-guys.nl/; http://www.wise-guys.nl/)
  • Block crawler: Mozilla/5.0 (compatible; WebDataCentreBot/1.0; +http://WebDataCentre.com/)
  • Block crawler: Nokia6680/1.0 (4.04.07) SymbianOS/8.0 Series60/2.6 Profile/MIDP-2.0 Configuration/CLDC-1.1 (botmobi find.mobi/bot.html find@mtld.mobi)
  • Block crawler: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; MSIECrawler)
  • Block crawler: CatchBot/1.0; +http://www.catchbot.com
  • Block crawler: Mozilla/5.0 (compatible; FirstSearchBot/1.0; +http://www.first-search.com/dmm-pitwork.org.uk.htm)
  • Block crawler: Gaisbot/3.0+(robot06@gais.cs.ccu.edu.tw;+http://gais.cs.ccu.edu.tw/robot.php)
  • Block crawler: KeywenBot/4.1 http://www.keywen.com/Encyclopedia/Links
  • Block crawler: Sosospider+(+http://help.soso.com/webspider.htm)
  • Block crawler: compatible; Mozilla 4.0; MSIE 5.5; (SqwidgeBot b1 - http://www.sqwidge.com/bot/)
  • Block crawler: TurnitinBot/2.1 (http://www.turnitin.com/robot/crawlerinfo.html)
  • 06/08/09
  • Block crawler: Mozilla/5.0 (compatible; FirstSearchBot/1.0; +http://www.first-search.com/dmm-pitwork.org.uk.htm)
  • Block crawler: Mozilla/4.0 (compatible; BOTW Spider; +http://botw.org)
  • Block crawler: DataPatrol/Nutch-1.0 (DataPatrol indexer from Garlik; http://www.garlik.com/products.php; crawler at garlik dot com)
  • GarlikCrawler/1.1 (http://garlik.com/, crawler@garlik.com)
  • Block crawler: GingerCrawler/1.0 (Language Assistant for Dyslexics; www.gingersoftware.com/crawler_agent.htm; support at ginger software dot com)
  • Block crawler: smart.apnoti.com Robot/v1.34 (http://smart.apnoti.com/bot)
  • 13/08/09
  • Block crawler: WebAlta Crawler/1.3.23 (http://www.webalta.net/ru/about_webmaster.html) (Windows; U; Windows NT 5.1; ru-RU)
  • Block crawler: Mozilla/5.0 (compatible; VDBot/1.0; +http://www.bvdep.com/)
  • Block crawler: Servage Robot ( http://www.servage.net/page/robot/ )
  • Block crawler: Mozilla/5.0 (compatible; Purebot/1.1; +http://www.puritysearch.net/)
  • Block crawler: Mozilla/5.0 (compatible; spbot/1.0; +http://www.seoprofiler.com/bot/ )
  • Block crawler: Linguee Bot (http://www.linguee.com/bot; bot@linguee.com)
  • Block crawler: Mozilla/5.0 (compatible; Purebot/1.1; +http://www.puritysearch.net/)
  • Block crawler: Mozilla/5.0 (compatible; askpeter_bot/5.1; +http://www.askpeter.info)
  • Mozilla/5.0 (compatible; AhrefsBot/4.0; +http://ahrefs.com/robot/)