tomw.net.au
robots.txt

Robots Exclusion Standard data for tomw.net.au

Resource Scan

Scan Details

Site Domain tomw.net.au
Base Domain tomw.net.au
Scan Status Ok
Last Scan2024-10-01T15:35:09+00:00
Next Scan 2024-10-08T15:35:09+00:00

Last Scan

Scanned2024-10-01T15:35:09+00:00
URL https://tomw.net.au/robots.txt
Domain IPs 203.132.1.1
Response IP 203.132.1.1
Found Yes
Hash 97a03293d3cfb7af6532dc818838d044f34c96306e52e0699a143db540c69fe6
SimHash a21b7f9b8be2

Groups

ia_archiver

Rule Path
Disallow /research/

Other Records

Field Value
crawl-delay 0

sensis.au web crawler

Rule Path
Disallow /

sensis.au

Rule Path
Disallow /

sensis

Rule Path
Disallow /

shopwiki

Rule Path
Disallow /

ubicrawler

Rule Path
Disallow /

doc

Rule Path
Disallow /

zao

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

xenu

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

wget

Rule Path
Disallow /

grub-client

Rule Path
Disallow /

k2spider

Rule Path
Disallow /

npbot

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

msnbot-media

Rule Path
Disallow /

yahoo-mmcrawler

Rule Path
Disallow /

psbot

Rule Path
Disallow /

asterias

Rule Path
Disallow /

msnbot-products

Rule Path
Disallow /

msnbot

Rule Path
Disallow /research/
Disallow /moodle/
Disallow /cgi-bin/
Disallow /images/
Disallow /blog/archive/
Disallow /anucecs/archive/
Disallow /screenact/archive/
Disallow /*.ppt$
Disallow /*.rtf$
Disallow /*.rm$
Disallow /*.ra$
Disallow /*.ram$
Disallow /*.dxf$
Disallow /*.au$
Disallow /*.jpg$
Disallow /*.jpeg$
Disallow /*.gif$
Disallow /*.png$

Other Records

Field Value
crawl-delay 300

yahoo! slurp

Rule Path
Disallow /research/
Disallow /images/
Disallow /cgi-bin/
Disallow /blog/archive/
Disallow /anucecs/archive/
Disallow /screenact/archive/
Disallow /moodle/
Disallow /*.ppt$
Disallow /*.rtf$
Disallow /*.rm$
Disallow /*.ra$
Disallow /*.ram$
Disallow /*.dxf$
Disallow /*.au$
Disallow /*.jpg$
Disallow /*.jpeg$
Disallow /*.gif$
Disallow /*.png$

Other Records

Field Value
crawl-delay 600

googlebot-image

Rule Path
Disallow /research/

Other Records

Field Value
crawl-delay 0

googlebot

Rule Path
Disallow /research/
Disallow /cgi-bin/
Disallow /blog/archive/
Disallow /anucecs/archive/
Disallow /screenact/archive/
Disallow /moodle/
Disallow /*.ppt$
Disallow /*.rtf$
Disallow /*.rm$
Disallow /*.ra$
Disallow /*.ram$
Disallow /*.dxf$
Disallow /*.au$
Disallow /*.pdf$

Other Records

Field Value
crawl-delay 0

adsbot-google

Rule Path
Disallow /research/
Disallow /moodle/

Other Records

Field Value
crawl-delay 0

*

Rule Path
Disallow /research/
Disallow /cgi-bin/
Disallow /images/
Disallow /blog/archive/
Disallow /anucecs/archive/
Disallow /screenact/archive/
Disallow /moodle/
Disallow /2000/bat.ppt
Disallow /2000/icarba.ppt
Disallow /2000/iim2000.ppt
Disallow /2000/iim2000.rtf
Disallow /2000/ipub.ppt
Disallow /2000/isup.ppt
Disallow /2000/pt.ppt
Disallow /2000/pt/pt.rm
Disallow /2000/scsp.ppt
Disallow /2000/yxml.ppt
Disallow /2000/yxml2.ppt
Disallow /2001/bat2001.ppt
Disallow /2001/eal/keynote.ppt
Disallow /2001/eal/notb2b.ppt
Disallow /2001/esb/invoice.rtf
Disallow /2001/itv.ppt
Disallow /2001/wf.ppt
Disallow /2001/wwgw/wireless.ppt
Disallow /2002/acra.ppt
Disallow /2002/atoxml.ppt
Disallow /2002/ebcwxml.ppt
Disallow /2002/edm.ppt
Disallow /2002/fit.ppt
Disallow /2002/maa.ppt
Disallow /2002/mka.ppt
Disallow /2002/nbt.ppt
Disallow /2003/bws.ppt
Disallow /2003/xmlstd.ppt
Disallow /2003/xmltech.ppt
Disallow /as/apan.ppt
Disallow /gsr/gsrit.ppt
Disallow /intrnt.rtf
Disallow /irc/irc1intr.au
Disallow /irc/irc1rm.au
Disallow /irc/irc5.au
Disallow /nt/nt.pdf
Disallow /nt/ntad.rtf
Disallow /nt/ntflyr.rtf
Disallow /nt/ntposter.doc
Disallow /nt/nts.ppt
Disallow /papers/ccit99.rtf
Disallow /papers/outint.rtf
Disallow /tomwlcme.au
Disallow /travel/ausalps1.au
Disallow /travel/balloon1.au
Disallow /uso/uso.pdf
Disallow /uso/usocvr.rtf
Disallow /uso/usoflyr.rtf
Disallow /uso/w3d.ppt
Disallow /uso/w3d.rtf

Other Records

Field Value
crawl-delay 10

Comments

  • robots.txt for tomw.net.au
  • Let the Internet Archive have almost everything.
  • Stop some bots completely.
  • Ban some image crawlers.
  • Ban Microsoft product crawler and slow down text crawler.
  • Slow down Yahoo and ban non-HTML files and images.
  • Allow Google to crawl images but not non-HTML files.
  • Other crawlers ban non-HTML files and images. Slow down a little.