thevirtual.co.nz
robots.txt

Robots Exclusion Standard data for thevirtual.co.nz

Resource Scan

Scan Details

Site Domain thevirtual.co.nz
Base Domain thevirtual.co.nz
Scan Status Ok
Last Scan2024-09-15T19:35:04+00:00
Next Scan 2024-10-15T19:35:04+00:00

Last Scan

Scanned2024-09-15T19:35:04+00:00
URL https://www.thevirtual.co.nz/robots.txt
Domain IPs 202.6.117.70
Response IP 202.6.117.70
Found Yes
Hash a9a895f33fca75b5884fca0860fc277c2db102e333adceae705c5904cb01573d
SimHash a860da716f63

Groups

ahrefsbot

Rule Path
Disallow /

becomebot

Rule Path
Disallow /

blackboard safeassign/0.1

Rule Path
Disallow /

charlotte

Rule Path
Disallow /

converacrawler

Rule Path
Disallow /

converamultimediacrawler

Rule Path
Disallow /

crawl

Rule Path
Disallow /

domaincrawler/3.0

Rule Path
Disallow /

fast

Rule Path
Disallow /

fast enterprise crawler

Rule Path
Disallow /

fast enterprise crawler 6

Rule Path
Disallow /

fast metaweb crawler

Rule Path
Disallow /

favorstarbot/1.0

Rule Path
Disallow /

frontpage

Rule Path
Disallow /

funnelback

Rule Path
Disallow /

gaisbot

Rule Path
Disallow /

geona

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

httrack

Rule Path
Disallow /

hurisearchbot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow

ichiro/2.0

Rule Path
Disallow /

irlbot

Rule Path
Disallow /

java

Rule Path
Disallow /

java/1.4.2_05

Rule Path
Disallow /

java/1.5.0_04

Rule Path
Disallow /

java/1.5.0_12

Rule Path
Disallow /

java/1.6.0-ea

Rule Path
Disallow /

linkwalker

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

megaindex.ru/2.0

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

nextgensearchbot

Rule Path
Disallow /

nutch

Rule Path
Disallow /

omniexplorer_bot/1.09

Rule Path
Disallow /

panscient.com

Rule Path
Disallow /

picmole

Rule Path
Disallow /

polybot

Rule Path
Disallow /

pompos

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

psbot

Rule Path
Disallow /

scirus

Rule Path
Disallow /

qihoobot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

shim-crawler

Rule Path
Disallow /

snapbot/1.0

Rule Path
Disallow /

sogou

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teoma

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

twiceler0.9

Rule Path
Disallow /

twiceler-0.9

Rule Path
Disallow /

w3crobot

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webupd

Rule Path
Disallow /

www.webwombat.com.au

Rule Path
Disallow /

webwombat

Rule Path
Disallow /

webwombat

Rule Path
Disallow /

webzip

Rule Path
Disallow /

wisenutbot

Rule Path
Disallow /

yodaobot/1.0

Rule Path
Disallow /

yahoo-mmcrawler

Rule Path
Disallow /

zibber-v0.1(www.zibb.com/crawler/)

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

googlebot

Rule Path
Allow /*?version=None
Allow /*?domain=widgets&language=en_us
Disallow /*?
Disallow /*atct_album_view$
Disallow /*folder_factories$
Disallow /*folder_summary_view$
Disallow /*login_form$
Disallow /*mail_password_form$
Disallow /search$
Disallow /%40%40search$
Disallow /*/%40%40search$
Disallow /database
Disallow /applicants
Disallow /search_rss
Disallow /*search_rss$
Disallow /*sendto_form$
Disallow /*summary_view$
Disallow /*thumbnail_view$
Disallow /view$
Disallow /*/%40%40view$

Other Records

Field Value
crawl-delay 2

*

Rule Path
Allow /*?version=None
Allow /*?domain=widgets&language=en_us
Disallow /*?
Disallow /*atct_album_view$
Disallow /*folder_factories$
Disallow /*folder_summary_view$
Disallow /*login_form$
Disallow /*mail_password_form$
Disallow /search
Disallow /%40%40search
Disallow /*/%40%40search$
Disallow /search_rss
Disallow /database
Disallow /applicants
Disallow /*search_rss$
Disallow /*sendto_form$
Disallow /*summary_view$
Disallow /*thumbnail_view$
Disallow /view$
Disallow /*/%40%40view$

Other Records

Field Value
crawl-delay 2

Comments

  • User-agent: Baiduspider
  • Disallow: /
  • Charlotte/1.0b; 20060525 209.249.86.4
  • 20060525 210.173.180.16
  • 20070527 210.150.10.109
  • 20050525 209.167.50.22
  • Doesnt seem to be a pain anymore
  • User-Agent: MSIECrawler
  • Disallow:/
  • Too greedy at present, downloads same links upto 80 times
  • User-Agent: msnbot-media
  • Disallow:/
  • User-agent: Slurp
  • Disallow: /
  • Add Googlebot-specific syntax extension to exclude forms
  • that are repeated for each piece of content in the site
  • the wildcard is only supported by Googlebot
  • http://www.google.com/support/webmasters/bin/answer.py?answer=40367&ctx=sibling

Warnings

  • 2 invalid lines.