horizonpoll.co.nz
robots.txt

Robots Exclusion Standard data for horizonpoll.co.nz

Resource Scan

Scan Details

Site Domain horizonpoll.co.nz
Base Domain horizonpoll.co.nz
Scan Status Ok
Last Scan2024-09-04T16:48:55+00:00
Next Scan 2024-10-04T16:48:55+00:00

Last Scan

Scanned2024-09-04T16:48:55+00:00
URL https://horizonpoll.co.nz/robots.txt
Domain IPs 120.138.16.117
Response IP 120.138.16.117
Found Yes
Hash 10e0191e8052edbd30d1d3c90614ffd9f605a7b170c8d663515248a28d5ab3a1
SimHash a610cd02cbc3

Groups

teoma

Rule Path
Disallow /

twiceler

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

scrubby

Rule Path
Disallow /

nutch

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

naverbot

Rule Path
Disallow /

yeti

Rule Path
Disallow /

psbot

Rule Path
Disallow /

asterias

Rule Path
Disallow /

yahoo-blogs

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

*

Rule Path
Disallow /Enquiry/
Disallow /Admin
Disallow /Areas

Other Records

Field Value
crawl-delay 2

Other Records

Field Value
sitemap /sitemap

Comments

  • robots.txt
  • beweb robotos
  • sometimes you want to disable msn because it hammers our servers a lot, but some people actually use it so maybe not
  • these are generally useless bots