sin01.search.spotxchange.com
robots.txt

Robots Exclusion Standard data for sin01.search.spotxchange.com

Resource Scan

Scan Details

Site Domain sin01.search.spotxchange.com
Base Domain spotxchange.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-02-21T00:13:41+00:00
Next Scan 2024-05-21T00:13:41+00:00

Last Successful Scan

Scanned2023-07-03T19:51:41+00:00
URL http://sin01.search.spotxchange.com/robots.txt
Domain IPs 103.71.26.123, 103.71.26.124
Response IP 103.71.26.124
Found Yes
Hash 03c9452313fdf84526d447e259a7eb8c61bb068b3b1caa7b076e8f7ce28ac91b
SimHash e74377546d7e

Groups

abilon

Rule Path
Disallow /

abot

Rule Path
Disallow /

accoona-ai-agent

Rule Path
Disallow /

agentname

Rule Path
Disallow /

aipbot

Rule Path
Disallow /

aladdino

Rule Path
Disallow /

apachebench

Rule Path
Disallow /

aport

Rule Path
Disallow /

appie

Rule Path
Disallow /

applesyndication

Rule Path
Disallow /

arachnia

Rule Path
Disallow /

aranha

Rule Path
Disallow /

art-online.com

Rule Path
Disallow /

ask jeeves

Rule Path
Disallow /

ask+jeeves

Rule Path
Disallow /

asterias

Rule Path
Disallow /

atomz

Rule Path
Disallow /

avantgo

Rule Path
Disallow /

avsearch

Rule Path
Disallow /

b2w

Rule Path
Disallow /

backweb

Rule Path
Disallow /

baidu

Rule Path
Disallow /

becomebot

Rule Path
Disallow /

bigbrother

Rule Path
Disallow /

bimbo

Rule Path
Disallow /

blitzbot

Rule Path
Disallow /

bloglines

Rule Path
Disallow /

bordermanager

Rule Path
Disallow /

bumblebee

Rule Path
Disallow /

ce-preload

Rule Path
Disallow /

change detection

Rule Path
Disallow /

change+detection

Rule Path
Disallow /

changedetection

Rule Path
Disallow /

charlotte

Rule Path
Disallow /

check_http

Rule Path
Disallow /

checkurl

Rule Path
Disallow /

chkd

Rule Path
Disallow /

coast

Rule Path
Disallow /

combine

Rule Path
Disallow /

cometsearch

Rule Path
Disallow /

contype

Rule Path
Disallow /

convera

Rule Path
Disallow /

copernicenterprisesearch

Rule Path
Disallow /

copyrightcheck

Rule Path
Disallow /

cosmos

Rule Path
Disallow /

crawler

Rule Path
Disallow /

crescent

Rule Path
Disallow /

crucial inforation miner

Rule Path
Disallow /

crucial+inforation+miner

Rule Path
Disallow /

curl

Rule Path
Disallow /

dialer

Rule Path
Disallow /

diphonet

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

download+ninja

Rule Path
Disallow /

dtaagent

Rule Path
Disallow /

dts agent

Rule Path
Disallow /

dts+agent

Rule Path
Disallow /

earthcom.info

Rule Path
Disallow /

echo

Rule Path
Disallow /

emailsiphon

Rule Path
Disallow /

enews creator

Rule Path
Disallow /

enews+creator

Rule Path
Disallow /

enfish tracker

Rule Path
Disallow /

enfish+tracker

Rule Path
Disallow /

fast

Rule Path
Disallow /

favorg

Rule Path
Disallow /

feedonfeeds

Rule Path
Disallow /

fetch

Rule Path
Disallow /

filehound

Rule Path
Disallow /

firehunter

Rule Path
Disallow /

flashget

Rule Path
Disallow /

freefind

Rule Path
Disallow /

frontier

Rule Path
Disallow /

geniebot

Rule Path
Disallow /

getright

Rule Path
Disallow /

go!zilla

Rule Path
Disallow /

golem

Rule Path
Disallow /

gomezagent

Rule Path
Disallow /

googlebot

Rule Path
Disallow /

grabber

Rule Path
Disallow /

grub

Rule Path
Disallow /

gulliver

Rule Path
Disallow /

hapax

Rule Path
Disallow /

harvest

Rule Path
Disallow /

hit list

Rule Path
Disallow /

hit+list

Rule Path
Disallow /

hitlist

Rule Path
Disallow /

htdig

Rule Path
Disallow /

httrack

Rule Path
Disallow /

ia_archive

Rule Path
Disallow /

ibot

Rule Path
Disallow /

ichiro

Rule Path
Disallow /

ideare

Rule Path
Disallow /

ieautodiscovery

Rule Path
Disallow /

iltrovatore-setaccio

Rule Path
Disallow /

indy library

Rule Path
Disallow /

indy+library

Rule Path
Disallow /

infolink

Rule Path
Disallow /

infoseek

Rule Path
Disallow /

inktomi search

Rule Path
Disallow /

inktomi+search

Rule Path
Disallow /

internet ninja

Rule Path
Disallow /

internet+ninja

Rule Path
Disallow /

internetseer

Rule Path
Disallow /

inverse ip insight

Rule Path
Disallow /

inverse+ip+insight

Rule Path
Disallow /

ipsentry

Rule Path
Disallow /

irlbot

Rule Path
Disallow /

isilo

Rule Path
Disallow /

jakarta

Rule Path
Disallow /

janrain-lobster

Rule Path
Disallow /

jetbot

Rule Path
Disallow /

jobo

Rule Path
Disallow /

justview

Rule Path
Disallow /

keepalive

Rule Path
Disallow /

keynote

Rule Path
Disallow /

kilroy

Rule Path
Disallow /

kinja

Rule Path
Disallow /

kummhttp

Rule Path
Disallow /

lachesis

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww-perl

Rule Path
Disallow /

linkbot

Rule Path
Disallow /

linkchecker

Rule Path
Disallow /

linklint

Rule Path
Disallow /

linkscan

Rule Path
Disallow /

linksweeper

Rule Path
Disallow /

linkwalker

Rule Path
Disallow /

lisa

Rule Path
Disallow /

locust

Rule Path
Disallow /

lotusdiscovery

Rule Path
Disallow /

lwp

Rule Path
Disallow /

lydia

Rule Path
Disallow /

mac finder

Rule Path
Disallow /

mac+finder

Rule Path
Disallow /

macreport

Rule Path
Disallow /

magenta

Rule Path
Disallow /

magus bot

Rule Path
Disallow /

magus+bot

Rule Path
Disallow /

markwatch

Rule Path
Disallow /

mazingo

Rule Path
Disallow /

mazzilla

Rule Path
Disallow /

mediapartners-google

Rule Path
Disallow /

mercator

Rule Path
Disallow /

mfc_tear_sample

Rule Path
Disallow /

microsoft internet explorer/4.40.426 (windows 95)

Rule Path
Disallow /

microsoft scheduled cache content download service

Rule Path
Disallow /

microsoft url control

Rule Path
Disallow /

microsoft+internet+explorer/4.40.426+(windows+95)

Rule Path
Disallow /

microsoft+scheduled+cache+content+download+service

Rule Path
Disallow /

microsoft+url+control

Rule Path
Disallow /

minuteman

Rule Path
Disallow /

mirago

Rule Path
Disallow /

missigua

Rule Path
Disallow /

miva

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

mobipocket webcompanion

Rule Path
Disallow /

mobipocket+webcompanion

Rule Path
Disallow /

moget

Rule Path
Disallow /

monitor

Rule Path
Disallow /

monkeycrawl

Rule Path
Disallow /

monster

Rule Path
Disallow /

mothra/126-paladium

Rule Path
Disallow /

motor

Rule Path
Disallow /

mozilla 2.0 (compatible; msie 3.02; update a; windows nt)

Rule Path
Disallow /

mozilla/5.0 (compatible; msie 5.0)

Rule Path
Disallow /

mozilla/5.0+(compatible;+msie+5.0)

Rule Path
Disallow /

mozilla+2.0+(compatible;+msie+3.02;+update+a;+windows+nt)

Rule Path
Disallow /

ms frontpage

Rule Path
Disallow /

ms search

Rule Path
Disallow /

ms+frontpage

Rule Path
Disallow /

ms+search

Rule Path
Disallow /

msnptc

Rule Path
Disallow /

nalanda

Rule Path
Disallow /

nbot

Rule Path
Disallow /

nessus

Rule Path
Disallow /

netmechanic

Rule Path
Disallow /

netnewswire

Rule Path
Disallow /

new/0.1libwww

Rule Path
Disallow /

newave-lisa

Rule Path
Disallow /

news search

Rule Path
Disallow /

news+search

Rule Path
Disallow /

newsapp

Rule Path
Disallow /

newsbot

Rule Path
Disallow /

newsfire

Rule Path
Disallow /

newsgator

Rule Path
Disallow /

newslookup

Rule Path
Disallow /

newsmachine

Rule Path
Disallow /

newsnow

Rule Path
Disallow /

newssearch

Rule Path
Disallow /

nextgensearchbot

Rule Path
Disallow /

ng/2.0

Rule Path
Disallow /

nomad

Rule Path
Disallow /

npbot

Rule Path
Disallow /

nutch

Rule Path
Disallow /

nutscrape

Rule Path
Disallow /

obot

Rule Path
Disallow /

ocelli

Rule Path
Disallow /

omniexplorer

Rule Path
Disallow /

openfind

Rule Path
Disallow /

oracle ultra search

Rule Path
Disallow /

oracle+ultra+search

Rule Path
Disallow /

patric

Rule Path
Disallow /

perman surfer

Rule Path
Disallow /

perman+surfer

Rule Path
Disallow /

pioneer

Rule Path
Disallow /

pita

Rule Path
Disallow /

pluck

Rule Path
Disallow /

plumtree

Rule Path
Disallow /

polybot

Rule Path
Disallow /

pompos

Rule Path
Disallow /

port huron labs

Rule Path
Disallow /

port+huron+labs

Rule Path
Disallow /

powermarks

Rule Path
Disallow /

proxysg

Rule Path
Disallow /

psbot

Rule Path
Disallow /

pulpfiction

Rule Path
Disallow /

quepasacreep

Rule Path
Disallow /

rational sitecheck

Rule Path
Disallow /

rational+sitecheck

Rule Path
Disallow /

realnamesbot

Rule Path
Disallow /

robot

Rule Path
Disallow /

rpt-http

Rule Path
Disallow /

rss client

Rule Path
Disallow /

rss+client

Rule Path
Disallow /

rssmaker-ng

Rule Path
Disallow /

rssreader

Rule Path
Disallow /

rufusbot

Rule Path
Disallow /

sawaalrobo

Rule Path
Disallow /

schmozilla

Rule Path
Disallow /

scirus

Rule Path
Disallow /

scooter

Rule Path
Disallow /

scoutabout

Rule Path
Disallow /

search.ch

Rule Path
Disallow /

seekbot

Rule Path
Disallow /

seeker.lookseek.com

Rule Path
Disallow /

servers alive

Rule Path
Disallow /

servers+alive

Rule Path
Disallow /

sherlock

Rule Path
Disallow /

shopwiki

Rule Path
Disallow /

sitescooper

Rule Path
Disallow /

slurp

Rule Path
Disallow /

slysearch

Rule Path
Disallow /

snooper

Rule Path
Disallow /

sohu

Rule Path
Disallow /

spider

Rule Path
Disallow /

spike

Rule Path
Disallow /

spinne

Rule Path
Disallow /

spyder

Rule Path
Disallow /

squid cache

Rule Path
Disallow /

squid+cache

Rule Path
Disallow /

stackrambler

Rule Path
Disallow /

stuff

Rule Path
Disallow /

sucker

Rule Path
Disallow /

sundoh search

Rule Path
Disallow /

sundoh+search

Rule Path
Disallow /

szukacz

Rule Path
Disallow /

taz

Rule Path
Disallow /

teleport

Rule Path
Disallow /

templeton

Rule Path
Disallow /

teoma

Rule Path
Disallow /

thunderstone

Rule Path
Disallow /

t-h-u-n-d-e-r-s-t-o-n-e

Rule Path
Disallow /

topix

Rule Path
Disallow /

ukonline

Rule Path
Disallow /

ultraseek

Rule Path
Disallow /

urchin

Rule Path
Disallow /

urlcheck

Rule Path
Disallow /

vagabondo

Rule Path
Disallow /

versus

Rule Path
Disallow /

voyager

Rule Path
Disallow /

web downloader

Rule Path
Disallow /

web+downloader

Rule Path
Disallow /

webauto

Rule Path
Disallow /

webcapture

Rule Path
Disallow /

webcheck

Rule Path
Disallow /

webclipping.com

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

webcrawl

Rule Path
Disallow /

webdup

Rule Path
Disallow /

webextractor

Rule Path
Disallow /

webinator

Rule Path
Disallow /

website extractor

Rule Path
Disallow /

website+extractor

Rule Path
Disallow /

webtool

Rule Path
Disallow /

webtrends

Rule Path
Disallow /

webvac

Rule Path
Disallow /

webwasher

Rule Path
Disallow /

webzip

Rule Path
Disallow /

wfarc

Rule Path
Disallow /

wget

Rule Path
Disallow /

whatsup

Rule Path
Disallow /

whizbang

Rule Path
Disallow /

worm

Rule Path
Disallow /

xenu

Rule Path
Disallow /

yacy

Rule Path
Disallow /

yandex

Rule Path
Disallow /

ync

Rule Path
Disallow /

yotta

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

zeus

Rule Path
Disallow /

zibber

Rule Path
Disallow /

zipppbot

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

ez publish link validator

Rule Path
Disallow /

ez+publish+link+validator

Rule Path
Disallow /

whistleblower

Rule Path
Disallow /

terrawizbot

Rule Path
Disallow /

goldfire

Rule Path
Disallow /

sitevigil

Rule Path
Disallow /

emailsmartz

Rule Path
Disallow /

iopus-i-m

Rule Path
Disallow /

bits

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

c r a w l e r

Rule Path
Disallow /

c+r+a+w+l+e+r

Rule Path
Disallow /

freedom

Rule Path
Disallow /

yahoofeedseeker

Rule Path
Disallow /

internal zero-knowledge agent

Rule Path
Disallow /

internal+zero-knowledge+agent

Rule Path
Disallow /

naverbot

Rule Path
Disallow /

surveybot/

Rule Path
Disallow /

liferea

Rule Path
Disallow /

netnewswire

Rule Path
Disallow /

tpsystem

Rule Path
Disallow /

yahooseeker

Rule Path
Disallow /

findlinks

Rule Path
Disallow /

psycheclone

Rule Path
Disallow /

oodlebot

Rule Path
Disallow /

mackster

Rule Path
Disallow /

adsbot-google

Rule Path
Disallow /

innovantagebot

Rule Path
Disallow /

nasa search

Rule Path
Disallow /

khte

Rule Path
Disallow /

ktxn

Rule Path
Disallow /

automapit

Rule Path
Disallow /

advanced email extractor

Rule Path
Disallow /

advanced+email+extractor

Rule Path
Disallow /

msrbot

Rule Path
Disallow /

moreoverbot

Rule Path
Disallow /

Comments

  • IAB_ABCe_International_Spiders_and_Robots_200612
  • December 20, 2006
  • **********COMMENTS SECTION***************************************************
  • This list has been reviewed by the IAB MTF Spider & Robot Policy Board.
  • This file contains a list of patterns that may be matched against HTTP User
  • Agent (UA) strings to determine whether that UA matches a known spider or
  • robot. This is one step of several required for compliance to IAB Advertising
  • Measurement Guidelines.
  • The list is valid for use when counting Client Side Counting (CSC)
  • transactions. See [http://www.iab.net/standards/pdf/2292%20IAB%20spreads.pdf]
  • for more info.
  • Rule: If any of these patterns are found to match any string within the HTTP
  • User-Agent, case insensitively, it is identified as a non-human interaction
  • and so should be filtered from counts.
  • It is strongly suggested that users analyze their own log data and sort this
  • list in order of frequency to allow their filter program to work as
  • efficiently as possible.
  • This list is provided in good faith but must be used at the user's own risk.
  • The IAB, ABC ELECTRONIC and ImServices accept no responsibility for any
  • legal, technical or commercial consequences arising from the use of this list.
  • Special characters in this file:
  • - (only at the start of a line) this line is a comment
  • | - field separator
  • , - field separator (Used when multiple exceptions)
  • blank lines may be present. ignore them.
  • Fields - delimited by a pipe symbol [|]:
  • 1) pattern - case insensitive string to match anywhere in the string
  • reserved characters are URL-escaped if present (|=%7C #=%23)
  • 2) active flag
  • 1=pattern is active and should be matched
  • 0=pattern is inactive, and should ignored
  • 3) [optional] comma-separated list of exception patterns
  • reserved characters are URL-escaped if present (|=%7C #=%23 ,=%2C)
  • 4) A an additional flag was added to this list in November 2005 to identify
  • those user-agent strings on this list that would not pass the valid user-
  • agent test and therefore, are redundant if both lists are used.
  • 1=this entry is not needed for those who use a two-pass approach
  • 0=this entry is always needed for both one-pass and two-pass
  • approaches
  • 5) Another flag was added to this list when the IAB and ABCe merged their two
  • lists (01/06) to identify those strings that primarily impact page
  • impression measurement vs. those strings that primarily impact ad
  • impression measurement (or both). The flags are as follows:
  • 0=this entry primarily impacts page impression measurement
  • 1=this entry primarily impacts ad impression measurement
  • 2=this entry impacts both
  • NOTES:
  • The 3rd column supports an 'exception' feature, which lets the file specify
  • broadly matching patterns and then allow special cases. For instance, if a UA
  • advertises itself as a 'robot', it should be ignored for counting purposes
  • unless the string 'robotics' is present, which allows for the counting of US
  • Robotics cobranded browsers. There may be more than one exception for each
  • pattern separated by a comma. Please note that use of this field is optional.
  • The 5th column attempts to associate the robot with page impressions or ad
  • impressions (or both) but should be used only as a guide. Application of this
  • list should be based on an analysis of the activity itself before excluding
  • any entries.
  • UA strings are considered uncountable (per IAB Guidelines) if they contain
  • any of the following patterns (note: patterns are case insensitive, but left
  • in this file in mixed case for human legibility)
  • Contact ImServices Group in the U.S. (spiders.bots@imservicesgroup.com) or
  • ABC Electronic in the UK (spiders.bots@abce.org.uk) with any feedback
  • regarding this file.
  • ******************* END OF COMMENTS ******************************************

Warnings

  • 6 invalid lines.