ispltd.com
robots.txt

Robots Exclusion Standard data for ispltd.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	ispltd.com
Base Domain	ispltd.com
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Couldn't connect to server.
Last Scan	2024-09-09T01:20:29+00:00
Next Scan	2024-12-08T01:20:29+00:00

Last Successful Scan

Scanned	2024-04-19T17:59:49+00:00
URL	https://ispltd.com/robots.txt
Domain IPs	93.95.230.7
Response IP	93.95.230.7
Found	Yes
Hash	f68a29d07148624bb0532de2857babddbf97de523559ae61ceedf8709dde7528
SimHash	b300a782cdc0

Groups

abacho
acontbot
ah-ha
ahrefsbot
aibot
aipbot
amfibibot
answerbus
appie
arachmo
arameda
archive.org_bot
argus
aspseek
asterias
autobot
baiduspider
barkrowler
becomebot
becomejpbot
bigcliquebot
blaiz-bee
boitho
boitho-robot
bruinbot
btbot
bumblebee
ccubee
cipinetbot
citenikbot
colly
converacrawler
converamultimediacrawler
cosmos
costacider
crawlconvera
crawlwave
cxl-fatassant
datacha0s
dataforseobot
datafountains
dataprovider.com
deepindex
df bot
diamondbot
dillodillo
dnsgroup
dotbot
dtaagent
eule-robot
euripbot
euripbot
eventax
exabot
exabot
fantomas
faxobot
fdse
firstgov.gov
fluffy
fyberspider
gaisbot
galaxy
galaxybot
gazz
genevabot
geniebot
geobot
girafabot
goforitbot
googlebot-image
groschobot
gsa-crawler
holmes
hoowwwer
hotzonu
htdig
ia_archiver
icab
iceape
ichiro
iconsurf
iltrovatore-setaccio
infociousbot
ingrid
innerprisebot
intravnews
ips-agent
jayde crawler
kaklebot
kretrieve
komodiabot
ksibot
kyluka
lanshanbot
lapozzbot
localcombot
lycos
mackster
matrix
metaspinner
mirago
mixrankbot
mj12bot
mnogosearch
mojeekbot
monkeycrawl
mozdex
mrgbot
msnbot-media
msnbot-news
msnbot-products
msnptc
msrbot
myfamilybot
naverbot
naverrobot
navissobot
netcraftsurveyagent
netmind-minder
networking4all
nextgensearchbot
ng
nicebot
nimblecrawler
nlcrawler
nusearch spider
nutch
nutchosu-vlib
ocelli
octopus
omnipelagos
openbot
openfind
orbiter
outfoxbot
pajaczek
panscient
panscient.com
patwebbot
peerbot
phpdig
pipeliner
poirot
polybot
pompos
popdexter
qweerybot
rampybot
reaper
rufusbot
sandcrawler
sansarn
sbider
schibstedsokbot
scooter
scrubby
search-10
search.ch
searchmee!
searchspider
seekbot
semrushbot
sensis web crawler
sensis.com.au web crawler
seokicks
shim+bot
shopwiki
shunixbot
sidewinder
silk
sitespider
sna-0.0.1
snap.com
sogou
speedyspider
spinne
squigglebotbot
stackrambler
superpagesbot
superpagesbot2.0
sureseeker
sygolbot
synobot
szukacz
t3versionsbot
thinkchaos
tkensaku
tridentspider
tutorgigbot
ultraseek
unchaos_crawler
updated
url spider pro
url spider sql
vagabondo
vuhuvbot
w3crobot
webcrawl.net
webfindbot
webindexer
whizbang
wisebot
wotbox
wwweasel
xirq
xunbot
yadowscrawler
yahoo mindset
yahoo-blogs
zaldamosearchbot
zao
zatka
zealbot
zeno
zgrab
zipppbot
zoominfobot
zyborg

Rule	Path
Disallow	/

Rule

Path

Disallow

/

*

Rule	Path
Disallow	/Graphics
Disallow	/_themes
Disallow	/_vti_conf
Disallow	/about
Disallow	/audio
Disallow	/biblesearch
Disallow	/biblestudy
Disallow	/contact
Disallow	/daily_photo
Disallow	/getid3
Disallow	/getsermon
Disallow	/images/
Disallow	/img/
Disallow	/js/
Disallow	/linuxnews
Disallow	/listing
Disallow	/login
Disallow	/pdf/
Disallow	/private/
Disallow	/radio
Disallow	/reminder
Disallow	/send_email
Disallow	/whoami

Rule

Path

Disallow

/Graphics

Disallow

/_themes

Disallow

/_vti_conf

Disallow

/about

Disallow

/audio

Disallow

/biblesearch

Disallow

/biblestudy

Disallow

/contact

Disallow

/daily_photo

Disallow

/getid3

Disallow

/getsermon

Disallow

/images/

Disallow

/img/

Disallow

/js/

Disallow

/linuxnews

Disallow

/listing

Disallow

/login

Disallow

/pdf/

Disallow

/private/

Disallow

/radio

Disallow

/reminder

Disallow

/send_email

Disallow

/whoami

Back to top

Comments

Specs for this file are at http://www.robotstxt.org/robotstxt.html
The Disallow value is a pattern-match without wildcard support. /img will
block all files in /img/ as well as a file named /img.jpg but /*jpg would
only block a file literally named that. The disallow path MUST START with
a slash. If the URL ends with a slash then that entire directory is
disallowed. A URL /help would disallow /help.txt as well as everything in
a directory named /help/.
User-agent is case-INsensitive.
Disallow URLs are case-sensitive, naturally.
User-agent: Googlebot
The disallow path MUST START with a slash. If the URL ends with a slash
then that entire directory is blocked.

Back to top

ispltd.comrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

*

Comments

ispltd.com
robots.txt