ispltd.com
robots.txt

Robots Exclusion Standard data for ispltd.com

Resource Scan

Scan Details

Site Domain ispltd.com
Base Domain ispltd.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-09-09T01:20:29+00:00
Next Scan 2024-12-08T01:20:29+00:00

Last Successful Scan

Scanned2024-04-19T17:59:49+00:00
URL https://ispltd.com/robots.txt
Domain IPs 93.95.230.7
Response IP 93.95.230.7
Found Yes
Hash f68a29d07148624bb0532de2857babddbf97de523559ae61ceedf8709dde7528
SimHash b300a782cdc0

Groups

abacho
acontbot
ah-ha
ahrefsbot
aibot
aipbot
amfibibot
answerbus
appie
arachmo
arameda
archive.org_bot
argus
aspseek
asterias
autobot
baiduspider
barkrowler
becomebot
becomejpbot
bigcliquebot
blaiz-bee
boitho
boitho-robot
bruinbot
btbot
bumblebee
ccubee
cipinetbot
citenikbot
colly
converacrawler
converamultimediacrawler
cosmos
costacider
crawlconvera
crawlwave
cxl-fatassant
datacha0s
dataforseobot
datafountains
dataprovider.com
deepindex
df bot
diamondbot
dillodillo
dnsgroup
dotbot
dtaagent
eule-robot
euripbot
euripbot
eventax
exabot
exabot
fantomas
faxobot
fdse
firstgov.gov
fluffy
fyberspider
gaisbot
galaxy
galaxybot
gazz
genevabot
geniebot
geobot
girafabot
goforitbot
googlebot-image
groschobot
gsa-crawler
holmes
hoowwwer
hotzonu
htdig
ia_archiver
icab
iceape
ichiro
iconsurf
iltrovatore-setaccio
infociousbot
ingrid
innerprisebot
intravnews
ips-agent
jayde crawler
kaklebot
kretrieve
komodiabot
ksibot
kyluka
lanshanbot
lapozzbot
localcombot
lycos
mackster
matrix
metaspinner
mirago
mixrankbot
mj12bot
mnogosearch
mojeekbot
monkeycrawl
mozdex
mrgbot
msnbot-media
msnbot-news
msnbot-products
msnptc
msrbot
myfamilybot
naverbot
naverrobot
navissobot
netcraftsurveyagent
netmind-minder
networking4all
nextgensearchbot
ng
nicebot
nimblecrawler
nlcrawler
nusearch spider
nutch
nutchosu-vlib
ocelli
octopus
omnipelagos
openbot
openfind
orbiter
outfoxbot
pajaczek
panscient
panscient.com
patwebbot
peerbot
phpdig
pipeliner
poirot
polybot
pompos
popdexter
qweerybot
rampybot
reaper
rufusbot
sandcrawler
sansarn
sbider
schibstedsokbot
scooter
scrubby
search-10
search.ch
searchmee!
searchspider
seekbot
semrushbot
sensis web crawler
sensis.com.au web crawler
seokicks
shim+bot
shopwiki
shunixbot
sidewinder
silk
sitespider
sna-0.0.1
snap.com
sogou
speedyspider
spinne
squigglebotbot
stackrambler
superpagesbot
superpagesbot2.0
sureseeker
sygolbot
synobot
szukacz
t3versionsbot
thinkchaos
tkensaku
tridentspider
tutorgigbot
ultraseek
unchaos_crawler
updated
url spider pro
url spider sql
vagabondo
vuhuvbot
w3crobot
webcrawl.net
webfindbot
webindexer
whizbang
wisebot
wotbox
wwweasel
xirq
xunbot
yadowscrawler
yahoo mindset
yahoo-blogs
zaldamosearchbot
zao
zatka
zealbot
zeno
zgrab
zipppbot
zoominfobot
zyborg

Rule Path
Disallow /

*

Rule Path
Disallow /Graphics
Disallow /_themes
Disallow /_vti_conf
Disallow /about
Disallow /audio
Disallow /biblesearch
Disallow /biblestudy
Disallow /contact
Disallow /daily_photo
Disallow /getid3
Disallow /getsermon
Disallow /images/
Disallow /img/
Disallow /js/
Disallow /linuxnews
Disallow /listing
Disallow /login
Disallow /pdf/
Disallow /private/
Disallow /radio
Disallow /reminder
Disallow /send_email
Disallow /whoami

Comments

  • Specs for this file are at http://www.robotstxt.org/robotstxt.html
  • The Disallow value is a pattern-match without wildcard support. /img will
  • block all files in /img/ as well as a file named /img.jpg but /*jpg would
  • only block a file literally named that. The disallow path MUST START with
  • a slash. If the URL ends with a slash then that entire directory is
  • disallowed. A URL /help would disallow /help.txt as well as everything in
  • a directory named /help/.
  • User-agent is case-INsensitive.
  • Disallow URLs are case-sensitive, naturally.
  • User-agent: Googlebot
  • The disallow path MUST START with a slash. If the URL ends with a slash
  • then that entire directory is blocked.