limanisupply.com
robots.txt

Robots Exclusion Standard data for limanisupply.com

Resource Scan

Scan Details

Site Domain limanisupply.com
Base Domain limanisupply.com
Scan Status Ok
Last Scan2024-10-08T17:11:05+00:00
Next Scan 2024-11-07T17:11:05+00:00

Last Scan

Scanned2024-10-08T17:11:05+00:00
URL https://limanisupply.com/robots.txt
Domain IPs 35.214.138.201
Response IP 35.214.138.201
Found Yes
Hash 5a6a198f8d4f9601d1a951aed9b0113582b01d410cd653b613126457dd3a4e93
SimHash 7b75fbd146ef

Groups

googlebot-mobile
googlebot
googlebot-image
googlebot-news
googlebot-video
bingbot
slurp
duckduckbot
msnbot
bingpreview
msnbot-media
microsoftpreview
yahoo pipes 1.0
yahoo! slurp
baiduspider
baiduspider-news
baiduspider-image
yandex
yandexbot
yandeximages
yandexnews
yandexwebmaster
yandexpagechecker
facebot
ia_archiver
archive.org_bot
feedfetcher-google
linkedinbot
twitterbot

Rule Path
Allow /
Allow /wp-admin/admin-ajax.php
Allow /wp-content/uploads/
Disallow

*
ai2bot
ai2bot-dolma
friendlycrawler
googleother
icc-crawler
imagesiftbot
petalbot
scrapy
sidetrade indexer bot
velenpublicwebcrawler
cohere-ai
facebookexternalhit
iaskspider/2.0
img2dataset
omgilibot
python-urllib
python-requests
aiohttp
httpx
libwww-perl
httpunit
nutch
go-http-client
phpcrawl
jyxobot
biglotron
teoma
convera
seekbot
gigabot
gigablast
webmon
gingercrawler
httrack
grub\\.org
usinenouvellecrawler
antibot
netresearchserver
speedy
fluffy
findlink
msrbot
panscient
yacybot
aisearchbot
ips-agent
tagoobot
woriobot
yanga
buzzbot
mlbot
purebot
linguee bot
cyberpatrol
admantx
alphabot
awariorssbot
awariosmartbot
blexbot
buzzbot
coccocbot-image
dataforseobot
heritrix
magpie-crawler
maxpointcrawler
meltwater
peer39_crawler
piplbot
scoop.it
seekr
seoscanners
seznambot
zoominfobot
zumbot
citeseerxbot
spbot
twengabot
postrank
turnitin
scribdbot
page2rss
sitebot
linkdex
adidxbot
ezooms
dotbot
heritrix
findthatfile
europarchive\\.org
nerdbynature\\.bot
fuelbot
crunchbot
indeedbot
mappydata
woobot
zoominfobot
privacyawarebot
multiviewbot
swimgbot
grobbot
eright
apercite
semanticbot
aboundex
domaincrawler
wbsearchbot
summify
edisterbot
seznambot
ec2linkfinder
gslfbot
aihitbot
intelium_bot
yeti
retrevopageanalyzer
lb-spider
sogou
lssbot
careerbot
wotbox
wocbot
ichiro
lssrocketcrawler
drupact
webcompanycrawler
acoonbot
openindexspider
gnam gnam spider
coccoc
integromedb
content crawler spider
toplistbot
it2media-domain-crawler
ip-web-crawler\\.com
siteexplorer\\.info
elisabot
proximic
changedetection
arabot
wesee:search
niki-bot
crystalsemanticsbot
psbot
interfaxscanbot
cc metadata scaper
g00g1e\\.net
grapeshotcrawler
urlappendbot
brainobot
fr-crawler
binlar
simplecrawler
cxensebot
smtbot
bnf\\.fr_bot
a6-indexer
orangebot\\/
memorybot
advbot
megaindex
semanticscholarbot
ltx71
nerdybot
xovibot
bubing
qwantify
archive\\.org_bot
tweetmemebot
crawler4j
findxbot
yoozbot
lipperhey
y!j
domain re-animator bot
addthis
metauri
scrapy
livelap[bb]ot
openhosebot
capsulechecker
collection@infegy\\.com
istellabot
deusu\\/
betabot
cliqzbot\\/
mojeekbot\\/
netestate ne crawler
buzzsumo
serpstatbot
backlinkcrawler
gsitecrawler
ahrefsbot/4.0
spyfu
semrushbot-coub
splitsignalbot
semrushbot-swa
semrushbot-si
semrushbot-ba
woorank
rsiteauditor
dotbot
rogerbot
cognitiveseo
oncrawl
mj12bot
semrushbot
ahrefsbot
ahrefs(bot|siteaudit)
s[ee][mm]rushbot
screaming frog seo spider

Rule Path
Disallow /*.pdf$
Disallow /*.mp4$
Disallow /wp-signup.php
Disallow /wp-admin/
Disallow /wp-login.php
Disallow /mshots/v1/
Disallow /next/
Disallow /cgi-bin/
Disallow /tmp/
Disallow /trackback/
Disallow /wp-trackback/
Disallow /replytocom/
Disallow /search
Disallow /url
Disallow /index.html?
Disallow /s?
Disallow /readme.html
Disallow *utm*%3D
Disallow */xmlrpc.php
Disallow /cdn-cgi/
Disallow /

amazonbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

applebot

Rule Path
Disallow /

applebot-extended

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

claudebot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

kangaroo bot

Rule Path
Disallow /

meta-externalagent

Rule Path
Disallow /

meta-externalfetcher

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

timpibot

Rule Path
Disallow /

webzio-extended

Rule Path
Disallow /

youbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.limanisupply.com/sitemap_index.xml

Comments

  • START DARK VISITORS BLOCK
  • ---------------------------
  • AI Search Crawler
  • https://darkvisitors.com/agents/amazonbot
  • Undocumented AI Agent
  • https://darkvisitors.com/agents/anthropic-ai
  • AI Search Crawler
  • https://darkvisitors.com/agents/applebot
  • AI Data Scraper
  • https://darkvisitors.com/agents/applebot-extended
  • AI Data Scraper
  • https://darkvisitors.com/agents/bytespider
  • AI Data Scraper
  • https://darkvisitors.com/agents/ccbot
  • AI Assistant
  • https://darkvisitors.com/agents/chatgpt-user
  • Undocumented AI Agent
  • https://darkvisitors.com/agents/claude-web
  • AI Data Scraper
  • https://darkvisitors.com/agents/claudebot
  • Undocumented AI Agent
  • https://darkvisitors.com/agents/cohere-ai
  • AI Data Scraper
  • https://darkvisitors.com/agents/diffbot
  • AI Data Scraper
  • https://darkvisitors.com/agents/facebookbot
  • AI Data Scraper
  • https://darkvisitors.com/agents/google-extended
  • AI Data Scraper
  • https://darkvisitors.com/agents/gptbot
  • AI Data Scraper
  • https://darkvisitors.com/agents/kangaroo-bot
  • AI Data Scraper
  • https://darkvisitors.com/agents/meta-externalagent
  • AI Assistant
  • https://darkvisitors.com/agents/meta-externalfetcher
  • AI Search Crawler
  • https://darkvisitors.com/agents/oai-searchbot
  • AI Data Scraper
  • https://darkvisitors.com/agents/omgili
  • AI Search Crawler
  • https://darkvisitors.com/agents/perplexitybot
  • AI Data Scraper
  • https://darkvisitors.com/agents/timpibot
  • AI Data Scraper
  • https://darkvisitors.com/agents/webzio-extended
  • AI Search Crawler
  • https://darkvisitors.com/agents/youbot
  • ---------------------------
  • END DARK VISITORS BLOCK

Warnings

  • 3 invalid lines.
  • `user agent` is not a known field.