karpatinfo.net
robots.txt

Robots Exclusion Standard data for karpatinfo.net

Archived Snapshots

Resource Scan

Scan Details

Site Domain	karpatinfo.net
Base Domain	karpatinfo.net
Scan Status	Ok
Last Scan	2024-11-13T20:51:14+00:00
Next Scan	2024-11-20T20:51:14+00:00

Last Scan

Scanned	2024-11-13T20:51:14+00:00
URL	https://karpatinfo.net/robots.txt
Domain IPs	185.51.188.57
Response IP	185.51.188.57
Found	Yes
Hash	c57c704f1df87dce396e3e6302a1fc40d3c428a01c88be9b49be301fde8ab7b4
SimHash	b3d4b5438566

Groups

*

Rule	Path
Allow	/core/*.css$
Allow	/core/*.css?
Allow	/core/*.js$
Allow	/core/*.js?
Allow	/core/*.gif
Allow	/core/*.jpg
Allow	/core/*.jpeg
Allow	/core/*.png
Allow	/core/*.svg
Allow	/profiles/*.css$
Allow	/profiles/*.css?
Allow	/profiles/*.js$
Allow	/profiles/*.js?
Allow	/profiles/*.gif
Allow	/profiles/*.jpg
Allow	/profiles/*.jpeg
Allow	/profiles/*.png
Allow	/profiles/*.svg
Disallow	/core/
Disallow	/profiles/
Disallow	/modules/
Disallow	/sites/default/files/styles/
Disallow	/README.md
Disallow	/composer/Metapackage/README.txt
Disallow	/composer/Plugin/ProjectMessage/README.md
Disallow	/composer/Plugin/Scaffold/README.md
Disallow	/composer/Plugin/VendorHardening/README.txt
Disallow	/composer/Template/README.txt
Disallow	/modules/README.txt
Disallow	/sites/README.txt
Disallow	/themes/README.txt
Disallow	/web.config
Disallow	/admin/
Disallow	/comment/reply/
Disallow	/filter/tips
Disallow	/node/add/
Disallow	/search/
Disallow	/user/register
Disallow	/user/password
Disallow	/user/login
Disallow	/user/logout
Disallow	/media/oembed
Disallow	/*/media/oembed
Disallow	/index.php/admin/
Disallow	/index.php/comment/reply/
Disallow	/index.php/filter/tips
Disallow	/index.php/node/add/
Disallow	/index.php/search/
Disallow	/index.php/user/password
Disallow	/index.php/user/register
Disallow	/index.php/user/login
Disallow	/index.php/user/logout
Disallow	/index.php/media/oembed
Disallow	/index.php/*/media/oembed

Rule

Path

Allow

/core/*.css$

Allow

/core/*.css?

Allow

/core/*.js$

Allow

/core/*.js?

Allow

/core/*.gif

Allow

/core/*.jpg

Allow

/core/*.jpeg

Allow

/core/*.png

Allow

/core/*.svg

Allow

/profiles/*.css$

Allow

/profiles/*.css?

Allow

/profiles/*.js$

Allow

/profiles/*.js?

Allow

/profiles/*.gif

Allow

/profiles/*.jpg

Allow

/profiles/*.jpeg

Allow

/profiles/*.png

Allow

/profiles/*.svg

Disallow

/core/

Disallow

/profiles/

Disallow

/modules/

Disallow

/sites/default/files/styles/

Disallow

/README.md

Disallow

/composer/Metapackage/README.txt

Disallow

/composer/Plugin/ProjectMessage/README.md

Disallow

/composer/Plugin/Scaffold/README.md

Disallow

/composer/Plugin/VendorHardening/README.txt

Disallow

/composer/Template/README.txt

Disallow

/modules/README.txt

Disallow

/sites/README.txt

Disallow

/themes/README.txt

Disallow

/web.config

Disallow

/admin/

Disallow

/comment/reply/

Disallow

/filter/tips

Disallow

/node/add/

Disallow

/search/

Disallow

/user/register

Disallow

/user/password

Disallow

/user/login

Disallow

/user/logout

Disallow

/media/oembed

Disallow

/*/media/oembed

Disallow

/index.php/admin/

Disallow

/index.php/comment/reply/

Disallow

/index.php/filter/tips

Disallow

/index.php/node/add/

Disallow

/index.php/search/

Disallow

/index.php/user/password

Disallow

/index.php/user/register

Disallow

/index.php/user/login

Disallow

/index.php/user/logout

Disallow

/index.php/media/oembed

Disallow

/index.php/*/media/oembed

libwww-perl
abachobot
admantx
admantx platform semantic analyzer - admantx inc. - www.admantx.com - support@admantx.com
adscanner
ahrefs
ahrefs
ahrefsbot
aihitbot
anarchie
antibot
appie
aspseek
asterias
attach
autoemailspider
b2w
backdoorbot
backlinkcrawler
backweb
baidu
bandit
batchftp
black\ hole
blackwidow
blexbot
blowfish
bot\ mailto
bot\ mailto:craftbot@yahoo.com
botalot
bubing
buddy
builtbottough
bullseye
bumblebee
bunnyslippers
careerbot
ccbot
cheesebot
cherrypicker
cherrypickerelite
cherrypickerse
chinaclaw
clariabot
clickagy intelligence bot
clickagy intelligence bot v2
cliqzbot
cloudservermarketspider
clshttp
coast\ webmaster
coccoc
coldfusion
collector
copier
copyrightcheck
cosmos
crawler4j
crescent
curl
custo
da
diamond
disco
disco\ pump
dittospyder
dloader
domain re-animator bot
domaincrawler
dotbot
dotbot
download\ demon
download\ wonder
downloader
drip
dts\ agent
easydl
ecatch
eirgrabber
emailcollector
emailsiphon
emailwolf
erocrawler
exabot
express\ webpictures
extractorpro
extreme\ picture\ finder
eyenetie
eyenetie
fast\ webcrawler
fetch\ api\ request
filehound
flashget
flickbot
fr-crawler
freefind.com
frontpage
garlikcrawler
generic
getright
getsmart
getweb!
gigabot
go!zilla
go-ahead-got-it
gosign-security-crawler
gotit
grabber
grabnet
grabnet
grafula
gulliver
haosouspider
harvest
heretrix
hitboxdoctor
hloader
hmview
httpapp
httpfetcher
httplib
httpscraper
httptrack
httpviewer
httrack
humanlinks
ia_archiver
iccrawler
image\ stripper
image\ sucker
implisensebot
indeedbot
indy\ library
infonavirobot
interget
internet\ ninja
internetseer.com
iria
irlbot
java
jennybot
jetcar
jobboersebot
jobo
jobs.de-robot
joc
joc\ web\ spider
jonzilla
justview
kenjin\ spider
keyword\ density
kraken
lachesis
larbin
leechftp
lexibot
lftp
libby_
libweb
libwww-perl
libwwwperl
likse
linguee
link
linkextractorpro
linkscan
linkstats
linkwalker
lipperhey-kaus-australis
ltx71
lwp-trivial
lwp\ request
mag-net
magnet
magpie-crawler
mass\ downloader
mata\ hari
maxpoint bot
maxpointcrawler
meanpathbot
megaindex.com
megaindex.ru
memo
mercator
metacarta
metajobbot
mewsoft\ search\ engine
mfc_tear_sample
mfc_tear_sample
microsoft\ url\ control
microsofturl
midown\ tool
miixpc
mindupbot
mirror
missigua
mister\ pix
mj12bot
moget
mozilla/4.0 (compatible; msie 4.01; windows nt; ms search 4.0 robot) microsoft
ms search 4.0 robot
msfrontpage
msiecrawler
nationaldirectory\ webspider
navroad
nearsite
net\ probe
net\ vampire
netants
netestate
netmechanic
netresearchserver
netspider
netzip
nexuscache
nicerspro
nikto
ninja
npbot
obot
octopus
offline\ explorer
offline\ navigator
omgili
omgili/0.5 +http://omgili.com
onestop
openfind
openfind\ data\ gatherer
openhosebot
orangebot
our\ agent
pagegrabber
papa\ foto
pavuk
pcbrowser
perl
php
php\ version
phpot
ping
pingalink\ monitoring\ services
plista
plukkie
pockey
pompos
propowerbot
prowebwalker
proximic
psbot
psycheclone
pump
python-urllib
python/3.5 aiohttp
python\ urllib
queryn
qwantify
r6_commentreader
realdownload
reaper
recorder
reget
repomonkey
rico
rma
robozilla
rogerbot
sabsimbot
safednsbot
scooter
scoutabout
screaming frog seo spider
searchmetricsbot
semrushbot
semrushbot-sa
sentibot
sentibot
seodiver
seokicks-robot
seoscanners.net
sg-orbiter
siphon
sistrix
sistrix
sistrix crawler
sitecheck.internetseer.com
siteliner
sitesnagger
slysearch
smartdownload
snake
snapbot
snoopy
spacebison
spankbot
spanner
spbot
spiderbot
spinne
sqworm
stealer
stripper
sucker
superbot
superhttp
surfbot
surveybot
suzuran
szukacz
takeout
teleport\ pro
telesoft
the\ intraformant
thenomad
thumbsniper
tighttwatbot
titan
tocrawl
toweya.com
trendictionbot
true_robot
turingos
turnitinbot
turnitinbot
twitterbot
um-ic
unisterbot
urldispatcher
urly\ warning
useragent:admantx platform semantic analyzer us - turn - admantx inc. - www.admantx.com - support@admantx.com
vacuum
vagabondo
vayala
vci
velenpublicwebcrawler
vintage
voideye
w3c_validator
wbsearchbot
web\ downloader
web\ image\ collector
web\ sucker
webauto
webcopier
webdownloader
webenhancer
webfetch
webgo\ is
webhook
webleacher
webminer
webmirror
webmole
websauger
website
website\ extractor
website\ quester
websites
webster
webster\ pro
webstripper
webviewer
webwhacker
webzip
wells
wget
whacker
widow
wildsoft\ surfer
winhttp
winhttprequest
wotbox
www-collector-e
wwwoffle
xaldon
xaldon\ webspider
xara
xenu
y!tunnelpro
yahooysmcm
yandex
zade
zbot
zeus

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Other Records

Field	Value
sitemap	https://karpatinfo.net/sitemap.xml

Field

Value

sitemap

https://karpatinfo.net/sitemap.xml

Back to top

Comments

robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/robotstxt.html
CSS, JS, Images
Directories
Files
Paths (clean URLs)
Paths (no clean URLs)

Back to top

Warnings

1 invalid line.

Back to top

karpatinfo.netrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

Comments

Warnings

karpatinfo.net
robots.txt