gssi.it
robots.txt

Robots Exclusion Standard data for gssi.it

Archived Snapshots

Resource Scan

Scan Details

Site Domain	gssi.it
Base Domain	gssi.it
Scan Status	Failed
Failure Stage	Fetching resource.
Failure Reason	Couldn't establish SSL connection.
Last Scan	2024-10-13T18:33:29+00:00
Next Scan	2025-01-11T18:33:29+00:00

Last Successful Scan

Scanned	2022-08-25T06:27:35+00:00
URL	https://gssi.it/robots.txt
Response IP	192.135.27.202
Found	Yes
Hash	0ed55dc0836a9ab5e5e04b2f3d707eadabf81deec0dabe657ad11ff807c48561
SimHash	a2562558cbfa

Groups

*

Rule	Path
Disallow	/administrator/
Disallow	/bin/
Disallow	/cache/
Disallow	/cli/
Disallow	/components/
Disallow	/includes/
Disallow	/installation/
Disallow	/language/
Disallow	/layouts/
Disallow	/libraries/
Disallow	/logs/
Disallow	/modules/
Disallow	/plugins/
Disallow	/tmp/
Disallow	/albo-ufficiale-online-gssi

Rule

Path

Disallow

/administrator/

Disallow

/bin/

Disallow

/cache/

Disallow

/cli/

Disallow

/components/

Disallow

/includes/

Disallow

/installation/

Disallow

/language/

Disallow

/layouts/

Disallow

/libraries/

Disallow

/logs/

Disallow

/modules/

Disallow

/plugins/

Disallow

/tmp/

Disallow

/albo-ufficiale-online-gssi

sistrix

Rule	Path
Disallow	/

Rule

Path

Disallow

sistrix crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

sistrix

Rule	Path
Disallow	/

Rule

Path

Disallow

seokicks-robot

Rule	Path
Disallow	/

Rule

Path

Disallow

jobs.de-robot

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

unisterbot

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

searchmetricsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

surveybot

Rule	Path
Disallow	/

Rule

Path

Disallow

seodiver

Rule	Path
Disallow	/

Rule

Path

Disallow

spbot

Rule	Path
Disallow	/

Rule

Path

Disallow

wotbox

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

meanpathbot

Rule	Path
Disallow	/

Rule

Path

Disallow

backlinkcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

magpie-crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

obot

Rule	Path
Disallow	/

Rule

Path

Disallow

fr-crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

megaindex.ru

Rule	Path
Disallow	/

Rule

Path

Disallow

megaindex.com

Rule	Path
Disallow	/

Rule

Path

Disallow

cloudservermarketspider

Rule	Path
Disallow	/

Rule

Path

Disallow

trendictionbot

Rule	Path
Disallow	/

Rule

Path

Disallow

exabot

Rule	Path
Disallow	/

Rule

Path

Disallow

careerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

lipperhey-kaus-australis

Rule	Path
Disallow	/

Rule

Path

Disallow

seoscanners.net

Rule	Path
Disallow	/

Rule

Path

Disallow

metajobbot

Rule	Path
Disallow	/

Rule

Path

Disallow

spiderbot

Rule	Path
Disallow	/

Rule

Path

Disallow

linkstats

Rule	Path
Disallow	/

Rule

Path

Disallow

jobboersebot

Rule	Path
Disallow	/

Rule

Path

Disallow

iccrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

plista

Rule	Path
Disallow	/

Rule

Path

Disallow

domain re-animator bot

Rule	Path
Disallow	/

Rule

Path

Disallow

lipperhey-kaus-australis

Rule	Path
Disallow	/

Rule

Path

Disallow

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

coccoc

Rule	Path
Disallow	/

Rule

Path

Disallow

um-ic

Rule	Path
Disallow	/

Rule

Path

Disallow

mindupbot

Rule	Path
Disallow	/

Rule

Path

Disallow

sg-orbiter

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

qwantify

Rule	Path
Disallow	/

Rule

Path

Disallow

kraken

Rule	Path
Disallow	/

Rule

Path

Disallow

plukkie

Rule	Path
Disallow	/

Rule

Path

Disallow

safednsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

haosouspider

Rule

Path

Disallow

rogerbot

Rule

Path

Disallow

openhosebot

Rule

Path

Disallow

screaming frog seo spider

Rule

Path

Disallow

thumbsniper

Rule

Path

Disallow

r6_commentreader

Rule

Path

Disallow

implisensebot

Rule

Path

Disallow

cliqzbot

Rule

Path

Disallow

aihitbot

Rule

Path

Disallow

trendictionbot

Rule

Path

Disallow

wbsearchbot

Rule

Path

Disallow

semrushbot

Rule

Path

Disallow

semrushbot-sa

Rule

Path

Disallow

aboutusbot

Rule

Path

Disallow

adnormcrawler

Rule

Path

Disallow

ahrefsbot

Rule

Path

Disallow

aliexpress

Rule

Path

Disallow

archive.org_bot

Rule

Path

Disallow

baidu

Rule

Path

Disallow

baiduspider

Rule

Path

Disallow

baiduspider-ads

Rule

Path

Disallow

baiduspider-cpro

Rule

Path

Disallow

baiduspider-favo

Rule

Path

Disallow

baiduspider-image

Rule

Path

Disallow

baiduspider-news

Rule

Path

Disallow

baiduspider-video

Rule

Path

Disallow

bizinformation

Rule

Path

Disallow

blackcatb

Rule

Path

Disallow

blexbot

Rule

Path

Disallow

bot-pge.chlooe.com

Rule

Path

Disallow

bubing

Rule

Path

Disallow

careerbot

Rule

Path

Disallow

catchbot

Rule

Path

Disallow

cliqzbot

Rule

Path

Disallow

cms crawler

Rule

Path

Disallow

coccoc

Rule

Path

Disallow

compspybot

Rule

Path

Disallow

crazywebcrawler-spider

Rule

Path

Disallow

cybeye

Rule

Path

Disallow

daumoa

Rule

Path

Disallow

discobot

Rule

Path

Disallow

domainappender

Rule

Path

Disallow

dotbot

Rule

Path

Disallow

easouspider

Rule

Path

Disallow

euripbot

Rule

Path

Disallow

exabot

Rule

Path

Disallow

ezooms

Rule

Path

Disallow

gonzo

Rule

Path

Disallow

grapeshot

Rule

Path

Disallow

haosouspider

Rule

Path

Disallow

hypercrawl

Rule

Path

Disallow

ia_archiver

Rule

Path

Disallow

idmarch

Rule

Path

Disallow

iopus-web-automation

Rule

Path

Disallow

implisensebot

Rule

Path

Disallow

isowq

Rule

Path

Disallow

jamesbot

Rule

Path

Disallow

java

Rule

Path

Disallow

jobboersebot

Rule

Path

Disallow

linkdex

Rule

Path

Disallow

linkdexbot

Rule

Path

Disallow

linkpadbot

Rule

Path

Disallow

lipperhey

Rule

Path

Disallow

lipperhey spider

Rule

Path

Disallow

mail.ru

Rule

Path

Disallow

mail.ru_bot

Rule

Path

Disallow

meanpathbot

Rule

Path

Disallow

megaindex.ru

Rule

Path

Disallow

nerdybot

Rule

Path

Disallow

metajobbot

Rule

Path

Disallow

netestate ne crawler

Rule

Path

Disallow

netcraftsurveyagent

Rule

Path

Disallow

netseer

Rule

Path

Disallow

mfibot

Rule

Path

Disallow

mindupbot

Rule

Path

Disallow

mj12bot

Rule

Path

Disallow

msiecrawler

Rule

Path

Disallow

ms search 4.0 robot

Rule

Path

Disallow

nutch

Rule

Path

Disallow

obot

Rule

Path

Disallow

onpagecheck.net

Rule

Path

Disallow

optimumform

Rule

Path

Disallow

pipl

Rule

Path

Disallow

plukkie

Rule

Path

Disallow

privacyawarebot

Rule

Path

Disallow

rankactivelinkbot

Rule

Path

Disallow

rbot

Rule

Path

Disallow

riddler

Rule

Path

Disallow

safednsbot

Rule

Path

Disallow

safetab

Rule

Path

Disallow

searchmetricsbot

Rule

Path

Disallow

seobilitybot

Rule

Path

Disallow

selfbot

Rule

Path

Disallow

semager

Rule

Path

Disallow

semrushbot

Rule

Path

Disallow

seoscanners.net

Rule

Path

Disallow

seznambot

Rule

Path

Disallow

shopwiki

Rule

Path

Disallow

sistrix

Rule

Path

Disallow

smtbot

Rule

Path

Disallow

sogou spider

Rule

Path

Disallow

spbot

Rule

Path

Disallow

spiderlytics

Rule

Path

Disallow

surveybot

Rule

Path

Disallow

thumbnailagent

Rule

Path

Disallow

turnitinbot

Rule

Path

Disallow

uptimebot

Rule

Path

Disallow

urlmetriken

Rule

Path

Disallow

urlpulse

Rule

Path

Disallow

urlspion

Rule

Path

Disallow

vebidoobot

Rule

Path

Disallow

vebidoobot-image

Rule

Path

Disallow

waybackarchive

Rule

Path

Disallow

webbericht.com

Rule

Path

Disallow

webinatorbot

Rule

Path

Disallow

webreaper

Rule

Path

Disallow

websitewiki

Rule

Path

Disallow

wevikabot

Rule

Path

Disallow

woobot

Rule

Path

Disallow

wotbox

Rule

Path

Disallow

yasni

Rule

Path

Disallow

yasnibot-image

Rule

Path

Disallow

zumbot

Rule

Path

Disallow

velenpublicwebcrawler

Rule

Path

Disallow

bingbot

No rules defined. All paths allowed.

Other Records

Field

Value

crawl-delay

Comments

If the Joomla site is installed within a folder such as at
e.g. www.example.com/joomla/ the robots.txt file MUST be
moved to the site root at e.g. www.example.com/robots.txt
AND the joomla folder name MUST be prefixed to the disallowed
path, e.g. the Disallow rule for the /administrator/ folder
MUST be changed to read Disallow: /joomla/administrator/
For more information about the robots.txt standard, see:
http://www.robotstxt.org/orig.html
For syntax checking, see:
http://tool.motoricerca.info/robots-checker.phtml
Disallow: Sistrix
Disallow: Sistrix
Disallow: Sistrix
Disallow: SEOkicks-Robot
Disallow: jobs.de-Robot
Backlink Analysis
Bot der Leipziger Unister Holding GmbH
http://moz.com/products
http://www.searchmetrics.com
http://www.majestic12.co.uk/projects/dsearch/mj12bot.php
http://www.domaintools.com/webmasters/surveybot.php
http://www.seodiver.com/bot
http://openlinkprofiler.org/bot
http://www.wotbox.com/bot/
http://www.opensiteexplorer.org/dotbot
http://moz.com/researchtools/ose/dotbot
http://www.meanpath.com/meanpathbot.html
http://www.backlinktest.com/crawler.html
http://www.brandwatch.com/magpie-crawler/
http://filterdb.iss.net/crawler/
http://webmeup-crawler.com
https://megaindex.com/crawler
http://www.cloudservermarket.com
http://www.trendiction.de/de/publisher/bot
http://www.exalead.com
http://www.career-x.de/bot.html
https://www.lipperhey.com/en/about/
https://www.lipperhey.com/en/about/
https://turnitin.com/robot/crawlerinfo.html
http://help.coccoc.com/
ubermetrics-technologies.com
datenbutler.de
http://searchgears.de/uber-uns/crawling-faq.html
http://commoncrawl.org/faq/
https://www.qwant.com/
http://linkfluence.net/
http://www.botje.com/plukkie.htm
https://www.safedns.com/searchbot
http://www.haosou.com/help/help_3_2.html
http://www.haosou.com/help/help_3_2.html
http://www.moz.com/dp/rogerbot
http://www.openhose.org/bot.html
http://www.screamingfrog.co.uk/seo-spider/
http://thumbsniper.com
http://www.radian6.com/crawler
http://cliqz.com/company/cliqzbot
https://www.aihitdata.com/about
http://www.trendiction.com/en/publisher/bot
http://warebay.com/bot.html

Warnings

8 invalid lines.

gssi.itrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

*

sistrix

sistrix crawler

sistrix

seokicks-robot

jobs.de-robot

ahrefsbot

unisterbot

dotbot

searchmetricsbot

mj12bot

surveybot

seodiver

spbot

wotbox

dotbot

meanpathbot

backlinkcrawler

magpie-crawler

obot

fr-crawler

blexbot

megaindex.ru

megaindex.com

cloudservermarketspider

trendictionbot

exabot

careerbot

lipperhey-kaus-australis

seoscanners.net

metajobbot

spiderbot

linkstats

jobboersebot

iccrawler

plista

domain re-animator bot

lipperhey-kaus-australis

turnitinbot

coccoc

um-ic

mindupbot

sg-orbiter

ccbot

qwantify

kraken

plukkie

safednsbot

haosouspider

rogerbot

openhosebot

screaming frog seo spider

thumbsniper

r6_commentreader

implisensebot

cliqzbot

aihitbot

trendictionbot

wbsearchbot

semrushbot

semrushbot-sa

aboutusbot

adnormcrawler

ahrefsbot

aliexpress

archive.org_bot

baidu

baiduspider

baiduspider-ads

baiduspider-cpro

baiduspider-favo

baiduspider-image

baiduspider-news

baiduspider-video

bizinformation

gssi.it
robots.txt