dspace.cuni.cz
robots.txt

Robots Exclusion Standard data for dspace.cuni.cz

Archived Snapshots

Resource Scan

Scan Details

Site Domain	dspace.cuni.cz
Base Domain	cuni.cz
Scan Status	Ok
Last Scan	2024-11-03T16:30:58+00:00
Next Scan	2024-12-03T16:30:58+00:00

Last Scan

Scanned	2024-11-03T16:30:58+00:00
URL	https://dspace.cuni.cz/robots.txt
Domain IPs	195.113.89.91, 2001:718:1e03:652::55
Response IP	195.113.89.91
Found	Yes
Hash	920cfc1c37f9d369a8f0d10aa043dcc7ada6aafbfb8535c28c9cb059e5a0a2b2
SimHash	ae1ccd51c5b4

Groups

*

Rule	Path
Disallow	/discover
Disallow	/search-filter
Disallow	/browse
Disallow	/handle/20.500.11956/*/browse
Disallow	/statistics
Disallow	/contact
Disallow	/feedback
Disallow	/forgot
Disallow	/login
Disallow	/register

Rule

Path

Disallow

/discover

Disallow

/search-filter

Disallow

/browse

Disallow

/handle/20.500.11956/*/browse

Disallow

/statistics

Disallow

/contact

Disallow

/feedback

Disallow

/forgot

Disallow

/login

Disallow

/register

mediapartners-google*

Rule	Path
Disallow	/

Rule

Path

Disallow

ubicrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

doc

Rule	Path
Disallow	/

Rule

Path

Disallow

zao

Rule	Path
Disallow	/

Rule

Path

Disallow

sitecheck.internetseer.com

Rule	Path
Disallow	/

Rule

Path

Disallow

zealbot

Rule	Path
Disallow	/

Rule

Path

Disallow

msiecrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

sitesnagger

Rule	Path
Disallow	/

Rule

Path

Disallow

webstripper

Rule	Path
Disallow	/

Rule

Path

Disallow

webcopier

Rule	Path
Disallow	/

Rule

Path

Disallow

fetch

Rule	Path
Disallow	/

Rule

Path

Disallow

offline explorer

Rule	Path
Disallow	/

Rule

Path

Disallow

teleport

Rule	Path
Disallow	/

Rule

Path

Disallow

teleportpro

Rule	Path
Disallow	/

Rule

Path

Disallow

webzip

Rule	Path
Disallow	/

Rule

Path

Disallow

linko

Rule	Path
Disallow	/

Rule

Path

Disallow

httrack

Rule	Path
Disallow	/

Rule

Path

Disallow

microsoft.url.control

Rule	Path
Disallow	/

Rule

Path

Disallow

xenu

Rule	Path
Disallow	/

Rule

Path

Disallow

larbin

Rule	Path
Disallow	/

Rule

Path

Disallow

libwww

Rule	Path
Disallow	/

Rule

Path

Disallow

zyborg

Rule	Path
Disallow	/

Rule

Path

Disallow

download ninja

Rule	Path
Disallow	/

Rule

Path

Disallow

fast

Rule	Path
Disallow	/

Rule

Path

Disallow

wget

Rule	Path
Disallow	/

Rule

Path

Disallow

grub-client

Rule	Path
Disallow	/

Rule

Path

Disallow

k2spider

Rule	Path
Disallow	/

Rule

Path

Disallow

npbot

Rule	Path
Disallow	/

Rule

Path

Disallow

webreaper

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

bytespider

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

siteauditbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-ba

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-si

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-swa

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-ct

Rule	Path
Disallow	/

Rule

Path

Disallow

splitsignalbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-coub

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot/7~bl

Rule	Path
Disallow	/

Rule

Path

Disallow

digitalshadowsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

awariorssbot
awariosmartbot
awariobot/1.0
awariobot

Rule	Path
Disallow	/

Rule

Path

Disallow

velenpublicwebcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

petalbot

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

yandexadnet
yandexaccessibilitybot
yandexblogs
yandexbot
yandexcalendar
yandexfordomain
yandeximages
yandeximageresizer
yandexmarket
yandexvideo
yandexmedia
yandexnews
yandexontodb
yandexpagechecker
yandexsitelinks
yandexspravbot
yandexturbo
yandexvertis
yandexverticals
yandexwebmaster

Rule	Path
Disallow	/

Rule

Path

Disallow