shodhganga.inflibnet.ac.in
robots.txt

Robots Exclusion Standard data for shodhganga.inflibnet.ac.in

Resource Scan

Scan Details

Site Domain shodhganga.inflibnet.ac.in
Base Domain inflibnet.ac.in
Scan Status Ok
Last Scan2024-11-03T07:00:49+00:00
Next Scan 2024-12-03T07:00:49+00:00

Last Scan

Scanned2024-11-03T07:00:49+00:00
URL https://shodhganga.inflibnet.ac.in/robots.txt
Domain IPs 45.124.184.20
Response IP 45.124.184.20
Found Yes
Hash 9b9f8c451445ec9b60ebf3ab54151d885c9bdb9a333c1bb57e72e48dd80c231d
SimHash e306774165ad

Groups

*

Rule Path
Disallow /discover
Disallow /simple-search
Disallow /browse
Disallow /handle

*

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 600

*

Rule Path
Disallow /

mediapartners-google*

Rule Path
Disallow /

ubicrawler

Rule Path
Disallow /

doc

Rule Path
Disallow /

zao

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

facebot

Rule Path
Disallow /

facebookexternalhit

Rule Path
Disallow /

bingbot

Rule Path
Disallow /

slurp

Rule Path
Disallow /

duckduckbot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

msnbot

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

fetch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

webzip

Rule Path
Disallow /

linko

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

xenu

Rule Path
Disallow /

larbin

Rule Path
Disallow /

libwww

Rule Path
Disallow /

pdfdrivecrawler

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

aspiegelbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

mauibot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

cyotek webcopy

Rule Path
Disallow /

octoparse

Rule Path
Disallow /

sitechecker

Rule Path
Disallow /

screaming frog seo spider

Rule Path
Disallow /

dyno mapper

Rule Path
Disallow /

zyte (scrapinghub)

Rule Path
Disallow /

webharvy

Rule Path
Disallow /

nokogiri

Rule Path
Disallow /

dexi.io

Rule Path
Disallow /

nokogiri

Rule Path
Disallow /

uipath

Rule Path
Disallow /

webz.io

Rule Path
Disallow /

getlef

Rule Path
Disallow /

parsehub

Rule Path
Disallow /

deepcrawl

Rule Path
Disallow /

oncrawl

Rule Path
Disallow /

import.io

Rule Path
Disallow /

open search server

Rule Path
Disallow /

apify

Rule Path
Disallow /

googlebot-image

Rule Path
Disallow /

mediapartners-google

Rule Path
Disallow /

uptimerobot/2.0

Rule Path
Disallow /

uptimerobot

Rule Path
Disallow /

yandexcalendar

Rule Path
Disallow /

yandexmobilebot

Rule Path
Disallow /

yandex

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

admantx

Rule Path
Disallow /

aibot

Rule Path
Disallow /

alittle client

Rule Path
Disallow /

aspseek

Rule Path
Disallow /

abonti

Rule Path
Disallow /

aboundex

Rule Path
Disallow /

aboundexbot

Rule Path
Disallow /

acunetix

Rule Path
Disallow /

afd-verbotsverfahren

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

aihitbot

Rule Path
Disallow /

aipbot

Rule Path
Disallow /

alexibot

Rule Path
Disallow /

allsubmitter

Rule Path
Disallow /

alligator

Rule Path
Disallow /

alphabot

Rule Path
Disallow /

anarchie

Rule Path
Disallow /

anarchy

Rule Path
Disallow /

anarchy99

Rule Path
Disallow /

ankit

Rule Path
Disallow /

anthill

Rule Path
Disallow /

apexoo

Rule Path
Disallow /

aspiegel

Rule Path
Disallow /

asterias

Rule Path
Disallow /

atomseobot

Rule Path
Disallow /

attach

Rule Path
Disallow /

awariorssbot

Rule Path
Disallow /

awariosmartbot

Rule Path
Disallow /

bbbike

Rule Path
Disallow /

bdcbot

Rule Path
Disallow /

bdfetch

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

backdoorbot

Rule Path
Disallow /

backstreet

Rule Path
Disallow /

backweb

Rule Path
Disallow /

backlink-ceck

Rule Path
Disallow /

backlinkcrawler

Rule Path
Disallow /

badass

Rule Path
Disallow /

bandit

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

batchftp

Rule Path
Disallow /

battleztar bazinga

Rule Path
Disallow /

betabot

Rule Path
Disallow /

bigfoot

Rule Path
Disallow /

bitacle

Rule Path
Disallow /

blackwidow

Rule Path
Disallow /

black hole

Rule Path
Disallow /

blackboard

Rule Path
Disallow /

blow

Rule Path
Disallow /

blowfish

Rule Path
Disallow /

boardreader

Rule Path
Disallow /

bolt

Rule Path
Disallow /

botalot

Rule Path
Disallow /

brandprotect

Rule Path
Disallow /

brandwatch

Rule Path
Disallow /

buck

Rule Path
Disallow /

buddy

Rule Path
Disallow /

builtbottough

Rule Path
Disallow /

builtwith

Rule Path
Disallow /

bullseye

Rule Path
Disallow /

bunnyslippers

Rule Path
Disallow /

buzzsumo

Rule Path
Disallow /

catexplorador

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

code87

Rule Path
Disallow /

cshttp

Rule Path
Disallow /

calculon

Rule Path
Disallow /

cazoodlebot

Rule Path
Disallow /

cegbfeieh

Rule Path
Disallow /

censysinspect

Rule Path
Disallow /

cheteam

Rule Path
Disallow /

cheesebot

Rule Path
Disallow /

cherrypicker

Rule Path
Disallow /

chinaclaw

Rule Path
Disallow /

chlooe

Rule Path
Disallow /

citoid

Rule Path
Disallow /

claritybot

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

cloud mapping

Rule Path
Disallow /

cocolyzebot

Rule Path
Disallow /

cogentbot

Rule Path
Disallow /

collector

Rule Path
Disallow /

copier

Rule Path
Disallow /

copyrightcheck

Rule Path
Disallow /

copyscape

Rule Path
Disallow /

cosmos

Rule Path
Disallow /

craftbot

Rule Path
Disallow /

crawling at home project

Rule Path
Disallow /

crazywebcrawler

Rule Path
Disallow /

crescent

Rule Path
Disallow /

crunchbot

Rule Path
Disallow /

curious

Rule Path
Disallow /

custo

Rule Path
Disallow /

cyotekwebcopy

Rule Path
Disallow /

dblbot

Rule Path
Disallow /

diibot

Rule Path
Disallow /

dsearch

Rule Path
Disallow /

dts agent

Rule Path
Disallow /

datacha0s

Rule Path
Disallow /

databasedrivermysqli

Rule Path
Disallow /

demon

Rule Path
Disallow /

deusu

Rule Path
Disallow /

devil

Rule Path
Disallow /

digincore

Rule Path
Disallow /

digitalpebble

Rule Path
Disallow /

dirbuster

Rule Path
Disallow /

disco

Rule Path
Disallow /

discobot

Rule Path
Disallow /

discoverybot

Rule Path
Disallow /

dispatch

Rule Path
Disallow /

dittospyder

Rule Path
Disallow /

dnbcrawler-analytics

Rule Path
Disallow /

dnyzbot

Rule Path
Disallow /

domcopbot

Rule Path
Disallow /

domainappender

Rule Path
Disallow /

domaincrawler

Rule Path
Disallow /

domainsigmacrawler

Rule Path
Disallow /

domainstatsbot

Rule Path
Disallow /

domains project

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

download wonder

Rule Path
Disallow /

dragonfly

Rule Path
Disallow /

drip

Rule Path
Disallow /

eccp/1.0

Rule Path
Disallow /

email siphon

Rule Path
Disallow /

email wolf

Rule Path
Disallow /

easydl

Rule Path
Disallow /

ebingbong

Rule Path
Disallow /

ecxi

Rule Path
Disallow /

eirgrabber

Rule Path
Disallow /

erocrawler

Rule Path
Disallow /

evil

Rule Path
Disallow /

exabot

Rule Path
Disallow /

express webpictures

Rule Path
Disallow /

extlinksbot

Rule Path
Disallow /

extractor

Rule Path
Disallow /

extractorpro

Rule Path
Disallow /

extreme picture finder

Rule Path
Disallow /

eyenetie

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

fdm

Rule Path
Disallow /

fhscan

Rule Path
Disallow /

femtosearchbot

Rule Path
Disallow /

fimap

Rule Path
Disallow /

firefox/7.0

Rule Path
Disallow /

flashget

Rule Path
Disallow /

flunky

Rule Path
Disallow /

foobot

Rule Path
Disallow /

freeuploader

Rule Path
Disallow /

frontpage

Rule Path
Disallow /

fuzz

Rule Path
Disallow /

fyberspider

Rule Path
Disallow /

fyrebot

Rule Path
Disallow /

g-i-g-a-b-o-t

Rule Path
Disallow /

gt::www

Rule Path
Disallow /

galaxybot

Rule Path
Disallow /

genieo

Rule Path
Disallow /

germcrawler

Rule Path
Disallow /

getright

Rule Path
Disallow /

getweb

Rule Path
Disallow /

getintent

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

go!zilla

Rule Path
Disallow /

go-ahead-got-it

Rule Path
Disallow /

gozilla

Rule Path
Disallow /

gotit

Rule Path
Disallow /

grabnet

Rule Path
Disallow /

grabber

Rule Path
Disallow /

grafula

Rule Path
Disallow /

grapefx

Rule Path
Disallow /

grapeshotcrawler

Rule Path
Disallow /

gridbot

Rule Path
Disallow /

headmasterseo

Rule Path
Disallow /

hmview

Rule Path
Disallow /

htmlparser

Rule Path
Disallow /

http::lite

Rule Path
Disallow /

httrack

Rule Path
Disallow /

haansoft

Rule Path
Disallow /

haosouspider

Rule Path
Disallow /

harvest

Rule Path
Disallow /

havij

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

hloader

Rule Path
Disallow /

honolulubot

Rule Path
Disallow /

humanlinks

Rule Path
Disallow /

hybridbot

Rule Path
Disallow /

idbte4m

Rule Path
Disallow /

idbot

Rule Path
Disallow /

irlbot

Rule Path
Disallow /

iblog

Rule Path
Disallow /

id-search

Rule Path
Disallow /

ilsebot

Rule Path
Disallow /

image fetch

Rule Path
Disallow /

image sucker

Rule Path
Disallow /

indeedbot

Rule Path
Disallow /

indy library

Rule Path
Disallow /

infonavirobot

Rule Path
Disallow /

infotekies

Rule Path
Disallow /

intelliseek

Rule Path
Disallow /

interget

Rule Path
Disallow /

internetseer

Rule Path
Disallow /

internet ninja

Rule Path
Disallow /

iria

Rule Path
Disallow /

iskanie

Rule Path
Disallow /

istellabot

Rule Path
Disallow /

joc web spider

Rule Path
Disallow /

jamesbot

Rule Path
Disallow /

jbrofuzz

Rule Path
Disallow /

jennybot

Rule Path
Disallow /

jetcar

Rule Path
Disallow /

jetty

Rule Path
Disallow /

jikespider

Rule Path
Disallow /

joomla

Rule Path
Disallow /

jorgee

Rule Path
Disallow /

justview

Rule Path
Disallow /

jyxobot

Rule Path
Disallow /

kenjin spider

Rule Path
Disallow /

keybot translation-search-machine

Rule Path
Disallow /

keyword density

Rule Path
Disallow /

kinza

Rule Path
Disallow /

kozmosbot

Rule Path
Disallow /

lnspiderguy

Rule Path
Disallow /

lwp::simple

Rule Path
Disallow /

lanshanbot

Rule Path
Disallow /

larbin

Rule Path
Disallow /

leap

Rule Path
Disallow /

leechftp

Rule Path
Disallow /

leechget

Rule Path
Disallow /

lexibot

Rule Path
Disallow /

lftp

Rule Path
Disallow /

libweb

Rule Path
Disallow /

libwhisker

Rule Path
Disallow /

liebaofast

Rule Path
Disallow /

lightspeedsystems

Rule Path
Disallow /

likse

Rule Path
Disallow /

linkscan

Rule Path
Disallow /

linkwalker

Rule Path
Disallow /

linkbot

Rule Path
Disallow /

linkextractorpro

Rule Path
Disallow /

linkpadbot

Rule Path
Disallow /

linksmanager

Rule Path
Disallow /

linqiametadatadownloaderbot

Rule Path
Disallow /

linqiarssbot

Rule Path
Disallow /

linqiascrapebot

Rule Path
Disallow /

lipperhey

Rule Path
Disallow /

lipperhey spider

Rule Path
Disallow /

litemage_walker

Rule Path
Disallow /

lmspider

Rule Path
Disallow /

ltx71

Rule Path
Disallow /

mfc_tear_sample

Rule Path
Disallow /

midown tool

Rule Path
Disallow /

miixpc

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

mqqbrowser

Rule Path
Disallow /

msfrontpage

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

mtrobot

Rule Path
Disallow /

mag-net

Rule Path
Disallow /

magnet

Rule Path
Disallow /

mail.ru_bot

Rule Path
Disallow /

majestic-seo

Rule Path
Disallow /

majestic12

Rule Path
Disallow /

majestic seo

Rule Path
Disallow /

markmonitor

Rule Path
Disallow /

markwatch

Rule Path
Disallow /

mass downloader

Rule Path
Disallow /

masscan

Rule Path
Disallow /

mata hari

Rule Path
Disallow /

mauibot

Rule Path
Disallow /

mb2345browser

Rule Path
Disallow /

meanpath bot

Rule Path
Disallow /

meanpathbot

Rule Path
Disallow /

mediatoolkitbot

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

metauri

Rule Path
Disallow /

micromessenger

Rule Path
Disallow /

microsoft data access

Rule Path
Disallow /

microsoft url control

Rule Path
Disallow /

minefield

Rule Path
Disallow /

mister pix

Rule Path
Disallow /

moblie safari

Rule Path
Disallow /

mojeek

Rule Path
Disallow /

mojolicious

Rule Path
Disallow /

molokaibot

Rule Path
Disallow /

morfeus fucking scanner

Rule Path
Disallow /

mozlila

Rule Path
Disallow /

mr.4x3

Rule Path
Disallow /

msrabot

Rule Path
Disallow /

musobot

Rule Path
Disallow /

nicerspro

Rule Path
Disallow /

npbot

Rule Path
Disallow /

name intelligence

Rule Path
Disallow /

nameprotect

Rule Path
Disallow /

navroad

Rule Path
Disallow /

nearsite

Rule Path
Disallow /

needle

Rule Path
Disallow /

nessus

Rule Path
Disallow /

netants

Rule Path
Disallow /

netlyzer

Rule Path
Disallow /

netmechanic

Rule Path
Disallow /

netspider

Rule Path
Disallow /

netzip

Rule Path
Disallow /

net vampire

Rule Path
Disallow /

netcraft

Rule Path
Disallow /

nettrack

Rule Path
Disallow /

netvibes

Rule Path
Disallow /

nextgensearchbot

Rule Path
Disallow /

nibbler

Rule Path
Disallow /

niki-bot

Rule Path
Disallow /

nikto

Rule Path
Disallow /

nimblecrawler

Rule Path
Disallow /

nimbostratus

Rule Path
Disallow /

ninja

Rule Path
Disallow /

nmap

Rule Path
Disallow /

not

Rule Path
Disallow /

nuclei

Rule Path
Disallow /

nutch

Rule Path
Disallow /

octopus

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

offline navigator

Rule Path
Disallow /

oncrawl

Rule Path
Disallow /

openlinkprofiler

Rule Path
Disallow /

openvas

Rule Path
Disallow /

openfind

Rule Path
Disallow /

openvas

Rule Path
Disallow /

orangebot

Rule Path
Disallow /

orangespider

Rule Path
Disallow /

outclicksbot

Rule Path
Disallow /

outfoxbot

Rule Path
Disallow /

pecl::http

Rule Path
Disallow /

phpcrawl

Rule Path
Disallow /

poe-component-client-http

Rule Path
Disallow /

pageanalyzer

Rule Path
Disallow /

pagegrabber

Rule Path
Disallow /

pagescorer

Rule Path
Disallow /

pagething.com

Rule Path
Disallow /

page analyzer

Rule Path
Disallow /

pandalytics

Rule Path
Disallow /

panscient

Rule Path
Disallow /

papa foto

Rule Path
Disallow /

pavuk

Rule Path
Disallow /

peoplepal

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

pi-monster

Rule Path
Disallow /

picscout

Rule Path
Disallow /

picsearch

Rule Path
Disallow /

picturefinder

Rule Path
Disallow /

piepmatz

Rule Path
Disallow /

pimonster

Rule Path
Disallow /

pixray

Rule Path
Disallow /

pleasecrawl

Rule Path
Disallow /

pockey

Rule Path
Disallow /

propowerbot

Rule Path
Disallow /

prowebwalker

Rule Path
Disallow /

probethenet

Rule Path
Disallow /

psbot

Rule Path
Disallow /

pu_in

Rule Path
Disallow /

pump

Rule Path
Disallow /

pxbroker

Rule Path
Disallow /

pycurl

Rule Path
Disallow /

queryn metasearch

Rule Path
Disallow /

quick-crawler

Rule Path
Disallow /

rssingbot

Rule Path
Disallow /

rankactive

Rule Path
Disallow /

rankactivelinkbot

Rule Path
Disallow /

rankflex

Rule Path
Disallow /

rankingbot

Rule Path
Disallow /

rankingbot2

Rule Path
Disallow /

rankivabot

Rule Path
Disallow /

rankurbot

Rule Path
Disallow /

re-re

Rule Path
Disallow /

reget

Rule Path
Disallow /

realdownload

Rule Path
Disallow /

reaper

Rule Path
Disallow /

rebelmouse

Rule Path
Disallow /

recorder

Rule Path
Disallow /

redesscrapy

Rule Path
Disallow /

repomonkey

Rule Path
Disallow /

ripper

Rule Path
Disallow /

rocketcrawler

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

sbider

Rule Path
Disallow /

seokicks

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

seolyticscrawler

Rule Path
Disallow /

seoprofiler

Rule Path
Disallow /

seostats

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

smtbot

Rule Path
Disallow /

salesintelligent

Rule Path
Disallow /

scanalert

Rule Path
Disallow /

scanbot

Rule Path
Disallow /

scoutjet

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

screaming

Rule Path
Disallow /

screenerbot

Rule Path
Disallow /

screpybot

Rule Path
Disallow /

searchestate

Rule Path
Disallow /

searchmetricsbot

Rule Path
Disallow /

seekport

Rule Path
Disallow /

semanticjuice

Rule Path
Disallow /

semrush

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

sentibot

Rule Path
Disallow /

seositecheckup

Rule Path
Disallow /

seobilitybot

Rule Path
Disallow /

seomoz

Rule Path
Disallow /

shodan

Rule Path
Disallow /

siphon

Rule Path
Disallow /

sitecheckerbotcrawler

Rule Path
Disallow /

siteexplorer

Rule Path
Disallow /

sitelockspider

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

sitesucker

Rule Path
Disallow /

site sucker

Rule Path
Disallow /

sitebeam

Rule Path
Disallow /

siteimprove

Rule Path
Disallow /

sitevigil

Rule Path
Disallow /

slysearch

Rule Path
Disallow /

smartdownload

Rule Path
Disallow /

snake

Rule Path
Disallow /

snapbot

Rule Path
Disallow /

snoopy

Rule Path
Disallow /

socialrankiobot

Rule Path
Disallow /

sociscraper

Rule Path
Disallow /

sogou web spider

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

sottopop

Rule Path
Disallow /

spacebison

Rule Path
Disallow /

spammen

Rule Path
Disallow /

spankbot

Rule Path
Disallow /

spanner

Rule Path
Disallow /

spbot

Rule Path
Disallow /

spinn3r

Rule Path
Disallow /

sputnikbot

Rule Path
Disallow /

sqlmap

Rule Path
Disallow /

sqlworm

Rule Path
Disallow /

sqworm

Rule Path
Disallow /

steeler

Rule Path
Disallow /

stripper

Rule Path
Disallow /

sucker

Rule Path
Disallow /

sucuri

Rule Path
Disallow /

superbot

Rule Path
Disallow /

superhttp

Rule Path
Disallow /

surfbot

Rule Path
Disallow /

surveybot

Rule Path
Disallow /

suzuran

Rule Path
Disallow /

swiftbot

Rule Path
Disallow /

szukacz

Rule Path
Disallow /

t0phackteam

Rule Path
Disallow /

t8abot

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

telesoft

Rule Path
Disallow /

telesphoreo

Rule Path
Disallow /

telesphorep

Rule Path
Disallow /

thenomad

Rule Path
Disallow /

the intraformant

Rule Path
Disallow /

thumbor

Rule Path
Disallow /

tighttwatbot

Rule Path
Disallow /

titan

Rule Path
Disallow /

toata

Rule Path
Disallow /

toweyabot

Rule Path
Disallow /

tracemyfile

Rule Path
Disallow /

trendiction

Rule Path
Disallow /

trendictionbot

Rule Path
Disallow /

true_robot

Rule Path
Disallow /

turingos

Rule Path
Disallow /

turnitin

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

twengabot

Rule Path
Disallow /

twice

Rule Path
Disallow /

typhoeus

Rule Path
Disallow /

urly.warning

Rule Path
Disallow /

urly warning

Rule Path
Disallow /

unisterbot

Rule Path
Disallow /

upflow

Rule Path
Disallow /

v-bot

Rule Path
Disallow /

vb project

Rule Path
Disallow /

vci

Rule Path
Disallow /

vacuum

Rule Path
Disallow /

vagabondo

Rule Path
Disallow /

velenpublicwebcrawler

Rule Path
Disallow /

vericitecrawler

Rule Path
Disallow /

vidiblescraper

Rule Path
Disallow /

virusdie

Rule Path
Disallow /

voideye

Rule Path
Disallow /

voil

Rule Path
Disallow /

voltron

Rule Path
Disallow /

wasalive-bot

Rule Path
Disallow /

wbsearchbot

Rule Path
Disallow /

webdav

Rule Path
Disallow /

wisenutbot

Rule Path
Disallow /

wpscan

Rule Path
Disallow /

www-collector-e

Rule Path
Disallow /

www-mechanize

Rule Path
Disallow /

www::mechanize

Rule Path
Disallow /

wwwoffle

Rule Path
Disallow /

wallpapers

Rule Path
Disallow /

wallpapers/3.0

Rule Path
Disallow /

wallpapershd

Rule Path
Disallow /

wesee

Rule Path
Disallow /

webauto

Rule Path
Disallow /

webbandit

Rule Path
Disallow /

webcollage

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

webenhancer

Rule Path
Disallow /

webfetch

Rule Path
Disallow /

webfuck

Rule Path
Disallow /

webgo is

Rule Path
Disallow /

webimagecollector

Rule Path
Disallow /

webleacher

Rule Path
Disallow /

webpix

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

websauger

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

websucker

Rule Path
Disallow /

webwhacker

Rule Path
Disallow /

webzip

Rule Path
Disallow /

web auto

Rule Path
Disallow /

web collage

Rule Path
Disallow /

web enhancer

Rule Path
Disallow /

web fetch

Rule Path
Disallow /

web fuck

Rule Path
Disallow /

web pix

Rule Path
Disallow /

web sauger

Rule Path
Disallow /

web sucker

Rule Path
Disallow /

webalta

Rule Path
Disallow /

webmasterworldforumbot

Rule Path
Disallow /

webshag

Rule Path
Disallow /

websiteextractor

Rule Path
Disallow /

websitequester

Rule Path
Disallow /

website quester

Rule Path
Disallow /

webster

Rule Path
Disallow /

whack

Rule Path
Disallow /

whacker

Rule Path
Disallow /

whatweb

Rule Path
Disallow /

who.is bot

Rule Path
Disallow /

widow

Rule Path
Disallow /

winhttrack

Rule Path
Disallow /

wiseguys robot

Rule Path
Disallow /

wonderbot

Rule Path
Disallow /

woobot

Rule Path
Disallow /

wotbox

Rule Path
Disallow /

wprecon

Rule Path
Disallow /

xaldon webspider

Rule Path
Disallow /

xaldon_webspider

Rule Path
Disallow /

xenu

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

zade

Rule Path
Disallow /

zauba

Rule Path
Disallow /

zermelo

Rule Path
Disallow /

zeus

Rule Path
Disallow /

zitebot

Rule Path
Disallow /

zmeu

Rule Path
Disallow /

zoombot

Rule Path
Disallow /

zoominfobot

Rule Path
Disallow /

zumbot

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

adscanner

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

arquivo-web-crawler

Rule Path
Disallow /

arquivo.pt

Rule Path
Disallow /

autoemailspider

Rule Path
Disallow /

backlink-check

Rule Path
Disallow /

cah.io.community

Rule Path
Disallow /

check1.exe

Rule Path
Disallow /

clark-crawler

Rule Path
Disallow /

coccocbot

Rule Path
Disallow /

cognitiveseo

Rule Path
Disallow /

com.plumanalytics

Rule Path
Disallow /

crawl.sogou.com

Rule Path
Disallow /

crawler.feedback

Rule Path
Disallow /

crawler4j

Rule Path
Disallow /

dataforseo.com

Rule Path
Disallow /

demandbase-bot

Rule Path
Disallow /

domainsproject.org

Rule Path
Disallow /

ecatch

Rule Path
Disallow /

evc-batch

Rule Path
Disallow /

facebookscraper

Rule Path
Disallow /

gopher

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

instabid

Rule Path
Disallow /

internetvista monitor

Rule Path
Disallow /

ips-agent

Rule Path
Disallow /

isitwp.com

Rule Path
Disallow /

iubenda-radar

Rule Path
Disallow /

linkdexbot

Rule Path
Disallow /

lwp-request

Rule Path
Disallow /

lwp-trivial

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

meanpathbot

Rule Path
Disallow /

mediawords

Rule Path
Disallow /

muhstik-scan

Rule Path
Disallow /

netestate ne crawler

Rule Path
Disallow /

obot

Rule Path
Disallow /

page scorer

Rule Path
Disallow /

pcbrowser

Rule Path
Disallow /

plumanalytics

Rule Path
Disallow /

polaris version

Rule Path
Disallow /

probe-image-size

Rule Path
Disallow /

ripz

Rule Path
Disallow /

s1z.ru

Rule Path
Disallow /

satoristudio.net

Rule Path
Disallow /

scalaj-http

Rule Path
Disallow /

scan.lol

Rule Path
Disallow /

seobility

Rule Path
Disallow /

seocompany.store

Rule Path
Disallow /

seoscanners

Rule Path
Disallow /

seostar

Rule Path
Disallow /

serpstatbot

Rule Path
Disallow /

sexsearcher

Rule Path
Disallow /

sitechecker.pro

Rule Path
Disallow /

siteripz

Rule Path
Disallow /

sogouspider

Rule Path
Disallow /

sp_auditbot

Rule Path
Disallow /

spyfu

Rule Path
Disallow /

sysscan

Rule Path
Disallow /

takeout

Rule Path
Disallow /

trendiction.com

Rule Path
Disallow /

trendiction.de

Rule Path
Disallow /

ubermetrics-technologies.com

Rule Path
Disallow /

voyagerx.com

Rule Path
Disallow /

webgains-bot

Rule Path
Disallow /

webmeup-crawler

Rule Path
Disallow /

webpros.com

Rule Path
Disallow /

webprosbot

Rule Path
Disallow /

x09mozilla

Rule Path
Disallow /

x22mozilla

Rule Path
Disallow /

xpymep1.exe

Rule Path
Disallow /

zauba.io

Rule Path
Disallow /

zgrab

Rule Path
Disallow /

pinterestbot/1.0

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

siteauditbot/0.97

Rule Path
Disallow /

siteauditbot

Rule Path
Disallow /

digitalshadowsbot/1.0

Rule Path
Disallow /

repolookoutbot

Rule Path
Disallow /

duckduckgo

Rule Path
Disallow /

zumbot

Rule Path
Disallow /

zumbot/1.0

Rule Path
Disallow /

aspiegelbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

bubing

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

sogou

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

youdaobot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

seekport

Rule Path
Disallow /

spbot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

slurp

Rule Path
Disallow /

msnbot

Rule Path
Disallow /

teoma

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

surveybot

Rule Path
Disallow /

wbsearchbot

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

lipperhey

Rule Path
Disallow /

wbsearchbot

Rule Path
Disallow /

aboundexbot

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

psbot

Rule Path
Disallow /

pagepeeker

Rule Path
Disallow /

panscient.com

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

censysinspect

Rule Path
Disallow /

compspybot

Rule Path
Disallow /

lssrocketcrawler

Rule Path
Disallow /

pingdom.com_bot

Rule Path
Disallow /

indeed

Rule Path
Disallow /

neustar wpm

Rule Path
Disallow /

python-requests

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

nekstbot

Rule Path
Disallow /

proximic

Rule Path
Disallow /

snbot

Rule Path
Disallow /

smtbot

Rule Path
Disallow /

barkrowler

Rule Path
Disallow /

trendmicro

Rule Path
Disallow /

a6-indexer

Rule Path
Disallow /

alphabot

Rule Path
Disallow /

alphaseobot

Rule Path
Disallow /

blekkobot

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

findxbot

Rule Path
Disallow /

hzgrnbot

Rule Path
Disallow /

mindupbot

Rule Path
Disallow /

netestate ne crawler

Rule Path
Disallow /

openwebindex

Rule Path
Disallow /

seoscanners.net

Rule Path
Disallow /

smtbot

Rule Path
Disallow /

trustwave spider

Rule Path
Disallow /

obot

Rule Path
Disallow /

ips-agent

Rule Path
Disallow /

zmeu

Rule Path
Disallow /

xenu

Rule Path
Disallow /

netcraftsurveyagent

Rule Path
Disallow /

robozilla

Rule Path
Disallow /

panscient.com

Rule Path
Disallow /

queryseekerspider

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

snitch

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

spbot

Rule Path
Disallow /

sputnikbot

Rule Path
Disallow /

teleport

Rule Path
Disallow /

trend micro

Rule Path
Disallow /

urlgrabber

Rule Path
Disallow /

xovibot

Rule Path
Disallow /

yandex

Rule Path
Disallow /

site24x7

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

startmebot

Rule Path
Disallow /

fast

Rule Path
Disallow /

wget

Rule Path
Disallow /

grub-client

Rule Path
Disallow /

k2spider

Rule Path
Disallow /

npbot

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

mail.ru_bot

Rule Path
Disallow /

acapbot

Rule Path
Disallow /

acoonbot

Rule Path
Disallow /

ahrefs

Rule Path
Disallow /

alexibot

Rule Path
Disallow /

asterias

Rule Path
Disallow /

attackbot

Rule Path
Disallow /

backdorbot

Rule Path
Disallow /

becomebot

Rule Path
Disallow /

binlar

Rule Path
Disallow /

blackwidow

Rule Path
Disallow /

blekkobot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

blowfish

Rule Path
Disallow /

bullseye

Rule Path
Disallow /

bunnys

Rule Path
Disallow /

butterfly

Rule Path
Disallow /

careerbot

Rule Path
Disallow /

casper

Rule Path
Disallow /

checkpriv

Rule Path
Disallow /

cheesebot

Rule Path
Disallow /

cherrypick

Rule Path
Disallow /

chinaclaw

Rule Path
Disallow /

choppy

Rule Path
Disallow /

clshttp

Rule Path
Disallow /

cmsworld

Rule Path
Disallow /

copernic

Rule Path
Disallow /

copyrightcheck

Rule Path
Disallow /

cosmos

Rule Path
Disallow /

crescent

Rule Path
Disallow /

cy_cho

Rule Path
Disallow /

datacha

Rule Path
Disallow /

demon

Rule Path
Disallow /

diavol

Rule Path
Disallow /

discobot

Rule Path
Disallow /

dittospyder

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

dotnetdotcom

Rule Path
Disallow /

dumbot

Rule Path
Disallow /

emailcollector

Rule Path
Disallow /

emailsiphon

Rule Path
Disallow /

emailwolf

Rule Path
Disallow /

exabot

Rule Path
Disallow /

extract

Rule Path
Disallow /

eyenetie

Rule Path
Disallow /

feedfinder

Rule Path
Disallow /

flaming

Rule Path
Disallow /

flashget

Rule Path
Disallow /

flicky

Rule Path
Disallow /

foobot

Rule Path
Disallow /

g00g1e

Rule Path
Disallow /

getright

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

go-ahead-got

Rule Path
Disallow /

gozilla

Rule Path
Disallow /

grabnet

Rule Path
Disallow /

grafula

Rule Path
Disallow /

harvest

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

httrack

Rule Path
Disallow /

icarus6j

Rule Path
Disallow /

jetbot

Rule Path
Disallow /

jetcar

Rule Path
Disallow /

jikespider

Rule Path
Disallow /

kmccrew

Rule Path
Disallow /

leechftp

Rule Path
Disallow /

libweb

Rule Path
Disallow /

linkextractor

Rule Path
Disallow /

linkscan

Rule Path
Disallow /

linkwalker

Rule Path
Disallow /

loader

Rule Path
Disallow /

masscan

Rule Path
Disallow /

miner

Rule Path
Disallow /

majestic

Rule Path
Disallow /

mechanize

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

morfeus

Rule Path
Disallow /

moveoverbot

Rule Path
Disallow /

netmechanic

Rule Path
Disallow /

netspider

Rule Path
Disallow /

nicerspro

Rule Path
Disallow /

nikto

Rule Path
Disallow /

ninja

Rule Path
Disallow /

nutch

Rule Path
Disallow /

octopus

Rule Path
Disallow /

pagegrabber

Rule Path
Disallow /

planetwork

Rule Path
Disallow /

postrank

Rule Path
Disallow /

proximic

Rule Path
Disallow /

purebot

Rule Path
Disallow /

pycurl

Rule Path
Disallow /

python

Rule Path
Disallow /

queryn

Rule Path
Disallow /

queryseeker

Rule Path
Disallow /

radian6

Rule Path
Disallow /

radiation

Rule Path
Disallow /

realdownload

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

scooter

Rule Path
Disallow /

seekerspider

Rule Path
Disallow /

semalt

Rule Path
Disallow /

siclab

Rule Path
Disallow /

sindice

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

sitebot

Rule Path
Disallow /

siteexplorer

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

skygrid

Rule Path
Disallow /

smartdownload

Rule Path
Disallow /

snoopy

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

spankbot

Rule Path
Disallow /

spbot

Rule Path
Disallow /

sqlmap

Rule Path
Disallow /

stackrambler

Rule Path
Disallow /

stripper

Rule Path
Disallow /

sucker

Rule Path
Disallow /

surftbot

Rule Path
Disallow /

sux0r

Rule Path
Disallow /

suzukacz

Rule Path
Disallow /

suzuran

Rule Path
Disallow /

takeout

Rule Path
Disallow /

teleport

Rule Path
Disallow /

telesoft

Rule Path
Disallow /

true_robots

Rule Path
Disallow /

turingos

Rule Path
Disallow /

turnit

Rule Path
Disallow /

vampire

Rule Path
Disallow /

vikspider

Rule Path
Disallow /

voideye

Rule Path
Disallow /

webleacher

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webvac

Rule Path
Disallow /

webviewer

Rule Path
Disallow /

webwhacker

Rule Path
Disallow /

winhttp

Rule Path
Disallow /

wwoofle

Rule Path
Disallow /

woxbot

Rule Path
Disallow /

xaldon

Rule Path
Disallow /

xxxyy

Rule Path
Disallow /

yamanalab

Rule Path
Disallow /

yioopbot

Rule Path
Disallow /

youda

Rule Path
Disallow /

zeus

Rule Path
Disallow /

zmeu

Rule Path
Disallow /

zune

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

spbot

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

yandeximages

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

yeti

Rule Path
Disallow /

yandexvideo

Rule Path
Disallow /

mail.ru_bot

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

sitebot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

feedfetcher-google

Rule Path
Disallow /

alphabot

Rule Path
Disallow /

proximic

Rule Path
Disallow /

pagesinventory

Rule Path
Disallow /

grapeshot

Rule Path
Disallow /

teoma

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

screaming frog seo spider

Rule Path
Disallow /

sentibot

Rule Path
Disallow /

linkpadbot

Rule Path
Disallow /

acoon-robot

Rule Path
Disallow /

meanpathbot

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

wbsearchbot

Rule Path
Disallow /

slurp

Rule Path
Disallow /

duckduckbot

Rule Path
Disallow /

grokkit-crawler

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

grokkit

Rule Path
Disallow /

scoutjet

Rule Path
Disallow /

grouphigh

Rule Path
Disallow /

safednsbot

Rule Path
Disallow /

riddler

Rule Path
Disallow /

memorybot

Rule Path
Disallow /

twingly recon

Rule Path
Disallow /

ips-agent

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

seoscanners.net

Rule Path
Disallow /

screaming frog seo spider

Rule Path
Disallow /

linkdex

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

neustar-wombot

Rule Path
Disallow /

findxbot

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

lipperhey

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

nutch

Rule Path
Disallow /

faviconizer

Rule Path
Disallow /

alphaseobot

Rule Path
Disallow /

urlappendbot

Rule Path
Disallow /

boardreader

Rule Path
Disallow /

nerdybot

Rule Path
Disallow /

souppotbot

Rule Path
Disallow /

com.plumanalytics

Rule Path
Disallow /

twiceler

Rule Path
Disallow /

linkdexbot

Rule Path
Disallow /

sogou spider

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

datagnionbot

Rule Path
Disallow /

urlappendbot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

backlinkcrawler

Rule Path
Disallow /

obot

Rule Path
Disallow /

twitterbot

Rule Path
Disallow /

compspybot

Rule Path
Disallow /

screaming frog seo spider

Rule Path
Disallow /

compspybot

Rule Path
Disallow /

qwantify

Rule Path
Disallow /

smtbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

stackrambler

Rule Path
Disallow /

lcc

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

seokicks

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

ltx71

Rule Path
Disallow /

netestate ne crawler

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

lcc

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

adidxbot

Rule Path
Disallow /

gsa

Rule Path
Disallow /

scooter

Rule Path
Disallow /

moget

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

seznambot

Rule Path
Disallow /

ravenbot

Rule Path
Disallow /

serpstatbot

Rule Path
Disallow /

megaindex.ru

Rule Path
Disallow /

coccoc

Rule Path
Disallow /

screaming frog seo spider

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

sogou web spider

Rule Path
Disallow /

proximic

Rule Path
Disallow /

qwantify

Rule Path
Disallow /

archivebot

Rule Path
Disallow /

wotbox

Rule Path
Disallow /

memorybot

Rule Path
Disallow /

searchmetricsbot

Rule Path
Disallow /

unwindfetchor

Rule Path
Disallow /

ahoy! the homepage abstraction thing

Rule Path
Disallow /

wbsrch

Rule Path
Disallow /

seoprofiler

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

meanpathbot

Rule Path
Disallow /

leikibot

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

ltx71

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

surveybot

Rule Path
Disallow /

searchmetricsbot

Rule Path
Disallow /

nutch

Rule Path
Disallow /

wotbox

Rule Path
Disallow /

siteexplorer

Rule Path
Disallow /

surveybot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

grouphigh

Rule Path
Disallow /

yacybot

Rule Path
Disallow /

facebot

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

mbcrawler

Rule Path
Disallow /

yandex

Rule Path
Disallow /

jyxobot

Rule Path
Disallow /

netestate ne crawler

Rule Path
Disallow /

yandeximages

Rule Path
Disallow /

sistrix

Rule Path
Disallow /

tubervidbot

Rule Path
Disallow /

googlebot

Rule Path
Disallow /

googlebot/2.1

Rule Path
Disallow /

datadog-agent

Rule Path
Disallow /

simplec

Rule Path
Disallow /

feedfetcher-google

Rule Path
Disallow /

facebookexternalhit

Rule Path
Disallow /

googlebot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

bingbot

Rule Path
Disallow /

Other Records

Field Value
sitemap http://localhost:8080/jspui/sitemap
sitemap http://localhost:8080/jspui/htmlmap

Comments

  • The FULL URL to the DSpace sitemaps
  • The http://localhost:8080/jspui will be auto-filled with the value in dspace.cfg
  • XML sitemap is listed first as it is preferred by most search engines
  • Default Access Group
  • (NOTE: blank lines are not allowable in a group record)
  • Disable access to Discovery search and filters
  • Optionally uncomment the following line ONLY if sitemaps are working
  • and you have verified that your site is being indexed correctly.
  • Uncomment on 28-11-2022
  • Uncomment on 28-11-2022
  • If you have configured DSpace (Solr-based) Statistics to be publicly
  • accessible, then you may not want this content to be indexed
  • Disallow: /statistics
  • You also may wish to disallow access to the following paths, in order
  • to stop web spiders from accessing user-based content
  • Disallow: /contact
  • Disallow: /feedback
  • Disallow: /forgot
  • Disallow: /login
  • Disallow: /register
  • Section for misbehaving bots
  • The following directives to block specific robots were borrowed from Wikipedia's robots.txt
  • advertising-related bots:
  • Crawlers that are kind enough to obey, but which we'd rather not have
  • unless they're feeding search engines.
  • Some bots are known to be trouble, particularly those designed to copy
  • entire sites. Please obey robots.txt.
  • ADDED ON 20-12-2022
  • ADDED ON 20-12-2022
  • User-agent: Googlebot
  • Disallow: /
  • ADDED BY SP
  • ADDED BY SP
  • Misbehaving: requests much too fast:
  • If your DSpace is going down because of someone using recursive wget,
  • you can activate the following rule.
  • If your own faculty is bringing down your dspace with recursive wget,
  • you can advise them to use the --wait option to set the delay between hits.
  • The 'grub' distributed client has been *very* poorly behaved.
  • Doesn't follow robots.txt anyway, but...
  • Hits many times per second, not acceptable
  • http://www.nameprotect.com/botinfo.html
  • A capture bot, downloads gazillions of pages with no public benefit
  • http://www.webreaper.net/
  • 08-04-2024

Warnings

  • 14 invalid lines.