humanitarianlibrary.org
robots.txt

Robots Exclusion Standard data for humanitarianlibrary.org

Archived Snapshots

Resource Scan

Scan Details

Site Domain	humanitarianlibrary.org
Base Domain	humanitarianlibrary.org
Scan Status	Ok
Last Scan	2024-09-08T16:07:27+00:00
Next Scan	2024-10-08T16:07:27+00:00

Last Scan

Scanned	2024-09-08T16:07:27+00:00
URL	https://humanitarianlibrary.org/robots.txt
Domain IPs	104.21.94.152, 172.67.137.104, 2606:4700:3034::6815:5e98, 2606:4700:3037::ac43:8968
Response IP	172.67.137.104
Found	Yes
Hash	9901d6f8f7796823a3e267e25e8790b7ed323a18edd3dd51020a74d0389ee479
SimHash	38961d1af678

Groups

*

Rule	Path
Allow	/misc/*.css$
Allow	/misc/*.css?
Allow	/misc/*.js$
Allow	/misc/*.js?
Allow	/misc/*.gif
Allow	/misc/*.jpg
Allow	/misc/*.jpeg
Allow	/misc/*.png
Allow	/modules/*.css$
Allow	/modules/*.css?
Allow	/modules/*.js$
Allow	/modules/*.js?
Allow	/modules/*.gif
Allow	/modules/*.jpg
Allow	/modules/*.jpeg
Allow	/modules/*.png
Allow	/profiles/*.css$
Allow	/profiles/*.css?
Allow	/profiles/*.js$
Allow	/profiles/*.js?
Allow	/profiles/*.gif
Allow	/profiles/*.jpg
Allow	/profiles/*.jpeg
Allow	/profiles/*.png
Allow	/themes/*.css$
Allow	/themes/*.css?
Allow	/themes/*.js$
Allow	/themes/*.js?
Allow	/themes/*.gif
Allow	/themes/*.jpg
Allow	/themes/*.jpeg
Allow	/themes/*.png
Disallow	/includes/
Disallow	/misc/
Disallow	/modules/
Disallow	/profiles/
Disallow	/scripts/
Disallow	/themes/
Disallow	/CHANGELOG.txt
Disallow	/cron.php
Disallow	/INSTALL.mysql.txt
Disallow	/INSTALL.pgsql.txt
Disallow	/INSTALL.sqlite.txt
Disallow	/install.php
Disallow	/INSTALL.txt
Disallow	/LICENSE.txt
Disallow	/MAINTAINERS.txt
Disallow	/update.php
Disallow	/UPGRADE.txt
Disallow	/xmlrpc.php
Disallow	/admin/
Disallow	/comment/reply/
Disallow	/filter/tips/
Disallow	/node/add/
Disallow	/search/
Disallow	/user/register/
Disallow	/user/password/
Disallow	/user/login/
Disallow	/user/logout/
Disallow	/?q=admin%2F
Disallow	/?q=comment%2Freply%2F
Disallow	/?q=filter%2Ftips%2F
Disallow	/?q=node%2Fadd%2F
Disallow	/?q=search%2F
Disallow	/?q=user%2Fpassword%2F
Disallow	/?q=user%2Fregister%2F
Disallow	/?q=user%2Flogin%2F
Disallow	/?q=user%2Flogout%2F

Rule

Path

Allow

/misc/*.css$

Allow

/misc/*.css?

Allow

/misc/*.js$

Allow

/misc/*.js?

Allow

/misc/*.gif

Allow

/misc/*.jpg

Allow

/misc/*.jpeg

Allow

/misc/*.png

Allow

/modules/*.css$

Allow

/modules/*.css?

Allow

/modules/*.js$

Allow

/modules/*.js?

Allow

/modules/*.gif

Allow

/modules/*.jpg

Allow

/modules/*.jpeg

Allow

/modules/*.png

Allow

/profiles/*.css$

Allow

/profiles/*.css?

Allow

/profiles/*.js$

Allow

/profiles/*.js?

Allow

/profiles/*.gif

Allow

/profiles/*.jpg

Allow

/profiles/*.jpeg

Allow

/profiles/*.png

Allow

/themes/*.css$

Allow

/themes/*.css?

Allow

/themes/*.js$

Allow

/themes/*.js?

Allow

/themes/*.gif

Allow

/themes/*.jpg

Allow

/themes/*.jpeg

Allow

/themes/*.png

Disallow

/includes/

Disallow

/misc/

Disallow

/modules/

Disallow

/profiles/

Disallow

/scripts/

Disallow

/themes/

Disallow

/CHANGELOG.txt

Disallow

/cron.php

Disallow

/INSTALL.mysql.txt

Disallow

/INSTALL.pgsql.txt

Disallow

/INSTALL.sqlite.txt

Disallow

/install.php

Disallow

/INSTALL.txt

Disallow

/LICENSE.txt

Disallow

/MAINTAINERS.txt

Disallow

/update.php

Disallow

/UPGRADE.txt

Disallow

/xmlrpc.php

Disallow

/admin/

Disallow

/comment/reply/

Disallow

/filter/tips/

Disallow

/node/add/

Disallow

/search/

Disallow

/user/register/

Disallow

/user/password/

Disallow

/user/login/

Disallow

/user/logout/

Disallow

/?q=admin%2F

Disallow

/?q=comment%2Freply%2F

Disallow

/?q=filter%2Ftips%2F

Disallow

/?q=node%2Fadd%2F

Disallow

/?q=search%2F

Disallow

/?q=user%2Fpassword%2F

Disallow

/?q=user%2Fregister%2F

Disallow

/?q=user%2Flogin%2F

Disallow

/?q=user%2Flogout%2F

Other Records

Field	Value
crawl-delay	10

Field

Value

crawl-delay

seznambot

Rule	Path
Disallow	/

Rule

Path

Disallow

viglink

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

sputnikbot

Rule	Path
Disallow	/

Rule

Path

Disallow

sputnik

Rule	Path
Disallow	/

Rule

Path

Disallow

sputnikbot/2.3

Rule	Path
Disallow	/

Rule

Path

Disallow

curious george

Rule	Path
Disallow	/

Rule

Path

Disallow

curious george - www.analyticsseo.com

Rule	Path
Disallow	/

Rule

Path

Disallow

curious george - www.analyticsseo.com/crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

http://site.ru

Rule	Path
Disallow	/

Rule

Path

Disallow

site.ru

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mauibot

Rule	Path
Disallow	/

Rule

Path

Disallow

seokicks

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

bubing

Rule	Path
Disallow	/

Rule

Path

Disallow

scoutjet

Rule	Path
Disallow	/

Rule

Path

Disallow

grapeshot

Rule	Path
Disallow	/

Rule

Path

Disallow

grapeshotcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

seekport crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

zoominfobot

Rule	Path
Disallow	/

Rule

Path

Disallow

barkrowler

Rule	Path
Disallow	/

Rule

Path

Disallow

datanyze

Rule	Path
Disallow	/

Rule

Path

Disallow

yisouspider

Rule	Path
Disallow	/

Rule

Path

Disallow

adscanner

Rule	Path
Disallow	/

Rule

Path

Disallow

genieo

Rule	Path
Disallow	/

Rule

Path

Disallow

genieo/1.0

Rule	Path
Disallow	/

Rule

Path

Disallow

the knowledge ai

Rule	Path
Disallow	/

Rule

Path

Disallow

zgrab

Rule	Path
Disallow	/

Rule

Path

Disallow

zoominfobot

Rule	Path
Disallow	/

Rule

Path

Disallow

sogou spider

Rule	Path
Disallow	/

Rule

Path

Disallow

blekkobot

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

seobility

Rule	Path
Disallow	/

Rule

Path

Disallow

baiduspider

Rule	Path
Disallow	/

Rule

Path

Disallow

baiduspider/2.0

Rule	Path
Disallow	/

Rule

Path

Disallow

yandex

Rule	Path
Disallow	/

Rule

Path

Disallow

yandexbot/3.0

Rule	Path
Disallow	/

Rule

Path

Disallow

istellabot

Rule	Path
Disallow	/

Rule

Path

Disallow

istellabot/1.01.18

Rule	Path
Disallow	/

Rule

Path

Disallow

istellabot/1.01.18 +http://www.tiscali.it/

Rule	Path
Disallow	/

Rule

Path

Disallow

istellabot/1.10.2 +http://www.tiscali.it/

Rule	Path
Disallow	/

Rule

Path

Disallow

mozilla/5.0 (compatible; istellabot/1.01.18 +http://www.tiscali.it/)

Rule	Path
Disallow	/

Rule

Path

Disallow

cliqzbot

Rule	Path
Disallow	/

Rule

Path

Disallow

sogou web spider/4.0

Rule	Path
Disallow	/

Rule

Path

Disallow

grouphigh

Rule

Path

Disallow

grouphigh/1.0

Rule

Path

Disallow

coccocbot-web

Rule

Path

Disallow

ccbot

Rule

Path

Disallow

mail.ru

Rule

Path

Disallow

grapeshotcrawler

Rule

Path

Disallow

ia_archiver

Rule

Path

Disallow

james bot

Rule

Path

Disallow

leikibot

Rule

Path

Disallow

libcurl

Rule

Path

Disallow

linkdexbot

Rule

Path

Disallow

lipperhey

Rule

Path

Disallow

livelap

Rule

Path

Disallow

lssrocket

Rule

Path

Disallow

magpie

Rule

Path

Disallow

uptimebot

Rule

Path

Disallow

gluten free crawler/1.0

Rule

Path

Disallow

serpstatbot/1.0

Rule

Path

Disallow

domaincrawler

Rule

Path

Disallow

steeler

Rule

Path

Disallow

steeler/3.5

Rule

Path

Disallow

daum

Rule

Path

Disallow

arquivo-web-crawler

Rule

Path

Disallow

awariosmartbot

Rule

Path

Disallow

mojeekbot

Rule

Path

Disallow

yeti

Rule

Path

Disallow

yeti-mobile

Rule

Path

Disallow

mr.4x3 powered

Rule

Path

Disallow

sjuupbot

Rule

Path

Disallow

viglink

Rule

Path

Disallow

pi-monster

Rule

Path

Disallow

tracemyfile/1.0

Rule

Path

Disallow

xenu's link sleuth 1.1c

Rule

Path

Disallow

obot/2.3.1

Rule

Path

Disallow

cowbot/1.0

Rule

Path

Disallow

deskyobot

Rule

Path

Disallow

deskyobot/1.0

Rule

Path

Disallow

ltx71+-+(http://ltx71.com/)

Rule

Path

Disallow

pandalytics/1.0
ccbot/2.0
surdotlybot
cincraw/1.0
twingly recon-klondike/1.0
yak/1.0
df bot 1.0
crawlson/1.0
ioncrawl
ltx71
woorankreview/2.0
dataforseobot/1.0

Rule

Path

Disallow

arquivo-web-crawler

Rule

Path

Disallow

petalbot

Rule

Path

Disallow

filibot/1.0

Rule

Path

Disallow

panscient.com

Rule

Path

Disallow

mediatoolkitbot

Rule

Path

Disallow

obot/2.3.1

Rule

Path

Disallow

barkrowler/0.9

Rule

Path

Disallow

mauibot (crawler.feedback+wc@gmail.com)

Rule

Path

Disallow

Comments

robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/robotstxt.html
CSS, JS, Images
Directories
Files
Paths (clean URLs)
Paths (no clean URLs)
User-Agent: aranhabot
Crawl-delay: 10
Blekkobot
Block BlexBot
Baiduspider
Yandex
"Yeti/1.0 (NHN Corp.; http://help.naver.com/robots/)"
"Mozilla/5.0 (iPhone; CPU iPhone OS 5_0_1 like Mac OS X) (compatible;Yeti-Mobile/0.1; +http://help.naver.com/robots/)"

humanitarianlibrary.orgrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

seznambot

viglink

dotbot

mj12bot

sputnikbot

sputnik

sputnikbot/2.3

curious george

curious george - www.analyticsseo.com

curious george - www.analyticsseo.com/crawler

http://site.ru

site.ru

ahrefsbot

mauibot

seokicks

blexbot

bubing

scoutjet

grapeshot

grapeshotcrawler

seekport crawler

zoominfobot

barkrowler

datanyze

yisouspider

adscanner

genieo

genieo/1.0

the knowledge ai

zgrab

zoominfobot

sogou spider

blekkobot

blexbot

seobility

baiduspider

baiduspider/2.0

yandex

yandexbot/3.0

istellabot

istellabot/1.01.18

istellabot/1.01.18 +http://www.tiscali.it/

istellabot/1.10.2 +http://www.tiscali.it/

mozilla/5.0 (compatible; istellabot/1.01.18 +http://www.tiscali.it/)

cliqzbot

sogou web spider/4.0

grouphigh

grouphigh/1.0

coccocbot-web

ccbot

mail.ru

grapeshotcrawler

ia_archiver

james bot

leikibot

libcurl

linkdexbot

lipperhey

livelap

lssrocket

magpie

uptimebot

gluten free crawler/1.0

serpstatbot/1.0

domaincrawler

steeler

steeler/3.5

daum

arquivo-web-crawler

awariosmartbot

mojeekbot

yeti

yeti-mobile

humanitarianlibrary.org
robots.txt

pandalytics/1.0
ccbot/2.0
surdotlybot
cincraw/1.0
twingly recon-klondike/1.0
yak/1.0
df bot 1.0
crawlson/1.0
ioncrawl
ltx71
woorankreview/2.0
dataforseobot/1.0