haiti.loopnews.com
robots.txt

Robots Exclusion Standard data for haiti.loopnews.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	haiti.loopnews.com
Base Domain	loopnews.com
Scan Status	Ok
Last Scan	2024-11-01T14:05:37+00:00
Next Scan	2024-11-08T14:05:37+00:00

Last Scan

Scanned	2024-11-01T14:05:37+00:00
URL	https://haiti.loopnews.com/robots.txt
Domain IPs	52.188.134.210
Response IP	52.188.134.210
Found	Yes
Hash	65beb437f5e621e4f0d12ff2349aa0b828c2a57bb1b6a046b3e75eab2bed7ad5
SimHash	7c969d4bc260

Groups

*

Rule	Path
Disallow	/

Rule

Path

Disallow

*

Rule	Path
Disallow	/content/40000-bail-man-fraudulent-conversion-charge

Rule

Path

Disallow

/content/40000-bail-man-fraudulent-conversion-charge

googlebot

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

googlebot-image

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

googlebot-video

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

googlebot-news

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

googlebot-mobile

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

mediapartners-google

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

adsbot-google

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

twitterbot

Rule	Path
Disallow

Rule

Path

Disallow

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

facebookexternalhit

Rule	Path
Disallow

Rule

Path

Disallow

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

facebookexternalhit/1.1

Rule	Path
Disallow

Rule

Path

Disallow

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

trendictionbot0.5.0

Rule	Path
Disallow
Allow	/core/*.css$
Allow	/core/*.css?
Allow	/core/*.js$
Allow	/core/*.js?
Allow	/core/*.gif
Allow	/core/*.jpg
Allow	/core/*.jpeg
Allow	/core/*.png
Allow	/core/*.svg
Allow	/profiles/*.css$
Allow	/profiles/*.css?
Allow	/profiles/*.js$
Allow	/profiles/*.js?
Allow	/profiles/*.gif
Allow	/profiles/*.jpg
Allow	/profiles/*.jpeg
Allow	/profiles/*.png
Allow	/profiles/*.svg
Disallow	/core/
Disallow	/profiles/
Disallow	/README.txt
Disallow	/web.config
Disallow	/admin/
Disallow	/comment/reply/
Disallow	/filter/tips/
Disallow	/node/add/
Disallow	/search/
Disallow	/user/register/
Disallow	/user/password/
Disallow	/user/login/
Disallow	/user/logout/
Disallow	/index.php/admin/
Disallow	/index.php/comment/reply/
Disallow	/index.php/filter/tips/
Disallow	/index.php/node/add/
Disallow	/index.php/search/
Disallow	/index.php/user/password/
Disallow	/index.php/user/register/
Disallow	/index.php/user/login/
Disallow	/index.php/user/logout/
Disallow	/content/hotelier-freed-fraud-charges
Disallow	/content/spelling-mix-former-hotel-manager-innocent-says-attorney
Disallow	/content/prominent-greek-national-freed-financial-crime-charge-locally
Disallow	/nl
Disallow	/fr
Disallow	/index.php/nl
Disallow	/index.php/fr

Rule

Path

Disallow

Allow

/core/*.css$

Allow

/core/*.css?

Allow

/core/*.js$

Allow

/core/*.js?

Allow

/core/*.gif

Allow

/core/*.jpg

Allow

/core/*.jpeg

Allow

/core/*.png

Allow

/core/*.svg

Allow

/profiles/*.css$

Allow

/profiles/*.css?

Allow

/profiles/*.js$

Allow

/profiles/*.js?

Allow

/profiles/*.gif

Allow

/profiles/*.jpg

Allow

/profiles/*.jpeg

Allow

/profiles/*.png

Allow

/profiles/*.svg

Disallow

/core/

Disallow

/profiles/

Disallow

/README.txt

Disallow

/web.config

Disallow

/admin/

Disallow

/comment/reply/

Disallow

/filter/tips/

Disallow

/node/add/

Disallow

/search/

Disallow

/user/register/

Disallow

/user/password/

Disallow

/user/login/

Disallow

/user/logout/

Disallow

/index.php/admin/

Disallow

/index.php/comment/reply/

Disallow

/index.php/filter/tips/

Disallow

/index.php/node/add/

Disallow

/index.php/search/

Disallow

/index.php/user/password/

Disallow

/index.php/user/register/

Disallow

/index.php/user/login/

Disallow

/index.php/user/logout/

Disallow

/content/hotelier-freed-fraud-charges

Disallow

/content/spelling-mix-former-hotel-manager-innocent-says-attorney

Disallow

/content/prominent-greek-national-freed-financial-crime-charge-locally

Disallow

/nl

Disallow

/fr

Disallow

/index.php/nl

Disallow

/index.php/fr

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

scooperbot

Rule	Path
Disallow	/

Rule

Path

Disallow

special_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

eyeotabot

Rule	Path
Disallow	/

Rule

Path

Disallow

eyeotabot

Rule	Path
Disallow	/

Rule

Path

Disallow

bingbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mediapartners-googlebot

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

googlebot

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

yandexbot

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

kauaibot

Rule	Path
Allow	/

Rule

Path

Allow

Other Records

Field	Value
crawl-delay	30

Field

Value

crawl-delay

mediatoolkitbot

Rule	Path
Disallow	/

Rule

Path

Disallow

criteobot

Rule	Path
Disallow	/

Rule

Path

Disallow

blp_bbot

Rule	Path
Disallow	/

Rule

Path

Disallow

petalbot

Rule	Path
Disallow	/

Rule

Path

Disallow

archive.org_bot

Rule	Path
Disallow	/

Rule

Path

Disallow

nimbostratus-bot

Rule	Path
Disallow	/

Rule

Path

Disallow

tweetmemebot

Rule	Path
Disallow	/

Rule

Path

Disallow

barkrowler

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-bm

Rule	Path
Disallow	/

Rule

Path

Disallow

yandexbot

Rule

Path

Disallow

mauibot

Rule

Path

Disallow

crsspxlbot

Rule

Path

Disallow

aasa-bot

Rule

Path

Disallow

demandbasepublisheranalyzer

Rule

Path

Disallow

mojeekbot

Rule

Path

Disallow

bomborabot

Rule

Path

Disallow

seznambot

Rule

Path

Disallow

blp_bbot/0.1

Rule

Path

Disallow

ahrefsbot

Rule

Path

Disallow

moatbot

Rule

Path

Disallow

aaabot

Rule

Path

Disallow

anderspinkbot

Rule

Path

Disallow

seekport

Rule

Path

Disallow

voluumdsp-content-bot

Rule

Path

Disallow

special_archiver

Rule

Path

Disallow

megaindex.ru

Rule

Path

Disallow

ias-va

Rule

Path

Disallow

applebot

Rule

Path

Disallow

admantx-ussy04

Rule

Path

Disallow

python-requests

Rule

Path

Disallow

semanticbot

Rule

Path

Disallow

feedio.co feed crawler

Rule

Path

Disallow

gumgum

Rule

Path

Disallow

ias-ir

Rule

Path

Disallow

ias-or

Rule

Path

Disallow

ias-sg

Rule

Path

Disallow

Comments

robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/robotstxt.html
Crawl-delay: 10
CSS, JS, Images
Directories
Files
Paths (clean URLs)
Paths (no clean URLs)
GSC
carma.com Bot
Archive bot
SemrushBot bot
MJ12bot bot
BLEXBot bot
DotBot bot
Eyeotabot bot
Eyeotabot bot
bingbot bot
Crawl-delay: 30
mediapartners-googlebot
Googlebot bot
YandexBot bot
KauaiBot bot
Mediatoolkitbot bot
CriteoBot bot
BLP_bbot bot
PetalBot bot
archive.org_bot bot
Nimbostratus-Bot bot
TweetmemeBot bot
TweetmemeBot bot
SemrushBot-BM bot
YandexBot bot
MauiBot bot
CrsspxlBot bot
AASA-Bot bot
DemandbasePublisherAnalyzer bot
MojeekBot bot
BomboraBot bot
SeznamBot bot
BLP_bbot/0.1 bot
AhrefsBot
moatbot

haiti.loopnews.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

*

googlebot

Other Records

googlebot-image

Other Records

googlebot-video

Other Records

googlebot-news

Other Records

googlebot-mobile

Other Records

mediapartners-google

Other Records

adsbot-google

Other Records

twitterbot

Other Records

facebookexternalhit

Other Records

facebookexternalhit/1.1

Other Records

trendictionbot0.5.0

Other Records

scooperbot

special_archiver

semrushbot

mj12bot

blexbot

dotbot

eyeotabot

eyeotabot

bingbot

mediapartners-googlebot

Other Records

googlebot

Other Records

yandexbot

Other Records

kauaibot

Other Records

mediatoolkitbot

criteobot

blp_bbot

petalbot

archive.org_bot

nimbostratus-bot

tweetmemebot

barkrowler

semrushbot-bm

yandexbot

mauibot

crsspxlbot

aasa-bot

demandbasepublisheranalyzer

mojeekbot

bomborabot

seznambot

blp_bbot/0.1

ahrefsbot

moatbot

aaabot

anderspinkbot

seekport

voluumdsp-content-bot

special_archiver

megaindex.ru

ias-va

applebot

admantx-ussy04

python-requests

semanticbot

feedio.co feed crawler

gumgum

ias-ir

haiti.loopnews.com
robots.txt