elnoticieroenlinea.com
robots.txt

Robots Exclusion Standard data for elnoticieroenlinea.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	elnoticieroenlinea.com
Base Domain	elnoticieroenlinea.com
Scan Status	Failed
Failure Reason	Scan timed out.
Last Scan	2024-09-17T21:20:16+00:00
Next Scan	2024-10-01T21:20:16+00:00

Last Successful Scan

Scanned	2024-08-10T21:19:26+00:00
URL	https://elnoticieroenlinea.com/robots.txt
Domain IPs	198.175.150.30
Response IP	198.175.150.30
Found	Yes
Hash	53435ddda9831f66590c278b3ce111a2fa975c99faea591ec57434a553c7bf00
SimHash	b8963f02b924

Groups

*

Rule	Path
Disallow	/node/
Disallow	/cdb/
Disallow	/wp-content/
Disallow	/sites/
Disallow	/Topicos/
Disallow	/7198/
Disallow	/noticias/
Disallow	/bbtstats/
Disallow	/bbtfile/
Disallow	/feed/
Disallow	/rss7/
Disallow	/rss10/
Disallow	/MediaCenter/
Disallow	/portal/
Disallow	/infografia/*
Disallow	/5644/*
Disallow	/wp-admin
Allow	/wp-admin/admin-ajax.php

Rule

Path

Disallow

/node/

Disallow

/cdb/

Disallow

/wp-content/

Disallow

/sites/

Disallow

/Topicos/

Disallow

/7198/

Disallow

/noticias/

Disallow

/bbtstats/

Disallow

/bbtfile/

Disallow

/feed/

Disallow

/rss7/

Disallow

/rss10/

Disallow

/MediaCenter/

Disallow

/portal/

Disallow

/infografia/*

Disallow

/5644/*

Disallow

/wp-admin

Allow

/wp-admin/admin-ajax.php

genio

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

scooperbot

Rule	Path
Disallow	/

Rule

Path

Disallow

seekportbot

Rule	Path
Disallow	/

Rule

Path

Disallow

rogerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

flamingo_searchengine

Rule	Path
Disallow	/

Rule

Path

Disallow

facebot

Rule	Path
Disallow	/

Rule

Path

Disallow

luminatebot

Rule	Path
Disallow	/

Rule

Path

Disallow

vagabondo

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

seznambot

Rule	Path
Disallow	/

Rule

Path

Disallow

r6_commentreader

Rule	Path
Disallow	/

Rule

Path

Disallow

yeti

Rule	Path
Disallow	/

Rule

Path

Disallow

heritrix

Rule	Path
Disallow	/

Rule

Path

Disallow

baiduspider

Rule	Path
Disallow	/

Rule

Path

Disallow

showyoubot

Rule	Path
Disallow	/

Rule

Path

Disallow

gozaikbot

Rule	Path
Disallow	/

Rule

Path

Disallow

python-requests

Rule	Path
Disallow	/

Rule

Path

Disallow

queryseekerspider

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

yandeximages

Rule	Path
Disallow	/

Rule

Path

Disallow

apache-httpclient

Rule	Path
Disallow	/

Rule

Path

Disallow

piplbot

Rule	Path
Disallow	/

Rule

Path

Disallow

scrapy

Rule	Path
Disallow	/

Rule

Path

Disallow

buck

Rule	Path
Disallow	/

Rule

Path

Disallow

wikido

Rule	Path
Disallow	/

Rule

Path

Disallow

zoominfobot

Rule	Path
Disallow	/

Rule

Path

Disallow

sogou

Rule	Path
Disallow	/

Rule

Path

Disallow

zend_http_client

Rule	Path
Disallow	/

Rule

Path

Disallow

robots

Rule	Path
Disallow	/

Rule

Path

Disallow

arquivo-web-crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

bidswitchbot

Rule	Path
Disallow	/

Rule

Path

Disallow

g-i-g-a-b-o-t

Rule	Path
Disallow	/

Rule

Path

Disallow

gigabot

Rule	Path
Disallow	/

Rule

Path

Disallow

garlikcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

caam

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

clickagy intelligence bot

Rule	Path
Disallow	/

Rule

Path

Disallow

jersey

Rule	Path
Disallow	/

Rule

Path

Disallow

libwww-perl

Rule	Path
Disallow	/

Rule

Path

Disallow

ltx71

Rule	Path
Disallow	/

Rule

Path

Disallow

omgili

Rule	Path
Disallow	/

Rule

Path

Disallow

piplbot

Rule	Path
Disallow	/

Rule

Path

Disallow

python-urllib

Rule	Path
Disallow	/

Rule

Path

Disallow

zoominfobot

Rule	Path
Disallow	/

Rule

Path

Disallow

siteauditbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-ba

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-si

Rule

Path

Disallow

semrushbot-swa

Rule

Path

Disallow

semrushbot-ct

Rule

Path

Disallow

semrushbot-bm

Rule

Path

Disallow

splitsignalbot

Rule

Path

Disallow

semrushbot-coub

Rule

Path

Disallow

/lp

Disallow

/de-de/lp

Disallow

/en-au/lp

Disallow

/en-ca/lp

Disallow

/en-gb/lp

Disallow

/en-in/lp

Disallow

/es-es/lp

Disallow

/es-la/lp

Disallow

/fr-fr/lp

Disallow

/it-it/lp

Disallow

/ja-jp/lp

Disallow

/ko-kr/lp

Disallow

/pt-br/lp

Disallow

/zh-cn/lp

Disallow

/zh-tw/lp

Disallow

/feedback

Disallow

/de-de/feedback

Disallow

/en-au/feedback

Disallow

/en-ca/feedback

Disallow

/en-gb/feedback

Disallow

/en-in/feedback

Disallow

/es-es/feedback

Disallow

/es-la/feedback

Disallow

/fr-fr/feedback

Disallow

/it-it/feedback

Disallow

/ja-jp/feedback

Disallow

/ko-kr/feedback

Disallow

/pt-br/feedback

Disallow

/zh-cn/feedback

Disallow

/zh-tw/feedback

Other Records

Field

Value

sitemap

https://www.elnoticieroenlinea.com/sitemap/sitemap-articles-index.xml

sitemap

https://www.elnoticieroenlinea.com/sitemap/sitemap-google-news-index.xml

sitemap

https://www.elnoticieroenlinea.com/sitemap/sitemap-tags-index.xml

sitemap

https://www.elnoticieroenlinea.com/sitemap/sitemap-images-index.xml

sitemap

https://www.elnoticieroenlinea.com/sitemap/sitemap-videos-index.xml

Comments

robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/wc/robots.html
For syntax checking, see:
http://www.sxw.org.uk/computing/robots/check.html
lp
feedback

elnoticieroenlinea.comrobots.txt

Resource Scan

Scan Details

Last Successful Scan

Groups

*

genio

mj12bot

scooperbot

seekportbot

rogerbot

flamingo_searchengine

facebot

luminatebot

vagabondo

ahrefsbot

seznambot

r6_commentreader

yeti

heritrix

baiduspider

showyoubot

gozaikbot

python-requests

queryseekerspider

dotbot

yandeximages

apache-httpclient

piplbot

scrapy

buck

wikido

zoominfobot

sogou

zend_http_client

robots

arquivo-web-crawler

bidswitchbot

g-i-g-a-b-o-t

gigabot

garlikcrawler

caam

ccbot

clickagy intelligence bot

jersey

libwww-perl

ltx71

omgili

piplbot

python-urllib

zoominfobot

siteauditbot

semrushbot-ba

semrushbot-si

semrushbot-swa

semrushbot-ct

semrushbot-bm

splitsignalbot

semrushbot-coub

Other Records

Comments

elnoticieroenlinea.com
robots.txt