compostela24horas.com
robots.txt

Robots Exclusion Standard data for compostela24horas.com

Resource Scan

Scan Details

Site Domain compostela24horas.com
Base Domain compostela24horas.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonCouldn't connect to server.
Last Scan2024-10-29T16:44:30+00:00
Next Scan 2024-11-28T16:44:30+00:00

Last Successful Scan

Scanned2024-09-30T15:31:13+00:00
URL https://www.compostela24horas.com/robots.txt
Domain IPs 79.143.93.75
Response IP 27.0.175.51
Found Yes
Hash cda473df0d5f426eaeeb6bd0b28745264597f6e34e7cc47ebf61fcbe5369c6e3
SimHash 5b967040cde6

Groups

*

Rule Path
Disallow /apd/
Disallow /search
Disallow /texto-diario/print
Disallow /file/download/
Allow /

archive.org_bot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

ia_archiver

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

arquivo-web-crawler

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

rogerbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

semrushbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

gofeed

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

dotbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

ccbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

bingbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

yahoo pipes 1.0

Rule Path
Disallow /

voltron

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

addthis.com

Rule Path
Disallow /

admantx

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

bdcbot

Rule Path
Disallow /

bender

Rule Path
Disallow /

bixocrawler

Rule Path
Disallow /

bl.uk_lddc_bot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

bubing

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

cncdialer

Rule Path
Disallow /

crawler4j

Rule Path
Disallow /

crystalsemanticsbot

Rule Path
Disallow /

cyberalert

Rule Path
Disallow /

digext

Rule Path
Disallow /

discobot

Rule Path
Disallow /

discoverybot

Rule Path
Disallow /

dloader

Rule Path
Disallow /

dloader(naverrobot)

Rule Path
Disallow /

doc

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

dts agent

Rule Path
Disallow /

exabot

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

fairshare

Rule Path
Disallow /

fetch

Rule Path
Disallow /

flamingo_searchengine

Rule Path
Disallow /

genieo

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

grub-client

Rule Path
Disallow /

heritrix

Rule Path
Disallow /

heritrix/3.3.0

Rule Path
Disallow /

httrack

Rule Path
Disallow /

integromedb

Rule Path
Disallow /

istellabot

Rule Path
Disallow /

jikespider

Rule Path
Disallow /

jyxobot

Rule Path
Disallow /

k2spider

Rule Path
Disallow /

kimengi

Rule Path
Disallow /

kimengi/nineconnections.com

Rule Path
Disallow /

larbin

Rule Path
Disallow /

lexxebot/1.0

Rule Path
Disallow /

libwww

Rule Path
Disallow /

linko

Rule Path
Disallow /

livelapbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

maxthon

Rule Path
Disallow /

metauri

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

moreover

Rule Path
Disallow /

moreoverbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

nabot

Rule Path
Disallow /

naverbot

Rule Path
Disallow /

nerdbynature.bot

Rule Path
Disallow /

netestate ne crawler

Rule Path
Disallow /

netseer crawler

Rule Path
Disallow /

newscan

Rule Path
Disallow /

nextgensearchbot

Rule Path
Disallow /

npbot

Rule Path
Disallow /

nutch

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

orthogaffe

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

pixray-seeker

Rule Path
Disallow /

proximic

Rule Path
Disallow /

psbot

Rule Path
Disallow /

queryseekerspider

Rule Path
Disallow /

seokicks

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

sitebot

Rule Path
Disallow /

sitebot/0.1

Rule Path
Disallow /

sitecheck.internetseer.com

Rule Path
Disallow /

sitesnagger

Rule Path
Disallow /

slurp

Rule Path
Disallow /

sogou

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

spbot

Rule Path
Disallow /

spinn3r

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

trendictionbot

Rule Path
Disallow /

trovitbot

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

ubicrawler

Rule Path
Disallow /

umbot-ln

Rule Path
Disallow /

unisterbot

Rule Path
Disallow /

universalfeedparser

Rule Path
Disallow /

wbsearchbot

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

webzip

Rule Path
Disallow /

wesee:search

Rule Path
Disallow /

wget

Rule Path
Disallow /

wotbot

Rule Path
Disallow /

wotbox

Rule Path
Disallow /

xenu

Rule Path
Disallow /

yasni

Rule Path
Disallow /

zao

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

zyborg

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.compostela24horas.com/sitemap/news
sitemap https://www.compostela24horas.com/sitemap/lastarticles
sitemap https://www.compostela24horas.com/sitemap.xml

Warnings

  • 2 invalid lines.
  • `host` is not a known field.