inchieste.repubblica.it
robots.txt

Robots Exclusion Standard data for inchieste.repubblica.it

Resource Scan

Scan Details

Site Domain inchieste.repubblica.it
Base Domain repubblica.it
Scan Status Ok
Last Scan2025-03-08T13:28:46+00:00
Next Scan 2025-03-15T13:28:46+00:00

Last Scan

Scanned2025-03-08T13:28:46+00:00
URL https://inchieste.repubblica.it/robots.txt
Domain IPs 34.160.171.237
Response IP 34.160.171.237
Found Yes
Hash 977908268c87dfa60b585f70421817e9c2e8ead9eb785bed0ac6ef3666620d98
SimHash 3a0d41108986

Groups

*

Rule Path
Disallow /utility/
Disallow /it/repubblica/repit/2012/12/11/news/l_ombra_della_camorra_sulle_autostrade_le_indagini_a_tutto_campo_sui_rei_del_ferro-48527172/
Disallow /it/repubblica/rep.it/2012/12/11/news/l_ombra_della_camorra_sulle_autostrade_le_indagini_a_tutto_campo_sui_rei_del_ferro-48527172/

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

petalbot

Rule Path
Disallow /

perplexitybot

Rule Path
Disallow /

yandex

Rule Path
Disallow /

netestate ne crawler

Rule Path
Disallow /

turnitinbot

Rule Path
Disallow /

download ninja

Rule Path
Disallow /

nabot

Rule Path
Disallow /

trendictionbot

Rule Path
Disallow /

discoverybot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

sosospider

Rule Path
Disallow /

dataforseobot

Rule Path
Disallow /

livelapbot

Rule Path
Disallow /

sitebot

Rule Path
Disallow /

converacrawler

Rule Path
Disallow /

lexxebot/1.0

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /

blexbot

Rule Path
Disallow /

k2spider

Rule Path
Disallow /

rogerbot

Rule Path
Disallow /

awariosmartbot

Rule Path
Disallow /

jetbot

Rule Path
Disallow /

psbot

Rule Path
Disallow /

archivebot

Rule Path
Disallow /

ia_archiver-web.archive.org

Rule Path
Disallow /

piplbot

Rule Path
Disallow /

ahrefsbot

Rule Path
Disallow /

gigabot

Rule Path
Disallow /

pangubot

Rule Path
Disallow /

wotbot

Rule Path
Disallow /

fetch

Rule Path
Disallow /

nutch

Rule Path
Disallow /

wbsearchbot

Rule Path
Disallow /

europarchive.org

Rule Path
Disallow /

nextgensearchbot

Rule Path
Disallow /

umbot-ln

Rule Path
Disallow /

nerdbynature.bot

Rule Path
Disallow /

true_robot

Rule Path
Disallow /

dotbot

Rule Path
Disallow /

msiecrawler

Rule Path
Disallow /

timpibot

Rule Path
Disallow /

discobot

Rule Path
Disallow /

meta-externalfetcher

Rule Path
Disallow /

slurp

Rule Path
Disallow /

crystalsemanticsbot

Rule Path
Disallow /

linkextractorpro

Rule Path
Disallow /

seokicks-robot

Rule Path
Disallow /

cliqzbot

Rule Path
Disallow /

kbcrawl

Rule Path
Disallow /

searchpreview

Rule Path
Disallow /

bixocrawler

Rule Path
Disallow /

jyxobot

Rule Path
Disallow /

quora-bot

Rule Path
Disallow /

awariorssbot

Rule Path
Disallow /

istellabot

Rule Path
Disallow /

primalbot

Rule Path
Disallow /

archive.org_bot

Rule Path
Disallow /

ia_archiver

Rule Path
Disallow /

peer39_crawler/1.0

Rule Path
Disallow /

zealbot

Rule Path
Disallow /

friendlycrawler

Rule Path
Disallow /

openbot

Rule Path
Disallow /

wesee:search

Rule Path
Disallow /

extractorpro

Rule Path
Disallow /

npbot

Rule Path
Disallow /

verticalsearch

Rule Path
Disallow /

nnetseer crawler

Rule Path
Disallow /

ubicrawler

Rule Path
Disallow /

duckassistbot

Rule Path
Disallow /

naverbot

Rule Path
Disallow /

trovitbot

Rule Path
Disallow /

dloader(naverrobot)

Rule Path
Disallow /

moreoverbot

Rule Path
Disallow /

spbot

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

magpie-crawler

Rule Path
Disallow /

sitebot/0.1

Rule Path
Disallow /

crawler4j

Rule Path
Disallow /

linkarchiver

Rule Path
Disallow /

seoengbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

kangaroo bot

Rule Path
Disallow /

scrapy

Rule Path
Disallow /

backlinkcrawler

Rule Path
Disallow /

jikespider

Rule Path
Disallow /

queryseekerspider

Rule Path
Disallow /

arquivo-web-crawler

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

pixray-seeker

Rule Path
Disallow /

amazonbot

Rule Path
Disallow /

gumgum bot

Rule Path
Disallow /

peer39_crawler

Rule Path
Disallow /

youbot

Rule Path
Disallow /

flamingo_searchengine

Rule Path
Disallow /

oai-searchbot

Rule Path
Disallow /

web-archive-net.com.bot

Rule Path
Disallow /

exabot

Rule Path
Disallow /

nicecrawler

Rule Path
Disallow /

url_spider_pro

Rule Path
Disallow /