milano.repubblica.it
robots.txt

Robots Exclusion Standard data for milano.repubblica.it

Resource Scan

Scan Details

Site Domain milano.repubblica.it
Base Domain repubblica.it
Scan Status Ok
Last Scan2024-05-11T15:56:04+00:00
Next Scan 2024-05-18T15:56:04+00:00

Last Scan

Scanned2024-05-11T15:56:04+00:00
URL https://milano.repubblica.it/robots.txt
Domain IPs 52.84.229.20, 52.84.229.5, 52.84.229.58, 52.84.229.71
Response IP 52.84.229.5
Found Yes
Hash d936f80ea93152cbf045276f538dc00eb03aec513f826223c0adfe89fa586203
SimHash 628459210092

Groups

*

Rule Path
Disallow /ristoranti/
Disallow /multimedia/
Disallow /dettaglio/
Disallow /dettaglio-news/
Disallow /cronaca/2023/02/27/news/avvocato_inquinamento_smog_causa_comune_milano_regione_lombardia-389862565/
Disallow /cronaca/2023/02/08/news/archiviazione_inchiesta_eni_congo_procura_milano--387043601
Disallow /blaize/datalayer

gptbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://milano.repubblica.it/sitemap-n.xml