corrieredeltrentino.corriere.it
robots.txt

Robots Exclusion Standard data for corrieredeltrentino.corriere.it

Resource Scan

Scan Details

Site Domain corrieredeltrentino.corriere.it
Base Domain corriere.it
Scan Status Ok
Last Scan2025-01-06T15:51:07+00:00
Next Scan 2025-01-13T15:51:07+00:00

Last Scan

Scanned2025-01-06T15:51:07+00:00
URL https://corrieredeltrentino.corriere.it/robots.txt
Domain IPs 199.232.193.50, 199.232.197.50
Response IP 151.101.197.50
Found Yes
Hash c95b460a2c788c7f469ab9ed4fb4ce7a726d278bf518bbc1c25512872a1e124a
SimHash ba8f49d08455

Groups

*

Rule Path
Disallow /*app_v2
Disallow /*app_v1
Disallow */localwebapp
Disallow /*/localwebapp/proRecensione.do*
Disallow /cronistipercaso/loadArg.do*
Disallow /ricerca/
Disallow /*_print.html$
Disallow /ultima_ora/
Disallow /notizie-ultima-ora/
Disallow /communityLocal/
Disallow /_template/
Disallow /apw.js
Disallow /notizie/cronaca/22_marzo_29/tangenti-scarcerazioni-de-benedictis-chiariello-condannati-9-anni-9-mesi-145d4670-af58-11ec-9372-638361423a51.shtml
Disallow /notizie/cronaca/21_dicembre_15/no-vax-leader-forza-nuova-indagati-terrorismo-internazionale-bari-ecbe718e-5dcc-11ec-89b9-8ec9a49d70ec.shtml

petalbot

Rule Path
Disallow /

yandex

Rule Path Comment
Disallow / prohibits crawling for the entire site

gptbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.corriere.it/dynamic-sitemap/sitemap-last-100/Trentino.xml

Warnings

  • `acap-disallow-crawl` is not a known field.