ilgiornale.it
robots.txt

Robots Exclusion Standard data for ilgiornale.it

Resource Scan

Scan Details

Site Domain ilgiornale.it
Base Domain ilgiornale.it
Scan Status Ok
Last Scan2024-05-15T01:09:46+00:00
Next Scan 2024-05-22T01:09:46+00:00

Last Scan

Scanned2024-05-15T01:09:46+00:00
URL https://ilgiornale.it/robots.txt
Redirect https://www.ilgiornale.it:443/robots.txt
Redirect Domain www.ilgiornale.it
Redirect Base ilgiornale.it
Domain IPs 63.32.39.77, 99.80.209.23
Redirect IPs 23.44.4.211, 23.44.4.243, 23.44.5.208, 23.44.5.210, 23.44.5.219, 23.44.5.225, 23.44.5.226, 23.44.5.241
Response IP 42.99.140.201
Found Yes
Hash 428449cfc133942926f79e280d69b61be5f0208c51c11c2b969ea39343fd41d7
SimHash 1d492d5aa419

Groups

*

Rule Path
Disallow /redirect/
Disallow /akamai/
Disallow /edicolaredirect.php
Disallow /pag_pdf.php
Disallow /relateds/
Disallow /account/status/
Disallow /commenti/
Disallow /content/*/autori_slim.html
Disallow /cerca.html?q=*
Disallow /utente/profilo/login.html?ReturnUrl=*
Disallow /utente/registrazione/start.html?ReturnUrl=*

Other Records

Field Value
sitemap https://www.ilgiornale.it/sitemap/google-news.xml
sitemap https://www.ilgiornale.it/sitemap/indice.xml

Comments

  • RULES