ilgiornale.it
robots.txt
Robots Exclusion Standard data for ilgiornale.it
Resource Scan
Scan Details
Site Domain | ilgiornale.it |
Base Domain | ilgiornale.it |
Scan Status | Ok |
Last Scan | 2024-05-15T01:09:46+00:00 |
Next Scan | 2024-05-22T01:09:46+00:00 |
Last Scan
Scanned | 2024-05-15T01:09:46+00:00 |
URL | https://ilgiornale.it/robots.txt |
Redirect | https://www.ilgiornale.it:443/robots.txt |
Redirect Domain | www.ilgiornale.it |
Redirect Base | ilgiornale.it |
Domain IPs | 63.32.39.77, 99.80.209.23 |
Redirect IPs | 23.44.4.211, 23.44.4.243, 23.44.5.208, 23.44.5.210, 23.44.5.219, 23.44.5.225, 23.44.5.226, 23.44.5.241 |
Response IP | 42.99.140.201 |
Found | Yes |
Hash | 428449cfc133942926f79e280d69b61be5f0208c51c11c2b969ea39343fd41d7 |
SimHash | 1d492d5aa419 |
Groups
*
Rule | Path |
---|---|
Disallow | /redirect/ |
Disallow | /akamai/ |
Disallow | /edicolaredirect.php |
Disallow | /pag_pdf.php |
Disallow | /relateds/ |
Disallow | /account/status/ |
Disallow | /commenti/ |
Disallow | /content/*/autori_slim.html |
Disallow | /cerca.html?q=* |
Disallow | /utente/profilo/login.html?ReturnUrl=* |
Disallow | /utente/registrazione/start.html?ReturnUrl=* |
Other Records
Field | Value |
---|---|
sitemap | https://www.ilgiornale.it/sitemap/google-news.xml |
sitemap | https://www.ilgiornale.it/sitemap/indice.xml |
Comments