generacodice.com
robots.txt

Robots Exclusion Standard data for generacodice.com

Resource Scan

Scan Details

Site Domain generacodice.com
Base Domain generacodice.com
Scan Status Ok
Last Scan2024-09-30T04:13:13+00:00
Next Scan 2024-10-07T04:13:13+00:00

Last Scan

Scanned2024-09-30T04:13:13+00:00
URL https://generacodice.com/robots.txt
Domain IPs 74.208.234.56
Response IP 74.208.234.56
Found Yes
Hash 3366840f4326d81b75b2c6ca12a3aebf609c09ea0d98efc59c50aea42ebd5b73
SimHash ca0b1614a32d

Groups

yandex

Rule Path
Disallow
Disallow /sitemap/202310*
Disallow /pubblica/*
Disallow /site/captcha
Disallow /autenticazione
Disallow /site/reportbug

Other Records

Field Value
crawl-delay 5

*

Rule Path
Allow /
Disallow /articolo*?a=
Disallow /articolo*?titolo=
Disallow /en/articolo*?titolo=
Disallow /jp/articolo*?titolo=
Disallow /en/tag*?per-page=
Disallow /ru/tag*?per-page=
Disallow /enticolo*
Disallow /ptticolo*
Disallow /utente/*
Disallow /pubblica/*
Disallow /site/captcha
Disallow /autenticazione
Disallow /site/reportbug

Comments

  • Crawl-delay: 2

Warnings

  • `clean-param` is not a known field.