iltirreno.gelocal.it
robots.txt

Robots Exclusion Standard data for iltirreno.gelocal.it

Resource Scan

Scan Details

Site Domain iltirreno.gelocal.it
Base Domain gelocal.it
Scan Status Ok
Last Scan2024-06-20T20:21:32+00:00
Next Scan 2024-06-27T20:21:32+00:00

Last Scan

Scanned2024-06-20T20:21:32+00:00
URL https://iltirreno.gelocal.it/robots.txt
Domain IPs 13.226.210.117, 13.226.210.12, 13.226.210.40, 13.226.210.72
Response IP 18.165.171.20
Found Yes
Hash 65bc14e4cc3ac51f79421dc03cc3e79c6131b7ccc31c1041111b773f47b44565
SimHash 7a7c4860a133

Groups

*

Rule Path
Disallow /stampa-articolo/
Disallow /montecatini/cronaca/2010/10/08/news/dopo-sette-anni-di-nuovo-nei-guai-riccardo-pieraccini-1.2104233
Disallow /dettaglio/*?edizione=
Disallow /dettaglio-news/*?edizione=
Disallow /cecina/cronaca/2022/04/09/news/non-paga-l-animatore-agenzia-condannata-ha-violato-la-norma-sugli-assembramenti-1.41362487

gptbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /