gazzettadireggio.gelocal.it
robots.txt

Robots Exclusion Standard data for gazzettadireggio.gelocal.it

Resource Scan

Scan Details

Site Domain gazzettadireggio.gelocal.it
Base Domain gelocal.it
Scan Status Ok
Last Scan2024-06-16T18:44:47+00:00
Next Scan 2024-06-23T18:44:47+00:00

Last Scan

Scanned2024-06-16T18:44:47+00:00
URL https://gazzettadireggio.gelocal.it/robots.txt
Domain IPs 13.226.210.117, 13.226.210.12, 13.226.210.40, 13.226.210.72
Response IP 18.165.171.20
Found Yes
Hash c648823acfde3e24bac99204e2947a188068198d6a2f9355ab2b1f5a1480455c
SimHash 703c4961a533

Groups

*

Rule Path
Disallow /stampa-articolo/
Disallow /reggio/cronaca/2014/12/30/news/corruzione-il-gip-archivia-la-posizione-del-legale-miraglia-1.10586021
Disallow /dettaglio/*?edizione=
Disallow /dettaglio-news/*?edizione=

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /