lescienze.espresso.repubblica.it
robots.txt

Robots Exclusion Standard data for lescienze.espresso.repubblica.it

Resource Scan

Scan Details

Site Domain lescienze.espresso.repubblica.it
Base Domain repubblica.it
Scan Status Ok
Last Scan2024-05-21T14:00:21+00:00
Next Scan 2024-06-20T14:00:21+00:00

Last Scan

Scanned2024-05-21T14:00:21+00:00
URL http://lescienze.espresso.repubblica.it/robots.txt
Redirect https://www.lescienze.it/robots.txt
Redirect Domain www.lescienze.it
Redirect Base lescienze.it
Domain IPs 213.92.16.101
Redirect IPs 13.33.30.108, 13.33.30.124, 13.33.30.33, 13.33.30.96
Response IP 13.33.30.108
Found Yes
Hash 2ae62bbcd1e53a3ab71d102f44fbb0b0c30f444655d0a2b616c3afb92a2df4cf
SimHash 50004b602137

Groups

*

Rule Path
Allow /
Disallow /ricerca?query

gptbot

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

facebookbot

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /