informazioneambiente.it
robots.txt

Robots Exclusion Standard data for informazioneambiente.it

Resource Scan

Scan Details

Site Domain informazioneambiente.it
Base Domain informazioneambiente.it
Scan Status Ok
Last Scan2024-11-15T23:07:15+00:00
Next Scan 2024-11-22T23:07:15+00:00

Last Scan

Scanned2024-11-15T23:07:15+00:00
URL https://informazioneambiente.it/robots.txt
Redirect https://www.informazioneambiente.it/robots.txt
Redirect Domain www.informazioneambiente.it
Redirect Base informazioneambiente.it
Domain IPs 104.21.235.47, 104.21.235.48, 2606:4700:3038::6815:eb2f, 2606:4700:3038::6815:eb30
Redirect IPs 104.21.235.47, 104.21.235.48, 2606:4700:3038::6815:eb2f, 2606:4700:3038::6815:eb30
Response IP 104.21.235.47
Found Yes
Hash 226c0f5e54bc352c0dc58eaf2bbcbd576221383f605beb7766c9c1ae0e2e2bd7
SimHash d8144944a017

Groups

*

Rule Path
Disallow /?s=
Disallow /wp-admin/
Disallow /wp-login.php

claudebot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

omgili

Rule Path
Disallow /

diffbot

Rule Path
Disallow /

bytespider

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

cohere-ai

Rule Path
Disallow /

googlebot-image

Rule Path
Disallow /wp-content/uploads/2022/04/Schermata-2022-04-05-alle-12.48.44.jpg

Other Records

Field Value
sitemap https://www.informazioneambiente.it/sitemap.xml