guidaoggiintv.it
robots.txt

Robots Exclusion Standard data for guidaoggiintv.it

Resource Scan

Scan Details

Site Domain guidaoggiintv.it
Base Domain guidaoggiintv.it
Scan Status Ok
Last Scan2024-06-26T12:50:55+00:00
Next Scan 2024-07-03T12:50:55+00:00

Last Scan

Scanned2024-06-26T12:50:55+00:00
URL https://guidaoggiintv.it/robots.txt
Domain IPs 13.226.120.121, 13.226.120.18, 13.226.120.32, 13.226.120.92, 2600:9000:2755:3400:6:853b:5500:93a1, 2600:9000:2755:5200:6:853b:5500:93a1, 2600:9000:2755:5e00:6:853b:5500:93a1, 2600:9000:2755:7c00:6:853b:5500:93a1, 2600:9000:2755:9a00:6:853b:5500:93a1, 2600:9000:2755:a600:6:853b:5500:93a1, 2600:9000:2755:c800:6:853b:5500:93a1, 2600:9000:2755:ce00:6:853b:5500:93a1
Response IP 108.157.254.74
Found Yes
Hash 36a66854f27c0a864462e02c061ab6ba432718ee93fabae86f1b4365135d044a
SimHash fa2cd870a2b3

Groups

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

awariorssbot

Rule Path
Disallow /

awariosmartbot

Rule Path
Disallow /

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 5

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

*

Rule Path
Disallow /disclaimer

Other Records

Field Value
sitemap https://guidaoggiintv.it/sitemap.xml