guidaoggiintv.it
robots.txt

Robots Exclusion Standard data for guidaoggiintv.it

Resource Scan

Scan Details

Site Domain guidaoggiintv.it
Base Domain guidaoggiintv.it
Scan Status Ok
Last Scan2024-10-31T01:13:49+00:00
Next Scan 2024-11-07T01:13:49+00:00

Last Scan

Scanned2024-10-31T01:13:49+00:00
URL https://guidaoggiintv.it/robots.txt
Domain IPs 108.157.254.120, 108.157.254.30, 108.157.254.47, 108.157.254.74, 2600:9000:2755:1c00:6:853b:5500:93a1, 2600:9000:2755:1e00:6:853b:5500:93a1, 2600:9000:2755:2600:6:853b:5500:93a1, 2600:9000:2755:2800:6:853b:5500:93a1, 2600:9000:2755:4400:6:853b:5500:93a1, 2600:9000:2755:8a00:6:853b:5500:93a1, 2600:9000:2755:a200:6:853b:5500:93a1, 2600:9000:2755:c000:6:853b:5500:93a1
Response IP 108.157.254.47
Found Yes
Hash 36a66854f27c0a864462e02c061ab6ba432718ee93fabae86f1b4365135d044a
SimHash fa2cd870a2b3

Groups

claudebot

Rule Path
Disallow /

claude-web

Rule Path
Disallow /

anthropic-ai

Rule Path
Disallow /

awariorssbot

Rule Path
Disallow /

awariosmartbot

Rule Path
Disallow /

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 5

omgili

Rule Path
Disallow /

omgilibot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

imagesiftbot

Rule Path
Disallow /

*

Rule Path
Disallow /disclaimer

Other Records

Field Value
sitemap https://guidaoggiintv.it/sitemap.xml