ilcorsaroblu.org
robots.txt

Robots Exclusion Standard data for ilcorsaroblu.org

Resource Scan

Scan Details

Site Domain ilcorsaroblu.org
Base Domain ilcorsaroblu.org
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-09-19T07:16:08+00:00
Next Scan 2024-09-26T07:16:08+00:00

Last Successful Scan

Scanned2024-01-23T04:32:19+00:00
URL https://ilcorsaroblu.org/robots.txt
Domain IPs 104.21.4.14, 172.67.223.242, 2606:4700:3032::ac43:dff2, 2606:4700:3033::6815:40e
Response IP 172.67.223.242
Found Yes
Hash fc8607b394b0be4f9da3247871b5e6b6ff8aa36224763b61ae3440b67e3cce05
SimHash 875f38466f13

Groups

*

Rule Path
Allow /
Disallow /system

googlebot

Rule Path
Allow /

bingbot

Rule Path
Allow /

baiduspider

Rule Path
Allow /

applebot

Rule Path
Allow /

ahrefsbot

Rule Path
Disallow /

baiduspider

Rule Path
Disallow /

easouspider

Rule Path
Disallow /

ezooms

Rule Path
Disallow /

yandexbot

Rule Path
Disallow /

mj12bot

Rule Path
Disallow /

sitesucker

Rule Path
Disallow /

httrack

Rule Path
Disallow /

teleport

Rule Path
Disallow /

teleportpro

Rule Path
Disallow /

emailcollector

Rule Path
Disallow /

emailsiphon

Rule Path
Disallow /

webbandit

Rule Path
Disallow /

webzip

Rule Path
Disallow /

webreaper

Rule Path
Disallow /

webstripper

Rule Path
Disallow /

web downloader

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

offline explorer pro

Rule Path
Disallow /

offline explorer

Rule Path
Disallow /

httrack website copier

Rule Path
Disallow /

offline commander

Rule Path
Disallow /

leech

Rule Path
Disallow /

websnake

Rule Path
Disallow /

blackwidow

Rule Path
Disallow /

http weazel

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.ilcorsaroblu.org/sitemap-files.xml