carrefour.it
robots.txt

Robots Exclusion Standard data for carrefour.it

Resource Scan

Scan Details

Site Domain carrefour.it
Base Domain carrefour.it
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2024-03-19T08:37:39+00:00
Next Scan 2024-06-17T08:37:39+00:00

Last Successful Scan

Scanned2023-05-16T14:11:19+00:00
URL https://www.carrefour.it/robots.txt
Domain IPs 104.21.28.98, 172.67.145.209
Response IP 172.67.145.209
Found Yes
Hash 470a9f404c498332c4186cf28f932b537db856c603885d1f0faaec0d62884a50
SimHash fc57ce22c783

Groups

dotbot
atomic_email_hunter/4.0
atspider/1.0
autoemailspider
bwh3_user_agent
china local browse 2.6
contactbot/0.2
contentsmartz
datacha0s/2.0
datacha0s/2.0
dbrowse 1.4b
dbrowse 1.4d
demo bot dot 16b
demo bot z 16b
dsurf15a 01
dsurf15a 71
dsurf15a 81
dsurf15a va
ebrowse 1.4b
educate search vxb
emailsiphon
emailspider
emailwolf 1.00
esurf15a 15
extractorpro
franklin locator 1.8
fsurf15a 01
full web bot 0416b
full web bot 0516b
full web bot 2816b

Rule Path
Disallow /

*

Rule Path
Disallow /search
Disallow /multisearch
Disallow /cart
Disallow /login?rurl%3F
Disallow */?srule
Disallow */?pmax
Disallow /*.pdf$
Disallow *?prefn1
Disallow *?period
Disallow *?label
Disallow *prefv1

Other Records

Field Value
sitemap https://www.carrefour.it/sitemap_index.xml

Warnings

  • 1 invalid line.