marieclaire.fr
robots.txt

Robots Exclusion Standard data for marieclaire.fr

Resource Scan

Scan Details

Site Domain marieclaire.fr
Base Domain marieclaire.fr
Scan Status Ok
Last Scan2024-05-04T17:33:09+00:00
Next Scan 2024-05-11T17:33:09+00:00

Last Scan

Scanned2024-05-04T17:33:09+00:00
URL https://marieclaire.fr/robots.txt
Redirect https://www.marieclaire.fr/robots.txt
Redirect Domain www.marieclaire.fr
Redirect Base marieclaire.fr
Domain IPs 195.200.101.75
Redirect IPs 195.200.101.76
Response IP 195.200.101.76
Found Yes
Hash a19ffd6ee54e9c3ac5903f1c7c30eae4c2043d7319da4c5d87ecae41ed7c58cf
SimHash 0075b874e253

Groups

*

Rule Path
Disallow /*?firstId=
Disallow /ope/
Disallow /adsite-under/
Disallow /recherche
Disallow /archives/*/*/*
Disallow /direct/
Disallow /photo/*/*/*
Disallow /blogs/
Disallow /idees/blogs/
Disallow /idees/fiches/objet%3D
Disallow /idees/fiches/theme%3D
Disallow /idees/fiches/technique%3D
Disallow /idees/fiches/magazine%3D
Disallow /idees/archives/*
Disallow /maison/blogs/
Disallow /cuisine/recettes/*
Disallow /print/article/*
Disallow /webview/
Disallow /codes-promo/codes-promo/visit/*

ccbot

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

chatgpt-user

Rule Path
Disallow /