croire.la-croix.com
robots.txt

Robots Exclusion Standard data for croire.la-croix.com

Resource Scan

Scan Details

Site Domain croire.la-croix.com
Base Domain la-croix.com
Scan Status Ok
Last Scan2024-05-07T22:22:22+00:00
Next Scan 2024-06-06T22:22:22+00:00

Last Scan

Scanned2024-05-07T22:22:22+00:00
URL https://croire.la-croix.com/robots.txt
Domain IPs 18.65.3.58, 18.65.3.59, 18.65.3.74, 18.65.3.81
Response IP 3.160.246.68
Found Yes
Hash 321eee8744872aadd40dfd987de2a43015a2f34216a714b116f418c5a9dd23c0
SimHash 6305d8132d5f

Groups

*

Rule Path
Disallow *%28offset%29*
Disallow *%28selection%29*
Disallow *%28cart%29*
Disallow *%28part%29*
Disallow *?id_folder=*
Disallow *?from_univers=lacroix
Disallow /inscription-newsletter
Disallow /Recherche/
Disallow /print/
Disallow /api/logged/
Disallow /api/comments/
Disallow /api/navis/
Disallow /api/paywall/
Disallow /api/propositions/
Disallow /api/cache/
Disallow /api/suggestions/
Disallow /api/articles/
Disallow /api/checkemails/
Disallow /api/user/
Disallow /api/zones/
Disallow /pdf/
Disallow /build/
Disallow /var/
Disallow /France/
Disallow /Monde/
Disallow /Religion/
Disallow /Economie/
Disallow /Culture/
Disallow /environnement/
Disallow /Debats/
Disallow /Sciences-et-ethique/
Disallow /art-de-vivre/
Disallow /Sport/
Disallow /Famille/
Disallow /Urbi-et-Orbi/
Disallow /login*

Other Records

Field Value
sitemap https://croire.la-croix.com/sitemap_dossiers.xml
sitemap https://croire.la-croix.com/RSS/Saint

Comments

  • www.robotstxt.org/
  • www.google.com/support/webmasters/bin/answer.py?hl=en&answer=156449
  • Avoid execution of ajax actions (navi, comments) and print pages