lartisien.com
robots.txt

Robots Exclusion Standard data for lartisien.com

Resource Scan

Scan Details

Site Domain lartisien.com
Base Domain lartisien.com
Scan Status Ok
Last Scan2026-01-12T12:22:08+00:00
Next Scan 2026-02-11T12:22:08+00:00

Last Scan

Scanned2026-01-12T12:22:08+00:00
URL https://lartisien.com/robots.txt
Redirect https://www.lartisien.com/robots.txt
Redirect Domain www.lartisien.com
Redirect Base lartisien.com
Domain IPs 104.26.14.7, 104.26.15.7, 172.67.68.75, 2606:4700:20::681a:e07, 2606:4700:20::681a:f07, 2606:4700:20::ac43:444b
Redirect IPs 104.26.14.7, 104.26.15.7, 172.67.68.75, 2606:4700:20::681a:e07, 2606:4700:20::681a:f07, 2606:4700:20::ac43:444b
Response IP 104.26.14.7
Found Yes
Hash 426eeb5700cd9f88d5c66a96ff9f34476dbacc487512c817b8abf0faa3178b19
SimHash 7f738cf4f391

Groups

*

Rule Path
Disallow /*.pdf
Disallow /*.swf
Disallow /*.doc
Disallow /login
Disallow /signup
Disallow /settings
Disallow /*?*cur=
Disallow /*?*in=
Disallow /browse/
Disallow /v19next/*
Disallow */check/
Disallow /partner-hotel/*
Disallow /membership
Allow /*css?*
Allow /*js?*

emailcollector

Rule Path
Disallow /

emailsiphon

Rule Path
Disallow /

linkextractorpro

Rule Path
Disallow /

propowerbot/2.14

Rule Path
Disallow /

backdoorbot/1.0

Rule Path
Disallow /

Other Records

Field Value
sitemap http://www.lartisien.com/sitemap.xml

Comments

  • Renommer le fichier en "robots.txt" et le placer C la racine du sous-domaine
  • Lister les URL des diffC)rents sitemaps ici (pour le sous-domaine courant, ici www)