selogerneuf.com
robots.txt

Robots Exclusion Standard data for selogerneuf.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	selogerneuf.com
Base Domain	selogerneuf.com
Scan Status	Ok
Last Scan	2026-01-21T02:01:40+00:00
Next Scan	2026-02-20T02:01:40+00:00

Last Scan

Scanned	2026-01-21T02:01:40+00:00
URL	https://selogerneuf.com/robots.txt
Domain IPs	52.84.45.116, 52.84.45.31, 52.84.45.7, 52.84.45.8
Response IP	3.169.71.95
Found	Yes
Hash	9bafdcdd62f7d63dcad49027a919bd62310f2bba7daac1415cc82482712e4618
SimHash	be3c718bcec4

Groups

*

Rule	Path
Disallow	/z/
Allow	/z/produits/assets/css/
Allow	/z/produits/assets/js/
Allow	/z/*.jpg
Allow	/z/*.gif
Allow	/z/*.png
Disallow	/noindex/
Disallow	/recherche%2Calerte%2Ccreation.htm
Disallow	/cgi/
Disallow	/prerecherche.htm
Disallow	/cartographie.htm
Disallow	/r%2Cgo
Disallow	/carto%2Ccarte.htm
Disallow	/cartepop.htm
Disallow	/prj%2Caddalerte.htm
Disallow	/residence_print.htm
Disallow	/creation.htm
Disallow	/alerte_email.htm
Disallow	/form_nous_contacter.htm
Disallow	/affiliation%2Ccollecte_newsletter.htm
Disallow	/affiliation%2Ctemplate_affiliation.htm
Disallow	/detail%2Cincl_coord_annonceur.htm
Disallow	/recherche%2Cframe_300_250.htm
Disallow	/recherche%2Cframe_300_600.htm
Disallow	/recherche%2Cframe_300_600.htm
Disallow	/recherche%2Cframe_300_600_2.htm
Disallow	/recherche%2Cframe_300_encart.htm
Disallow	/recherche%2Cframe_728_90.htm
Disallow	/*/detail%2Cincl_coord_annonceur.htm
Disallow	/*/residence_print.htm
Disallow	/*/documentation_programme_is.htm
Disallow	/rss%2Crecherche.xml
Disallow	/recherche%2Cframe_300
Disallow	/listing%2Cadvanced_search.htm
Disallow	/recherche-avancee.htm
Disallow	/*/new_detail%2Cajax%2Call_lots.htm
Disallow	/*/new_detail%2Cajax%2Cpoi_data.htm
Disallow	/*/interceptor%2Cpages.json
Disallow	/*/dem_doc.htm
Disallow	tri%3D
Disallow	//annonces?
Disallow	/recherche*
Disallow	/annuaire/recherche*
Disallow	?bp=
Disallow	?ann_neufpg=
Disallow	?ci=
Disallow	?bd=
Disallow	?cp=
Disallow	?cmp=
Disallow	?vedette=
Disallow	?ali=
Disallow	?idannonce=
Disallow	?div=
Disallow	?idpays=
Disallow	?p=
Disallow	?annuaireLabel=
Disallow	clickserve.dartsearch.net
Disallow	/gtmlocal
Disallow	/lib/seo/reportLinks.js

Rule

Path

Disallow

/z/

Allow

/z/produits/assets/css/

Allow

/z/produits/assets/js/

Allow

/z/*.jpg

Allow

/z/*.gif

Allow

/z/*.png

Disallow

/noindex/

Disallow

/recherche%2Calerte%2Ccreation.htm

Disallow

/cgi/

Disallow

/prerecherche.htm

Disallow

/cartographie.htm

Disallow

/r%2Cgo

Disallow

/carto%2Ccarte.htm

Disallow

/cartepop.htm

Disallow

/prj%2Caddalerte.htm

Disallow

/residence_print.htm

Disallow

/creation.htm

Disallow

/alerte_email.htm

Disallow

/form_nous_contacter.htm

Disallow

/affiliation%2Ccollecte_newsletter.htm

Disallow

/affiliation%2Ctemplate_affiliation.htm

Disallow

/detail%2Cincl_coord_annonceur.htm

Disallow

/recherche%2Cframe_300_250.htm

Disallow

/recherche%2Cframe_300_600.htm

Disallow

/recherche%2Cframe_300_600.htm

Disallow

/recherche%2Cframe_300_600_2.htm

Disallow

/recherche%2Cframe_300_encart.htm

Disallow

/recherche%2Cframe_728_90.htm

Disallow

/*/detail%2Cincl_coord_annonceur.htm

Disallow

/*/residence_print.htm

Disallow

/*/documentation_programme_is.htm

Disallow

/rss%2Crecherche.xml

Disallow

/recherche%2Cframe_300

Disallow

/listing%2Cadvanced_search.htm

Disallow

/recherche-avancee.htm

Disallow

/*/new_detail%2Cajax%2Call_lots.htm

Disallow

/*/new_detail%2Cajax%2Cpoi_data.htm

Disallow

/*/interceptor%2Cpages.json

Disallow

/*/dem_doc.htm

Disallow

*tri%3D*

Disallow

/*/annonces?*

Disallow

/recherche*

Disallow

/annuaire/recherche*

Disallow

*?bp=*

Disallow

*?ann_neufpg=*

Disallow

*?ci=*

Disallow

*?bd=*

Disallow

*?cp=*

Disallow

*?cmp=*

Disallow

*?vedette=*

Disallow

*?ali=*

Disallow

*?idannonce=*

Disallow

*?div=*

Disallow

*?idpays=*

Disallow

*?p=*

Disallow

*?annuaireLabel=*

Disallow

*clickserve.dartsearch.net*

Disallow

/gtmlocal

Disallow

/lib/seo/reportLinks.js

mj12bot

Rule	Path
Disallow

Rule

Path

Disallow

ubicrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

doc

Rule	Path
Disallow	/

Rule

Path

Disallow

zao

Rule	Path
Disallow	/

Rule

Path

Disallow

sitecheck.internetseer.com

Rule	Path
Disallow	/

Rule

Path

Disallow

zealbot

Rule	Path
Disallow	/

Rule

Path

Disallow

msiecrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

sitesnagger

Rule	Path
Disallow	/

Rule

Path

Disallow

webstripper

Rule	Path
Disallow	/

Rule

Path

Disallow

webcopier

Rule	Path
Disallow	/

Rule

Path

Disallow

fetch

Rule	Path
Disallow	/

Rule

Path

Disallow

offline explorer

Rule	Path
Disallow	/

Rule

Path

Disallow

teleport

Rule	Path
Disallow	/

Rule

Path

Disallow

teleportpro

Rule	Path
Disallow	/

Rule

Path

Disallow

webzip

Rule	Path
Disallow	/

Rule

Path

Disallow

linko

Rule	Path
Disallow	/

Rule

Path

Disallow

httrack

Rule	Path
Disallow	/

Rule

Path

Disallow

microsoft.url.control

Rule	Path
Disallow	/

Rule

Path

Disallow

xenu

Rule	Path
Disallow	/

Rule

Path

Disallow

larbin

Rule	Path
Disallow	/

Rule

Path

Disallow

libwww

Rule	Path
Disallow	/

Rule

Path

Disallow

zyborg

Rule	Path
Disallow	/

Rule

Path

Disallow

download ninja

Rule	Path
Disallow	/

Rule

Path

Disallow

wget

Rule	Path
Disallow	/

Rule

Path

Disallow

grub-client

Rule	Path
Disallow	/

Rule

Path

Disallow

k2spider

Rule	Path
Disallow	/

Rule

Path

Disallow

npbot

Rule	Path
Disallow	/

Rule

Path

Disallow

webreaper

Rule	Path
Disallow	/

Rule

Path

Disallow

Other Records

Field	Value
sitemap	https://www.selogerneuf.com/sitemaps/index.xml

Field

Value

sitemap

https://www.selogerneuf.com/sitemaps/index.xml

Comments

Sorry, wget in its recursive mode is a frequent problem.
Please read the man page and use it properly; there is a
--wait option you can use to set the delay between hits,
for instance.
The 'grub' distributed client has been *very* poorly behaved.
Doesn't follow robots.txt anyway, but...
Hits many times per second, not acceptable
http://www.nameprotect.com/botinfo.html
A capture bot, downloads gazillions of pages with no public benefit
http://www.webreaper.net/

selogerneuf.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

mj12bot

ubicrawler

doc

zao

sitecheck.internetseer.com

zealbot

msiecrawler

sitesnagger

webstripper

webcopier

fetch

offline explorer

teleport

teleportpro

webzip

linko

httrack

microsoft.url.control

xenu

larbin

libwww

zyborg

download ninja

wget

grub-client

k2spider

npbot

webreaper

Other Records

Comments

selogerneuf.com
robots.txt