infodujour.fr
robots.txt

Robots Exclusion Standard data for infodujour.fr

Archived Snapshots

Resource Scan

Scan Details

Site Domain	infodujour.fr
Base Domain	infodujour.fr
Scan Status	Ok
Last Scan	2024-11-09T09:28:41+00:00
Next Scan	2024-11-16T09:28:41+00:00

Last Scan

Scanned	2024-11-09T09:28:41+00:00
URL	https://infodujour.fr/robots.txt
Domain IPs	2001:41d0:8:88df::, 5.39.70.223
Response IP	5.39.70.223
Found	Yes
Hash	d76455059753a3456785b539d9c66941b3284c3a68f1d182db08a4d905cb67be
SimHash	e25a52204f87

Groups

ia_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

/

googlebot-news

Rule	Path
Disallow	/mot-cle/
Disallow	/auteur/

Rule

Path

Disallow

/mot-cle/

Disallow

/auteur/

*

Rule	Path
Disallow	/wp-login.php
Disallow	/wp-admin
Disallow	/mot-cle
Disallow	/auteur/
Disallow	/page/

Rule

Path

Disallow

/wp-login.php

Disallow

/wp-admin

Disallow

/mot-cle

Disallow

/auteur/

Disallow

/page/

adequat
adequat-systems
ahrefsbot
alexibot
alphaseobot
alphaseobot-sa
alvinetspider
amisoftware
antenne hatena
apocalxexplorerbot
ask n read
asknread.com
asterias
augure
auramundi
backdoorbot/1.0
bizinformation
black hole
blexbot
blowfish/1.0
botalot
builtbottough
bullseye/1.0
bunnyslippers
cegbfeieh
cheesebot
cherrypicker
cherrypickerelite/1.0
cherrypickerse/1.0
cision
coexel
converacrawler
copyrightcheck
corporama
cosmos
crescent
crescent internet toolpak http ole control v.1.0
digimind
disco pump 3.1
dittospyder
dotbot
ellisphere
emailcollector
emailsiphon
emailwolf
erocrawler
exabot
extractorpro
fetch
flamingo_searchengine
flipboard
foobot
grapeshot
grub-client
harvest/1.5
hloader
httplib
httrack
httrack 3.0
humanlinks
igentia
infonavirobot
infoseek
jennybot
jetbot
jikespider
k2spider
kbcrawl
kenjin spider
knowings
leadbox
lexibot
libweb/clshttp
libwww
linkextractorpro
linkfluence
linkscan/8.1a unix
linkwalker
lwp-trivial
lwp-trivial/1.34
mata hari
meltawer
mention
microsoft url control - 5.01.4511
microsoft url control - 6.00.8169
miixpc
miixpc/4.2
mister pix
mj12bot
mlbot
moget
moget/2.1
moreover
ms search 4.0 robot
ms search 5.0 robot
msiecrawler
mytwip
naverbot
netants
netattache
netmechanic
newsnow
newzbin
nicerspro
offline explorer
omgili
omgilibot
openfind
openindexspider
opinion-tracker
propowerbot/2.14
prowebwalker
proxem
psbot
quepasacreep
queryn metasearch
qwam content intelligence
repomonkey
rma
scoop.it
score3
semrushbot
sightupbot
sindup
sitebot
sitecheck.internetseer.com
sitesnagger
sitesucker
sogou web spider
sosospider
spankbot
spanner
speedy
spotter
suggybot
superbot
superbot/2.6
suzuran
synthesio
szukacz/1.4
talkwater
teleport
teleportpro
telesoft
the intraformant
thenomad
tighttwatbot
titan
tocrawl/urldispatcher
toscrawler
trendeo
trendictionbot
trendybuzz
true_robot
true_robot/1.0
turingos
turnitinbot
urlpouls
urly warning
vci
vecteurplus
verticalsearch
vsw
web image collector
webauto
webbandit
webbandit/3.50
webcopier
webcopy
webenhancer
webmasterworldforumbot
webmirror
webreaper
websauger
website extractor
website quester
webster pro
webstripper
webstripper/2.02
webzip
wget
wikiofeedbot
winello
winhttrack
www-collector-e
xenu link sleuth/1.3.8
yacy
youmag
yrspider
zealbot
zeus
zite
zookabot

Rule	Path
Disallow	/

Rule

Path

Disallow

/

Back to top

Other Records

Field	Value
sitemap	https://infodujour.fr/sitemap.xml

Field

Value

sitemap

https://infodujour.fr/sitemap.xml

Back to top

Comments

URLs que je ne veux pas indexer : Login Trackbacks Commentaires
Robots exclus de toute indexation

Back to top

Warnings

3 invalid lines.

Back to top

infodujour.frrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

ia_archiver

googlebot-news

*

Other Records

Comments

Warnings

infodujour.fr
robots.txt