frenchdailynews.com
robots.txt

Robots Exclusion Standard data for frenchdailynews.com

Resource Scan

Scan Details

Site Domain frenchdailynews.com
Base Domain frenchdailynews.com
Scan Status Ok
Last Scan2024-09-22T15:39:20+00:00
Next Scan 2024-09-29T15:39:20+00:00

Last Scan

Scanned2024-09-22T15:39:20+00:00
URL https://frenchdailynews.com/robots.txt
Domain IPs 2001:41d0:8:88df::, 5.39.70.223
Response IP 5.39.70.223
Found Yes
Hash d76455059753a3456785b539d9c66941b3284c3a68f1d182db08a4d905cb67be
SimHash e25a52204f87

Groups

ia_archiver

Rule Path
Disallow /

googlebot-news

Rule Path
Disallow /mot-cle/
Disallow /auteur/

*

Rule Path
Disallow /wp-login.php
Disallow /wp-admin
Disallow /mot-cle
Disallow /auteur/
Disallow /page/

adequat
adequat-systems
ahrefsbot
alexibot
alphaseobot
alphaseobot-sa
alvinetspider
amisoftware
antenne hatena
apocalxexplorerbot
ask n read
asknread.com
asterias
augure
auramundi
backdoorbot/1.0
bizinformation
black hole
blexbot
blowfish/1.0
botalot
builtbottough
bullseye/1.0
bunnyslippers
cegbfeieh
cheesebot
cherrypicker
cherrypickerelite/1.0
cherrypickerse/1.0
cision
coexel
converacrawler
copyrightcheck
corporama
cosmos
crescent
crescent internet toolpak http ole control v.1.0
digimind
disco pump 3.1
dittospyder
dotbot
ellisphere
emailcollector
emailsiphon
emailwolf
erocrawler
exabot
extractorpro
fetch
flamingo_searchengine
flipboard
foobot
grapeshot
grub-client
harvest/1.5
hloader
httplib
httrack
httrack 3.0
humanlinks
igentia
infonavirobot
infoseek
jennybot
jetbot
jikespider
k2spider
kbcrawl
kenjin spider
knowings
leadbox
lexibot
libweb/clshttp
libwww
linkextractorpro
linkfluence
linkscan/8.1a unix
linkwalker
lwp-trivial
lwp-trivial/1.34
mata hari
meltawer
mention
microsoft url control - 5.01.4511
microsoft url control - 6.00.8169
miixpc
miixpc/4.2
mister pix
mj12bot
mlbot
moget
moget/2.1
moreover
ms search 4.0 robot
ms search 5.0 robot
msiecrawler
mytwip
naverbot
netants
netattache
netmechanic
newsnow
newzbin
nicerspro
offline explorer
omgili
omgilibot
openfind
openindexspider
opinion-tracker
propowerbot/2.14
prowebwalker
proxem
psbot
quepasacreep
queryn metasearch
qwam content intelligence
repomonkey
rma
scoop.it
score3
semrushbot
sightupbot
sindup
sitebot
sitecheck.internetseer.com
sitesnagger
sitesucker
sogou web spider
sosospider
spankbot
spanner
speedy
spotter
suggybot
superbot
superbot/2.6
suzuran
synthesio
szukacz/1.4
talkwater
teleport
teleportpro
telesoft
the intraformant
thenomad
tighttwatbot
titan
tocrawl/urldispatcher
toscrawler
trendeo
trendictionbot
trendybuzz
true_robot
true_robot/1.0
turingos
turnitinbot
urlpouls
urly warning
vci
vecteurplus
verticalsearch
vsw
web image collector
webauto
webbandit
webbandit/3.50
webcopier
webcopy
webenhancer
webmasterworldforumbot
webmirror
webreaper
websauger
website extractor
website quester
webster pro
webstripper
webstripper/2.02
webzip
wget
wikiofeedbot
winello
winhttrack
www-collector-e
xenu link sleuth/1.3.8
yacy
youmag
yrspider
zealbot
zeus
zite
zookabot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://infodujour.fr/sitemap.xml

Comments

  • URLs que je ne veux pas indexer : Login Trackbacks Commentaires
  • Robots exclus de toute indexation

Warnings

  • 3 invalid lines.