generationsnouvelles.net
robots.txt

Robots Exclusion Standard data for generationsnouvelles.net

Archived Snapshots

Resource Scan

Scan Details

Site Domain	generationsnouvelles.net
Base Domain	generationsnouvelles.net
Scan Status	Ok
Last Scan	2024-09-15T20:42:26+00:00
Next Scan	2024-10-15T20:42:26+00:00

Last Scan

Scanned	2024-09-15T20:42:26+00:00
URL	https://generationsnouvelles.net/robots.txt
Domain IPs	66.29.135.86
Response IP	66.29.135.86
Found	Yes
Hash	348e1e0ca1f5760a6b2fc73313a9eb79ebcdc382dfd1972e6ba285000c1877e4
SimHash	cb125148e908

Groups

*

Rule	Path
Disallow	/cgi-bin/
Disallow	/wp-admin/
Disallow	/linkout/
Disallow	/recommended/
Disallow	/comments/feed/
Disallow	/trackback/
Disallow	/index.php
Disallow	/xmlrpc.php

Rule

Path

Disallow

/cgi-bin/

Disallow

/wp-admin/

Disallow

/linkout/

Disallow

/recommended/

Disallow

/comments/feed/

Disallow

/trackback/

Disallow

/index.php

Disallow

/xmlrpc.php

ninjabot

Rule	Path
Allow	/

Rule

Path

Allow

mediapartners-google*

Rule	Path
Allow	/

Rule

Path

Allow

googlebot-image

Rule	Path
Allow	/wp-content/uploads/

Rule

Path

Allow

/wp-content/uploads/

adsbot-google

Rule	Path
Allow	/

Rule

Path

Allow

googlebot-mobile

Rule	Path
Allow	/

Rule

Path

Allow

amazonbot

Rule	Path
Disallow	/

Rule

Path

Disallow

applebot

Rule	Path
Disallow	/

Rule

Path

Disallow

mediatoolkitbot

Rule	Path
Disallow	/

Rule

Path

Disallow

neevabot

Rule	Path
Disallow	/

Rule

Path

Disallow

serpstatbot

Rule	Path
Disallow	/

Rule

Path

Disallow

twitterbot

Rule	Path
Disallow	/

Rule

Path

Disallow

seznambot

Rule	Path
Disallow	/

Rule

Path

Disallow

dataforseobot

Rule	Path
Disallow	/

Rule

Path

Disallow

criteobot

Rule	Path
Disallow	/

Rule

Path

Disallow

velenpublicwebcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

trendictionbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

petalbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-sa

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-ba

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-si

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-swa

Rule	Path
Disallow	/

Rule

Path

Disallow

semrushbot-ct

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

alexibot

Rule	Path
Disallow	/

Rule

Path

Disallow

surveybot

Rule	Path
Disallow	/

Rule

Path

Disallow

xenuã¢â¬â¢s

Rule	Path
Disallow	/

Rule

Path

Disallow

xenuã¢â¬â¢s link sleuth 1.1c

Rule	Path
Disallow	/

Rule

Path

Disallow

rogerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

nextgensearchbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

archive.org_bot

Rule	Path
Disallow	/

Rule

Path

Disallow

archive.org bot

Rule	Path
Disallow	/

Rule

Path

Disallow

linkwalker

Rule	Path
Disallow	/

Rule

Path

Disallow

gigablast spider

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver-web.archive.org

Rule	Path
Disallow	/

Rule

Path

Disallow

picscout

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

tineye

Rule	Path
Disallow	/

Rule

Path

Disallow

seokicks-robot

Rule	Path
Disallow	/

Rule

Path

Disallow

blexbot

Rule	Path
Disallow	/

Rule

Path

Disallow

sistrix crawler

Rule	Path
Disallow	/

Rule

Path

Disallow

uptimerobot/2.0

Rule	Path
Disallow	/

Rule

Path

Disallow

ezooms robot

Rule	Path
Disallow	/

Rule

Path

Disallow

netestate ne crawler (+http://www.website-datenbank.de/)

Rule

Path

Disallow

wiseguys robot

Rule

Path

Disallow

turnitin robot

Rule

Path

Disallow

heritrix

Rule

Path

Disallow

pimonster

Rule

Path

Disallow

pimonster

Rule

Path

Disallow

pi-monster

Rule

Path

Disallow

eccp/1.0 (search@eniro.com)

Rule

Path

Disallow

psbot

Rule

Path

Disallow

youdaobot

Rule

Path

Disallow

blexbot

Rule

Path

Disallow

naverbot
yeti

Rule

Path

Disallow

zbot

Rule

Path

Disallow

vagabondo

Rule

Path

Disallow

linkwalker

Rule

Path

Disallow

simplepie

Rule

Path

Disallow

wget

Rule

Path

Disallow

pixray-seeker

Rule

Path

Disallow

boardreader

Rule

Path

Disallow

quantify

Rule

Path

Disallow

plukkie

Rule

Path

Disallow

cuam

Rule

Path

Disallow

megaindex.ru

Rule

Path

Disallow

megaindex.com

Rule

Path

Disallow

megaindex.ru/2.0

Rule

Path

Disallow

megaindex.ru

Rule

Path

Disallow

seekportbot

Rule

Path

Disallow

yandexvideoparser

Rule

Path

Disallow

yandeximages

Rule

Path

Disallow

blexbot

Rule

Path

Disallow

semrushbot

Rule

Path

Disallow

semrushbot-sa

Rule

Path

Disallow

semrushbot-ba

Rule

Path

Disallow

semrushbot-si

Rule

Path

Disallow

semrushbot-swa

Rule

Path

Disallow

semrushbot-ct

Rule

Path

Disallow

semrushbot-bm

Rule

Path

Disallow

splitsignalbot

Rule

Path

Disallow

dotbot

Rule

Path

Disallow

mj12bot

Rule

Path

Disallow

linkpadbot

Rule

Path

Disallow

sogou blog

Rule

Path

Disallow

sogou inst spider

Rule

Path

Disallow

sogou news spider

Rule

Path

Disallow

sogou orion spider

Rule

Path

Disallow

sogou spider2

Rule

Path

Disallow

sogou web spider

Rule

Path

Disallow

baiduspider

Rule

Path

Disallow

yisouspider

Rule

Path

Disallow

bytespider

Rule

Path

Disallow

velenpublicwebcrawler

Rule

Path

Disallow

seokicks

Rule

Path

Disallow

serpstatbot

Rule

Path

Disallow

turnitinbot

Rule

Path

Disallow

cliqzbot

Rule

Path

Disallow

ccbot

Rule

Path

Disallow

ltx71

Rule

Path

Disallow

Rule

Path

Disallow

obot

Rule

Path

Disallow

checkmarknetwork/1.0

Rule

Path

Disallow

builtwith

Rule

Path

Disallow

grapeshot

Rule

Path

Disallow

riddler

Rule

Path

Disallow

mj12bot

Rule

Path

Disallow

a.pr-cy.ru

Rule

Path

Disallow

petalbot

Rule

Path

Disallow

zombiebot

Rule

Path

Disallow

mauibot

Rule

Path

Disallow

smtbot

Rule

Path

Disallow

facebookexternalhit

Rule

Path

Disallow

twitterbot

Rule

Path

Disallow

barkrowler

Rule

Path

Disallow

safednsbot

Rule

Path

Disallow

mtrobot

Rule

Path

Disallow

mbcrawler/1.0

Rule

Path

Disallow

netpeakcheckerbot/3.4

Rule

Path

Disallow

lcc

Rule

Path

Disallow

adsbot

Rule

Path

Disallow

xovibot

Rule

Path

Disallow

ahrefsbot

Rule

Path

Disallow

adbeat_bot

Rule

Path

Disallow

amazonbot

Rule

Path

Disallow

geedobot

Rule

Path

Disallow

dataforseobot

Rule

Path

Disallow

megaindex.ru

Rule

Path

Disallow

megaindex.com

Rule

Path

Disallow

builtwith

Rule

Path

Disallow

awariorssbot
awariosmartbot

Rule

Path

Disallow

ia_archiver

Rule

Path

Disallow

archive.org_bot

Rule

Path

Disallow

ia_archiver-web.archive.org

Rule

Path

Disallow

cloudservermarketspider

Rule

Path

Disallow

scrapybot

Rule

Path

Disallow

coccocbot-web

Rule

Path

Disallow

uribot

Rule

Path

Disallow

zoominfobot (zoominfobot at zoominfo dot com)

Rule

Path

Disallow

wellknownbot

Rule

Path

Disallow

arquivo-web-crawler

Rule

Path

Disallow

screaming frog seo spider

Rule

Path

Disallow

gptbot

Rule

Path

Disallow

oai-searchbot

Rule

Path

Disallow

claude

Rule

Path

Disallow

claudebot

Rule

Path

Disallow

claude-web

Rule

Path

Disallow

imagesiftbot

Rule

Path

Disallow

seobility

Rule

Path

Disallow

facebookexternalhit/1.1

Rule

Path

Disallow

perplexitybot

Rule

Path

Disallow

Other Records

Field

Value

sitemap

https://generationsnouvelles.net/sitemap_index.xml

sitemap

https://generationsnouvelles.net/sitemap-news.xml

Comments

Block NextGenSearchBot
Block ia-archiver from crawling site
Block archive.org_bot from crawling site
Block Archive.org Bot from crawling site
Block LinkWalker from crawling site
Block GigaBlast Spider from crawling site
Block ia_archiver-web.archive.org_bot from crawling site
Block PicScout Crawler from crawling site
Block BLEXBot Crawler from crawling site
Block TinEye from crawling site
Block SEOkicks
Block BlexBot
Block SISTRIX
Block Uptime robot
Block Ezooms Robot
Block netEstate NE Crawler (+http://www.website-datenbank.de/)
Block WiseGuys Robot
Block Turnitin Robot
Block Heritrix
Block pricepi
Block Eniro
Block Psbot
Block Youdao
BLEXBot
Block NaverBot
Block ZBot
Block Vagabondo
Block LinkWalker
Block SimplePie
Block Wget
Block Pixray-Seeker
Block BoardReader
Block Quantify
Block Plukkie
Block Cuam
https://megaindex.com/crawler
sogou.com chinese search engine
https://serpstatbot.com/
http://ltx71.com/
http://www.pinterest.com/bot.html
http://www.xforce-security.com/crawler/
https://www.checkmarknetwork.com/spider.html
https://builtwith.com/biup
http://www.grapeshot.co.uk/crawler.php
https://aspiegel.com/petalbot
http://www.zombiedomain.net/robot/
http://www.similartech.com/smtbot
https://www.safedns.com/searchbot/
https://metrics-tools.de/robot.html
MBCrawler/1.0 (https://monitorbacklinks.com/robot)
https://corpora.uni-leipzig.de/crawler_faq.html
https://seostar.co/robot/
https://www.xovibot.net/
https://www.adbeat.com/operation_policy
https://geedo.com/bot/
https://dataforseo.com/dataforseo-bot
https://megaindex.com/crawler
https://awario.com/bots.html
http://cloudservermarket.com/spider.html
https://scrapy.org/
https://help.coccoc.com/en/search-engine
https://urirank.com/robot
https://well-known.dev/about/
https://sobre.arquivo.pt/en/help/crawling-and-archiving-web-content/
https://openai.com/searchbot
https://www.seobility.net/en/bot/
https://faviconkit.com/
https://docs.perplexity.ai/docs/perplexity-bot

Warnings

4 invalid lines.

generationsnouvelles.netrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

ninjabot

mediapartners-google*

googlebot-image

adsbot-google

googlebot-mobile

amazonbot

applebot

mediatoolkitbot

neevabot

serpstatbot

twitterbot

seznambot

dataforseobot

criteobot

velenpublicwebcrawler

trendictionbot

ahrefsbot

petalbot

mj12bot

semrushbot

semrushbot-sa

semrushbot-ba

semrushbot-si

semrushbot-swa

semrushbot-ct

dotbot

ahrefsbot

alexibot

surveybot

xenuã¢â¬â¢s

xenuã¢â¬â¢s link sleuth 1.1c

rogerbot

nextgensearchbot

ia_archiver

archive.org_bot

archive.org bot

linkwalker

gigablast spider

ia_archiver-web.archive.org

picscout

blexbot crawler

tineye

seokicks-robot

blexbot

sistrix crawler

uptimerobot/2.0

ezooms robot

netestate ne crawler (+http://www.website-datenbank.de/)

wiseguys robot

turnitin robot

heritrix

pimonster

pimonster

pi-monster

eccp/1.0 (search@eniro.com)

psbot

youdaobot

blexbot

naverbotyeti

zbot

vagabondo

linkwalker

simplepie

wget

pixray-seeker

boardreader

quantify

plukkie

cuam

megaindex.ru

megaindex.com

megaindex.ru/2.0

megaindex.ru

seekportbot

generationsnouvelles.net
robots.txt

xenuã¢â¬â¢s

xenuã¢â¬â¢s link sleuth 1.1c

naverbot
yeti

awariorssbot
awariosmartbot