20minutes.fr
robots.txt

Robots Exclusion Standard data for 20minutes.fr

Archived Snapshots

Resource Scan

Scan Details

Site Domain	20minutes.fr
Base Domain	20minutes.fr
Scan Status	Ok
Last Scan	2024-11-14T15:14:56+00:00
Next Scan	2024-11-21T15:14:56+00:00

Last Scan

Scanned	2024-11-14T15:14:56+00:00
URL	https://20minutes.fr/robots.txt
Redirect	https://www.20minutes.fr/robots.txt
Redirect Domain	www.20minutes.fr
Redirect Base	20minutes.fr
Domain IPs	13.227.254.20, 13.227.254.32, 13.227.254.49, 13.227.254.94
Redirect IPs	152.195.37.212
Response IP	152.195.37.212
Found	Yes
Hash	e73adc0d874593eed5d2ba11f3b589fbb9c7322f69aaa627f1a4d28dbf56a760
SimHash	601b5050ee01

Groups

*

Rule	Path
Disallow	/article//commentaires
Disallow	/resultats-examen/recherche/
Disallow	/resultats-examen/candidat/
Disallow	/embed/elections/resultats/
Disallow	/v-ajax
Disallow	/v-esi
Disallow	/search

Rule

Path

Disallow

/article/*/commentaires*

Disallow

/resultats-examen/recherche/

Disallow

/resultats-examen/candidat/

Disallow

/embed/elections/resultats/

Disallow

/v-ajax

Disallow

/v-esi

Disallow

/search

grapeshot

No rules defined. All paths allowed.

Other Records

Field	Value
crawl-delay	2

Field

Value

crawl-delay

firstrain

Rule	Path
Disallow	/

Rule

Path

Disallow

rogerbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ahrefsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

searchmetricsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

mj12bot

Rule	Path
Disallow	/

Rule

Path

Disallow

dotbot

Rule	Path
Disallow	/

Rule

Path

Disallow

nutch

Rule	Path
Disallow	/

Rule

Path

Disallow

trendictionbot

Rule	Path
Disallow	/

Rule

Path

Disallow

xovibot

Rule	Path
Disallow	/

Rule

Path

Disallow

cliqzbot

Rule	Path
Disallow	/

Rule

Path

Disallow

gnowitnewsbot

Rule	Path
Disallow	/

Rule

Path

Disallow

maxpointcrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

meltawer

Rule	Path
Disallow	/

Rule

Path

Disallow

digimind

Rule	Path
Disallow	/

Rule

Path

Disallow

knowings

Rule	Path
Disallow	/

Rule

Path

Disallow

sindup

Rule	Path
Disallow	/

Rule

Path

Disallow

cision

Rule	Path
Disallow	/

Rule

Path

Disallow

talkwater

Rule	Path
Disallow	/

Rule

Path

Disallow

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

converacrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

quepasacreep

Rule	Path
Disallow	/

Rule

Path

Disallow

jetbot

Rule	Path
Disallow	/

Rule

Path

Disallow

newsnow

Rule	Path
Disallow	/

Rule

Path

Disallow

kbcrawl

Rule	Path
Disallow	/

Rule

Path

Disallow

amisoftware

Rule	Path
Disallow	/

Rule

Path

Disallow

newzbin

Rule	Path
Disallow	/

Rule

Path

Disallow

ask n read

Rule	Path
Disallow	/

Rule

Path

Disallow

qwam content intelligence

Rule	Path
Disallow	/

Rule

Path

Disallow

zite

Rule	Path
Disallow	/

Rule

Path

Disallow

youmag

Rule	Path
Disallow	/

Rule

Path

Disallow

synthesio

Rule	Path
Disallow	/

Rule

Path

Disallow

trendybuzz

Rule	Path
Disallow	/

Rule

Path

Disallow

scoop.it

Rule	Path
Disallow	/

Rule

Path

Disallow

linkfluence

Rule	Path
Disallow	/

Rule

Path

Disallow

augure

Rule	Path
Disallow	/

Rule

Path

Disallow

corporama

Rule	Path
Disallow	/

Rule

Path

Disallow

readability.com

Rule	Path
Disallow	/

Rule

Path

Disallow

grub-client

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver-web.archive.org

Rule	Path
Disallow	/

Rule

Path

Disallow

k2spider

Rule	Path
Disallow	/

Rule

Path

Disallow

libwww

Rule	Path
Disallow	/

Rule

Path

Disallow

wget

Rule	Path
Disallow	/

Rule

Path

Disallow

adequat

Rule	Path
Disallow	/

Rule

Path

Disallow

adequat-systems

Rule	Path
Disallow	/

Rule

Path

Disallow

auramundi

Rule	Path
Disallow	/

Rule

Path

Disallow

coexel

Rule

Path

Disallow

leadbox

Rule

Path

Disallow

mention

Rule

Path

Disallow

moreover

Rule

Path

Disallow

mytwip

Rule

Path

Disallow

newsnow

Rule

Path

Disallow

newzbin

Rule

Path

Disallow

opinion-tracker

Rule

Path

Disallow

proxem

Rule

Path

Disallow

score3

Rule

Path

Disallow

trendeo

Rule

Path

Disallow

vecteurplus

Rule

Path

Disallow

verticalsearch

Rule

Path

Disallow

vsw

Rule

Path

Disallow

winello

Rule

Path

Disallow

fetch

Rule

Path

Disallow

infoseek

Rule

Path

Disallow

msiecrawler

Rule

Path

Disallow

offline explorer

Rule

Path

Disallow

sitecheck.internetseer.com

Rule

Path

Disallow

sitesnagger

Rule

Path

Disallow

teleport

Rule

Path

Disallow

teleportpro

Rule

Path

Disallow

webcopier

Rule

Path

Disallow

webstripper

Rule

Path

Disallow

zealbot

Rule

Path

Disallow

asknread.com

Rule

Path

Disallow

ellisphere

Rule

Path

Disallow

spotter

Rule

Path

Disallow

riddler

Rule

Path

Disallow

gptbot

Rule

Path

Disallow

chatgpt-user

Rule

Path

Disallow

ccbot

Rule

Path

Disallow

google-extended

Rule

Path

Disallow

perplexitybot

Rule

Path

Disallow

anthropic-ai

Rule

Path

Disallow

claude-web

Rule

Path

Disallow

claudebot

Rule

Path

Disallow

cohere-ai

Rule

Path

Disallow

applebot-extended

Rule

Path

Disallow

Other Records

Field

Value

sitemap

https://www.20minutes.fr/sitemap-arbo.xml

Warnings

4 invalid lines.

20minutes.frrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

grapeshot

Other Records

firstrain

rogerbot

ahrefsbot

searchmetricsbot

mj12bot

dotbot

nutch

trendictionbot

xovibot

cliqzbot

gnowitnewsbot

maxpointcrawler

meltawer

digimind

knowings

sindup

cision

talkwater

turnitinbot

converacrawler

quepasacreep

jetbot

newsnow

kbcrawl

amisoftware

newzbin

ask n read

qwam content intelligence

zite

youmag

synthesio

trendybuzz

scoop.it

linkfluence

augure

corporama

readability.com

grub-client

ia_archiver

ia_archiver-web.archive.org

k2spider

libwww

wget

adequat

adequat-systems

auramundi

coexel

leadbox

mention

moreover

mytwip

newsnow

newzbin

opinion-tracker

proxem

score3

trendeo

vecteurplus

verticalsearch

vsw

winello

fetch

infoseek

msiecrawler

offline explorer

sitecheck.internetseer.com

sitesnagger

teleport

teleportpro

webcopier

webstripper

zealbot

20minutes.fr
robots.txt