evenements.courrierinternational.com
robots.txt

Robots Exclusion Standard data for evenements.courrierinternational.com

Archived Snapshots

Resource Scan

Scan Details

Site Domain	evenements.courrierinternational.com
Base Domain	courrierinternational.com
Scan Status	Ok
Last Scan	2024-06-24T16:44:57+00:00
Next Scan	2024-07-24T16:44:57+00:00

Last Scan

Scanned	2024-06-24T16:44:57+00:00
URL	https://evenements.courrierinternational.com/robots.txt
Domain IPs	35.195.102.228
Response IP	35.195.102.228
Found	Yes
Hash	eabfb2ada96196a034e61487aa317b97514dd5ad4c09722aa736609473a63058
SimHash	ac5ad90a4564

Groups

*

Rule	Path
Disallow	/tmp
Disallow	/cache

Rule

Path

Disallow

/tmp

Disallow

/cache

meltawer

Rule	Path
Disallow	/

Rule

Path

Disallow

digimind

Rule	Path
Disallow	/

Rule

Path

Disallow

knowings

Rule	Path
Disallow	/

Rule

Path

Disallow

sindup

Rule	Path
Disallow	/

Rule

Path

Disallow

cision

Rule	Path
Disallow	/

Rule

Path

Disallow

talkwater

Rule	Path
Disallow	/

Rule

Path

Disallow

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

converacrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

quepasacreep

Rule	Path
Disallow	/

Rule

Path

Disallow

jetbot

Rule	Path
Disallow	/

Rule

Path

Disallow

newsnow

Rule	Path
Disallow	/

Rule

Path

Disallow

kbcrawl

Rule	Path
Disallow	/

Rule

Path

Disallow

amisoftware

Rule	Path
Disallow	/

Rule

Path

Disallow

newzbin

Rule	Path
Disallow	/

Rule

Path

Disallow

ask n read

Rule	Path
Disallow	/

Rule

Path

Disallow

qwam content intelligence

Rule	Path
Disallow	/

Rule

Path

Disallow

zite

Rule	Path
Disallow	/

Rule

Path

Disallow

flipboard

Rule	Path
Disallow	/

Rule

Path

Disallow

flipboardproxy

Rule	Path
Disallow	/

Rule

Path

Disallow

youmag

Rule	Path
Disallow	/

Rule

Path

Disallow

synthesio

Rule	Path
Disallow	/

Rule

Path

Disallow

trendybuzz

Rule	Path
Disallow	/

Rule

Path

Disallow

spotter

Rule	Path
Disallow	/

Rule

Path

Disallow

scoop.it

Rule	Path
Disallow	/

Rule

Path

Disallow

linkfluence

Rule	Path
Disallow	/

Rule

Path

Disallow

augure

Rule	Path
Disallow	/

Rule

Path

Disallow

corporama

Rule	Path
Disallow	/

Rule

Path

Disallow

readability.com

Rule	Path
Disallow	/

Rule

Path

Disallow

grub-client

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver-web.archive.org

Rule	Path
Disallow	/

Rule

Path

Disallow

k2spider

Rule	Path
Disallow	/

Rule

Path

Disallow

libwww

Rule	Path
Disallow	/

Rule

Path

Disallow

wget

Rule	Path
Disallow	/

Rule

Path

Disallow

adequat

Rule	Path
Disallow	/

Rule

Path

Disallow

adequat-systems

Rule	Path
Disallow	/

Rule

Path

Disallow

auramundi

Rule	Path
Disallow	/

Rule

Path

Disallow

coexel

Rule	Path
Disallow	/

Rule

Path

Disallow

ellisphere

Rule	Path
Disallow	/

Rule

Path

Disallow

leadbox

Rule	Path
Disallow	/

Rule

Path

Disallow

mention

Rule	Path
Disallow	/

Rule

Path

Disallow

moreover

Rule	Path
Disallow	/

Rule

Path

Disallow

mytwip

Rule	Path
Disallow	/

Rule

Path

Disallow

newsnow

Rule	Path
Disallow	/

Rule

Path

Disallow

newzbin

Rule	Path
Disallow	/

Rule

Path

Disallow

opinion-tracker

Rule	Path
Disallow	/

Rule

Path

Disallow

proxem

Rule	Path
Disallow	/

Rule

Path

Disallow

score3

Rule

Path

Disallow

trendeo

Rule

Path

Disallow

vecteurplus

Rule

Path

Disallow

verticalsearch

Rule

Path

Disallow

vsw

Rule

Path

Disallow

winello

Rule

Path

Disallow

fetch

Rule

Path

Disallow

infoseek

Rule

Path

Disallow

msiecrawler

Rule

Path

Disallow

offline explorer

Rule

Path

Disallow

sitecheck.internetseer.com

Rule

Path

Disallow

sitesnagger

Rule

Path

Disallow

teleport

Rule

Path

Disallow

teleportpro

Rule

Path

Disallow

webcopier

Rule

Path

Disallow

webstripper

Rule

Path

Disallow

zealbot

Rule

Path

Disallow

asknread.com

Rule

Path

Disallow

ellisphere

Rule

Path

Disallow

spotter

Rule

Path

Disallow

Comments

robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/wc/robots.html
For syntax checking, see:
http://www.sxw.org.uk/computing/robots/check.html
Robots exclus de toute indexation.

Warnings

4 invalid lines.

evenements.courrierinternational.comrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

meltawer

digimind

knowings

sindup

cision

talkwater

turnitinbot

converacrawler

quepasacreep

jetbot

newsnow

kbcrawl

amisoftware

newzbin

ask n read

qwam content intelligence

zite

flipboard

flipboardproxy

youmag

synthesio

trendybuzz

spotter

scoop.it

linkfluence

augure

corporama

readability.com

grub-client

ia_archiver

ia_archiver-web.archive.org

k2spider

libwww

wget

adequat

adequat-systems

auramundi

coexel

ellisphere

leadbox

mention

moreover

mytwip

newsnow

newzbin

opinion-tracker

proxem

score3

trendeo

vecteurplus

verticalsearch

vsw

winello

fetch

infoseek

msiecrawler

offline explorer

sitecheck.internetseer.com

sitesnagger

teleport

teleportpro

webcopier

webstripper

zealbot

asknread.com

ellisphere

spotter

Comments

Warnings

evenements.courrierinternational.com
robots.txt