www.parismusees.paris.fr
robots.txt

Robots Exclusion Standard data for www.parismusees.paris.fr

Archived Snapshots

Resource Scan

Scan Details

Site Domain	www.parismusees.paris.fr
Base Domain	paris.fr
Scan Status	Ok
Last Scan	2024-11-03T12:33:56+00:00
Next Scan	2024-12-03T12:33:56+00:00

Last Scan

Scanned	2024-11-03T12:33:56+00:00
URL	https://www.parismusees.paris.fr/robots.txt
Domain IPs	185.145.32.78
Response IP	185.145.32.78
Found	Yes
Hash	cbff472f2752ba01ff97e32c99bb341d2309f08ba62cca311e1d635d67a1792a
SimHash	30969d4bc744

Groups

*

Rule	Path
Allow	/core/*.css$
Allow	/core/*.css?
Allow	/core/*.js$
Allow	/core/*.js?
Allow	/core/*.gif
Allow	/core/*.jpg
Allow	/core/*.jpeg
Allow	/core/*.png
Allow	/core/*.svg
Allow	/profiles/*.css$
Allow	/profiles/*.css?
Allow	/profiles/*.js$
Allow	/profiles/*.js?
Allow	/profiles/*.gif
Allow	/profiles/*.jpg
Allow	/profiles/*.jpeg
Allow	/profiles/*.png
Allow	/profiles/*.svg
Disallow	/core/
Disallow	/profiles/
Disallow	/README.txt
Disallow	/web.config
Disallow	/admin/
Disallow	/comment/reply/
Disallow	/filter/tips/
Disallow	/node/add/
Disallow	/search/
Disallow	/user/register/
Disallow	/user/password/
Disallow	/user/login/
Disallow	/user/logout/
Disallow	/index.php/admin/
Disallow	/index.php/comment/reply/
Disallow	/index.php/filter/tips/
Disallow	/index.php/node/add/
Disallow	/index.php/search/
Disallow	/index.php/user/password/
Disallow	/index.php/user/register/
Disallow	/index.php/user/login/
Disallow	/index.php/user/logout/

Rule

Path

Allow

/core/*.css$

Allow

/core/*.css?

Allow

/core/*.js$

Allow

/core/*.js?

Allow

/core/*.gif

Allow

/core/*.jpg

Allow

/core/*.jpeg

Allow

/core/*.png

Allow

/core/*.svg

Allow

/profiles/*.css$

Allow

/profiles/*.css?

Allow

/profiles/*.js$

Allow

/profiles/*.js?

Allow

/profiles/*.gif

Allow

/profiles/*.jpg

Allow

/profiles/*.jpeg

Allow

/profiles/*.png

Allow

/profiles/*.svg

Disallow

/core/

Disallow

/profiles/

Disallow

/README.txt

Disallow

/web.config

Disallow

/admin/

Disallow

/comment/reply/

Disallow

/filter/tips/

Disallow

/node/add/

Disallow

/search/

Disallow

/user/register/

Disallow

/user/password/

Disallow

/user/login/

Disallow

/user/logout/

Disallow

/index.php/admin/

Disallow

/index.php/comment/reply/

Disallow

/index.php/filter/tips/

Disallow

/index.php/node/add/

Disallow

/index.php/search/

Disallow

/index.php/user/password/

Disallow

/index.php/user/register/

Disallow

/index.php/user/login/

Disallow

/index.php/user/logout/

meltwater

Rule	Path
Disallow	/

Rule

Path

Disallow

digimind

Rule	Path
Disallow	/

Rule

Path

Disallow

knowings

Rule	Path
Disallow	/

Rule

Path

Disallow

sindup

Rule	Path
Disallow	/

Rule

Path

Disallow

cision

Rule	Path
Disallow	/

Rule

Path

Disallow

talkwater

Rule	Path
Disallow	/

Rule

Path

Disallow

turnitinbot

Rule	Path
Disallow	/

Rule

Path

Disallow

converacrawler

Rule	Path
Disallow	/

Rule

Path

Disallow

jetbot

Rule	Path
Disallow	/

Rule

Path

Disallow

newsnow

Rule	Path
Disallow	/

Rule

Path

Disallow

kbcrawl

Rule	Path
Disallow	/

Rule

Path

Disallow

amisoftware

Rule	Path
Disallow	/

Rule

Path

Disallow

newzbin

Rule	Path
Disallow	/

Rule

Path

Disallow

ask n read

Rule	Path
Disallow	/

Rule

Path

Disallow

qwam content intelligence

Rule	Path
Disallow	/

Rule

Path

Disallow

zite

Rule	Path
Disallow	/

Rule

Path

Disallow

flipboard

Rule	Path
Disallow	/

Rule

Path

Disallow

youmag

Rule	Path
Disallow	/

Rule

Path

Disallow

synthesio

Rule	Path
Disallow	/

Rule

Path

Disallow

trendybuzz

Rule	Path
Disallow	/

Rule

Path

Disallow

spotter

Rule	Path
Disallow	/

Rule

Path

Disallow

scoop.it

Rule	Path
Disallow	/

Rule

Path

Disallow

linkfluence

Rule	Path
Disallow	/

Rule

Path

Disallow

augure

Rule	Path
Disallow	/

Rule

Path

Disallow

corporama

Rule	Path
Disallow	/

Rule

Path

Disallow

grub-client

Rule	Path
Disallow	/

Rule

Path

Disallow

ia_archiver

Rule	Path
Allow	/$
Disallow	/*

Rule

Path

Allow

Disallow

ia_archiver-web.archive.org

Rule	Path
Allow	/$
Disallow	/*

Rule

Path

Allow

Disallow

k2spider

Rule	Path
Disallow	/

Rule

Path

Disallow

libwww

Rule	Path
Disallow	/

Rule

Path

Disallow

wget

Rule	Path
Disallow	/

Rule

Path

Disallow

adequat

Rule	Path
Disallow	/

Rule

Path

Disallow

adequat-systems

Rule	Path
Disallow	/

Rule

Path

Disallow

auramundi

Rule	Path
Disallow	/

Rule

Path

Disallow

coexel

Rule	Path
Disallow	/

Rule

Path

Disallow

ellisphere

Rule	Path
Disallow	/

Rule

Path

Disallow

leadbox

Rule	Path
Disallow	/

Rule

Path

Disallow

mention

Rule	Path
Disallow	/

Rule

Path

Disallow

moreover

Rule	Path
Disallow	/

Rule

Path

Disallow

mytwip

Rule	Path
Disallow	/

Rule

Path

Disallow

newsnow

Rule	Path
Disallow	/

Rule

Path

Disallow

newzbin

Rule	Path
Disallow	/

Rule

Path

Disallow

opinion-tracker

Rule	Path
Disallow	/

Rule

Path

Disallow

proxem

Rule	Path
Disallow	/

Rule

Path

Disallow

score3

Rule	Path
Disallow	/

Rule

Path

Disallow

trendeo

Rule	Path
Disallow	/

Rule

Path

Disallow

vecteurplus

Rule	Path
Disallow	/

Rule

Path

Disallow

verticalsearch

Rule

Path

Disallow

vsw

Rule

Path

Disallow

winello

Rule

Path

Disallow

fetch

Rule

Path

Disallow

infoseek

Rule

Path

Disallow

msiecrawler

Rule

Path

Disallow

offline explorer

Rule

Path

Disallow

sitecheck.internetseer.com

Rule

Path

Disallow

teleport

Rule

Path

Disallow

teleportpro

Rule

Path

Disallow

webcopier

Rule

Path

Disallow

webstripper

Rule

Path

Disallow

zealbot

Rule

Path

Disallow

asknread.com

Rule

Path

Disallow

ellisphere

Rule

Path

Disallow

spotter

Rule

Path

Disallow

omgilibot

Rule

Path

Disallow

omgili

Rule

Path

Disallow

ccbot

Rule

Path

Disallow

google-extended

Rule

Path

Disallow

perplexitybot

Rule

Path

Disallow

bytespider

Rule

Path

Disallow

diffbot

Rule

Path

Disallow

facebookbot

Rule

Path

Disallow

youbot

Rule

Path

Disallow

anthropic-ai

Rule

Path

Disallow

claude-web

Rule

Path

Disallow

claudebot

Rule

Path

Disallow

cohere-ai

Rule

Path

Disallow

Comments

robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/robotstxt.html
CSS, JS, Images
Directories
Files
Paths (clean URLs)
Paths (no clean URLs)
Robots exclus de toute indexation.

Warnings

4 invalid lines.

www.parismusees.paris.frrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

meltwater

digimind

knowings

sindup

cision

talkwater

turnitinbot

converacrawler

jetbot

newsnow

kbcrawl

amisoftware

newzbin

ask n read

qwam content intelligence

zite

flipboard

youmag

synthesio

trendybuzz

spotter

scoop.it

linkfluence

augure

corporama

grub-client

ia_archiver

ia_archiver-web.archive.org

k2spider

libwww

wget

adequat

adequat-systems

auramundi

coexel

ellisphere

leadbox

mention

moreover

mytwip

newsnow

newzbin

opinion-tracker

proxem

score3

trendeo

vecteurplus

verticalsearch

vsw

winello

fetch

infoseek

msiecrawler

offline explorer

sitecheck.internetseer.com

teleport

teleportpro

webcopier

webstripper

zealbot

asknread.com

ellisphere

spotter

omgilibot

omgili

ccbot

google-extended

perplexitybot

bytespider

diffbot

facebookbot

youbot

anthropic-ai

claude-web

www.parismusees.paris.fr
robots.txt