mnhn.fr
robots.txt

Robots Exclusion Standard data for mnhn.fr

Archived Snapshots

Resource Scan

Scan Details

Site Domain	mnhn.fr
Base Domain	mnhn.fr
Scan Status	Ok
Last Scan	2024-10-21T03:39:35+00:00
Next Scan	2024-11-20T03:39:35+00:00

Last Scan

Scanned	2024-10-21T03:39:35+00:00
URL	https://mnhn.fr/robots.txt
Redirect	https://www.mnhn.fr/robots.txt
Redirect Domain	www.mnhn.fr
Redirect Base	mnhn.fr
Domain IPs	46.183.50.41
Redirect IPs	46.183.50.41
Response IP	46.183.50.41
Found	Yes
Hash	c4a66fa51f96a11077343ce47763cfb9bdd48009f01d4a8c93de99bc28c6cd89
SimHash	3996bd59c768

Groups

*

Rule	Path
Allow	/core/*.css$
Allow	/core/*.css?
Allow	/core/*.js$
Allow	/core/*.js?
Allow	/core/*.gif
Allow	/core/*.jpg
Allow	/core/*.jpeg
Allow	/core/*.png
Allow	/core/*.svg
Allow	/profiles/*.css$
Allow	/profiles/*.css?
Allow	/profiles/*.js$
Allow	/profiles/*.js?
Allow	/profiles/*.gif
Allow	/profiles/*.jpg
Allow	/profiles/*.jpeg
Allow	/profiles/*.png
Allow	/profiles/*.svg
Allow	/system/files/*.webp
Allow	/system/files/*.svg
Disallow	/core/
Disallow	/profiles/
Disallow	/README.txt
Disallow	/web.config
Disallow	/admin/
Disallow	/comment/reply/
Disallow	/filter/tips
Disallow	/node/add/
Disallow	/search/
Disallow	/buffon/register/
Disallow	/buffon/password/
Disallow	/buffon/login/
Disallow	/buffon/logout/
Disallow	/fr/buffon/login/
Disallow	/index.php/admin/
Disallow	/index.php/comment/reply/
Disallow	/index.php/filter/tips
Disallow	/index.php/node/add/
Disallow	/index.php/search/
Disallow	/index.php/buffon/password/
Disallow	/index.php/buffon/register/
Disallow	/index.php/buffon/login/
Disallow	/index.php/buffon/logout/
Disallow	/?f%5B0%5D=
Disallow	/?nid=
Disallow	/*?mtm
Disallow	/?&mtm
Disallow	/*?utm
Disallow	/?&utm

Rule

Path

Allow

/core/*.css$

Allow

/core/*.css?

Allow

/core/*.js$

Allow

/core/*.js?

Allow

/core/*.gif

Allow

/core/*.jpg

Allow

/core/*.jpeg

Allow

/core/*.png

Allow

/core/*.svg

Allow

/profiles/*.css$

Allow

/profiles/*.css?

Allow

/profiles/*.js$

Allow

/profiles/*.js?

Allow

/profiles/*.gif

Allow

/profiles/*.jpg

Allow

/profiles/*.jpeg

Allow

/profiles/*.png

Allow

/profiles/*.svg

Allow

/system/files/*.webp

Allow

/system/files/*.svg

Disallow

/core/

Disallow

/profiles/

Disallow

/README.txt

Disallow

/web.config

Disallow

/admin/

Disallow

/comment/reply/

Disallow

/filter/tips

Disallow

/node/add/

Disallow

/search/

Disallow

/buffon/register/

Disallow

/buffon/password/

Disallow

/buffon/login/

Disallow

/buffon/logout/

Disallow

/fr/buffon/login/

Disallow

/index.php/admin/

Disallow

/index.php/comment/reply/

Disallow

/index.php/filter/tips

Disallow

/index.php/node/add/

Disallow

/index.php/search/

Disallow

/index.php/buffon/password/

Disallow

/index.php/buffon/register/

Disallow

/index.php/buffon/login/

Disallow

/index.php/buffon/logout/

Disallow

/*?*f%5B0%5D=

Disallow

/*?*nid=

Disallow

/*?mtm

Disallow

/*?*&mtm

Disallow

/*?utm

Disallow

/*?*&utm

Other Records

Field	Value
crawl-delay	1

Field

Value

crawl-delay

1

Back to top

Other Records

Field	Value
sitemap	https://www.mnhn.fr/sitemap.xml

Field

Value

sitemap

https://www.mnhn.fr/sitemap.xml

Back to top

Comments

robots.txt
This file is to prevent the crawling and indexing of certain parts
of your site by web crawlers and spiders run by sites like Yahoo!
and Google. By telling these "robots" where not to go on your site,
you save bandwidth and server resources.
This file will be ignored unless it is at the root of your host:
Used: http://example.com/robots.txt
Ignored: http://example.com/site/robots.txt
For more information about the robots.txt standard, see:
http://www.robotstxt.org/robotstxt.html
On Page SEO Checker does not accept more than 1s
https://fr.semrush.com/kb/375-why-did-i-get-a-page-is-not-accessible-note-for-some-of-my-pages
CSS, JS, Images
Uploaded images
Directories
Files
Paths (clean URLs)
Paths (no clean URLs)
Facettes, trop de combinaisons possibles et peu pertinent pour le
SEO
URLs crawlées alors qu'elles ne devraient pas
XML sitemaps

Back to top

mnhn.frrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

Other Records

Other Records

Comments

mnhn.fr
robots.txt