planete-eleve.com
robots.txt

Robots Exclusion Standard data for planete-eleve.com

Resource Scan

Scan Details

Site Domain planete-eleve.com
Base Domain planete-eleve.com
Scan Status Ok
Last Scan2024-10-26T08:12:47+00:00
Next Scan 2024-11-02T08:12:47+00:00

Last Scan

Scanned2024-10-26T08:12:47+00:00
URL https://planete-eleve.com/robots.txt
Domain IPs 2a02:4780:39:8421:6257:314f:135e:2f52, 91.108.100.177
Response IP 77.37.115.114
Found Yes
Hash 1130db28f2af8447d19c306872e76e5e22af54d6c06f1b63fed0fe08d1effca0
SimHash 6940b8c23274

Groups

*

Rule Path
Disallow /wp-login.php
Disallow /wp-admin
Disallow /wp-includes
Disallow /wp-content/plugins
Disallow /wp-content/cache
Disallow /trackback
Disallow /feed
Disallow /comments
Disallow /category/*/*
Disallow /category/*
Disallow /author/*
Disallow /page/*
Disallow */trackback
Disallow */feed
Disallow */comments
Disallow */tag/*
Disallow /*?*
Disallow /*?
Disallow /?*

googlebot

Rule Path
Disallow /*.php$
Disallow /*.inc$
Disallow /*.gz$
Allow /wp-content/uploads

googlebot-image

Rule Path
Disallow
Allow /*

mediapartners-google*

Rule Path
Disallow
Allow /*

Other Records

Field Value
sitemap https://planete-eleve.com/sitemap_index.xml

Comments

  • no indexation (dossiers sensibles)
  • desindexe les URL ayant des paramètres (duplication de contenu)
  • On empeche l'indexation des fichiers sensibles
  • On autorise l'indexation des images
  • Autoriser Google Image
  • Autoriser Google AdSense
  • On indique au spider le lien vers notre sitemap