captrain.it
robots.txt

Robots Exclusion Standard data for captrain.it

Resource Scan

Scan Details

Site Domain captrain.it
Base Domain captrain.it
Scan Status Ok
Last Scan2025-09-28T01:14:08+00:00
Next Scan 2025-10-28T01:14:08+00:00

Last Scan

Scanned2025-09-28T01:14:08+00:00
URL https://captrain.it/robots.txt
Domain IPs 51.91.236.193
Response IP 51.91.236.193
Found Yes
Hash e61ac0f0d896145dbf745c2f663518ea94a6da110c6b6b991c75c36e0b504690
SimHash 694098cbf245

Groups

*

Rule Path
Disallow /wp-admin
Disallow /wp-includes
Disallow /wp-content/plugins
Disallow /wp-content/cache
Disallow /trackback
Disallow /feed
Disallow /comments
Disallow /category/*/*
Disallow */trackback
Disallow */feed
Disallow */comments
Disallow /*.pdf$
Disallow /*?*
Disallow /*?
Disallow /wp-login.php
Allow /wp-content/uploads

googlebot

Rule Path
Disallow /*.php$
Disallow /*.inc$
Disallow /*.gz$
Disallow /*.pdf$

googlebot-image

Rule Path
Disallow
Allow /*

mediapartners-google*

Rule Path
Disallow
Allow /*

ahrefssiteaudit

Rule Path
Allow /*

ahrefsbot

Rule Path
Allow /*

Other Records

Field Value
sitemap https://captrain.it/sitemap_index.xml

Comments

  • On empêche l'indexation des dossiers sensibles
  • On désindexe toutes les URL ayant des paramètres (duplication de contenu)
  • On désindexe la page de connexion (contenu inutile)
  • On autorise l'indexation des images
  • On empêche l'indexation des fichiers sensibles
  • Autoriser Google Image
  • Autoriser Google AdSense
  • Autoriser Ahrefs
  • On indique au spider le lien vers notre sitemap