cosmeticlatam.com
robots.txt

Robots Exclusion Standard data for cosmeticlatam.com

Resource Scan

Scan Details

Site Domain cosmeticlatam.com
Base Domain cosmeticlatam.com
Scan Status Ok
Last Scan6/15/2025, 10:10:15 PM
Next Scan 6/22/2025, 10:10:15 PM

Last Scan

Scanned6/15/2025, 10:10:15 PM
URL https://cosmeticlatam.com/robots.txt
Domain IPs 104.21.93.17, 172.67.202.180, 2606:4700:3032::ac43:cab4, 2606:4700:3036::6815:5d11
Response IP 172.67.202.180
Found Yes
Hash c54ecacaf3a63e60adef5bbaebf26200f15dd4370e56f1dbc8c2deb58390fdcc
SimHash 6ac4dc800852

Groups

*

Rule Path
Allow /wp-content/uploads/
Disallow /cgi-bin
Disallow /wp-content/plugins/
Disallow /wp-content/themes/
Disallow /wp-includes/
Disallow /wp-admin/
Allow /ads/preferences/
Allow /gpt/
Allow /pagead/show_ads.js
Allow /pagead/js/adsbygoogle.js
Allow /pagead/js/*/show_ads_impl.js
Allow /static/glade.js
Allow /static/glade/
Disallow /wp-
Disallow /?s=
Disallow /search
Allow /feed/$
Disallow /feed
Disallow /comments/feed
Disallow /*/feed/$
Disallow /*/feed/rss/$
Disallow /*/trackback/$
Disallow /*/*/feed/$
Disallow /*/*/feed/rss/$
Disallow /*/*/trackback/$
Disallow /*/*/*/feed/$
Disallow /*/*/*/feed/rss/$
Disallow /*/*/*/trackback/$
Allow /*.js$
Allow /*.css$

Other Records

Field Value
crawl-delay 10

googlebot

Rule Path
Allow /

googlebot-image

Rule Path
Allow /wp-content/uploads/

adsbot-google

Rule Path
Allow /

googlebot-mobile

Rule Path
Allow /

msiecrawler

Rule Path
Disallow /

webcopier

Rule Path
Disallow /

httrack

Rule Path
Disallow /

microsoft.url.control

Rule Path
Disallow /

libwww

Rule Path
Disallow /

noxtrumbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 50

msnbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 30

slurp

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 10

Other Records

Field Value
sitemap https://cosmeticlatam.com/sitemap_index.xml

Comments

  • Disallow: /wp-admin/
  • Allow: /wp-admin/admin-ajax.php
  • Desindexar carpetas que empiecen por wp-
  • Permitir sitemap pero no las búsquedas.
  • Permitir Feed general para Google Blogsearch.
  • Impedir que /permalink/feed/ sea indexado pues el feed de comentarios suele posicionarse antes de los post.
  • Impedir URLs terminadas en /trackback/ que sirven como Trackback URI (contenido duplicado).
  • Evita bloqueos de CSS y JS.
  • Lista de bots que deberías permitir.
  • User-agent: Googlebot
  • Disallow: /nogooglebot/
  • Lista de bots que generan consultas abusivas aunque siguen las pautas del archivo robots.txt
  • Slurp (Yahoo!), Noxtrum y el bot de MSN que suelen generar excesivas consultas.