thegreatcosmos.com
robots.txt

Robots Exclusion Standard data for thegreatcosmos.com

Resource Scan

Scan Details

Site Domain thegreatcosmos.com
Base Domain thegreatcosmos.com
Scan Status Failed
Failure StageFetching resource.
Failure ReasonServer returned a client error.
Last Scan2026-03-22T08:43:38+00:00
Next Scan 2026-04-05T08:43:38+00:00

Last Successful Scan

Scanned2026-02-28T06:02:38+00:00
URL https://thegreatcosmos.com/robots.txt
Domain IPs 104.26.0.124, 104.26.1.124, 172.67.71.100, 2606:4700:20::681a:17c, 2606:4700:20::681a:7c, 2606:4700:20::ac43:4764
Response IP 104.26.0.124
Found Yes
Hash 93a783dceda7f2c9d360a070b87453e5864c81f05b139154d08ae35f80c0a7a1
SimHash b8dcc80a0213

Groups

*

Rule Path
Disallow /cart/
Disallow /checkout/
Disallow /my-account/
Disallow /customer-login/
Disallow /wishlist/
Disallow /thank-you/
Disallow /pedido-recibido/
Disallow /order-received/
Allow /*?*
Disallow /feed/
Disallow /comments/feed/
Disallow /*/feed/
Disallow /*/embed/
Allow /*?gclid*
Allow /*?utm_*
Allow /*?fbclid*

Other Records

Field Value
sitemap https://thegreatcosmos.com/sitemap_index.xml

Comments

  • --- Bloquear páginas internas sin valor SEO ---
  • --- Permitir todos los parámetros para Google Merchant ---
  • (IMPORTANTE: NO BLOQUEAR /?s= porque bloquea también variaciones)
  • Eliminado: Disallow: /?s=
  • Eliminado: Disallow: /page/?s=
  • --- Permitir parámetros de filtros, variaciones y Merchant ---
  • --- Bloquear feeds y formatos innecesarios ---
  • --- Permitir parámetros de Ads y Analytics ---
  • --- Sitemap ---