trosell.net
robots.txt

Robots Exclusion Standard data for trosell.net

Resource Scan

Scan Details

Site Domain trosell.net
Base Domain trosell.net
Scan Status Ok
Last Scan2026-01-22T04:08:48+00:00
Next Scan 2026-01-29T04:08:48+00:00

Last Scan

Scanned2026-01-22T04:08:48+00:00
URL https://trosell.net/robots.txt
Domain IPs 104.21.20.116, 172.67.192.154, 2606:4700:3034::6815:1474, 2606:4700:3035::ac43:c09a
Response IP 172.67.192.154
Found Yes
Hash 886b68569cb1707c41869120aaeb3aa09b12ad43c7e884d99bc9d3d6d24cac7a
SimHash 903330000ff4

Groups

*

Rule Path
Disallow /search/
Disallow /search
Disallow /cdn/assets/
Disallow /confirm
Disallow /index/1
Disallow /index/3
Disallow /index/5
Disallow /index/7
Disallow /index/8
Disallow /index/9
Disallow /index/sub/
Disallow /panel/
Disallow /register
Disallow /register2
Disallow /cdn-cgi/zaraz/
Disallow /verify
Disallow /stat/
Disallow /admin/
Disallow /informer/
Disallow /secure/
Disallow /poll/
Disallow /abnl/
Disallow /*_escaped_fragment_%3D
Disallow /*-*-*-*-987$
Disallow /shop/order
Disallow /shop/printorder
Disallow /shop/checkout
Disallow /shop/user
Disallow /go?*
Disallow /*.php
Disallow /shop/search
Disallow /bini/
Allow /

gptbot
google-extended
anthropic-ai
claudebot
ccbot
bytespider
facebookbot
diffbot
applebot-extended
amazonbot
perplexity-user
mistralai-user
google-cloudvertexbot

Rule Path
Disallow /

ahrefsbot
semrushbot
mj12bot
rogerbot
dotbot
exabot
criteobot/0.1
criteobot
barkrowler
screaming frog seo spider

Rule Path
Disallow /

scrapy
wget
python-requests
python-urllib
aiohttp

Rule Path
Disallow /

Other Records

Field Value
sitemap https://trosell.net/sitemap.xml
sitemap https://trosell.net/sitemap-planteles.xml

Comments

  • -------------------------------------------------------------------
  • CONTENT SIGNALS COMPLIANCE (https://contentsignals.org/)
  • -------------------------------------------------------------------
  • (a) ai-train=no -> No usar mi data para entrenar modelos.
  • (b) search=yes -> Indexar para búsqueda (Google, Bing, etc.).
  • (c) ai-input=yes -> Permitir lectura en vivo (RAG/Grounding).
  • -------------------------------------------------------------------
  • --- REGLAS GENERALES Y SEÑALES DE CONTENIDO ---
  • Bloqueos de seguridad y estructura del sitio (Tus reglas originales)
  • --- BLOQUE DE SEGURIDAD ANTI-ENTRENAMIENTO (AI TRAINING) ---
  • Aquí bloqueamos explicitamente a los bots que solo quieren robar data
  • para entrenar modelos (incluyendo Gemini Training y GPT Training).
  • --- BLOQUE DE SEO SPIDERS (HERRAMIENTAS DE SEO COMERCIALES) ---
  • Bloqueo de Ahrefs, Semrush, etc para ahorrar ancho de banda.
  • --- BLOQUE DE SCRAPERS GENÉRICOS ---
  • --- SITEMAPS Y HOST ---

Warnings

  • `content-signal` is not a known field.
  • `host` is not a known field.