d.jornaldocomercio.com
robots.txt

Robots Exclusion Standard data for d.jornaldocomercio.com

Resource Scan

Scan Details

Site Domain d.jornaldocomercio.com
Base Domain jornaldocomercio.com
Scan Status Ok
Last Scan2026-01-07T19:29:24+00:00
Next Scan 2026-02-06T19:29:24+00:00

Last Scan

Scanned2026-01-07T19:29:24+00:00
URL https://d.jornaldocomercio.com/robots.txt
Domain IPs 104.26.8.35, 104.26.9.35, 172.67.72.107, 2606:4700:20::681a:823, 2606:4700:20::681a:923, 2606:4700:20::ac43:486b
Response IP 104.26.9.35
Found Yes
Hash df4a827a481447c74edc023803d063fa0303b7382ab0cf4657a70502afdb7791
SimHash f3293472c95e

Groups

*

Rule Path
Disallow
Disallow /_conteudo/*/index.html
Disallow /_conteudo/ge/newsletter_assine/
Disallow /_conteudo/loja/home/
Disallow /_conteudo/x_agenda_cultural_ignorar/
Disallow /_conteudo/2015/08/
Disallow /_conteudo/2015/07/
Disallow /_conteudo/2015/06/
Disallow /_conteudo/2015/05/
Disallow /_conteudo/2015/04/
Disallow /_conteudo/2015/03/
Disallow /_conteudo/2015/02/
Disallow /_conteudo/2015/01/
Disallow /_conteudo/2018/06/jornal_cidades/
Disallow /_conteudo/2018/05/jornal_cidades/
Disallow /_conteudo/2018/04/jornal_cidades/
Disallow /_conteudo/2018/03/jornal_cidades/
Disallow /_conteudo/2018/02/jornal_cidades/
Disallow /_conteudo/2018/01/jornal_cidades/
Disallow /_conteudo/2017/12/jornal_cidades/
Disallow /_conteudo/2017/11/jornal_cidades/
Disallow /_conteudo/2017/10/jornal_cidades/
Disallow /_conteudo/2017/09/jornal_cidades/
Disallow /_conteudo/2017/08/jornal_cidades/
Disallow /_conteudo/2017/07/jornal_cidades/
Disallow /_conteudo/2017/06/jornal_cidades/
Disallow /_conteudo/2017/05/jornal_cidades/
Disallow /_conteudo/2017/04/jornal_cidades/
Disallow /_conteudo/2017/03/jornal_cidades/
Disallow /_conteudo/2017/02/jornal_cidades/
Disallow /_conteudo/2017/01/jornal_cidades/
Disallow /_conteudo/antiga_p_agina_inicial/
Disallow /_conteudo/2_caderno/
Disallow /_conteudo/contra/
Disallow /_conteudo/ge/newsletter_assine/
Disallow /_conteudo/ge/noticias/
Disallow /_conteudo/google_publisher/
Allow /_conteudo/ge/noticias/*/
Disallow /flip/edicao/
Disallow /_midias/pdf/
Disallow /mobile/

heritrix

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 25

ahrefsbot

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 20