cariri24horas.com.br
robots.txt

Robots Exclusion Standard data for cariri24horas.com.br

Resource Scan

Scan Details

Site Domain cariri24horas.com.br
Base Domain cariri24horas.com.br
Scan Status Ok
Last Scan2024-09-26T16:27:33+00:00
Next Scan 2024-10-03T16:27:33+00:00

Last Scan

Scanned2024-09-26T16:27:33+00:00
URL https://cariri24horas.com.br/robots.txt
Redirect https://www.cariri24horas.com.br/robots.txt
Redirect Domain www.cariri24horas.com.br
Redirect Base cariri24horas.com.br
Domain IPs 216.239.32.21, 216.239.34.21, 216.239.36.21, 216.239.38.21
Redirect IPs 2404:6800:4003:c00::79, 74.125.68.121
Response IP 74.125.68.121
Found Yes
Hash 43ac9f3d5f8b38c07bbb7eb897d96053e0d1def620a95b363c9dbbf309f770f3
SimHash 8c5a945ece37

Groups

*

Rule Path
Disallow /search
Disallow /*_archive.html$
Disallow /feeds/*

mediapartners-google

Rule Path
Disallow
Allow /

Other Records

Field Value
sitemap https://www.cariri24horas.com.br/sitemap.xml
sitemap https://www.cariri24horas.com.br/sitemap-pages.xml

Comments

  • Liberado para todos os robôs
  • Bloqueia
  • Google AdSense
  • Indexar página inicial
  • Sitemap xml para até 1000 entradas de postagens
  • Indexar páginas (Eu quero)
  • Caso um dia você passe de 1000 postagens, basta tirar "#" da frente de Sitemap
  • Sitemap: https://www.cariri24horas.com.br/atom.xml?redirect=false&start-index=1001&max-results=1500