caracol.com.co
robots.txt

Robots Exclusion Standard data for caracol.com.co

Resource Scan

Scan Details

Site Domain caracol.com.co
Base Domain caracol.com.co
Scan Status Ok
Last Scan2024-11-08T10:57:41+00:00
Next Scan 2024-11-15T10:57:41+00:00

Last Scan

Scanned2024-11-08T10:57:41+00:00
URL https://caracol.com.co/robots.txt
Domain IPs 23.209.46.73, 23.209.46.81, 2600:1413:b000:13::b857:c18a, 2600:1413:b000:13::b857:c192
Response IP 23.45.207.178
Found Yes
Hash f192c8cfc39483389b98000c0453eea5f499e9fe3f37ca9820808da8821193c1
SimHash a80715513611

Groups

*

Rule Path
Disallow /pxlctl.gif
Disallow /pxlctl2.gif
Disallow /*.swf$
Disallow /pruebas/
Disallow /includes/
Disallow /images/
Disallow /amp/nota.aspx
Disallow /feed.aspx
Disallow /dmz/
Disallow /i/
Disallow /pf/
Disallow /mnt/
Disallow /embed/
Disallow /ThreadeskupSimple
Disallow /tag//*
Disallow /m/
Disallow /datosdeportes/
Disallow /buscar/
Disallow /buscador/
Disallow /estaticos/
Disallow /URL_DE_DESTINO/
Disallow /error/404/
Disallow /mnt/
Allow /pf/dist/components/combinations/
Allow /pf/dist/engine/
Allow /pf/resources/caracol-colombia/img/
Allow /pf/resources/caracol-colombia/favicon
Allow /pf/resources/dist/css/caracol-colombia/
Disallow /search-audios/
Disallow /embed/
Disallow /audio/

Other Records

Field Value
sitemap https://caracol.com.co/arc/outboundfeeds/googlenewssitemap/latest/?outputType=xml
sitemap https://caracol.com.co/arc/outboundfeeds/google-news-feed/?outputType=xml
sitemap https://caracol.com.co/arc/outboundfeeds/news-sitemap/2022-09-27/?outputType=xml
sitemap https://caracol.com.co/arc/outboundfeeds/news-sitemap/latest/?outputType=xml
sitemap https://caracol.com.co/arc/outboundfeeds/news-sitemap-index/?outputType=xml
sitemap https://caracol.com.co/arc/outboundfeeds/sitemap.xml?outputType=xml

Comments

  • Update 14-06-2023
  • Update: 16052023
  • Recursos Bloqueados desde PEP
  • Buscador
  • Estaticos
  • Concretas
  • Allow
  • Update 14-06-2023