tvnovedadestv.com.co
robots.txt

Robots Exclusion Standard data for tvnovedadestv.com.co

Resource Scan

Scan Details

Site Domain tvnovedadestv.com.co
Base Domain tvnovedadestv.com.co
Scan Status Ok
Last Scan2024-07-02T04:24:55+00:00
Next Scan 2024-08-01T04:24:55+00:00

Last Scan

Scanned2024-07-02T04:24:55+00:00
URL https://www.tvnovedadestv.com.co/robots.txt
Domain IPs 2600:9000:23d2:1400:13:772d:5cc0:93a1, 2600:9000:23d2:3200:13:772d:5cc0:93a1, 2600:9000:23d2:4800:13:772d:5cc0:93a1, 2600:9000:23d2:5a00:13:772d:5cc0:93a1, 2600:9000:23d2:9800:13:772d:5cc0:93a1, 2600:9000:23d2:9c00:13:772d:5cc0:93a1, 2600:9000:23d2:d400:13:772d:5cc0:93a1, 2600:9000:23d2:f200:13:772d:5cc0:93a1, 54.192.18.122, 54.192.18.40, 54.192.18.83, 54.192.18.96
Response IP 18.155.68.21
Found Yes
Hash 8c2909781e9ac9c9ac0bf44357d96a651890f766263dafdd2d374ebc07e603b6
SimHash 63d6793bcea1

Groups

*

Rule Path
Allow /*.css$
Allow /*.jpeg$
Allow /*.js$
Allow /*.png$
Allow /*.webp

*

Rule Path Comment
Disallow /_secure -
Disallow /account -
Disallow /admin -
Disallow /busca -
Disallow /buscapagina -
Disallow /buscavazia -
Disallow /checkout/ -
Disallow /checkout/ /cart
Disallow /checkout/cart/add -
Disallow /checkout /
Disallow /coleccion/ -
Disallow /control/ -
Disallow /espiar/ -
Disallow /files/ -
Disallow /img/ -
Disallow /lista-de-deseos -
Disallow /login -
Disallow /quick-view/ -
Disallow /Sistema/ -
Disallow /Sistema/404 -
Disallow /Sistema/buscavazia -
Disallow /wishlist -

ubicrawler
doc
zao
twiceler
sitecheck.internetseer.com
zealbot
msiecrawler
sitesnagger
webstripper
webcopier
fetch
offline explorer
teleport
teleportpro
webzip
linko
httrack
microsoft.url.control
xenu
larbin
libwww
zyborg
download ninja
nutch
spock
omniexplorer_bot
turnitinbot
becomebot
geniebot
dotbot
mlbot
linguee bot
aihitbot
exabot
sbider/nutch
jyxobot
magent
mj12bot
speedy spider
shopwiki
huasai
datacha0s
baiduspider
atomic_email_hunter
mp3bot
winhttp
betabot
core-project
panscient.com
java
libwww-perl
wget
webreaper
grub-client
k2spider
npbot
adsbot-google

Rule Path
Disallow

googlebot

Rule Path
Disallow

googlebot-image

Rule Path
Disallow
Disallow /

*

Rule Path
Disallow /*.aspx

Other Records

Field Value
sitemap https://www.tvnovedadestv.com.co/sitemap.xml

Comments

  • ALLOWs QUE DEBEN ESTAR SIEMPRE
  • URLS NO DEBEN APARECER
  • BOTS MALIGNOS
  • EVITAR 404 ERRORES
  • SITEMAPS

Warnings

  • 1 invalid line.