jn.pt
robots.txt

Robots Exclusion Standard data for jn.pt

Resource Scan

Scan Details

Site Domain jn.pt
Base Domain jn.pt
Scan Status Ok
Last Scan2024-06-25T20:28:50+00:00
Next Scan 2024-07-02T20:28:50+00:00

Last Scan

Scanned2024-06-25T20:28:50+00:00
URL https://jn.pt/robots.txt
Redirect https://www.jn.pt/robots.txt
Redirect Domain www.jn.pt
Redirect Base jn.pt
Domain IPs 104.26.2.211, 104.26.3.211, 172.67.71.226, 2606:4700:20::681a:2d3, 2606:4700:20::681a:3d3, 2606:4700:20::ac43:47e2
Redirect IPs 104.26.2.211, 104.26.3.211, 172.67.71.226, 2606:4700:20::681a:2d3, 2606:4700:20::681a:3d3, 2606:4700:20::ac43:47e2
Response IP 104.26.2.211
Found Yes
Hash 6f23915a0f1574010fe65790dc74372278626f7d70fe672c3552ebc38187f93a
SimHash acd908b2c6a1

Groups

*

Rule Path
Disallow

googlebot
googlebot-video
bingbot
baiduspider
baiduspider-mobile
baiduspider-video
baiduspider-image
naverbot
yeti
yandex
yandexbot
yandexmobilebot
yandexvideo
yandexwebmaster
yandexsitelinks
seznambot

Rule Path
Allow /

adsbot-google
twitterbot
adidxbot

Rule Path
Allow /

yahoo pipes 1.0
facebot
externalfacebookhit
semrushbot
semrushbot-sa
mj12bot
ahrefsbot

Rule Path
Disallow /
Disallow /*?*
Disallow /newsgen/*
Disallow /page/*

ia_archiver

Rule Path
Allow /$
Disallow /*

ia_archiver-web.archive.org

Rule Path
Allow /$
Disallow /*

meltawer
digimind
knowings
sindup
talkwater
turnitinbot
converacrawler
jetbot
newsnow
kbcrawl
amisoftware
newzbin
ask n read
qwam content intelligence
zite
flipboard
youmag
synthesio
trendybuzz
spotter
scoop.it
linkfluence
augure
corporama
grub-client
k2spider
libwww
wget
adequat
adequat-systems
auramundi
coexel
ellisphere
leadbox
mention
moreover
mytwip
newsnow
newzbin
opinion-tracker
proxem
score3
trendeo
vecteurplus
verticalsearch
vsw
winello
fetch
infoseek
msiecrawler
offline explorer
sitecheck.internetseer.com
teleport
teleportpro
webcopier
webstripper
zealbot
asknread.com
ellisphere
spotter
omgilibot
omgili

Rule Path
Disallow /

gptbot

Rule Path
Disallow /

ccbot

Rule Path
Disallow /

google-extended

Rule Path
Disallow /

Comments

  • E proibido o uso de web crawlers ou outros métodos automáticos de navegação neste site.
  • Proibimos o rastreamento de nosso site usando um agente que não corresponda à sua identidade conforme número 2 alínea w) do artigo 75o do Decreto Lei n.o 63/85, de 14 de Março.
  • Convidamos-vos a entrar em contato connosco para subscrever uma licença de utilizador. Apenas os nossos parceiros têm o direito de utilizar o nosso conteúdo para uma finalidade que não seja estritamente individual.
  • Robots excluidos .
  • Disable ChatGPT crawler
  • Disable CommonCrawl
  • Disable BARD and Vortex AI crawler

Warnings

  • 2 invalid lines.