jn.pt
robots.txt

Robots Exclusion Standard data for jn.pt

Archived Snapshots

Resource Scan

Scan Details

Site Domain	jn.pt
Base Domain	jn.pt
Scan Status	Ok
Last Scan	2024-11-13T18:22:25+00:00
Next Scan	2024-11-20T18:22:25+00:00

Last Scan

Scanned	2024-11-13T18:22:25+00:00
URL	https://jn.pt/robots.txt
Redirect	https://www.jn.pt/robots.txt
Redirect Domain	www.jn.pt
Redirect Base	jn.pt
Domain IPs	104.26.2.211, 104.26.3.211, 172.67.71.226, 2606:4700:20::681a:2d3, 2606:4700:20::681a:3d3, 2606:4700:20::ac43:47e2
Redirect IPs	104.26.2.211, 104.26.3.211, 172.67.71.226, 2606:4700:20::681a:2d3, 2606:4700:20::681a:3d3, 2606:4700:20::ac43:47e2
Response IP	104.26.3.211
Found	Yes
Hash	6f23915a0f1574010fe65790dc74372278626f7d70fe672c3552ebc38187f93a
SimHash	acd908b2c6a1

Groups

*

Rule	Path
Disallow

Rule

Path

Disallow

googlebot
googlebot-video
bingbot
baiduspider
baiduspider-mobile
baiduspider-video
baiduspider-image
naverbot
yeti
yandex
yandexbot
yandexmobilebot
yandexvideo
yandexwebmaster
yandexsitelinks
seznambot

Rule	Path
Allow	/

Rule

Path

Allow

adsbot-google
twitterbot
adidxbot

Rule	Path
Allow	/

Rule

Path

Allow

yahoo pipes 1.0
facebot
externalfacebookhit
semrushbot
semrushbot-sa
mj12bot
ahrefsbot

Rule	Path
Disallow	/
Disallow	/?
Disallow	/newsgen/*
Disallow	/page/*

Rule

Path

Disallow

/*?*

Disallow

/newsgen/*

Disallow

/page/*

ia_archiver

Rule	Path
Allow	/$
Disallow	/*

Rule

Path

Allow

Disallow

ia_archiver-web.archive.org

Rule	Path
Allow	/$
Disallow	/*

Rule

Path

Allow

Disallow

meltawer
digimind
knowings
sindup
talkwater
turnitinbot
converacrawler
jetbot
newsnow
kbcrawl
amisoftware
newzbin
ask n read
qwam content intelligence
zite
flipboard
youmag
synthesio
trendybuzz
spotter
scoop.it
linkfluence
augure
corporama
grub-client
k2spider
libwww
wget
adequat
adequat-systems
auramundi
coexel
ellisphere
leadbox
mention
moreover
mytwip
newsnow
newzbin
opinion-tracker
proxem
score3
trendeo
vecteurplus
verticalsearch
vsw
winello
fetch
infoseek
msiecrawler
offline explorer
sitecheck.internetseer.com
teleport
teleportpro
webcopier
webstripper
zealbot
asknread.com
ellisphere
spotter
omgilibot
omgili

Rule	Path
Disallow	/

Rule

Path

Disallow

gptbot

Rule	Path
Disallow	/

Rule

Path

Disallow

ccbot

Rule	Path
Disallow	/

Rule

Path

Disallow

google-extended

Rule	Path
Disallow	/

Rule

Path

Disallow

Comments

E proibido o uso de web crawlers ou outros métodos automáticos de navegação neste site.
Proibimos o rastreamento de nosso site usando um agente que não corresponda à sua identidade conforme número 2 alínea w) do artigo 75o do Decreto Lei n.o 63/85, de 14 de Março.
Convidamos-vos a entrar em contato connosco para subscrever uma licença de utilizador. Apenas os nossos parceiros têm o direito de utilizar o nosso conteúdo para uma finalidade que não seja estritamente individual.
Robots excluidos .
Disable ChatGPT crawler
Disable CommonCrawl
Disable BARD and Vortex AI crawler

Warnings

2 invalid lines.

jn.ptrobots.txt

Resource Scan

Scan Details

Last Scan

Groups

*

googlebotgooglebot-videobingbotbaiduspiderbaiduspider-mobilebaiduspider-videobaiduspider-imagenaverbotyetiyandexyandexbotyandexmobilebotyandexvideoyandexwebmasteryandexsitelinksseznambot

adsbot-googletwitterbotadidxbot

yahoo pipes 1.0facebotexternalfacebookhitsemrushbotsemrushbot-samj12botahrefsbot

ia_archiver

ia_archiver-web.archive.org

gptbot

ccbot

google-extended

Comments

Warnings

jn.pt
robots.txt

googlebot
googlebot-video
bingbot
baiduspider
baiduspider-mobile
baiduspider-video
baiduspider-image
naverbot
yeti
yandex
yandexbot
yandexmobilebot
yandexvideo
yandexwebmaster
yandexsitelinks
seznambot

adsbot-google
twitterbot
adidxbot

yahoo pipes 1.0
facebot
externalfacebookhit
semrushbot
semrushbot-sa
mj12bot
ahrefsbot