embrapa.br
robots.txt

Robots Exclusion Standard data for embrapa.br

Resource Scan

Scan Details

Site Domain embrapa.br
Base Domain embrapa.br
Scan Status Ok
Last Scan2024-05-02T03:00:25+00:00
Next Scan 2024-06-01T03:00:25+00:00

Last Scan

Scanned2024-05-02T03:00:25+00:00
URL https://embrapa.br/robots.txt
Redirect https://www.embrapa.br/robots.txt
Redirect Domain www.embrapa.br
Redirect Base embrapa.br
Domain IPs 200.202.168.147
Redirect IPs 200.202.168.147
Response IP 200.202.168.147
Found Yes
Hash 5aa4f880afdd532e00ad112a946da9a0ba238def0ef46cb8503dd814ae58adf7
SimHash 7d04e6794697

Groups

adsbot-google
adsbot-google-mobile
amazonbot
amazon cloudfront
apis-google
appengine-google
applebot
bingbot
bingpreview
duckduckbot
facebookexternalhit
feedfetcher-google
google-adwords-instant
googlebot
googlebot-image
googlebot-mobile
googlebot-news
googlebot-video
google favicon
google-inspectiontool
google-physicalweb
google-read-aloud
google-site-verification
google-structured-data-testing-tool
google web preview
google-xrawler
mediapartners-google
msnbot
slurp
w3c-checklink
w3c_css_validator
w3c_i18n-checker
w3c-mobileok
w3c_unicorn
w3c_validator

Rule Path
Disallow /produtos-e-mercado
Disallow /informacao-tecnologica
Disallow /quarentena-vegetal
Disallow /gestao-territorial
Disallow /noticias-rss
Disallow /ativos-para-parcerias
Disallow /web/rede-ilpf
Disallow /*p_p_state%3Dmaximized
Disallow /*p_p_state%3Dpop_up
Disallow /noticias/-/asset_publisher

Other Records

Field Value
crawl-delay 300

a6-indexer
aboundex
acapbot
acoonbot
adbeat_bot
addsearchbot
addthis
adidxbot
admantx
adscanner
adstxtcrawler
advbot
ahc
ahrefs
ahrefsbot
aihitbot
aisearchbot
alphabot
anderspinkbot
antibot
anyevent
apercite
appinsights
arabot
archivebot
archive.org_bot
aspiegelbot
atom feed robot
awariobot
awariorssbot
awariosmartbot
awesomecrawler
axios
b2b bot
backlinkcrawler
baiduspider
baidu-yunguance
barkrowler
bark[rr]owler
bazqux
bdcbot
behloolbot
betabot
bidswitchbot
biglotron
binlar
bitbot
bitlybot
blackboard
blexbot
blogmurabot
blogtrafficd.d+ feed-fetcher
blp_bbot
bnf.fr_bot
bomborabot
bot.araturka.com
botify
bot-pge.chlooe.com
boxcarbot
brainobot
brandonbot
brandverity
btwebclient
bubing
bublupbot
buck
buzzbot
bytespider
caliperbot
capsulechecker
careerbot
ccbot
cc metadata scaper
cfnetwork
changedetection
check_http
checkmarknetwork
chrome-lighthouse
cincraw
citeseerxbot
clickagy
cliqzbot
cloudflare-alwaysonline
coccoc
collection@infegy.com
companybook-crawler
content crawler spider
contextad bot
contxbot
convera
crawler4j
crunchbot
crystalsemanticsbot
curebot
cutbot
cxensebot
cyberpatrol
dareboost
datafeedwatch
dataforseobot
datagnionbot
datanyze
dataprovider.com
daum
dcrawl
deadlinkchecker
deusu
diffbot
digg deeper
digincore bot
discobot
discordbot
disqus
dnyzbot
domaincrawler
domain re-animator bot
domains project
domainstatsbot
dotbot
dragonbot
drupact
dubbotbot
duckduckgo-favicons-bot
ec2linkfinder
edisterbot
electricmonk
elisabot
embedly
epicbot
eright
europarchive.org
everyonesocialbot
exabot
experibot
extlinksbot
eyeotabot
ezid
ezooms
facebot
fast enterprise crawler
fast-webcrawler
fedoraplanet
feedbot
feedly
feedspot
feedvalidator
femtosearchbot
fetch
fever
filterdb.iss.netcrawler
finditanswersbot
findlink
findthatfile
findxbot
flamingo_searchengine
flipboardproxy
fluffy
fr-crawler
freewebmonitoring sitechecker
freshrss
friendica
fuelbot
fyrebot
g00g1e.net
g2reader-bot
g2 web services
garlikcrawler
genieo
ggpht.com
gigablast
gigabot
gingercrawler
gluten free crawler
gnam gnam spider
gnowitnewsbot
google-extended
go-http-client
gowikibot
gptbot
grapeshotcrawler
grobbot
grouphigh
grub.org
gslfbot
gwene
hatena
headlesschrome
heritrix
http_get
httpunit
httpurlconnection
httrack
hubspot
ia_archiver
ias crawler
icbot
icc-crawler
ichiro
imrbot
indeedbot
infoobot
integromedb
intelium_bot
interfaxscanbot
ips-agent
ip-web-crawler.com
iskanie
istellabot
it2media-domain-crawler
james bot
jamie's spider
jetslide
jetty
jobboersebot
jooblebot
jpg-newsbot
jugendschutzprogramm-crawler
jyxobot
k7mlwcbot
kemvibot
kosmiobot
landau-media-spider
laserlikebot
lb-spider
leikibot
libwww-perl
linguee bot
linkapediabot
linkarchiver
linkdex
linkedinbot
linkisbot
lipperhey
livelap[bb]ot
lssbot
lssrocketcrawler
ltx71
luminator-robots
lycos
magpie-crawler
mail.ru_bot
mappydata
mastodon
mauibot
mbcrawler
mediapartners
mediatoolkitbot
megaindex
meltwaternews
memorybot
metajobbot
metauri
mindupbot
miniflux
mixnodecache
mj12bot
mlbot
moatbot
mojeekbot
moodlebot
moreover
msrbot
muckrack
multiviewbot
naver blog rssbot
nerdbynature.bot
nerdybot
netcraftsurveyagent
netestate ne crawler
neticle crawler
netresearchserver
netvibes
newsharecounts
newspaper
nextcloud
niki-bot
nimbostratus-bot
ning
ninja bot
nixstatsbot
nmap scripting engine
ntentbot
nutch
nuzzel
ocarinabot
officestorebot
okhttp
omgili
online-webceo-bot
openhosebot
openindexspider
orangebot
outbrain
outclicksbot
page2rss
pagepeeker
pandalytics
panscient
paperlibot
pcore-http
phantomjs
phpcrawl
pingdom
pinterest.com.bot
piplbot
pocketparser
postrank
pr-cy.ru
presto
primalbot
privacyawarebot
proximic
psbot
pulsepoint
purebot
python-requests
python-urllib
qwantify
rankactivelinkbot
redditbot
refindbot
regionstuttgartbot
retrevopageanalyzer
ridderbot
rivva
rogerbot
rssbot
rssingbot
rytebot
safednsbot
safesearch microdata crawler
sbl-bot
scoutjet
scrapy
screaming frog seo spider
scribdbot
searchatlas
searchmetricsbot
seekbot
seekport crawler
seewithkids
semanticbot
semanticscholarbot
semrushbot
semrushbot
sentibot
seobilitybot
seokicks
seoscanners
serendeputybot
serpstatbot
seznambot
simplecrawler
simplepie
simplescraper
sistrix crawler
sitebot
siteexplorer.info
siteimprove.com
skypeuripreview
slackbot
slack-imgproxy
smtbot
snacktory
socialrankiobot
sogou
sonic
spbot
speedy
startmebot
storygizebot
streamline3bot
summify
superfeedr
surdotlybot
surveybot
swimgbot
sysomos
taboolabot
tagoobot
tangibleebot
telegrambot
teoma
theoldreader.com
thinklab
tigerbot
tineye
tiny tiny rss
toplistbot
toutiaospider
traackr.com
tracemyfile
trendictionbot
trendsmapresolver
trove
turnitin
turnitinbot
tweetedtimes
tweetmemebot
twengabot
twingly
twitterbot
twurly
um-ln
upflow
uptimebot.org
uptimerobot
urlappendbot
usinenouvellecrawler
ut-dorkbot
validator.nu
vebidoobot
velenpublicwebcrawler
veoozbot
vigil
vkrobot
vkshare
voilabot
voluumdsp-content-bot
wbsearchbot
web-archive-net.com.bot
webcompanycrawler
webdatastats
webmon
wesee:search
whatsapp
wocbot
woobot
wordupinfosearch
woriobot
wotbox
www.uptime.com
xenu link sleuth
xovi
xovibot
yacybot
yahoo link preview
yak
yandexaccessibilitybot
yandexbot
yandeximageresizer
yandeximages
yandexmetrika
yandexmobilebot
yandexturbo
yandexvideoparser
yanga
yeti
yisouspider
y!j
yoozbot
zabbix
zenback bot
zgrab
zoombot
zoominfobot
zumbot
zuperlistbot

Rule Path
Disallow /

Other Records

Field Value
sitemap https://www.embrapa.br/sitemap.xml

Warnings

  • 10 invalid lines.
  • `noindex` is not a known field.