ebi.ac.uk
robots.txt

Robots Exclusion Standard data for ebi.ac.uk

Resource Scan

Scan Details

Site Domain ebi.ac.uk
Base Domain ebi.ac.uk
Scan Status Ok
Last Scan2024-10-25T21:34:33+00:00
Next Scan 2024-11-24T21:34:33+00:00

Last Scan

Scanned2024-10-25T21:34:33+00:00
URL https://ebi.ac.uk/robots.txt
Redirect https://www.ebi.ac.uk/robots.txt
Redirect Domain www.ebi.ac.uk
Redirect Base ebi.ac.uk
Domain IPs 193.62.193.80
Redirect IPs 193.62.193.80
Response IP 193.62.193.80
Found Yes
Hash c28297438aa3f3d6a180b9e2fc394fbe98bb5f78bb703cc2a235b89eb8d523b8
SimHash 38b432186ef0

Groups

sogou spider

Rule Path
Disallow /Tools/services

baiduspider

Rule Path
Disallow /chebi

mj12bot

Rule Path
Disallow /

naverbot

Rule Path
Disallow /

semrushbot

Rule Path
Disallow /biomodels

semrushbot-sa

Rule Path
Disallow /biomodels

twiceler

Rule Path
Disallow /

discobot

Rule Path
Disallow /

ebisearch

No rules defined. All paths allowed.

Other Records

Field Value
crawl-delay 0

*

Rule Path
Disallow /*private*
Disallow /~
Disallow /3D
Disallow /abc
Disallow /awstats
Disallow /beta/
Disallow /bioinformatics
Disallow /biosamples/beta
Disallow /biomodels/papers
Disallow /cgi-bin/printable
Disallow /cgi-bin/webservices
Disallow /cgi-bin/*?
Disallow /cgi_files
Disallow /citexplore
Disallow /Compass
Disallow /contrib
Disallow /cpg
Disallow /Databases/Genome_MOT/
Disallow /dbases
Disallow /dbest
Disallow /disease
Disallow /downloads
Disallow /dx.doi.org
Disallow /ebibbs
Disallow /ebi_docs
Disallow /ega/dev
Disallow /embnet.news
Disallow /ena/data/warehouse
Disallow /ena/data/view
Disallow /ensembl/biomart
Disallow /EnsMart
Disallow /epo
Disallow /epo-dev
Disallow /errors
Disallow /es/doku
Disallow /euro2004
Disallow /eurocarb
Disallow /euprojects
Disallow /fiv97
Disallow /flybase
Disallow /fssp
Disallow /Genome_MOT
Disallow /genomes/mot/
Disallow /Groups/images
Disallow /Groups/include
Disallow /hinxton
Disallow /images
Disallow /include
Disallow /industry/private
Disallow /industry_SME
Disallow /info
Disallow /Information/images
Disallow /Information/include
Disallow /Information/local
Disallow /Information/Seminars/EU1998/
Disallow /intact/graph2MIF
Disallow /intact/hierarchView
Disallow /intact/mine/
Disallow /intact/search
Disallow /intact/statisticView
Disallow /integr8
Disallow /interpro/internal-tools
Disallow /interpro/internal-tools/
Disallow /interpro/ISpy
Disallow /interpro/RI*
Disallow /ismb97
Disallow /local
Disallow /login
Disallow /metabolights/MTBLS*/files/*
Disallow /metagenomics/beta
Disallow /miamexpress/devhelp
Disallow /microarray/ArrayExpress
Disallow /microarray-as/atlas/qr
Disallow /microarray/ExpressionProfiler
Disallow /microarray/General/Internal
Disallow /microarray/MIAMExpress
Disallow /microarray/interfaces
Disallow /_mm
Disallow /msd.old/
Disallow /msd-srv/ssm/cgi-bin/
Disallow /newt
Disallow /oib97
Disallow /oib98
Disallow /parasites
Disallow /pdbe-srv/view/search
Disallow /powerpoint
Disallow /profiles
Disallow /proteincol
Disallow /protein_profiles
Disallow /proteomeApplications
Disallow /PSD
Disallow /pub
Disallow /research/cgg
Disallow /RHdb
Disallow /scripts
Disallow /searches/netsearch.html
Disallow /services.html
Disallow /services/images
Disallow /services/include
Disallow /sift
Disallow /site_indexer
Disallow /sites/ebi.ac.uk/files/person/image/Emanuele_Alpi.jpeg
Disallow /sites/ebi.ac.uk/files/styles/medium/public/person/image/Emanuele_Alpi.jpeg
Disallow /soaplab
Disallow /software
Disallow /SSC
Disallow /subs
Disallow /sw
Disallow /Systems
Disallow /tc-test
Disallow /Templates
Disallow /test
Disallow /text
Disallow /textonly
Disallow /thornton-srv/databases/enzymes
Disallow /thornton-srv/databases/practicals
Disallow /thornton-srv/databases/targets
Disallow /tops
Disallow /wc2002
Disallow /webservices/documentation/
Disallow /xs
Disallow /includes/
Disallow /misc/
Disallow /modules/
Disallow /profiles/
Disallow /scripts/
Disallow /themes/
Disallow /CHANGELOG.txt
Disallow /cron.php
Disallow /INSTALL.mysql.txt
Disallow /INSTALL.pgsql.txt
Disallow /INSTALL.sqlite.txt
Disallow /install.php
Disallow /INSTALL.txt
Disallow /LICENSE.txt
Disallow /MAINTAINERS.txt
Disallow /update.php
Disallow /UPGRADE.txt
Disallow /xmlrpc.php
Disallow /admin
Disallow /comment/reply
Disallow /filter/tips
Disallow /node/add
Disallow /search
Disallow /user/register
Disallow /user/password
Disallow /user/login
Disallow /user/logout
Disallow /ega/user/register
Disallow /ega/user/password
Disallow /ega/user/login
Disallow /ega/user/logout
Disallow /taxonomy
Disallow /training/Studentships
Disallow /training/studentships/howtoapply
Disallow /training/studentships/phdprojects
Disallow /training/online/print
Disallow /training/online/printpdf
Disallow /rdf/services/*/sparql
Disallow /interpro/*export%3Dfasta
Disallow /interpro/*export%3Dids
Disallow /interpro/*export%3Dtsv
Disallow /interpro/condensed
Disallow /interpro/*/proteins-matched
Disallow /panda/jira/
Disallow /about/events/seminars/
Disallow /about/events/events/internal-seminar/
Disallow /about/events/internal-events/
Disallow /about/events/events/internal-event/
Disallow /emdb/search

Other Records

Field Value
crawl-delay 10

googlebot

Rule Path
Disallow /ebisearch/summary/api/identificationsearchboxs4/*

Comments

  • Robots exclusion file for http://www.ebi.ac.uk/
  • See https://gitlab.ebi.ac.uk/ebiwd/ebi.ac.uk-root-assets
  • Any questions/comments to www-dev@ebi.ac.uk
  • User-agent: msnbot
  • Disallow: /
  • Drupal site
  • Directories
  • Files
  • Paths (clean URLs)
  • Taxonomies
  • Nodes
  • Block robots for RDF services, RT#19945
  • NB only Google and Yahoo understand wildcards
  • Block robots for EBI internal seminars & internal events
  • INC0060214 - Block EMDB search