Proteome-Scale Detection of Differential Conservation Patterns at Protein and Subprotein Levels with BLUR

Genome Biol Evol. 2021 Jan 7;13(1):evaa248. doi: 10.1093/gbe/evaa248.

Abstract

In the multiomics era, comparative genomics studies based on gene repertoire comparison are increasingly used to investigate evolutionary histories of species, to study genotype-phenotype relations, species adaptation to various environments, or to predict gene function using phylogenetic profiling. However, comparisons of orthologs have highlighted the prevalence of sequence plasticity among species, showing the benefits of combining protein and subprotein levels of analysis to allow for a more comprehensive study of genotype/phenotype correlations. In this article, we introduce a new approach called BLUR (BLAST Unexpected Ranking), capable of detecting genotype divergence or specialization between two related clades at different levels: gain/loss of proteins but also of subprotein regions. These regions can correspond to known domains, uncharacterized regions, or even small motifs. Our method was created to allow two types of research strategies: 1) the comparison of two groups of species with no previous knowledge, with the aim of predicting phenotype differences or specializations between close species or 2) the study of specific phenotypes by comparing species that present the phenotype of interest with species that do not. We designed a website to facilitate the use of BLUR with a possibility of in-depth analysis of the results with various tools, such as functional enrichments, protein-protein interaction networks, and multiple sequence alignments. We applied our method to the study of two different biological pathways and to the comparison of several groups of close species, all with very promising results. BLUR is freely available at http://lbgi.fr/blur/.

Keywords: comparative genomics; evolution; genotype/phenotype relations; sequence analysis.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Armadillo Domain Proteins
  • Bacteria
  • Conserved Sequence / genetics
  • Evolution, Molecular*
  • Fungi
  • Genomics / methods*
  • Genotype
  • Humans
  • Phenotype
  • Phylogeny
  • Proteins / genetics*
  • Proteome / genetics*
  • Proteome / metabolism*
  • Sequence Alignment
  • Sequence Analysis
  • Software

Substances

  • Armadillo Domain Proteins
  • ODAD2 protein, human
  • Proteins
  • Proteome