Emergence of de novo proteins from 'dark genomic matter' by 'grow slow and moult'

Biochem Soc Trans. 2015 Oct;43(5):867-73. doi: 10.1042/BST20150089.

Abstract

Proteins are the workhorses of the cell and, over billions of years, they have evolved an amazing plethora of extremely diverse and versatile structures with equally diverse functions. Evolutionary emergence of new proteins and transitions between existing ones are believed to be rare or even impossible. However, recent advances in comparative genomics have repeatedly called some 10%-30% of all genes without any detectable similarity to existing proteins. Even after careful scrutiny, some of those orphan genes contain protein coding reading frames with detectable transcription and translation. Thus some proteins seem to have emerged from previously non-coding 'dark genomic matter'. These 'de novo' proteins tend to be disordered, fast evolving, weakly expressed but also rapidly assuming novel and physiologically important functions. Here we review mechanisms by which 'de novo' proteins might be created, under which circumstances they may become fixed and why they are elusive. We propose a 'grow slow and moult' model in which first a reading frame is extended, coding for an initially disordered and non-globular appendage which, over time, becomes more structured and may also become associated with other proteins.

Keywords: domain rearrangements; orphan genes; protein disorder; protein evolution.

Publication types

  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Animals
  • Arthropods / physiology*
  • Daphnia / physiology
  • Databases, Genetic
  • Databases, Protein
  • Evolution, Molecular*
  • Gene Expression Regulation, Developmental*
  • Genome*
  • Insect Proteins / genetics
  • Insect Proteins / metabolism
  • Insecta / physiology
  • Ixodes / physiology
  • Models, Genetic*
  • Mutation
  • Protein Structure, Tertiary
  • Proteome / genetics
  • Proteome / metabolism*
  • Reading Frames
  • Structural Homology, Protein

Substances

  • Insect Proteins
  • Proteome