Genomic and structural aspects of protein evolution

Cyrus Chothia; Julian Gough

doi:10.1042/BJ20090122

Genomic and structural aspects of protein evolution

Biochem J. 2009 Apr 1;419(1):15-28. doi: 10.1042/BJ20090122.

Authors

Cyrus Chothia¹, Julian Gough

Affiliation

¹ MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 0QH, UK. [email protected]

PMID: 19272021
DOI: 10.1042/BJ20090122

Abstract

It has been known for more than 35 years that, during evolution, new proteins are formed by gene duplications, sequence and structural divergence and, in many cases, gene combinations. The genome projects have produced complete, or almost complete, descriptions of the protein repertoires of over 600 distinct organisms. Analyses of these data have dramatically increased our understanding of the formation of new proteins. At the present time, we can accurately trace the evolutionary relationships of about half the proteins found in most genomes, and it is these proteins that we discuss in the present review. Usually, the units of evolution are protein domains that are duplicated, diverge and form combinations. Small proteins contain one domain, and large proteins contain combinations of two or more domains. Domains descended from a common ancestor are clustered into superfamilies. In most genomes, the net growth of superfamily members means that more than 90% of domains are duplicates. In a section on domain duplications, we discuss the number of currently known superfamilies, their size and distribution, and superfamily expansions related to biological complexity and to specific lineages. In a section on divergence, we describe how sequences and structures diverge, the changes in stability produced by acceptable mutations, and the nature of functional divergence and selection. In a section on domain combinations, we discuss their general nature, the sequential order of domains, how combinations modify function, and the extraordinary variety of the domain combinations found in different genomes. We conclude with a brief note on other forms of protein evolution and speculations of the origins of the duplication, divergence and combination processes.

Publication types

Review

MeSH terms

Animals
Evolution, Molecular*
Genome / genetics*
Humans
Mutation
Proteins / chemistry
Proteins / genetics*

Substances

Proteins

Grants and funding

MC_U105184318/MRC_/Medical Research Council/United Kingdom