The ortholog conjecture is untestable by the current gene ontology but is supported by RNA sequencing data

PLoS Comput Biol. 2012;8(11):e1002784. doi: 10.1371/journal.pcbi.1002784. Epub 2012 Nov 29.

Abstract

The ortholog conjecture posits that orthologous genes are functionally more similar than paralogous genes. This conjecture is a cornerstone of phylogenomics and is used daily by both computational and experimental biologists in predicting, interpreting, and understanding gene functions. A recent study, however, challenged the ortholog conjecture on the basis of experimentally derived Gene Ontology (GO) annotations and microarray gene expression data in human and mouse. It instead proposed that the functional similarity of homologous genes is primarily determined by the cellular context in which the genes act, explaining why a greater functional similarity of (within-species) paralogs than (between-species) orthologs was observed. Here we show that GO-based functional similarity between human and mouse orthologs, relative to that between paralogs, has been increasing in the last five years. Further, compared with paralogs, orthologs are less likely to be included in the same study, causing an underestimation in their functional similarity. A close examination of functional studies of homologs with identical protein sequences reveals experimental biases, annotation errors, and homology-based functional inferences that are labeled in GO as experimental. These problems and the temporary nature of the GO-based finding make the current GO inappropriate for testing the ortholog conjecture. RNA sequencing (RNA-Seq) is known to be superior to microarray for comparing the expressions of different genes or in different species. Our analysis of a large RNA-Seq dataset of multiple tissues from eight mammals and the chicken shows that the expression similarity between orthologs is significantly higher than that between within-species paralogs, supporting the ortholog conjecture and refuting the cellular context hypothesis for gene expression. We conclude that the ortholog conjecture remains largely valid to the extent that it has been tested, but further scrutiny using more and better functional data is needed.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Animals
  • Female
  • Gene Expression Profiling
  • Genes*
  • Genomics / methods*
  • Humans
  • Liver / chemistry
  • Liver / metabolism
  • Male
  • Mice
  • Molecular Sequence Annotation / methods*
  • Oligonucleotide Array Sequence Analysis
  • Proteins / classification
  • Proteins / genetics
  • Proteins / physiology
  • Sequence Analysis, RNA / methods*

Substances

  • Proteins