A Comprehensive Evaluation of Disease Phenotype Networks for Gene Prioritization

PLoS One. 2016 Jul 14;11(7):e0159457. doi: 10.1371/journal.pone.0159457. eCollection 2016.

Abstract

Identification of disease-causing genes is a fundamental challenge for human health studies. The phenotypic similarity among diseases may reflect the interactions at the molecular level, and phenotype comparison can be used to predict disease candidate genes. Online Mendelian Inheritance in Man (OMIM) is a database of human genetic diseases and related genes that has become an authoritative source of disease phenotypes. However, disease phenotypes have been described by free text; thus, standardization of phenotypic descriptions is needed before diseases can be compared. Several disease phenotype networks have been established in OMIM using different standardization methods. Two of these networks are important for phenotypic similarity analysis: the first and most commonly used network (mimMiner) is standardized by medical subject heading, and the other network (resnikHPO) is the first to be standardized by human phenotype ontology. This paper comprehensively evaluates for the first time the accuracy of these two networks in gene prioritization based on protein-protein interactions using large-scale, leave-one-out cross-validation experiments. The results show that both networks can effectively prioritize disease-causing genes, and the approach that relates two diseases using a logistic function improves prioritization performance. Tanimoto, one of four methods for normalizing resnikHPO, generates a symmetric network and it performs similarly to mimMiner. Furthermore, an integration of these two networks outperforms either network alone in gene prioritization, indicating that these two disease networks are complementary.

MeSH terms

  • Algorithms
  • Gene Regulatory Networks / genetics*
  • Genes / genetics
  • Genetic Diseases, Inborn / genetics*
  • Health Priorities
  • Humans
  • Models, Biological
  • Phenotype

Grants and funding

This work was funded by the National Natural Science Foundation of China (61302013 and 61372014; http://www.nsfc.gov.cn) to YYT and YK, Science and Technology Plan of Liaoning Province of China (2014305001; http://www.lninfo.gov.cn/) to YYT and Fundamental Research Funds for the Central Universities of China (N141008001 and N130219001; http://www.moe.gov.cn) to YYT and SLQ. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.