Prioritization of differentially expressed genes through integrating public expression data

Anim Genet. 2019 Dec;50(6):726-732. doi: 10.1111/age.12855. Epub 2019 Sep 12.

Abstract

Differentially expressed gene (DEG) analysis is a major approach for interpreting phenotype differences and produces a large number of candidate genes. Given that it is burdensome to validate too many genes through benchwork, an urgent need exists for DEG prioritization. Here, a novel method is proposed for prioritizing bona fide DEGs by constructing the normal range of gene expression through integrating public expression data. Prioritization was performed by ranking the differences in cumulative probability for genes in case and control groups. DEGs from a study on pig muscle tissue were used to evaluate the prioritization accuracy. The results showed that the method reached an area under the receiver operating characteristic curve of 96.42% and can effectively shorten the list of candidate genes from a differential expression experiment to find novel causal genes. Our method can be easily extended to other tissues or species to promote functional research in broad applications.

Keywords: candidate gene; gene expression; normal range.

MeSH terms

  • Access to Information
  • Animals
  • Databases, Genetic
  • Gene Expression*
  • Information Storage and Retrieval*
  • Muscle, Skeletal / metabolism
  • Sus scrofa / genetics*