FitSNPs: highly differentially expressed genes are more likely to have variants associated with disease

Rong Chen; Alex A Morgan; Joel Dudley; Tarangini Deshpande; Li Li; Keiichi Kodama; Annie P Chiang; Atul J Butte

doi:10.1186/gb-2008-9-12-r170

FitSNPs: highly differentially expressed genes are more likely to have variants associated with disease

Genome Biol. 2008;9(12):R170. doi: 10.1186/gb-2008-9-12-r170. Epub 2008 Dec 5.

Authors

Rong Chen¹, Alex A Morgan, Joel Dudley, Tarangini Deshpande, Li Li, Keiichi Kodama, Annie P Chiang, Atul J Butte

Affiliation

¹ Stanford Center for Biomedical Informatics Research, 251 Cmpus Drive, Stanford, CA 94305, USA. [email protected]

Abstract

Background: Candidate single nucleotide polymorphisms (SNPs) from genome-wide association studies (GWASs) were often selected for validation based on their functional annotation, which was inadequate and biased. We propose to use the more than 200,000 microarray studies in the Gene Expression Omnibus to systematically prioritize candidate SNPs from GWASs.

Results: We analyzed all human microarray studies from the Gene Expression Omnibus, and calculated the observed frequency of differential expression, which we called differential expression ratio, for every human gene. Analysis conducted in a comprehensive list of curated disease genes revealed a positive association between differential expression ratio values and the likelihood of harboring disease-associated variants. By considering highly differentially expressed genes, we were able to rediscover disease genes with 79% specificity and 37% sensitivity. We successfully distinguished true disease genes from false positives in multiple GWASs for multiple diseases. We then derived a list of functionally interpolating SNPs (fitSNPs) to analyze the top seven loci of Wellcome Trust Case Control Consortium type 1 diabetes mellitus GWASs, rediscovered all type 1 diabetes mellitus genes, and predicted a novel gene (KIAA1109) for an unexplained locus 4q27. We suggest that fitSNPs would work equally well for both Mendelian and complex diseases (being more effective for cancer) and proposed candidate genes to sequence for their association with 597 syndromes with unknown molecular basis.

Conclusions: Our study demonstrates that highly differentially expressed genes are more likely to harbor disease-associated DNA variants. FitSNPs can serve as an effective tool to systematically prioritize candidate SNPs from GWASs.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Diabetes Mellitus / genetics
Disease / genetics*
Gene Expression*
Genome-Wide Association Study
Humans
Polymorphism, Single Nucleotide*

Abstract

Publication types

MeSH terms

Grants and funding