STAMS: STRING-assisted module search for genome wide association studies and application to autism

Sara Hillenmeyer; Lea K Davis; Eric R Gamazon; Edwin H Cook; Nancy J Cox; Russ B Altman

doi:10.1093/bioinformatics/btw530

STAMS: STRING-assisted module search for genome wide association studies and application to autism

Bioinformatics. 2016 Dec 15;32(24):3815-3822. doi: 10.1093/bioinformatics/btw530. Epub 2016 Aug 19.

Authors

Sara Hillenmeyer¹, Lea K Davis^{2

3}, Eric R Gamazon^{3

4}, Edwin H Cook⁵, Nancy J Cox^{2

3}, Russ B Altman⁶

Affiliations

¹ Biomedical Informatics Training Program, Stanford University, Stanford, CA, USA.
² Vanderbilt Genetics Institute.
³ Division of Genetic Medicine, Department of Medicine, Vanderbilt University, Nashville, TN, USA.
⁴ Academic Medical Center, University of Amsterdam, Amsterdam, The Netherlands.
⁵ Institute for Juvenile Research, Department of Psychiatry, University of Illinois at Chicago, Chicago, IL, USA.
⁶ Departments of Bioengineering and Genetics, Stanford University, Stanford, CA, USA.

Abstract

Motivation: Analyzing genome wide association data in the context of biological pathways helps us understand how genetic variation influences phenotype and increases power to find associations. However, the utility of pathway-based analysis tools is hampered by undercuration and reliance on a distribution of signal across all of the genes in a pathway. Methods that combine genome wide association results with genetic networks to infer the key phenotype-modulating subnetworks combat these issues, but have primarily been limited to network definitions with yes/no labels for gene-gene interactions. A recent method (EW_dmGWAS) incorporates a biological network with weighted edge probability by requiring a secondary phenotype-specific expression dataset. In this article, we combine an algorithm for weighted-edge module searching and a probabilistic interaction network in order to develop a method, STAMS, for recovering modules of genes with strong associations to the phenotype and probable biologic coherence. Our method builds on EW_dmGWAS but does not require a secondary expression dataset and performs better in six test cases.

Results: We show that our algorithm improves over EW_dmGWAS and standard gene-based analysis by measuring precision and recall of each method on separately identified associations. In the Wellcome Trust Rheumatoid Arthritis study, STAMS-identified modules were more enriched for separately identified associations than EW_dmGWAS (STAMS P-value 3.0 × 10^-4; EW_dmGWAS- P-value = 0.8). We demonstrate that the area under the Precision-Recall curve is 5.9 times higher with STAMS than EW_dmGWAS run on the Wellcome Trust Type 1 Diabetes data.

Availability and implementation: STAMS is implemented as an R package and is freely available at https://simtk.org/projects/stams CONTACT: [email protected] information: Supplementary data are available at Bioinformatics online.

MeSH terms

Algorithms*
Autistic Disorder / genetics*
Computational Biology / methods
Gene Regulatory Networks*
Genome-Wide Association Study*
Humans
Phenotype

Abstract

MeSH terms

Grants and funding