Biofilter: a knowledge-integration system for the multi-locus analysis of genome-wide association studies

Pac Symp Biocomput. 2009:368-79.

Abstract

Genome-wide association studies provide an unprecedented opportunity to identify combinations of genetic variants that contribute to disease susceptibility. The combinatorial problem of jointly analyzing the millions of genetic variations accessible by high-throughput genotyping technologies is a difficult challenge. One approach to reducing the search space of this variable selection problem is to assess specific combinations of genetic variations based on prior statistical and biological knowledge. In this work, we provide a systematic approach to integrate multiple public databases of gene groupings and sets of disease-related genes to produce multi-SNP models that have an established biological foundation. This approach yields a collection of models which can be tested statistically in genome-wide data, along with an ordinal quantity describing the number of data sources that support any given model. Using this knowledge-driven approach reduces the computational and statistical burden of large-scale interaction analysis while simultaneously providing a biological foundation for the relevance of any significant statistical result that is found.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Biometry
  • Computer Simulation
  • Databases, Genetic
  • Epistasis, Genetic
  • Genetic Predisposition to Disease
  • Genome-Wide Association Study / statistics & numerical data*
  • Humans
  • Knowledge Bases*
  • Models, Genetic
  • Polymorphism, Single Nucleotide