A multi-dimensional integrative scoring framework for predicting functional variants in the human genome

Am J Hum Genet. 2022 Mar 3;109(3):446-456. doi: 10.1016/j.ajhg.2022.01.017. Epub 2022 Feb 24.

Abstract

Attempts to identify and prioritize functional DNA elements in coding and non-coding regions, particularly through use of in silico functional annotation data, continue to increase in popularity. However, specific functional roles can vary widely from one variant to another, making it challenging to summarize different aspects of variant function with a one-dimensional rating. Here we propose multi-dimensional annotation-class integrative estimation (MACIE), an unsupervised multivariate mixed-model framework capable of integrating annotations of diverse origin to assess multi-dimensional functional roles for both coding and non-coding variants. Unlike existing one-dimensional scoring methods, MACIE views variant functionality as a composite attribute encompassing multiple characteristics and estimates the joint posterior functional probabilities of each genomic position. This estimate offers more comprehensive and interpretable information in the presence of multiple aspects of functionality. Applied to a variety of independent coding and non-coding datasets, MACIE demonstrates powerful and robust performance in discriminating between functional and non-functional variants. We also show an application of MACIE to fine-mapping and heritability enrichment analysis by using the lipids GWAS summary statistics data from the European Network for Genetic and Genomic Epidemiology Consortium.

Keywords: EM algorithm; functional annotations; generalized linear mixed model; multi-dimensional integrated scores; prediction of functional effect.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Genome, Human* / genetics
  • Genome-Wide Association Study* / methods
  • Genomics
  • Humans
  • Molecular Sequence Annotation
  • Polymorphism, Single Nucleotide / genetics
  • Probability