IMHOTEP-a composite score integrating popular tools for predicting the functional consequences of non-synonymous sequence variants

Nucleic Acids Res. 2017 Feb 17;45(3):e13. doi: 10.1093/nar/gkw886.

Abstract

The in silico prediction of the functional consequences of mutations is an important goal of human pathogenetics. However, bioinformatic tools that classify mutations according to their functionality employ different algorithms so that predictions may vary markedly between tools. We therefore integrated nine popular prediction tools (PolyPhen-2, SNPs&GO, MutPred, SIFT, MutationTaster2, Mutation Assessor and FATHMM as well as conservation-based Grantham Score and PhyloP) into a single predictor. The optimal combination of these tools was selected by means of a wide range of statistical modeling techniques, drawing upon 10 029 disease-causing single nucleotide variants (SNVs) from Human Gene Mutation Database and 10 002 putatively ‘benign’ non-synonymous SNVs from UCSC. Predictive performance was found to be markedly improved by model-based integration, whilst maximum predictive capability was obtained with either random forest, decision tree or logistic regression analysis. A combination of PolyPhen-2, SNPs&GO, MutPred, MutationTaster2 and FATHMM was found to perform as well as all tools combined. Comparison of our approach with other integrative approaches such as Condel, CoVEC, CAROL, CADD, MetaSVM and MetaLR using an independent validation dataset, revealed the superiority of our newly proposed integrative approach. An online implementation of this approach, IMHOTEP (‘Integrating Molecular Heuristics and Other Tools for Effect Prediction’), is provided at http://www.uni-kiel.de/medinfo/cgi-bin/predictor/.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms
  • Computational Biology / methods
  • Computer Simulation
  • Genetic Variation*
  • Humans
  • Mutation
  • Polymorphism, Single Nucleotide
  • Software*