Deep structured learning for variant prioritization in Mendelian diseases

Nat Commun. 2023 Jul 13;14(1):4167. doi: 10.1038/s41467-023-39306-7.

Abstract

Effective computer-aided or automated variant evaluations for monogenic diseases will expedite clinical diagnostic and research efforts of known and novel disease-causing genes. Here we introduce MAVERICK: a Mendelian Approach to Variant Effect pRedICtion built in Keras. MAVERICK is an ensemble of transformer-based neural networks that can classify a wide range of protein-altering single nucleotide variants (SNVs) and indels and assesses whether a variant would be pathogenic in the context of dominant or recessive inheritance. We demonstrate that MAVERICK outperforms all other major programs that assess pathogenicity in a Mendelian context. In a cohort of 644 previously solved patients with Mendelian diseases, MAVERICK ranks the causative pathogenic variant within the top five variants in over 95% of cases. Seventy-six percent of cases were solved by the top-ranked variant. MAVERICK ranks the causative pathogenic variant in hitherto novel disease genes within the first five candidate variants in 70% of cases. MAVERICK has already facilitated the identification of a novel disease gene causing a degenerative motor neuron disease. These results represent a significant step towards automated identification of causal variants in patients with Mendelian diseases.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Humans
  • INDEL Mutation*
  • Proteins*

Substances

  • Proteins