Establishing a Variant Allele Frequency Cutoff for Manual Curation of Medical Exome Sequencing Data

J Mol Diagn. 2025 Jan;27(1):36-41. doi: 10.1016/j.jmoldx.2024.09.006. Epub 2024 Oct 18.

Abstract

Medical exome sequencing pipelines consist of various preprocessing steps to prioritize credible causal variants before a pathologist or variant curation scientist manually interprets potential findings that are then reported to patients. The variant allele frequency (VAF), reported as the fraction of sequencing reads supporting a variant call, can be used to screen for technical artifacts, yet a specific filtering threshold has yet to be established. A total of 13,122 manually curated variants, sequenced from 289 patients using the Agilent SureSelect Focused Exome enrichment kit at the University of Kentucky Clinical Genomics laboratory from October 2019 to May 2023, were evaluated. Totals of 278 single-nucleotide polymorphisms (SNPs) and 3340 SNPs as technical artifacts are clinically reported. All reported variants had a VAF between 0.33 and 0.63, and 82% (2725/3340) of sequencing artifacts had a VAF of <0.33. It is proposed that removing SNPs in which the VAF is less than approximately 0.30 reduces manual curation time by approximately 20% while capturing all medically relevant variants in medical exome sequencing data sets.

MeSH terms

  • Data Curation / methods
  • Exome Sequencing* / methods
  • Exome* / genetics
  • Gene Frequency*
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Polymorphism, Single Nucleotide*