Mendelian Inconsistent Signatures from 1314 Ancestrally Diverse Family Trios Distinguish Biological Variation from Sequencing Error

J Comput Biol. 2019 May;26(5):405-419. doi: 10.1089/cmb.2018.0253. Epub 2019 Apr 3.

Abstract

Next-generation sequencing enables advances in the clinical application of genomics by providing high-throughput detection of genomic variation. However, next-generation sequencing technologies, especially whole-genome sequencing (WGS), are often associated with a high false-positive rate. Trio-based WGS can contribute significantly towards improved quality control methods. Mendelian-inconsistent calls (MIC) in parent-child trios are commonly attributed to erroneous sequencing calls, as the true de novo mutation rate is extremely low compared with MIC incidence. Here, we analyzed WGS data from 1314 mother, father, and child trios across ethnically diverse populations with the goal of characterizing MIC. Genotype calls in a trio can be used to assign different signatures to MIC. MIC occur more frequently within repeats but show varying distribution and error mechanisms across repeat types. MIC are enriched within poly-A/T runs in short interspersed nuclear elements. Alignability scores, allele balance, and relative parental read depth vary among MIC signatures and these differences should be considered when designing filters for MIC reduction. MIC cluster in germline deletions and these MIC also segregate with population. Our results provide a basis for making decisions on how each MIC type should be evaluated before discarding them as errors or including them in alternative applications. With the reduction of sequencing cost, family trio whole genome and exome analysis are being performed more routinely in clinical practice. We provide a reference that can be used for annotating MIC with their frequencies in a larger population to aid in the filtering of candidate de novo mutations.

Keywords: Mendelian-inconsistent calls (MIC); de novo mutations; inherited deletions; long interspersed nuclear elements (LINE); population-specific deletions; quality control; repeats; short interspersed nuclear elements (SINE); trio sequencing; whole-genome sequencing..

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Exome / genetics
  • Female
  • Genome, Human / genetics
  • Genomics / methods
  • Genotype
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Male
  • Mutation / genetics*
  • Whole Genome Sequencing / methods