A synthetic-diploid benchmark for accurate variant-calling evaluation

Heng Li; Jonathan M Bloom; Yossi Farjoun; Mark Fleharty; Laura Gauthier; Benjamin Neale; Daniel MacArthur

doi:10.1038/s41592-018-0054-7

A synthetic-diploid benchmark for accurate variant-calling evaluation

Nat Methods. 2018 Aug;15(8):595-597. doi: 10.1038/s41592-018-0054-7. Epub 2018 Jul 16.

Authors

Heng Li¹, Jonathan M Bloom², Yossi Farjoun², Mark Fleharty², Laura Gauthier², Benjamin Neale^{3

4}, Daniel MacArthur^{5

6}

Affiliations

¹ Broad Institute of Harvard and MIT, Cambridge, MA, USA. [email protected].
² Broad Institute of Harvard and MIT, Cambridge, MA, USA.
³ Broad Institute of Harvard and MIT, Cambridge, MA, USA. [email protected].
⁴ Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA. [email protected].
⁵ Broad Institute of Harvard and MIT, Cambridge, MA, USA. [email protected].
⁶ Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA. [email protected].

Abstract

Existing benchmark datasets for use in evaluating variant-calling accuracy are constructed from a consensus of known short-variant callers, and they are thus biased toward easy regions that are accessible by these algorithms. We derived a new benchmark dataset from the de novo PacBio assemblies of two fully homozygous human cell lines, which provides a relatively more accurate and less biased estimate of small-variant-calling error rates in a realistic context.

Publication types

Evaluation Study
Research Support, N.I.H., Extramural

MeSH terms

Algorithms
Benchmarking
Cell Line, Tumor
Databases, Genetic / standards
Databases, Genetic / statistics & numerical data*
Diploidy
Female
Genetic Variation*
Genome, Human
Homozygote
Humans
Hydatidiform Mole / genetics
Pregnancy
Synthetic Biology
Uterine Neoplasms / genetics
Whole Genome Sequencing / statistics & numerical data

Abstract

Publication types

MeSH terms

Grants and funding