Avant-garde: an automated data-driven DIA data curation tool

Nat Methods. 2020 Dec;17(12):1237-1244. doi: 10.1038/s41592-020-00986-4. Epub 2020 Nov 16.

Abstract

Several challenges remain in data-independent acquisition (DIA) data analysis, such as to confidently identify peptides, define integration boundaries, remove interferences, and control false discovery rates. In practice, a visual inspection of the signals is still required, which is impractical with large datasets. We present Avant-garde as a tool to refine DIA (and parallel reaction monitoring) data. Avant-garde uses a novel data-driven scoring strategy: signals are refined by learning from the dataset itself, using all measurements in all samples to achieve the best optimization. We evaluate the performance of Avant-garde using benchmark DIA datasets and show that it can determine the quantitative suitability of a peptide peak, and reach the same levels of selectivity, accuracy, and reproducibility as manual validation. Avant-garde is complementary to existing DIA analysis engines and aims to establish a strong foundation for subsequent analysis of quantitative mass spectrometry data.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Cell Line
  • Data Analysis*
  • Data Curation / methods*
  • Data Science / methods*
  • HEK293 Cells
  • Humans
  • Mass Spectrometry / methods
  • Peptides / analysis
  • Proteome / analysis*
  • Proteomics / methods*
  • Reproducibility of Results
  • Software

Substances

  • Peptides
  • Proteome