STAVER: a standardized benchmark dataset-based algorithm for effective variation reduction in large-scale DIA-MS data

Brief Bioinform. 2024 Sep 23;25(6):bbae553. doi: 10.1093/bib/bbae553.

Abstract

Mass spectrometry (MS)-based proteomics has become instrumental in comprehensively investigating complex biological systems. Data-independent acquisition (DIA)-MS, utilizing hybrid spectral library search strategies, allows for the simultaneous quantification of thousands of proteins, showing promise in enhancing protein identification and quantification precision. However, low-quality profiles can considerably undermine quantitative precision, resulting in inaccurate protein quantification. To tackle this challenge, we introduced STAVER, a novel algorithm that leverages standardized benchmark datasets to reduce non-biological variation in large-scale DIA-MS analyses. By eliminating unwanted noise in MS signals, STAVER significantly improved protein quantification precision, especially in hybrid spectral library searches. Moreover, we validated STAVER's robustness and applicability across multiple large-scale DIA datasets, demonstrating significantly enhanced precision and reproducibility of protein quantification. STAVER offers an innovative and effective approach for enhancing the quality of large-scale DIA proteomic data, facilitating cross-platform and cross-laboratory comparative analyses. This advancement significantly enhances the consistency and reliability of findings in clinical research. The complete package is available at https://github.com/Ran485/STAVER.

Keywords: STAVER algorithm; bioinformatics; data-independent acquisition; non-biological noise; proteomics analysis; quantitative proteomics.

MeSH terms

  • Algorithms*
  • Benchmarking* / methods
  • Databases, Protein
  • Humans
  • Mass Spectrometry / methods
  • Mass Spectrometry / standards
  • Proteomics* / methods
  • Proteomics* / standards
  • Reproducibility of Results
  • Software