STAVER: a standardized benchmark dataset-based algorithm for effective variation reduction in large-scale DIA-MS data

Peng Ran; Yunzhi Wang; Kai Li; Shiman He; Subei Tan; Jiacheng Lv; Jiajun Zhu; Shaoshuai Tang; Jinwen Feng; Zhaoyu Qin; Yan Li; Lin Huang; Yanan Yin; Lingli Zhu; Wenjun Yang; Chen Ding

doi:10.1093/bib/bbae553

STAVER: a standardized benchmark dataset-based algorithm for effective variation reduction in large-scale DIA-MS data

Brief Bioinform. 2024 Sep 23;25(6):bbae553. doi: 10.1093/bib/bbae553.

Authors

Peng Ran¹, Yunzhi Wang¹, Kai Li¹, Shiman He¹, Subei Tan¹, Jiacheng Lv¹, Jiajun Zhu¹, Shaoshuai Tang¹, Jinwen Feng¹, Zhaoyu Qin¹, Yan Li¹, Lin Huang¹, Yanan Yin¹, Lingli Zhu¹, Wenjun Yang², Chen Ding^{1

3}

Affiliations

¹ Center for Cell and Gene Therapy, Clinical Research Center for Cell-based Immunotherapy, Shanghai Pudong Hospital, State Key Laboratory of Genetic Engineering, School of Life Sciences, Human Phenome Institute, Fudan University, E301, School of Life Sciences, No. 2005, Songhu Road, Yangpu District, Shanghai 200438, P.R. China.
² Department of Pediatric Orthopedics, Xinhua Hospital affiliated to Shanghai Jiao Tong University School of Medicine, No. 1665, Kongjiang Road, Yangpu District, Shanghai 200092, China.
³ Departments of Cancer Research Institute, Affiliated Cancer Hospital of Xinjiang Medical University Xinjiang Key Laboratory of Translational Biomedical Engineering, Urumqi 830000, P. R. China.

PMID: 39504480
DOI: 10.1093/bib/bbae553

Abstract

Mass spectrometry (MS)-based proteomics has become instrumental in comprehensively investigating complex biological systems. Data-independent acquisition (DIA)-MS, utilizing hybrid spectral library search strategies, allows for the simultaneous quantification of thousands of proteins, showing promise in enhancing protein identification and quantification precision. However, low-quality profiles can considerably undermine quantitative precision, resulting in inaccurate protein quantification. To tackle this challenge, we introduced STAVER, a novel algorithm that leverages standardized benchmark datasets to reduce non-biological variation in large-scale DIA-MS analyses. By eliminating unwanted noise in MS signals, STAVER significantly improved protein quantification precision, especially in hybrid spectral library searches. Moreover, we validated STAVER's robustness and applicability across multiple large-scale DIA datasets, demonstrating significantly enhanced precision and reproducibility of protein quantification. STAVER offers an innovative and effective approach for enhancing the quality of large-scale DIA proteomic data, facilitating cross-platform and cross-laboratory comparative analyses. This advancement significantly enhances the consistency and reliability of findings in clinical research. The complete package is available at https://github.com/Ran485/STAVER.

Keywords: STAVER algorithm; bioinformatics; data-independent acquisition; non-biological noise; proteomics analysis; quantitative proteomics.

MeSH terms

Algorithms*
Benchmarking* / methods
Databases, Protein
Humans
Mass Spectrometry / methods
Mass Spectrometry / standards
Proteomics* / methods
Proteomics* / standards
Reproducibility of Results
Software

Abstract

MeSH terms

Grants and funding