Generating Proteomic Big Data for Precision Medicine

Proteomics. 2020 Nov;20(21-22):e1900358. doi: 10.1002/pmic.201900358. Epub 2020 Aug 26.

Abstract

Here, the authors reason that the complexity of medical problems and proteome science might be tackled effectively with deep learning (DL) technology. However, deployment of DL for proteomics data requires the acquisition of data sets from a large number of samples. Based on the success of DL in medical imaging classification, proteome data from thousands of samples are arguably the minimal input for DL. Contemporary proteomics is turning high-throughput thanks to the rapid progresses of sample preparation and liquid chromatography mass spectrometry methods. In particular, data-independent acquisition now enables the generation of hundreds to thousands of quantitative proteome maps from clinical specimens in clinical cohorts with only limited sample amounts in clinical cohorts. Upheavals in the design of large-scale clinical proteomics studies might be required to generate proteomic big data and deploy DL to tackle complex medical problems.

Keywords: clinical cohort; data-independent acquisition; deep learning; high-throughput proteomics; precision medicine; proteomic big data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Big Data
  • Mass Spectrometry
  • Precision Medicine*
  • Proteome
  • Proteomics*

Substances

  • Proteome