Machine Learning Approach to Raman Spectrum Analysis of MIA PaCa-2 Pancreatic Cancer Tumor Repopulating Cells for Classification and Feature Analysis

Life (Basel). 2020 Sep 5;10(9):181. doi: 10.3390/life10090181.

Abstract

A machine learning approach is applied to Raman spectra of cells from the MIA PaCa-2 human pancreatic cancer cell line to distinguish between tumor repopulating cells (TRCs) and parental control cells, and to aid in the identification of molecular signatures. Fifty-one Raman spectra from the two types of cells are analyzed to determine the best combination of data type, dimension size, and classification technique to differentiate the cell types. An accuracy of 0.98 is obtained from support vector machine (SVM) and k-nearest neighbor (kNN) classifiers with various dimension reduction and feature selection tools. We also identify some possible biomolecules that cause the spectral peaks that led to the best results.

Keywords: Raman spectroscopy; k-nearest neighbor (kNN); pancreatic cancer; principal component analysis (PCA); support vector machine (SVM); tumor repopulating cell (TRC).