A user-driven machine learning approach for RNA-based sample discrimination and hierarchical classification

STAR Protoc. 2023 Oct 27;4(4):102661. doi: 10.1016/j.xpro.2023.102661. Online ahead of print.

Abstract

RNA-based sample discrimination and classification can be used to provide biological insights and/or distinguish between clinical groups. However, finding informative differences between sample groups can be challenging due to the multidimensional and noisy nature of sequencing data. Here, we apply a machine learning approach for hierarchical discrimination and classification of samples with high-dimensional miRNA expression data. Our protocol comprises data preprocessing, unsupervised learning, feature selection, and machine-learning-based hierarchical classification, alongside open-source MATLAB code.

Keywords: Bioinformatics; Gene Expression; RNAseq; Sequence Analysis; Sequencing.