Bayesian unsupervised clustering identifies clinically relevant osteosarcoma subtypes

Sergio Llaneza-Lago; William D Fraser; Darrell Green

doi:10.1093/bib/bbae665

Bayesian unsupervised clustering identifies clinically relevant osteosarcoma subtypes

Brief Bioinform. 2024 Nov 22;26(1):bbae665. doi: 10.1093/bib/bbae665.

Authors

Sergio Llaneza-Lago¹, William D Fraser², Darrell Green¹

Affiliations

¹ Biomedical Research Centre, Norwich Medical School, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, United Kingdom.
² Bioanalytical Facility, Norwich Medical School, University of East Anglia, Norwich Research Park, Norwich NR4 7UQ, United Kingdom.

Abstract

Identification of cancer subtypes is a critical step for developing precision medicine. Most cancer subtyping is based on the analysis of RNA sequencing (RNA-seq) data from patient cohorts using unsupervised machine learning methods such as hierarchical cluster analysis, but these computational approaches disregard the heterogeneous composition of individual cancer samples. Here, we used a more sophisticated unsupervised Bayesian model termed latent process decomposition (LPD), which handles individual cancer sample heterogeneity and deconvolutes the structure of transcriptome data to provide clinically relevant information. The work was performed on the pediatric tumor osteosarcoma, which is a prototypical model for a rare and heterogeneous cancer. The LPD model detected three osteosarcoma subtypes. The subtype with the poorest prognosis was validated using independent patient datasets. This new stratification framework will be important for more accurate diagnostic labeling, expediting precision medicine, and improving clinical trial success. Our results emphasize the importance of using more sophisticated machine learning approaches (and for teaching deep learning and artificial intelligence) for RNA-seq data analysis, which may assist drug targeting and clinical management.

Keywords: RNA-seq; heterogeneity; latent process decomposition; osteosarcoma; precision medicine.

MeSH terms

Bayes Theorem*
Bone Neoplasms / classification
Bone Neoplasms / genetics
Bone Neoplasms / pathology
Cluster Analysis
Gene Expression Profiling / methods
Humans
Osteosarcoma* / classification
Osteosarcoma* / genetics
Osteosarcoma* / pathology
Prognosis
Transcriptome
Unsupervised Machine Learning*

Grants and funding

21-343/CHILDREN with CANCER UK