Machine Learning-Supported Diagnosis of Small Blue Round Cell Sarcomas Using Targeted RNA Sequencing

J Mol Diagn. 2024 May;26(5):387-398. doi: 10.1016/j.jmoldx.2024.02.002. Epub 2024 Feb 21.

Abstract

Small blue round cell sarcomas (SBRCSs) are a heterogeneous group of tumors with overlapping morphologic features but markedly varying prognosis. They are characterized by distinct chromosomal alterations, particularly rearrangements leading to gene fusions, whose detection currently represents the most reliable diagnostic marker. Ewing sarcomas are the most common SBRCSs, defined by gene fusions involving EWSR1 and transcription factors of the ETS family, and the most frequent non-EWSR1-rearranged SBRCSs harbor a CIC rearrangement. Unfortunately, currently the identification of CIC::DUX4 translocation events, the most common CIC rearrangement, is challenging. Here, we present a machine-learning approach to support SBRCS diagnosis that relies on gene expression profiles measured via targeted sequencing. The analyses on a curated cohort of 69 soft-tissue tumors showed markedly distinct expression patterns for SBRCS subgroups. A random forest classifier trained on Ewing sarcoma and CIC-rearranged cases predicted probabilities of being CIC-rearranged >0.9 for CIC-rearranged-like sarcomas and <0.6 for other SBRCSs. Testing on a retrospective cohort of 1335 routine diagnostic cases identified 15 candidate CIC-rearranged tumors with a probability >0.75, all of which were supported by expert histopathologic reassessment. Furthermore, the multigene random forest classifier appeared advantageous over using high ETV4 expression alone, previously proposed as a surrogate to identify CIC rearrangement. Taken together, the expression-based classifier can offer valuable support for SBRCS pathologic diagnosis.

MeSH terms

  • Biomarkers, Tumor / analysis
  • Biomarkers, Tumor / genetics
  • Humans
  • Oncogene Proteins, Fusion / genetics
  • Retrospective Studies
  • Sarcoma* / genetics
  • Sarcoma, Small Cell* / diagnosis
  • Sarcoma, Small Cell* / genetics
  • Sarcoma, Small Cell* / pathology
  • Sequence Analysis, RNA
  • Soft Tissue Neoplasms* / genetics
  • Transcription Factors / genetics

Substances

  • Transcription Factors
  • Oncogene Proteins, Fusion
  • Biomarkers, Tumor