Signature Genes Selection and Functional Analysis of Astrocytoma Phenotypes: A Comparative Study

Cancers (Basel). 2024 Sep 25;16(19):3263. doi: 10.3390/cancers16193263.

Abstract

Novel cancer biomarkers discoveries are driven by the application of omics technologies. The vast quantity of highly dimensional data necessitates the implementation of feature selection. The mathematical basis of different selection methods varies considerably, which may influence subsequent inferences. In the study, feature selection and classification methods were employed to identify six signature gene sets of grade 2 and 3 astrocytoma samples from the Rembrandt repository. Subsequently, the impact of these variables on classification and further discovery of biological patterns was analysed. Principal component analysis (PCA), uniform manifold approximation and projection (UMAP), and hierarchical clustering revealed that the data set (10,096 genes) exhibited a high degree of noise, feature redundancy, and lack of distinct patterns. The application of feature selection methods resulted in a reduction in the number of genes to between 28 and 128. Notably, no single gene was selected by all of the methods tested. Selection led to an increase in classification accuracy and noise reduction. Significant differences in the Gene Ontology terms were discovered, with only 13 terms overlapping. One selection method did not result in any enriched terms. KEGG pathway analysis revealed only one pathway in common (cell cycle), while the two methods did not yield any enriched pathways. The results demonstrated a significant difference in outcomes when classification-type algorithms were utilised in comparison to mixed types (selection and classification). This may result in the inadvertent omission of biological phenomena, while simultaneously achieving enhanced classification outcomes.

Keywords: artificial intelligence; astrocytoma; brain cancer; classification; feature selection; machine learning; microarrays; patient stratification; precision medicine; signature genes.