Multiple Machine Learning Identifies Key Gene PHLDA1 Suppressing NAFLD Progression

Inflammation. 2024 Nov 4. doi: 10.1007/s10753-024-02164-6. Online ahead of print.

Abstract

Non-alcoholic fatty liver disease (NAFLD) poses a serious global health threat, with its progression mechanisms not yet fully understood. While several molecular markers for NAFLD have been developed in recent years, a lack of robust evidence hampers their clinical application. Therefore, identifying novel and potent biomarkers would directly aid in the prediction, prevention, and personalized treatment of NAFLD. We downloaded NAFLD-related datasets from the Gene Expression Omnibus (GEO). Differential expression analysis and functional analysis were initially conducted. Subsequently, Weighted Gene Co-expression Network Analysis (WGCNA) and multiple machine learning strategies were employed to screen and identify key genes, and the diagnostic value was assessed using Receiver Operating Characteristic (ROC) analysis. We then explored the relationship between genes and immune cells using transcriptome data and single-cell RNA sequencing (scRNA-seq) data. Finally, we validated our findings in cell and mouse NAFLD models. We obtained 23 overlapping differentially expressed genes (DEGs) across three NAFLD datasets. Enrichment analysis revealed that DEGs were associated with Apoptosis, Parathyroid hormone synthesis, secretion and action, Colorectal cancer, p53 signaling pathway, and Biosynthesis of unsaturated fatty acids. After employing machine learning strategies, we identified one gene, pleckstrin homology like domain family A member 1 (PHLDA1), downregulated in NAFLD and showing high diagnostic accuracy. CIBERSORT analysis revealed significant associations of PHLDA1 with various immune cells. Single-cell data analysis demonstrated downregulation of PHLDA1 in NAFLD, with PHLDA1 exhibiting a significant negative correlation with macrophages. Furthermore, we found PHLDA1 to be downregulated in an in vitro hepatic steatosis cell model, and overexpression of PHLDA1 significantly reduced lipid accumulation, as well as the expression of key molecules involved in hepatic lipogenesis and fatty acid uptake, such as FASN, SCD-1, and CD36. Additionally, gene set enrichment analysis (GSEA) pathway enrichment analysis suggested that PHLDA1 may influence NAFLD progression through pathways such as Cytokine Cytokine Receptor Interaction, Ecm Receptor Interaction, Parkinson's Disease, and Ribosome pathways. Our conclusions were further validated in a mouse model of NAFLD. Our study reveals that PHLDA1 inhibits the progression of NAFLD, as overexpression of PHLDA1 significantly reduces lipid accumulation in cells and markedly decreases the expression of key molecules involved in liver lipogenesis and fatty acid uptake. Therefore, PHLDA1 may emerge as a novel potential target for future prediction, diagnosis, and targeted prevention of NAFLD.

Keywords: Biomarker; Machine Learning strategy; NAFLD; PHLDA1.