Machine learning-based analysis of Ebola virus' impact on gene expression in nonhuman primates

Front Artif Intell. 2024 Aug 30:7:1405332. doi: 10.3389/frai.2024.1405332. eCollection 2024.

Abstract

Introduction: This study introduces the Supervised Magnitude-Altitude Scoring (SMAS) methodology, a novel machine learning-based approach for analyzing gene expression data from non-human primates (NHPs) infected with Ebola virus (EBOV). By focusing on host-pathogen interactions, this research aims to enhance the understanding and identification of critical biomarkers for Ebola infection.

Methods: We utilized a comprehensive dataset of NanoString gene expression profiles from Ebola-infected NHPs. The SMAS system combines gene selection based on both statistical significance and expression changes. Employing linear classifiers such as logistic regression, the method facilitates precise differentiation between RT-qPCR positive and negative NHP samples.

Results: The application of SMAS led to the identification of IFI6 and IFI27 as key biomarkers, which demonstrated perfect predictive performance with 100% accuracy and optimal Area Under the Curve (AUC) metrics in classifying various stages of Ebola infection. Additionally, genes including MX1, OAS1, and ISG15 were significantly upregulated, underscoring their vital roles in the immune response to EBOV.

Discussion: Gene Ontology (GO) analysis further elucidated the involvement of these genes in critical biological processes and immune response pathways, reinforcing their significance in Ebola pathogenesis. Our findings highlight the efficacy of the SMAS methodology in revealing complex genetic interactions and response mechanisms, which are essential for advancing the development of diagnostic tools and therapeutic strategies.

Conclusion: This study provides valuable insights into EBOV pathogenesis, demonstrating the potential of SMAS to enhance the precision of diagnostics and interventions for Ebola and other viral infections.

Keywords: Ebola virus infection; biomarker discovery; gene expression profiling; machine learning in virology; transcriptomic analysis.

Grants and funding

The author(s) declare financial support was received for the research, authorship, and/or publication of this article. Effort sponsored by the U.S. Government under HDTRA 12310003, “Host signaling mechanisms contributing to endothelial damage in hemorrhagic fever virus infection,” PI: Narayanan. The US Government is authorized to reproduce and distribute reprints for Governmental purposes, notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of the U.S. Government.