An intrusion detection model to detect zero-day attacks in unseen data using machine learning

PLoS One. 2024 Sep 11;19(9):e0308469. doi: 10.1371/journal.pone.0308469. eCollection 2024.

Abstract

In an era marked by pervasive digital connectivity, cybersecurity concerns have escalated. The rapid evolution of technology has led to a spectrum of cyber threats, including sophisticated zero-day attacks. This research addresses the challenge of existing intrusion detection systems in identifying zero-day attacks using the CIC-MalMem-2022 dataset and autoencoders for anomaly detection. The trained autoencoder is integrated with XGBoost and Random Forest, resulting in the models XGBoost-AE and Random Forest-AE. The study demonstrates that incorporating an anomaly detector into traditional models significantly enhances performance. The Random Forest-AE model achieved 100% accuracy, precision, recall, F1 score, and Matthews Correlation Coefficient (MCC), outperforming the methods proposed by Balasubramanian et al., Khan, Mezina et al., Smith et al., and Dener et al. When tested on unseen data, the Random Forest-AE model achieved an accuracy of 99.9892%, precision of 100%, recall of 99.9803%, F1 score of 99.9901%, and MCC of 99.8313%. This research highlights the effectiveness of the proposed model in maintaining high accuracy even with previously unseen data.

MeSH terms

  • Algorithms
  • Computer Security*
  • Humans
  • Machine Learning*
  • Models, Theoretical

Grants and funding

This work was supported by the National Science and Technology Council in Taiwan under grant numbers NSTC-112-2221-E-027-088-MY2 and NSTC-111-2622-8-027-009 and also supported by the Ministry of Education of Taiwan under Official Document No. 1122302319 entitled "The study of artificial intelligence and advanced semiconductor manufacturing for female STEM talent education and industry-university value-added cooperation promotion.” and the UTAR Financial Support for Journal Paper Publication Scheme through Universiti Tunku Abdul Rahman (UTAR), Malaysia. The funders had no role in study design, data collection and analysis, the decision to publish, or the preparation of the manuscript.