A novel model for malaria prediction based on ensemble algorithms

PLoS One. 2019 Dec 26;14(12):e0226910. doi: 10.1371/journal.pone.0226910. eCollection 2019.

Abstract

Background and objective: Most previous studies adopted single traditional time series models to predict incidences of malaria. A single model cannot effectively capture all the properties of the data structure. However, a stacking architecture can solve this problem by combining distinct algorithms and models. This study compares the performance of traditional time series models and deep learning algorithms in malaria case prediction and explores the application value of stacking methods in the field of infectious disease prediction.

Methods: The ARIMA, STL+ARIMA, BP-ANN and LSTM network models were separately applied in simulations using malaria data and meteorological data in Yunnan Province from 2011 to 2017. We compared the predictive performance of each model through evaluation measures: RMSE, MASE, MAD. In addition, gradient-boosting regression trees (GBRTs) were used to combine the above four models. We also determined whether stacking structure improved the model prediction performance.

Results: The root mean square errors (RMSEs) of the four sub-models were 13.176, 14.543, 9.571 and 7.208; the mean absolute scaled errors (MASEs) were 0.469, 0.472, 0.296 and 0.266 and the mean absolute deviation (MAD) were 6.403, 7.658, 5.871 and 5.691. After using the stacking architecture combined with the above four models, the RMSE, MASE and MAD values of the ensemble model decreased to 6.810, 0.224 and 4.625, respectively.

Conclusions: A novel ensemble model based on the robustness of structured prediction and model combination through stacking was developed. The findings suggest that the predictive performance of the final model is superior to that of the other four sub-models, indicating that stacking architecture may have significant implications in infectious disease prediction.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • China / epidemiology
  • Communicable Diseases / epidemiology*
  • Deep Learning*
  • Humans
  • Incidence
  • Malaria / epidemiology*

Grants and funding

This work was supported by the Ministry of Education of the Humanities and Social Science project [grant no. 17YJAZH048] and the National Natural Science Foundation of China [grant no. 81803333].The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.