Background: Accurate incidence prediction of sexually transmitted diseases (STDs) is critical for early prevention and better government strategic planning. In this paper, four different forecasting models were presented to predict the incidence of AIDS, gonorrhea, and syphilis.
Methods: The annual percentage changes in the incidence of AIDS, gonorrhea, and syphilis were estimated by using joinpoint regression. The performance of four methods, namely, the autoregressive integrated moving average (ARIMA) model, Elman neural network (ERNN) model, ARIMA-ERNN hybrid model and long short-term memory (LSTM) model, were assessed and compared. For 1-year prediction, the collected data from 2011 to 2020 were used for modeling to predict the incidence in 2021. For 5-year prediction, the collected data from 2011 to 2016 were used for modeling to predict the incidence from 2017 to 2021. The performance was evaluated based on four indices: mean square error (MSE), mean absolute error (MAE), and mean absolute percentage error (MAPE).
Results: The morbidities of AIDS and syphilis are on the rise, and the morbidity of gonorrhea has declined in recent years. The optimal ARIMA models were determined: ARIMA(2,1,2)(0,1,1)12, ARIMA(1,1,2)(0,1,2)12, and ARIMA(3,1,2)(1,1,2)12 for AIDS, gonorrhea, and syphilis 1-year prediction, respectively; ARIMA (2,1,2)(0,1,1)12, ARIMA(1,1,2)(0,1,2)12, and ARIMA(2,1,1)(0,1,0)12 for AIDS, gonorrhea and syphilis 5-year prediction, respectively. For 1-year prediction, the MAPEs of ARIMA, ERNN, ARIMA-ERNN, and LSTM for AIDS are 23.26, 20.24, 18.34, and 18.63, respectively; For gonorrhea, the MAPEs are 19.44, 18.03, 17.77, and 5.09, respectively; For syphilis, the MAPEs are 9.80, 9.55, 8.67, and 5.79, respectively. For 5-year prediction, the MAPEs of ARIMA, ERNN, ARIMA-ERNN, and LSTM for AIDS are 12.86, 23.54, 14.74, and 25.43, respectively; For gonorrhea, the MAPEs are 17.07, 17.95, 16.46, and 15.13, respectively; For syphilis, the MAPEs are 21.88, 24.00, 20.18 and 11.20, respectively. In general, the performance ranking of the four models from high to low is LSTM, ARIMA-ERNN, ERNN, and ARIMA.
Conclusion: The time series predictive models show their powerful performance in forecasting STDs incidence and can be applied by relevant authorities in the prevention and control of STDs.
Keywords: ARIMA; ARIMA-ERNN; ERNN; LSTM; sexually transmitted diseases; time series predictive models.
Copyright © 2022 Zhu, Zhu, Zhan, Gu, Chen and Li.