Pine wilt disease has caused significant damage to China's ecological and financial resources. To prevent its further spread across the country, proactive control measures are necessary. Given the low accuracy of traditional models, we have employed an enhanced LightGBM model to predict the development trend of pine wilt disease in China. By incorporating anthropogenic factors such as the volume of pine wood imports from 2017 to 2022, the density of graded roads, the number of adjacent counties, and the presence of wood processing factories, as well as natural factors like temperature, humidity, and wind speed, we employed Pearson correlation and LightGBM model's feature importance analysis to select the 17 most significant influencing factors. Spatial analysis was conducted on the epidemic sub-compartments (A divisional unit smaller than a township) of pine wilt disease for 2022 and 2023, revealing the distribution patterns of epidemic sub-compartments within 2 km of roads and the spatial relationships between new and old epidemic sub-compartments. We improved the LightGBM model using Bayesian algorithm, SSA, and HPO. By comparison, the enhanced model was validated to outperform in terms of accuracy, precision, recall, sensitivity, and specificity. Based on the results of correlation analysis and spatial analysis, an enhanced model was used to predict the emergence of pine wilt disease in new counties and districts in the future. Currently, pine wilt disease is primarily concentrated in the central-southern and northeastern provinces of China. Predictions indicate that the disease will further spread to the northeastern and southern regions of the country in the future.
Keywords: Biological Control; Climate Change; Data Science; Disease Control and Pest Management.