Integrating environmental and satellite data to estimate county-level cotton yield in Xinjiang Province

Front Plant Sci. 2023 Jan 18:13:1048479. doi: 10.3389/fpls.2022.1048479. eCollection 2022.

Abstract

Accurate and timely estimation of cotton yield over large areas is essential for precision agriculture, facilitating the operation of commodity markets and guiding agronomic management practices. Remote sensing (RS) and crop models are effective means to predict cotton yield in the field. The satellite vegetation indices (VIs) can describe crop yield variations over large areas but can't take the exact environmental impact into consideration. Climate variables (CVs), the result of the influence of spatial heterogeneity in large regions, can provide environmental information for better estimation of cotton yield. In this study, the most important VIs and CVs for estimating county-level cotton yield across Xinjiang Province were screened out. We found that the VIs of canopy structure and chlorophyll contents, and the CVs of moisture, were the most significant factors for cotton growth. For yield estimation, we utilized four approaches: least absolute shrinkage and selection operator regression (LASSO), support vector regression (SVR), random forest regression (RFR) and long short-term memory (LSTM). Due to its ability to capture temporal features over the long term, LSTM performed best, with an R2 of 0.76, root mean square error (RMSE) of 150 kg/ha and relative RMSE (rRMSE) of 8.67%; moreover, an additional 10% of the variance could be explained by adding CVs to the VIs. For the within-season yield estimation using LSTM, predictions made 2 months before harvest were the most accurate (R2 = 0.65, RMSE = 220 kg/ha, rRMSE = 15.97%). Our study demonstrated the feasibility of yield estimation and early prediction at the county level over large cotton cultivation areas by integrating satellite and environmental data.

Keywords: GEE; climate variables; cotton; deep learning; remote sensing; yield estimation.

Grants and funding

This research was funded by the National Natural Science Foundation of China (grant numbers 41971321 and 41830108), Key Research Program of Frontier Sciences, CAS (grant number ZDBS-LY-DQC012), and Open Fund of Key Laboratory of Oasis Eco-agriculture, XPCC (grant numbers 201801 and 202003). CH was supported by Youth Innovation Promotion Association, CAS (grant number Y2021047).