The rapid increases of the global population and climate change pose major challenges to a sustainable production of food to meet consumer demands. Process-based models (PBMs) have long been used in agricultural crop production for predicting yield and understanding the environmental regulation of plant physiological processes and its consequences for crop growth and development. In recent years, with the increasing use of sensor and communication technologies for data acquisition in agriculture, machine learning (ML) has become a popular tool in yield prediction (especially on a large scale) and phenotyping. Both PBMs and ML are frequently used in studies on major challenges in crop production and each has its own advantages and drawbacks. We propose to combine PBMs and ML given their intrinsic complementarity, to develop knowledge- and data-driven modelling (KDDM) with high prediction accuracy as well as good interpretability. Parallel, serial and modular structures are three main modes can be adopted to develop KDDM for agricultural applications. The KDDM approach helps to simplify model parameterization by making use of sensor data and improves the accuracy of yield prediction. Furthermore, the KDDM approach has great potential to expand the boundary of current crop models to allow upscaling towards a farm, regional or global level and downscaling to the gene-to-cell level. The KDDM approach is a promising way of combining simulation models in agriculture with the fast developments in data science while mechanisms of many genetic and physiological processes are still under investigation, especially at the nexus of increasing food production, mitigating climate change and achieving sustainability.
Keywords: Knowledge; Machine learning; Process; and data; based models; driven modelling; yield prediction.
© The Author(s) 2022. Published by Oxford University Press on behalf of the Annals of Botany Company.