For the last decade, Gaussian process regression (GPR) proved to be a competitive machine learning regression algorithm for Earth observation applications, with attractive unique properties such as band relevance ranking and uncertainty estimates. More recently, GPR also proved to be a proficient time series processor to fill up gaps in optical imagery, typically due to cloud cover. This makes GPR perfectly suited for large-scale spatiotemporal processing of satellite imageries into cloud-free products of biophysical variables. With the advent of the Google Earth Engine (GEE) cloud platform, new opportunities emerged to process local-to-planetary scale satellite data using advanced machine learning techniques and convert them into gap-filled vegetation properties products. However, GPR is not yet part of the GEE ecosystem. To circumvent this limitation, this work proposes a general adaptation of GPR formulation to parallel processing framework and its integration into GEE. To demonstrate the functioning and utility of the developed workflow, a GPR model predicting green leaf area index (LAI G ) from Sentinel-2 imagery was imported. Although by running this GPR model into GEE any corner of the world can be mapped into LAI G at a resolution of 20 m, here we show some demonstration cases over western Europe with zoom-ins over Spain. Thanks to the computational power of GEE, the mapping takes place on-the-fly. Additionally, a GPR-based gap filling strategy based on pre-optimized kernel hyperparameters is also put forward for the generation of multi-orbit cloud-free LAI G maps with an unprecedented level of detail, and the extraction of regularly-sampled LAI G time series at a pixel level. The ability to plugin a locally-trained GPR model into the GEE framework and its instant processing opens up a new paradigm of remote sensing image processing.
Keywords: Gaussian process regression (GPR); Google Earth Engine (GEE); Sentinel-2; gap filling; leaf area index (LAI); machine learning.