Search | arXiv e-print repository

DiffDA: a Diffusion Model for Weather-scale Data Assimilation

Authors: Langwen Huang, Lukas Gianinazzi, Yuejiang Yu, Peter D. Dueben, Torsten Hoefler

Abstract: The generation of initial conditions via accurate data assimilation is crucial for weather forecasting and climate modeling. We propose DiffDA as a denoising diffusion model capable of assimilating atmospheric variables using predicted states and sparse observations. Acknowledging the similarity between a weather forecast model and a denoising diffusion model dedicated to weather applications, we… ▽ More The generation of initial conditions via accurate data assimilation is crucial for weather forecasting and climate modeling. We propose DiffDA as a denoising diffusion model capable of assimilating atmospheric variables using predicted states and sparse observations. Acknowledging the similarity between a weather forecast model and a denoising diffusion model dedicated to weather applications, we adapt the pretrained GraphCast neural network as the backbone of the diffusion model. Through experiments based on simulated observations from the ERA5 reanalysis dataset, our method can produce assimilated global atmospheric data consistent with observations at 0.25 deg (~30km) resolution globally. This marks the highest resolution achieved by ML data assimilation models. The experiments also show that the initial conditions assimilated from sparse observations (less than 0.96% of gridded data) and 48-hour forecast can be used for forecast models with a loss of lead time of at most 24 hours compared to initial conditions from state-of-the-art data assimilation in ERA5. This enables the application of the method to real-world applications, such as creating reanalysis datasets with autoregressive data assimilation. △ Less

Submitted 10 June, 2024; v1 submitted 11 January, 2024; originally announced January 2024.

arXiv:2311.07222 [pdf, other]

doi 10.1038/s41586-024-07744-y

Neural General Circulation Models for Weather and Climate

Authors: Dmitrii Kochkov, Janni Yuval, Ian Langmore, Peter Norgaard, Jamie Smith, Griffin Mooers, Milan Klöwer, James Lottes, Stephan Rasp, Peter Düben, Sam Hatfield, Peter Battaglia, Alvaro Sanchez-Gonzalez, Matthew Willson, Michael P. Brenner, Stephan Hoyer

Abstract: General circulation models (GCMs) are the foundation of weather and climate prediction. GCMs are physics-based simulators which combine a numerical solver for large-scale dynamics with tuned representations for small-scale processes such as cloud formation. Recently, machine learning (ML) models trained on reanalysis data achieved comparable or better skill than GCMs for deterministic weather fore… ▽ More General circulation models (GCMs) are the foundation of weather and climate prediction. GCMs are physics-based simulators which combine a numerical solver for large-scale dynamics with tuned representations for small-scale processes such as cloud formation. Recently, machine learning (ML) models trained on reanalysis data achieved comparable or better skill than GCMs for deterministic weather forecasting. However, these models have not demonstrated improved ensemble forecasts, or shown sufficient stability for long-term weather and climate simulations. Here we present the first GCM that combines a differentiable solver for atmospheric dynamics with ML components, and show that it can generate forecasts of deterministic weather, ensemble weather and climate on par with the best ML and physics-based methods. NeuralGCM is competitive with ML models for 1-10 day forecasts, and with the European Centre for Medium-Range Weather Forecasts ensemble prediction for 1-15 day forecasts. With prescribed sea surface temperature, NeuralGCM can accurately track climate metrics such as global mean temperature for multiple decades, and climate forecasts with 140 km resolution exhibit emergent phenomena such as realistic frequency and trajectories of tropical cyclones. For both weather and climate, our approach offers orders of magnitude computational savings over conventional GCMs. Our results show that end-to-end deep learning is compatible with tasks performed by conventional GCMs, and can enhance the large-scale physical simulations that are essential for understanding and predicting the Earth system. △ Less

Submitted 7 March, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

Comments: 92 pages, 54 figures. Nature (2024)

arXiv:2308.15560 [pdf, other]

WeatherBench 2: A benchmark for the next generation of data-driven global weather models

Authors: Stephan Rasp, Stephan Hoyer, Alexander Merose, Ian Langmore, Peter Battaglia, Tyler Russel, Alvaro Sanchez-Gonzalez, Vivian Yang, Rob Carver, Shreya Agrawal, Matthew Chantry, Zied Ben Bouallegue, Peter Dueben, Carla Bromberg, Jared Sisk, Luke Barrington, Aaron Bell, Fei Sha

Abstract: WeatherBench 2 is an update to the global, medium-range (1-14 day) weather forecasting benchmark proposed by Rasp et al. (2020), designed with the aim to accelerate progress in data-driven weather modeling. WeatherBench 2 consists of an open-source evaluation framework, publicly available training, ground truth and baseline data as well as a continuously updated website with the latest metrics and… ▽ More WeatherBench 2 is an update to the global, medium-range (1-14 day) weather forecasting benchmark proposed by Rasp et al. (2020), designed with the aim to accelerate progress in data-driven weather modeling. WeatherBench 2 consists of an open-source evaluation framework, publicly available training, ground truth and baseline data as well as a continuously updated website with the latest metrics and state-of-the-art models: https://sites.research.google/weatherbench. This paper describes the design principles of the evaluation framework and presents results for current state-of-the-art physical and data-driven weather models. The metrics are based on established practices for evaluating weather forecasts at leading operational weather centers. We define a set of headline scores to provide an overview of model performance. In addition, we also discuss caveats in the current evaluation setup and challenges for the future of data-driven weather forecasting. △ Less

Submitted 26 January, 2024; v1 submitted 29 August, 2023; originally announced August 2023.

arXiv:2206.14786 [pdf, other]

ENS-10: A Dataset For Post-Processing Ensemble Weather Forecasts

Authors: Saleh Ashkboos, Langwen Huang, Nikoli Dryden, Tal Ben-Nun, Peter Dueben, Lukas Gianinazzi, Luca Kummer, Torsten Hoefler

Abstract: Post-processing ensemble prediction systems can improve the reliability of weather forecasting, especially for extreme event prediction. In recent years, different machine learning models have been developed to improve the quality of weather post-processing. However, these models require a comprehensive dataset of weather simulations to produce high-accuracy results, which comes at a high computat… ▽ More Post-processing ensemble prediction systems can improve the reliability of weather forecasting, especially for extreme event prediction. In recent years, different machine learning models have been developed to improve the quality of weather post-processing. However, these models require a comprehensive dataset of weather simulations to produce high-accuracy results, which comes at a high computational cost to generate. This paper introduces the ENS-10 dataset, consisting of ten ensemble members spanning 20 years (1998-2017). The ensemble members are generated by perturbing numerical weather simulations to capture the chaotic behavior of the Earth. To represent the three-dimensional state of the atmosphere, ENS-10 provides the most relevant atmospheric variables at 11 distinct pressure levels and the surface at 0.5-degree resolution for forecast lead times T=0, 24, and 48 hours (two data points per week). We propose the ENS-10 prediction correction task for improving the forecast quality at a 48-hour lead time through ensemble post-processing. We provide a set of baselines and compare their skill at correcting the predictions of three important atmospheric variables. Moreover, we measure the baselines' skill at improving predictions of extreme weather events using our dataset. The ENS-10 dataset is available under the Creative Commons Attribution 4.0 International (CC BY 4.0) license. △ Less

Submitted 7 November, 2022; v1 submitted 29 June, 2022; originally announced June 2022.

Comments: Accepted version of the paper

arXiv:2204.02028 [pdf, other]

doi 10.1029/2022MS003120

A Generative Deep Learning Approach to Stochastic Downscaling of Precipitation Forecasts

Authors: Lucy Harris, Andrew T. T. McRae, Matthew Chantry, Peter D. Dueben, Tim N. Palmer

Abstract: Despite continuous improvements, precipitation forecasts are still not as accurate and reliable as those of other meteorological variables. A major contributing factor to this is that several key processes affecting precipitation distribution and intensity occur below the resolved scale of global weather models. Generative adversarial networks (GANs) have been demonstrated by the computer vision c… ▽ More Despite continuous improvements, precipitation forecasts are still not as accurate and reliable as those of other meteorological variables. A major contributing factor to this is that several key processes affecting precipitation distribution and intensity occur below the resolved scale of global weather models. Generative adversarial networks (GANs) have been demonstrated by the computer vision community to be successful at super-resolution problems, i.e., learning to add fine-scale structure to coarse images. Leinonen et al. (2020) previously applied a GAN to produce ensembles of reconstructed high-resolution atmospheric fields, given coarsened input data. In this paper, we demonstrate this approach can be extended to the more challenging problem of increasing the accuracy and resolution of comparatively low-resolution input from a weather forecasting model, using high-resolution radar measurements as a "ground truth". The neural network must learn to add resolution and structure whilst accounting for non-negligible forecast error. We show that GANs and VAE-GANs can match the statistical properties of state-of-the-art pointwise post-processing methods whilst creating high-resolution, spatially coherent precipitation maps. Our model compares favourably to the best existing downscaling methods in both pixel-wise and pooled CRPS scores, power spectrum information and rank histograms (used to assess calibration). We test our models and show that they perform in a range of scenarios, including heavy rainfall. △ Less

Submitted 28 July, 2022; v1 submitted 5 April, 2022; originally announced April 2022.

Comments: Revised version 28/7/22

arXiv:2112.11429 [pdf]

doi 10.1029/2021MS002744

Machine Learning Emulation of Urban Land Surface Processes

Authors: David Meyer, Sue Grimmond, Peter Dueben, Robin Hogan, Maarten van Reeuwijk

Abstract: Can we improve the modeling of urban land surface processes with machine learning (ML)? A prior comparison of urban land surface models (ULSMs) found that no single model is 'best' at predicting all common surface fluxes. Here, we develop an urban neural network (UNN) trained on the mean predicted fluxes from 22 ULSMs at one site. The UNN emulates the mean output of ULSMs accurately. When compared… ▽ More Can we improve the modeling of urban land surface processes with machine learning (ML)? A prior comparison of urban land surface models (ULSMs) found that no single model is 'best' at predicting all common surface fluxes. Here, we develop an urban neural network (UNN) trained on the mean predicted fluxes from 22 ULSMs at one site. The UNN emulates the mean output of ULSMs accurately. When compared to a reference ULSM (Town Energy Balance; TEB), the UNN has greater accuracy relative to flux observations, less computational cost, and requires fewer input parameters. When coupled to the Weather Research Forecasting (WRF) model using TensorFlow bindings, WRF-UNN is stable and more accurate than the reference WRF-TEB. Although the application is currently constrained by the training data (1 site), we show a novel approach to improve the modeling of surface fluxes by combining the strengths of several ULSMs into one using ML. △ Less

Submitted 15 March, 2022; v1 submitted 21 December, 2021; originally announced December 2021.

Comments: Published version

Journal ref: Meyer, D., Grimmond, S., Dueben, P., Hogan, R., & van Reeuwijk, M. (2022). Machine Learning Emulation of Urban Land Surface Processes. Journal of Advances in Modeling Earth Systems, 14(3)

arXiv:2112.08217 [pdf, other]

Probabilistic Forecasting with Generative Networks via Scoring Rule Minimization

Authors: Lorenzo Pacchiardi, Rilwan Adewoyin, Peter Dueben, Ritabrata Dutta

Abstract: Probabilistic forecasting relies on past observations to provide a probability distribution for a future outcome, which is often evaluated against the realization using a scoring rule. Here, we perform probabilistic forecasting with generative neural networks, which parametrize distributions on high-dimensional spaces by transforming draws from a latent variable. Generative networks are typically… ▽ More Probabilistic forecasting relies on past observations to provide a probability distribution for a future outcome, which is often evaluated against the realization using a scoring rule. Here, we perform probabilistic forecasting with generative neural networks, which parametrize distributions on high-dimensional spaces by transforming draws from a latent variable. Generative networks are typically trained in an adversarial framework. In contrast, we propose to train generative networks to minimize a predictive-sequential (or prequential) scoring rule on a recorded temporal sequence of the phenomenon of interest, which is appealing as it corresponds to the way forecasting systems are routinely evaluated. Adversarial-free minimization is possible for some scoring rules; hence, our framework avoids the cumbersome hyperparameter tuning and uncertainty underestimation due to unstable adversarial training, thus unlocking reliable use of generative networks in probabilistic forecasting. Further, we prove consistency of the minimizer of our objective with dependent data, while adversarial training assumes independence. We perform simulation studies on two chaotic dynamical models and a benchmark data set of global weather observations; for this last example, we define scoring rules for spatial data by drawing from the relevant literature. Our method outperforms state-of-the-art adversarial approaches, especially in probabilistic calibration, while requiring less hyperparameter tuning. △ Less

Submitted 13 February, 2024; v1 submitted 15 December, 2021; originally announced December 2021.

Journal ref: Journal of Machine Learning Research, 25(45), 2024

arXiv:2103.16120 [pdf, other]

doi 10.1029/2022MS003148

Mixed-precision for Linear Solvers in Global Geophysical Flows

Authors: Jan Ackmann, Peter D. Düben, Tim N. Palmer, Piotr K. Smolarkiewicz

Abstract: Semi-implicit time-stepping schemes for atmosphere and ocean models require elliptic solvers that work efficiently on modern supercomputers. This paper reports our study of the potential computational savings when using mixed precision arithmetic in the elliptic solvers. The essential components of a representative elliptic solver are run at precision levels as low as half (16 bits), and accompani… ▽ More Semi-implicit time-stepping schemes for atmosphere and ocean models require elliptic solvers that work efficiently on modern supercomputers. This paper reports our study of the potential computational savings when using mixed precision arithmetic in the elliptic solvers. The essential components of a representative elliptic solver are run at precision levels as low as half (16 bits), and accompanied with a detailed evaluation of the impact of reduced precision on the solver convergence and the solution quality. A detailed inquiry into reduced precision requires a model configuration that is meaningful but cheaper to run and easier to evaluate than full atmosphere/ocean models. This study is therefore conducted in the context of a novel semi-implicit shallow-water model on the sphere, purposely designed to mimic numerical intricacies of modern all-scale weather and climate (W&C) models with the numerical stability independent on celerity of all wave motions. The governing algorithm of the shallow-water model is based on the non-oscillatory MPDATA methods for geophysical flows, whereas the resulting elliptic problem employs a strongly preconditioned non-symmetric Krylov-subspace solver GCR, proven in advanced atmospheric applications. The classical longitude/latitude grid is deliberately chosen to retain the stiffness of global W&C models posed in thin spherical shells as well as to better understand the performance of reduced-precision arithmetic in the vicinity of grid singularities. Precision reduction is done on a software level, using an emulator. The reduced-precision experiments are conducted for established dynamical-core test-cases, like the Rossby-Haurwitz wave number 4 and a zonal orographic flow. The study shows that selected key components of the elliptic solver, most prominently the preconditioning, can be performed at the level of half precision. △ Less

Submitted 30 March, 2021; originally announced March 2021.

Comments: 38 pages, 12 figures; to be submitted to the Journal of Computational Physics (JCP)

arXiv:2103.11919 [pdf]

doi 10.1029/2021MS002550

Machine Learning Emulation of 3D Cloud Radiative Effects

Authors: David Meyer, Robin J. Hogan, Peter D. Dueben, Shannon L. Mason

Abstract: The treatment of cloud structure in numerical weather and climate models is often greatly simplified to make them computationally affordable. Here we propose to correct the European Centre for Medium-Range Weather Forecasts 1D radiation scheme ecRad for 3D cloud effects using computationally cheap neural networks. 3D cloud effects are learned as the difference between ecRad's fast 1D Tripleclouds… ▽ More The treatment of cloud structure in numerical weather and climate models is often greatly simplified to make them computationally affordable. Here we propose to correct the European Centre for Medium-Range Weather Forecasts 1D radiation scheme ecRad for 3D cloud effects using computationally cheap neural networks. 3D cloud effects are learned as the difference between ecRad's fast 1D Tripleclouds solver that neglects them and its 3D SPARTACUS (SPeedy Algorithm for Radiative TrAnsfer through CloUd Sides) solver that includes them but is about five times more computationally expensive. With typical errors between 20 % and 30 % of the 3D signal, neural networks improve Tripleclouds' accuracy for about 1 % increase in runtime. Thus, rather than emulating the whole of SPARTACUS, we keep Tripleclouds unchanged for cloud-free parts of the atmosphere and 3D-correct it elsewhere. The focus on the comparably small 3D correction instead of the entire signal allows us to improve predictions significantly if we assume a similar signal-to-noise ratio for both. △ Less

Submitted 15 March, 2022; v1 submitted 22 March, 2021; originally announced March 2021.

Comments: Published version

Journal ref: Meyer, D., Hogan, R. J., Dueben, P. D., & Mason, S. L. (2022). Machine Learning Emulation of 3D Cloud Radiative Effects. Journal of Advances in Modeling Earth Systems, 14(3)

arXiv:2010.02866 [pdf, other]

Machine-Learned Preconditioners for Linear Solvers in Geophysical Fluid Flows

Authors: Jan Ackmann, Peter D. Düben, Tim N. Palmer, Piotr K. Smolarkiewicz

Abstract: It is tested whether machine learning methods can be used for preconditioning to increase the performance of the linear solver -- the backbone of the semi-implicit, grid-point model approach for weather and climate models. Embedding the machine-learning method within the framework of a linear solver circumvents potential robustness issues that machine learning approaches are often criticized for,… ▽ More It is tested whether machine learning methods can be used for preconditioning to increase the performance of the linear solver -- the backbone of the semi-implicit, grid-point model approach for weather and climate models. Embedding the machine-learning method within the framework of a linear solver circumvents potential robustness issues that machine learning approaches are often criticized for, as the linear solver ensures that a sufficient, pre-set level of accuracy is reached. The approach does not require prior availability of a conventional preconditioner and is highly flexible regarding complexity and machine learning design choices. Several machine learning methods are used to learn the optimal preconditioner for a shallow-water model with semi-implicit timestepping that is conceptually similar to more complex atmosphere models. The machine-learning preconditioner is competitive with a conventional preconditioner and provides good results even if it is used outside of the dynamical range of the training dataset. △ Less

Submitted 6 October, 2020; originally announced October 2020.

Comments: To be submitted to GRL, 15 pages, 3 figures

arXiv:2008.09090 [pdf, other]

TRU-NET: A Deep Learning Approach to High Resolution Prediction of Rainfall

Authors: Rilwan Adewoyin, Peter Dueben, Peter Watson, Yulan He, Ritabrata Dutta

Abstract: Climate models (CM) are used to evaluate the impact of climate change on the risk of floods and strong precipitation events. However, these numerical simulators have difficulties representing precipitation events accurately, mainly due to limited spatial resolution when simulating multi-scale dynamics in the atmosphere. To improve the prediction of high resolution precipitation we apply a Deep Lea… ▽ More Climate models (CM) are used to evaluate the impact of climate change on the risk of floods and strong precipitation events. However, these numerical simulators have difficulties representing precipitation events accurately, mainly due to limited spatial resolution when simulating multi-scale dynamics in the atmosphere. To improve the prediction of high resolution precipitation we apply a Deep Learning (DL) approach using an input of CM simulations of the model fields (weather variables) that are more predictable than local precipitation. To this end, we present TRU-NET (Temporal Recurrent U-Net), an encoder-decoder model featuring a novel 2D cross attention mechanism between contiguous convolutional-recurrent layers to effectively model multi-scale spatio-temporal weather processes. We use a conditional-continuous loss function to capture the zero-skewed %extreme event patterns of rainfall. Experiments show that our model consistently attains lower RMSE and MAE scores than a DL model prevalent in short term precipitation prediction and improves upon the rainfall predictions of a state-of-the-art dynamical weather model. Moreover, by evaluating the performance of our model under various, training and testing, data formulation strategies, we show that there is enough data for our deep learning approach to output robust, high-quality results across seasons and varying regions. △ Less

Submitted 12 February, 2021; v1 submitted 20 August, 2020; originally announced August 2020.

arXiv:2005.08748 [pdf, other]

doi 10.1098/rsta.2020.0092

Deep Learning for Post-Processing Ensemble Weather Forecasts

Authors: Peter Grönquist, Chengyuan Yao, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Shigang Li, Torsten Hoefler

Abstract: Quantifying uncertainty in weather forecasts is critical, especially for predicting extreme weather events. This is typically accomplished with ensemble prediction systems, which consist of many perturbed numerical weather simulations, or trajectories, run in parallel. These systems are associated with a high computational cost and often involve statistical post-processing steps to inexpensively i… ▽ More Quantifying uncertainty in weather forecasts is critical, especially for predicting extreme weather events. This is typically accomplished with ensemble prediction systems, which consist of many perturbed numerical weather simulations, or trajectories, run in parallel. These systems are associated with a high computational cost and often involve statistical post-processing steps to inexpensively improve their raw prediction qualities. We propose a mixed model that uses only a subset of the original weather trajectories combined with a post-processing step using deep neural networks. These enable the model to account for non-linear relationships that are not captured by current numerical models or post-processing methods. Applied to global data, our mixed models achieve a relative improvement in ensemble forecast skill (CRPS) of over 14%. Furthermore, we demonstrate that the improvement is larger for extreme weather events on select case studies. We also show that our post-processing can use fewer trajectories to achieve comparable results to the full ensemble. By using fewer trajectories, the computational costs of an ensemble prediction system can be reduced, allowing it to run at higher resolution and produce more accurate forecasts. △ Less

Submitted 21 September, 2020; v1 submitted 18 May, 2020; originally announced May 2020.

arXiv:1911.00630 [pdf, other]

Predicting Weather Uncertainty with Deep Convnets

Authors: Peter Grönquist, Tal Ben-Nun, Nikoli Dryden, Peter Dueben, Luca Lavarini, Shigang Li, Torsten Hoefler

Abstract: Modern weather forecast models perform uncertainty quantification using ensemble prediction systems, which collect nonparametric statistics based on multiple perturbed simulations. To provide accurate estimation, dozens of such computationally intensive simulations must be run. We show that deep neural networks can be used on a small set of numerical weather simulations to estimate the spread of a… ▽ More Modern weather forecast models perform uncertainty quantification using ensemble prediction systems, which collect nonparametric statistics based on multiple perturbed simulations. To provide accurate estimation, dozens of such computationally intensive simulations must be run. We show that deep neural networks can be used on a small set of numerical weather simulations to estimate the spread of a weather forecast, significantly reducing computational cost. To train the system, we both modify the 3D U-Net architecture and explore models that incorporate temporal data. Our models serve as a starting point to improve uncertainty quantification in current real-time weather forecasting systems, which is vital for predicting extreme events. △ Less

Submitted 4 December, 2019; v1 submitted 1 November, 2019; originally announced November 2019.

Comments: Poster presentation at NeurIPS2019 "Machine Learning and the Physical Sciences" Workshop

MSC Class: I.2.10; I.2.1 ACM Class: I.2.10; I.2.1

Showing 1–13 of 13 results for author: Dueben, P