-
Improving Atmospheric Processes in Earth System Models with Deep Learning Ensembles and Stochastic Parameterizations
Authors:
Gunnar Behrens,
Tom Beucler,
Fernando Iglesias-Suarez,
Sungduk Yu,
Pierre Gentine,
Michael Pritchard,
Mierk Schwabe,
Veronika Eyring
Abstract:
Deep learning has proven to be a valuable tool to represent subgrid processes in climate models, but most application cases have so far used idealized settings and deterministic approaches. Here, we develop ensemble and stochastic parameterizations with calibrated uncertainty quantification to learn subgrid convective and turbulent processes and surface radiative fluxes of a superparameterization…
▽ More
Deep learning has proven to be a valuable tool to represent subgrid processes in climate models, but most application cases have so far used idealized settings and deterministic approaches. Here, we develop ensemble and stochastic parameterizations with calibrated uncertainty quantification to learn subgrid convective and turbulent processes and surface radiative fluxes of a superparameterization embedded in an Earth System Model (ESM). We explore three methods to construct stochastic parameterizations: 1) a single Deep Neural Network (DNN) with Monte Carlo Dropout; 2) a multi-network ensemble; and 3) a Variational Encoder Decoder with latent space perturbation. We show that the multi-network ensembles improve the representation of convective processes in the planetary boundary layer compared to individual DNNs. The respective uncertainty quantification illustrates that the two latter methods are advantageous compared to a dropout-based DNN ensemble regarding the spread of convective processes. We develop a novel partial coupling strategy to sidestep issues in condensate emulation to evaluate the multi-network parameterizations in online runs coupled to the ESM. We can conduct Earth-like stable runs over more than 5 months with the ensemble approach, while such simulations using individual DNNs fail within days. Moreover, we show that our novel ensemble parameterizations improve the representation of extreme precipitation and the underlying diurnal cycle compared to a traditional parameterization, although faithfully representing the mean precipitation pattern remains challenging. Our results pave the way towards a new generation of parameterizations using machine learning with realistic uncertainty quantification that significantly improve the representation of subgrid effects.
△ Less
Submitted 5 February, 2024;
originally announced February 2024.
-
ClimSim: A large multi-scale dataset for hybrid physics-ML climate emulation
Authors:
Sungduk Yu,
Walter Hannah,
Liran Peng,
Jerry Lin,
Mohamed Aziz Bhouri,
Ritwik Gupta,
Björn Lütjens,
Justus Christopher Will,
Gunnar Behrens,
Julius Busecke,
Nora Loose,
Charles I Stern,
Tom Beucler,
Bryce Harrop,
Benjamin R Hillman,
Andrea Jenney,
Savannah Ferretti,
Nana Liu,
Anima Anandkumar,
Noah D Brenowitz,
Veronika Eyring,
Nicholas Geneva,
Pierre Gentine,
Stephan Mandt,
Jaideep Pathak
, et al. (31 additional authors not shown)
Abstract:
Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short,…
▽ More
Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short, high-resolution simulations to ML emulators. However, this hybrid ML-physics simulation approach requires domain-specific treatment and has been inaccessible to ML experts because of lack of training data and relevant, easy-to-use workflows. We present ClimSim, the largest-ever dataset designed for hybrid ML-physics research. It comprises multi-scale climate simulations, developed by a consortium of climate scientists and ML researchers. It consists of 5.7 billion pairs of multivariate input and output vectors that isolate the influence of locally-nested, high-resolution, high-fidelity physics on a host climate simulator's macro-scale physical state.
The dataset is global in coverage, spans multiple years at high sampling frequency, and is designed such that resulting emulators are compatible with downstream coupling into operational climate simulators. We implement a range of deterministic and stochastic regression baselines to highlight the ML challenges and their scoring. The data (https://huggingface.co/datasets/LEAP/ClimSim_high-res) and code (https://leap-stc.github.io/ClimSim) are released openly to support the development of hybrid ML-physics and high-fidelity climate simulations for the benefit of science and society.
△ Less
Submitted 6 February, 2024; v1 submitted 14 June, 2023;
originally announced June 2023.
-
Non-Linear Dimensionality Reduction with a Variational Encoder Decoder to Understand Convective Processes in Climate Models
Authors:
Gunnar Behrens,
Tom Beucler,
Pierre Gentine,
Fernando Iglesias-Suarez,
Michael Pritchard,
Veronika Eyring
Abstract:
Deep learning can accurately represent sub-grid-scale convective processes in climate models, learning from high resolution simulations. However, deep learning methods usually lack interpretability due to large internal dimensionality, resulting in reduced trustworthiness in these methods. Here, we use Variational Encoder Decoder structures (VED), a non-linear dimensionality reduction technique, t…
▽ More
Deep learning can accurately represent sub-grid-scale convective processes in climate models, learning from high resolution simulations. However, deep learning methods usually lack interpretability due to large internal dimensionality, resulting in reduced trustworthiness in these methods. Here, we use Variational Encoder Decoder structures (VED), a non-linear dimensionality reduction technique, to learn and understand convective processes in an aquaplanet superparameterized climate model simulation, where deep convective processes are simulated explicitly. We show that similar to previous deep learning studies based on feed-forward neural nets, the VED is capable of learning and accurately reproducing convective processes. In contrast to past work, we show this can be achieved by compressing the original information into only five latent nodes. As a result, the VED can be used to understand convective processes and delineate modes of convection through the exploration of its latent dimensions. A close investigation of the latent space enables the identification of different convective regimes: a) stable conditions are clearly distinguished from deep convection with low outgoing longwave radiation and strong precipitation; b) high optically thin cirrus-like clouds are separated from low optically thick cumulus clouds; and c) shallow convective processes are associated with large-scale moisture content and surface diabatic heating. Our results demonstrate that VEDs can accurately represent convective processes in climate models, while enabling interpretability and better understanding of sub-grid-scale physical processes, paving the way to increasingly interpretable machine learning parameterizations with promising generative properties
△ Less
Submitted 26 July, 2022; v1 submitted 19 April, 2022;
originally announced April 2022.