-
Precipitation Downscaling with Spatiotemporal Video Diffusion
Authors:
Prakhar Srivastava,
Ruihan Yang,
Gavin Kerrigan,
Gideon Dresdner,
Jeremy McGibbon,
Christopher Bretherton,
Stephan Mandt
Abstract:
In climate science and meteorology, high-resolution local precipitation (rain and snowfall) predictions are limited by the computational costs of simulation-based methods. Statistical downscaling, or super-resolution, is a common workaround where a low-resolution prediction is improved using statistical approaches. Unlike traditional computer vision tasks, weather and climate applications require…
▽ More
In climate science and meteorology, high-resolution local precipitation (rain and snowfall) predictions are limited by the computational costs of simulation-based methods. Statistical downscaling, or super-resolution, is a common workaround where a low-resolution prediction is improved using statistical approaches. Unlike traditional computer vision tasks, weather and climate applications require capturing the accurate conditional distribution of high-resolution given low-resolution patterns to assure reliable ensemble averages and unbiased estimates of extreme events, such as heavy rain. This work extends recent video diffusion models to precipitation super-resolution, employing a deterministic downscaler followed by a temporally-conditioned diffusion model to capture noise characteristics and high-frequency patterns. We test our approach on FV3GFS output, an established large-scale global atmosphere model, and compare it against six state-of-the-art baselines. Our analysis, capturing CRPS, MSE, precipitation distributions, and qualitative aspects using California and the Himalayas as examples, establishes our method as a new standard for data-driven precipitation downscaling.
△ Less
Submitted 20 June, 2024; v1 submitted 10 December, 2023;
originally announced December 2023.
-
Understanding and Visualizing Droplet Distributions in Simulations of Shallow Clouds
Authors:
Justus C. Will,
Andrea M. Jenney,
Kara D. Lamb,
Michael S. Pritchard,
Colleen Kaul,
Po-Lun Ma,
Kyle Pressel,
Jacob Shpund,
Marcus van Lier-Walqui,
Stephan Mandt
Abstract:
Thorough analysis of local droplet-level interactions is crucial to better understand the microphysical processes in clouds and their effect on the global climate. High-accuracy simulations of relevant droplet size distributions from Large Eddy Simulations (LES) of bin microphysics challenge current analysis techniques due to their high dimensionality involving three spatial dimensions, time, and…
▽ More
Thorough analysis of local droplet-level interactions is crucial to better understand the microphysical processes in clouds and their effect on the global climate. High-accuracy simulations of relevant droplet size distributions from Large Eddy Simulations (LES) of bin microphysics challenge current analysis techniques due to their high dimensionality involving three spatial dimensions, time, and a continuous range of droplet sizes. Utilizing the compact latent representations from Variational Autoencoders (VAEs), we produce novel and intuitive visualizations for the organization of droplet sizes and their evolution over time beyond what is possible with clustering techniques. This greatly improves interpretation and allows us to examine aerosol-cloud interactions by contrasting simulations with different aerosol concentrations. We find that the evolution of the droplet spectrum is similar across aerosol levels but occurs at different paces. This similarity suggests that precipitation initiation processes are alike despite variations in onset times.
△ Less
Submitted 31 October, 2023;
originally announced October 2023.
-
ClimSim: A large multi-scale dataset for hybrid physics-ML climate emulation
Authors:
Sungduk Yu,
Walter Hannah,
Liran Peng,
Jerry Lin,
Mohamed Aziz Bhouri,
Ritwik Gupta,
Björn Lütjens,
Justus Christopher Will,
Gunnar Behrens,
Julius Busecke,
Nora Loose,
Charles I Stern,
Tom Beucler,
Bryce Harrop,
Benjamin R Hillman,
Andrea Jenney,
Savannah Ferretti,
Nana Liu,
Anima Anandkumar,
Noah D Brenowitz,
Veronika Eyring,
Nicholas Geneva,
Pierre Gentine,
Stephan Mandt,
Jaideep Pathak
, et al. (31 additional authors not shown)
Abstract:
Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short,…
▽ More
Modern climate projections lack adequate spatial and temporal resolution due to computational constraints. A consequence is inaccurate and imprecise predictions of critical processes such as storms. Hybrid methods that combine physics with machine learning (ML) have introduced a new generation of higher fidelity climate simulators that can sidestep Moore's Law by outsourcing compute-hungry, short, high-resolution simulations to ML emulators. However, this hybrid ML-physics simulation approach requires domain-specific treatment and has been inaccessible to ML experts because of lack of training data and relevant, easy-to-use workflows. We present ClimSim, the largest-ever dataset designed for hybrid ML-physics research. It comprises multi-scale climate simulations, developed by a consortium of climate scientists and ML researchers. It consists of 5.7 billion pairs of multivariate input and output vectors that isolate the influence of locally-nested, high-resolution, high-fidelity physics on a host climate simulator's macro-scale physical state.
The dataset is global in coverage, spans multiple years at high sampling frequency, and is designed such that resulting emulators are compatible with downstream coupling into operational climate simulators. We implement a range of deterministic and stochastic regression baselines to highlight the ML challenges and their scoring. The data (https://huggingface.co/datasets/LEAP/ClimSim_high-res) and code (https://leap-stc.github.io/ClimSim) are released openly to support the development of hybrid ML-physics and high-fidelity climate simulations for the benefit of science and society.
△ Less
Submitted 6 February, 2024; v1 submitted 14 June, 2023;
originally announced June 2023.
-
Understanding Extreme Precipitation Changes through Unsupervised Machine Learning
Authors:
Griffin Mooers,
Tom Beucler,
Mike Pritchard,
Stephan Mandt
Abstract:
Despite the importance of quantifying how the spatial patterns of extreme precipitation will change with warming, we lack tools to objectively analyze the storm-scale outputs of modern climate models. To address this gap, we develop an unsupervised machine learning framework to quantify how storm dynamics affect changes in precipitation extremes, without sacrificing spatial information. For the up…
▽ More
Despite the importance of quantifying how the spatial patterns of extreme precipitation will change with warming, we lack tools to objectively analyze the storm-scale outputs of modern climate models. To address this gap, we develop an unsupervised machine learning framework to quantify how storm dynamics affect changes in precipitation extremes, without sacrificing spatial information. For the upper precipitation quantiles (above the 80th percentile), we find that the spatial patterns of extreme precipitation changes are dominated by spatial shifts in storm dynamical regimes rather than changes in how these storm regimes produce precipitation. Our study shows how unsupervised machine learning, paired with domain knowledge, may allow us to better understand the physics of the atmosphere and anticipate the changes associated with a warming world.
△ Less
Submitted 1 December, 2023; v1 submitted 3 November, 2022;
originally announced November 2022.
-
Making Thermodynamic Models of Mixtures Predictive by Machine Learning: Matrix Completion of Pair Interactions
Authors:
Fabian Jirasek,
Robert Bamler,
Sophie Fellenz,
Michael Bortz,
Marius Kloft,
Stephan Mandt,
Hans Hasse
Abstract:
Predictive models of thermodynamic properties of mixtures are paramount in chemical engineering and chemistry. Classical thermodynamic models are successful in generalizing over (continuous) conditions like temperature and concentration. On the other hand, matrix completion methods (MCMs) from machine learning successfully generalize over (discrete) binary systems; these MCMs can make predictions…
▽ More
Predictive models of thermodynamic properties of mixtures are paramount in chemical engineering and chemistry. Classical thermodynamic models are successful in generalizing over (continuous) conditions like temperature and concentration. On the other hand, matrix completion methods (MCMs) from machine learning successfully generalize over (discrete) binary systems; these MCMs can make predictions without any data for a given binary system by implicitly learning commonalities across systems. In the present work, we combine the strengths of both worlds in a hybrid approach. The underlying idea is to predict the pair-interaction energies, as they are used in basically all physical models of liquid mixtures, by an MCM. As an example, we embed an MCM into UNIQUAC, a widely-used physical model for the Gibbs excess energy. We train the resulting hybrid model in a Bayesian machine-learning framework on experimental data for activity coefficients in binary systems of 1146 components from the Dortmund Data Bank. We thereby obtain, for the first time, a complete set of UNIQUAC parameters for all binary systems of these components, which allows us to predict, in principle, activity coefficients at arbitrary temperature and composition for any combination of these components, not only for binary but also for multicomponent systems. The hybrid model even outperforms the best available physical model for predicting activity coefficients, the modified UNIFAC (Dortmund) model.
△ Less
Submitted 1 September, 2022;
originally announced September 2022.
-
Comparing Storm Resolving Models and Climates via Unsupervised Machine Learning
Authors:
Griffin Mooers,
Mike Pritchard,
Tom Beucler,
Prakhar Srivastava,
Harshini Mangipudi,
Liran Peng,
Pierre Gentine,
Stephan Mandt
Abstract:
Global Storm-Resolving Models (GSRMs) have gained widespread interest because of the unprecedented detail with which they resolve the global climate. However, it remains difficult to quantify objective differences in how GSRMs resolve complex atmospheric formations. This lack of comprehensive tools for comparing model similarities is a problem in many disparate fields that involve simulation tools…
▽ More
Global Storm-Resolving Models (GSRMs) have gained widespread interest because of the unprecedented detail with which they resolve the global climate. However, it remains difficult to quantify objective differences in how GSRMs resolve complex atmospheric formations. This lack of comprehensive tools for comparing model similarities is a problem in many disparate fields that involve simulation tools for complex data. To address this challenge we develop methods to estimate distributional distances based on both nonlinear dimensionality reduction and vector quantization. Our approach automatically learns physically meaningful notions of similarity from low-dimensional latent data representations that the different models produce. This enables an intercomparison of nine GSRMs based on their high-dimensional simulation data (2D vertical velocity snapshots) and reveals that only six are similar in their representation of atmospheric dynamics. Furthermore, we uncover signatures of the convective response to global warming in a fully unsupervised way. Our study provides a path toward evaluating future high-resolution simulation data more objectively.
△ Less
Submitted 2 December, 2023; v1 submitted 24 August, 2022;
originally announced August 2022.
-
Hybridizing Physical and Data-driven Prediction Methods for Physicochemical Properties
Authors:
Fabian Jirasek,
Robert Bamler,
Stephan Mandt
Abstract:
We present a generic way to hybridize physical and data-driven methods for predicting physicochemical properties. The approach `distills' the physical method's predictions into a prior model and combines it with sparse experimental data using Bayesian inference. We apply the new approach to predict activity coefficients at infinite dilution and obtain significant improvements compared to the data-…
▽ More
We present a generic way to hybridize physical and data-driven methods for predicting physicochemical properties. The approach `distills' the physical method's predictions into a prior model and combines it with sparse experimental data using Bayesian inference. We apply the new approach to predict activity coefficients at infinite dilution and obtain significant improvements compared to the data-driven and physical baselines and established ensemble methods from the machine learning literature.
△ Less
Submitted 17 February, 2022;
originally announced February 2022.
-
Analyzing High-Resolution Clouds and Convection using Multi-Channel VAEs
Authors:
Harshini Mangipudi,
Griffin Mooers,
Mike Pritchard,
Tom Beucler,
Stephan Mandt
Abstract:
Understanding the details of small-scale convection and storm formation is crucial to accurately represent the larger-scale planetary dynamics. Presently, atmospheric scientists run high-resolution, storm-resolving simulations to capture these kilometer-scale weather details. However, because they contain abundant information, these simulations can be overwhelming to analyze using conventional app…
▽ More
Understanding the details of small-scale convection and storm formation is crucial to accurately represent the larger-scale planetary dynamics. Presently, atmospheric scientists run high-resolution, storm-resolving simulations to capture these kilometer-scale weather details. However, because they contain abundant information, these simulations can be overwhelming to analyze using conventional approaches. This paper takes a data-driven approach and jointly embeds spatial arrays of vertical wind velocities, temperatures, and water vapor information as three "channels" of a VAE architecture. Our "multi-channel VAE" results in more interpretable and robust latent structures than earlier work analyzing vertical velocities in isolation. Analyzing and clustering the VAE's latent space identifies weather patterns and their geographical manifestations in a fully unsupervised fashion. Our approach shows that VAEs can play essential roles in analyzing high-dimensional simulation data and extracting critical weather and climate characteristics.
△ Less
Submitted 1 December, 2021;
originally announced December 2021.
-
Generative Modeling for Atmospheric Convection
Authors:
Griffin Mooers,
Jens Tuyls,
Stephan Mandt,
Michael Pritchard,
Tom Beucler
Abstract:
While cloud-resolving models can explicitly simulate the details of small-scale storm formation and morphology, these details are often ignored by climate models for lack of computational resources. Here, we explore the potential of generative modeling to cheaply recreate small-scale storms by designing and implementing a Variational Autoencoder (VAE) that performs structural replication, dimensio…
▽ More
While cloud-resolving models can explicitly simulate the details of small-scale storm formation and morphology, these details are often ignored by climate models for lack of computational resources. Here, we explore the potential of generative modeling to cheaply recreate small-scale storms by designing and implementing a Variational Autoencoder (VAE) that performs structural replication, dimensionality reduction, and clustering of high-resolution vertical velocity fields. Trained on ~6*10^6 samples spanning the globe, the VAE successfully reconstructs the spatial structure of convection, performs unsupervised clustering of convective organization regimes, and identifies anomalous storm activity, confirming the potential of generative modeling to power stochastic parameterizations of convection in climate models.
△ Less
Submitted 24 October, 2020; v1 submitted 2 July, 2020;
originally announced July 2020.
-
Machine Learning in Thermodynamics: Prediction of Activity Coefficients by Matrix Completion
Authors:
Fabian Jirasek,
Rodrigo A. S. Alves,
Julie Damay,
Robert A. Vandermeulen,
Robert Bamler,
Michael Bortz,
Stephan Mandt,
Marius Kloft,
Hans Hasse
Abstract:
Activity coefficients, which are a measure of the non-ideality of liquid mixtures, are a key property in chemical engineering with relevance to modeling chemical and phase equilibria as well as transport processes. Although experimental data on thousands of binary mixtures are available, prediction methods are needed to calculate the activity coefficients in many relevant mixtures that have not be…
▽ More
Activity coefficients, which are a measure of the non-ideality of liquid mixtures, are a key property in chemical engineering with relevance to modeling chemical and phase equilibria as well as transport processes. Although experimental data on thousands of binary mixtures are available, prediction methods are needed to calculate the activity coefficients in many relevant mixtures that have not been explored to-date. In this report, we propose a probabilistic matrix factorization model for predicting the activity coefficients in arbitrary binary mixtures. Although no physical descriptors for the considered components were used, our method outperforms the state-of-the-art method that has been refined over three decades while requiring much less training effort. This opens perspectives to novel methods for predicting physico-chemical properties of binary mixtures with the potential to revolutionize modeling and simulation in chemical engineering.
△ Less
Submitted 28 January, 2020;
originally announced January 2020.