Search | arXiv e-print repository

Stable Machine-Learning Parameterization of Subgrid Processes with Real Geography and Full-physics Emulation

Authors: Zeyuan Hu, Akshay Subramaniam, Zhiming Kuang, Jerry Lin, Sungduk Yu, Walter M. Hannah, Noah D. Brenowitz, Josh Romero, Michael S. Pritchard

Abstract: Modern climate projections often suffer from inadequate spatial and temporal resolution due to computational limitations, resulting in inaccurate representations of sub-resolution processes. A promising technique to address this is the Multiscale Modeling Framework (MMF), which embeds a small-domain, kilometer-resolution cloud-resolving model within each atmospheric column of a host climate model… ▽ More Modern climate projections often suffer from inadequate spatial and temporal resolution due to computational limitations, resulting in inaccurate representations of sub-resolution processes. A promising technique to address this is the Multiscale Modeling Framework (MMF), which embeds a small-domain, kilometer-resolution cloud-resolving model within each atmospheric column of a host climate model to replace traditional convection and cloud parameterizations. Machine learning (ML) offers a unique opportunity to make MMF more accessible by emulating the embedded cloud-resolving model and thereby reducing its substantial computational cost. Although many studies have demonstrated proof-of-concept success of emulating the MMF model with stable hybrid simulations, it remains a challenge to achieve operational-level success with real geography and comprehensive variable emulation, such as explicit cloud condensate coupling. In this study, we present a stable hybrid model capable of integrating for at least 5 years with near operational-level complexity, including real geography and explicit predictions of cloud condensate and wind tendencies. Our model demonstrates state-of-the-art online performance such as 5-year zonal mean biases when comparing to previous MMF emulation studies. Key factors contributing to this online performance include the use of an expressive U-Net architecture, leveraging input features that includes large-scale forcings and convective memory, and incorporating microphysics constraints. The microphysics constraints mitigate unrealistic cloud formations such as liquid clouds at freezing temperatures or excessive ice clouds in the stratosphere, which would occur in online simulations with an unconstrained ML model. △ Less

Submitted 27 June, 2024; originally announced July 2024.

arXiv:2406.16947 [pdf, other]

Generative Data Assimilation of Sparse Weather Station Observations at Kilometer Scales

Authors: Peter Manshausen, Yair Cohen, Jaideep Pathak, Mike Pritchard, Piyush Garg, Morteza Mardani, Karthik Kashinath, Simon Byrne, Noah Brenowitz

Abstract: Data assimilation of observational data into full atmospheric states is essential for weather forecast model initialization. Recently, methods for deep generative data assimilation have been proposed which allow for using new input data without retraining the model. They could also dramatically accelerate the costly data assimilation process used in operational regional weather models. Here, in a… ▽ More Data assimilation of observational data into full atmospheric states is essential for weather forecast model initialization. Recently, methods for deep generative data assimilation have been proposed which allow for using new input data without retraining the model. They could also dramatically accelerate the costly data assimilation process used in operational regional weather models. Here, in a central US testbed, we demonstrate the viability of score-based data assimilation in the context of realistically complex km-scale weather. We train an unconditional diffusion model to generate snapshots of a state-of-the-art km-scale analysis product, the High Resolution Rapid Refresh. Then, using score-based data assimilation to incorporate sparse weather station data, the model produces maps of precipitation and surface winds. The generated fields display physically plausible structures, such as gust fronts, and sensitivity tests confirm learnt physics through multivariate relationships. Preliminary skill analysis shows the approach already outperforms a naive baseline of the High-Resolution Rapid Refresh system itself. By incorporating observations from 40 weather stations, 10\% lower RMSEs on left-out stations are attained. Despite some lingering imperfections such as insufficiently disperse ensemble DA estimates, we find the results overall an encouraging proof of concept, and the first at km-scale. It is a ripe time to explore extensions that combine increasingly ambitious regional state generators with an increasing set of in situ, ground-based, and satellite remote sensing data streams. △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 18 pages, 7 figures

ACM Class: J.2

arXiv:2406.08632 [pdf, other]

Coupled Ocean-Atmosphere Dynamics in a Machine Learning Earth System Model

Authors: Chenggong Wang, Michael S. Pritchard, Noah Brenowitz, Yair Cohen, Boris Bonev, Thorsten Kurth, Dale Durran, Jaideep Pathak

Abstract: Seasonal climate forecasts are socioeconomically important for managing the impacts of extreme weather events and for planning in sectors like agriculture and energy. Climate predictability on seasonal timescales is tied to boundary effects of the ocean on the atmosphere and coupled interactions in the ocean-atmosphere system. We present the Ocean-linked-atmosphere (Ola) model, a high-resolution (… ▽ More Seasonal climate forecasts are socioeconomically important for managing the impacts of extreme weather events and for planning in sectors like agriculture and energy. Climate predictability on seasonal timescales is tied to boundary effects of the ocean on the atmosphere and coupled interactions in the ocean-atmosphere system. We present the Ocean-linked-atmosphere (Ola) model, a high-resolution (0.25°) Artificial Intelligence/ Machine Learning (AI/ML) coupled earth-system model which separately models the ocean and atmosphere dynamics using an autoregressive Spherical Fourier Neural Operator architecture, with a view towards enabling fast, accurate, large ensemble forecasts on the seasonal timescale. We find that Ola exhibits learned characteristics of ocean-atmosphere coupled dynamics including tropical oceanic waves with appropriate phase speeds, and an internally generated El Niño/Southern Oscillation (ENSO) having realistic amplitude, geographic structure, and vertical structure within the ocean mixed layer. We present initial evidence of skill in forecasting the ENSO which compares favorably to the SPEAR model of the Geophysical Fluid Dynamics Laboratory. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2404.06517 [pdf, other]

DiffObs: Generative Diffusion for Global Forecasting of Satellite Observations

Authors: Jason Stock, Jaideep Pathak, Yair Cohen, Mike Pritchard, Piyush Garg, Dale Durran, Morteza Mardani, Noah Brenowitz

Abstract: This work presents an autoregressive generative diffusion model (DiffObs) to predict the global evolution of daily precipitation, trained on a satellite observational product, and assessed with domain-specific diagnostics. The model is trained to probabilistically forecast day-ahead precipitation. Nonetheless, it is stable for multi-month rollouts, which reveal a qualitatively realistic superposit… ▽ More This work presents an autoregressive generative diffusion model (DiffObs) to predict the global evolution of daily precipitation, trained on a satellite observational product, and assessed with domain-specific diagnostics. The model is trained to probabilistically forecast day-ahead precipitation. Nonetheless, it is stable for multi-month rollouts, which reveal a qualitatively realistic superposition of convectively coupled wave modes in the tropics. Cross-spectral analysis confirms successful generation of low frequency variations associated with the Madden--Julian oscillation, which regulates most subseasonal to seasonal predictability in the observed atmosphere, and convectively coupled moist Kelvin waves with approximately correct dispersion relationships. Despite secondary issues and biases, the results affirm the potential for a next generation of global diffusion models trained on increasingly sparse, and increasingly direct and differentiated observations of the world, for practical applications in subseasonal and climate prediction. △ Less

Submitted 4 April, 2024; originally announced April 2024.

Comments: Published as a workshop paper at "Tackling Climate Change with Machine Learning", ICLR 2024

arXiv:2402.03079 [pdf, other]

Improving Atmospheric Processes in Earth System Models with Deep Learning Ensembles and Stochastic Parameterizations

Authors: Gunnar Behrens, Tom Beucler, Fernando Iglesias-Suarez, Sungduk Yu, Pierre Gentine, Michael Pritchard, Mierk Schwabe, Veronika Eyring

Abstract: Deep learning has proven to be a valuable tool to represent subgrid processes in climate models, but most application cases have so far used idealized settings and deterministic approaches. Here, we develop ensemble and stochastic parameterizations with calibrated uncertainty quantification to learn subgrid convective and turbulent processes and surface radiative fluxes of a superparameterization… ▽ More Deep learning has proven to be a valuable tool to represent subgrid processes in climate models, but most application cases have so far used idealized settings and deterministic approaches. Here, we develop ensemble and stochastic parameterizations with calibrated uncertainty quantification to learn subgrid convective and turbulent processes and surface radiative fluxes of a superparameterization embedded in an Earth System Model (ESM). We explore three methods to construct stochastic parameterizations: 1) a single Deep Neural Network (DNN) with Monte Carlo Dropout; 2) a multi-network ensemble; and 3) a Variational Encoder Decoder with latent space perturbation. We show that the multi-network ensembles improve the representation of convective processes in the planetary boundary layer compared to individual DNNs. The respective uncertainty quantification illustrates that the two latter methods are advantageous compared to a dropout-based DNN ensemble regarding the spread of convective processes. We develop a novel partial coupling strategy to sidestep issues in condensate emulation to evaluate the multi-network parameterizations in online runs coupled to the ESM. We can conduct Earth-like stable runs over more than 5 months with the ensemble approach, while such simulations using individual DNNs fail within days. Moreover, we show that our novel ensemble parameterizations improve the representation of extreme precipitation and the underlying diurnal cycle compared to a traditional parameterization, although faithfully representing the mean precipitation pattern remains challenging. Our results pave the way towards a new generation of parameterizations using machine learning with realistic uncertainty quantification that significantly improve the representation of subgrid effects. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: main: 34 pages, 8 figures, 1 table; supporting information: 39 pages, 23 figures, 4 tables ; submitted to Journal of Advances in Modeling Earth Systems (JAMES)

arXiv:2401.15305 [pdf, other]

A Practical Probabilistic Benchmark for AI Weather Models

Authors: Noah D. Brenowitz, Yair Cohen, Jaideep Pathak, Ankur Mahesh, Boris Bonev, Thorsten Kurth, Dale R. Durran, Peter Harrington, Michael S. Pritchard

Abstract: Since the weather is chaotic, forecasts aim to predict the distribution of future states rather than make a single prediction. Recently, multiple data driven weather models have emerged claiming breakthroughs in skill. However, these have mostly been benchmarked using deterministic skill scores, and little is known about their probabilistic skill. Unfortunately, it is hard to fairly compare AI wea… ▽ More Since the weather is chaotic, forecasts aim to predict the distribution of future states rather than make a single prediction. Recently, multiple data driven weather models have emerged claiming breakthroughs in skill. However, these have mostly been benchmarked using deterministic skill scores, and little is known about their probabilistic skill. Unfortunately, it is hard to fairly compare AI weather models in a probabilistic sense, since variations in choice of ensemble initialization, definition of state, and noise injection methodology become confounding. Moreover, even obtaining ensemble forecast baselines is a substantial engineering challenge given the data volumes involved. We sidestep both problems by applying a decades-old idea -- lagged ensembles -- whereby an ensemble can be constructed from a moderately-sized library of deterministic forecasts. This allows the first parameter-free intercomparison of leading AI weather models' probabilistic skill against an operational baseline. The results reveal that two leading AI weather models, i.e. GraphCast and Pangu, are tied on the probabilistic CRPS metric even though the former outperforms the latter in deterministic scoring. We also reveal how multiple time-step loss functions, which many data-driven weather models have employed, are counter-productive: they improve deterministic metrics at the cost of increased dissipation, deteriorating probabilistic skill. This is confirmed through ablations applied to a spherical Fourier Neural Operator (SFNO) approach to AI weather forecasting. Separate SFNO ablations modulating effective resolution reveal it has a useful effect on ensemble dispersion relevant to achieving good ensemble calibration. We hope these and forthcoming insights from lagged ensembles can help guide the development of AI weather forecasts and have thus shared the diagnostic code. △ Less

Submitted 27 January, 2024; originally announced January 2024.

Comments: 15 pages, 5 figures

arXiv:2401.02098 [pdf, other]

Stress-testing the coupled behavior of hybrid physics-machine learning climate simulations on an unseen, warmer climate

Authors: Jerry Lin, Mohamed Aziz Bhouri, Tom Beucler, Sungduk Yu, Michael Pritchard

Abstract: Accurate and computationally-viable representations of clouds and turbulence are a long-standing challenge for climate model development. Traditional parameterizations that crudely but efficiently approximate these processes are a leading source of uncertainty in long-term projected warming and precipitation patterns. Machine Learning (ML)-based parameterizations have long been hailed as a promisi… ▽ More Accurate and computationally-viable representations of clouds and turbulence are a long-standing challenge for climate model development. Traditional parameterizations that crudely but efficiently approximate these processes are a leading source of uncertainty in long-term projected warming and precipitation patterns. Machine Learning (ML)-based parameterizations have long been hailed as a promising alternative with the potential to yield higher accuracy at a fraction of the cost of more explicit simulations. However, these ML variants are often unpredictably unstable and inaccurate in \textit{coupled} testing (i.e. in a downstream hybrid simulation task where they are dynamically interacting with the large-scale climate model). These issues are exacerbated in out-of-distribution climates. Certain design decisions such as ``climate-invariant" feature transformation for moisture inputs, input vector expansion, and temporal history incorporation have been shown to improve coupled performance, but they may be insufficient for coupled out-of-distribution generalization. If feature selection and transformations can inoculate hybrid physics-ML climate models from non-physical, out-of-distribution extrapolation in a changing climate, there is far greater potential in extrapolating from observational data. Otherwise, training on multiple simulated climates becomes an inevitable necessity. While our results show generalization benefits from these design decisions, the obtained improvment does not sufficiently preclude the necessity of using multi-climate simulated training data. △ Less

Submitted 4 January, 2024; originally announced January 2024.

Comments: 4 Pages, 3 Figures. Accepted to NeurIPS 2023 Climate Change AI Workshop. See https://www.climatechange.ai/papers/neurips2023/62

arXiv:2310.20168 [pdf, other]

Understanding and Visualizing Droplet Distributions in Simulations of Shallow Clouds

Authors: Justus C. Will, Andrea M. Jenney, Kara D. Lamb, Michael S. Pritchard, Colleen Kaul, Po-Lun Ma, Kyle Pressel, Jacob Shpund, Marcus van Lier-Walqui, Stephan Mandt

Abstract: Thorough analysis of local droplet-level interactions is crucial to better understand the microphysical processes in clouds and their effect on the global climate. High-accuracy simulations of relevant droplet size distributions from Large Eddy Simulations (LES) of bin microphysics challenge current analysis techniques due to their high dimensionality involving three spatial dimensions, time, and… ▽ More Thorough analysis of local droplet-level interactions is crucial to better understand the microphysical processes in clouds and their effect on the global climate. High-accuracy simulations of relevant droplet size distributions from Large Eddy Simulations (LES) of bin microphysics challenge current analysis techniques due to their high dimensionality involving three spatial dimensions, time, and a continuous range of droplet sizes. Utilizing the compact latent representations from Variational Autoencoders (VAEs), we produce novel and intuitive visualizations for the organization of droplet sizes and their evolution over time beyond what is possible with clustering techniques. This greatly improves interpretation and allows us to examine aerosol-cloud interactions by contrasting simulations with different aerosol concentrations. We find that the evolution of the droplet spectrum is similar across aerosol levels but occurs at different paces. This similarity suggests that precipitation initiation processes are alike despite variations in onset times. △ Less

Submitted 31 October, 2023; originally announced October 2023.

Comments: 4 pages, 3 figures, accepted at NeurIPS 2023 (Machine Learning and the Physical Sciences Workshop)

arXiv:2310.02074 [pdf, other]

ACE: A fast, skillful learned global atmospheric model for climate prediction

Authors: Oliver Watt-Meyer, Gideon Dresdner, Jeremy McGibbon, Spencer K. Clark, Brian Henn, James Duncan, Noah D. Brenowitz, Karthik Kashinath, Michael S. Pritchard, Boris Bonev, Matthew E. Peters, Christopher S. Bretherton

Abstract: Existing ML-based atmospheric models are not suitable for climate prediction, which requires long-term stability and physical consistency. We present ACE (AI2 Climate Emulator), a 200M-parameter, autoregressive machine learning emulator of an existing comprehensive 100-km resolution global atmospheric model. The formulation of ACE allows evaluation of physical laws such as the conservation of mass… ▽ More Existing ML-based atmospheric models are not suitable for climate prediction, which requires long-term stability and physical consistency. We present ACE (AI2 Climate Emulator), a 200M-parameter, autoregressive machine learning emulator of an existing comprehensive 100-km resolution global atmospheric model. The formulation of ACE allows evaluation of physical laws such as the conservation of mass and moisture. The emulator is stable for 100 years, nearly conserves column moisture without explicit constraints and faithfully reproduces the reference model's climate, outperforming a challenging baseline on over 90% of tracked variables. ACE requires nearly 100x less wall clock time and is 100x more energy efficient than the reference model using typically available resources. Without fine-tuning, ACE can stably generalize to a previously unseen historical sea surface temperature dataset. △ Less

Submitted 6 December, 2023; v1 submitted 3 October, 2023; originally announced October 2023.

Comments: Accepted at Tackling Climate Change with Machine Learning: workshop at NeurIPS 2023

arXiv:2310.01488 [pdf, other]

Sandwiched planet formation: restricting the mass of a middle planet

Authors: Matthew Pritchard, Farzana Meru, Sahl Rowther, David Armstrong, Kaleb Randall

Abstract: We conduct gas and dust hydrodynamical simulations of protoplanetary discs with one and two embedded planets to determine the impact that a second planet located further out in the disc has on the potential for subsequent planet formation in the region locally exterior to the inner planet. We show how the presence of a second planet has a strong influence on the collection of solid material near t… ▽ More We conduct gas and dust hydrodynamical simulations of protoplanetary discs with one and two embedded planets to determine the impact that a second planet located further out in the disc has on the potential for subsequent planet formation in the region locally exterior to the inner planet. We show how the presence of a second planet has a strong influence on the collection of solid material near the inner planet, particularly when the outer planet is massive enough to generate a maximum in the disc's pressure profile. This effect in general acts to reduce the amount of material that can collect in a pressure bump generated by the inner planet. When viewing the inner pressure bump as a location for potential subsequent planet formation of a third planet, we therefore expect that the mass of such a planet will be smaller than it would be in the case without the outer planet, resulting in a small planet being sandwiched between its neighbours - this is in contrast to the expected trend of increasing planet mass with radial distance from the host star. We show that several planetary systems have been observed that do not show this trend but instead have a smaller planet sandwiched in between two more massive planets. We present the idea that such an architecture could be the result of the subsequent formation of a middle planet after its two neighbours formed at some earlier stage. △ Less

Submitted 9 October, 2023; v1 submitted 2 October, 2023; originally announced October 2023.

Comments: 13 pages, 10 figures, accepted by MNRAS (in press)

arXiv:2309.16177 [pdf, other]

Sampling Hybrid Climate Simulation at Scale to Reliably Improve Machine Learning Parameterization

Authors: Jerry Lin, Sungduk Yu, Liran Peng, Tom Beucler, Eliot Wong-Toi, Zeyuan Hu, Pierre Gentine, Margarita Geleta, Mike Pritchard

Abstract: Machine-learning (ML) parameterizations of subgrid processes (here of turbulence, convection, and radiation) may one day replace conventional parameterizations by emulating high-resolution physics without the cost of explicit simulation. However, their development has been stymied by uncertainty surrounding whether or not improved offline performance translates to improved online performance (i.e.… ▽ More Machine-learning (ML) parameterizations of subgrid processes (here of turbulence, convection, and radiation) may one day replace conventional parameterizations by emulating high-resolution physics without the cost of explicit simulation. However, their development has been stymied by uncertainty surrounding whether or not improved offline performance translates to improved online performance (i.e., when coupled to a large-scale general circulation model (GCM)). A key barrier has been the limited sampling of the online effects of the ML design decisions and tuning due to the complexity of performing large ensembles of hybrid physics-ML climate simulations. Our work examines the coupled behavior of full-physics ML parameterizations using large ensembles of hybrid simulations, totalling 2,970 in our case. With extensive sampling, we statistically confirm that lowering offline error lowers online error (given certain constraints). However, we also reveal that decisions decreasing online error, like removing dropout, can trade off against hybrid model stability and vice versa. Nevertheless, we are able to identify design decisions that yield unambiguous improvements to offline and online performance, namely incorporating memory and training on multiple climates. We also find that converting moisture input from specific to relative humidity enhances online stability and that using a Mean Absolute Error (MAE) loss breaks the aforementioned offline/online error relationship. By enabling rapid online experimentation at scale, we empirically answer previously unresolved questions regarding subgrid ML parameterization design. △ Less

Submitted 4 July, 2024; v1 submitted 28 September, 2023; originally announced September 2023.

Comments: 16 pages, 4 figures

arXiv:2309.15214 [pdf, other]

Residual Diffusion Modeling for Km-scale Atmospheric Downscaling

Authors: Morteza Mardani, Noah Brenowitz, Yair Cohen, Jaideep Pathak, Chieh-Yu Chen, Cheng-Chin Liu, Arash Vahdat, Karthik Kashinath, Jan Kautz, Mike Pritchard

Abstract: Predictions of weather hazard require expensive km-scale simulations driven by coarser global inputs. Here, a cost-effective stochastic downscaling model is trained from a high-resolution 2-km weather model over Taiwan conditioned on 25-km ERA5 reanalysis. To address the multi-scale machine learning challenges of weather data, we employ a two-step approach Corrector Diffusion (\textit{CorrDiff}),… ▽ More Predictions of weather hazard require expensive km-scale simulations driven by coarser global inputs. Here, a cost-effective stochastic downscaling model is trained from a high-resolution 2-km weather model over Taiwan conditioned on 25-km ERA5 reanalysis. To address the multi-scale machine learning challenges of weather data, we employ a two-step approach Corrector Diffusion (\textit{CorrDiff}), where a UNet prediction of the mean is corrected by a diffusion step. Akin to Reynolds decomposition in fluid dynamics, this isolates generative learning to the stochastic scales. \textit{CorrDiff} exhibits skillful RMSE and CRPS and faithfully recovers spectra and distributions even for extremes. Case studies of coherent weather phenomena reveal appropriate multivariate relationships reminiscent of learnt physics: the collocation of intense rainfall and sharp gradients in fronts and extreme winds and rainfall bands near the eyewall of typhoons. Downscaling global forecasts successfully retains many of these benefits, foreshadowing the potential of end-to-end, global-to-km-scales machine learning weather predictions. △ Less

Submitted 9 December, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

arXiv:2309.10231 [pdf, other]

Multi-fidelity climate model parameterization for better generalization and extrapolation

Authors: Mohamed Aziz Bhouri, Liran Peng, Michael S. Pritchard, Pierre Gentine

Abstract: Machine-learning-based parameterizations (i.e. representation of sub-grid processes) of global climate models or turbulent simulations have recently been proposed as a powerful alternative to physical, but empirical, representations, offering a lower computational cost and higher accuracy. Yet, those approaches still suffer from a lack of generalization and extrapolation beyond the training data,… ▽ More Machine-learning-based parameterizations (i.e. representation of sub-grid processes) of global climate models or turbulent simulations have recently been proposed as a powerful alternative to physical, but empirical, representations, offering a lower computational cost and higher accuracy. Yet, those approaches still suffer from a lack of generalization and extrapolation beyond the training data, which is however critical to projecting climate change or unobserved regimes of turbulence. Here we show that a multi-fidelity approach, which integrates datasets of different accuracy and abundance, can provide the best of both worlds: the capacity to extrapolate leveraging the physically-based parameterization and a higher accuracy using the machine-learning-based parameterizations. In an application to climate modeling, the multi-fidelity framework yields more accurate climate projections without requiring major increase in computational resources. Our multi-fidelity randomized prior networks (MF-RPNs) combine physical parameterization data as low-fidelity and storm-resolving historical run's data as high-fidelity. To extrapolate beyond the training data, the MF-RPNs are tested on high-fidelity warming scenarios, $+4K$, data. We show the MF-RPN's capacity to return much more skillful predictions compared to either low- or high-fidelity (historical data) simulations trained only on one regime while providing trustworthy uncertainty quantification across a wide range of scenarios. Our approach paves the way for the use of machine-learning based methods that can optimally leverage historical observations or high-fidelity simulations and extrapolate to unseen regimes such as climate change. △ Less

Submitted 18 September, 2023; originally announced September 2023.

Comments: 27 pages, 16 figures

arXiv:2306.08754 [pdf, other]

ClimSim-Online: A Large Multi-scale Dataset and Framework for Hybrid ML-physics Climate Emulation

Authors: Sungduk Yu, Zeyuan Hu, Akshay Subramaniam, Walter Hannah, Liran Peng, Jerry Lin, Mohamed Aziz Bhouri, Ritwik Gupta, Björn Lütjens, Justus C. Will, Gunnar Behrens, Julius J. M. Busecke, Nora Loose, Charles I. Stern, Tom Beucler, Bryce Harrop, Helge Heuer, Benjamin R. Hillman, Andrea Jenney, Nana Liu, Alistair White, Tian Zheng, Zhiming Kuang, Fiaz Ahmed, Elizabeth Barnes , et al. (22 additional authors not shown)

Abstract: Modern climate projections lack adequate spatial and temporal resolution due to computational constraints, leading to inaccuracies in representing critical processes like thunderstorms that occur on the sub-resolution scale. Hybrid methods combining physics with machine learning (ML) offer faster, higher fidelity climate simulations by outsourcing compute-hungry, high-resolution simulations to ML… ▽ More Modern climate projections lack adequate spatial and temporal resolution due to computational constraints, leading to inaccuracies in representing critical processes like thunderstorms that occur on the sub-resolution scale. Hybrid methods combining physics with machine learning (ML) offer faster, higher fidelity climate simulations by outsourcing compute-hungry, high-resolution simulations to ML emulators. However, these hybrid ML-physics simulations require domain-specific data and workflows that have been inaccessible to many ML experts. As an extension of the ClimSim dataset (Yu et al., 2024), we present ClimSim-Online, which also includes an end-to-end workflow for develo** hybrid ML-physics simulators. The ClimSim dataset includes 5.7 billion pairs of multivariate input/output vectors, capturing the influence of high-resolution, high-fidelity physics on a host climate simulator's macro-scale state. The dataset is global and spans ten years at a high sampling frequency. We provide a cross-platform, containerized pipeline to integrate ML models into operational climate simulators for hybrid testing. We also implement various ML baselines, alongside a hybrid baseline simulator, to highlight the ML challenges of building stable, skillful emulators. The data (https://huggingface.co/datasets/LEAP/ClimSim_high-res) and code (https://leap-stc.github.io/ClimSim and https://github.com/leap-stc/climsim-online) are publicly released to support the development of hybrid ML-physics and high-fidelity climate simulations. △ Less

Submitted 8 July, 2024; v1 submitted 14 June, 2023; originally announced June 2023.

Comments: This manuscript is an expanded version of our paper that received the Outstanding Paper Award at the NeurIPS 2023 conference

arXiv:2304.12952 [pdf]

doi 10.1029/2023JD039202

Causally-informed deep learning to improve climate models and projections

Authors: Fernando Iglesias-Suarez, Pierre Gentine, Breixo Solino-Fernandez, Tom Beucler, Michael Pritchard, Jakob Runge, Veronika Eyring

Abstract: Climate models are essential to understand and project climate change, yet long-standing biases and uncertainties in their projections remain. This is largely associated with the representation of subgrid-scale processes, particularly clouds and convection. Deep learning can learn these subgrid-scale processes from computationally expensive storm-resolving models while retaining many features at a… ▽ More Climate models are essential to understand and project climate change, yet long-standing biases and uncertainties in their projections remain. This is largely associated with the representation of subgrid-scale processes, particularly clouds and convection. Deep learning can learn these subgrid-scale processes from computationally expensive storm-resolving models while retaining many features at a fraction of computational cost. Yet, climate simulations with embedded neural network parameterizations are still challenging and highly depend on the deep learning solution. This is likely associated with spurious non-physical correlations learned by the neural networks due to the complexity of the physical dynamical system. Here, we show that the combination of causality with deep learning helps removing spurious correlations and optimizing the neural network algorithm. To resolve this, we apply a causal discovery method to unveil causal drivers in the set of input predictors of atmospheric subgrid-scale processes of a superparameterized climate model in which deep convection is explicitly resolved. The resulting causally-informed neural networks are coupled to the climate model, hence, replacing the superparameterization and radiation scheme. We show that the climate simulations with causally-informed neural network parameterizations retain many convection-related properties and accurately generate the climate of the original high-resolution climate model, while retaining similar generalization capabilities to unseen climates compared to the non-causal approach. The combination of causal discovery and deep learning is a new and promising approach that leads to stable and more trustworthy climate simulations and paves the way towards more physically-based causal deep learning approaches also in other scientific disciplines. △ Less

Submitted 20 March, 2024; v1 submitted 25 April, 2023; originally announced April 2023.

Journal ref: Journal of Geophysical Research: Atmospheres, 129, e2023JD039202

arXiv:2303.17064 [pdf, other]

Improving stratocumulus cloud amounts in a 200-m resolution multi-scale modeling framework through tuning of its interior physics

Authors: Liran Peng, Peter N. Blossey, Walter M. Hannah, Christopher S. Bretherton, Christopher R. Terai, Andrea M. Jenney, Michael Pritchard

Abstract: High-Resolution Multi-scale Modeling Frameworks (HR) -- global climate models that embed separate, convection-resolving models with high enough resolution to resolve boundary layer eddies -- have exciting potential for investigating low cloud feedback dynamics due to reduced parameterization and ability for multidecadal throughput on modern computing hardware. However low clouds in past HR have su… ▽ More High-Resolution Multi-scale Modeling Frameworks (HR) -- global climate models that embed separate, convection-resolving models with high enough resolution to resolve boundary layer eddies -- have exciting potential for investigating low cloud feedback dynamics due to reduced parameterization and ability for multidecadal throughput on modern computing hardware. However low clouds in past HR have suffered a stubborn problem of over-entrainment due to an uncontrolled source of mixing across the marine subtropical inversion manifesting as stratocumulus dim biases in present-day climate, limiting their scientific utility. We report new results showing that this over-entrainment can be partly offset by using hyperviscosity and cloud droplet sedimentation. Hyperviscosity damps small-scale momentum fluctuations associated with the formulation of the momentum solver of the embedded LES. By considering the sedimentation process adjacent to default one-moment microphysics in HR, condensed phase particles can be removed from the entrainment zone, which further reduces entrainment efficiency. The result is an HR that is able to produce more low clouds with a higher liquid water path and a reduced stratocumulus dim bias. Associated improvements in the explicitly simulated sub-cloud eddy spectrum are observed. We report these sensitivities in multi-week tests and then explore their operational potential alongside microphysical retuning in decadal simulations at operational 1.5 degree exterior resolution. The result is a new HR having desired improvements in the baseline present-day low cloud climatology, and a reduced global mean bias and root mean squared error of absorbed shortwave radiation. We suggest it should be promising for examining low cloud feedbacks with minimal approximation. △ Less

Submitted 16 October, 2023; v1 submitted 29 March, 2023; originally announced March 2023.

arXiv:2302.03845 [pdf, other]

doi 10.1175/AIES-D-23-0013.1

Two-step hyperparameter optimization method: Accelerating hyperparameter search by using a fraction of a training dataset

Authors: Sungduk Yu, Mike Pritchard, Po-Lun Ma, Balwinder Singh, Sam Silva

Abstract: Hyperparameter optimization (HPO) is an important step in machine learning (ML) model development, but common practices are archaic -- primarily relying on manual or grid searches. This is partly because adopting advanced HPO algorithms introduces added complexity to the workflow, leading to longer computation times. This poses a notable challenge to ML applications, as suboptimal hyperparameter s… ▽ More Hyperparameter optimization (HPO) is an important step in machine learning (ML) model development, but common practices are archaic -- primarily relying on manual or grid searches. This is partly because adopting advanced HPO algorithms introduces added complexity to the workflow, leading to longer computation times. This poses a notable challenge to ML applications, as suboptimal hyperparameter selections curtail the potential of ML model performance, ultimately obstructing the full exploitation of ML techniques. In this article, we present a two-step HPO method as a strategic solution to curbing computational demands and wait times, gleaned from practical experiences in applied ML parameterization work. The initial phase involves a preliminary evaluation of hyperparameters on a small subset of the training dataset, followed by a re-evaluation of the top-performing candidate models post-retraining with the entire training dataset. This two-step HPO method is universally applicable across HPO search algorithms, and we argue it has attractive efficiency gains. As a case study, we present our recent application of the two-step HPO method to the development of neural network emulators for aerosol activation. Although our primary use case is a data-rich limit with many millions of samples, we also find that using up to 0.0025% of the data (a few thousand samples) in the initial step is sufficient to find optimal hyperparameter configurations from much more extensive sampling, achieving up to 135-times speedup. The benefits of this method materialize through an assessment of hyperparameters and model performance, revealing the minimal model complexity required to achieve the best performance. The assortment of top-performing models harvested from the HPO process allows us to choose a high-performing model with a low inference cost for efficient use in global climate models (GCMs). △ Less

Submitted 7 September, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

Journal ref: Artificial Intelligence for the Earth Systems, 3(1), 2024, e230013

arXiv:2301.00496 [pdf, other]

Using Neural Networks to Learn the Jet Stream Forced Response from Natural Variability

Authors: Charlotte Connolly, Elizabeth A. Barnes, Pedram Hassanzadeh, Mike Pritchard

Abstract: Two distinct features of anthropogenic climate change, warming in the tropical upper troposphere and warming at the Arctic surface, have competing effects on the mid-latitude jet stream's latitudinal position, often referred to as a "tug-of-war". Studies that investigate the jet's response to these thermal forcings show that it is sensitive to model type, season, initial atmospheric conditions, an… ▽ More Two distinct features of anthropogenic climate change, warming in the tropical upper troposphere and warming at the Arctic surface, have competing effects on the mid-latitude jet stream's latitudinal position, often referred to as a "tug-of-war". Studies that investigate the jet's response to these thermal forcings show that it is sensitive to model type, season, initial atmospheric conditions, and the shape and magnitude of the forcing. Much of this past work focuses on studying a simulation's response to external manipulation. In contrast, we explore the potential to train a convolutional neural network (CNN) on internal variability alone and then use it to examine possible nonlinear responses of the jet to tropospheric thermal forcing that more closely resemble anthropogenic climate change. Our approach leverages the idea behind the fluctuation-dissipation theorem, which relates the internal variability of a system to its forced response but so far has been only used to quantify linear responses. We train a CNN on data from a long control run of the CESM dry dynamical core and show that it is able to skillfully predict the nonlinear response of the jet to sustained external forcing. The trained CNN provides a quick method for exploring the jet stream sensitivity to a wide range of tropospheric temperature tendencies and, considering that this method can likely be applied to any model with a long control run, could lend itself useful for early stage experiment design. △ Less

Submitted 1 January, 2023; originally announced January 2023.

Comments: 24 pages, 9 figures, submitted for consideration for publication in Artificial Intelligence for the Earth Systems

arXiv:2211.01613 [pdf, other]

Understanding Extreme Precipitation Changes through Unsupervised Machine Learning

Authors: Griffin Mooers, Tom Beucler, Mike Pritchard, Stephan Mandt

Abstract: Despite the importance of quantifying how the spatial patterns of extreme precipitation will change with warming, we lack tools to objectively analyze the storm-scale outputs of modern climate models. To address this gap, we develop an unsupervised machine learning framework to quantify how storm dynamics affect changes in precipitation extremes, without sacrificing spatial information. For the up… ▽ More Despite the importance of quantifying how the spatial patterns of extreme precipitation will change with warming, we lack tools to objectively analyze the storm-scale outputs of modern climate models. To address this gap, we develop an unsupervised machine learning framework to quantify how storm dynamics affect changes in precipitation extremes, without sacrificing spatial information. For the upper precipitation quantiles (above the 80th percentile), we find that the spatial patterns of extreme precipitation changes are dominated by spatial shifts in storm dynamical regimes rather than changes in how these storm regimes produce precipitation. Our study shows how unsupervised machine learning, paired with domain knowledge, may allow us to better understand the physics of the atmosphere and anticipate the changes associated with a warming world. △ Less

Submitted 1 December, 2023; v1 submitted 3 November, 2022; originally announced November 2022.

Comments: 16 Pages, 9 Figures, Under Revisions at the Journal of Environmental Data Sciences

arXiv:2208.11843 [pdf, other]

Comparing Storm Resolving Models and Climates via Unsupervised Machine Learning

Authors: Griffin Mooers, Mike Pritchard, Tom Beucler, Prakhar Srivastava, Harshini Mangipudi, Liran Peng, Pierre Gentine, Stephan Mandt

Abstract: Global Storm-Resolving Models (GSRMs) have gained widespread interest because of the unprecedented detail with which they resolve the global climate. However, it remains difficult to quantify objective differences in how GSRMs resolve complex atmospheric formations. This lack of comprehensive tools for comparing model similarities is a problem in many disparate fields that involve simulation tools… ▽ More Global Storm-Resolving Models (GSRMs) have gained widespread interest because of the unprecedented detail with which they resolve the global climate. However, it remains difficult to quantify objective differences in how GSRMs resolve complex atmospheric formations. This lack of comprehensive tools for comparing model similarities is a problem in many disparate fields that involve simulation tools for complex data. To address this challenge we develop methods to estimate distributional distances based on both nonlinear dimensionality reduction and vector quantization. Our approach automatically learns physically meaningful notions of similarity from low-dimensional latent data representations that the different models produce. This enables an intercomparison of nine GSRMs based on their high-dimensional simulation data (2D vertical velocity snapshots) and reveals that only six are similar in their representation of atmospheric dynamics. Furthermore, we uncover signatures of the convective response to global warming in a fully unsupervised way. Our study provides a path toward evaluating future high-resolution simulation data more objectively. △ Less

Submitted 2 December, 2023; v1 submitted 24 August, 2022; originally announced August 2022.

Comments: 26 pages, 21 figures. In revision at Scientific Reports

arXiv:2204.08708 [pdf, other]

doi 10.1029/2022MS003130

Non-Linear Dimensionality Reduction with a Variational Encoder Decoder to Understand Convective Processes in Climate Models

Authors: Gunnar Behrens, Tom Beucler, Pierre Gentine, Fernando Iglesias-Suarez, Michael Pritchard, Veronika Eyring

Abstract: Deep learning can accurately represent sub-grid-scale convective processes in climate models, learning from high resolution simulations. However, deep learning methods usually lack interpretability due to large internal dimensionality, resulting in reduced trustworthiness in these methods. Here, we use Variational Encoder Decoder structures (VED), a non-linear dimensionality reduction technique, t… ▽ More Deep learning can accurately represent sub-grid-scale convective processes in climate models, learning from high resolution simulations. However, deep learning methods usually lack interpretability due to large internal dimensionality, resulting in reduced trustworthiness in these methods. Here, we use Variational Encoder Decoder structures (VED), a non-linear dimensionality reduction technique, to learn and understand convective processes in an aquaplanet superparameterized climate model simulation, where deep convective processes are simulated explicitly. We show that similar to previous deep learning studies based on feed-forward neural nets, the VED is capable of learning and accurately reproducing convective processes. In contrast to past work, we show this can be achieved by compressing the original information into only five latent nodes. As a result, the VED can be used to understand convective processes and delineate modes of convection through the exploration of its latent dimensions. A close investigation of the latent space enables the identification of different convective regimes: a) stable conditions are clearly distinguished from deep convection with low outgoing longwave radiation and strong precipitation; b) high optically thin cirrus-like clouds are separated from low optically thick cumulus clouds; and c) shallow convective processes are associated with large-scale moisture content and surface diabatic heating. Our results demonstrate that VEDs can accurately represent convective processes in climate models, while enabling interpretability and better understanding of sub-grid-scale physical processes, paving the way to increasingly interpretable machine learning parameterizations with promising generative properties △ Less

Submitted 26 July, 2022; v1 submitted 19 April, 2022; originally announced April 2022.

Comments: main paper: 30 pages, 11 figures; supporting informations: 37 pages, 19 figures, 11 tables; Submitted to 'Journal of Advances in Modeling Earth Systems' (JAMES)

arXiv:2112.08440 [pdf, other]

Climate-Invariant Machine Learning

Authors: Tom Beucler, Pierre Gentine, Janni Yuval, Ankitesh Gupta, Liran Peng, Jerry Lin, Sungduk Yu, Stephan Rasp, Fiaz Ahmed, Paul A. O'Gorman, J. David Neelin, Nicholas J. Lutsko, Michael Pritchard

Abstract: Projecting climate change is a generalization problem: we extrapolate the recent past using physical models across past, present, and future climates. Current climate models require representations of processes that occur at scales smaller than model grid size, which have been the main source of model projection uncertainty. Recent machine learning (ML) algorithms hold promise to improve such proc… ▽ More Projecting climate change is a generalization problem: we extrapolate the recent past using physical models across past, present, and future climates. Current climate models require representations of processes that occur at scales smaller than model grid size, which have been the main source of model projection uncertainty. Recent machine learning (ML) algorithms hold promise to improve such process representations, but tend to extrapolate poorly to climate regimes they were not trained on. To get the best of the physical and statistical worlds, we propose a new framework - termed "climate-invariant" ML - incorporating knowledge of climate processes into ML algorithms, and show that it can maintain high offline accuracy across a wide range of climate conditions and configurations in three distinct atmospheric models. Our results suggest that explicitly incorporating physical knowledge into data-driven models of Earth system processes can improve their consistency, data efficiency, and generalizability across climate regimes. △ Less

Submitted 17 January, 2024; v1 submitted 14 December, 2021; originally announced December 2021.

Comments: 26+28 pages, 9+15 figures, 0+3 tables in the main text + supplementary materials. Accepted for publication in Science Advances on Jan 5, 2024

arXiv:2112.01221 [pdf, other]

Analyzing High-Resolution Clouds and Convection using Multi-Channel VAEs

Authors: Harshini Mangipudi, Griffin Mooers, Mike Pritchard, Tom Beucler, Stephan Mandt

Abstract: Understanding the details of small-scale convection and storm formation is crucial to accurately represent the larger-scale planetary dynamics. Presently, atmospheric scientists run high-resolution, storm-resolving simulations to capture these kilometer-scale weather details. However, because they contain abundant information, these simulations can be overwhelming to analyze using conventional app… ▽ More Understanding the details of small-scale convection and storm formation is crucial to accurately represent the larger-scale planetary dynamics. Presently, atmospheric scientists run high-resolution, storm-resolving simulations to capture these kilometer-scale weather details. However, because they contain abundant information, these simulations can be overwhelming to analyze using conventional approaches. This paper takes a data-driven approach and jointly embeds spatial arrays of vertical wind velocities, temperatures, and water vapor information as three "channels" of a VAE architecture. Our "multi-channel VAE" results in more interpretable and robust latent structures than earlier work analyzing vertical velocities in isolation. Analyzing and clustering the VAE's latent space identifies weather patterns and their geographical manifestations in a fully unsupervised fashion. Our approach shows that VAEs can play essential roles in analyzing high-dimensional simulation data and extracting critical weather and climate characteristics. △ Less

Submitted 1 December, 2021; originally announced December 2021.

Comments: 4 Pages, 3 Figures. Accepted to NeurIPS 2021 (Machine Learning and Physical Sciences)

arXiv:2010.12996 [pdf, other]

doi 10.1029/2020MS002385

Assessing the Potential of Deep Learning for Emulating Cloud Superparameterization in Climate Models with Real-Geography Boundary Conditions

Authors: Griffin Mooers, Mike Pritchard, Tom Beucler, Jordan Ott, Galen Yacalis, Pierre Baldi, Pierre Gentine

Abstract: We explore the potential of feed-forward deep neural networks (DNNs) for emulating cloud superparameterization in realistic geography, using offline fits to data from the Super Parameterized Community Atmospheric Model. To identify the network architecture of greatest skill, we formally optimize hyperparameters using ~250 trials. Our DNN explains over 70 percent of the temporal variance at the 15-… ▽ More We explore the potential of feed-forward deep neural networks (DNNs) for emulating cloud superparameterization in realistic geography, using offline fits to data from the Super Parameterized Community Atmospheric Model. To identify the network architecture of greatest skill, we formally optimize hyperparameters using ~250 trials. Our DNN explains over 70 percent of the temporal variance at the 15-minute sampling scale throughout the mid-to-upper troposphere. Autocorrelation timescale analysis compared against DNN skill suggests the less good fit in the tropical, marine boundary layer is driven by neural network difficulty emulating fast, stochastic signals in convection. However, spectral analysis in the temporal domain indicates skillful emulation of signals on diurnal to synoptic scales. A close look at the diurnal cycle reveals correct emulation of land-sea contrasts and vertical structure in the heating and moistening fields, but some distortion of precipitation. Sensitivity tests targeting precipitation skill reveal complementary effects of adding positive constraints vs. hyperparameter tuning, motivating the use of both in the future. A first attempt to force an offline land model with DNN emulated atmospheric fields produces reassuring results further supporting neural network emulation viability in real-geography settings. Overall, the fit skill is competitive with recent attempts by sophisticated Residual and Convolutional Neural Network architectures trained on added information, including memory of past states. Our results confirm the parameterizability of superparameterized convection with continents through machine learning and we highlight advantages of casting this problem locally in space and time for accurate emulation and hopefully quick implementation of hybrid climate models. △ Less

Submitted 20 April, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

Comments: 32 Pages, 13 Figures, Revised Version Submitted to Journal of Advances in Modeling Earth Systems April 2021

arXiv:2007.01444 [pdf, other]

Generative Modeling for Atmospheric Convection

Authors: Griffin Mooers, Jens Tuyls, Stephan Mandt, Michael Pritchard, Tom Beucler

Abstract: While cloud-resolving models can explicitly simulate the details of small-scale storm formation and morphology, these details are often ignored by climate models for lack of computational resources. Here, we explore the potential of generative modeling to cheaply recreate small-scale storms by designing and implementing a Variational Autoencoder (VAE) that performs structural replication, dimensio… ▽ More While cloud-resolving models can explicitly simulate the details of small-scale storm formation and morphology, these details are often ignored by climate models for lack of computational resources. Here, we explore the potential of generative modeling to cheaply recreate small-scale storms by designing and implementing a Variational Autoencoder (VAE) that performs structural replication, dimensionality reduction, and clustering of high-resolution vertical velocity fields. Trained on ~6*10^6 samples spanning the globe, the VAE successfully reconstructs the spatial structure of convection, performs unsupervised clustering of convective organization regimes, and identifies anomalous storm activity, confirming the potential of generative modeling to power stochastic parameterizations of convection in climate models. △ Less

Submitted 24 October, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

Comments: 8 Pages, 6 Figures. Accepted into ACM International Conference Proceedings Series

arXiv:2007.00239 [pdf]

doi 10.1038/s41558-020-00963-x

Zonally opposing shifts of the intertropical convergence zone in response to climate change

Authors: Antonios Mamalakis, James T. Randerson, **-Yi Yu, Michael S. Pritchard, Gudrun Magnusdottir, Padhraic Smyth, Paul A. Levine, Sungduk Yu, Efi Foufoula-Georgiou

Abstract: Future changes in the location of the intertropical convergence zone (ITCZ) due to climate change are of high interest since they could substantially alter precipitation patterns in the tropics and subtropics. Although models predict a future narrowing of the ITCZ during the 21st century in response to climate warming, uncertainties remain large regarding its future position, with most past work f… ▽ More Future changes in the location of the intertropical convergence zone (ITCZ) due to climate change are of high interest since they could substantially alter precipitation patterns in the tropics and subtropics. Although models predict a future narrowing of the ITCZ during the 21st century in response to climate warming, uncertainties remain large regarding its future position, with most past work focusing on the zonal-mean ITCZ shifts. Here we use projections from 27 state-of-the-art climate models (CMIP6) to investigate future changes in ITCZ location as a function of longitude and season, in response to climate warming. We document a robust zonally opposing response of the ITCZ, with a northward shift over eastern Africa and the Indian Ocean, and a southward shift in the eastern Pacific and Atlantic Ocean by 2100, for the SSP3-7.0 scenario. Using a two-dimensional energetics framework, we find that the revealed ITCZ response is consistent with future changes in the divergent atmospheric energy transport over the tropics, and sector-mean shifts of the energy flux equator (EFE). The changes in the EFE appear to be the result of zonally opposing imbalances in the hemispheric atmospheric heating over the two sectors, consisting of increases in atmospheric heating over Eurasia and cooling over the Southern Ocean, which contrast with atmospheric cooling over the North Atlantic Ocean due to a model-projected weakening of the Atlantic meridional overturning circulation. △ Less

Submitted 1 July, 2020; originally announced July 2020.

Journal ref: Nature Climate Change 2021

arXiv:2004.10652 [pdf, other]

A Fortran-Keras Deep Learning Bridge for Scientific Computing

Authors: Jordan Ott, Mike Pritchard, Natalie Best, Erik Linstead, Milan Curcic, Pierre Baldi

Abstract: Implementing artificial neural networks is commonly achieved via high-level programming languages like Python and easy-to-use deep learning libraries like Keras. These software libraries come pre-loaded with a variety of network architectures, provide autodifferentiation, and support GPUs for fast and efficient computation. As a result, a deep learning practitioner will favor training a neural net… ▽ More Implementing artificial neural networks is commonly achieved via high-level programming languages like Python and easy-to-use deep learning libraries like Keras. These software libraries come pre-loaded with a variety of network architectures, provide autodifferentiation, and support GPUs for fast and efficient computation. As a result, a deep learning practitioner will favor training a neural network model in Python, where these tools are readily available. However, many large-scale scientific computation projects are written in Fortran, making it difficult to integrate with modern deep learning methods. To alleviate this problem, we introduce a software library, the Fortran-Keras Bridge (FKB). This two-way bridge connects environments where deep learning resources are plentiful, with those where they are scarce. The paper describes several unique features offered by FKB, such as customizable layers, loss functions, and network ensembles. The paper concludes with a case study that applies FKB to address open questions about the robustness of an experimental approach to global climate simulation, in which subgrid physics are outsourced to deep neural network emulators. In this context, FKB enables a hyperparameter search of one hundred plus candidate models of subgrid cloud and radiation physics, initially implemented in Keras, to be transferred and used in Fortran. Such a process allows the model's emergent behavior to be assessed, i.e. when fit imperfections are coupled to explicit planetary-scale fluid dynamics. The results reveal a previously unrecognized strong relationship between offline validation error and online performance, in which the choice of optimizer proves unexpectedly critical. This reveals many neural network architectures that produce considerable improvements in stability including some with reduced error, for an especially challenging training dataset. △ Less

Submitted 3 August, 2020; v1 submitted 14 April, 2020; originally announced April 2020.

arXiv:2003.06549 [pdf, other]

doi 10.1175/JAS-D-20-0082.1

Interpreting and Stabilizing Machine-learning Parametrizations of Convection

Authors: Noah D. Brenowitz, Tom Beucler, Michael Pritchard, Christopher S. Bretherton

Abstract: Neural networks are a promising technique for parameterizing sub-grid-scale physics (e.g. moist atmospheric convection) in coarse-resolution climate models, but their lack of interpretability and reliability prevents widespread adoption. For instance, it is not fully understood why neural network parameterizations often cause dramatic instability when coupled to atmospheric fluid dynamics. This pa… ▽ More Neural networks are a promising technique for parameterizing sub-grid-scale physics (e.g. moist atmospheric convection) in coarse-resolution climate models, but their lack of interpretability and reliability prevents widespread adoption. For instance, it is not fully understood why neural network parameterizations often cause dramatic instability when coupled to atmospheric fluid dynamics. This paper introduces tools for interpreting their behavior that are customized to the parameterization task. First, we assess the nonlinear sensitivity of a neural network to lower-tropospheric stability and the mid-tropospheric moisture, two widely-studied controls of moist convection. Second, we couple the linearized response functions of these neural networks to simplified gravity-wave dynamics, and analytically diagnose the corresponding phase speeds, growth rates, wavelengths, and spatial structures. To demonstrate their versatility, these techniques are tested on two sets of neural networks, one trained with a super-parametrized version of the Community Atmosphere Model (SPCAM) and the second with a near-global cloud-resolving model (GCRM). Even though the SPCAM simulation has a warmer climate than the cloud-resolving model, both neural networks predict stronger heating/drying in moist and unstable environments, which is consistent with observations. Moreover, the spectral analysis can predict that instability occurs when GCMs are coupled to networks that support gravity waves that are unstable and have phase speeds larger than 5 m/s. In contrast, standing unstable modes do not cause catastrophic instability. Using these tools, differences between the SPCAM- vs. GCRM- trained neural networks are analyzed, and strategies to incrementally improve both of their coupled online performance unveiled. △ Less

Submitted 14 March, 2020; originally announced March 2020.

arXiv:2002.08525 [pdf, other]

Towards Physically-consistent, Data-driven Models of Convection

Authors: Tom Beucler, Michael Pritchard, Pierre Gentine, Stephan Rasp

Abstract: Data-driven algorithms, in particular neural networks, can emulate the effect of sub-grid scale processes in coarse-resolution climate models if trained on high-resolution climate simulations. However, they may violate key physical constraints and lack the ability to generalize outside of their training set. Here, we show that physical constraints can be enforced in neural networks, either approxi… ▽ More Data-driven algorithms, in particular neural networks, can emulate the effect of sub-grid scale processes in coarse-resolution climate models if trained on high-resolution climate simulations. However, they may violate key physical constraints and lack the ability to generalize outside of their training set. Here, we show that physical constraints can be enforced in neural networks, either approximately by adapting the loss function or to within machine precision by adapting the architecture. As these physical constraints are insufficient to guarantee generalizability, we additionally propose to physically rescale the training and validation data to improve the ability of neural networks to generalize to unseen climates. △ Less

Submitted 17 April, 2020; v1 submitted 19 February, 2020; originally announced February 2020.

Comments: Accepted for oral presentation at the 2020 IEEE International Geoscience and Remote Sensing Symposium (IGARSS) 5 pages, 5 figures, 1 table

arXiv:1909.00912 [pdf, other]

doi 10.1103/PhysRevLett.126.098302

Enforcing Analytic Constraints in Neural-Networks Emulating Physical Systems

Authors: Tom Beucler, Michael Pritchard, Stephan Rasp, Jordan Ott, Pierre Baldi, Pierre Gentine

Abstract: Neural networks can emulate nonlinear physical systems with high accuracy, yet they may produce physically-inconsistent results when violating fundamental constraints. Here, we introduce a systematic way of enforcing nonlinear analytic constraints in neural networks via constraints in the architecture or the loss function. Applied to convective processes for climate modeling, architectural constra… ▽ More Neural networks can emulate nonlinear physical systems with high accuracy, yet they may produce physically-inconsistent results when violating fundamental constraints. Here, we introduce a systematic way of enforcing nonlinear analytic constraints in neural networks via constraints in the architecture or the loss function. Applied to convective processes for climate modeling, architectural constraints enforce conservation laws to within machine precision without degrading performance. Enforcing constraints also reduces errors in the subsets of the outputs most impacted by the constraints. △ Less

Submitted 27 January, 2021; v1 submitted 2 September, 2019; originally announced September 2019.

Comments: 21 pages, 11 figures, 9 tables. Submitted to Physical Review Letters

Journal ref: Phys. Rev. Lett. 126, 098302 (2021)

arXiv:1908.03764 [pdf, other]

doi 10.1029/2019GL084130

Comparing Convective Self-Aggregation in Idealized Models to Observed Moist Static Energy Variability near the Equator

Authors: Tom Beucler, Tristan Abbott, Timothy Cronin, Michael Pritchard

Abstract: Idealized convection-permitting simulations of radiative-convective equilibrium (RCE) have become a popular tool for understanding the physical processes leading to horizontal variability of tropical water vapor and rainfall. However, the applicability of idealized simulations to nature is still unclear given that important processes are typically neglected, such as lateral vapor advection by extr… ▽ More Idealized convection-permitting simulations of radiative-convective equilibrium (RCE) have become a popular tool for understanding the physical processes leading to horizontal variability of tropical water vapor and rainfall. However, the applicability of idealized simulations to nature is still unclear given that important processes are typically neglected, such as lateral vapor advection by extratropical intrusions, or interactive ocean coupling. Here, we exploit spectral analysis to compactly summarize the multi-scale processes supporting convective aggregation. By applying this framework to high-resolution reanalysis data and satellite observations in addition to idealized simulations, we compare convective-aggregation processes across horizontal scales and data sets. The results affirm the validity of the RCE simulations as an analogy to the real world. Column moist static energy tendencies share similar signs and scale-selectivity in convection-permitting models and observations: Radiation increases variance at wavelengths above 1,000km, while advection damps variance across wavelengths, and surface fluxes mostly reduce variance between 1,000km and 10,000km. △ Less

Submitted 10 August, 2019; originally announced August 2019.

Comments: 15 pages, 3 figures, Submitted to "Geophysical Research Letters"

Journal ref: Geophysical Research Letters, 46 (2019)

arXiv:1906.06622 [pdf, other]

Achieving Conservation of Energy in Neural Network Emulators for Climate Modeling

Authors: Tom Beucler, Stephan Rasp, Michael Pritchard, Pierre Gentine

Abstract: Artificial neural-networks have the potential to emulate cloud processes with higher accuracy than the semi-empirical emulators currently used in climate models. However, neural-network models do not intrinsically conserve energy and mass, which is an obstacle to using them for long-term climate predictions. Here, we propose two methods to enforce linear conservation laws in neural-network emulato… ▽ More Artificial neural-networks have the potential to emulate cloud processes with higher accuracy than the semi-empirical emulators currently used in climate models. However, neural-network models do not intrinsically conserve energy and mass, which is an obstacle to using them for long-term climate predictions. Here, we propose two methods to enforce linear conservation laws in neural-network emulators of physical models: Constraining (1) the loss function or (2) the architecture of the network itself. Applied to the emulation of explicitly-resolved cloud processes in a prototype multi-scale climate model, we show that architecture constraints can enforce conservation laws to satisfactory numerical precision, while all constraints help the neural-network better generalize to conditions outside of its training set, such as global warming. △ Less

Submitted 15 June, 2019; originally announced June 2019.

Comments: ICML 2019 Workshop. Climate Change: How Can AI Help? 3 pages, 3 figures, 1 table

arXiv:1811.06162 [pdf, other]

Plan Interdiction Games

Authors: Yevgeniy Vorobeychik, Michael Pritchard

Abstract: We propose a framework for cyber risk assessment and mitigation which models attackers as formal planners and defenders as interdicting such plans. We illustrate the value of plan interdiction problems by first modeling network cyber risk through the use of formal planning, and subsequently formalizing an important question of prioritizing vulnerabilities for patching in the plan interdiction fram… ▽ More We propose a framework for cyber risk assessment and mitigation which models attackers as formal planners and defenders as interdicting such plans. We illustrate the value of plan interdiction problems by first modeling network cyber risk through the use of formal planning, and subsequently formalizing an important question of prioritizing vulnerabilities for patching in the plan interdiction framework. In particular, we show that selectively patching relatively few vulnerabilities allows a network administrator to significantly reduce exposure to cyber risk. More broadly, we have developed a number of scalable approaches for plan interdiction problems, making especially significant advances when attack plans involve uncertainty about system dynamics. However, important open problems remain, including how to effectively capture information asymmetry between the attacker and defender, how to best model dynamics in the attacker-defender interaction, and how to develop scalable algorithms for solving associated plan interdiction games. △ Less

Submitted 14 November, 2018; originally announced November 2018.

arXiv:1806.04731 [pdf, other]

doi 10.1073/pnas.1810286115

Deep learning to represent sub-grid processes in climate models

Authors: Stephan Rasp, Michael S. Pritchard, Pierre Gentine

Abstract: The representation of nonlinear sub-grid processes, especially clouds, has been a major source of uncertainty in climate models for decades. Cloud-resolving models better represent many of these processes and can now be run globally but only for short-term simulations of at most a few years because of computational limitations. Here we demonstrate that deep learning can be used to capture many adv… ▽ More The representation of nonlinear sub-grid processes, especially clouds, has been a major source of uncertainty in climate models for decades. Cloud-resolving models better represent many of these processes and can now be run globally but only for short-term simulations of at most a few years because of computational limitations. Here we demonstrate that deep learning can be used to capture many advantages of cloud-resolving modeling at a fraction of the computational cost. We train a deep neural network to represent all atmospheric sub-grid processes in a climate model by learning from a multi-scale model in which convection is treated explicitly. The trained neural network then replaces the traditional sub-grid parameterizations in a global general circulation model in which it freely interacts with the resolved dynamics and the surface-flux scheme. The prognostic multi-year simulations are stable and closely reproduce not only the mean climate of the cloud-resolving simulation but also key aspects of variability, including precipitation extremes and the equatorial wave spectrum. Furthermore, the neural network approximately conserves energy despite not being explicitly instructed to. Finally, we show that the neural network parameterization generalizes to new surface forcing patterns but struggles to cope with temperatures far outside its training manifold. Our results show the feasibility of using deep learning for climate model parameterization. In a broader context, we anticipate that data-driven Earth System Model development could play a key role in reducing climate prediction uncertainty in the coming decade. △ Less

Submitted 7 September, 2018; v1 submitted 12 June, 2018; originally announced June 2018.

Comments: View official PNAS version at https://doi.org/10.1073/pnas.1810286115

Journal ref: Proceedings of the National Academy of Sciences Sep 2018, 201810286; DOI: 10.1073/pnas.1810286115

arXiv:1505.01765 [pdf, other]

Development of a Burst Buffer System for Data-Intensive Applications

Authors: Teng Wang, Sarp Oral, Michael Pritchard, Kevin Vasko, Weikuan Yu

Abstract: Modern parallel filesystems such as Lustre are designed to provide high, scalable I/O bandwidth in response to growing I/O requirements; however, the bursty I/O characteristics of many data-intensive scientific applications make it difficult for back-end parallel filesystems to efficiently handle I/O requests. A burst buffer system, through which data can be temporarily buffered via high-performan… ▽ More Modern parallel filesystems such as Lustre are designed to provide high, scalable I/O bandwidth in response to growing I/O requirements; however, the bursty I/O characteristics of many data-intensive scientific applications make it difficult for back-end parallel filesystems to efficiently handle I/O requests. A burst buffer system, through which data can be temporarily buffered via high-performance storage mediums, allows for gradual flushing of data to back-end filesystems. In this paper, we explore issues surrounding the development of a burst buffer system for data-intensive scientific applications. Our initial results demonstrate that utilizing a burst buffer system on top of the Lustre filesystem shows promise for dealing with the intense I/O traffic generated by application checkpointing. △ Less

Submitted 7 May, 2015; originally announced May 2015.

Comments: International Workshop on the Lustre Ecosystem: Challenges and Opportunities, March 2015, Annapolis MD

arXiv:1504.07995 [pdf, ps, other]

doi 10.1093/mnras/stv2367

An Empirically Derived Three-Dimensional Laplace Resonance in the Gliese 876 Planetary System

Authors: Benjamin E. Nelson, Paul Robertson, Matthew J. Payne, Seth M. Pritchard, Katherine M. Deck, Eric B. Ford, Jason T. Wright, Howard Isaacson

Abstract: We report constraints on the three-dimensional orbital architecture for all four planets known to orbit the nearby M dwarf Gliese 876 based solely on Doppler measurements and demanding long-term orbital stability. Our dataset incorporates publicly available radial velocities taken with the ELODIE and CORALIE spectrographs, HARPS, and Keck HIRES as well as previously unpublished HIRES velocities. W… ▽ More We report constraints on the three-dimensional orbital architecture for all four planets known to orbit the nearby M dwarf Gliese 876 based solely on Doppler measurements and demanding long-term orbital stability. Our dataset incorporates publicly available radial velocities taken with the ELODIE and CORALIE spectrographs, HARPS, and Keck HIRES as well as previously unpublished HIRES velocities. We first quantitatively assess the validity of the planets thought to orbit GJ 876 by computing the Bayes factors for a variety of different coplanar models using an importance sampling algorithm. We find that a four-planet model is preferred over a three-planet model. Next, we apply a Newtonian MCMC algorithm to perform a Bayesian analysis of the planet masses and orbits using an n-body model in three-dimensional space. Based on the radial velocities alone, we find that a 99% credible interval provides upper limits on the mutual inclinations for the three resonant planets ($Φ_{cb}<6.20^\circ$ for the "c" and "b" pair and $Φ_{be}<28.5^\circ$ for the "b" and "e" pair). Subsequent dynamical integrations of our posterior sample find that the GJ 876 planets must be roughly coplanar ($Φ_{cb}<2.60^\circ$ and $Φ_{be}<7.87^\circ$), suggesting the amount of planet-planet scattering in the system has been low. We investigate the distribution of the respective resonant arguments of each planet pair and find that at least one argument for each planet pair and the Laplace argument librate. The libration amplitudes in our three-dimensional orbital model supports the idea of the outer-three planets having undergone significant past disk migration. △ Less

Submitted 12 October, 2015; v1 submitted 29 April, 2015; originally announced April 2015.

Comments: 19 pages, 11 figures, 8 tables. Accepted to MNRAS. Posterior samples available at https://github.com/benelson/GJ876

arXiv:1204.3553 [pdf, other]

The JASMIN super-data-cluster

Authors: B. N. Lawrence, V. Bennett, J. Churchill, M. Juckes, P. Kershaw, P. Oliver, M. Pritchard, A. Stephens

Abstract: The JASMIN super-data-cluster is being deployed to support the data analysis requirements of the UK and European climate and earth system modelling community. Physical colocation of the core JASMIN resource with significant components of the facility for Climate and Environmental Monitoring from Space (CEMS) provides additional support for the earth observation community, as well as facilitating f… ▽ More The JASMIN super-data-cluster is being deployed to support the data analysis requirements of the UK and European climate and earth system modelling community. Physical colocation of the core JASMIN resource with significant components of the facility for Climate and Environmental Monitoring from Space (CEMS) provides additional support for the earth observation community, as well as facilitating further comparison and evaluation of models with data. JASMIN and CEMS together centrally deploy 9.3 PB of storage - 4.6 PB of Panasas fast disk storage alongside the STFC Atlas Tape Store. Over 370 computing cores provide local computation. Remote JASMIN resources at Bristol, Leeds and Reading provide additional distributed storage and compute configured to support local workflow as a step** stone to using the central JASMIN system. Fast network links from JASMIN provide reliable communication between the UK supercomputers MONSooN (at the Met Office) and HECToR (at the University of Edinburgh). JASMIN also supports European users via a light path to KNMI in the Netherlands. The functional components of the JASMIN infrastructure have been designed to support and integrate workflows for three main goals: (1) the efficient operation of data curation and facilitation at the STFC Centre for Environmental Data Archival; (2) efficient data analysis by the UK and European climate and earth system science communities, and; (3) flexible access for the climate impacts and earth observation communities to complex data and concomitant services. △ Less

Submitted 16 April, 2012; originally announced April 2012.

Comments: Submitted to Supercomputing 2012

arXiv:physics/0701272 [pdf, ps, other]

doi 10.1088/0953-4075/41/12/125302

Experimental single-impulse magnetic focusing of launched cold atoms

Authors: David A. Smith, Aidan S. Arnold, Matthew J. Pritchard, Ifan G. Hughes

Abstract: Single-impulse three-dimensional magnetic focusing of vertically launched cold atoms has been observed. Four different configurations of the lens were used to vary the relative radial and axial focusing properties. Compact focused clouds of 85Rb were seen for all four configurations. It is shown that an atom-optical ray matrix approach for describing the lensing action is insufficient. Numerical… ▽ More Single-impulse three-dimensional magnetic focusing of vertically launched cold atoms has been observed. Four different configurations of the lens were used to vary the relative radial and axial focusing properties. Compact focused clouds of 85Rb were seen for all four configurations. It is shown that an atom-optical ray matrix approach for describing the lensing action is insufficient. Numerical simulation using a full approximation to the lens's magnetic field shows very good agreement with the radial focusing properties of the lens. However, the axial (vertical direction) focusing properties are less well described and the reasons for this are discussed. △ Less

Submitted 10 June, 2008; v1 submitted 23 January, 2007; originally announced January 2007.

Comments: Revised to incorporate referee suggestions, final version

Journal ref: J. Phys. B 41, 125302 (2008)

arXiv:physics/0610027 [pdf, ps, other]

doi 10.1088/1367-2630/8/12/309

Transport of launched cold atoms with a laser guide and pulsed magnetic fields

Authors: Matthew J Pritchard, Aidan S Arnold, Simon L Cornish, David W Hallwood, Chris V S Pleasant, Ifan G Hughes

Abstract: We propose the novel combination of a laser guide and magnetic lens to transport a cold atomic cloud. We have modelled the loading and guiding of a launched cloud of cold atoms with the optical dipole force. We discuss the optimum strategy for loading typically 30% of the atoms from a MOT and guiding them vertically through 22cm. However, although the atoms are tightly confined transversely, the… ▽ More We propose the novel combination of a laser guide and magnetic lens to transport a cold atomic cloud. We have modelled the loading and guiding of a launched cloud of cold atoms with the optical dipole force. We discuss the optimum strategy for loading typically 30% of the atoms from a MOT and guiding them vertically through 22cm. However, although the atoms are tightly confined transversely, thermal expansion in the propagation direction still results in a density loss of two orders of magnitude. By combining the laser guide with a single impulse from a magnetic lens we show one can actually increase the density of the guided atoms by a factor of 10. △ Less

Submitted 9 November, 2006; v1 submitted 4 October, 2006; originally announced October 2006.

Comments: 18 pages 11 figures. To appear in New J. Phys. Original version modified to include referee's improvements

Journal ref: New J. Phys. 8, 309 (2006).

arXiv:physics/0512151 [pdf, ps, other]

doi 10.1088/1367-2630/8/4/053

Double-impulse magnetic focusing of launched cold atoms

Authors: Aidan S Arnold, Matthew J Pritchard, David A Smith, Ifan G Hughes

Abstract: We have theoretically investigated 3D focusing of a launched cloud of cold atoms using a pair of magnetic lens pulses (the alternate-gradient method). Individual lenses focus radially and defocus axially or vice-versa. The performance of the two possible pulse sequences are compared and found to be ideal for loading both 'pancake' and 'sausage' shaped magnetic/optical microtraps. It is shown tha… ▽ More We have theoretically investigated 3D focusing of a launched cloud of cold atoms using a pair of magnetic lens pulses (the alternate-gradient method). Individual lenses focus radially and defocus axially or vice-versa. The performance of the two possible pulse sequences are compared and found to be ideal for loading both 'pancake' and 'sausage' shaped magnetic/optical microtraps. It is shown that focusing aberrations are considerably smaller for double-impulse magnetic lenses compared to single-impulse magnetic lenses. An analysis of the clouds focused by double-impulse technique is presented. △ Less

Submitted 22 March, 2006; v1 submitted 16 December, 2005; originally announced December 2005.

Comments: 14 pages, 6 figures. Accepted for publication in New Journal of Physics

Journal ref: New J. Phys. 8 53 (2006)

arXiv:physics/0512141 [pdf, ps, other]

doi 10.1088/0953-4075/37/22/004

Single-impulse magnetic focusing of launched cold atoms

Authors: Matthew J Pritchard, Aidan S Arnold, David A Smith, Ifan G Hughes

Abstract: We have theoretically investigated the focusing of a launched cloud of cold atoms. Time-dependent spatially-varying magnetic fields are used to impart impulses leading to a three-dimensional focus of the launched cloud. We discuss possible coil arrangements for a new focusing regime: isotropic 3D focusing of atoms with a single-impulse magnetic lens. We investigate focusing aberrations and find… ▽ More We have theoretically investigated the focusing of a launched cloud of cold atoms. Time-dependent spatially-varying magnetic fields are used to impart impulses leading to a three-dimensional focus of the launched cloud. We discuss possible coil arrangements for a new focusing regime: isotropic 3D focusing of atoms with a single-impulse magnetic lens. We investigate focusing aberrations and find that, for typical experimental parameters, the widely used assumption of a purely harmonic lens is often inaccurate. The baseball lens offers the best possibility for isotropically focusing a cloud of weak-field-seeking atoms in 3D. △ Less

Submitted 15 December, 2005; originally announced December 2005.

Comments: 16 pages, 7 figures

Journal ref: J. Phys. B, 37 4435 (2004)

Showing 1–41 of 41 results for author: Pritchard, M