Search | arXiv e-print repository

GenCast: Diffusion-based ensemble forecasting for medium-range weather

Authors: Ilan Price, Alvaro Sanchez-Gonzalez, Ferran Alet, Tom R. Andersson, Andrew El-Kadi, Dominic Masters, Timo Ewalds, Jacklynn Stott, Shakir Mohamed, Peter Battaglia, Remi Lam, Matthew Willson

Abstract: Weather forecasts are fundamentally uncertain, so predicting the range of probable weather scenarios is crucial for important decisions, from warning the public about hazardous weather, to planning renewable energy use. Here, we introduce GenCast, a probabilistic weather model with greater skill and speed than the top operational medium-range weather forecast in the world, the European Centre for… ▽ More Weather forecasts are fundamentally uncertain, so predicting the range of probable weather scenarios is crucial for important decisions, from warning the public about hazardous weather, to planning renewable energy use. Here, we introduce GenCast, a probabilistic weather model with greater skill and speed than the top operational medium-range weather forecast in the world, the European Centre for Medium-Range Forecasts (ECMWF)'s ensemble forecast, ENS. Unlike traditional approaches, which are based on numerical weather prediction (NWP), GenCast is a machine learning weather prediction (MLWP) method, trained on decades of reanalysis data. GenCast generates an ensemble of stochastic 15-day global forecasts, at 12-hour steps and 0.25 degree latitude-longitude resolution, for over 80 surface and atmospheric variables, in 8 minutes. It has greater skill than ENS on 97.4% of 1320 targets we evaluated, and better predicts extreme weather, tropical cyclones, and wind power production. This work helps open the next chapter in operational weather forecasting, where critical weather-dependent decisions are made with greater accuracy and efficiency. △ Less

Submitted 1 May, 2024; v1 submitted 25 December, 2023; originally announced December 2023.

Comments: Main text 11 pages, Appendices 76 pages

arXiv:2311.07222 [pdf, other]

Neural General Circulation Models for Weather and Climate

Authors: Dmitrii Kochkov, Janni Yuval, Ian Langmore, Peter Norgaard, Jamie Smith, Griffin Mooers, Milan Klöwer, James Lottes, Stephan Rasp, Peter Düben, Sam Hatfield, Peter Battaglia, Alvaro Sanchez-Gonzalez, Matthew Willson, Michael P. Brenner, Stephan Hoyer

Abstract: General circulation models (GCMs) are the foundation of weather and climate prediction. GCMs are physics-based simulators which combine a numerical solver for large-scale dynamics with tuned representations for small-scale processes such as cloud formation. Recently, machine learning (ML) models trained on reanalysis data achieved comparable or better skill than GCMs for deterministic weather fore… ▽ More General circulation models (GCMs) are the foundation of weather and climate prediction. GCMs are physics-based simulators which combine a numerical solver for large-scale dynamics with tuned representations for small-scale processes such as cloud formation. Recently, machine learning (ML) models trained on reanalysis data achieved comparable or better skill than GCMs for deterministic weather forecasting. However, these models have not demonstrated improved ensemble forecasts, or shown sufficient stability for long-term weather and climate simulations. Here we present the first GCM that combines a differentiable solver for atmospheric dynamics with ML components, and show that it can generate forecasts of deterministic weather, ensemble weather and climate on par with the best ML and physics-based methods. NeuralGCM is competitive with ML models for 1-10 day forecasts, and with the European Centre for Medium-Range Weather Forecasts ensemble prediction for 1-15 day forecasts. With prescribed sea surface temperature, NeuralGCM can accurately track climate metrics such as global mean temperature for multiple decades, and climate forecasts with 140 km resolution exhibit emergent phenomena such as realistic frequency and trajectories of tropical cyclones. For both weather and climate, our approach offers orders of magnitude computational savings over conventional GCMs. Our results show that end-to-end deep learning is compatible with tasks performed by conventional GCMs, and can enhance the large-scale physical simulations that are essential for understanding and predicting the Earth system. △ Less

Submitted 7 March, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

Comments: 92 pages, 54 figures

arXiv:2308.15560 [pdf, other]

WeatherBench 2: A benchmark for the next generation of data-driven global weather models

Authors: Stephan Rasp, Stephan Hoyer, Alexander Merose, Ian Langmore, Peter Battaglia, Tyler Russel, Alvaro Sanchez-Gonzalez, Vivian Yang, Rob Carver, Shreya Agrawal, Matthew Chantry, Zied Ben Bouallegue, Peter Dueben, Carla Bromberg, Jared Sisk, Luke Barrington, Aaron Bell, Fei Sha

Abstract: WeatherBench 2 is an update to the global, medium-range (1-14 day) weather forecasting benchmark proposed by Rasp et al. (2020), designed with the aim to accelerate progress in data-driven weather modeling. WeatherBench 2 consists of an open-source evaluation framework, publicly available training, ground truth and baseline data as well as a continuously updated website with the latest metrics and… ▽ More WeatherBench 2 is an update to the global, medium-range (1-14 day) weather forecasting benchmark proposed by Rasp et al. (2020), designed with the aim to accelerate progress in data-driven weather modeling. WeatherBench 2 consists of an open-source evaluation framework, publicly available training, ground truth and baseline data as well as a continuously updated website with the latest metrics and state-of-the-art models: https://sites.research.google/weatherbench. This paper describes the design principles of the evaluation framework and presents results for current state-of-the-art physical and data-driven weather models. The metrics are based on established practices for evaluating weather forecasts at leading operational weather centers. We define a set of headline scores to provide an overview of model performance. In addition, we also discuss caveats in the current evaluation setup and challenges for the future of data-driven weather forecasting. △ Less

Submitted 26 January, 2024; v1 submitted 29 August, 2023; originally announced August 2023.

arXiv:2212.12794 [pdf, other]

GraphCast: Learning skillful medium-range global weather forecasting

Authors: Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson, Peter Wirnsberger, Meire Fortunato, Ferran Alet, Suman Ravuri, Timo Ewalds, Zach Eaton-Rosen, Weihua Hu, Alexander Merose, Stephan Hoyer, George Holland, Oriol Vinyals, Jacklynn Stott, Alexander Pritzel, Shakir Mohamed, Peter Battaglia

Abstract: Global medium-range weather forecasting is critical to decision-making across many social and economic domains. Traditional numerical weather prediction uses increased compute resources to improve forecast accuracy, but cannot directly use historical weather data to improve the underlying model. We introduce a machine learning-based method called "GraphCast", which can be trained directly from rea… ▽ More Global medium-range weather forecasting is critical to decision-making across many social and economic domains. Traditional numerical weather prediction uses increased compute resources to improve forecast accuracy, but cannot directly use historical weather data to improve the underlying model. We introduce a machine learning-based method called "GraphCast", which can be trained directly from reanalysis data. It predicts hundreds of weather variables, over 10 days at 0.25 degree resolution globally, in under one minute. We show that GraphCast significantly outperforms the most accurate operational deterministic systems on 90% of 1380 verification targets, and its forecasts support better severe event prediction, including tropical cyclones, atmospheric rivers, and extreme temperatures. GraphCast is a key advance in accurate and efficient weather forecasting, and helps realize the promise of machine learning for modeling complex dynamical systems. △ Less

Submitted 4 August, 2023; v1 submitted 24 December, 2022; originally announced December 2022.

Comments: GraphCast code and trained weights are available at: https://github.com/deepmind/graphcast

arXiv:2207.03522 [pdf, other]

TF-GNN: Graph Neural Networks in TensorFlow

Authors: Oleksandr Ferludin, Arno Eigenwillig, Martin Blais, Dustin Zelle, Jan Pfeifer, Alvaro Sanchez-Gonzalez, Wai Lok Sibon Li, Sami Abu-El-Haija, Peter Battaglia, Neslihan Bulut, Jonathan Halcrow, Filipe Miguel Gonçalves de Almeida, Pedro Gonnet, Liangze Jiang, Parth Kothari, Silvio Lattanzi, André Linhares, Brandon Mayer, Vahab Mirrokni, John Palowitch, Mihir Paradkar, Jennifer She, Anton Tsitsulin, Kevin Villela, Lisa Wang , et al. (2 additional authors not shown)

Abstract: TensorFlow-GNN (TF-GNN) is a scalable library for Graph Neural Networks in TensorFlow. It is designed from the bottom up to support the kinds of rich heterogeneous graph data that occurs in today's information ecosystems. In addition to enabling machine learning researchers and advanced developers, TF-GNN offers low-code solutions to empower the broader developer community in graph learning. Many… ▽ More TensorFlow-GNN (TF-GNN) is a scalable library for Graph Neural Networks in TensorFlow. It is designed from the bottom up to support the kinds of rich heterogeneous graph data that occurs in today's information ecosystems. In addition to enabling machine learning researchers and advanced developers, TF-GNN offers low-code solutions to empower the broader developer community in graph learning. Many production models at Google use TF-GNN, and it has been recently released as an open source project. In this paper we describe the TF-GNN data model, its Keras message passing API, and relevant capabilities such as graph sampling and distributed training. △ Less

Submitted 23 July, 2023; v1 submitted 7 July, 2022; originally announced July 2022.

arXiv:2112.15275 [pdf, other]

Learned Coarse Models for Efficient Turbulence Simulation

Authors: Kimberly Stachenfeld, Drummond B. Fielding, Dmitrii Kochkov, Miles Cranmer, Tobias Pfaff, Jonathan Godwin, Can Cui, Shirley Ho, Peter Battaglia, Alvaro Sanchez-Gonzalez

Abstract: Turbulence simulation with classical numerical solvers requires high-resolution grids to accurately resolve dynamics. Here we train learned simulators at low spatial and temporal resolutions to capture turbulent dynamics generated at high resolution. We show that our proposed model can simulate turbulent dynamics more accurately than classical numerical solvers at the comparably low resolutions ac… ▽ More Turbulence simulation with classical numerical solvers requires high-resolution grids to accurately resolve dynamics. Here we train learned simulators at low spatial and temporal resolutions to capture turbulent dynamics generated at high resolution. We show that our proposed model can simulate turbulent dynamics more accurately than classical numerical solvers at the comparably low resolutions across various scientifically relevant metrics. Our model is trained end-to-end from data and is capable of learning a range of challenging chaotic and turbulent dynamics at low resolution, including trajectories generated by the state-of-the-art Athena++ engine. We show that our simpler, general-purpose architecture outperforms various more specialized, turbulence-specific architectures from the learned turbulence simulation literature. In general, we see that learned simulators yield unstable trajectories; however, we show that tuning training noise and temporal downsampling solves this problem. We also find that while generalization beyond the training distribution is a challenge for learned models, training noise, added loss constraints, and dataset augmentation can help. Broadly, we conclude that our learned simulator outperforms traditional solvers run on coarser grids, and emphasize that simple design choices can offer stability and robust generalization. △ Less

Submitted 22 April, 2022; v1 submitted 30 December, 2021; originally announced December 2021.

Journal ref: (2022) International Conference on Learning Representations

arXiv:2105.06507 [pdf, other]

doi 10.1103/PhysRevX.11.031048

Correlation Driven Transient Hole Dynamics Resolved in Space and Time in the Isopropanol Molecule

Authors: T. Barillot, O. Alexander, B. Cooper, T. Driver, D. Garratt, S. Li, A. Al Haddad, A. Sanchez-Gonzalez, M. Agåker, C. Arrell, M. Bearpark, N. Berrah, C. Bostedt, J. Bozek, C. Brahms, P. H. Bucksbaum, A. Clark, G. Doumy, R. Feifel, L. J. Frasinski, S. Jarosch, A. S. Johnson, L. Kjellsson, P. Kolorenč, Y. Kumagai , et al. (24 additional authors not shown)

Abstract: The possibility of suddenly ionized molecules undergoing extremely fast electron hole dynamics prior to significant structural change was first recognized more than 20 years ago and termed charge migration. The accurate probing of ultrafast electron hole dynamics requires measurements that have both sufficient temporal resolution and can detect the localization of a specific hole within the molecu… ▽ More The possibility of suddenly ionized molecules undergoing extremely fast electron hole dynamics prior to significant structural change was first recognized more than 20 years ago and termed charge migration. The accurate probing of ultrafast electron hole dynamics requires measurements that have both sufficient temporal resolution and can detect the localization of a specific hole within the molecule. We report an investigation of the dynamics of inner valence hole states in isopropanol where we use an x-ray pump/x-ray probe experiment, with site and state-specific probing of a transient hole state localized near the oxygen atom in the molecule, together with an ab initio theoretical treatment. We record the signature of transient hole dynamics and make the first observation of dynamics driven by frustrated Auger-Meitner transitions. We verify that the hole lifetime is consistent with our theoretical prediction. This state-specific measurement paves the way to widespread application for observations of transient hole dynamics localized in space and time in molecules and thus to charge transfer phenomena that are fundamental in chemical and material physics. △ Less

Submitted 13 May, 2021; originally announced May 2021.

Journal ref: Phys. Rev. X 11, 031048 (2021)

arXiv:2006.11287 [pdf, other]

Discovering Symbolic Models from Deep Learning with Inductive Biases

Authors: Miles Cranmer, Alvaro Sanchez-Gonzalez, Peter Battaglia, Rui Xu, Kyle Cranmer, David Spergel, Shirley Ho

Abstract: We develop a general approach to distill symbolic representations of a learned deep model by introducing strong inductive biases. We focus on Graph Neural Networks (GNNs). The technique works as follows: we first encourage sparse latent representations when we train a GNN in a supervised setting, then we apply symbolic regression to components of the learned model to extract explicit physical rela… ▽ More We develop a general approach to distill symbolic representations of a learned deep model by introducing strong inductive biases. We focus on Graph Neural Networks (GNNs). The technique works as follows: we first encourage sparse latent representations when we train a GNN in a supervised setting, then we apply symbolic regression to components of the learned model to extract explicit physical relations. We find the correct known equations, including force laws and Hamiltonians, can be extracted from the neural network. We then apply our method to a non-trivial cosmology example-a detailed dark matter simulation-and discover a new analytic formula which can predict the concentration of dark matter from the mass distribution of nearby cosmic structures. The symbolic expressions extracted from the GNN using our technique also generalized to out-of-distribution data better than the GNN itself. Our approach offers alternative directions for interpreting neural networks and discovering novel physical principles from the representations they learn. △ Less

Submitted 17 November, 2020; v1 submitted 19 June, 2020; originally announced June 2020.

Comments: Accepted to NeurIPS 2020. 9 pages content + 16 pages appendix/references. Supporting code found at https://github.com/MilesCranmer/symbolic_deep_learning

arXiv:2002.09405 [pdf, other]

Learning to Simulate Complex Physics with Graph Networks

Authors: Alvaro Sanchez-Gonzalez, Jonathan Godwin, Tobias Pfaff, Rex Ying, Jure Leskovec, Peter W. Battaglia

Abstract: Here we present a machine learning framework and model implementation that can learn to simulate a wide variety of challenging physical domains, involving fluids, rigid solids, and deformable materials interacting with one another. Our framework---which we term "Graph Network-based Simulators" (GNS)---represents the state of a physical system with particles, expressed as nodes in a graph, and comp… ▽ More Here we present a machine learning framework and model implementation that can learn to simulate a wide variety of challenging physical domains, involving fluids, rigid solids, and deformable materials interacting with one another. Our framework---which we term "Graph Network-based Simulators" (GNS)---represents the state of a physical system with particles, expressed as nodes in a graph, and computes dynamics via learned message-passing. Our results show that our model can generalize from single-timestep predictions with thousands of particles during training, to different initial conditions, thousands of timesteps, and at least an order of magnitude more particles at test time. Our model was robust to hyperparameter choices across various evaluation metrics: the main determinants of long-term performance were the number of message-passing steps, and mitigating the accumulation of error by corrupting the training data with noise. Our GNS framework advances the state-of-the-art in learned physical simulation, and holds promise for solving a wide range of complex forward and inverse problems. △ Less

Submitted 14 September, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

Comments: Accepted at ICML 2020

arXiv:1909.12790 [pdf, other]

Hamiltonian Graph Networks with ODE Integrators

Authors: Alvaro Sanchez-Gonzalez, Victor Bapst, Kyle Cranmer, Peter Battaglia

Abstract: We introduce an approach for imposing physically informed inductive biases in learned simulation models. We combine graph networks with a differentiable ordinary differential equation integrator as a mechanism for predicting future states, and a Hamiltonian as an internal representation. We find that our approach outperforms baselines without these biases in terms of predictive accuracy, energy ac… ▽ More We introduce an approach for imposing physically informed inductive biases in learned simulation models. We combine graph networks with a differentiable ordinary differential equation integrator as a mechanism for predicting future states, and a Hamiltonian as an internal representation. We find that our approach outperforms baselines without these biases in terms of predictive accuracy, energy accuracy, and zero-shot generalization to time-step sizes and integrator orders not experienced during training. This advances the state-of-the-art of learned simulation, and in principle is applicable beyond physical domains. △ Less

Submitted 27 September, 2019; originally announced September 2019.

arXiv:1810.11097 [pdf, other]

doi 10.1103/PhysRevLett.123.023201

Extreme Ultraviolet Superfluorescence in Xenon and Krypton

Authors: L. Mercadier, A. Benediktovitch, C. Weninger, M. A. Blessenohl, S. Bernitt, H. Bekker, S. Dobrodey, A. Sánchez-González, B. Erk, C. Bomme, R. Boll, Z. Yin, V. P. Majety, R. Steinbrügge, M. A. Khalal, F. Penent, J. Palaudoux, P. Lablanquie, A. Rudenko, D. Rolles, J. R. Crespo López-Urrutia, N. Rohringer

Abstract: We present a comprehensive experimental and theoretical study on superfluorescence in the extreme ultraviolet wavelength regime. Focusing a high-intensity free-electron laser pulse in a cell filled with Xe or Kr gas, the medium is quasi instantaneously population-inverted by inner-shell ionization on the giant resonance followed by Auger decay. On the timescale of 100 ps a macroscopic polarization… ▽ More We present a comprehensive experimental and theoretical study on superfluorescence in the extreme ultraviolet wavelength regime. Focusing a high-intensity free-electron laser pulse in a cell filled with Xe or Kr gas, the medium is quasi instantaneously population-inverted by inner-shell ionization on the giant resonance followed by Auger decay. On the timescale of 100 ps a macroscopic polarization builds up in the medium, resulting in superfluorescent emission of several Xe and Kr lines in the forward direction. As the number of emitters in the system is increased by either raising the pressure or the pump-pulse energy, the emission shows an exponential growth of over 4 orders of magnitude and reaches saturation. With increasing yield, we observe line broadening, a manifestation of superfluorescence in the spectral domain. Our novel theoretical approach, based on a full quantum treatment of the atomic system and the irradiated field, shows quantitative agreement with the experiment and supports our interpretation. △ Less

Submitted 25 October, 2018; originally announced October 2018.

Journal ref: Phys. Rev. Lett. 123, 023201 (2019)

arXiv:1610.03378 [pdf, other]

doi 10.1038/ncomms15461

Machine learning applied to single-shot x-ray diagnostics in an XFEL

Authors: A. Sanchez-Gonzalez, P. Micaelli, C. Olivier, T. R. Barillot, M. Ilchen, A. A. Lutman, A. Marinelli, T. Maxwell, A. Achner, M. Agåker, N. Berrah, C. Bostedt, J. Buck, P. H. Bucksbaum, S. Carron Montero, B. Cooper, J. P. Cryan, M. Dong, R. Feifel, L. J. Frasinski, H. Fukuzawa, A. Galler, G. Hartmann, N. Hartmann, W. Helml , et al. (17 additional authors not shown)

Abstract: X-ray free-electron lasers (XFELs) are the only sources currently able to produce bright few-fs pulses with tunable photon energies from 100 eV to more than 10 keV. Due to the stochastic SASE operating principles and other technical issues the output pulses are subject to large fluctuations, making it necessary to characterize the x-ray pulses on every shot for data sorting purposes. We present a… ▽ More X-ray free-electron lasers (XFELs) are the only sources currently able to produce bright few-fs pulses with tunable photon energies from 100 eV to more than 10 keV. Due to the stochastic SASE operating principles and other technical issues the output pulses are subject to large fluctuations, making it necessary to characterize the x-ray pulses on every shot for data sorting purposes. We present a technique that applies machine learning tools to predict x-ray pulse properties using simple electron beam and x-ray parameters as input. Using this technique at the Linac Coherent Light Source (LCLS), we report mean errors below 0.3 eV for the prediction of the photon energy at 530 eV and below 1.6 fs for the prediction of the delay between two x-ray pulses. We also demonstrate spectral shape prediction with a mean agreement of 97%. This approach could potentially be used at the next generation of high-repetition-rate XFELs to provide accurate knowledge of complex x-ray pulses at the full repetition rate. △ Less

Submitted 11 October, 2016; originally announced October 2016.

Comments: 12 pages, 8 figures

Journal ref: Nature Communications 8, 15461 (2017)

arXiv:1512.07652 [pdf, other]

The response of a neutral atom to a strong laser field probed by transient absorption near the ionisation threshold

Authors: E. R. Simpson, A. Sanchez-Gonzalez, D. R. Austin, Z. Diveki, S. E. E. Hutchinson, T. Siegel, M. Ruberti, V. Averbukh, L. Miseikis, C. Strüber, L. Chipperfield, J. P. Marangos

Abstract: We present transient absorption spectra of an extreme ultraviolet attosecond pulse train in helium dressed by an 800 nm laser field with intensity ranging from $2\times10^{12}$ W/cm$^2$ to $2\times10^{14}$ W/cm$^2$. The energy range probed spans 16-42 eV, straddling the first ionisation energy of helium (24.59 eV). By changing the relative polarisation of the dressing field with respect to the att… ▽ More We present transient absorption spectra of an extreme ultraviolet attosecond pulse train in helium dressed by an 800 nm laser field with intensity ranging from $2\times10^{12}$ W/cm$^2$ to $2\times10^{14}$ W/cm$^2$. The energy range probed spans 16-42 eV, straddling the first ionisation energy of helium (24.59 eV). By changing the relative polarisation of the dressing field with respect to the attosecond pulse train polarisation we observe a large change in the modulation of the absorption reflecting the vectorial response to the dressing field. With parallel polarized dressing and probing fields, we observe significant modulations with periods of one half and one quarter of the dressing field period. With perpendicularly polarized dressing and probing fields, the modulations of the harmonics above the ionisation threshold are significantly suppressed. A full-dimensionality solution of the single-atom time-dependent Schrödinger equation obtained using the recently developed ab-initio time-dependent B-spline ADC method reproduce some of our observations. △ Less

Submitted 23 December, 2015; originally announced December 2015.

Comments: 9 pages, 5 figures

Showing 1–13 of 13 results for author: Sanchez-Gonzalez, A