Search | arXiv e-print repository

Modeling sea ice in the marginal ice zone as a dense granular flow with rheology inferred from a discrete element model

Authors: Gonzalo G. de Diego, Mukund Gupta, Skylar A. Gering, Rohaiz Haris, Georg Stadler

Abstract: The marginal ice zone (MIZ) represents the periphery of the sea ice cover. Here, the macroscale behavior of the sea ice results from collisions and enduring contact between ice floes. This configuration closely resembles that of dense granular flows, which have been modeled successfully with the $μ(I)$ rheology. Here, we present a continuous model based on the $μ(I)$ rheology which treats sea ice… ▽ More The marginal ice zone (MIZ) represents the periphery of the sea ice cover. Here, the macroscale behavior of the sea ice results from collisions and enduring contact between ice floes. This configuration closely resembles that of dense granular flows, which have been modeled successfully with the $μ(I)$ rheology. Here, we present a continuous model based on the $μ(I)$ rheology which treats sea ice as a compressible fluid, with the local sea ice concentration given by a dilatancy function $Φ(I)$. We infer expressions for $μ(I)$ and $Φ(I)$ from a discrete element method (DEM) which considers polygonal-shaped ice floes. We do this by driving the sea ice with a one-dimensional shearing ocean current. The resulting continuous model is a nonlinear system of equations with the sea ice velocity, local concentration, and pressure as unknowns. The rheology is given by the sum of a plastic and a viscous term. In the context of a periodic patch of ocean, which is effectively a one dimensional problem, and under steady conditions, we prove this system to be well-posed, present a numerical algorithm for solving it, and compare its solutions to those of the DEM. These comparisons demonstrate the continuous model's ability to capture most of the DEM's results accurately. The continuous model is particularly accurate for ocean currents faster than 0.25 m/s; however, for low concentrations and slow ocean currents, the continuous model is less effective in capturing the DEM results. In the latter case, the lack of accuracy of the continuous model is found to be accompanied by the breakdown of a balance between the average shear stress and the integrated ocean drag extracted from the DEM. △ Less

Submitted 13 May, 2024; originally announced May 2024.

arXiv:2310.16906 [pdf, other]

doi 10.1615/Int.J.UncertaintyQuantification.2024051416

Sensitivity Analysis of the Information Gain in Infinite-Dimensional Bayesian Linear Inverse Problems

Authors: Abhijit Chowdhary, Shanyin Tong, Georg Stadler, Alen Alexanderian

Abstract: We study the sensitivity of infinite-dimensional Bayesian linear inverse problems governed by partial differential equations (PDEs) with respect to modeling uncertainties. In particular, we consider derivative-based sensitivity analysis of the information gain, as measured by the Kullback-Leibler divergence from the posterior to the prior distribution. To facilitate this, we develop a fast and acc… ▽ More We study the sensitivity of infinite-dimensional Bayesian linear inverse problems governed by partial differential equations (PDEs) with respect to modeling uncertainties. In particular, we consider derivative-based sensitivity analysis of the information gain, as measured by the Kullback-Leibler divergence from the posterior to the prior distribution. To facilitate this, we develop a fast and accurate method for computing derivatives of the information gain with respect to auxiliary model parameters. Our approach combines low-rank approximations, adjoint-based eigenvalue sensitivity analysis, and post-optimal sensitivity analysis. The proposed approach also paves way for global sensitivity analysis by computing derivative-based global sensitivity measures. We illustrate different aspects of the proposed approach using an inverse problem governed by a scalar linear elliptic PDE, and an inverse problem governed by the three-dimensional equations of linear elasticity, which is motivated by the inversion of the fault-slip field after an earthquake. △ Less

Submitted 16 May, 2024; v1 submitted 25 October, 2023; originally announced October 2023.

Comments: 20 pages, 7 figures

MSC Class: 65C60; 90C31; 62F15; 35R30; 65F55

arXiv:2303.11919 [pdf, other]

doi 10.1007/s11222-023-10307-2

Scalable Methods for Computing Sharp Extreme Event Probabilities in Infinite-Dimensional Stochastic Systems

Authors: Timo Schorlepp, Shanyin Tong, Tobias Grafke, Georg Stadler

Abstract: We introduce and compare computational techniques for sharp extreme event probability estimates in stochastic differential equations with small additive Gaussian noise. In particular, we focus on strategies that are scalable, i.e. their efficiency does not degrade upon temporal and possibly spatial refinement. For that purpose, we extend algorithms based on the Laplace method for estimating the pr… ▽ More We introduce and compare computational techniques for sharp extreme event probability estimates in stochastic differential equations with small additive Gaussian noise. In particular, we focus on strategies that are scalable, i.e. their efficiency does not degrade upon temporal and possibly spatial refinement. For that purpose, we extend algorithms based on the Laplace method for estimating the probability of an extreme event to infinite dimensional path space. The method estimates the limiting exponential scaling using a single realization of the random variable, the large deviation minimizer. Finding this minimizer amounts to solving an optimization problem governed by a differential equation. The probability estimate becomes sharp when it additionally includes prefactor information, which necessitates computing the determinant of a second derivative operator to evaluate a Gaussian integral around the minimizer. We present an approach in infinite dimensions based on Fredholm determinants, and develop numerical algorithms to compute these determinants efficiently for the high-dimensional systems that arise upon discretization. We also give an interpretation of this approach using Gaussian process covariances and transition tubes. An example model problem, for which we provide an open-source python implementation, is used throughout the paper to illustrate all methods discussed. To study the performance of the methods, we consider examples of stochastic differential and stochastic partial differential equations, including the randomly forced incompressible three-dimensional Navier-Stokes equations. △ Less

Submitted 22 November, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

Journal ref: Stat Comput 33, 137 (2023)

arXiv:2301.03644 [pdf, other]

doi 10.1088/1361-6420/acd719

Hierarchical off-diagonal low-rank approximation of Hessians in inverse problems, with application to ice sheet model initializaiton

Authors: Tucker Hartland, Georg Stadler, Mauro Perego, Kim Liegeois, Noemi Petra

Abstract: Obtaining lightweight and accurate approximations of Hessian applies in inverse problems governed by partial differential equations (PDEs) is an essential task to make both deterministic and Bayesian statistical large-scale inverse problems computationally tractable. The $\mathcal{O}(N^{3})$ computational complexity of dense linear algebraic routines such as that needed for sampling from Gaussian… ▽ More Obtaining lightweight and accurate approximations of Hessian applies in inverse problems governed by partial differential equations (PDEs) is an essential task to make both deterministic and Bayesian statistical large-scale inverse problems computationally tractable. The $\mathcal{O}(N^{3})$ computational complexity of dense linear algebraic routines such as that needed for sampling from Gaussian proposal distributions and Newton solves by direct linear methods, can be reduced to log-linear complexity by utilizing hierarchical off-diagonal low-rank (HODLR) matrix approximations. In this work, we show that a class of Hessians that arise from inverse problems governed by PDEs are well approximated by the HODLR matrix format. In particular, we study inverse problems governed by PDEs that model the instantaneous viscous flow of ice sheets. In these problems, we seek a spatially distributed basal sliding parameter field such that the flow predicted by the ice sheet model is consistent with ice sheet surface velocity observations. We demonstrate the use of HODLR approximation by efficiently generating Hessian approximations that allow fast generation of samples from a Gaussianized posterior proposal distribution. Computational studies are performed which illustrate ice sheet problem regimes for which the Gauss-Newton data-misfit Hessian is more efficiently approximated by the HODLR matrix format than the low-rank (LR) format. We then demonstrate that HODLR approximations can be favorable, when compared to global low-rank approximations, for large-scale problems by studying the data-misfit Hessian associated to inverse problems governed by the Stokes flow model on the Humboldt glacier and Greenland ice sheets. △ Less

Submitted 9 January, 2023; originally announced January 2023.

Comments: 28 pages, 12 figures, submitted to Inverse Problems

arXiv:2210.03248 [pdf, other]

doi 10.1063/5.0129716

Direct stellarator coil optimization for nested magnetic surfaces with precise quasi-symmetry

Authors: Andrew Giuliani, Florian Wechsung, Antoine Cerfon, Matt Landreman, Georg Stadler

Abstract: We present a robust optimization algorithm for the design of electromagnetic coils that generate vacuum magnetic fields with nested flux surfaces and precise quasi-symmetry. The method is based on a bilevel optimization problem, where the outer coil optimization is constrained by a set of inner least-squares optimization problems whose solutions describe magnetic surfaces. The outer optimization o… ▽ More We present a robust optimization algorithm for the design of electromagnetic coils that generate vacuum magnetic fields with nested flux surfaces and precise quasi-symmetry. The method is based on a bilevel optimization problem, where the outer coil optimization is constrained by a set of inner least-squares optimization problems whose solutions describe magnetic surfaces. The outer optimization objective targets coils that generate a field with nested magnetic surfaces and good quasi-symmetry. The inner optimization problems identify magnetic surfaces when they exist, and approximate surfaces in the presence of magnetic islands or chaos. We show that this formulation can be used to heal islands and chaos, thus producing coils that result in magnetic fields with precise quasi-symmetry. We show that the method can be initialized with coils from the traditional two stage coil design process, as well as coils from a near axis expansion optimization. We present a numerical example where island chains are healed and quasi-symmetry is optimized up to surfaces with aspect ratio 6. Another numerical example illustrates that the aspect ratio of nested flux surfaces with optimized quasi-symmetry can be decreased from 6 to approximately 4. The last example shows that our approach is robust and a cold-start using coils from a near-axis expansion optimization. △ Less

Submitted 13 March, 2023; v1 submitted 6 October, 2022; originally announced October 2022.

arXiv:2209.06278 [pdf, other]

Large deviation theory-based adaptive importance sampling for rare events in high dimensions

Authors: Shanyin Tong, Georg Stadler

Abstract: We propose a method for the accurate estimation of rare event or failure probabilities for expensive-to-evaluate numerical models in high dimensions. The proposed approach combines ideas from large deviation theory and adaptive importance sampling. The importance sampler uses a cross-entropy method to find an optimal Gaussian biasing distribution, and reuses all samples made throughout the process… ▽ More We propose a method for the accurate estimation of rare event or failure probabilities for expensive-to-evaluate numerical models in high dimensions. The proposed approach combines ideas from large deviation theory and adaptive importance sampling. The importance sampler uses a cross-entropy method to find an optimal Gaussian biasing distribution, and reuses all samples made throughout the process for both, the target probability estimation and for updating the biasing distributions. Large deviation theory is used to find a good initial biasing distribution through the solution of an optimization problem. Additionally, it is used to identify a low-dimensional subspace that is most informative of the rare event probability. This subspace is used for the cross-entropy method, which is known to lose efficiency in higher dimensions. The proposed method does not require smoothing of indicator functions nor does it involve numerical tuning parameters. We compare the method with a state-of-the-art cross-entropy-based importance sampling scheme using three examples: a high-dimensional failure probability estimation benchmark, a problem governed by a diffusion equation, and a tsunami problem governed by the time-dependent shallow water system in one spatial dimension. △ Less

Submitted 25 March, 2023; v1 submitted 13 September, 2022; originally announced September 2022.

MSC Class: 65C05; 60F10; 62L12; 65F15; 65K10

arXiv:2208.01096 [pdf, other]

doi 10.1088/1741-4326/aca10d

Stellarator coil optimization supporting multiple magnetic configurations

Authors: Brandon F Lee, Elizabeth J Paul, Georg Stadler, Matt Landreman

Abstract: We present a technique that can be used to design stellarators with a high degree of experimental flexibility. For our purposes, flexibility is defined by the range of values the rotational transform can take on the magnetic axis of the vacuum field while maintaining satisfactory quasisymmetry. We show that accounting for configuration flexibility during the modular coil design improves flexibilit… ▽ More We present a technique that can be used to design stellarators with a high degree of experimental flexibility. For our purposes, flexibility is defined by the range of values the rotational transform can take on the magnetic axis of the vacuum field while maintaining satisfactory quasisymmetry. We show that accounting for configuration flexibility during the modular coil design improves flexibility beyond that attained by previous methods. Careful placement of planar control coils and the incorporation of an integrability objective enhance the quasisymmetry and nested flux surface volume of each configuration. We show that it is possible to achieve flexibility, quasisymmetry, and nested flux surface volume to reasonable degrees with a relatively simple coil set through an NCSX-like example. This example coil design is optimized to achieve three rotational transform targets and nested flux surface volumes in each magnetic configuration larger than the NCSX design plasma volume. Our work suggests that there is a tradeoff between flexibility, quasisymmetry, and volume of nested flux surfaces. △ Less

Submitted 4 November, 2022; v1 submitted 1 August, 2022; originally announced August 2022.

Comments: Minor phrasing change for clarity

arXiv:2204.10822 [pdf, other]

doi 10.1016/j.jcp.2022.111802

Robust and efficient primal-dual Newton-Krylov solvers for viscous-plastic sea-ice models

Authors: Yu-hsuan Shih, Carolin Mehlmann, Martin Losch, Georg Stadler

Abstract: We present a Newton-Krylov solver for a viscous-plastic sea-ice model. This constitutive relation is commonly used in climate models to describe the material properties of sea ice. Due to the strong nonlinearity introduced by the material law in the momentum equation, the development of fast, robust and scalable solvers is still a substantial challenge. In this paper, we propose a novel primal-dua… ▽ More We present a Newton-Krylov solver for a viscous-plastic sea-ice model. This constitutive relation is commonly used in climate models to describe the material properties of sea ice. Due to the strong nonlinearity introduced by the material law in the momentum equation, the development of fast, robust and scalable solvers is still a substantial challenge. In this paper, we propose a novel primal-dual Newton linearization for the implicitly-in-time discretized momentum equation. Compared to existing methods, it converges faster and more robustly with respect to mesh refinement, and thus enables numerically converged sea-ice simulations at high resolutions. Combined with an algebraic multigrid-preconditioned Krylov method for the linearized systems, which contain strongly varying coefficients, the resulting solver scales well and can be used in parallel. We present experiments for two challenging test problems and study solver performance for problems with up to 8.4 million spatial unknowns. △ Less

Submitted 15 September, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

Comments: 18 pages, 7 figures

arXiv:2203.10164 [pdf, other]

doi 10.1088/1361-6587/ac89ee

Stochastic and a posteriori optimization to mitigate coil manufacturing errors in stellarator design

Authors: Florian Wechsung, Andrew Giuliani, Matt Landreman, Antoine Cerfon, Georg Stadler

Abstract: It was recently shown in [Wechsung et. al., Proc. Natl. Acad. Sci. USA, 2022] that there exist electromagnetic coils that generate magnetic fields which are excellent approximations to quasi-symmetric fields and have very good particle confinement properties. Using a Gaussian process based model for coil perturbations, we investigate the impact of manufacturing errors on the performance of these c… ▽ More It was recently shown in [Wechsung et. al., Proc. Natl. Acad. Sci. USA, 2022] that there exist electromagnetic coils that generate magnetic fields which are excellent approximations to quasi-symmetric fields and have very good particle confinement properties. Using a Gaussian process based model for coil perturbations, we investigate the impact of manufacturing errors on the performance of these coils. We show that even fairly small errors result in noticeable performance degradation. While stochastic optimization yields minor improvements, it is not able to mitigate these errors significantly. As an alternative to stochastic optimization, we then formulate a new optimization problem for computing optimal adjustments of the coil positions and currents without changing the shapes of the coil. These a-posteriori adjustments are able to reduce the impact of coil errors by an order of magnitude, providing a new perspective for dealing with manufacturing tolerances in stellarator design. △ Less

Submitted 30 July, 2022; v1 submitted 18 March, 2022; originally announced March 2022.

arXiv:2203.03753 [pdf, other]

doi 10.1017/S0022377822000563

Direct computation of magnetic surfaces in Boozer coordinates and coil optimization for quasi-symmetry

Authors: Andrew Giuliani, Florian Wechsung, Matt Landreman, Georg Stadler, Antoine Cerfon

Abstract: We propose a new method to compute magnetic surfaces that are parametrized in Boozer coordinates for vacuum magnetic fields. We also propose a measure for quasi-symmetry on the computed surfaces and use it to design coils that generate a magnetic field that is quasi-symmetric on those surfaces. The rotational transform of the field and complexity measures for the coils are also controlled in the d… ▽ More We propose a new method to compute magnetic surfaces that are parametrized in Boozer coordinates for vacuum magnetic fields. We also propose a measure for quasi-symmetry on the computed surfaces and use it to design coils that generate a magnetic field that is quasi-symmetric on those surfaces. The rotational transform of the field and complexity measures for the coils are also controlled in the design problem. Using an adjoint approach, we are able to obtain analytic derivatives for this optimization problem, yielding an efficient gradient-based algorithm. Starting from an initial coil set that presents nested magnetic surfaces for a large fraction of the volume, our method converges rapidly to coil systems generating fields with excellent quasi-symmetry and low particle losses. In particular for low complexity coils, we are able to significantly improve the performance compared to coils obtained from the standard two-stage approach, e.g.~reduce losses of fusion-produced alpha particles born at half-radius from $17.7\%$ to $6.6\%$. We also demonstrate 16-coil configurations with alpha loss < $1\%$ and neoclassical transport magnitude $ε_{\mathrm{eff}}^{3/2}$ less than approximately $5\times 10^{-9}.$ △ Less

Submitted 29 April, 2022; v1 submitted 7 March, 2022; originally announced March 2022.

arXiv:2202.11088 [pdf, other]

A gradient-free subspace-adjusting ensemble sampler for infinite-dimensional Bayesian inverse problems

Authors: Matthew M. Dunlop, Georg Stadler

Abstract: Sampling of sharp posteriors in high dimensions is a challenging problem, especially when gradients of the likelihood are unavailable. In low to moderate dimensions, affine-invariant methods, a class of ensemble-based gradient-free methods, have found success in sampling concentrated posteriors. However, the number of ensemble members must exceed the dimension of the unknown state in order for the… ▽ More Sampling of sharp posteriors in high dimensions is a challenging problem, especially when gradients of the likelihood are unavailable. In low to moderate dimensions, affine-invariant methods, a class of ensemble-based gradient-free methods, have found success in sampling concentrated posteriors. However, the number of ensemble members must exceed the dimension of the unknown state in order for the correct distribution to be targeted. Conversely, the preconditioned Crank-Nicolson (pCN) algorithm succeeds at sampling in high dimensions, but samples become highly correlated when the posterior differs significantly from the prior. In this article we combine the above methods in two different ways as an attempt to find a compromise. The first method involves inflating the proposal covariance in pCN with that of the current ensemble, whilst the second performs approximately affine-invariant steps on a continually adapting low-dimensional subspace, while using pCN on its orthogonal complement. △ Less

Submitted 22 February, 2022; originally announced February 2022.

Comments: 24 pages, 10 figures

MSC Class: 65N21; 62F15; 65C05; 65N75; 90C56

arXiv:2112.10663 [pdf, other]

Bayesian neural network priors for edge-preserving inversion

Authors: Chen Li, Matthew Dunlop, Georg Stadler

Abstract: We consider Bayesian inverse problems wherein the unknown state is assumed to be a function with discontinuous structure a priori. A class of prior distributions based on the output of neural networks with heavy-tailed weights is introduced, motivated by existing results concerning the infinite-width limit of such networks. We show theoretically that samples from such priors have desirable discont… ▽ More We consider Bayesian inverse problems wherein the unknown state is assumed to be a function with discontinuous structure a priori. A class of prior distributions based on the output of neural networks with heavy-tailed weights is introduced, motivated by existing results concerning the infinite-width limit of such networks. We show theoretically that samples from such priors have desirable discontinuous-like properties even when the network width is finite, making them appropriate for edge-preserving inversion. Numerically we consider deconvolution problems defined on one- and two-dimensional spatial domains to illustrate the effectiveness of these priors; MAP estimation, dimension-robust MCMC sampling and ensemble-based approximations are utilized to probe the posterior distribution. The accuracy of point estimates is shown to exceed those obtained from non-heavy tailed priors, and uncertainty estimates are shown to provide more useful qualitative information. △ Less

Submitted 20 December, 2021; originally announced December 2021.

arXiv:2111.14325 [pdf, ps, other]

doi 10.1007/s00024-023-03281-3

Estimating earthquake-induced tsunami height probabilities without sampling

Authors: Shanyin Tong, Eric Vanden-Eijnden, Georg Stadler

Abstract: Given a distribution of earthquake-induced seafloor elevations, we present a method to compute the probability of the resulting tsunamis reaching a certain size on shore. Instead of sampling, the proposed method relies on optimization to compute the most likely fault slips that result in a seafloor deformation inducing a large tsunami wave. We model tsunamis induced by bathymetry change using the… ▽ More Given a distribution of earthquake-induced seafloor elevations, we present a method to compute the probability of the resulting tsunamis reaching a certain size on shore. Instead of sampling, the proposed method relies on optimization to compute the most likely fault slips that result in a seafloor deformation inducing a large tsunami wave. We model tsunamis induced by bathymetry change using the shallow water equations on an idealized slice through the sea. The earthquake slip model is based on a sum of multivariate log-normal distributions, and follows the Gutenberg-Richter law for moment magnitudes 7--9. For a model problem inspired by the Tohoku-Oki 2011 earthquake and tsunami, we quantify annual probabilities of differently sized tsunami waves. Our method also identifies the most effective tsunami mechanisms. These mechanisms have smoothly varying fault slip patches that lead to an expansive but moderately large bathymetry change. The resulting tsunami waves are compressed as they approach shore and reach close-to-vertical leading wave edge close to shore. △ Less

Submitted 7 April, 2023; v1 submitted 28 November, 2021; originally announced November 2021.

arXiv:2107.00820 [pdf, other]

Robust multigrid techniques for augmented Lagrangian preconditioning of incompressible Stokes equations with extreme viscosity variations

Authors: Yu-hsuan Shih, Georg Stadler, Florian Wechsung

Abstract: We present augmented Lagrangian Schur complement preconditioners and robust multigrid methods for incompressible Stokes problems with extreme viscosity variations. Such Stokes systems arise, for instance, upon linearization of nonlinear viscous flow problems, and they can have severely inhomogeneous and anisotropic coefficients. Using an augmented Lagrangian formulation for the incompressibility c… ▽ More We present augmented Lagrangian Schur complement preconditioners and robust multigrid methods for incompressible Stokes problems with extreme viscosity variations. Such Stokes systems arise, for instance, upon linearization of nonlinear viscous flow problems, and they can have severely inhomogeneous and anisotropic coefficients. Using an augmented Lagrangian formulation for the incompressibility constraint makes the Schur complement easier to approximate, but results in a nearly singular (1,1)-block in the Stokes system. We present eigenvalue estimates for the quality of the Schur complement approximation. To cope with the near-singularity of the (1,1)-block, we extend a multigrid scheme with a discretization-dependent smoother and transfer operators from triangular/tetrahedral to the quadrilateral/hexahedral finite element discretizations $[\mathbb{Q}_k]^d\times \mathbb{P}_{k-1}^{\text{disc}}$, $k\geq 2$, $d=2,3$. Using numerical examples with scalar and with anisotropic fourth-order tensor viscosity arising from linearization of a viscoplastic constitutive relation, we confirm the robustness of the multigrid scheme and the overall efficiency of the solver. We present scalability results using up to 28,672 parallel tasks for problems with up to 1.6 billion unknowns and a viscosity contrast up to ten orders of magnitude. △ Less

Submitted 2 November, 2021; v1 submitted 2 July, 2021; originally announced July 2021.

Comments: 27 pages, 6 figures

MSC Class: 65F08; 65F10; 65N55; 65Y05; 76D07

arXiv:2106.12137 [pdf, other]

doi 10.1088/1741-4326/ac45f3

Single-stage gradient-based stellarator coil design: stochastic optimization

Authors: Florian Wechsung, Andrew Giuliani, Matt Landreman, Antoine Cerfon, Georg Stadler

Abstract: We extend the single-stage stellarator coil design approach for quasi-symmetry on axis from [Giuliani et al, 2020] to additionally take into account coil manufacturing errors. By modeling coil errors independently from the coil discretization, we have the flexibility to consider realistic forms of coil errors. The corresponding stochastic optimization problems are formulated using a risk-neutral a… ▽ More We extend the single-stage stellarator coil design approach for quasi-symmetry on axis from [Giuliani et al, 2020] to additionally take into account coil manufacturing errors. By modeling coil errors independently from the coil discretization, we have the flexibility to consider realistic forms of coil errors. The corresponding stochastic optimization problems are formulated using a risk-neutral approach and risk-averse approaches. We present an efficient, gradient-based descent algorithm which relies on analytical derivatives to solve these problems. In a comprehensive numerical study, we compare the coil designs resulting from deterministic and risk-neutral stochastic optimization and find that the risk-neutral formulation results in more robust configurations and reduces the number of local minima of the optimization problem. We also compare deterministic and risk-neutral approaches in terms of quasi-symmetry on and away from the magnetic axis, and in terms of the confinement of particles released close to the axis. Finally, we show that for the optimization problems we consider, a risk-averse objective using the Conditional Value-at-Risk leads to results which are similar to the risk-neutral objective. △ Less

Submitted 22 June, 2021; originally announced June 2021.

arXiv:2010.02033 [pdf, other]

Single-stage gradient-based stellarator coil design: Optimization for near-axis quasi-symmetry

Authors: Andrew Giuliani, Florian Wechsung, Antoine Cerfon, Georg Stadler, Matt Landreman

Abstract: We present a new coil design paradigm for magnetic confinement in stellarators. Our approach directly optimizes coil shapes and coil currents to produce a vacuum quasi-symmetric magnetic field with a target rotational transform on the magnetic axis. This approach differs from the traditional two-stage approach in which first a magnetic configuration with desirable physics properties is found, and… ▽ More We present a new coil design paradigm for magnetic confinement in stellarators. Our approach directly optimizes coil shapes and coil currents to produce a vacuum quasi-symmetric magnetic field with a target rotational transform on the magnetic axis. This approach differs from the traditional two-stage approach in which first a magnetic configuration with desirable physics properties is found, and then coils to approximately realize this magnetic configuration are designed. The proposed single-stage approach allows us to find a compromise between confinement and engineering requirements, i.e., find easy-to-build coils with good confinement properties. Using forward and adjoint sensitivities, we derive derivatives of the physical quantities in the objective, which is constrained by a nonlinear periodic differential equation. In two numerical examples, we compare different gradient-based descent algorithms and find that incorporating approximate second-order derivative information through a quasi-Newton method is crucial for convergence. We also explore the optimization landscape in the neighborhood of a minimizer and find many directions in which the objective is mostly flat, indicating ample freedom to find simple and thus easy-to-build coils. △ Less

Submitted 15 March, 2022; v1 submitted 1 October, 2020; originally announced October 2020.

Comments: 28 pages

arXiv:2007.13930 [pdf, other]

doi 10.2140/camcos.2021.16.181

Extreme event probability estimation using PDE-constrained optimization and large deviation theory, with application to tsunamis

Authors: Shanyin Tong, Eric Vanden-Eijnden, Georg Stadler

Abstract: We propose and compare methods for the analysis of extreme events in complex systems governed by PDEs that involve random parameters, in situations where we are interested in quantifying the probability that a scalar function of the system's solution is above a threshold. If the threshold is large, this probability is small and its accurate estimation is challenging. To tackle this difficulty, we… ▽ More We propose and compare methods for the analysis of extreme events in complex systems governed by PDEs that involve random parameters, in situations where we are interested in quantifying the probability that a scalar function of the system's solution is above a threshold. If the threshold is large, this probability is small and its accurate estimation is challenging. To tackle this difficulty, we blend theoretical results from large deviation theory (LDT) with numerical tools from PDE-constrained optimization. Our methods first compute parameters that minimize the LDT-rate function over the set of parameters leading to extreme events, using adjoint methods to compute the gradient of this rate function. The minimizers give information about the mechanism of the extreme events as well as estimates of their probability. We then propose a series of methods to refine these estimates, either via importance sampling or geometric approximation of the extreme event sets. Results are formulated for general parameter distributions and detailed expressions are provided when Gaussian distributions. We give theoretical and numerical arguments showing that the performance of our methods is insensitive to the extremeness of the events we are interested in. We illustrate the application of our approach to quantify the probability of extreme tsunami events on shore. Tsunamis are typically caused by a sudden, unpredictable change of the ocean floor elevation during an earthquake. We model this change as a random process, which takes into account the underlying physics. We use the one-dimensional shallow water equation to model tsunamis numerically. In the context of this example, we present a comparison of our methods for extreme event probability estimation, and find which type of ocean floor elevation change leads to the largest tsunamis on shore. △ Less

Submitted 22 November, 2023; v1 submitted 27 July, 2020; originally announced July 2020.

MSC Class: 65K10; 35Q93; 76B15; 60F10; 60H35

Journal ref: Commun. Appl. Math. Comput. Sci. 16 (2021) 181-225

arXiv:2006.11939 [pdf, other]

Optimal design of large-scale Bayesian linear inverse problems under reducible model uncertainty: good to know what you don't know

Authors: Alen Alexanderian, Noemi Petra, Georg Stadler, Isaac Sunseri

Abstract: We consider optimal design of infinite-dimensional Bayesian linear inverse problems governed by partial differential equations that contain secondary reducible model uncertainties, in addition to the uncertainty in the inversion parameters. By reducible uncertainties we refer to parametric uncertainties that can be reduced through parameter inference. We seek experimental designs that minimize the… ▽ More We consider optimal design of infinite-dimensional Bayesian linear inverse problems governed by partial differential equations that contain secondary reducible model uncertainties, in addition to the uncertainty in the inversion parameters. By reducible uncertainties we refer to parametric uncertainties that can be reduced through parameter inference. We seek experimental designs that minimize the posterior uncertainty in the primary parameters, while accounting for the uncertainty in secondary parameters. We accomplish this by deriving a marginalized A-optimality criterion and develo** an efficient computational approach for its optimization. We illustrate our approach for estimating an uncertain time-dependent source in a contaminant transport model with an uncertain initial state as secondary uncertainty. Our results indicate that accounting for additional model uncertainty in the experimental design process is crucial. △ Less

Submitted 21 June, 2020; originally announced June 2020.

Comments: 22 pages

MSC Class: 65C60; 62K05; 62F15; 35R30

arXiv:2003.11115 [pdf, other]

doi 10.1029/2020GC009059

Advanced Newton Methods for Geodynamical Models of Stokes Flow with Viscoplastic Rheologies

Authors: Johann Rudi, Yu-hsuan Shih, Georg Stadler

Abstract: Strain localization and resulting plasticity and failure play an important role in the evolution of the lithosphere. These phenomena are commonly modeled by Stokes flows with viscoplastic rheologies. The nonlinearities of these rheologies make the numerical solution of the resulting systems challenging, and iterative methods often converge slowly or not at all. Yet accurate solutions are critical… ▽ More Strain localization and resulting plasticity and failure play an important role in the evolution of the lithosphere. These phenomena are commonly modeled by Stokes flows with viscoplastic rheologies. The nonlinearities of these rheologies make the numerical solution of the resulting systems challenging, and iterative methods often converge slowly or not at all. Yet accurate solutions are critical for representing the physics. Moreover, for some rheology laws, aspects of solvability are still unknown. We study a basic but representative viscoplastic rheology law. The law involves a yield stress that is independent of the dynamic pressure, referred to as von Mises yield criterion. Two commonly used variants, perfect/ideal and composite viscoplasticity, are compared. We derive both variants from energy minimization principles, and we use this perspective to argue when solutions are unique. We propose a new stress-velocity Newton solution algorithm that treats the stress as an independent variable during the Newton linearization but requires solution only of Stokes systems that are of the usual velocity-pressure form. To study different solution algorithms, we implement 2D and 3D finite element discretizations, and we generate Stokes problems with up to 7 orders of magnitude viscosity contrasts, in which compression or tension results in significant nonlinear localization effects. Comparing the performance of the proposed Newton method with the standard Newton method and the Picard fixed-point method, we observe a significant reduction in the number of iterations and improved stability with respect to problem nonlinearity, mesh refinement, and the polynomial order of the discretization. △ Less

Submitted 22 July, 2020; v1 submitted 24 March, 2020; originally announced March 2020.

Comments: To appear in Geochemistry, Geophysics, Geosystems

arXiv:2003.10173 [pdf, other]

Hierarchical Matrix Approximations of Hessians Arising in Inverse Problems Governed by PDEs

Authors: Ilona Ambartsumyan, Wajih Boukaram, Tan Bui-Thanh, Omar Ghattas, David Keyes, Georg Stadler, George Turkiyyah, Stefano Zampini

Abstract: Hessian operators arising in inverse problems governed by partial differential equations (PDEs) play a critical role in delivering efficient, dimension-independent convergence for both Newton solution of deterministic inverse problems, as well as Markov chain Monte Carlo sampling of posteriors in the Bayesian setting. These methods require the ability to repeatedly perform such operations on the H… ▽ More Hessian operators arising in inverse problems governed by partial differential equations (PDEs) play a critical role in delivering efficient, dimension-independent convergence for both Newton solution of deterministic inverse problems, as well as Markov chain Monte Carlo sampling of posteriors in the Bayesian setting. These methods require the ability to repeatedly perform such operations on the Hessian as multiplication with arbitrary vectors, solving linear systems, inversion, and (inverse) square root. Unfortunately, the Hessian is a (formally) dense, implicitly-defined operator that is intractable to form explicitly for practical inverse problems, requiring as many PDE solves as inversion parameters. Low rank approximations are effective when the data contain limited information about the parameters, but become prohibitive as the data become more informative. However, the Hessians for many inverse problems arising in practical applications can be well approximated by matrices that have hierarchically low rank structure. Hierarchical matrix representations promise to overcome the high complexity of dense representations and provide effective data structures and matrix operations that have only log-linear complexity. In this work, we describe algorithms for constructing and updating hierarchical matrix approximations of Hessians, and illustrate them on a number of representative inverse problems involving time-dependent diffusion, advection-dominated transport, frequency domain acoustic wave propagation, and low frequency Maxwell equations, demonstrating up to an order of magnitude speedup compared to globally low rank approximations. △ Less

Submitted 23 March, 2020; originally announced March 2020.

arXiv:2003.09318 [pdf, other]

doi 10.1088/1361-6420/abaa30

Bayesian approach to inverse scattering with topological priors

Authors: Ana Carpio, Sergei Iakunin, Georg Stadler

Abstract: We propose a Bayesian inference framework to estimate uncertainties in inverse scattering problems. Given the observed data, the forward model and their uncertainties, we find the posterior distribution over a finite parameter field representing the objects. To construct the prior distribution we use a topological sensitivity analysis. We demonstrate the approach on the Bayesian solution of 2D inv… ▽ More We propose a Bayesian inference framework to estimate uncertainties in inverse scattering problems. Given the observed data, the forward model and their uncertainties, we find the posterior distribution over a finite parameter field representing the objects. To construct the prior distribution we use a topological sensitivity analysis. We demonstrate the approach on the Bayesian solution of 2D inverse problems in light and acoustic holography with synthetic data. Statistical information on objects such as their center location, diameter size, orientation, as well as material properties, are extracted by sampling the posterior distribution. Assuming the number of objects known, comparison of the results obtained by Markov Chain Monte Carlo sampling and by sampling a Gaussian distribution found by linearization about the maximum a posteriori estimate show reasonable agreement. The latter procedure has low computational cost, which makes it an interesting tool for uncertainty studies in 3D. However, MCMC sampling provides a more complete picture of the posterior distribution and yields multi-modal posterior distributions for problems with larger measurement noise. When the number of objects is unknown, we devise a stochastic model selection framework. △ Less

Submitted 16 November, 2020; v1 submitted 20 March, 2020; originally announced March 2020.

Journal ref: Inverse Problems 36, 105001, 2020

arXiv:1912.08915 [pdf, other]

doi 10.1088/1361-6420/ab89c5

Optimal experimental design under irreducible uncertainty for linear inverse problems governed by PDEs

Authors: Karina Koval, Alen Alexanderian, Georg Stadler

Abstract: We present a method for computing A-optimal sensor placements for infinite-dimensional Bayesian linear inverse problems governed by PDEs with irreducible model uncertainties. Here, irreducible uncertainties refers to uncertainties in the model that exist in addition to the parameters in the inverse problem, and that cannot be reduced through observations. Specifically, given a statistical distribu… ▽ More We present a method for computing A-optimal sensor placements for infinite-dimensional Bayesian linear inverse problems governed by PDEs with irreducible model uncertainties. Here, irreducible uncertainties refers to uncertainties in the model that exist in addition to the parameters in the inverse problem, and that cannot be reduced through observations. Specifically, given a statistical distribution for the model uncertainties, we compute the optimal design that minimizes the expected value of the posterior covariance trace. The expected value is discretized using Monte Carlo leading to an objective function consisting of a sum of trace operators and a binary-inducing penalty. Minimization of this objective requires a large number of PDE solves in each step. To make this problem computationally tractable, we construct a composite low-rank basis using a randomized range finder algorithm to eliminate forward and adjoint PDE solves. We also present a novel formulation of the A-optimal design objective that requires the trace of an operator in the observation rather than the parameter space. The binary structure is enforced using a weighted regularized $\ell_0$-sparsification approach. We present numerical results for inference of the initial condition in a subsurface flow problem with inherent uncertainty in the flow fields and in the initial times. △ Less

Submitted 29 March, 2020; v1 submitted 18 December, 2019; originally announced December 2019.

arXiv:1909.11085 [pdf, other]

doi 10.1145/3295500.3356203

Scalable Simulation of Realistic Volume Fraction Red Blood Cell Flows through Vascular Networks

Authors: Libin Lu, Matthew J. Morse, Abtin Rahimian, Georg Stadler, Denis Zorin

Abstract: High-resolution blood flow simulations have potential for develo** better understanding biophysical phenomena at the microscale, such as vasodilation, vasoconstriction and overall vascular resistance. To this end, we present a scalable platform for the simulation of red blood cell (RBC) flows through complex capillaries by modeling the physical system as a viscous fluid with immersed deformable… ▽ More High-resolution blood flow simulations have potential for develo** better understanding biophysical phenomena at the microscale, such as vasodilation, vasoconstriction and overall vascular resistance. To this end, we present a scalable platform for the simulation of red blood cell (RBC) flows through complex capillaries by modeling the physical system as a viscous fluid with immersed deformable particles. We describe a parallel boundary integral equation solver for general elliptic partial differential equations, which we apply to Stokes flow through blood vessels. We also detail a parallel collision avoiding algorithm to ensure RBCs and the blood vessel remain contact-free. We have scaled our code on Stampede2 at the Texas Advanced Computing Center up to 34,816 cores. Our largest simulation enforces a contact-free state between four billion surface elements and solves for three billion degrees of freedom on one million RBCs and a blood vessel composed from two million patches. △ Less

Submitted 23 September, 2019; originally announced September 2019.

arXiv:1808.05441 [pdf, other]

doi 10.1088/1361-6420/aaf129

A comparative study of structural similarity and regularization for joint inverse problems governed by PDEs

Authors: Benjamin Crestel, Georg Stadler, Omar Ghattas

Abstract: Joint inversion refers to the simultaneous inference of multiple parameter fields from observations of systems governed by single or multiple forward models. In many cases these parameter fields reflect different attributes of a single medium and are thus spatially correlated or structurally similar. By imposing prior information on their spatial correlations via a joint regularization term, we se… ▽ More Joint inversion refers to the simultaneous inference of multiple parameter fields from observations of systems governed by single or multiple forward models. In many cases these parameter fields reflect different attributes of a single medium and are thus spatially correlated or structurally similar. By imposing prior information on their spatial correlations via a joint regularization term, we seek to improve the reconstruction of the parameter fields relative to inversion for each field independently. One of the main challenges is to devise a joint regularization functional that conveys the spatial correlations or structural similarity between the fields while at the same time permitting scalable and efficient solvers for the joint inverse problem. We describe several joint regularizations that are motivated by these goals: a cross-gradient and a normalized cross-gradient structural similarity term, the vectorial total variation, and a joint regularization based on the nuclear norm of the gradients. Based on numerical results from three classes of inverse problems with piecewise-homogeneous parameter fields, we conclude that the vectorial total variation functional is preferable to the other methods considered. Besides resulting in good reconstructions in all experiments, it allows for scalable, efficient solvers for joint inverse problems governed by PDE forward models. △ Less

Submitted 16 August, 2018; originally announced August 2018.

arXiv:1804.05678 [pdf, other]

Sparse solutions in optimal control of PDEs with uncertain parameters: the linear case

Authors: Chen Li, Georg Stadler

Abstract: We study sparse solutions of optimal control problems governed by PDEs with uncertain coefficients. We propose two formulations, one where the solution is a deterministic control optimizing the mean objective, and a formulation aiming at stochastic controls that share the same sparsity structure. In both formulations, regions where the controls do not vanish can be interpreted as optimal locations… ▽ More We study sparse solutions of optimal control problems governed by PDEs with uncertain coefficients. We propose two formulations, one where the solution is a deterministic control optimizing the mean objective, and a formulation aiming at stochastic controls that share the same sparsity structure. In both formulations, regions where the controls do not vanish can be interpreted as optimal locations for placing control devices. In this paper, we focus on linear PDEs with linearly entering uncertain parameters. Under these assumptions, the deterministic formulation reduces to a problem with known structure, and thus we mainly focus on the stochastic control formulation. Here, shared sparsity is achieved by incorporating the $L^1$-norm of the mean of the pointwise squared controls in the objective. We reformulate the problem using a norm reweighting function that is defined over physical space only and thus helps to avoid approximation of the random space using samples or quadrature. We show that a fixed point algorithm applied to the norm reweighting formulation leads to a variant of the well-studied iterative reweighted least squares (IRLS) algorithm, and we propose a novel preconditioned Newton-conjugate gradient method to speed up the IRLS algorithm. We combine our algorithms with low-rank operator approximations, for which we provide estimates of the truncation error. We carefully examine the computational complexity of the resulting algorithms. The sparsity structure of the optimal controls and the performance of the solution algorithms are studied numerically using control problems governed by the Laplace and Helmholtz equations. In these experiments the Newton variant clearly outperforms the IRLS method. △ Less

Submitted 19 November, 2018; v1 submitted 16 April, 2018; originally announced April 2018.

Comments: 25 pages

arXiv:1704.02170 [pdf, ps, other]

A Feynman-kac Formula Approach for Computing Expectations and Threshold Crossing Probabilities of Non-smooth Stochastic Dynamical Systems

Authors: Laurent Mertz, Georg Stadler, Jonathan Wylie

Abstract: We present a computational alternative to probabilistic simulations for non-smooth stochastic dynamical systems that are prevalent in engineering mechanics. As examples, we target (1) stochastic elasto-plastic problems, which involve transitions between elastic and plastic states, and (2) obstacle problems with noise, which involve discrete impulses due to collisions with an obstacle. We formally… ▽ More We present a computational alternative to probabilistic simulations for non-smooth stochastic dynamical systems that are prevalent in engineering mechanics. As examples, we target (1) stochastic elasto-plastic problems, which involve transitions between elastic and plastic states, and (2) obstacle problems with noise, which involve discrete impulses due to collisions with an obstacle. We formally introduce a class of partial differential equations related to the Feynman-Kac formula, where the underlying stochastic processes satisfy variational inequalities modelling elasto-plastic and obstacle oscillators. We then focus on solving them numerically. The main challenge in solving these equations is the non-standard boundary conditions which describe the behavior of the underlying process on the boundary. We illustrate how to use our approach to compute expectations and other statistical quantities, such as the asymptotic growth rate of variance in asymptotic formulae for threshold crossing probabilities. △ Less

Submitted 22 May, 2019; v1 submitted 7 April, 2017; originally announced April 2017.

Comments: 27 pages, 6 figures, 3 tables

arXiv:1612.02358 [pdf, other]

doi 10.1088/1361-6420/aa6d8e

A-optimal encoding weights for nonlinear inverse problems, with applications to the Helmholtz inverse problem

Authors: Benjamin Crestel, Alen Alexanderian, Georg Stadler, Omar Ghattas

Abstract: The computational cost of solving an inverse problem governed by PDEs, using multiple experiments, increases linearly with the number of experiments. A recently proposed method to decrease this cost uses only a small number of random linear combinations of all experiments for solving the inverse problem. This approach applies to inverse problems where the PDE solution depends linearly on the right… ▽ More The computational cost of solving an inverse problem governed by PDEs, using multiple experiments, increases linearly with the number of experiments. A recently proposed method to decrease this cost uses only a small number of random linear combinations of all experiments for solving the inverse problem. This approach applies to inverse problems where the PDE solution depends linearly on the right-hand side function that models the experiment. As this method is stochastic in essence, the quality of the obtained reconstructions can vary, in particular when only a small number of combinations are used. We develop a Bayesian formulation for the definition and computation of encoding weights that lead to a parameter reconstruction with the least uncertainty. We call these weights A-optimal encoding weights. Our framework applies to inverse problems where the governing PDE is nonlinear with respect to the inversion parameter field. We formulate the problem in infinite dimensions and follow the optimize-then-discretize approach, devoting special attention to the discretization and the choice of numerical methods in order to achieve a computational cost that is independent of the parameter discretization. We elaborate our method for a Helmholtz inverse problem, and derive the adjoint-based expressions for the gradient of the objective function of the optimization problem for finding the A-optimal encoding weights. The proposed method is potentially attractive for real-time monitoring applications, where one can invest the effort to compute optimal weights offline, to later solve an inverse problem repeatedly, over time, at a fraction of the initial cost. △ Less

Submitted 27 February, 2017; v1 submitted 7 December, 2016; originally announced December 2016.

arXiv:1610.05280 [pdf, other]

Mitigating the Influence of the Boundary on PDE-based Covariance Operators

Authors: Yair Daon, Georg Stadler

Abstract: Gaussian random fields over infinite-dimensional Hilbert spaces require the definition of appropriate covariance operators. The use of elliptic PDE operators to construct covariance operators allows to build on fast PDE solvers for manipulations with the resulting covariance and precision operators. However, PDE operators require a choice of boundary conditions, and this choice can have a strong a… ▽ More Gaussian random fields over infinite-dimensional Hilbert spaces require the definition of appropriate covariance operators. The use of elliptic PDE operators to construct covariance operators allows to build on fast PDE solvers for manipulations with the resulting covariance and precision operators. However, PDE operators require a choice of boundary conditions, and this choice can have a strong and usually undesired influence on the Gaussian random field. We propose two techniques that allow to ameliorate these boundary effects for large-scale problems. The first approach combines the elliptic PDE operator with a Robin boundary condition, where a varying Robin coefficient is computed from an optimization problem. The second approach normalizes the pointwise variance by rescaling the covariance operator. These approaches can be used individually or can be combined. We study properties of these approaches, and discuss their computational complexity. The performance of our approaches is studied for random fields defined over simple and complex two- and three-dimensional domains. △ Less

Submitted 11 December, 2017; v1 submitted 17 October, 2016; originally announced October 2016.

Comments: 19 pages

MSC Class: 62F15; 35R30; 65C50 (Primary) 28C20 (Secondary)

arXiv:1607.03936 [pdf, other]

doi 10.1137/16M108450X

Weighted BFBT Preconditioner for Stokes Flow Problems with Highly Heterogeneous Viscosity

Authors: Johann Rudi, Georg Stadler, Omar Ghattas

Abstract: We present a weighted BFBT approximation (w-BFBT) to the inverse Schur complement of a Stokes system with highly heterogeneous viscosity. When used as part of a Schur complement-based Stokes preconditioner, we observe robust fast convergence for Stokes problems with smooth but highly varying (up to 10 orders of magnitude) viscosities, optimal algorithmic scalability with respect to mesh refinement… ▽ More We present a weighted BFBT approximation (w-BFBT) to the inverse Schur complement of a Stokes system with highly heterogeneous viscosity. When used as part of a Schur complement-based Stokes preconditioner, we observe robust fast convergence for Stokes problems with smooth but highly varying (up to 10 orders of magnitude) viscosities, optimal algorithmic scalability with respect to mesh refinement, and only a mild dependence on the polynomial order of high-order finite element discretizations ($Q_k \times P_{k-1}^{disc}$, order $k \ge 2$). For certain difficult problems, we demonstrate numerically that w-BFBT significantly improves Stokes solver convergence over the widely used inverse viscosity-weighted pressure mass matrix approximation of the Schur complement. In addition, we derive theoretical eigenvalue bounds to prove spectral equivalence of w-BFBT. Using detailed numerical experiments, we discuss modifications to w-BFBT at Dirichlet boundaries that decrease the number of iterations. The overall algorithmic performance of the Stokes solver is governed by the efficacy of w-BFBT as a Schur complement approximation and, in addition, by our parallel hybrid spectral-geometric-algebraic multigrid (HMG) method, which we use to approximate the inverses of the viscous block and variable-coefficient pressure Poisson operators within w-BFBT. Building on the scalability of HMG, our Stokes solver achieves a parallel efficiency of 90% while weak scaling over a more than 600-fold increase from 48 to all 30,000 cores of TACC's Lonestar 5 supercomputer. △ Less

Submitted 29 January, 2017; v1 submitted 13 July, 2016; originally announced July 2016.

Comments: To appear in SIAM Journal on Scientific Computing

MSC Class: 65F08; 65F10; 65N55; 65Y05 (Primary); 76D07; 76A05; 65N30 (Secondary)

Journal ref: SIAM Journal on Scientific Computing, 39(5), S272-S297, 2017

arXiv:1602.07592 [pdf, other]

Mean-variance risk-averse optimal control of systems governed by PDEs with random parameter fields using quadratic approximations

Authors: Alen Alexanderian, Noemi Petra, Georg Stadler, Omar Ghattas

Abstract: We present a method for optimal control of systems governed by partial differential equations (PDEs) with uncertain parameter fields. We consider an objective function that involves the mean and variance of the control objective, leading to a risk-averse optimal control problem. To make the problem tractable, we invoke a quadratic Taylor series approximation of the control objective with respect t… ▽ More We present a method for optimal control of systems governed by partial differential equations (PDEs) with uncertain parameter fields. We consider an objective function that involves the mean and variance of the control objective, leading to a risk-averse optimal control problem. To make the problem tractable, we invoke a quadratic Taylor series approximation of the control objective with respect to the uncertain parameter. This enables deriving explicit expressions for the mean and variance of the control objective in terms of its gradients and Hessians with respect to the uncertain parameter. The risk-averse optimal control problem is then formulated as a PDE-constrained optimization problem with constraints given by the forward and adjoint PDEs defining these gradients and Hessians. The expressions for the mean and variance of the control objective under the quadratic approximation involve the trace of the (preconditioned) Hessian and are thus prohibitive to evaluate. To address this, we employ trace estimators that only require a modest number of Hessian-vector products. We illustrate our approach with two problems: the control of a semilinear elliptic PDE with an uncertain boundary source term, and the control of a linear elliptic PDE with an uncertain coefficient field. For the latter problem, we derive adjoint-based expressions for efficient computation of the gradient of the risk-averse objective with respect to the controls. Our method ensures that the cost of computing the risk-averse objective and its gradient with respect to the control, measured in the number of PDE solves, is independent of the (discretized) parameter and control dimensions, and depends only on the number of random vectors employed in the trace estimation. Finally, we present a comprehensive numerical study of an optimal control problem for fluid flow in a porous medium with uncertain permeability field. △ Less

Submitted 22 November, 2017; v1 submitted 24 February, 2016; originally announced February 2016.

Comments: 27 pages. Minor revisions. Accepted for publication in SIAM/ASA Journal on Uncertainty Quantification

MSC Class: 60H15; 60H35; 35Q93; 35R60; 65K10

arXiv:1410.5899 [pdf, other]

A Fast and Scalable Method for A-Optimal Design of Experiments for Infinite-dimensional Bayesian Nonlinear Inverse Problems

Authors: Alen Alexanderian, Noemi Petra, Georg Stadler, Omar Ghattas

Abstract: We address the problem of optimal experimental design (OED) for Bayesian nonlinear inverse problems governed by PDEs. The goal is to find a placement of sensors, at which experimental data are collected, so as to minimize the uncertainty in the inferred parameter field. We formulate the OED objective function by generalizing the classical A-optimal experimental design criterion using the expected… ▽ More We address the problem of optimal experimental design (OED) for Bayesian nonlinear inverse problems governed by PDEs. The goal is to find a placement of sensors, at which experimental data are collected, so as to minimize the uncertainty in the inferred parameter field. We formulate the OED objective function by generalizing the classical A-optimal experimental design criterion using the expected value of the trace of the posterior covariance. We seek a method that solves the OED problem at a cost (measured in the number of forward PDE solves) that is independent of both the parameter and sensor dimensions. To facilitate this, we construct a Gaussian approximation to the posterior at the maximum a posteriori probability (MAP) point, and use the resulting covariance operator to define the OED objective function. We use randomized trace estimation to compute the trace of this (implicitly defined) covariance operator. The resulting OED problem includes as constraints the PDEs characterizing the MAP point, and the PDEs describing the action of the covariance operator to vectors. The sparsity of the sensor configurations is controlled using sparsifying penalty functions. We elaborate our OED method for the problem of determining the sensor placement to best infer the coefficient of an elliptic PDE. Adjoint methods are used to compute the gradient of the PDE-constrained OED objective function. We provide numerical results for inference of the permeability field in a porous medium flow problem, and demonstrate that the number of PDE solves required for the evaluation of the OED objective function and its gradient is essentially independent of both the parameter and sensor dimensions. The number of quasi-Newton iterations for computing an OED also exhibits the same dimension invariance properties. △ Less

Submitted 3 November, 2015; v1 submitted 21 October, 2014; originally announced October 2014.

Comments: 30 pages; minor revisions; accepted for publication in SIAM Journal on Scientific Computing

MSC Class: 62K05; 35Q62; 62F15; 35R30; 35Q93; 65C60

arXiv:1410.1221 [pdf, other]

doi 10.1016/j.jcp.2015.04.047

Scalable and efficient algorithms for the propagation of uncertainty from data through inference to prediction for large-scale problems, with application to flow of the Antarctic ice sheet

Authors: Tobin Isaac, Noemi Petra, Georg Stadler, Omar Ghattas

Abstract: The majority of research on efficient and scalable algorithms in computational science and engineering has focused on the forward problem: given parameter inputs, solve the governing equations to determine output quantities of interest. In contrast, here we consider the broader question: given a (large-scale) model containing uncertain parameters, (possibly) noisy observational data, and a predict… ▽ More The majority of research on efficient and scalable algorithms in computational science and engineering has focused on the forward problem: given parameter inputs, solve the governing equations to determine output quantities of interest. In contrast, here we consider the broader question: given a (large-scale) model containing uncertain parameters, (possibly) noisy observational data, and a prediction quantity of interest, how do we construct efficient and scalable algorithms to (1) infer the model parameters from the data (the deterministic inverse problem), (2) quantify the uncertainty in the inferred parameters (the Bayesian inference problem), and (3) propagate the resulting uncertain parameters through the model to issue predictions with quantified uncertainties (the forward uncertainty propagation problem)? We present efficient and scalable algorithms for this end-to-end, data-to-prediction process under the Gaussian approximation and in the context of modeling the flow of the Antarctic ice sheet and its effect on sea level. The ice is modeled as a viscous, incompressible, cree**, shear-thinning fluid. The observational data come from InSAR satellite measurements of surface ice flow velocity, and the uncertain parameter field to be inferred is the basal sliding parameter. The prediction quantity of interest is the present-day ice mass flux from the Antarctic continent to the ocean. We show that the work required for executing this data-to-prediction process is independent of the state dimension, parameter dimension, data dimension, and number of processor cores. The key to achieving this dimension independence is to exploit the fact that the observational data typically provide only sparse information on model parameters. This property can be exploited to construct a low rank approximation of the linearized parameter-to-observable map. △ Less

Submitted 1 September, 2015; v1 submitted 5 October, 2014; originally announced October 2014.

MSC Class: 35Q62; 62F15; 35R30; 35Q93; 65C60; 49M15; 86A40

arXiv:1406.6573 [pdf, other]

doi 10.1137/140974407

Solution of nonlinear Stokes equations discretized by high-order finite elements on nonconforming and anisotropic meshes, with application to ice sheet dynamics

Authors: Tobin Isaac, Georg Stadler, Omar Ghattas

Abstract: Motivated by the need for efficient and accurate simulation of the dynamics of the polar ice sheets, we design high-order finite element discretizations and scalable solvers for the solution of nonlinear incompressible Stokes equations. We focus on power-law, shear thinning rheologies used in modeling ice dynamics and other geophysical flows. We use nonconforming hexahedral meshes and the conformi… ▽ More Motivated by the need for efficient and accurate simulation of the dynamics of the polar ice sheets, we design high-order finite element discretizations and scalable solvers for the solution of nonlinear incompressible Stokes equations. We focus on power-law, shear thinning rheologies used in modeling ice dynamics and other geophysical flows. We use nonconforming hexahedral meshes and the conforming inf-sup stable finite element velocity-pressure pairings $\mathbb{Q}_k\times \mathbb{Q}^\text{disc}_{k-2}$ or $\mathbb{Q}_k \times \mathbb{P}^\text{disc}_{k-1}$. To solve the nonlinear equations, we propose a Newton-Krylov method with a block upper triangular preconditioner for the linearized Stokes systems. The diagonal blocks of this preconditioner are sparse approximations of the (1,1)-block and of its Schur complement. The (1,1)-block is approximated using linear finite elements based on the nodes of the high-order discretization, and the application of its inverse is approximated using algebraic multigrid with an incomplete factorization smoother. This preconditioner is designed to be efficient on anisotropic meshes, which are necessary to match the high aspect ratio domains typical for ice sheets. We develop and make available extensions to two libraries---a hybrid meshing scheme for the p4est parallel AMR library, and a modified smoothed aggregation scheme for PETSc---to improve their support for solving PDEs in high aspect ratio domains. In a numerical study, we find that our solver yields fast convergence that is independent of the element aspect ratio, the occurrence of nonconforming interfaces, and of mesh refinement, and that depends only weakly on the polynomial finite element order. We simulate the ice flow in a realistic description of the Antarctic ice sheet derived from field data, and study the parallel scalability of our solver for problems with up to 383M unknowns. △ Less

Submitted 9 July, 2015; v1 submitted 25 June, 2014; originally announced June 2014.

Comments: 31 pages

arXiv:1402.5938 [pdf, other]

Comparison of Multigrid Algorithms for High-order Continuous Finite Element Discretizations

Authors: Hari Sundar, Georg Stadler, George Biros

Abstract: We present a comparison of different multigrid approaches for the solution of systems arising from high-order continuous finite element discretizations of elliptic partial differential equations on complex geometries. We consider the pointwise Jacobi, the Chebyshev-accelerated Jacobi and the symmetric successive over-relaxation (SSOR) smoothers, as well as elementwise block Jacobi smoothing. Three… ▽ More We present a comparison of different multigrid approaches for the solution of systems arising from high-order continuous finite element discretizations of elliptic partial differential equations on complex geometries. We consider the pointwise Jacobi, the Chebyshev-accelerated Jacobi and the symmetric successive over-relaxation (SSOR) smoothers, as well as elementwise block Jacobi smoothing. Three approaches for the multigrid hierarchy are compared: 1) high-order $h$-multigrid, which uses high-order interpolation and restriction between geometrically coarsened meshes; 2) $p$-multigrid, in which the polynomial order is reduced while the mesh remains unchanged, and the interpolation and restriction incorporate the different-order basis functions; and 3), a first-order approximation multigrid preconditioner constructed using the nodes of the high-order discretization. This latter approach is often combined with algebraic multigrid for the low-order operator and is attractive for high-order discretizations on unstructured meshes, where geometric coarsening is difficult. Based on a simple performance model, we compare the computational cost of the different approaches. Using scalar test problems in two and three dimensions with constant and varying coefficients, we compare the performance of the different multigrid approaches for polynomial orders up to 16. Overall, both $h$- and $p$-multigrid work well; the first-order approximation is less efficient. For constant coefficients, all smoothers work well. For variable coefficients, Chebyshev and SSOR smoothing outperforms Jacobi smoothing. While all of the tested methods converge in a mesh-independent number of iterations, none of them behaves completely independent of the polynomial order. When multigrid is used as a preconditioner in a Krylov method, the iteration number decreases significantly compared to using multigrid as a solver. △ Less

Submitted 6 March, 2015; v1 submitted 24 February, 2014; originally announced February 2014.

arXiv:1311.6900 [pdf, ps, other]

Discretely exact derivatives for hyperbolic PDE-constrained optimization problems discretized by the discontinuous Galerkin method

Authors: Lucas C. Wilcox, Georg Stadler, Tan Bui-Thanh, Omar Ghattas

Abstract: This paper discusses the computation of derivatives for optimization problems governed by linear hyperbolic systems of partial differential equations (PDEs) that are discretized by the discontinuous Galerkin (dG) method. An efficient and accurate computation of these derivatives is important, for instance, in inverse problems and optimal control problems. This computation is usually based on an ad… ▽ More This paper discusses the computation of derivatives for optimization problems governed by linear hyperbolic systems of partial differential equations (PDEs) that are discretized by the discontinuous Galerkin (dG) method. An efficient and accurate computation of these derivatives is important, for instance, in inverse problems and optimal control problems. This computation is usually based on an adjoint PDE system, and the question addressed in this paper is how the discretization of this adjoint system should relate to the dG discretization of the hyperbolic state equation. Adjoint-based derivatives can either be computed before or after discretization; these two options are often referred to as the optimize-then-discretize and discretize-then-optimize approaches. We discuss the relation between these two options for dG discretizations in space and Runge-Kutta time integration. Discretely exact discretizations for several hyperbolic optimization problems are derived, including the advection equation, Maxwell's equations and the coupled elastic-acoustic wave equation. We find that the discrete adjoint equation inherits a natural dG discretization from the discretization of the state equation and that the expressions for the discretely exact gradient often have to take into account contributions from element faces. For the coupled elastic-acoustic wave equation, the correctness and accuracy of our derivative expressions are illustrated by comparisons with finite difference gradients. The results show that a straightforward discretization of the continuous gradient differs from the discretely exact gradient, and thus is not consistent with the discretized objective. This inconsistency may cause difficulties in the convergence of gradient based algorithms for solving optimization problems. △ Less

Submitted 27 November, 2013; originally announced November 2013.

MSC Class: 65M60; 65M70; 65M32

arXiv:1308.6221 [pdf, other]

A computational framework for infinite-dimensional Bayesian inverse problems: Part II. Stochastic Newton MCMC with application to ice sheet flow inverse problems

Authors: Noemi Petra, James Martin, Georg Stadler, Omar Ghattas

Abstract: We address the numerical solution of infinite-dimensional inverse problems in the framework of Bayesian inference. In the Part I companion to this paper (arXiv.org:1308.1313), we considered the linearized infinite-dimensional inverse problem. Here in Part II, we relax the linearization assumption and consider the fully nonlinear infinite-dimensional inverse problem using a Markov chain Monte Carlo… ▽ More We address the numerical solution of infinite-dimensional inverse problems in the framework of Bayesian inference. In the Part I companion to this paper (arXiv.org:1308.1313), we considered the linearized infinite-dimensional inverse problem. Here in Part II, we relax the linearization assumption and consider the fully nonlinear infinite-dimensional inverse problem using a Markov chain Monte Carlo (MCMC) sampling method. To address the challenges of sampling high-dimensional pdfs arising from Bayesian inverse problems governed by PDEs, we build on the stochastic Newton MCMC method. This method exploits problem structure by taking as a proposal density a local Gaussian approximation of the posterior pdf, whose construction is made tractable by invoking a low-rank approximation of its data misfit component of the Hessian. Here we introduce an approximation of the stochastic Newton proposal in which we compute the low-rank-based Hessian at just the MAP point, and then reuse this Hessian at each MCMC step. We compare the performance of the proposed method to the original stochastic Newton MCMC method and to an independence sampler. The comparison of the three methods is conducted on a synthetic ice sheet inverse problem. For this problem, the stochastic Newton MCMC method with a MAP-based Hessian converges at least as rapidly as the original stochastic Newton MCMC method, but is far cheaper since it avoids recomputing the Hessian at each step. On the other hand, it is more expensive per sample than the independence sampler; however, its convergence is significantly more rapid, and thus overall it is much cheaper. Finally, we present extensive analysis and interpretation of the posterior distribution, and classify directions in parameter space based on the extent to which they are informed by the prior or the observations. △ Less

Submitted 11 April, 2014; v1 submitted 28 August, 2013; originally announced August 2013.

Comments: 31 pages

MSC Class: 35Q62; 62F15; 35R30; 35Q93; 65C40; 65C60; 49M15; 86A40

arXiv:1308.4084 [pdf, other]

A-optimal design of experiments for infinite-dimensional Bayesian linear inverse problems with regularized $\ell_0$-sparsification

Authors: Alen Alexanderian, Noemi Petra, Georg Stadler, Omar Ghattas

Abstract: We present an efficient method for computing A-optimal experimental designs for infinite-dimensional Bayesian linear inverse problems governed by partial differential equations (PDEs). Specifically, we address the problem of optimizing the location of sensors (at which observational data are collected) to minimize the uncertainty in the parameters estimated by solving the inverse problem, where th… ▽ More We present an efficient method for computing A-optimal experimental designs for infinite-dimensional Bayesian linear inverse problems governed by partial differential equations (PDEs). Specifically, we address the problem of optimizing the location of sensors (at which observational data are collected) to minimize the uncertainty in the parameters estimated by solving the inverse problem, where the uncertainty is expressed by the trace of the posterior covariance. Computing optimal experimental designs (OEDs) is particularly challenging for inverse problems governed by computationally expensive PDE models with infinite-dimensional (or, after discretization, high-dimensional) parameters. To alleviate the computational cost, we exploit the problem structure and build a low-rank approximation of the parameter-to-observable map, preconditioned with the square root of the prior covariance operator. This relieves our method from expensive PDE solves when evaluating the optimal experimental design objective function and its derivatives. Moreover, we employ a randomized trace estimator for efficient evaluation of the OED objective function. We control the sparsity of the sensor configuration by employing a sequence of penalty functions that successively approximate the $\ell_0$-"norm"; this results in binary designs that characterize optimal sensor locations. We present numerical results for inference of the initial condition from spatio-temporal observations in a time-dependent advection-diffusion problem in two and three space dimensions. We find that an optimal design can be computed at a cost, measured in number of forward PDE solves, that is independent of the parameter and sensor dimensions. We demonstrate numerically that $\ell_0$-sparsified experimental designs obtained via a continuation method outperform $\ell_1$-sparsified designs. △ Less

Submitted 27 May, 2014; v1 submitted 19 August, 2013; originally announced August 2013.

Comments: 27 pages, accepted for publication in SIAM Journal on Scientific Computing

MSC Class: 62K05; 35Q62; 62F15; 35R30; 35Q93; 65C60

arXiv:1308.1313 [pdf, other]

A computational framework for infinite-dimensional Bayesian inverse problems. Part I: The linearized case, with application to global seismic inversion

Authors: Tan Bui-Thanh, Omar Ghattas, James Martin, Georg Stadler

Abstract: We present a computational framework for estimating the uncertainty in the numerical solution of linearized infinite-dimensional statistical inverse problems. We adopt the Bayesian inference formulation: given observational data and their uncertainty, the governing forward problem and its uncertainty, and a prior probability distribution describing uncertainty in the parameter field, find the post… ▽ More We present a computational framework for estimating the uncertainty in the numerical solution of linearized infinite-dimensional statistical inverse problems. We adopt the Bayesian inference formulation: given observational data and their uncertainty, the governing forward problem and its uncertainty, and a prior probability distribution describing uncertainty in the parameter field, find the posterior probability distribution over the parameter field. The prior must be chosen appropriately in order to guarantee well-posedness of the infinite-dimensional inverse problem and facilitate computation of the posterior. Furthermore, straightforward discretizations may not lead to convergent approximations of the infinite-dimensional problem. And finally, solution of the discretized inverse problem via explicit construction of the covariance matrix is prohibitive due to the need to solve the forward problem as many times as there are parameters. Our computational framework builds on the infinite-dimensional formulation proposed by Stuart (A. M. Stuart, Inverse problems: A Bayesian perspective, Acta Numerica, 19 (2010), pp. 451-559), and incorporates a number of components aimed at ensuring a convergent discretization of the underlying infinite-dimensional inverse problem. The framework additionally incorporates algorithms for manipulating the prior, constructing a low rank approximation of the data-informed component of the posterior covariance operator, and exploring the posterior that together ensure scalability of the entire framework to very high parameter dimensions. We demonstrate this computational framework on the Bayesian solution of an inverse problem in 3D global seismic wave propagation with hundreds of thousands of parameters. △ Less

Submitted 6 August, 2013; originally announced August 2013.

Comments: 30 pages; to appear in SIAM Journal on Scientific Computing

MSC Class: 35Q62; 62F15; 35R30; 35Q93; 65C60; 35L05

arXiv:cond-mat/0403497 [pdf, ps, other]

doi 10.1103/PhysRevLett.92.196801

Quantum control of electron--phonon scatterings in artificial atoms

Authors: Ulrich Hohenester, Georg Stadler

Abstract: The phonon-induced dephasing dynamics in optically excited semiconductor quantum dots is studied within the frameworks of the independent Boson model and optimal control. We show that appropriate tailoring of laser pulses allows a complete control of the optical excitation despite the phonon dephasing, a finding in marked contrast to other environment couplings. The phonon-induced dephasing dynamics in optically excited semiconductor quantum dots is studied within the frameworks of the independent Boson model and optimal control. We show that appropriate tailoring of laser pulses allows a complete control of the optical excitation despite the phonon dephasing, a finding in marked contrast to other environment couplings. △ Less

Submitted 19 March, 2004; originally announced March 2004.

Comments: to appear in Phys. Rev. Lett

arXiv:cond-mat/0209513 [pdf, ps, other]

doi 10.1103/PhysRevA.66.053811

Optimal quantum control in nanostructures: Theory and application to generic three-level system

Authors: Alfio Borzi, Georg Stadler, Ulrich Hohenester

Abstract: Coherent carrier control in quantum nanostructures is studied within the framework of Optimal Control. We develop a general solution scheme for the optimization of an external control (e.g., lasers pulses), which allows to channel the system's wavefunction between two given states in its most efficient way; physically motivated constraints, such as limited laser resources or population suppressi… ▽ More Coherent carrier control in quantum nanostructures is studied within the framework of Optimal Control. We develop a general solution scheme for the optimization of an external control (e.g., lasers pulses), which allows to channel the system's wavefunction between two given states in its most efficient way; physically motivated constraints, such as limited laser resources or population suppression of certain states, can be accounted for through a general cost functional. Using a generic three-level scheme for the quantum system, we demonstrate the applicability of our approach and identify the pertinent calculation and convergence parameters. △ Less

Submitted 23 September, 2002; originally announced September 2002.

Comments: 7 pages; to appear in Phys. Rev. A

Journal ref: Phys. Rev. A 66, 053811 (2002).

Showing 1–40 of 40 results for author: Stadler, G