-
Modeling sea ice in the marginal ice zone as a dense granular flow with rheology inferred from a discrete element model
Authors:
Gonzalo G. de Diego,
Mukund Gupta,
Skylar A. Gering,
Rohaiz Haris,
Georg Stadler
Abstract:
The marginal ice zone (MIZ) represents the periphery of the sea ice cover. Here, the macroscale behavior of the sea ice results from collisions and enduring contact between ice floes. This configuration closely resembles that of dense granular flows, which have been modeled successfully with the $μ(I)$ rheology. Here, we present a continuous model based on the $μ(I)$ rheology which treats sea ice…
▽ More
The marginal ice zone (MIZ) represents the periphery of the sea ice cover. Here, the macroscale behavior of the sea ice results from collisions and enduring contact between ice floes. This configuration closely resembles that of dense granular flows, which have been modeled successfully with the $μ(I)$ rheology. Here, we present a continuous model based on the $μ(I)$ rheology which treats sea ice as a compressible fluid, with the local sea ice concentration given by a dilatancy function $Φ(I)$. We infer expressions for $μ(I)$ and $Φ(I)$ from a discrete element method (DEM) which considers polygonal-shaped ice floes. We do this by driving the sea ice with a one-dimensional shearing ocean current. The resulting continuous model is a nonlinear system of equations with the sea ice velocity, local concentration, and pressure as unknowns. The rheology is given by the sum of a plastic and a viscous term. In the context of a periodic patch of ocean, which is effectively a one dimensional problem, and under steady conditions, we prove this system to be well-posed, present a numerical algorithm for solving it, and compare its solutions to those of the DEM. These comparisons demonstrate the continuous model's ability to capture most of the DEM's results accurately. The continuous model is particularly accurate for ocean currents faster than 0.25 m/s; however, for low concentrations and slow ocean currents, the continuous model is less effective in capturing the DEM results. In the latter case, the lack of accuracy of the continuous model is found to be accompanied by the breakdown of a balance between the average shear stress and the integrated ocean drag extracted from the DEM.
△ Less
Submitted 13 May, 2024;
originally announced May 2024.
-
Sensitivity Analysis of the Information Gain in Infinite-Dimensional Bayesian Linear Inverse Problems
Authors:
Abhijit Chowdhary,
Shanyin Tong,
Georg Stadler,
Alen Alexanderian
Abstract:
We study the sensitivity of infinite-dimensional Bayesian linear inverse problems governed by partial differential equations (PDEs) with respect to modeling uncertainties. In particular, we consider derivative-based sensitivity analysis of the information gain, as measured by the Kullback-Leibler divergence from the posterior to the prior distribution. To facilitate this, we develop a fast and acc…
▽ More
We study the sensitivity of infinite-dimensional Bayesian linear inverse problems governed by partial differential equations (PDEs) with respect to modeling uncertainties. In particular, we consider derivative-based sensitivity analysis of the information gain, as measured by the Kullback-Leibler divergence from the posterior to the prior distribution. To facilitate this, we develop a fast and accurate method for computing derivatives of the information gain with respect to auxiliary model parameters. Our approach combines low-rank approximations, adjoint-based eigenvalue sensitivity analysis, and post-optimal sensitivity analysis. The proposed approach also paves way for global sensitivity analysis by computing derivative-based global sensitivity measures. We illustrate different aspects of the proposed approach using an inverse problem governed by a scalar linear elliptic PDE, and an inverse problem governed by the three-dimensional equations of linear elasticity, which is motivated by the inversion of the fault-slip field after an earthquake.
△ Less
Submitted 16 May, 2024; v1 submitted 25 October, 2023;
originally announced October 2023.
-
Scalable Methods for Computing Sharp Extreme Event Probabilities in Infinite-Dimensional Stochastic Systems
Authors:
Timo Schorlepp,
Shanyin Tong,
Tobias Grafke,
Georg Stadler
Abstract:
We introduce and compare computational techniques for sharp extreme event probability estimates in stochastic differential equations with small additive Gaussian noise. In particular, we focus on strategies that are scalable, i.e. their efficiency does not degrade upon temporal and possibly spatial refinement. For that purpose, we extend algorithms based on the Laplace method for estimating the pr…
▽ More
We introduce and compare computational techniques for sharp extreme event probability estimates in stochastic differential equations with small additive Gaussian noise. In particular, we focus on strategies that are scalable, i.e. their efficiency does not degrade upon temporal and possibly spatial refinement. For that purpose, we extend algorithms based on the Laplace method for estimating the probability of an extreme event to infinite dimensional path space. The method estimates the limiting exponential scaling using a single realization of the random variable, the large deviation minimizer. Finding this minimizer amounts to solving an optimization problem governed by a differential equation. The probability estimate becomes sharp when it additionally includes prefactor information, which necessitates computing the determinant of a second derivative operator to evaluate a Gaussian integral around the minimizer. We present an approach in infinite dimensions based on Fredholm determinants, and develop numerical algorithms to compute these determinants efficiently for the high-dimensional systems that arise upon discretization. We also give an interpretation of this approach using Gaussian process covariances and transition tubes. An example model problem, for which we provide an open-source python implementation, is used throughout the paper to illustrate all methods discussed. To study the performance of the methods, we consider examples of stochastic differential and stochastic partial differential equations, including the randomly forced incompressible three-dimensional Navier-Stokes equations.
△ Less
Submitted 22 November, 2023; v1 submitted 21 March, 2023;
originally announced March 2023.
-
Hierarchical off-diagonal low-rank approximation of Hessians in inverse problems, with application to ice sheet model initializaiton
Authors:
Tucker Hartland,
Georg Stadler,
Mauro Perego,
Kim Liegeois,
Noemi Petra
Abstract:
Obtaining lightweight and accurate approximations of Hessian applies in inverse problems governed by partial differential equations (PDEs) is an essential task to make both deterministic and Bayesian statistical large-scale inverse problems computationally tractable. The $\mathcal{O}(N^{3})$ computational complexity of dense linear algebraic routines such as that needed for sampling from Gaussian…
▽ More
Obtaining lightweight and accurate approximations of Hessian applies in inverse problems governed by partial differential equations (PDEs) is an essential task to make both deterministic and Bayesian statistical large-scale inverse problems computationally tractable. The $\mathcal{O}(N^{3})$ computational complexity of dense linear algebraic routines such as that needed for sampling from Gaussian proposal distributions and Newton solves by direct linear methods, can be reduced to log-linear complexity by utilizing hierarchical off-diagonal low-rank (HODLR) matrix approximations. In this work, we show that a class of Hessians that arise from inverse problems governed by PDEs are well approximated by the HODLR matrix format. In particular, we study inverse problems governed by PDEs that model the instantaneous viscous flow of ice sheets. In these problems, we seek a spatially distributed basal sliding parameter field such that the flow predicted by the ice sheet model is consistent with ice sheet surface velocity observations. We demonstrate the use of HODLR approximation by efficiently generating Hessian approximations that allow fast generation of samples from a Gaussianized posterior proposal distribution. Computational studies are performed which illustrate ice sheet problem regimes for which the Gauss-Newton data-misfit Hessian is more efficiently approximated by the HODLR matrix format than the low-rank (LR) format. We then demonstrate that HODLR approximations can be favorable, when compared to global low-rank approximations, for large-scale problems by studying the data-misfit Hessian associated to inverse problems governed by the Stokes flow model on the Humboldt glacier and Greenland ice sheets.
△ Less
Submitted 9 January, 2023;
originally announced January 2023.
-
Direct stellarator coil optimization for nested magnetic surfaces with precise quasi-symmetry
Authors:
Andrew Giuliani,
Florian Wechsung,
Antoine Cerfon,
Matt Landreman,
Georg Stadler
Abstract:
We present a robust optimization algorithm for the design of electromagnetic coils that generate vacuum magnetic fields with nested flux surfaces and precise quasi-symmetry. The method is based on a bilevel optimization problem, where the outer coil optimization is constrained by a set of inner least-squares optimization problems whose solutions describe magnetic surfaces. The outer optimization o…
▽ More
We present a robust optimization algorithm for the design of electromagnetic coils that generate vacuum magnetic fields with nested flux surfaces and precise quasi-symmetry. The method is based on a bilevel optimization problem, where the outer coil optimization is constrained by a set of inner least-squares optimization problems whose solutions describe magnetic surfaces. The outer optimization objective targets coils that generate a field with nested magnetic surfaces and good quasi-symmetry. The inner optimization problems identify magnetic surfaces when they exist, and approximate surfaces in the presence of magnetic islands or chaos. We show that this formulation can be used to heal islands and chaos, thus producing coils that result in magnetic fields with precise quasi-symmetry. We show that the method can be initialized with coils from the traditional two stage coil design process, as well as coils from a near axis expansion optimization. We present a numerical example where island chains are healed and quasi-symmetry is optimized up to surfaces with aspect ratio 6. Another numerical example illustrates that the aspect ratio of nested flux surfaces with optimized quasi-symmetry can be decreased from 6 to approximately 4. The last example shows that our approach is robust and a cold-start using coils from a near-axis expansion optimization.
△ Less
Submitted 13 March, 2023; v1 submitted 6 October, 2022;
originally announced October 2022.
-
Large deviation theory-based adaptive importance sampling for rare events in high dimensions
Authors:
Shanyin Tong,
Georg Stadler
Abstract:
We propose a method for the accurate estimation of rare event or failure probabilities for expensive-to-evaluate numerical models in high dimensions. The proposed approach combines ideas from large deviation theory and adaptive importance sampling. The importance sampler uses a cross-entropy method to find an optimal Gaussian biasing distribution, and reuses all samples made throughout the process…
▽ More
We propose a method for the accurate estimation of rare event or failure probabilities for expensive-to-evaluate numerical models in high dimensions. The proposed approach combines ideas from large deviation theory and adaptive importance sampling. The importance sampler uses a cross-entropy method to find an optimal Gaussian biasing distribution, and reuses all samples made throughout the process for both, the target probability estimation and for updating the biasing distributions. Large deviation theory is used to find a good initial biasing distribution through the solution of an optimization problem. Additionally, it is used to identify a low-dimensional subspace that is most informative of the rare event probability. This subspace is used for the cross-entropy method, which is known to lose efficiency in higher dimensions. The proposed method does not require smoothing of indicator functions nor does it involve numerical tuning parameters. We compare the method with a state-of-the-art cross-entropy-based importance sampling scheme using three examples: a high-dimensional failure probability estimation benchmark, a problem governed by a diffusion equation, and a tsunami problem governed by the time-dependent shallow water system in one spatial dimension.
△ Less
Submitted 25 March, 2023; v1 submitted 13 September, 2022;
originally announced September 2022.
-
Stellarator coil optimization supporting multiple magnetic configurations
Authors:
Brandon F Lee,
Elizabeth J Paul,
Georg Stadler,
Matt Landreman
Abstract:
We present a technique that can be used to design stellarators with a high degree of experimental flexibility. For our purposes, flexibility is defined by the range of values the rotational transform can take on the magnetic axis of the vacuum field while maintaining satisfactory quasisymmetry. We show that accounting for configuration flexibility during the modular coil design improves flexibilit…
▽ More
We present a technique that can be used to design stellarators with a high degree of experimental flexibility. For our purposes, flexibility is defined by the range of values the rotational transform can take on the magnetic axis of the vacuum field while maintaining satisfactory quasisymmetry. We show that accounting for configuration flexibility during the modular coil design improves flexibility beyond that attained by previous methods. Careful placement of planar control coils and the incorporation of an integrability objective enhance the quasisymmetry and nested flux surface volume of each configuration. We show that it is possible to achieve flexibility, quasisymmetry, and nested flux surface volume to reasonable degrees with a relatively simple coil set through an NCSX-like example. This example coil design is optimized to achieve three rotational transform targets and nested flux surface volumes in each magnetic configuration larger than the NCSX design plasma volume. Our work suggests that there is a tradeoff between flexibility, quasisymmetry, and volume of nested flux surfaces.
△ Less
Submitted 4 November, 2022; v1 submitted 1 August, 2022;
originally announced August 2022.
-
Robust and efficient primal-dual Newton-Krylov solvers for viscous-plastic sea-ice models
Authors:
Yu-hsuan Shih,
Carolin Mehlmann,
Martin Losch,
Georg Stadler
Abstract:
We present a Newton-Krylov solver for a viscous-plastic sea-ice model. This constitutive relation is commonly used in climate models to describe the material properties of sea ice. Due to the strong nonlinearity introduced by the material law in the momentum equation, the development of fast, robust and scalable solvers is still a substantial challenge. In this paper, we propose a novel primal-dua…
▽ More
We present a Newton-Krylov solver for a viscous-plastic sea-ice model. This constitutive relation is commonly used in climate models to describe the material properties of sea ice. Due to the strong nonlinearity introduced by the material law in the momentum equation, the development of fast, robust and scalable solvers is still a substantial challenge. In this paper, we propose a novel primal-dual Newton linearization for the implicitly-in-time discretized momentum equation. Compared to existing methods, it converges faster and more robustly with respect to mesh refinement, and thus enables numerically converged sea-ice simulations at high resolutions. Combined with an algebraic multigrid-preconditioned Krylov method for the linearized systems, which contain strongly varying coefficients, the resulting solver scales well and can be used in parallel. We present experiments for two challenging test problems and study solver performance for problems with up to 8.4 million spatial unknowns.
△ Less
Submitted 15 September, 2022; v1 submitted 22 April, 2022;
originally announced April 2022.
-
Stochastic and a posteriori optimization to mitigate coil manufacturing errors in stellarator design
Authors:
Florian Wechsung,
Andrew Giuliani,
Matt Landreman,
Antoine Cerfon,
Georg Stadler
Abstract:
It was recently shown in [Wechsung et. al., Proc. Natl. Acad. Sci. USA, 2022] that there exist electromagnetic coils that generate magnetic fields which are excellent approximations to quasi-symmetric fields and have very good particle confinement properties. Using a Gaussian process based model for coil perturbations, we investigate the impact of manufacturing errors on the performance of these c…
▽ More
It was recently shown in [Wechsung et. al., Proc. Natl. Acad. Sci. USA, 2022] that there exist electromagnetic coils that generate magnetic fields which are excellent approximations to quasi-symmetric fields and have very good particle confinement properties. Using a Gaussian process based model for coil perturbations, we investigate the impact of manufacturing errors on the performance of these coils. We show that even fairly small errors result in noticeable performance degradation. While stochastic optimization yields minor improvements, it is not able to mitigate these errors significantly. As an alternative to stochastic optimization, we then formulate a new optimization problem for computing optimal adjustments of the coil positions and currents without changing the shapes of the coil. These a-posteriori adjustments are able to reduce the impact of coil errors by an order of magnitude, providing a new perspective for dealing with manufacturing tolerances in stellarator design.
△ Less
Submitted 30 July, 2022; v1 submitted 18 March, 2022;
originally announced March 2022.
-
Direct computation of magnetic surfaces in Boozer coordinates and coil optimization for quasi-symmetry
Authors:
Andrew Giuliani,
Florian Wechsung,
Matt Landreman,
Georg Stadler,
Antoine Cerfon
Abstract:
We propose a new method to compute magnetic surfaces that are parametrized in Boozer coordinates for vacuum magnetic fields. We also propose a measure for quasi-symmetry on the computed surfaces and use it to design coils that generate a magnetic field that is quasi-symmetric on those surfaces. The rotational transform of the field and complexity measures for the coils are also controlled in the d…
▽ More
We propose a new method to compute magnetic surfaces that are parametrized in Boozer coordinates for vacuum magnetic fields. We also propose a measure for quasi-symmetry on the computed surfaces and use it to design coils that generate a magnetic field that is quasi-symmetric on those surfaces. The rotational transform of the field and complexity measures for the coils are also controlled in the design problem.
Using an adjoint approach, we are able to obtain analytic derivatives for this optimization problem, yielding an efficient gradient-based algorithm. Starting from an initial coil set that presents nested magnetic surfaces for a large fraction of the volume, our method converges rapidly to coil systems generating fields with excellent quasi-symmetry and low particle losses. In particular for low complexity coils, we are able to significantly improve the performance compared to coils obtained from the standard two-stage approach, e.g.~reduce losses of fusion-produced alpha particles born at half-radius from $17.7\%$ to $6.6\%$. We also demonstrate 16-coil configurations with alpha loss < $1\%$ and neoclassical transport magnitude $ε_{\mathrm{eff}}^{3/2}$ less than approximately $5\times 10^{-9}.$
△ Less
Submitted 29 April, 2022; v1 submitted 7 March, 2022;
originally announced March 2022.
-
A gradient-free subspace-adjusting ensemble sampler for infinite-dimensional Bayesian inverse problems
Authors:
Matthew M. Dunlop,
Georg Stadler
Abstract:
Sampling of sharp posteriors in high dimensions is a challenging problem, especially when gradients of the likelihood are unavailable. In low to moderate dimensions, affine-invariant methods, a class of ensemble-based gradient-free methods, have found success in sampling concentrated posteriors. However, the number of ensemble members must exceed the dimension of the unknown state in order for the…
▽ More
Sampling of sharp posteriors in high dimensions is a challenging problem, especially when gradients of the likelihood are unavailable. In low to moderate dimensions, affine-invariant methods, a class of ensemble-based gradient-free methods, have found success in sampling concentrated posteriors. However, the number of ensemble members must exceed the dimension of the unknown state in order for the correct distribution to be targeted. Conversely, the preconditioned Crank-Nicolson (pCN) algorithm succeeds at sampling in high dimensions, but samples become highly correlated when the posterior differs significantly from the prior. In this article we combine the above methods in two different ways as an attempt to find a compromise. The first method involves inflating the proposal covariance in pCN with that of the current ensemble, whilst the second performs approximately affine-invariant steps on a continually adapting low-dimensional subspace, while using pCN on its orthogonal complement.
△ Less
Submitted 22 February, 2022;
originally announced February 2022.
-
Bayesian neural network priors for edge-preserving inversion
Authors:
Chen Li,
Matthew Dunlop,
Georg Stadler
Abstract:
We consider Bayesian inverse problems wherein the unknown state is assumed to be a function with discontinuous structure a priori. A class of prior distributions based on the output of neural networks with heavy-tailed weights is introduced, motivated by existing results concerning the infinite-width limit of such networks. We show theoretically that samples from such priors have desirable discont…
▽ More
We consider Bayesian inverse problems wherein the unknown state is assumed to be a function with discontinuous structure a priori. A class of prior distributions based on the output of neural networks with heavy-tailed weights is introduced, motivated by existing results concerning the infinite-width limit of such networks. We show theoretically that samples from such priors have desirable discontinuous-like properties even when the network width is finite, making them appropriate for edge-preserving inversion. Numerically we consider deconvolution problems defined on one- and two-dimensional spatial domains to illustrate the effectiveness of these priors; MAP estimation, dimension-robust MCMC sampling and ensemble-based approximations are utilized to probe the posterior distribution. The accuracy of point estimates is shown to exceed those obtained from non-heavy tailed priors, and uncertainty estimates are shown to provide more useful qualitative information.
△ Less
Submitted 20 December, 2021;
originally announced December 2021.
-
Estimating earthquake-induced tsunami height probabilities without sampling
Authors:
Shanyin Tong,
Eric Vanden-Eijnden,
Georg Stadler
Abstract:
Given a distribution of earthquake-induced seafloor elevations, we present a method to compute the probability of the resulting tsunamis reaching a certain size on shore. Instead of sampling, the proposed method relies on optimization to compute the most likely fault slips that result in a seafloor deformation inducing a large tsunami wave. We model tsunamis induced by bathymetry change using the…
▽ More
Given a distribution of earthquake-induced seafloor elevations, we present a method to compute the probability of the resulting tsunamis reaching a certain size on shore. Instead of sampling, the proposed method relies on optimization to compute the most likely fault slips that result in a seafloor deformation inducing a large tsunami wave. We model tsunamis induced by bathymetry change using the shallow water equations on an idealized slice through the sea. The earthquake slip model is based on a sum of multivariate log-normal distributions, and follows the Gutenberg-Richter law for moment magnitudes 7--9. For a model problem inspired by the Tohoku-Oki 2011 earthquake and tsunami, we quantify annual probabilities of differently sized tsunami waves. Our method also identifies the most effective tsunami mechanisms. These mechanisms have smoothly varying fault slip patches that lead to an expansive but moderately large bathymetry change. The resulting tsunami waves are compressed as they approach shore and reach close-to-vertical leading wave edge close to shore.
△ Less
Submitted 7 April, 2023; v1 submitted 28 November, 2021;
originally announced November 2021.
-
Robust multigrid techniques for augmented Lagrangian preconditioning of incompressible Stokes equations with extreme viscosity variations
Authors:
Yu-hsuan Shih,
Georg Stadler,
Florian Wechsung
Abstract:
We present augmented Lagrangian Schur complement preconditioners and robust multigrid methods for incompressible Stokes problems with extreme viscosity variations. Such Stokes systems arise, for instance, upon linearization of nonlinear viscous flow problems, and they can have severely inhomogeneous and anisotropic coefficients. Using an augmented Lagrangian formulation for the incompressibility c…
▽ More
We present augmented Lagrangian Schur complement preconditioners and robust multigrid methods for incompressible Stokes problems with extreme viscosity variations. Such Stokes systems arise, for instance, upon linearization of nonlinear viscous flow problems, and they can have severely inhomogeneous and anisotropic coefficients. Using an augmented Lagrangian formulation for the incompressibility constraint makes the Schur complement easier to approximate, but results in a nearly singular (1,1)-block in the Stokes system. We present eigenvalue estimates for the quality of the Schur complement approximation. To cope with the near-singularity of the (1,1)-block, we extend a multigrid scheme with a discretization-dependent smoother and transfer operators from triangular/tetrahedral to the quadrilateral/hexahedral finite element discretizations $[\mathbb{Q}_k]^d\times \mathbb{P}_{k-1}^{\text{disc}}$, $k\geq 2$, $d=2,3$. Using numerical examples with scalar and with anisotropic fourth-order tensor viscosity arising from linearization of a viscoplastic constitutive relation, we confirm the robustness of the multigrid scheme and the overall efficiency of the solver. We present scalability results using up to 28,672 parallel tasks for problems with up to 1.6 billion unknowns and a viscosity contrast up to ten orders of magnitude.
△ Less
Submitted 2 November, 2021; v1 submitted 2 July, 2021;
originally announced July 2021.
-
Single-stage gradient-based stellarator coil design: stochastic optimization
Authors:
Florian Wechsung,
Andrew Giuliani,
Matt Landreman,
Antoine Cerfon,
Georg Stadler
Abstract:
We extend the single-stage stellarator coil design approach for quasi-symmetry on axis from [Giuliani et al, 2020] to additionally take into account coil manufacturing errors. By modeling coil errors independently from the coil discretization, we have the flexibility to consider realistic forms of coil errors. The corresponding stochastic optimization problems are formulated using a risk-neutral a…
▽ More
We extend the single-stage stellarator coil design approach for quasi-symmetry on axis from [Giuliani et al, 2020] to additionally take into account coil manufacturing errors. By modeling coil errors independently from the coil discretization, we have the flexibility to consider realistic forms of coil errors. The corresponding stochastic optimization problems are formulated using a risk-neutral approach and risk-averse approaches. We present an efficient, gradient-based descent algorithm which relies on analytical derivatives to solve these problems. In a comprehensive numerical study, we compare the coil designs resulting from deterministic and risk-neutral stochastic optimization and find that the risk-neutral formulation results in more robust configurations and reduces the number of local minima of the optimization problem. We also compare deterministic and risk-neutral approaches in terms of quasi-symmetry on and away from the magnetic axis, and in terms of the confinement of particles released close to the axis. Finally, we show that for the optimization problems we consider, a risk-averse objective using the Conditional Value-at-Risk leads to results which are similar to the risk-neutral objective.
△ Less
Submitted 22 June, 2021;
originally announced June 2021.
-
Single-stage gradient-based stellarator coil design: Optimization for near-axis quasi-symmetry
Authors:
Andrew Giuliani,
Florian Wechsung,
Antoine Cerfon,
Georg Stadler,
Matt Landreman
Abstract:
We present a new coil design paradigm for magnetic confinement in stellarators. Our approach directly optimizes coil shapes and coil currents to produce a vacuum quasi-symmetric magnetic field with a target rotational transform on the magnetic axis. This approach differs from the traditional two-stage approach in which first a magnetic configuration with desirable physics properties is found, and…
▽ More
We present a new coil design paradigm for magnetic confinement in stellarators. Our approach directly optimizes coil shapes and coil currents to produce a vacuum quasi-symmetric magnetic field with a target rotational transform on the magnetic axis. This approach differs from the traditional two-stage approach in which first a magnetic configuration with desirable physics properties is found, and then coils to approximately realize this magnetic configuration are designed. The proposed single-stage approach allows us to find a compromise between confinement and engineering requirements, i.e., find easy-to-build coils with good confinement properties. Using forward and adjoint sensitivities, we derive derivatives of the physical quantities in the objective, which is constrained by a nonlinear periodic differential equation. In two numerical examples, we compare different gradient-based descent algorithms and find that incorporating approximate second-order derivative information through a quasi-Newton method is crucial for convergence. We also explore the optimization landscape in the neighborhood of a minimizer and find many directions in which the objective is mostly flat, indicating ample freedom to find simple and thus easy-to-build coils.
△ Less
Submitted 15 March, 2022; v1 submitted 1 October, 2020;
originally announced October 2020.
-
Extreme event probability estimation using PDE-constrained optimization and large deviation theory, with application to tsunamis
Authors:
Shanyin Tong,
Eric Vanden-Eijnden,
Georg Stadler
Abstract:
We propose and compare methods for the analysis of extreme events in complex systems governed by PDEs that involve random parameters, in situations where we are interested in quantifying the probability that a scalar function of the system's solution is above a threshold. If the threshold is large, this probability is small and its accurate estimation is challenging. To tackle this difficulty, we…
▽ More
We propose and compare methods for the analysis of extreme events in complex systems governed by PDEs that involve random parameters, in situations where we are interested in quantifying the probability that a scalar function of the system's solution is above a threshold. If the threshold is large, this probability is small and its accurate estimation is challenging. To tackle this difficulty, we blend theoretical results from large deviation theory (LDT) with numerical tools from PDE-constrained optimization. Our methods first compute parameters that minimize the LDT-rate function over the set of parameters leading to extreme events, using adjoint methods to compute the gradient of this rate function. The minimizers give information about the mechanism of the extreme events as well as estimates of their probability. We then propose a series of methods to refine these estimates, either via importance sampling or geometric approximation of the extreme event sets. Results are formulated for general parameter distributions and detailed expressions are provided when Gaussian distributions. We give theoretical and numerical arguments showing that the performance of our methods is insensitive to the extremeness of the events we are interested in. We illustrate the application of our approach to quantify the probability of extreme tsunami events on shore. Tsunamis are typically caused by a sudden, unpredictable change of the ocean floor elevation during an earthquake. We model this change as a random process, which takes into account the underlying physics. We use the one-dimensional shallow water equation to model tsunamis numerically. In the context of this example, we present a comparison of our methods for extreme event probability estimation, and find which type of ocean floor elevation change leads to the largest tsunamis on shore.
△ Less
Submitted 22 November, 2023; v1 submitted 27 July, 2020;
originally announced July 2020.
-
Optimal design of large-scale Bayesian linear inverse problems under reducible model uncertainty: good to know what you don't know
Authors:
Alen Alexanderian,
Noemi Petra,
Georg Stadler,
Isaac Sunseri
Abstract:
We consider optimal design of infinite-dimensional Bayesian linear inverse problems governed by partial differential equations that contain secondary reducible model uncertainties, in addition to the uncertainty in the inversion parameters. By reducible uncertainties we refer to parametric uncertainties that can be reduced through parameter inference. We seek experimental designs that minimize the…
▽ More
We consider optimal design of infinite-dimensional Bayesian linear inverse problems governed by partial differential equations that contain secondary reducible model uncertainties, in addition to the uncertainty in the inversion parameters. By reducible uncertainties we refer to parametric uncertainties that can be reduced through parameter inference. We seek experimental designs that minimize the posterior uncertainty in the primary parameters, while accounting for the uncertainty in secondary parameters. We accomplish this by deriving a marginalized A-optimality criterion and develo** an efficient computational approach for its optimization. We illustrate our approach for estimating an uncertain time-dependent source in a contaminant transport model with an uncertain initial state as secondary uncertainty. Our results indicate that accounting for additional model uncertainty in the experimental design process is crucial.
△ Less
Submitted 21 June, 2020;
originally announced June 2020.
-
Advanced Newton Methods for Geodynamical Models of Stokes Flow with Viscoplastic Rheologies
Authors:
Johann Rudi,
Yu-hsuan Shih,
Georg Stadler
Abstract:
Strain localization and resulting plasticity and failure play an important role in the evolution of the lithosphere. These phenomena are commonly modeled by Stokes flows with viscoplastic rheologies. The nonlinearities of these rheologies make the numerical solution of the resulting systems challenging, and iterative methods often converge slowly or not at all. Yet accurate solutions are critical…
▽ More
Strain localization and resulting plasticity and failure play an important role in the evolution of the lithosphere. These phenomena are commonly modeled by Stokes flows with viscoplastic rheologies. The nonlinearities of these rheologies make the numerical solution of the resulting systems challenging, and iterative methods often converge slowly or not at all. Yet accurate solutions are critical for representing the physics. Moreover, for some rheology laws, aspects of solvability are still unknown. We study a basic but representative viscoplastic rheology law. The law involves a yield stress that is independent of the dynamic pressure, referred to as von Mises yield criterion. Two commonly used variants, perfect/ideal and composite viscoplasticity, are compared. We derive both variants from energy minimization principles, and we use this perspective to argue when solutions are unique. We propose a new stress-velocity Newton solution algorithm that treats the stress as an independent variable during the Newton linearization but requires solution only of Stokes systems that are of the usual velocity-pressure form. To study different solution algorithms, we implement 2D and 3D finite element discretizations, and we generate Stokes problems with up to 7 orders of magnitude viscosity contrasts, in which compression or tension results in significant nonlinear localization effects. Comparing the performance of the proposed Newton method with the standard Newton method and the Picard fixed-point method, we observe a significant reduction in the number of iterations and improved stability with respect to problem nonlinearity, mesh refinement, and the polynomial order of the discretization.
△ Less
Submitted 22 July, 2020; v1 submitted 24 March, 2020;
originally announced March 2020.
-
Hierarchical Matrix Approximations of Hessians Arising in Inverse Problems Governed by PDEs
Authors:
Ilona Ambartsumyan,
Wajih Boukaram,
Tan Bui-Thanh,
Omar Ghattas,
David Keyes,
Georg Stadler,
George Turkiyyah,
Stefano Zampini
Abstract:
Hessian operators arising in inverse problems governed by partial differential equations (PDEs) play a critical role in delivering efficient, dimension-independent convergence for both Newton solution of deterministic inverse problems, as well as Markov chain Monte Carlo sampling of posteriors in the Bayesian setting. These methods require the ability to repeatedly perform such operations on the H…
▽ More
Hessian operators arising in inverse problems governed by partial differential equations (PDEs) play a critical role in delivering efficient, dimension-independent convergence for both Newton solution of deterministic inverse problems, as well as Markov chain Monte Carlo sampling of posteriors in the Bayesian setting. These methods require the ability to repeatedly perform such operations on the Hessian as multiplication with arbitrary vectors, solving linear systems, inversion, and (inverse) square root. Unfortunately, the Hessian is a (formally) dense, implicitly-defined operator that is intractable to form explicitly for practical inverse problems, requiring as many PDE solves as inversion parameters. Low rank approximations are effective when the data contain limited information about the parameters, but become prohibitive as the data become more informative. However, the Hessians for many inverse problems arising in practical applications can be well approximated by matrices that have hierarchically low rank structure. Hierarchical matrix representations promise to overcome the high complexity of dense representations and provide effective data structures and matrix operations that have only log-linear complexity. In this work, we describe algorithms for constructing and updating hierarchical matrix approximations of Hessians, and illustrate them on a number of representative inverse problems involving time-dependent diffusion, advection-dominated transport, frequency domain acoustic wave propagation, and low frequency Maxwell equations, demonstrating up to an order of magnitude speedup compared to globally low rank approximations.
△ Less
Submitted 23 March, 2020;
originally announced March 2020.
-
Bayesian approach to inverse scattering with topological priors
Authors:
Ana Carpio,
Sergei Iakunin,
Georg Stadler
Abstract:
We propose a Bayesian inference framework to estimate uncertainties in inverse scattering problems. Given the observed data, the forward model and their uncertainties, we find the posterior distribution over a finite parameter field representing the objects. To construct the prior distribution we use a topological sensitivity analysis. We demonstrate the approach on the Bayesian solution of 2D inv…
▽ More
We propose a Bayesian inference framework to estimate uncertainties in inverse scattering problems. Given the observed data, the forward model and their uncertainties, we find the posterior distribution over a finite parameter field representing the objects. To construct the prior distribution we use a topological sensitivity analysis. We demonstrate the approach on the Bayesian solution of 2D inverse problems in light and acoustic holography with synthetic data. Statistical information on objects such as their center location, diameter size, orientation, as well as material properties, are extracted by sampling the posterior distribution. Assuming the number of objects known, comparison of the results obtained by Markov Chain Monte Carlo sampling and by sampling a Gaussian distribution found by linearization about the maximum a posteriori estimate show reasonable agreement. The latter procedure has low computational cost, which makes it an interesting tool for uncertainty studies in 3D. However, MCMC sampling provides a more complete picture of the posterior distribution and yields multi-modal posterior distributions for problems with larger measurement noise. When the number of objects is unknown, we devise a stochastic model selection framework.
△ Less
Submitted 16 November, 2020; v1 submitted 20 March, 2020;
originally announced March 2020.
-
Optimal experimental design under irreducible uncertainty for linear inverse problems governed by PDEs
Authors:
Karina Koval,
Alen Alexanderian,
Georg Stadler
Abstract:
We present a method for computing A-optimal sensor placements for infinite-dimensional Bayesian linear inverse problems governed by PDEs with irreducible model uncertainties. Here, irreducible uncertainties refers to uncertainties in the model that exist in addition to the parameters in the inverse problem, and that cannot be reduced through observations. Specifically, given a statistical distribu…
▽ More
We present a method for computing A-optimal sensor placements for infinite-dimensional Bayesian linear inverse problems governed by PDEs with irreducible model uncertainties. Here, irreducible uncertainties refers to uncertainties in the model that exist in addition to the parameters in the inverse problem, and that cannot be reduced through observations. Specifically, given a statistical distribution for the model uncertainties, we compute the optimal design that minimizes the expected value of the posterior covariance trace. The expected value is discretized using Monte Carlo leading to an objective function consisting of a sum of trace operators and a binary-inducing penalty. Minimization of this objective requires a large number of PDE solves in each step. To make this problem computationally tractable, we construct a composite low-rank basis using a randomized range finder algorithm to eliminate forward and adjoint PDE solves. We also present a novel formulation of the A-optimal design objective that requires the trace of an operator in the observation rather than the parameter space. The binary structure is enforced using a weighted regularized $\ell_0$-sparsification approach. We present numerical results for inference of the initial condition in a subsurface flow problem with inherent uncertainty in the flow fields and in the initial times.
△ Less
Submitted 29 March, 2020; v1 submitted 18 December, 2019;
originally announced December 2019.
-
Scalable Simulation of Realistic Volume Fraction Red Blood Cell Flows through Vascular Networks
Authors:
Libin Lu,
Matthew J. Morse,
Abtin Rahimian,
Georg Stadler,
Denis Zorin
Abstract:
High-resolution blood flow simulations have potential for develo** better understanding biophysical phenomena at the microscale, such as vasodilation, vasoconstriction and overall vascular resistance. To this end, we present a scalable platform for the simulation of red blood cell (RBC) flows through complex capillaries by modeling the physical system as a viscous fluid with immersed deformable…
▽ More
High-resolution blood flow simulations have potential for develo** better understanding biophysical phenomena at the microscale, such as vasodilation, vasoconstriction and overall vascular resistance. To this end, we present a scalable platform for the simulation of red blood cell (RBC) flows through complex capillaries by modeling the physical system as a viscous fluid with immersed deformable particles. We describe a parallel boundary integral equation solver for general elliptic partial differential equations, which we apply to Stokes flow through blood vessels. We also detail a parallel collision avoiding algorithm to ensure RBCs and the blood vessel remain contact-free. We have scaled our code on Stampede2 at the Texas Advanced Computing Center up to 34,816 cores. Our largest simulation enforces a contact-free state between four billion surface elements and solves for three billion degrees of freedom on one million RBCs and a blood vessel composed from two million patches.
△ Less
Submitted 23 September, 2019;
originally announced September 2019.
-
A comparative study of structural similarity and regularization for joint inverse problems governed by PDEs
Authors:
Benjamin Crestel,
Georg Stadler,
Omar Ghattas
Abstract:
Joint inversion refers to the simultaneous inference of multiple parameter fields from observations of systems governed by single or multiple forward models. In many cases these parameter fields reflect different attributes of a single medium and are thus spatially correlated or structurally similar. By imposing prior information on their spatial correlations via a joint regularization term, we se…
▽ More
Joint inversion refers to the simultaneous inference of multiple parameter fields from observations of systems governed by single or multiple forward models. In many cases these parameter fields reflect different attributes of a single medium and are thus spatially correlated or structurally similar. By imposing prior information on their spatial correlations via a joint regularization term, we seek to improve the reconstruction of the parameter fields relative to inversion for each field independently. One of the main challenges is to devise a joint regularization functional that conveys the spatial correlations or structural similarity between the fields while at the same time permitting scalable and efficient solvers for the joint inverse problem. We describe several joint regularizations that are motivated by these goals: a cross-gradient and a normalized cross-gradient structural similarity term, the vectorial total variation, and a joint regularization based on the nuclear norm of the gradients. Based on numerical results from three classes of inverse problems with piecewise-homogeneous parameter fields, we conclude that the vectorial total variation functional is preferable to the other methods considered. Besides resulting in good reconstructions in all experiments, it allows for scalable, efficient solvers for joint inverse problems governed by PDE forward models.
△ Less
Submitted 16 August, 2018;
originally announced August 2018.
-
Sparse solutions in optimal control of PDEs with uncertain parameters: the linear case
Authors:
Chen Li,
Georg Stadler
Abstract:
We study sparse solutions of optimal control problems governed by PDEs with uncertain coefficients. We propose two formulations, one where the solution is a deterministic control optimizing the mean objective, and a formulation aiming at stochastic controls that share the same sparsity structure. In both formulations, regions where the controls do not vanish can be interpreted as optimal locations…
▽ More
We study sparse solutions of optimal control problems governed by PDEs with uncertain coefficients. We propose two formulations, one where the solution is a deterministic control optimizing the mean objective, and a formulation aiming at stochastic controls that share the same sparsity structure. In both formulations, regions where the controls do not vanish can be interpreted as optimal locations for placing control devices. In this paper, we focus on linear PDEs with linearly entering uncertain parameters. Under these assumptions, the deterministic formulation reduces to a problem with known structure, and thus we mainly focus on the stochastic control formulation. Here, shared sparsity is achieved by incorporating the $L^1$-norm of the mean of the pointwise squared controls in the objective. We reformulate the problem using a norm reweighting function that is defined over physical space only and thus helps to avoid approximation of the random space using samples or quadrature. We show that a fixed point algorithm applied to the norm reweighting formulation leads to a variant of the well-studied iterative reweighted least squares (IRLS) algorithm, and we propose a novel preconditioned Newton-conjugate gradient method to speed up the IRLS algorithm. We combine our algorithms with low-rank operator approximations, for which we provide estimates of the truncation error. We carefully examine the computational complexity of the resulting algorithms. The sparsity structure of the optimal controls and the performance of the solution algorithms are studied numerically using control problems governed by the Laplace and Helmholtz equations. In these experiments the Newton variant clearly outperforms the IRLS method.
△ Less
Submitted 19 November, 2018; v1 submitted 16 April, 2018;
originally announced April 2018.
-
A Feynman-kac Formula Approach for Computing Expectations and Threshold Crossing Probabilities of Non-smooth Stochastic Dynamical Systems
Authors:
Laurent Mertz,
Georg Stadler,
Jonathan Wylie
Abstract:
We present a computational alternative to probabilistic simulations for non-smooth stochastic dynamical systems that are prevalent in engineering mechanics. As examples, we target (1) stochastic elasto-plastic problems, which involve transitions between elastic and plastic states, and (2) obstacle problems with noise, which involve discrete impulses due to collisions with an obstacle. We formally…
▽ More
We present a computational alternative to probabilistic simulations for non-smooth stochastic dynamical systems that are prevalent in engineering mechanics. As examples, we target (1) stochastic elasto-plastic problems, which involve transitions between elastic and plastic states, and (2) obstacle problems with noise, which involve discrete impulses due to collisions with an obstacle. We formally introduce a class of partial differential equations related to the Feynman-Kac formula, where the underlying stochastic processes satisfy variational inequalities modelling elasto-plastic and obstacle oscillators. We then focus on solving them numerically. The main challenge in solving these equations is the non-standard boundary conditions which describe the behavior of the underlying process on the boundary. We illustrate how to use our approach to compute expectations and other statistical quantities, such as the asymptotic growth rate of variance in asymptotic formulae for threshold crossing probabilities.
△ Less
Submitted 22 May, 2019; v1 submitted 7 April, 2017;
originally announced April 2017.
-
A-optimal encoding weights for nonlinear inverse problems, with applications to the Helmholtz inverse problem
Authors:
Benjamin Crestel,
Alen Alexanderian,
Georg Stadler,
Omar Ghattas
Abstract:
The computational cost of solving an inverse problem governed by PDEs, using multiple experiments, increases linearly with the number of experiments. A recently proposed method to decrease this cost uses only a small number of random linear combinations of all experiments for solving the inverse problem. This approach applies to inverse problems where the PDE solution depends linearly on the right…
▽ More
The computational cost of solving an inverse problem governed by PDEs, using multiple experiments, increases linearly with the number of experiments. A recently proposed method to decrease this cost uses only a small number of random linear combinations of all experiments for solving the inverse problem. This approach applies to inverse problems where the PDE solution depends linearly on the right-hand side function that models the experiment. As this method is stochastic in essence, the quality of the obtained reconstructions can vary, in particular when only a small number of combinations are used. We develop a Bayesian formulation for the definition and computation of encoding weights that lead to a parameter reconstruction with the least uncertainty. We call these weights A-optimal encoding weights. Our framework applies to inverse problems where the governing PDE is nonlinear with respect to the inversion parameter field. We formulate the problem in infinite dimensions and follow the optimize-then-discretize approach, devoting special attention to the discretization and the choice of numerical methods in order to achieve a computational cost that is independent of the parameter discretization. We elaborate our method for a Helmholtz inverse problem, and derive the adjoint-based expressions for the gradient of the objective function of the optimization problem for finding the A-optimal encoding weights. The proposed method is potentially attractive for real-time monitoring applications, where one can invest the effort to compute optimal weights offline, to later solve an inverse problem repeatedly, over time, at a fraction of the initial cost.
△ Less
Submitted 27 February, 2017; v1 submitted 7 December, 2016;
originally announced December 2016.
-
Mitigating the Influence of the Boundary on PDE-based Covariance Operators
Authors:
Yair Daon,
Georg Stadler
Abstract:
Gaussian random fields over infinite-dimensional Hilbert spaces require the definition of appropriate covariance operators. The use of elliptic PDE operators to construct covariance operators allows to build on fast PDE solvers for manipulations with the resulting covariance and precision operators. However, PDE operators require a choice of boundary conditions, and this choice can have a strong a…
▽ More
Gaussian random fields over infinite-dimensional Hilbert spaces require the definition of appropriate covariance operators. The use of elliptic PDE operators to construct covariance operators allows to build on fast PDE solvers for manipulations with the resulting covariance and precision operators. However, PDE operators require a choice of boundary conditions, and this choice can have a strong and usually undesired influence on the Gaussian random field. We propose two techniques that allow to ameliorate these boundary effects for large-scale problems. The first approach combines the elliptic PDE operator with a Robin boundary condition, where a varying Robin coefficient is computed from an optimization problem. The second approach normalizes the pointwise variance by rescaling the covariance operator. These approaches can be used individually or can be combined. We study properties of these approaches, and discuss their computational complexity. The performance of our approaches is studied for random fields defined over simple and complex two- and three-dimensional domains.
△ Less
Submitted 11 December, 2017; v1 submitted 17 October, 2016;
originally announced October 2016.
-
Weighted BFBT Preconditioner for Stokes Flow Problems with Highly Heterogeneous Viscosity
Authors:
Johann Rudi,
Georg Stadler,
Omar Ghattas
Abstract:
We present a weighted BFBT approximation (w-BFBT) to the inverse Schur complement of a Stokes system with highly heterogeneous viscosity. When used as part of a Schur complement-based Stokes preconditioner, we observe robust fast convergence for Stokes problems with smooth but highly varying (up to 10 orders of magnitude) viscosities, optimal algorithmic scalability with respect to mesh refinement…
▽ More
We present a weighted BFBT approximation (w-BFBT) to the inverse Schur complement of a Stokes system with highly heterogeneous viscosity. When used as part of a Schur complement-based Stokes preconditioner, we observe robust fast convergence for Stokes problems with smooth but highly varying (up to 10 orders of magnitude) viscosities, optimal algorithmic scalability with respect to mesh refinement, and only a mild dependence on the polynomial order of high-order finite element discretizations ($Q_k \times P_{k-1}^{disc}$, order $k \ge 2$). For certain difficult problems, we demonstrate numerically that w-BFBT significantly improves Stokes solver convergence over the widely used inverse viscosity-weighted pressure mass matrix approximation of the Schur complement. In addition, we derive theoretical eigenvalue bounds to prove spectral equivalence of w-BFBT. Using detailed numerical experiments, we discuss modifications to w-BFBT at Dirichlet boundaries that decrease the number of iterations. The overall algorithmic performance of the Stokes solver is governed by the efficacy of w-BFBT as a Schur complement approximation and, in addition, by our parallel hybrid spectral-geometric-algebraic multigrid (HMG) method, which we use to approximate the inverses of the viscous block and variable-coefficient pressure Poisson operators within w-BFBT. Building on the scalability of HMG, our Stokes solver achieves a parallel efficiency of 90% while weak scaling over a more than 600-fold increase from 48 to all 30,000 cores of TACC's Lonestar 5 supercomputer.
△ Less
Submitted 29 January, 2017; v1 submitted 13 July, 2016;
originally announced July 2016.
-
Mean-variance risk-averse optimal control of systems governed by PDEs with random parameter fields using quadratic approximations
Authors:
Alen Alexanderian,
Noemi Petra,
Georg Stadler,
Omar Ghattas
Abstract:
We present a method for optimal control of systems governed by partial differential equations (PDEs) with uncertain parameter fields. We consider an objective function that involves the mean and variance of the control objective, leading to a risk-averse optimal control problem. To make the problem tractable, we invoke a quadratic Taylor series approximation of the control objective with respect t…
▽ More
We present a method for optimal control of systems governed by partial differential equations (PDEs) with uncertain parameter fields. We consider an objective function that involves the mean and variance of the control objective, leading to a risk-averse optimal control problem. To make the problem tractable, we invoke a quadratic Taylor series approximation of the control objective with respect to the uncertain parameter. This enables deriving explicit expressions for the mean and variance of the control objective in terms of its gradients and Hessians with respect to the uncertain parameter. The risk-averse optimal control problem is then formulated as a PDE-constrained optimization problem with constraints given by the forward and adjoint PDEs defining these gradients and Hessians. The expressions for the mean and variance of the control objective under the quadratic approximation involve the trace of the (preconditioned) Hessian and are thus prohibitive to evaluate. To address this, we employ trace estimators that only require a modest number of Hessian-vector products. We illustrate our approach with two problems: the control of a semilinear elliptic PDE with an uncertain boundary source term, and the control of a linear elliptic PDE with an uncertain coefficient field. For the latter problem, we derive adjoint-based expressions for efficient computation of the gradient of the risk-averse objective with respect to the controls. Our method ensures that the cost of computing the risk-averse objective and its gradient with respect to the control, measured in the number of PDE solves, is independent of the (discretized) parameter and control dimensions, and depends only on the number of random vectors employed in the trace estimation. Finally, we present a comprehensive numerical study of an optimal control problem for fluid flow in a porous medium with uncertain permeability field.
△ Less
Submitted 22 November, 2017; v1 submitted 24 February, 2016;
originally announced February 2016.
-
A Fast and Scalable Method for A-Optimal Design of Experiments for Infinite-dimensional Bayesian Nonlinear Inverse Problems
Authors:
Alen Alexanderian,
Noemi Petra,
Georg Stadler,
Omar Ghattas
Abstract:
We address the problem of optimal experimental design (OED) for Bayesian nonlinear inverse problems governed by PDEs. The goal is to find a placement of sensors, at which experimental data are collected, so as to minimize the uncertainty in the inferred parameter field. We formulate the OED objective function by generalizing the classical A-optimal experimental design criterion using the expected…
▽ More
We address the problem of optimal experimental design (OED) for Bayesian nonlinear inverse problems governed by PDEs. The goal is to find a placement of sensors, at which experimental data are collected, so as to minimize the uncertainty in the inferred parameter field. We formulate the OED objective function by generalizing the classical A-optimal experimental design criterion using the expected value of the trace of the posterior covariance. We seek a method that solves the OED problem at a cost (measured in the number of forward PDE solves) that is independent of both the parameter and sensor dimensions. To facilitate this, we construct a Gaussian approximation to the posterior at the maximum a posteriori probability (MAP) point, and use the resulting covariance operator to define the OED objective function. We use randomized trace estimation to compute the trace of this (implicitly defined) covariance operator. The resulting OED problem includes as constraints the PDEs characterizing the MAP point, and the PDEs describing the action of the covariance operator to vectors. The sparsity of the sensor configurations is controlled using sparsifying penalty functions. We elaborate our OED method for the problem of determining the sensor placement to best infer the coefficient of an elliptic PDE. Adjoint methods are used to compute the gradient of the PDE-constrained OED objective function. We provide numerical results for inference of the permeability field in a porous medium flow problem, and demonstrate that the number of PDE solves required for the evaluation of the OED objective function and its gradient is essentially independent of both the parameter and sensor dimensions. The number of quasi-Newton iterations for computing an OED also exhibits the same dimension invariance properties.
△ Less
Submitted 3 November, 2015; v1 submitted 21 October, 2014;
originally announced October 2014.
-
Scalable and efficient algorithms for the propagation of uncertainty from data through inference to prediction for large-scale problems, with application to flow of the Antarctic ice sheet
Authors:
Tobin Isaac,
Noemi Petra,
Georg Stadler,
Omar Ghattas
Abstract:
The majority of research on efficient and scalable algorithms in computational science and engineering has focused on the forward problem: given parameter inputs, solve the governing equations to determine output quantities of interest. In contrast, here we consider the broader question: given a (large-scale) model containing uncertain parameters, (possibly) noisy observational data, and a predict…
▽ More
The majority of research on efficient and scalable algorithms in computational science and engineering has focused on the forward problem: given parameter inputs, solve the governing equations to determine output quantities of interest. In contrast, here we consider the broader question: given a (large-scale) model containing uncertain parameters, (possibly) noisy observational data, and a prediction quantity of interest, how do we construct efficient and scalable algorithms to (1) infer the model parameters from the data (the deterministic inverse problem), (2) quantify the uncertainty in the inferred parameters (the Bayesian inference problem), and (3) propagate the resulting uncertain parameters through the model to issue predictions with quantified uncertainties (the forward uncertainty propagation problem)? We present efficient and scalable algorithms for this end-to-end, data-to-prediction process under the Gaussian approximation and in the context of modeling the flow of the Antarctic ice sheet and its effect on sea level. The ice is modeled as a viscous, incompressible, cree**, shear-thinning fluid. The observational data come from InSAR satellite measurements of surface ice flow velocity, and the uncertain parameter field to be inferred is the basal sliding parameter. The prediction quantity of interest is the present-day ice mass flux from the Antarctic continent to the ocean. We show that the work required for executing this data-to-prediction process is independent of the state dimension, parameter dimension, data dimension, and number of processor cores. The key to achieving this dimension independence is to exploit the fact that the observational data typically provide only sparse information on model parameters. This property can be exploited to construct a low rank approximation of the linearized parameter-to-observable map.
△ Less
Submitted 1 September, 2015; v1 submitted 5 October, 2014;
originally announced October 2014.
-
Solution of nonlinear Stokes equations discretized by high-order finite elements on nonconforming and anisotropic meshes, with application to ice sheet dynamics
Authors:
Tobin Isaac,
Georg Stadler,
Omar Ghattas
Abstract:
Motivated by the need for efficient and accurate simulation of the dynamics of the polar ice sheets, we design high-order finite element discretizations and scalable solvers for the solution of nonlinear incompressible Stokes equations. We focus on power-law, shear thinning rheologies used in modeling ice dynamics and other geophysical flows. We use nonconforming hexahedral meshes and the conformi…
▽ More
Motivated by the need for efficient and accurate simulation of the dynamics of the polar ice sheets, we design high-order finite element discretizations and scalable solvers for the solution of nonlinear incompressible Stokes equations. We focus on power-law, shear thinning rheologies used in modeling ice dynamics and other geophysical flows. We use nonconforming hexahedral meshes and the conforming inf-sup stable finite element velocity-pressure pairings $\mathbb{Q}_k\times \mathbb{Q}^\text{disc}_{k-2}$ or $\mathbb{Q}_k \times \mathbb{P}^\text{disc}_{k-1}$. To solve the nonlinear equations, we propose a Newton-Krylov method with a block upper triangular preconditioner for the linearized Stokes systems. The diagonal blocks of this preconditioner are sparse approximations of the (1,1)-block and of its Schur complement. The (1,1)-block is approximated using linear finite elements based on the nodes of the high-order discretization, and the application of its inverse is approximated using algebraic multigrid with an incomplete factorization smoother. This preconditioner is designed to be efficient on anisotropic meshes, which are necessary to match the high aspect ratio domains typical for ice sheets. We develop and make available extensions to two libraries---a hybrid meshing scheme for the p4est parallel AMR library, and a modified smoothed aggregation scheme for PETSc---to improve their support for solving PDEs in high aspect ratio domains. In a numerical study, we find that our solver yields fast convergence that is independent of the element aspect ratio, the occurrence of nonconforming interfaces, and of mesh refinement, and that depends only weakly on the polynomial finite element order. We simulate the ice flow in a realistic description of the Antarctic ice sheet derived from field data, and study the parallel scalability of our solver for problems with up to 383M unknowns.
△ Less
Submitted 9 July, 2015; v1 submitted 25 June, 2014;
originally announced June 2014.
-
Comparison of Multigrid Algorithms for High-order Continuous Finite Element Discretizations
Authors:
Hari Sundar,
Georg Stadler,
George Biros
Abstract:
We present a comparison of different multigrid approaches for the solution of systems arising from high-order continuous finite element discretizations of elliptic partial differential equations on complex geometries. We consider the pointwise Jacobi, the Chebyshev-accelerated Jacobi and the symmetric successive over-relaxation (SSOR) smoothers, as well as elementwise block Jacobi smoothing. Three…
▽ More
We present a comparison of different multigrid approaches for the solution of systems arising from high-order continuous finite element discretizations of elliptic partial differential equations on complex geometries. We consider the pointwise Jacobi, the Chebyshev-accelerated Jacobi and the symmetric successive over-relaxation (SSOR) smoothers, as well as elementwise block Jacobi smoothing. Three approaches for the multigrid hierarchy are compared: 1) high-order $h$-multigrid, which uses high-order interpolation and restriction between geometrically coarsened meshes; 2) $p$-multigrid, in which the polynomial order is reduced while the mesh remains unchanged, and the interpolation and restriction incorporate the different-order basis functions; and 3), a first-order approximation multigrid preconditioner constructed using the nodes of the high-order discretization. This latter approach is often combined with algebraic multigrid for the low-order operator and is attractive for high-order discretizations on unstructured meshes, where geometric coarsening is difficult. Based on a simple performance model, we compare the computational cost of the different approaches. Using scalar test problems in two and three dimensions with constant and varying coefficients, we compare the performance of the different multigrid approaches for polynomial orders up to 16. Overall, both $h$- and $p$-multigrid work well; the first-order approximation is less efficient. For constant coefficients, all smoothers work well. For variable coefficients, Chebyshev and SSOR smoothing outperforms Jacobi smoothing. While all of the tested methods converge in a mesh-independent number of iterations, none of them behaves completely independent of the polynomial order. When multigrid is used as a preconditioner in a Krylov method, the iteration number decreases significantly compared to using multigrid as a solver.
△ Less
Submitted 6 March, 2015; v1 submitted 24 February, 2014;
originally announced February 2014.
-
Discretely exact derivatives for hyperbolic PDE-constrained optimization problems discretized by the discontinuous Galerkin method
Authors:
Lucas C. Wilcox,
Georg Stadler,
Tan Bui-Thanh,
Omar Ghattas
Abstract:
This paper discusses the computation of derivatives for optimization problems governed by linear hyperbolic systems of partial differential equations (PDEs) that are discretized by the discontinuous Galerkin (dG) method. An efficient and accurate computation of these derivatives is important, for instance, in inverse problems and optimal control problems. This computation is usually based on an ad…
▽ More
This paper discusses the computation of derivatives for optimization problems governed by linear hyperbolic systems of partial differential equations (PDEs) that are discretized by the discontinuous Galerkin (dG) method. An efficient and accurate computation of these derivatives is important, for instance, in inverse problems and optimal control problems. This computation is usually based on an adjoint PDE system, and the question addressed in this paper is how the discretization of this adjoint system should relate to the dG discretization of the hyperbolic state equation. Adjoint-based derivatives can either be computed before or after discretization; these two options are often referred to as the optimize-then-discretize and discretize-then-optimize approaches. We discuss the relation between these two options for dG discretizations in space and Runge-Kutta time integration. Discretely exact discretizations for several hyperbolic optimization problems are derived, including the advection equation, Maxwell's equations and the coupled elastic-acoustic wave equation. We find that the discrete adjoint equation inherits a natural dG discretization from the discretization of the state equation and that the expressions for the discretely exact gradient often have to take into account contributions from element faces. For the coupled elastic-acoustic wave equation, the correctness and accuracy of our derivative expressions are illustrated by comparisons with finite difference gradients. The results show that a straightforward discretization of the continuous gradient differs from the discretely exact gradient, and thus is not consistent with the discretized objective. This inconsistency may cause difficulties in the convergence of gradient based algorithms for solving optimization problems.
△ Less
Submitted 27 November, 2013;
originally announced November 2013.
-
A computational framework for infinite-dimensional Bayesian inverse problems: Part II. Stochastic Newton MCMC with application to ice sheet flow inverse problems
Authors:
Noemi Petra,
James Martin,
Georg Stadler,
Omar Ghattas
Abstract:
We address the numerical solution of infinite-dimensional inverse problems in the framework of Bayesian inference. In the Part I companion to this paper (arXiv.org:1308.1313), we considered the linearized infinite-dimensional inverse problem. Here in Part II, we relax the linearization assumption and consider the fully nonlinear infinite-dimensional inverse problem using a Markov chain Monte Carlo…
▽ More
We address the numerical solution of infinite-dimensional inverse problems in the framework of Bayesian inference. In the Part I companion to this paper (arXiv.org:1308.1313), we considered the linearized infinite-dimensional inverse problem. Here in Part II, we relax the linearization assumption and consider the fully nonlinear infinite-dimensional inverse problem using a Markov chain Monte Carlo (MCMC) sampling method. To address the challenges of sampling high-dimensional pdfs arising from Bayesian inverse problems governed by PDEs, we build on the stochastic Newton MCMC method. This method exploits problem structure by taking as a proposal density a local Gaussian approximation of the posterior pdf, whose construction is made tractable by invoking a low-rank approximation of its data misfit component of the Hessian. Here we introduce an approximation of the stochastic Newton proposal in which we compute the low-rank-based Hessian at just the MAP point, and then reuse this Hessian at each MCMC step. We compare the performance of the proposed method to the original stochastic Newton MCMC method and to an independence sampler. The comparison of the three methods is conducted on a synthetic ice sheet inverse problem. For this problem, the stochastic Newton MCMC method with a MAP-based Hessian converges at least as rapidly as the original stochastic Newton MCMC method, but is far cheaper since it avoids recomputing the Hessian at each step. On the other hand, it is more expensive per sample than the independence sampler; however, its convergence is significantly more rapid, and thus overall it is much cheaper. Finally, we present extensive analysis and interpretation of the posterior distribution, and classify directions in parameter space based on the extent to which they are informed by the prior or the observations.
△ Less
Submitted 11 April, 2014; v1 submitted 28 August, 2013;
originally announced August 2013.
-
A-optimal design of experiments for infinite-dimensional Bayesian linear inverse problems with regularized $\ell_0$-sparsification
Authors:
Alen Alexanderian,
Noemi Petra,
Georg Stadler,
Omar Ghattas
Abstract:
We present an efficient method for computing A-optimal experimental designs for infinite-dimensional Bayesian linear inverse problems governed by partial differential equations (PDEs). Specifically, we address the problem of optimizing the location of sensors (at which observational data are collected) to minimize the uncertainty in the parameters estimated by solving the inverse problem, where th…
▽ More
We present an efficient method for computing A-optimal experimental designs for infinite-dimensional Bayesian linear inverse problems governed by partial differential equations (PDEs). Specifically, we address the problem of optimizing the location of sensors (at which observational data are collected) to minimize the uncertainty in the parameters estimated by solving the inverse problem, where the uncertainty is expressed by the trace of the posterior covariance. Computing optimal experimental designs (OEDs) is particularly challenging for inverse problems governed by computationally expensive PDE models with infinite-dimensional (or, after discretization, high-dimensional) parameters. To alleviate the computational cost, we exploit the problem structure and build a low-rank approximation of the parameter-to-observable map, preconditioned with the square root of the prior covariance operator. This relieves our method from expensive PDE solves when evaluating the optimal experimental design objective function and its derivatives. Moreover, we employ a randomized trace estimator for efficient evaluation of the OED objective function. We control the sparsity of the sensor configuration by employing a sequence of penalty functions that successively approximate the $\ell_0$-"norm"; this results in binary designs that characterize optimal sensor locations. We present numerical results for inference of the initial condition from spatio-temporal observations in a time-dependent advection-diffusion problem in two and three space dimensions. We find that an optimal design can be computed at a cost, measured in number of forward PDE solves, that is independent of the parameter and sensor dimensions. We demonstrate numerically that $\ell_0$-sparsified experimental designs obtained via a continuation method outperform $\ell_1$-sparsified designs.
△ Less
Submitted 27 May, 2014; v1 submitted 19 August, 2013;
originally announced August 2013.
-
A computational framework for infinite-dimensional Bayesian inverse problems. Part I: The linearized case, with application to global seismic inversion
Authors:
Tan Bui-Thanh,
Omar Ghattas,
James Martin,
Georg Stadler
Abstract:
We present a computational framework for estimating the uncertainty in the numerical solution of linearized infinite-dimensional statistical inverse problems. We adopt the Bayesian inference formulation: given observational data and their uncertainty, the governing forward problem and its uncertainty, and a prior probability distribution describing uncertainty in the parameter field, find the post…
▽ More
We present a computational framework for estimating the uncertainty in the numerical solution of linearized infinite-dimensional statistical inverse problems. We adopt the Bayesian inference formulation: given observational data and their uncertainty, the governing forward problem and its uncertainty, and a prior probability distribution describing uncertainty in the parameter field, find the posterior probability distribution over the parameter field. The prior must be chosen appropriately in order to guarantee well-posedness of the infinite-dimensional inverse problem and facilitate computation of the posterior. Furthermore, straightforward discretizations may not lead to convergent approximations of the infinite-dimensional problem. And finally, solution of the discretized inverse problem via explicit construction of the covariance matrix is prohibitive due to the need to solve the forward problem as many times as there are parameters. Our computational framework builds on the infinite-dimensional formulation proposed by Stuart (A. M. Stuart, Inverse problems: A Bayesian perspective, Acta Numerica, 19 (2010), pp. 451-559), and incorporates a number of components aimed at ensuring a convergent discretization of the underlying infinite-dimensional inverse problem. The framework additionally incorporates algorithms for manipulating the prior, constructing a low rank approximation of the data-informed component of the posterior covariance operator, and exploring the posterior that together ensure scalability of the entire framework to very high parameter dimensions. We demonstrate this computational framework on the Bayesian solution of an inverse problem in 3D global seismic wave propagation with hundreds of thousands of parameters.
△ Less
Submitted 6 August, 2013;
originally announced August 2013.
-
Quantum control of electron--phonon scatterings in artificial atoms
Authors:
Ulrich Hohenester,
Georg Stadler
Abstract:
The phonon-induced dephasing dynamics in optically excited semiconductor quantum dots is studied within the frameworks of the independent Boson model and optimal control. We show that appropriate tailoring of laser pulses allows a complete control of the optical excitation despite the phonon dephasing, a finding in marked contrast to other environment couplings.
The phonon-induced dephasing dynamics in optically excited semiconductor quantum dots is studied within the frameworks of the independent Boson model and optimal control. We show that appropriate tailoring of laser pulses allows a complete control of the optical excitation despite the phonon dephasing, a finding in marked contrast to other environment couplings.
△ Less
Submitted 19 March, 2004;
originally announced March 2004.
-
Optimal quantum control in nanostructures: Theory and application to generic three-level system
Authors:
Alfio Borzi,
Georg Stadler,
Ulrich Hohenester
Abstract:
Coherent carrier control in quantum nanostructures is studied within the framework of Optimal Control. We develop a general solution scheme for the optimization of an external control (e.g., lasers pulses), which allows to channel the system's wavefunction between two given states in its most efficient way; physically motivated constraints, such as limited laser resources or population suppressi…
▽ More
Coherent carrier control in quantum nanostructures is studied within the framework of Optimal Control. We develop a general solution scheme for the optimization of an external control (e.g., lasers pulses), which allows to channel the system's wavefunction between two given states in its most efficient way; physically motivated constraints, such as limited laser resources or population suppression of certain states, can be accounted for through a general cost functional. Using a generic three-level scheme for the quantum system, we demonstrate the applicability of our approach and identify the pertinent calculation and convergence parameters.
△ Less
Submitted 23 September, 2002;
originally announced September 2002.