-
Valid Cross-Covariance Models via Multivariate Mixtures with an Application to the Confluent Hypergeometric Class
Authors:
Drew Yarger,
Anindya Bhadra
Abstract:
Modeling of multivariate random fields through Gaussian processes calls for the construction of valid cross-covariance functions describing the dependence between any two component processes at different spatial locations. The required validity conditions often present challenges that lead to complicated restrictions on the parameter space. The purpose of this work is to present techniques using m…
▽ More
Modeling of multivariate random fields through Gaussian processes calls for the construction of valid cross-covariance functions describing the dependence between any two component processes at different spatial locations. The required validity conditions often present challenges that lead to complicated restrictions on the parameter space. The purpose of this work is to present techniques using multivariate mixtures for establishing validity that are simultaneously simplified and comprehensive. This is accomplished using results on conditionally negative semidefinite matrices and the Schur product theorem. For illustration, we use the recently-introduced Confluent Hypergeometric (CH) class of covariance functions. In addition, we establish the spectral density of the Confluent Hypergeometric covariance and use this to construct valid multivariate models as well as propose new cross-covariances. Our approach leads to valid multivariate cross-covariance models that inherit the desired marginal properties of the Confluent Hypergeometric model and outperform the multivariate Matérn model in out-of-sample prediction under slowly-decaying correlation of the underlying multivariate random field. We also establish properties of the new models, including results on equivalence of Gaussian measures. We demonstrate the new model's use for multivariate oceanography dataset consisting of temperature, salinity and oxygen, as measured by autonomous floats in the Southern Ocean.
△ Less
Submitted 17 June, 2024; v1 submitted 9 December, 2023;
originally announced December 2023.
-
Multivariate Matérn Models -- A Spectral Approach
Authors:
Drew Yarger,
Stilian Stoev,
Tailen Hsing
Abstract:
The classical Matérn model has been a staple in spatial statistics. Novel data-rich applications in environmental and physical sciences, however, call for new, flexible vector-valued spatial and space-time models. Therefore, the extension of the classical Matérn model has been a problem of active theoretical and methodological interest. In this paper, we offer a new perspective to extending the Ma…
▽ More
The classical Matérn model has been a staple in spatial statistics. Novel data-rich applications in environmental and physical sciences, however, call for new, flexible vector-valued spatial and space-time models. Therefore, the extension of the classical Matérn model has been a problem of active theoretical and methodological interest. In this paper, we offer a new perspective to extending the Matérn covariance model to the vector-valued setting. We adopt a spectral, stochastic integral approach, which allows us to address challenging issues on the validity of the covariance structure and at the same time to obtain new, flexible, and interpretable models. In particular, our multivariate extensions of the Matérn model allow for asymmetric covariance structures. Moreover, the spectral approach provides an essentially complete flexibility in modeling the local structure of the process. We establish closed-form representations of the cross-covariances when available, compare them with existing models, simulate Gaussian instances of these new processes, and demonstrate estimation of the model's parameters through maximum likelihood. An application of the new class of multivariate Matérn models to environmental data indicate their success in capturing inherent covariance-asymmetry phenomena.
△ Less
Submitted 2 June, 2024; v1 submitted 5 September, 2023;
originally announced September 2023.
-
Autocalibration of the E3SM version 2 atmosphere model using a PCA-based surrogate for spatial fields
Authors:
Drew Yarger,
Benjamin Wagman,
Lyndsay Shand,
Kenny Chowdhary
Abstract:
Global Climate Model (GCM) tuning (calibration) is a tedious and time-consuming process, with high-dimensional input and output fields. Experts typically tune by iteratively running climate simulations with hand-picked values of tuning parameters. Many, in both the statistical and climate literature, have proposed alternative calibration methods, but most are impractical or difficult to implement.…
▽ More
Global Climate Model (GCM) tuning (calibration) is a tedious and time-consuming process, with high-dimensional input and output fields. Experts typically tune by iteratively running climate simulations with hand-picked values of tuning parameters. Many, in both the statistical and climate literature, have proposed alternative calibration methods, but most are impractical or difficult to implement. We present a practical, robust and rigorous calibration approach on the atmosphere-only model of the Department of Energy's Energy Exascale Earth System Model (E3SM) version 2. Our approach can be summarized into two main parts: (1) the training of a surrogate that predicts E3SM output in a fraction of the time compared to running E3SM, and (2) gradient-based parameter optimization. To train the surrogate, we generate a set of designed ensemble runs that span our input parameter space and use polynomial chaos expansions on a reduced output space to fit the E3SM output. We use this surrogate in an optimization scheme to identify values of the input parameters for which our model best matches gridded spatial fields of climate observations. To validate our choice of parameters, we run E3SMv2 with the optimal parameter values and compare prediction results to expertly-tuned simulations across 45 different output fields. This flexible, robust, and automated approach is straightforward to implement, and we demonstrate that the resulting model output matches present day climate observations as well or better than the corresponding output from expert tuned parameter values, while considering high-dimensional output and operating in a fraction of the time.
△ Less
Submitted 13 August, 2023;
originally announced August 2023.
-
Detecting changepoints in globally-indexed functional time series
Authors:
Drew Yarger,
J. Derek Tucker
Abstract:
In environmental and climate data, there is often an interest in determining if and when changes occur in a system. Such changes may result from localized sources in space and time like a volcanic eruption or climate geoengineering events. Detecting such events and their subsequent influence on climate has important policy implications. However, the climate system is complex, and such changes can…
▽ More
In environmental and climate data, there is often an interest in determining if and when changes occur in a system. Such changes may result from localized sources in space and time like a volcanic eruption or climate geoengineering events. Detecting such events and their subsequent influence on climate has important policy implications. However, the climate system is complex, and such changes can be challenging to detect. One statistical perspective for changepoint detection is functional time series, where one observes an entire function at each time point. We will consider the context where each time point is a year, and we observe a function of temperature indexed by day of the year. Furthermore, such data is measured at many spatial locations on Earth, which motivates accommodating sets of functional time series that are spatially-indexed on a sphere. Simultaneously inferring changes that can occur at different times for different locations is challenging. We propose test statistics for detecting these changepoints, and we evaluate performance using varying levels of data complexity, including a simulation study, simplified climate model simulations, and climate reanalysis data. We evaluate changes in stratospheric temperature globally over 1984-1998. Such changes may be associated with the eruption of Mt. Pinatubo in 1991.
△ Less
Submitted 10 August, 2023;
originally announced August 2023.
-
Elastic Functional Changepoint Detection of Climate Impacts from Localized Sources
Authors:
J. Derek Tucker,
Drew Yarger
Abstract:
Detecting changepoints in functional data has become an important problem as interest in monitoring of climate phenomenon has increased, where the data is functional in nature. The observed data often contains both amplitude ($y$-axis) and phase ($x$-axis) variability. If not accounted for properly, true changepoints may be undetected, and the estimated underlying mean change functions will be inc…
▽ More
Detecting changepoints in functional data has become an important problem as interest in monitoring of climate phenomenon has increased, where the data is functional in nature. The observed data often contains both amplitude ($y$-axis) and phase ($x$-axis) variability. If not accounted for properly, true changepoints may be undetected, and the estimated underlying mean change functions will be incorrect. In this paper, an elastic functional changepoint method is developed which properly accounts for these types of variability. The method can detect amplitude and phase changepoints which current methods in the literature do not, as they focus solely on the amplitude changepoint. This method can easily be implemented using the functions directly or can be computed via functional principal component analysis to ease the computational burden. We apply the method and its non-elastic competitors to both simulated data and observed data to show its efficiency in handling data with phase variation with both amplitude and phase changepoints. We use the method to evaluate potential changes in stratospheric temperature due to the eruption of Mt.\ Pinatubo in the Philippines in June 1991. Using an epidemic changepoint model, we find evidence of a increase in stratospheric temperature during a period that contains the immediate aftermath of Mt.\ Pinatubo, with most detected changepoints occurring in the tropics as expected.
△ Less
Submitted 10 August, 2023; v1 submitted 22 November, 2022;
originally announced November 2022.
-
A multivariate functional-data mixture model for spatio-temporal data: inference and cokriging
Authors:
Moritz Korte-Stapff,
Drew Yarger,
Stilian Stoev,
Tailen Hsing
Abstract:
In this paper, we introduce a model for multivariate, spatio-temporal functional data. Specifically, this work proposes a mixture model that is used to perform spatio-temporal prediction (cokriging) when both the response and the additional covariates are functional data. The estimation of such models in the context of expansive data poses many methodological and computational challenges. We propo…
▽ More
In this paper, we introduce a model for multivariate, spatio-temporal functional data. Specifically, this work proposes a mixture model that is used to perform spatio-temporal prediction (cokriging) when both the response and the additional covariates are functional data. The estimation of such models in the context of expansive data poses many methodological and computational challenges. We propose a new Monte Carlo Expectation Maximization algorithm based on importance sampling to estimate model parameters. We validate our methodology using simulation studies and provide a comparison to previously proposed functional clustering methodologies. To tackle computational challenges, we describe a variety of advances that enable application on large spatio-temporal datasets. The methodology is applied on Argo oceanographic data in the Southern Ocean to predict oxygen concentration, which is critical to ocean biodiversity and reflects fundamental aspects of the carbon cycle. Our model and implementation comprehensively provide oxygen predictions and their estimated uncertainty as well as recover established oceanographic fronts.
△ Less
Submitted 8 November, 2022;
originally announced November 2022.
-
A probabilistic model of ocean floats under ice
Authors:
Derek Hansen,
Drew Yarger
Abstract:
The Argo project deploys thousands of floats throughout the world's oceans. Carried only by the current, these floats take measurements such as temperature and salinity at depths of up to two kilometers. These measurements are critical for scientific tasks such as modeling climate change, estimating temperature and salinity fields, and tracking the global hydrological cycle. In the Southern Ocean,…
▽ More
The Argo project deploys thousands of floats throughout the world's oceans. Carried only by the current, these floats take measurements such as temperature and salinity at depths of up to two kilometers. These measurements are critical for scientific tasks such as modeling climate change, estimating temperature and salinity fields, and tracking the global hydrological cycle. In the Southern Ocean, Argo floats frequently drift under ice cover which prevents tracking via GPS. Managing this missing location data is an important scientific challenge for the Argo project. To predict the floats' trajectories under ice and quantify their uncertainty, we introduce a probabilistic state-space model (SSM) called ArgoSSM. ArgoSSM infers the posterior distribution of a float's position and velocity at each time based on all available data, which includes GPS measurements, ice cover, and potential vorticity. This inference is achieved via an efficient particle filtering scheme, which is effective despite the high signal-to0noise ratio in the GPS data. Compared to existing interpolation approaches in oceanography, ArgoSSM more accurately predicts held-out GPS measurements. Moreover, because uncertainty estimates are well-calibrated in the posterior distribution, ArgoSSM enables more robust and accurate temperature, salinity, and circulation estimates.
△ Less
Submitted 30 September, 2022;
originally announced October 2022.
-
A functional-data approach to the Argo data
Authors:
Drew Yarger,
Stilian Stoev,
Tailen Hsing
Abstract:
The Argo data is a modern oceanography dataset that provides unprecedented global coverage of temperature and salinity measurements in the upper 2,000 meters of depth of the ocean. We study the Argo data from the perspective of functional data analysis (FDA). We develop spatio-temporal functional kriging methodology for mean and covariance estimation to predict temperature and salinity at a fixed…
▽ More
The Argo data is a modern oceanography dataset that provides unprecedented global coverage of temperature and salinity measurements in the upper 2,000 meters of depth of the ocean. We study the Argo data from the perspective of functional data analysis (FDA). We develop spatio-temporal functional kriging methodology for mean and covariance estimation to predict temperature and salinity at a fixed location as a smooth function of depth. By combining tools from FDA and spatial statistics, including smoothing splines, local regression, and multivariate spatial modeling and prediction, our approach provides advantages over current methodology that consider pointwise estimation at fixed depths. Our approach naturally leverages the irregularly-sampled data in space, time, and depth to fit a space-time functional model for temperature and salinity. The developed framework provides new tools to address fundamental scientific problems involving the entire upper water column of the oceans such as the estimation of ocean heat content, stratification, and thermohaline oscillation. For example, we show that our functional approach yields more accurate ocean heat content estimates than ones based on discrete integral approximations in pressure. Further, using the derivative function estimates, we obtain a new product of a global map of the mixed layer depth, a key component in the study of heat absorption and nutrient circulation in the oceans. The derivative estimates also reveal evidence for density inversions in areas distinguished by mixing of particularly different water masses.
△ Less
Submitted 9 May, 2021; v1 submitted 8 June, 2020;
originally announced June 2020.