-
A Criterion for Multivariate Regionalization of Spatial Data
Authors:
Ranadeep Daw,
Christopher K. Wikle,
Jonathan R. Bradley,
Scott H. Holan
Abstract:
The modifiable areal unit problem in geography or the change-of-support (COS) problem in statistics demonstrates that the interpretation of spatial (or spatio-temporal) data analysis is affected by the choice of resolutions or geographical units used in the study. The ecological fallacy is one famous example of this phenomenon. Here we investigate the ecological fallacy associated with the COS pro…
▽ More
The modifiable areal unit problem in geography or the change-of-support (COS) problem in statistics demonstrates that the interpretation of spatial (or spatio-temporal) data analysis is affected by the choice of resolutions or geographical units used in the study. The ecological fallacy is one famous example of this phenomenon. Here we investigate the ecological fallacy associated with the COS problem for multivariate spatial data with the goal of providing a data-driven discretization criterion for the domain of interest that minimizes aggregation errors. The discretization is based on a novel multiscale metric, called the Multivariate Criterion for Aggregation Error (MVCAGE). Such multi-scale representations of an underlying multivariate process are often formulated in terms of basis expansions. We show that a particularly useful basis expansion in this context is the multivariate Karhunen-Lo`eve expansion (MKLE). We use the MKLE to build the MVCAGE loss function and use it within the framework of spatial clustering algorithms to perform optimal spatial aggregation. We demonstrate the effectiveness of our approach through simulation and through regionalization of county-level income and hospital quality data over the United States and prediction of ocean color in the coastal Gulf of Alaska.
△ Less
Submitted 21 December, 2023; v1 submitted 19 December, 2023;
originally announced December 2023.
-
REDS: Random Ensemble Deep Spatial prediction
Authors:
Ranadeep Daw,
Christopher K. Wikle
Abstract:
There has been a great deal of recent interest in the development of spatial prediction algorithms for very large datasets and/or prediction domains. These methods have primarily been developed in the spatial statistics community, but there has been growing interest in the machine learning community for such methods, primarily driven by the success of deep Gaussian process regression approaches an…
▽ More
There has been a great deal of recent interest in the development of spatial prediction algorithms for very large datasets and/or prediction domains. These methods have primarily been developed in the spatial statistics community, but there has been growing interest in the machine learning community for such methods, primarily driven by the success of deep Gaussian process regression approaches and deep convolutional neural networks. These methods are often computationally expensive to train and implement and consequently, there has been a resurgence of interest in random projections and deep learning models based on random weights -- so called reservoir computing methods. Here, we combine several of these ideas to develop the Random Ensemble Deep Spatial (REDS) approach to predict spatial data. The procedure uses random Fourier features as inputs to an extreme learning machine (a deep neural model with random weights), and with calibrated ensembles of outputs from this model based on different random weights, it provides a simple uncertainty quantification. The REDS method is demonstrated on simulated data and on a classic large satellite data set.
△ Less
Submitted 8 November, 2022;
originally announced November 2022.
-
Correcting spatial Gaussian process parameter and prediction variance estimation under informative sampling
Authors:
Erin M. Schliep,
Christopher K. Wikle,
Ranadeep Daw
Abstract:
Informative sampling designs can impact spatial prediction, or kriging, in two important ways. First, the sampling design can bias spatial covariance parameter estimation, which in turn can bias spatial kriging estimates. Second, even with unbiased estimates of the spatial covariance parameters, since the kriging variance is a function of the observation locations, these estimates will vary based…
▽ More
Informative sampling designs can impact spatial prediction, or kriging, in two important ways. First, the sampling design can bias spatial covariance parameter estimation, which in turn can bias spatial kriging estimates. Second, even with unbiased estimates of the spatial covariance parameters, since the kriging variance is a function of the observation locations, these estimates will vary based on the sample and overestimate the population-based estimates. In this work, we develop a weighted composite likelihood approach to improve spatial covariance parameter estimation under informative sampling designs. Then, given these parameter estimates, we propose three approaches to quantify the effects of the sampling design on the variance estimates in spatial prediction. These results can be used to make informed decisions for population-based inference. We illustrate our approaches using a comprehensive simulation study. Then, we apply our methods to perform spatial prediction on nitrate concentration in wells located throughout central California.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
Deep Neural Network in Cusp Catastrophe Model
Authors:
Ranadeep Daw,
Zhuoqiong He
Abstract:
Catastrophe theory was originally proposed to study dynamical systems that exhibit sudden shifts in behavior arising from small changes in input. These models can generate reasonable explanation behind abrupt jumps in nonlinear dynamic models. Among the different catastrophe models, the Cusp Catastrophe model attracted the most attention due to it's relatively simpler dynamics and rich domain of a…
▽ More
Catastrophe theory was originally proposed to study dynamical systems that exhibit sudden shifts in behavior arising from small changes in input. These models can generate reasonable explanation behind abrupt jumps in nonlinear dynamic models. Among the different catastrophe models, the Cusp Catastrophe model attracted the most attention due to it's relatively simpler dynamics and rich domain of application. Due to the complex behavior of the response, the parameter space becomes highly non-convex and hence it becomes very hard to optimize to figure out the generating parameters. Instead of solving for these generating parameters, we demonstrated how a Machine learning model can be trained to learn the dynamics of the Cusp catastrophe models, without ever really solving for the generating model parameters. Simulation studies and application on a few famous datasets are used to validate our approach. To our knowledge, this is the first paper of such kind where a neural network based approach has been applied in Cusp Catastrophe model.
△ Less
Submitted 21 April, 2020; v1 submitted 5 April, 2020;
originally announced April 2020.