-
Why not a thin plate spline for spatial models? A comparative study using Bayesian inference
Authors:
Joaquin Cavieres,
Paula Moraga,
Cole C. Monnahan
Abstract:
Spatial modelling often uses Gaussian random fields to capture the stochastic nature of studied phenomena. However, this approach incurs significant computational burdens (O(n3)), primarily due to covariance matrix computations. In this study, we propose to use a low-rank approximation of a thin plate spline as a spatial random effect in Bayesian spatial models. We compare its statistical performa…
▽ More
Spatial modelling often uses Gaussian random fields to capture the stochastic nature of studied phenomena. However, this approach incurs significant computational burdens (O(n3)), primarily due to covariance matrix computations. In this study, we propose to use a low-rank approximation of a thin plate spline as a spatial random effect in Bayesian spatial models. We compare its statistical performance and computational efficiency with the approximated Gaussian random field (by the SPDE method). In this case, the dense matrix of the thin plate spline is approximated using a truncated spectral decomposition, resulting in computational complexity of O(kn2) operations, where k is the number of knots. Bayesian inference is conducted via the Hamiltonian Monte Carlo algorithm of the probabilistic software Stan, which allows us to evaluate performance and diagnostics for the proposed models. A simulation study reveals that both models accurately recover the parameters used to simulate data. However, models using a thin plate spline demonstrate superior execution time to achieve the convergence of chains compared to the models utilizing an approximated Gaussian random field. Furthermore, thin plate spline models exhibited better computational efficiency for simulated data coming from different spatial locations. In a real application, models using a thin plate spline as spatial random effect produced similar results in estimating a relative index of abundance for a benthic marine species when compared to models incorporating an approximated Gaussian random field. Although they were not the more computational efficient models, their simplicity in parametrization, execution time and predictive performance make them a valid alternative for spatial modelling under Bayesian inference.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
Spatial Latent Gaussian Modelling with Change of Support
Authors:
Erick A. Chacón-Montalván,
Peter M. Atkinson,
Christopher Nemeth,
Benjamin M. Taylor,
Paula Moraga
Abstract:
Spatial data are often derived from multiple sources (e.g. satellites, in-situ sensors, survey samples) with different supports, but associated with the same properties of a spatial phenomenon of interest. It is common for predictors to also be measured on different spatial supports than the response variables. Although there is no standard way to work with spatial data with different supports, a…
▽ More
Spatial data are often derived from multiple sources (e.g. satellites, in-situ sensors, survey samples) with different supports, but associated with the same properties of a spatial phenomenon of interest. It is common for predictors to also be measured on different spatial supports than the response variables. Although there is no standard way to work with spatial data with different supports, a prevalent approach used by practitioners has been to use downscaling or interpolation to project all the variables of analysis towards a common support, and then using standard spatial models. The main disadvantage with this approach is that simple interpolation can introduce biases and, more importantly, the uncertainty associated with the change of support is not taken into account in parameter estimation. In this article, we propose a Bayesian spatial latent Gaussian model that can handle data with different rectilinear supports in both the response variable and predictors. Our approach allows to handle changes of support more naturally according to the properties of the spatial stochastic process being used, and to take into account the uncertainty from the change of support in parameter estimation and prediction. We use spatial stochastic processes as linear combinations of basis functions where Gaussian Markov random fields define the weights. Our hierarchical modelling approach can be described by the following steps: (i) define a latent model where response variables and predictors are considered as latent stochastic processes with continuous support, (ii) link the continuous-index set stochastic processes with its projection to the support of the observed data, (iii) link the projected process with the observed data. We show the applicability of our approach by simulation studies and modelling land suitability for improved grassland in Rhondda Cynon Taf, a county borough in Wales.
△ Less
Submitted 13 March, 2024;
originally announced March 2024.
-
On adaptive kernel intensity estimation on linear networks
Authors:
Jonatan A. González,
Paula Moraga
Abstract:
In the analysis of spatial point patterns on linear networks, a critical statistical objective is estimating the first-order intensity function, representing the expected number of points within specific subsets of the network. Typically, non-parametric approaches employing heating kernels are used for this estimation. However, a significant challenge arises in selecting appropriate bandwidths bef…
▽ More
In the analysis of spatial point patterns on linear networks, a critical statistical objective is estimating the first-order intensity function, representing the expected number of points within specific subsets of the network. Typically, non-parametric approaches employing heating kernels are used for this estimation. However, a significant challenge arises in selecting appropriate bandwidths before conducting the estimation. We study an intensity estimation mechanism that overcomes this limitation using adaptive estimators, where bandwidths adapt to the data points in the pattern. While adaptive estimators have been explored in other contexts, their application in linear networks remains underexplored. We investigate the adaptive intensity estimator within the linear network context and extend a partitioning technique based on bandwidth quantiles to expedite the estimation process significantly. Through simulations, we demonstrate the efficacy of this technique, showing that the partition estimator closely approximates the direct estimator while drastically reducing computation time. As a practical application, we employ our method to estimate the intensity of traffic accidents in a neighbourhood in Medellin, Colombia, showcasing its real-world relevance and efficiency.
△ Less
Submitted 17 September, 2023;
originally announced September 2023.
-
Spatial data fusion adjusting for preferential sampling using INLA and SPDE
Authors:
Ruiman Zhong,
André Victor Ribeiro Amaral,
Paula Moraga
Abstract:
Spatially misaligned data can be fused by using a Bayesian melding model that assumes that underlying all observations there is a spatially continuous Gaussian random field process. This model can be used, for example, to predict air pollution levels by combining point data from monitoring stations and areal data from satellite imagery.
However, if the data presents preferential sampling, that i…
▽ More
Spatially misaligned data can be fused by using a Bayesian melding model that assumes that underlying all observations there is a spatially continuous Gaussian random field process. This model can be used, for example, to predict air pollution levels by combining point data from monitoring stations and areal data from satellite imagery.
However, if the data presents preferential sampling, that is, if the observed point locations are not independent of the underlying spatial process, the inference obtained from models that ignore such a dependence structure might not be valid.
In this paper, we present a Bayesian spatial model for the fusion of point and areal data that takes into account preferential sampling. The model combines the Bayesian melding specification and a model for the stochastically dependent sampling and underlying spatial processes.
Fast Bayesian inference is performed using the integrated nested Laplace approximation (INLA) and the stochastic partial differential equation (SPDE) approaches. The performance of the model is assessed using simulated data in a range of scenarios and sampling strategies that can appear in real settings. The model is also applied to predict air pollution in the USA.
△ Less
Submitted 5 June, 2024; v1 submitted 6 September, 2023;
originally announced September 2023.
-
A multitype Fiksel interaction model for tumour immune microenvironments
Authors:
Jonatan A. González,
Paula Moraga
Abstract:
The tumour microenvironment plays a fundamental role in understanding the development and progression of cancer. This paper proposes a novel spatial point process model that accounts for inhomogeneity and interaction to flexibly model a complex database of cells in the tumour immune microenvironments of a cohort of patients with non-small-cell lung cancer whose samples have been processed using di…
▽ More
The tumour microenvironment plays a fundamental role in understanding the development and progression of cancer. This paper proposes a novel spatial point process model that accounts for inhomogeneity and interaction to flexibly model a complex database of cells in the tumour immune microenvironments of a cohort of patients with non-small-cell lung cancer whose samples have been processed using digital pathology techniques. Specifically, an inhomogeneous multitype Gibbs point process model with an associated Fiksel-type interaction function is proposed. Estimation and inference procedures are conducted through maximum pseudolikelihood, considering replicated multitype point patterns.
△ Less
Submitted 9 July, 2023;
originally announced July 2023.
-
Analysing spatial point patterns in digital pathology: immune cells in high-grade serous ovarian carcinomas
Authors:
Jonatan A. González,
Julia Wrobel,
Simon Vandekar,
Paula Moraga
Abstract:
Multiplex immunofluorescence (mIF) imaging technology facilitates the study of the tumour microenvironment in cancer patients. Due to the capabilities of this emerging bioimaging technique, it is possible to statistically analyse, for example, the co-varying location and functions of multiple different types of immune cells. Complex spatial relationships between different immune cells have been sh…
▽ More
Multiplex immunofluorescence (mIF) imaging technology facilitates the study of the tumour microenvironment in cancer patients. Due to the capabilities of this emerging bioimaging technique, it is possible to statistically analyse, for example, the co-varying location and functions of multiple different types of immune cells. Complex spatial relationships between different immune cells have been shown to correlate with patient outcomes and may reveal new pathways for targeted immunotherapy treatments.
This tutorial reviews methods and procedures relating to spatial point patterns for complex data analysis. We consider tissue cells as a realisation of a spatial point process for each patient. We focus on proper functional descriptors for each observation and techniques that allow us to obtain information about inter-patient variation.
Ovarian cancer is the deadliest gynaecological malignancy and can resist chemotherapy treatment effective in cancers. We use a dataset of high-grade serous ovarian cancer samples from 51 patients. We examine the immune cell composition (T cells, B cells, macrophages) within tumours and additional information such as cell classification (tumour or stroma) and other patient clinical characteristics. Our analyses, supported by reproducible software, apply to other digital pathology datasets.
△ Less
Submitted 6 July, 2023;
originally announced July 2023.
-
Extended Excess Hazard Models for Spatially Dependent Survival Data
Authors:
André Victor Ribeiro Amaral,
Francisco Javier Rubio,
Manuela Quaresma,
Francisco J. Rodríguez-Cortés,
Paula Moraga
Abstract:
Relative survival represents the preferred framework for the analysis of population cancer survival data. The aim is to model the survival probability associated to cancer in the absence of information about the cause of death. Recent data linkage developments have allowed for incorporating the place of residence into the population cancer data bases; however, modeling this spatial information has…
▽ More
Relative survival represents the preferred framework for the analysis of population cancer survival data. The aim is to model the survival probability associated to cancer in the absence of information about the cause of death. Recent data linkage developments have allowed for incorporating the place of residence into the population cancer data bases; however, modeling this spatial information has received little attention in the relative survival setting. We propose a flexible parametric class of spatial excess hazard models (along with inference tools), named "Relative Survival Spatial General Hazard" (RS-SGH), that allows for the inclusion of fixed and spatial effects in both time-level and hazard-level components. We illustrate the performance of the proposed model using an extensive simulation study, and provide guidelines about the interplay of sample size, censoring, and model misspecification. We present a case study using real data from colon cancer patients in England. This case study illustrates how a spatial model can be used to identify geographical areas with low cancer survival, as well as how to summarize such a model through marginal survival quantities and spatial effects.
△ Less
Submitted 6 December, 2023; v1 submitted 18 February, 2023;
originally announced February 2023.
-
An adaptive kernel estimator for the intensity function of spatio-temporal point processes
Authors:
Jonatan A. González,
Paula Moraga
Abstract:
In spatio-temporal point pattern analysis, one of the main statistical objectives is to estimate the first-order intensity function, i.e., the expected number of points per unit area and unit time. This estimation is usually carried out non-parametrically through kernel functions, where one of the most frequent handicaps is the selection of kernel bandwidths prior to estimation. This work presents…
▽ More
In spatio-temporal point pattern analysis, one of the main statistical objectives is to estimate the first-order intensity function, i.e., the expected number of points per unit area and unit time. This estimation is usually carried out non-parametrically through kernel functions, where one of the most frequent handicaps is the selection of kernel bandwidths prior to estimation. This work presents an intensity estimation mechanism in which the spatial and temporal bandwidths change at each data point in a spatio-temporal point pattern. This class of estimators is called adaptive estimators, and although there have been studied in spatial settings, little has been said about them in the spatio-temporal context. We define the adaptive intensity estimator in the spatio-temporal context and extend a partitioning technique based on the bandwidths quantiles to perform a fast estimation. We demonstrate through simulation that this technique works well in practice with the partition estimator approximating the direct estimator and much faster computation time. Finally, we apply our method to estimate the spatio-temporal intensity of fires in the Amazonia basin.
△ Less
Submitted 25 August, 2022;
originally announced August 2022.
-
Panchromatic HST/WFC3 Imaging Studies of Young, Rapidly Evolving Planetary Nebulae. I. NGC 6302
Authors:
Joel H. Kastner,
Paula Moraga,
Bruce Balick,
Jesse Bublitz,
Rodolfo Montez Jr.,
Adam Frank,
Eric Blackman
Abstract:
We present the results of a comprehensive, near-UV-to-near-IR Hubble Space Telescope WFC3 imaging study of the young planetary nebula (PN) NGC 6302, the archetype of the class of extreme bi-lobed, pinched-waist PNe that are rich in dust and molecular gas. The new WFC3 emission-line image suite clearly defines the dusty toroidal equatorial structure that bisects NGC 6302's polar lobes, and the fine…
▽ More
We present the results of a comprehensive, near-UV-to-near-IR Hubble Space Telescope WFC3 imaging study of the young planetary nebula (PN) NGC 6302, the archetype of the class of extreme bi-lobed, pinched-waist PNe that are rich in dust and molecular gas. The new WFC3 emission-line image suite clearly defines the dusty toroidal equatorial structure that bisects NGC 6302's polar lobes, and the fine structures (clumps, knots, and filaments) within the lobes. The most striking aspect of the new WFC3 image suite is the bright, S-shaped 1.64 micron [Fe II] emission that traces the southern interior of the east lobe rim and the northern interior of the west lobe rim, in point-symmetric fashion. We interpret this [Fe II] emitting region as a zone of shocks caused by ongoing, fast (~100 km/s), collimated, off-axis winds from NGC 6302's central star(s). The [Fe II] emission and a zone of dusty, N- and S-rich clumps near the nebular symmetry axis form wedge-shaped structures on opposite sides of the core, with boundaries marked by sharp azimuthal ionization gradients. Comparison of our new images with earlier HST/WFC3 imaging reveals that the object previously identified as NGC 6302's central star is a foreground field star. Shell-like inner lobe features may instead pinpoint the obscured central star's actual position within the nebula's dusty central torus. The juxtaposition of structures revealed in this HST/WFC3 imaging study of NGC 6302 presents a daunting challenge for models of the origin and evolution of bipolar PNe.
△ Less
Submitted 20 December, 2021; v1 submitted 28 May, 2021;
originally announced May 2021.
-
Spatial and Spatio-Temporal Log-Gaussian Cox Processes: Extending the Geostatistical Paradigm
Authors:
Peter J. Diggle,
Paula Moraga,
Barry Rowlingson,
Benjamin M. Taylor
Abstract:
In this paper we first describe the class of log-Gaussian Cox processes (LGCPs) as models for spatial and spatio-temporal point process data. We discuss inference, with a particular focus on the computational challenges of likelihood-based inference. We then demonstrate the usefulness of the LGCP by describing four applications: estimating the intensity surface of a spatial point process; investig…
▽ More
In this paper we first describe the class of log-Gaussian Cox processes (LGCPs) as models for spatial and spatio-temporal point process data. We discuss inference, with a particular focus on the computational challenges of likelihood-based inference. We then demonstrate the usefulness of the LGCP by describing four applications: estimating the intensity surface of a spatial point process; investigating spatial segregation in a multi-type process; constructing spatially continuous maps of disease risk from spatially discrete data; and real-time health surveillance. We argue that problems of this kind fit naturally into the realm of geostatistics, which traditionally is defined as the study of spatially continuous processes using spatially discrete observations at a finite number of locations. We suggest that a more useful definition of geostatistics is by the class of scientific problems that it addresses, rather than by particular models or data formats.
△ Less
Submitted 23 December, 2013;
originally announced December 2013.