-
sfislands: An R Package for Accommodating Islands and Disjoint Zones in Areal Spatial Modelling
Authors:
Kevin Horan,
Katarina Domijan,
Chris Brunsdon
Abstract:
Fitting areal models which use a spatial weights matrix to represent relationships between geographical units can be a cumbersome task, particularly when these units are not well-behaved. The two chief aims of sfislands are to simplify the process of creating an appropriate neighbourhood matrix, and to quickly visualise the predictions of subsequent models. The package uses visual aids in the form…
▽ More
Fitting areal models which use a spatial weights matrix to represent relationships between geographical units can be a cumbersome task, particularly when these units are not well-behaved. The two chief aims of sfislands are to simplify the process of creating an appropriate neighbourhood matrix, and to quickly visualise the predictions of subsequent models. The package uses visual aids in the form of easily-generated maps to help this process. This paper demonstrates how sfislands could be useful to researchers. It begins by describing the package's functions in the context of a proposed workflow. It then presents three worked examples showing a selection of potential use-cases. These range from earthquakes in Indonesia, to river crossings in London, and hierarchical models of output areas in Liverpool. We aim to show how the sfislands package streamlines much of the human workflow involved in creating and examining such models.
△ Less
Submitted 15 April, 2024;
originally announced April 2024.
-
gwverse: a template for a new generic Geographically Weighted Rpackage
Authors:
Alexis Comber,
Chris Brunsdon,
Martin Callaghan,
Paul Harris,
Binbin Lu,
Nick Malleson
Abstract:
GWR is a popular approach for investigating the spatial variation in relationships between response and predictor variables, and critically for investigating and understanding process spatial heterogeneity. The geographically weighted (GW) framework is increasingly used to accommodate different types of models and analyses reflecting a wider desire to explore spatial variation in model parameters…
▽ More
GWR is a popular approach for investigating the spatial variation in relationships between response and predictor variables, and critically for investigating and understanding process spatial heterogeneity. The geographically weighted (GW) framework is increasingly used to accommodate different types of models and analyses reflecting a wider desire to explore spatial variation in model parameters or components. However the growth in the use of GWR and different GW models has only been partially supported by package development in both R and Python, the major coding environments for spatial analysis. The result is that refinements have been inconsistently included (if at all) within GWR and GW functions in any given package. This paper outlines the structure of a new `gwverse` package, that will over time replace `GWmodel`, that takes advantage of recent developments in the composition of complex, integrated packages. It conceptualises `gwverse` as having a modular structure, that separates core GW functionality and applications such as GWR. It adopts a function factory approach, in which bespoke functions are created and returned to the user based on user-defined parameters. The paper introduces two demonstrator modules that can be used to undertake GWR and identifies a number of key considerations and next steps.
△ Less
Submitted 29 September, 2021;
originally announced September 2021.
-
Opening practice: supporting Reproducibility and Critical spatial data science
Authors:
Chris Brunsdon,
Alexis Comber
Abstract:
This paper reflects on a number of trends towards a more open and reproducible approach to geographic and spatial data science over recent years. In particular it considers trends towards Big Data, and the impacts this is having on spatial data analysis and modelling. It identifies a turn in academia towards coding as a core analytic tool, and away from proprietary software tools offering 'black b…
▽ More
This paper reflects on a number of trends towards a more open and reproducible approach to geographic and spatial data science over recent years. In particular it considers trends towards Big Data, and the impacts this is having on spatial data analysis and modelling. It identifies a turn in academia towards coding as a core analytic tool, and away from proprietary software tools offering 'black boxes' where the internal workings of the analysis are not revealed. It is argued that this closed form software is problematic, and considers a number of ways in which issues identified in spatial data analysis (such as the MAUP) could be overlooked when working with closed tools, leading to problems of interpretation and possibly inappropriate actions and policies based on these. In addition, this paper and considers the role that reproducible and open spatial science may play in such an approach, taking into account the issues raised. It highlights the dangers of failing to account for the geographical properties of data, now that all data are spatial (they are collected somewhere), the problems of a desire for n=all observations in data science and it identifies the need for a critical approach. This is one in which openness, transparency, sharing and reproducibility provide a mantra for defensible and robust spatial data science.
△ Less
Submitted 20 July, 2020;
originally announced August 2020.
-
Big Issues for Big Data: challenges for critical spatial data analytics
Authors:
Chris Brunsdon,
Alexis Comber
Abstract:
In this paper we consider some of the issues of working with big data and big spatial data and highlight the need for an open and critical framework. We focus on a set of challenges underlying the collection and analysis of big data. In particular, we consider 1) the issues related to inference when working with usually biased big data, challenging the assumed inferential superiority of data with…
▽ More
In this paper we consider some of the issues of working with big data and big spatial data and highlight the need for an open and critical framework. We focus on a set of challenges underlying the collection and analysis of big data. In particular, we consider 1) the issues related to inference when working with usually biased big data, challenging the assumed inferential superiority of data with observations, n, approaching N, the population (n->N), and the need for data science analysis that answer questions of practical significance or with greater emphasis n the size of the effect, rather than the truth or falsehood of a statistical statement; 2) the need to accept messiness in your data and to document all operations undertaken on the data because of this support of openness and reproducibility paradigms; and 3) the need to explicitly seek to understand the causes of bias, messiness etc in the data and the inferential consequences of using such data in analyses, by adopting critical approaches to spatial data science. In particular we consider the need to place individual data science studies in a wider social and economic contexts, along the the role of inferential theory in the presence of big data, and issues relating to messiness and complexity in big data.
△ Less
Submitted 11 August, 2020; v1 submitted 22 July, 2020;
originally announced July 2020.
-
The GWR route map: a guide to the informed application of Geographically Weighted Regression
Authors:
Alexis Comber,
Chris Brunsdon,
Martin Charlton,
Guanpeng Dong,
Rich Harris,
Binbin Lu,
Yihe Lü,
Daisuke Murakami,
Tomoki Nakaya,
Yunqiang Wang,
Paul Harris
Abstract:
Geographically Weighted Regression (GWR) is increasingly used in spatial analyses of social and environmental data. It allows spatial heterogeneities in processes and relationships to be investigated through a series of local regression models rather than a global one. Standard GWR assumes that the relationships between the response and predictor variables operate at the same spatial scale, which…
▽ More
Geographically Weighted Regression (GWR) is increasingly used in spatial analyses of social and environmental data. It allows spatial heterogeneities in processes and relationships to be investigated through a series of local regression models rather than a global one. Standard GWR assumes that the relationships between the response and predictor variables operate at the same spatial scale, which is frequently not the case. To address this, several GWR variants have been proposed. This paper describes a route map to inform the choice of whether to use a GWR model or not, and if so which of three core variants to apply: a standard GWR, a mixed GWR or a multiscale GWR (MS-GWR). The route map comprises primary steps: a basic linear regression, a MS-GWR, and investigations of the results of these. The paper provides guidance for deciding whether to use a GWR approach, and if so for determining the appropriate GWR variant. It describes the importance of investigating a number of secondary issues at global and local scales including collinearity, the influence of outliers, and dependent error terms. Code and data for the case study used to illustrate the route map are provided, and further considerations are described in an extensive Appendix.
△ Less
Submitted 14 April, 2020; v1 submitted 13 April, 2020;
originally announced April 2020.
-
The importance of scale in spatially varying coefficient modeling
Authors:
Daisuke Murakami,
Binbin Lu,
Paul Harris,
Chris Brunsdon,
Martin Charlton,
Tomoki Nakaya,
Daniel A. Griffith
Abstract:
While spatially varying coefficient (SVC) models have attracted considerable attention in applied science, they have been criticized as being unstable. The objective of this study is to show that capturing the "spatial scale" of each data relationship is crucially important to make SVC modeling more stable, and in doing so, adds flexibility. Here, the analytical properties of six SVC models are su…
▽ More
While spatially varying coefficient (SVC) models have attracted considerable attention in applied science, they have been criticized as being unstable. The objective of this study is to show that capturing the "spatial scale" of each data relationship is crucially important to make SVC modeling more stable, and in doing so, adds flexibility. Here, the analytical properties of six SVC models are summarized in terms of their characterization of scale. Models are examined through a series of Monte Carlo simulation experiments to assess the extent to which spatial scale influences model stability and the accuracy of their SVC estimates. The following models are studied: (i) geographically weighted regression (GWR) with a fixed distance or (ii) an adaptive distance bandwidth (GWRa), (iii) flexible bandwidth GWR (FB-GWR) with fixed distance or (iv) adaptive distance bandwidths (FB-GWRa), (v) eigenvector spatial filtering (ESF), and (vi) random effects ESF (RE-ESF). Results reveal that the SVC models designed to capture scale dependencies in local relationships (FB-GWR, FB-GWRa and RE-ESF) most accurately estimate the simulated SVCs, where RE-ESF is the most computationally efficient. Conversely GWR and ESF, where SVC estimates are naively assumed to operate at the same spatial scale for each relationship, perform poorly. Results also confirm that the adaptive bandwidth GWR models (GWRa and FB-GWRa) are superior to their fixed bandwidth counterparts (GWR and FB-GWR).
△ Less
Submitted 25 September, 2017;
originally announced September 2017.
-
The GWmodel R package: Further Topics for Exploring Spatial Heterogeneity using Geographically Weighted Models
Authors:
Binbin Lu,
Paul Harris,
Martin Charlton,
Chris Brunsdon
Abstract:
In this study, we present a collection of local models, termed geographically weighted (GW) models, that can be found within the GWmodel R package. A GW model suits situations when spatial data are poorly described by the global form, and for some regions the localised fit provides a better description. The approach uses a moving window weighting technique, where a collection of local models are e…
▽ More
In this study, we present a collection of local models, termed geographically weighted (GW) models, that can be found within the GWmodel R package. A GW model suits situations when spatial data are poorly described by the global form, and for some regions the localised fit provides a better description. The approach uses a moving window weighting technique, where a collection of local models are estimated at target locations. Commonly, model parameters or outputs are mapped so that the nature of spatial heterogeneity can be explored and assessed. In particular, we present case studies using: (i) GW summary statistics and a GW principal components analysis; (ii) advanced GW regression fits and diagnostics; (iii) associated Monte Carlo significance tests for non-stationarity; (iv) a GW discriminant analysis; and (v) enhanced kernel bandwidth selection procedures. General Election data sets from the Republic of Ireland and US are used for demonstration. This study is designed to complement a companion GWmodel study, which focuses on basic and robust GW models.
△ Less
Submitted 10 December, 2013;
originally announced December 2013.
-
GWmodel: an R Package for Exploring Spatial Heterogeneity using Geographically Weighted Models
Authors:
Isabella Gollini,
Binbin Lu,
Martin Charlton,
Christopher Brunsdon,
Paul Harris
Abstract:
Spatial statistics is a growing discipline providing important analytical techniques in a wide range of disciplines in the natural and social sciences. In the R package GWmodel, we introduce techniques from a particular branch of spatial statistics, termed geographically weighted (GW) models. GW models suit situations when data are not described well by some global model, but where there are spati…
▽ More
Spatial statistics is a growing discipline providing important analytical techniques in a wide range of disciplines in the natural and social sciences. In the R package GWmodel, we introduce techniques from a particular branch of spatial statistics, termed geographically weighted (GW) models. GW models suit situations when data are not described well by some global model, but where there are spatial regions where a suitably localised calibration provides a better description. The approach uses a moving window weighting technique, where localised models are found at target locations. Outputs are mapped to provide a useful exploratory tool into the nature of the data spatial heterogeneity. GWmodel includes: GW summary statistics, GW principal components analysis, GW regression, GW regression with a local ridge compensation, and GW regression for prediction; some of which are provided in basic and robust forms.
△ Less
Submitted 17 March, 2014; v1 submitted 3 June, 2013;
originally announced June 2013.