-
Reducing Investigator Bias in Sampling-Based Land Cover Classification by Integrating Multiple Investigators' Maps Using a Multiple Classifier System
Authors:
Narumasa Tsutsumida,
Akira Kato
Abstract:
Land cover classification plays a pivotal role in describing Earth's surface characteristics. However, these thematic classifications can be affected by uncertainties introduced by an investigator's bias. While land cover classification map** is becoming easier for us due to the emergence of cloud geospatial platforms such as Google Earth Engine, such uncertainty is often overlooked. Thus, this…
▽ More
Land cover classification plays a pivotal role in describing Earth's surface characteristics. However, these thematic classifications can be affected by uncertainties introduced by an investigator's bias. While land cover classification map** is becoming easier for us due to the emergence of cloud geospatial platforms such as Google Earth Engine, such uncertainty is often overlooked. Thus, this study aimed to create a robust land cover classification map by reducing investigator-induced uncertainty from independent investigators' maps using a multiple classifier system. In Saitama City, Japan, as a case study, 44 investigators used a point-based visual interpretation method via Google Earth Engine to collect stratified reference samples across four different land cover classes: forest, agriculture, urban, and water. These samples were then used to train a random forest classifier, ultimately resulting in the creation of individual classification maps. We quantified pixel-level discrepancies in these maps, which came from inherent investigator-induced variability. To tackle these uncertainties, we developed a multiple classifier system incorporating K-Medoids to group the most reliable maps and minimize discrepancies. We further applied Bayesian analysis to these grouped maps to produce a unified, accurate classification map. This yielded an overall accuracy of 92.5\% for 400 independent validation samples. We discuss how our approach can also reduce salt-and-pepper noise, which is often found in individual classification maps. This research underscores the intrinsic uncertainties present in land cover classification maps attributable to investigator variations and introduces a potential solution to attenuate these variations.
△ Less
Submitted 23 March, 2024;
originally announced March 2024.
-
A linearization for stable and fast geographically weighted Poisson regression
Authors:
Daisuke Murakami,
Narumasa Tsutsumida,
Takahiro Yoshida,
Tomoki Nakaya,
Binbin Lu,
Paul Harris
Abstract:
Although geographically weighted Poisson regression (GWPR) is a popular regression for spatially indexed count data, its development is relatively limited compared to that found for linear geographically weighted regression (GWR), where many extensions (e.g., multiscale GWR, scalable GWR) have been proposed. The weak development of GWPR can be attributed to the computational cost and identificatio…
▽ More
Although geographically weighted Poisson regression (GWPR) is a popular regression for spatially indexed count data, its development is relatively limited compared to that found for linear geographically weighted regression (GWR), where many extensions (e.g., multiscale GWR, scalable GWR) have been proposed. The weak development of GWPR can be attributed to the computational cost and identification problem in the underpinning Poisson regression model. This study proposes linearized GWPR (L-GWPR) by introducing a log-linear approximation into the GWPR model to overcome these bottlenecks. Because the L-GWPR model is identical to the Gaussian GWR model, it is free from the identification problem, easily implemented, computationally efficient, and offers similar potential for extension. Specifically, L-GWPR does not require a double-loop algorithm, which makes GWPR slow for large samples. Furthermore, we extended L-GWPR by introducing ridge regularization to enhance its stability (regularized L-GWPR). The results of the Monte Carlo experiments confirmed that regularized L-GWPR estimates local coefficients accurately and computationally efficiently. Finally, we compared GWPR and regularized L-GWPR through a crime analysis in Tokyo.
△ Less
Submitted 15 May, 2023;
originally announced May 2023.
-
gwpcorMapper: an interactive map** tool for exploring geographically weighted correlation and partial correlation in high-dimensional geospatial datasets
Authors:
Joseph Emile Honour Percival,
Narumasa Tsutsumida,
Daisuke Murakami,
Takahiro Yoshida,
Tomoki Nakaya
Abstract:
Exploratory spatial data analysis (ESDA) plays a key role in research that includes geographic data. In ESDA, analysts often want to be able to visualize observations and local relationships on a map. However, software dedicated to visualizing local spatial relations be-tween multiple variables in high dimensional datasets remains undeveloped. This paper introduces gwpcorMapper, a newly developed…
▽ More
Exploratory spatial data analysis (ESDA) plays a key role in research that includes geographic data. In ESDA, analysts often want to be able to visualize observations and local relationships on a map. However, software dedicated to visualizing local spatial relations be-tween multiple variables in high dimensional datasets remains undeveloped. This paper introduces gwpcorMapper, a newly developed software application for map** geographically weighted correlation and partial correlation in large multivariate datasets. gwpcorMap-per facilitates ESDA by giving researchers the ability to interact with map components that describe local correlative relationships. We built gwpcorMapper using the R Shiny framework. The software inherits its core algorithm from GWpcor, an R library for calculating the geographically weighted correlation and partial correlation statistics. We demonstrate the application of gwpcorMapper by using it to explore census data in order to find meaningful relationships that describe the work-life environment in the 23 special wards of Tokyo, Japan. We show that gwpcorMapper is useful in both variable selection and parameter tuning for geographically weighted statistics. gwpcorMapper highlights that there are strong statistically clear local variations in the relationship between the number of commuters and the total number of hours worked when considering the total population in each district across the 23 special wards of Tokyo. Our application demonstrates that the ESDA process with high-dimensional geospatial data using gwpcorMapper has applications across multiple fields.
△ Less
Submitted 8 May, 2022; v1 submitted 10 January, 2021;
originally announced January 2021.
-
Scalable GWR: A linear-time algorithm for large-scale geographically weighted regression with polynomial kernels
Authors:
Daisuke Murakami,
Narumasa Tsutsumida,
Takahiro Yoshida,
Tomoki Nakaya,
Binbin Lu
Abstract:
Although a number of studies have developed fast geographically weighted regression (GWR) algorithms for large samples, none of them has achieved linear-time estimation, which is considered a requisite for big data analysis in machine learning, geostatistics, and related domains. Against this backdrop, this study proposes a scalable GWR (ScaGWR) for large datasets. The key improvement is the calib…
▽ More
Although a number of studies have developed fast geographically weighted regression (GWR) algorithms for large samples, none of them has achieved linear-time estimation, which is considered a requisite for big data analysis in machine learning, geostatistics, and related domains. Against this backdrop, this study proposes a scalable GWR (ScaGWR) for large datasets. The key improvement is the calibration of the model through a pre-compression of the matrices and vectors whose size depends on the sample size, prior to the leave-one-out cross-validation, which is the heaviest computational step in conventional GWR. This pre-compression allows us to run the proposed GWR extension so that its computation time increases linearly with the sample size. With this improvement, the ScaGWR can be calibrated with one million observations without parallelization. Moreover, the ScaGWR estimator can be regarded as an empirical Bayesian estimator that is more stable than the conventional GWR estimator. We compare the ScaGWR with the conventional GWR in terms of estimation accuracy and computational efficiency using a Monte Carlo simulation. Then, we apply these methods to a US income analysis. The code for ScaGWR is available in the R package scgwr. The code is embedded into C++ code and implemented in another R package, GWmodel.
△ Less
Submitted 23 April, 2020; v1 submitted 1 May, 2019;
originally announced May 2019.
-
Investigating Spatial Error Structures in Continuous Raster Data
Authors:
Narumasa Tsutsumida,
Pedro RodrÃguez-Veiga,
Paul Harris,
Heiko Balzter,
Alexis Comber
Abstract:
The objective of this study is to investigate spatial structures of error in the assessment of continuous raster data. The use of conventional diagnostics of error often overlooks the possible spatial variation in error because such diagnostics report only average error or deviation between predicted and reference values. In this respect, this work uses a moving window (kernel) approach to generat…
▽ More
The objective of this study is to investigate spatial structures of error in the assessment of continuous raster data. The use of conventional diagnostics of error often overlooks the possible spatial variation in error because such diagnostics report only average error or deviation between predicted and reference values. In this respect, this work uses a moving window (kernel) approach to generate geographically weighted (GW) versions of the mean signed deviation, the mean absolute error and the root mean squared error and to quantify their spatial variations. Such approach computes local error diagnostics from data weighted by its distance to the centre of a moving kernel and allows to map spatial surfaces of each type of error. In addition, a GW correlation analysis between predicted and reference values provides an alternative view of local error. Full abstract can be found in the pdf.
△ Less
Submitted 30 September, 2018;
originally announced October 2018.