Skip to main content

Showing 1–27 of 27 results for author: Furrer, R

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.14492  [pdf, other

    stat.ME cs.LG stat.ML

    Iterative Methods for Full-Scale Gaussian Process Approximations for Large Spatial Data

    Authors: Tim Gyger, Reinhard Furrer, Fabio Sigrist

    Abstract: Gaussian processes are flexible probabilistic regression models which are widely used in statistics and machine learning. However, a drawback is their limited scalability to large data sets. To alleviate this, we consider full-scale approximations (FSAs) that combine predictive process methods and covariance tapering, thus approximating both global and local structures. We show how iterative metho… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  2. arXiv:2203.03322  [pdf, other

    stat.ME stat.AP

    Dominant-feature identification in data from Gaussian processes applied to Finnish forest inventory records

    Authors: Roman Flury, Tuomas Aakala, Leena Ruha, Timo Kuuluvainen, Reinhard Furrer

    Abstract: In spatial data, location-dependent variation leads to connected structures known as features. Variations occur at different spatial scales and possibly originate from distinct underlying processes. Each of these scales is characterized by its own dominant features. Here we introduce a statistical method for identifying these scales and their dominant features in data from Gaussian processes. This… ▽ More

    Submitted 7 March, 2022; originally announced March 2022.

    Comments: 28 pages, 9 figures

  3. Discussion on Competition for Spatial Statistics for Large Datasets

    Authors: Roman Flury, Reinhard Furrer

    Abstract: We discuss the experiences and results of the AppStatUZH team's participation in the comprehensive and unbiased comparison of different spatial approximations conducted in the Competition for Spatial Statistics for Large Datasets. In each of the different sub-competitions, we estimated parameters of the covariance model based on a likelihood function and predicted missing observations with simple… ▽ More

    Submitted 19 June, 2021; originally announced June 2021.

    Comments: 5 pages, 1 figure

  4. arXiv:2106.02364  [pdf, other

    stat.CO stat.ME

    varycoef: An R Package for Gaussian Process-based Spatially Varying Coefficient Models

    Authors: Jakob A. Dambon, Fabio Sigrist, Reinhard Furrer

    Abstract: Gaussian processes (GPs) are well-known tools for modeling dependent data with applications in spatial statistics, time series analysis, or econometrics. In this article, we present the R package varycoef that implements estimation, prediction, and variable selection of linear models with spatially varying coefficients (SVC) defined by GPs, so called GP-based SVC models. Such models offer a high d… ▽ More

    Submitted 4 June, 2021; originally announced June 2021.

  5. arXiv:2101.01932  [pdf, other

    stat.ME

    Joint Variable Selection of both Fixed and Random Effects for Gaussian Process-based Spatially Varying Coefficient Models

    Authors: Jakob A. Dambon, Fabio Sigrist, Reinhard Furrer

    Abstract: Spatially varying coefficient (SVC) models are a type of regression model for spatial data where covariate effects vary over space. If there are several covariates, a natural question is which covariates have a spatially varying effect and which not. We present a new variable selection approach for Gaussian process-based SVC models. It relies on a penalized maximum likelihood estimation (PMLE) and… ▽ More

    Submitted 11 February, 2021; v1 submitted 6 January, 2021; originally announced January 2021.

    Comments: 26 pages including appendix. Containing 6 figures and 6 tables. Updated Declarations

  6. arXiv:2010.00534  [pdf, other

    stat.AP

    Bayesian spatial modelling of terrestrial radiation in Switzerland

    Authors: Christophe L. Folly, Garyfallos Konstantinoudis, Antonella Mazzei-Abba, Christian Kreis, Benno Bucher, Reinhard Furrer, Ben D. Spycher

    Abstract: The geographic variation of terrestrial radiation can be exploited in epidemiological studies of the health effects of protracted low-dose exposure. Various methods have been applied to derive maps of this variation. We aimed to construct a map of terrestrial radiation for Switzerland. We used airborne $γ$-spectrometry measurements to model the ambient dose rates from terrestrial radiation through… ▽ More

    Submitted 1 October, 2020; originally announced October 2020.

    Comments: 27 pages, 10 figures

  7. Identification of Dominant Features in Spatial Data

    Authors: Roman Flury, Florian Gerber, Bernhard Schmid, Reinhard Furrer

    Abstract: Dominant features of spatial data are connected structures or patterns that emerge from location-based variation and manifest at specific scales or resolutions. To identify dominant features, we propose a sequential application of multiresolution decomposition and variogram function estimation. Multiresolution decomposition separates data into additive components, and in this way enables the recog… ▽ More

    Submitted 18 November, 2020; v1 submitted 12 June, 2020; originally announced June 2020.

    Comments: 25 pages, 14 figures

  8. Multiresolution Decomposition of Areal Count Data

    Authors: Roman Flury, Reinhard Furrer

    Abstract: Multiresolution decomposition is commonly understood as a procedure to capture scale-dependent features in random signals. Such methods were first established for image processing and typically rely on raster or regularly gridded data. In this article, we extend a particular multiresolution decomposition procedure to areal count data, i.e.~discrete irregularly gridded data. More specifically, we i… ▽ More

    Submitted 29 May, 2020; originally announced May 2020.

    Comments: 4 pages, 3 figures, GRASPA 2019 conference proceeding

    Journal ref: Proceedings of the GRASPA 2019 Conference, Pescara, 15-16 July 2019

  9. Maximum Likelihood Estimation of Spatially Varying Coefficient Models for Large Data with an Application to Real Estate Price Prediction

    Authors: Jakob A. Dambon, Fabio Sigrist, Reinhard Furrer

    Abstract: In regression models for spatial data, it is often assumed that the marginal effects of covariates on the response are constant over space. In practice, this assumption might often be questionable. In this article, we show how a Gaussian process-based spatially varying coefficient (SVC) model can be estimated using maximum likelihood estimation (MLE). In addition, we present an approach that scale… ▽ More

    Submitted 12 November, 2020; v1 submitted 22 January, 2020; originally announced January 2020.

    Comments: revision: 35 pages, 14 figures, typo in likelihood corrected, DOI added

  10. arXiv:1911.09006  [pdf, other

    stat.ML cs.LG stat.ME

    Additive Bayesian Network Modelling with the R Package abn

    Authors: Gilles Kratzer, Fraser Iain Lewis, Arianna Comin, Marta Pittavino, Reinhard Furrer

    Abstract: The R package abn is designed to fit additive Bayesian models to observational datasets. It contains routines to score Bayesian networks based on Bayesian or information theoretic formulations of generalized linear models. It is equipped with exact search and greedy search algorithms to select the best network. It supports a possible blend of continuous, discrete and count data and input of prior… ▽ More

    Submitted 20 November, 2019; originally announced November 2019.

    Comments: 37 pages, 14 figures and 2 tables

  11. arXiv:1906.00364  [pdf, other

    stat.ME

    Combining Heterogeneous Spatial Datasets with Process-based Spatial Fusion Models: A Unifying Framework

    Authors: Craig Wang, Reinhard Furrer

    Abstract: In modern spatial statistics, the structure of data that is collected has become more heterogeneous. Depending on the type of spatial data, different modeling strategies for spatial data are used. For example, a kriging approach for geostatistical data; a Gaussian Markov random field model for lattice data; or a log Gaussian Cox process for point-pattern data. Despite these different modeling choi… ▽ More

    Submitted 2 June, 2019; originally announced June 2019.

    Comments: 33 pages, 5 figures

  12. arXiv:1902.06641  [pdf, other

    stat.CO stat.ML

    Is a single unique Bayesian network enough to accurately represent your data?

    Authors: Gilles Kratzer, Reinhard Furrer

    Abstract: Bayesian network (BN) modelling is extensively used in systems epidemiology. Usually it consists in selecting and reporting the best-fitting structure conditional to the data. A major practical concern is avoiding overfitting, on account of its extreme flexibility and its modelling richness. Many approaches have been proposed to control for overfitting. Unfortunately, they essentially all rely on… ▽ More

    Submitted 18 February, 2019; originally announced February 2019.

    Comments: 2 pages, 3 figures

  13. arXiv:1809.06636  [pdf, other

    stat.ME stat.AP stat.ML

    Comparison between Suitable Priors for Additive Bayesian Networks

    Authors: Gilles Kratzer, Reinhard Furrer, Marta Pittavino

    Abstract: Additive Bayesian networks are types of graphical models that extend the usual Bayesian generalized linear model to multiple dependent variables through the factorisation of the joint probability distribution of the underlying variables. When fitting an ABN model, the choice of the prior of the parameters is of crucial importance. If an inadequate prior - like a too weakly informative one - is use… ▽ More

    Submitted 18 September, 2018; originally announced September 2018.

    Comments: 8 pages, 4 figures

  14. arXiv:1808.01126  [pdf, other

    stat.ML cs.LG stat.CO

    Information-Theoretic Scoring Rules to Learn Additive Bayesian Network Applied to Epidemiology

    Authors: Gilles Kratzer, Reinhard Furrer

    Abstract: Bayesian network modelling is a well adapted approach to study messy and highly correlated datasets which are very common in, e.g., systems epidemiology. A popular approach to learn a Bayesian network from an observational datasets is to identify the maximum a posteriori network in a search-and-score approach. Many scores have been proposed both Bayesian or frequentist based. In an applied perspec… ▽ More

    Submitted 3 August, 2018; originally announced August 2018.

    Comments: 16 pages, 3 figures

  15. arXiv:1804.11224  [pdf, other

    stat.CO stat.AP

    EggCounts: a Bayesian hierarchical toolkit to model faecal egg count reductions

    Authors: Craig Wang, Reinhard Furrer

    Abstract: This is a vignette for the R package eggCounts version 2.0. The package implements a suite of Bayesian hierarchical models dealing with faecal egg count reductions. The models are designed for a variety of practical situations, including individual treatment efficacy, zero inflation, small sample size (less than 10) and potential outliers. The functions are intuitive to use and their output are ea… ▽ More

    Submitted 3 February, 2022; v1 submitted 30 April, 2018; originally announced April 2018.

    Comments: 13 pages, 3 figures

  16. arXiv:1804.11058  [pdf, other

    stat.CO

    optimParallel: an R Package Providing Parallel Versions of the Gradient-Based Optimization Methods of optim()

    Authors: Florian Gerber, Reinhard Furrer

    Abstract: The R package optimParallel provides a parallel version of the gradient-based optimization methods of optim(). The main function of the package is optimParallel(), which has the same usage and output as optim(). Using optimParallel() can significantly reduce optimization times. We introduce the R package and illustrate its implementation, which takes advantage of the lexical sco** mechanism of R… ▽ More

    Submitted 30 April, 2018; originally announced April 2018.

  17. arXiv:1804.07134  [pdf, other

    stat.ML cs.LG

    varrank: an R package for variable ranking based on mutual information with applications to observed systemic datasets

    Authors: Gilles Kratzer, Reinhard Furrer

    Abstract: This article describes the R package varrank. It has a flexible implementation of heuristic approaches which perform variable ranking based on mutual information. The package is particularly suitable for exploring multivariate datasets requiring a holistic analysis. The core functionality is a general implementation of the minimum redundancy maximum relevance (mRMRe) model. This approach is based… ▽ More

    Submitted 19 April, 2018; originally announced April 2018.

    Comments: 18 pages, 4 figures

  18. arXiv:1710.05013  [pdf, other

    stat.ME

    A Case Study Competition Among Methods for Analyzing Large Spatial Data

    Authors: Matthew J. Heaton, Abhirup Datta, Andrew Finley, Reinhard Furrer, Rajarshi Guhaniyogi, Florian Gerber, Robert B. Gramacy, Dorit Hammerling, Matthias Katzfuss, Finn Lindgren, Douglas W. Nychka, Furong Sun, Andrew Zammit-Mangion

    Abstract: The Gaussian process is an indispensable tool for spatial data analysts. The onset of the "big data" era, however, has lead to the traditional Gaussian process being computationally infeasible for modern spatial data. As such, various alternatives to the full Gaussian process that are more amenable to handling big spatial data have been proposed. These modern methods often exploit low rank structu… ▽ More

    Submitted 25 April, 2018; v1 submitted 13 October, 2017; originally announced October 2017.

  19. dotCall64: An Efficient Interface to Compiled C/C++ and Fortran Code Supporting Long Vectors

    Authors: Florian Gerber, Kaspar Mösinger, Reinhard Furrer

    Abstract: The R functions .C() and .Fortran() can be used to call compiled C/C++ and Fortran code from R. This so-called foreign function interface is convenient, since it does not require any interactions with the C API of R. However, it does not support long vectors (i.e., vectors of more than 2^31 elements). To overcome this limitation, the R package dotCall64 provides .C64(), which can be used to call c… ▽ More

    Submitted 27 February, 2017; originally announced February 2017.

    Comments: 17 pages

    Journal ref: SoftwareX, 7, 217-221, 2018

  20. Predicting missing values in spatio-temporal satellite data

    Authors: Florian Gerber, Reinhard Furrer, Gabriela Schaepman-Strub, Rogier de Jong, Michael E. Schaepman

    Abstract: Remotely sensed data are sparse, which means that data have missing values, for instance due to cloud cover. This is problematic for applications and signal processing algorithms that require complete data sets. To address the sparse data issue, we present a new gap-fill algorithm. The proposed method predicts each missing value separately based on data points in a spatio-temporal neighborhood aro… ▽ More

    Submitted 3 May, 2016; originally announced May 2016.

    Comments: 35 pages

    Journal ref: IEEE Transactions on Geoscience and Remote Sensing, Volume 55, Issue 5, 2841-2853, 2018

  21. arXiv:1604.05478  [pdf, other

    stat.CO stat.ME

    Valid parameter space of a bivariate Gaussian Markov random field with a generalized block-Toeplitz precision matrix

    Authors: Mattia Molinaro, Reinhard Furrer

    Abstract: Gaussian Markov random fields (GMRFs) are extensively used in statistics to model area-based data and usually depend on several parameters in order to capture complex spatial correlations. In this context, it is important to determine the valid parameter space, namely the domain ensuring (semi) positive-definiteness of the precision matrix. Depending on the structure of the latter, this task can b… ▽ More

    Submitted 19 April, 2016; originally announced April 2016.

    MSC Class: 15A18; 62M30

  22. arXiv:1401.2642  [pdf, other

    stat.AP

    Hierarchical modelling of faecal egg counts to assess anthelmintic efficacy

    Authors: Michaela Paul, Paul R. Torgerson, Johan Höglund, Reinhard Furrer

    Abstract: Counting the number of parasite eggs in faecal samples is a widely used diagnostic method to evaluate parasite burden. Typically a sub-sample of the diluted faeces is examined for eggs. The resulting egg counts are multiplied by a specific correction factor to estimate the mean parasite burden. To detect anthelmintic resistance, the mean parasite burden from treated and untreated animals are compa… ▽ More

    Submitted 12 January, 2014; originally announced January 2014.

    Comments: 14 pages, 7 figures, 1 table

  23. arXiv:1303.3390  [pdf, other

    stat.ME

    Conjugate distributions in hierarchical Bayesian ANOVA for computational efficiency and assessments of both practical and statistical significance

    Authors: Steven Geinitz, Reinhard Furrer

    Abstract: Assessing variability according to distinct factors in data is a fundamental technique of statistics. The method commonly regarded to as analysis of variance (ANOVA) is, however, typically confined to the case where all levels of a factor are present in the data (i.e. the population of factor levels has been exhausted). Random and mixed effects models are used for more elaborate cases, but require… ▽ More

    Submitted 14 March, 2013; originally announced March 2013.

    Comments: 24 pages

  24. arXiv:1302.4659  [pdf, other

    stat.AP

    Spatial Backfitting of Roller Measurement Values from a Florida Test Bed

    Authors: Daniel K. Heersink, Reinhard Furrer, Mike A. Mooney

    Abstract: Modern earthwork compaction rollers collect location and compaction information as they traverse a compaction site. These data are indirectly observed through non-linear measurement operators, inherently multivariate with complex correlation structures, and collected in huge quantities. The nature of such data was investigated at a large, atypically compacted test bed in Florida, USA. Exploratory… ▽ More

    Submitted 20 February, 2013; v1 submitted 19 February, 2013; originally announced February 2013.

    Comments: 14 pages, 6 figures

  25. arXiv:1302.4631  [pdf, other

    stat.AP

    Intelligent Compaction and Quality Assurance of Roller Measurement Values utilizing Backfitting and Multiresolution Scale Space Analysis

    Authors: Daniel K. Heersink, Reinhard Furrer, Mike A. Mooney

    Abstract: Modern earthwork compaction rollers collect location and compaction information as they traverse a compaction site. These roller measurement values present a challenging spatio-temporal statistical problem that requires careful implementation of a proper stochastic model and estimation procedure. Heersink and Furrer (2013) proposed a sequential, spatial mixed-effects model and a sequential, spatia… ▽ More

    Submitted 20 March, 2013; v1 submitted 19 February, 2013; originally announced February 2013.

    Comments: 11 pages, 4 figures

  26. arXiv:1207.2338  [pdf, other

    stat.ME

    MMANOVA: A general multilevel framework for multivariate analysis of variance

    Authors: Steven Geinitz, Reinhard Furrer, Stephan R. Sain

    Abstract: Classical analysis of variance requires that model terms be labeled as fixed or random and typically culminate by comparing variability from each batch (factor) to variability from errors; without a standard methodology to assess the magnitude of a batch's variability, to compare variability between batches, nor to consider the uncertainty in this assessment. In this paper we support recent work,… ▽ More

    Submitted 15 July, 2012; v1 submitted 10 July, 2012; originally announced July 2012.

  27. A spatial analysis of multivariate output from regional climate models

    Authors: Stephan R. Sain, Reinhard Furrer, Noel Cressie

    Abstract: Climate models have become an important tool in the study of climate and climate change, and ensemble experiments consisting of multiple climate-model runs are used in studying and quantifying the uncertainty in climate-model output. However, there are often only a limited number of model runs available for a particular experiment, and one of the statistical challenges is to characterize the distr… ▽ More

    Submitted 14 April, 2011; originally announced April 2011.

    Comments: Published in at http://dx.doi.org/10.1214/10-AOAS369 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org)

    Report number: IMS-AOAS-AOAS369

    Journal ref: Annals of Applied Statistics 2011, Vol. 5, No. 1, 150-175