-
A joint estimation approach for monotonic regression functions in general dimensions
Authors:
Christian Rohrbeck,
Deborah A Costain
Abstract:
Regression analysis under the assumption of monotonicity is a well-studied statistical problem and has been used in a wide range of applications. However, there remains a lack of a broadly applicable methodology that permits information borrowing, for efficiency gains, when jointly estimating multiple monotonic regression functions. We introduce such a methodology by extending the isotonic regress…
▽ More
Regression analysis under the assumption of monotonicity is a well-studied statistical problem and has been used in a wide range of applications. However, there remains a lack of a broadly applicable methodology that permits information borrowing, for efficiency gains, when jointly estimating multiple monotonic regression functions. We introduce such a methodology by extending the isotonic regression problem presented in the article "The isotonic regression problem and its dual" (Barlow and Brunk, 1972). The presented approach can be applied to both fixed and random designs and any number of explanatory variables (regressors). Our framework penalizes pairwise differences in the values (levels) of the monotonic function estimates, with the weight of penalty being determined based on a statistical test, which results in information being shared across data sets if similarities in the regression functions exist. Function estimates are subsequently derived using an iterative optimization routine that uses existing solution algorithms for the isotonic regression problem. Simulation studies for normally and binomially distributed response data illustrate that function estimates are consistently improved if similarities between functions exist, and are not oversmoothed otherwise. We further apply our methodology to analyse two public health data sets: neonatal mortality data for Porto Alegre, Brazil, and stroke patient data for North West England.
△ Less
Submitted 28 May, 2023;
originally announced May 2023.
-
Simulating flood event sets using extremal principal components
Authors:
Christian Rohrbeck,
Daniel Cooley
Abstract:
Hazard event sets, a collection of synthetic extreme events over a given period, are important for catastrophe modelling. This paper addresses the issue of generating event sets of extreme river flow for northern England and southern Scotland, a region which has been particularly affected by severe flooding over the past 20 years. We start by analysing historical extreme river flow across 45 gauge…
▽ More
Hazard event sets, a collection of synthetic extreme events over a given period, are important for catastrophe modelling. This paper addresses the issue of generating event sets of extreme river flow for northern England and southern Scotland, a region which has been particularly affected by severe flooding over the past 20 years. We start by analysing historical extreme river flow across 45 gauges, located within the study region, using methods from extreme value analysis, including the concept of extremal principal components. Our analysis reveals interesting connections between the extremal dependence structure and the region's topography/climate. We then introduce a framework which is based on modelling the distribution of the extremal principal components in order to generate synthetic events of extreme river flow. The generative framework is dimension-reducing in that it distinctly handles the principal components based on their contribution to describing the nature of extreme river flow across the study region. We also detail a data-driven approach to select the optimal dimension. Synthetic flood events are subsequently generated efficiently by sampling from the fitted distribution. Our approach for generating hazard event sets can be easily implemented by practitioners and our results indicate good agreement between the observed and simulated extreme river flow dynamics. For the considered application, we also find that our approach outperforms existing statistical approaches for generating hazard event sets.
△ Less
Submitted 16 March, 2022; v1 submitted 1 June, 2021;
originally announced June 2021.
-
Bayesian non-parametric ordinal regression under a monotonicity constraint
Authors:
Olli Saarela,
Christian Rohrbeck,
Elja Arjas
Abstract:
Compared to the nominal scale, the ordinal scale for a categorical outcome variable has the property of making a monotonicity assumption for the covariate effects meaningful. This assumption is encoded in the commonly used proportional odds model, but there it is combined with other parametric assumptions such as linearity and additivity. Herein, the considered models are non-parametric and the on…
▽ More
Compared to the nominal scale, the ordinal scale for a categorical outcome variable has the property of making a monotonicity assumption for the covariate effects meaningful. This assumption is encoded in the commonly used proportional odds model, but there it is combined with other parametric assumptions such as linearity and additivity. Herein, the considered models are non-parametric and the only condition imposed is that the effects of the covariates on the outcome categories are stochastically monotone according to the ordinal scale. We are not aware of the existence of other comparable multivariable models that would be suitable for inference purposes. We generalize our previously proposed Bayesian monotonic multivariable regression model to ordinal outcomes, and propose an estimation procedure based on reversible jump Markov chain Monte Carlo. The model is based on a marked point process construction, which allows it to approximate arbitrary monotonic regression function shapes, and has a built-in covariate selection property. We study the performance of the proposed approach through extensive simulation studies, and demonstrate its practical application in two real data examples.
△ Less
Submitted 11 February, 2022; v1 submitted 2 July, 2020;
originally announced July 2020.
-
Bayesian spatial clustering of extremal behaviour for hydrological variables
Authors:
Christian Rohrbeck,
Jonathan A Tawn
Abstract:
To address the need for efficient inference for a range of hydrological extreme value problems, spatial pooling of information is the standard approach for marginal tail estimation. We propose the first extreme value spatial clustering methods which account for both the similarity of the marginal tails and the spatial dependence structure of the data to determine the appropriate level of pooling.…
▽ More
To address the need for efficient inference for a range of hydrological extreme value problems, spatial pooling of information is the standard approach for marginal tail estimation. We propose the first extreme value spatial clustering methods which account for both the similarity of the marginal tails and the spatial dependence structure of the data to determine the appropriate level of pooling. Spatial dependence is incorporated in two ways: to determine the cluster selection and to account for dependence of the data over sites within a cluster when making the marginal inference. We introduce a statistical model for the pairwise extremal dependence which incorporates distance between sites, and accommodates our belief that sites within the same cluster tend to exhibit a higher degree of dependence than sites in different clusters. We use a Bayesian framework which learns about both the number of clusters and their spatial structure, and that enables the inference of site-specific marginal distributions of extremes to incorporate uncertainty in the clustering allocation. The approach is illustrated using simulations, the analysis of daily precipitation levels in Norway and daily river flow levels in the UK.
△ Less
Submitted 20 June, 2019;
originally announced June 2019.
-
Bayesian Spatial Monotonic Multiple Regression
Authors:
Christian Rohrbeck,
Deborah Costain,
Arnoldo Frigessi
Abstract:
We consider monotonic, multiple regression for a set of contiguous regions (lattice data). The regression functions permissibly vary between regions and exhibit geographical structure. We develop new Bayesian non-parametric methodology which allows for both continuous and discontinuous functional shapes and which are estimated using marked point processes and reversible jump Markov Chain Monte Car…
▽ More
We consider monotonic, multiple regression for a set of contiguous regions (lattice data). The regression functions permissibly vary between regions and exhibit geographical structure. We develop new Bayesian non-parametric methodology which allows for both continuous and discontinuous functional shapes and which are estimated using marked point processes and reversible jump Markov Chain Monte Carlo techniques. Geographical dependency is incorporated by a flexible prior distribution; the parametrisation allows the dependency to vary with functional level. The approach is tuned using Bayesian global optimization and cross-validation. Estimates enable variable selection, threshold detection and prediction as well as the extrapolation of the regression function. Performance and flexibility of our approach is illustrated by simulation studies and an application to a Norwegian insurance data set.
△ Less
Submitted 19 May, 2016;
originally announced May 2016.