-
A copula-based boosting model for time-to-event prediction with dependent censoring
Authors:
Alise Danielle Midtfjord,
Riccardo De Bin,
Arne Bang Huseby
Abstract:
A characteristic feature of time-to-event data analysis is possible censoring of the event time. Most of the statistical learning methods for handling censored data are limited by the assumption of independent censoring, even if this can lead to biased predictions when the assumption does not hold. This paper introduces Clayton-boost, a boosting approach built upon the accelerated failure time mod…
▽ More
A characteristic feature of time-to-event data analysis is possible censoring of the event time. Most of the statistical learning methods for handling censored data are limited by the assumption of independent censoring, even if this can lead to biased predictions when the assumption does not hold. This paper introduces Clayton-boost, a boosting approach built upon the accelerated failure time model, which uses a Clayton copula to handle the dependency between the event and censoring distributions. By taking advantage of a copula, the independent censoring assumption is not needed any more. During comparisons with commonly used methods, Clayton-boost shows a strong ability to remove prediction bias at the presence of dependent censoring and outperforms the comparing methods either if the dependency strength or percentage censoring are considerable. The encouraging performance of Clayton-boost shows that there is indeed reasons to be critical about the independent censoring assumption, and that real-world data could highly benefit from modelling the potential dependency.
△ Less
Submitted 26 October, 2022; v1 submitted 10 October, 2022;
originally announced October 2022.
-
A Decision Support System for Safer Airplane Landings: Predicting Runway Conditions Using XGBoost and Explainable AI
Authors:
Alise Danielle Midtfjord,
Riccardo De Bin,
Arne Bang Huseby
Abstract:
The presence of snow and ice on runway surfaces reduces the available tire-pavement friction needed for retardation and directional control and causes potential economic and safety threats for the aviation industry during the winter seasons. To activate appropriate safety procedures, pilots need accurate and timely information on the actual runway surface conditions. In this study, XGBoost is used…
▽ More
The presence of snow and ice on runway surfaces reduces the available tire-pavement friction needed for retardation and directional control and causes potential economic and safety threats for the aviation industry during the winter seasons. To activate appropriate safety procedures, pilots need accurate and timely information on the actual runway surface conditions. In this study, XGBoost is used to create a combined runway assessment system, which includes a classification model to identify slippery conditions and a regression model to predict the level of slipperiness. The models are trained on weather data and runway reports. The runway surface conditions are represented by the tire-pavement friction coefficient, which is estimated from flight sensor data from landing aircrafts. The XGBoost models are combined with SHAP approximations to provide a reliable decision support system for airport operators, which can contribute to safer and more economic operations of airport runways. To evaluate the performance of the prediction models, they are compared to several state-of-the-art runway assessment methods. The XGBoost models identify slippery runway conditions with a ROC AUC of 0.95, predict the friction coefficient with a MAE of 0.0254, and outperforms all the previous methods. The results show the strong abilities of machine learning methods to model complex, physical phenomena with a good accuracy. Published version: https://doi.org/10.1016/j.coldregions.2022.103556.
△ Less
Submitted 29 September, 2022; v1 submitted 1 July, 2021;
originally announced July 2021.
-
Modelling extreme claims via composite models and threshold selection methods
Authors:
Yinzhi Wang,
Ingrid Hobæk Haff,
Arne Huseby
Abstract:
The existence of large and extreme claims of a non-life insurance portfolio influences the ability of (re)insurers to estimate the reserve. The excess over-threshold method provides a way to capture and model the typical behaviour of insurance claim data. This paper discusses several composite models with commonly used bulk distributions, combined with a 2-parameter Pareto distribution above the t…
▽ More
The existence of large and extreme claims of a non-life insurance portfolio influences the ability of (re)insurers to estimate the reserve. The excess over-threshold method provides a way to capture and model the typical behaviour of insurance claim data. This paper discusses several composite models with commonly used bulk distributions, combined with a 2-parameter Pareto distribution above the threshold. We have explored how several threshold selection methods perform when estimating the reserve as well as the effect of the choice of bulk distribution, with varying sample size and tail properties. To investigate this, a simulation study has been performed. Our study shows that when data are sufficient, the square root rule has the overall best performance in terms of the quality of the reserve estimate. The second best is the exponentiality test, especially when the right tail of the data is extreme. As the sample size becomes small, the simultaneous estimation has the best performance. Further, the influence of the choice of bulk distribution seems to be rather large, especially when the distribution is heavy-tailed. Moreover, it shows that the empirical estimate of $p_{\leq b}$, the probability that a claim is below the threshold, is more robust than the theoretical one.
△ Less
Submitted 6 November, 2019;
originally announced November 2019.
-
Buffered environmental contours
Authors:
Kristina Rognlien Dahl,
Arne Bang Huseby
Abstract:
The main idea of this paper is to use the notion of buffered failure probability from probabilistic structural design, to introduce buffered environmental contours. Classical environmental contours are used in structural design in order to obtain upper bounds on the failure probabilities of a large class of designs. The purpose of buffered failure probabilities is the same. However, in constrast t…
▽ More
The main idea of this paper is to use the notion of buffered failure probability from probabilistic structural design, to introduce buffered environmental contours. Classical environmental contours are used in structural design in order to obtain upper bounds on the failure probabilities of a large class of designs. The purpose of buffered failure probabilities is the same. However, in constrast to classical environmental contours, this new concept does not just take into account failure vs. functioning, but also to which extent the system is failing. For example, this is relevant when considering the risk of flooding: We are not just interested in knowing whether a river has flooded. The damages caused by the flooding greatly depends on how much the water has risen above the standard level.
△ Less
Submitted 28 February, 2019;
originally announced March 2019.
-
On environmental contours for marine and coastal design
Authors:
Emma Ross,
Ole Christian Astrup,
Elzbieta Bitner-Gregersen,
Nigel Bunn,
Graham Feld,
Ben Gouldby,
Arne Huseby,
Ye Liu,
David Randell,
Erik Vanem,
Philip Jonathan
Abstract:
Environmental contours are used in structural reliability analysis of marine and coastal structures as an approximate means to locate the boundary of the distribution of environmental variables, and hence sets of environmental conditions giving rise to extreme structural loads and responses. Outline guidance concerning the application of environmental contour methods is given in recent design guid…
▽ More
Environmental contours are used in structural reliability analysis of marine and coastal structures as an approximate means to locate the boundary of the distribution of environmental variables, and hence sets of environmental conditions giving rise to extreme structural loads and responses. Outline guidance concerning the application of environmental contour methods is given in recent design guidelines from many organisations. However there is lack of clarity concerning the differences between approaches to environmental contour estimation reported in the literature, and regarding the relationship between the environmental contour, corresponding to some return period, and the extreme structural response for the same period. Hence there is uncertainty about precisely when environmental contours should be used, and how they should be used well. This article seeks to provide some assistance in understanding the fundamental issues regarding environmental contours and their use in structural reliability analysis. Approaches to estimating the joint distribution of environmental variables, and to estimating environmental contours based on that distribution, are described. Simple software for estimation of the joint distribution, and hence environmental contours, is illustrated (and is freely available from the authors). Extra assumptions required to relate the characteristics of environmental contour to structural failure are outlined. Alternative response-based methods not requiring environmental contours are summarised. The results of an informal survey of the metocean user community regarding environmental contours are presented. Finally, recommendations about when and how environmental contour methods should be used are made.
△ Less
Submitted 19 December, 2018;
originally announced December 2018.