Search | arXiv e-print repository

Parametric estimation of conditional Archimedean copula generators for censored data

Authors: Marie Michaelides, Hélène Cossette, Mathieu Pigeon

Abstract: In this paper, we propose a novel approach for estimating Archimedean copula generators in a conditional setting, incorporating endogenous variables. Our method allows for the evaluation of the impact of the different levels of covariates on both the strength and shape of dependence by directly estimating the generator function rather than the copula itself. As such, we contribute to relaxing the… ▽ More In this paper, we propose a novel approach for estimating Archimedean copula generators in a conditional setting, incorporating endogenous variables. Our method allows for the evaluation of the impact of the different levels of covariates on both the strength and shape of dependence by directly estimating the generator function rather than the copula itself. As such, we contribute to relaxing the simplifying assumption inherent in traditional copula modeling. We demonstrate the effectiveness of our methodology through applications in two diverse settings: a diabetic retinopathy study and a claims reserving analysis. In both cases, we show how considering the influence of covariates enables a more accurate capture of the underlying dependence structure in the data, thus enhancing the applicability of copula models, particularly in actuarial contexts. △ Less

Submitted 10 April, 2024; originally announced April 2024.

arXiv:2401.07724 [pdf, other]

A non-parametric estimator for Archimedean copulas under flexible censoring scenarios and an application to claims reserving

Authors: Marie Michaelides, Hélène Cossette, Mathieu Pigeon

Abstract: With insurers benefiting from ever-larger amounts of data of increasing complexity, we explore a data-driven method to model dependence within multilevel claims in this paper. More specifically, we start from a non-parametric estimator for Archimedean copula generators introduced by Genest and Rivest (1993), and we extend it to diverse flexible censoring scenarios using techniques derived from sur… ▽ More With insurers benefiting from ever-larger amounts of data of increasing complexity, we explore a data-driven method to model dependence within multilevel claims in this paper. More specifically, we start from a non-parametric estimator for Archimedean copula generators introduced by Genest and Rivest (1993), and we extend it to diverse flexible censoring scenarios using techniques derived from survival analysis. We implement a graphical selection procedure for copulas that we validate using goodness-of-fit methods applied to complete, single-censored, and double-censored bivariate data. We illustrate the performance of our model with multiple simulation studies. We then apply our methodology to a recent Canadian automobile insurance dataset where we seek to model the dependence between the activation delays of correlated coverages. We show that our model performs quite well in selecting the best-fitted copula for the data at hand, especially when the dataset is large, and that the results can then be used as part of a larger claims reserving methodology. △ Less

Submitted 15 January, 2024; originally announced January 2024.

arXiv:2308.01729 [pdf, other]

Telematics Combined Actuarial Neural Networks for Cross-Sectional and Longitudinal Claim Count Data

Authors: Francis Duval, Jean-Philippe Boucher, Mathieu Pigeon

Abstract: We present novel cross-sectional and longitudinal claim count models for vehicle insurance built upon the Combined Actuarial Neural Network (CANN) framework proposed by Mario Wüthrich and Michael Merz. The CANN approach combines a classical actuarial model, such as a generalized linear model, with a neural network. This blending of models results in a two-component model comprising a classical reg… ▽ More We present novel cross-sectional and longitudinal claim count models for vehicle insurance built upon the Combined Actuarial Neural Network (CANN) framework proposed by Mario Wüthrich and Michael Merz. The CANN approach combines a classical actuarial model, such as a generalized linear model, with a neural network. This blending of models results in a two-component model comprising a classical regression model and a neural network part. The CANN model leverages the strengths of both components, providing a solid foundation and interpretability from the classical model while harnessing the flexibility and capacity to capture intricate relationships and interactions offered by the neural network. In our proposed models, we use well-known log-linear claim count regression models for the classical regression part and a multilayer perceptron (MLP) for the neural network part. The MLP part is used to process telematics car driving data given as a vector characterizing the driving behavior of each insured driver. In addition to the Poisson and negative binomial distributions for cross-sectional data, we propose a procedure for training our CANN model with a multivariate negative binomial (MVNB) specification. By doing so, we introduce a longitudinal model that accounts for the dependence between contracts from the same insured. Our results reveal that the CANN models exhibit superior performance compared to log-linear models that rely on manually engineered telematics features. △ Less

Submitted 3 December, 2023; v1 submitted 3 August, 2023; originally announced August 2023.

Comments: 30 pages, 10 tables, 6 figures

arXiv:2209.11763 [pdf, other]

Enhancing Claim Classification with Feature Extraction from Anomaly-Detection-Derived Routine and Peculiarity Profiles

Authors: Francis Duval, Jean-Philippe Boucher, Mathieu Pigeon

Abstract: Usage-based insurance is becoming the new standard in vehicle insurance; it is therefore relevant to find efficient ways of using insureds' driving data. Applying anomaly detection to vehicles' trip summaries, we develop a method allowing to derive a "routine" and a "peculiarity" anomaly profile for each vehicle. To this end, anomaly detection algorithms are used to compute a routine and a peculia… ▽ More Usage-based insurance is becoming the new standard in vehicle insurance; it is therefore relevant to find efficient ways of using insureds' driving data. Applying anomaly detection to vehicles' trip summaries, we develop a method allowing to derive a "routine" and a "peculiarity" anomaly profile for each vehicle. To this end, anomaly detection algorithms are used to compute a routine and a peculiarity anomaly score for each trip a vehicle makes. The former measures the anomaly degree of the trip compared to the other trips made by the concerned vehicle, while the latter measures its anomaly degree compared to trips made by any vehicle. The resulting anomaly scores vectors are used as routine and peculiarity profiles. Features are then extracted from these profiles, for which we investigate the predictive power in the claim classification framework. Using real data, we find that features extracted from the vehicles' peculiarity profile improve classification. △ Less

Submitted 26 September, 2022; originally announced September 2022.

Comments: 26 pages, 10 figures, 9 tables

arXiv:2208.08430 [pdf, other]

doi 10.1007/s13385-023-00355-3

Individual Claims Reserving using Activation Patterns

Authors: Marie Michaelides, Mathieu Pigeon, Hélène Cossette

Abstract: The occurrence of a claim often impacts not one but multiple insurance coverages provided in the contract. To account for this multivariate feature, we propose a new individual claims reserving model built around the activation of the different coverages to predict the reserve amounts. Using the framework of multinomial logistic regression, we model the activation of the different insurance covera… ▽ More The occurrence of a claim often impacts not one but multiple insurance coverages provided in the contract. To account for this multivariate feature, we propose a new individual claims reserving model built around the activation of the different coverages to predict the reserve amounts. Using the framework of multinomial logistic regression, we model the activation of the different insurance coverages for each claim and their development in the following years, i.e. the activation of other coverages in the later years and all the possible payments that might result from them. As such, the model allows us to complete the individual development of the open claims in the portfolio. Using a recent automobile dataset from a major Canadian insurance company, we demonstrate that this approach generates accurate predictions of the total reserves as well as of the reserves per insurance coverage. This allows the insurer to get better insights in the dynamics of his claims reserves. △ Less

Submitted 15 August, 2023; v1 submitted 17 August, 2022; originally announced August 2022.

Comments: European Actuarial Journal (2023)

arXiv:2105.14055 [pdf, other]

How much telematics information do insurers need for claim classification?

Authors: Francis Duval, Jean-Philippe Boucher, Mathieu Pigeon

Abstract: It has been shown several times in the literature that telematics data collected in motor insurance help to better understand an insured's driving risk. Insurers that use this data reap several benefits, such as a better estimate of the pure premium, more segmented pricing and less adverse selection. The flip side of the coin is that collected telematics information is often sensitive and can ther… ▽ More It has been shown several times in the literature that telematics data collected in motor insurance help to better understand an insured's driving risk. Insurers that use this data reap several benefits, such as a better estimate of the pure premium, more segmented pricing and less adverse selection. The flip side of the coin is that collected telematics information is often sensitive and can therefore compromise policyholders' privacy. Moreover, due to their large volume, this type of data is costly to store and hard to manipulate. These factors, combined with the fact that insurance regulators tend to issue more and more recommendations regarding the collection and use of telematics data, make it important for an insurer to determine the right amount of telematics information to collect. In addition to traditional contract information such as the age and gender of the insured, we have access to a telematics dataset where information is summarized by trip. We first derive several features of interest from these trip summaries before building a claim classification model using both traditional and telematics features. By comparing a few classification algorithms, we find that logistic regression with lasso penalty is the most suitable for our problem. Using this model, we develop a method to determine how much information about policyholders' driving should be kept by an insurer. Using real data from a North American insurance company, we find that telematics data become redundant after about 3 months or 4,000 kilometers of observation, at least from a claim classification perspective. △ Less

Submitted 25 October, 2021; v1 submitted 28 May, 2021; originally announced May 2021.

Comments: 21 pages, 8 figures, 6 tables

arXiv:1812.06157 [pdf, other]

A Claim Score for Dynamic Claim Counts Modeling

Authors: Jean-Philippe Boucher, Mathieu Pigeon

Abstract: We develop a claim score based on the Bonus-Malus approach proposed by [7]. We compare the fit and predictive ability of this new model with various models for of panel count data. In particular, we study in more details a new dynamic model based on the Harvey-Fernandès (HF) approach, which gives different weight to the claims according to their date of occurrence. We show that the HF model has se… ▽ More We develop a claim score based on the Bonus-Malus approach proposed by [7]. We compare the fit and predictive ability of this new model with various models for of panel count data. In particular, we study in more details a new dynamic model based on the Harvey-Fernandès (HF) approach, which gives different weight to the claims according to their date of occurrence. We show that the HF model has serious shortcomings that limit its use in practice. In contrast, the Bonus-Malus model does not have these defects. Instead, it has several interesting properties: interpretability, computational advantages and ease of use in practice. We believe that the flexibility of this new model means that it could be used in many other actuarial contexts. Based on a real database, we show that the proposed model generates the best fit and one of the best predictive capabilities among the other models tested. △ Less

Submitted 14 December, 2018; originally announced December 2018.

arXiv:1602.08773 [pdf, other]

Macro vs. Micro Methods in Non-Life Claims Reserving (an Econometric Perspective)

Authors: Arthur Charpentier, Mathieu Pigeon

Abstract: Traditionally, actuaries have used run-off triangles to estimate reserve ("macro" models, on agregated data). But it is possible to model payments related to individual claims. If those models provide similar estimations, we investigate uncertainty related to reserves, with "macro" and "micro" models. We study theoretical properties of econometric models (Gaussian, Poisson and quasi-Poisson) on in… ▽ More Traditionally, actuaries have used run-off triangles to estimate reserve ("macro" models, on agregated data). But it is possible to model payments related to individual claims. If those models provide similar estimations, we investigate uncertainty related to reserves, with "macro" and "micro" models. We study theoretical properties of econometric models (Gaussian, Poisson and quasi-Poisson) on individual data, and clustered data. Finally, application on claims reserving are considered. △ Less

Submitted 28 February, 2016; originally announced February 2016.

Showing 1–8 of 8 results for author: Pigeon, M