Skip to main content

Showing 1–38 of 38 results for author: Charpentier, A

.
  1. arXiv:2403.15790  [pdf, other

    cs.LG stat.ML

    Boarding for ISS: Imbalanced Self-Supervised: Discovery of a Scaled Autoencoder for Mixed Tabular Datasets

    Authors: Samuel Stocksieker, Denys Pommeret, Arthur Charpentier

    Abstract: The field of imbalanced self-supervised learning, especially in the context of tabular data, has not been extensively studied. Existing research has predominantly focused on image datasets. This paper aims to fill this gap by examining the specific challenges posed by data imbalance in self-supervised learning in the domain of tabular data, with a primary focus on autoencoders. Autoencoders are wi… ▽ More

    Submitted 23 March, 2024; originally announced March 2024.

  2. arXiv:2402.07790  [pdf, other

    cs.LG

    From Uncertainty to Precision: Enhancing Binary Classifier Performance through Calibration

    Authors: Agathe Fernandes Machado, Arthur Charpentier, Emmanuel Flachaire, Ewen Gallic, François Hu

    Abstract: The assessment of binary classifier performance traditionally centers on discriminative ability using metrics, such as accuracy. However, these metrics often disregard the model's inherent uncertainty, especially when dealing with sensitive decision-making domains, such as finance or healthcare. Given that model-predicted scores are commonly seen as event probabilities, calibration is crucial for… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

  3. arXiv:2401.16197  [pdf, other

    cs.LG cs.CY

    Geospatial Disparities: A Case Study on Real Estate Prices in Paris

    Authors: Agathe Fernandes Machado, François Hu, Philipp Ratz, Ewen Gallic, Arthur Charpentier

    Abstract: Driven by an increasing prevalence of trackers, ever more IoT sensors, and the declining cost of computing power, geospatial information has come to play a pivotal role in contemporary predictive models. While enhancing prognostic performance, geospatial data also has the potential to perpetuate many historical socio-economic patterns, raising concerns about a resurgence of biases and exclusionary… ▽ More

    Submitted 29 January, 2024; originally announced January 2024.

  4. arXiv:2311.11900  [pdf, other

    stat.ML cs.CY cs.LG

    Measuring and Mitigating Biases in Motor Insurance Pricing

    Authors: Mulah Moriah, Franck Vermet, Arthur Charpentier

    Abstract: The non-life insurance sector operates within a highly competitive and tightly regulated framework, confronting a pivotal juncture in the formulation of pricing strategies. Insurers are compelled to harness a range of statistical methodologies and available data to construct optimal pricing structures that align with the overarching corporate strategy while accommodating the dynamics of market com… ▽ More

    Submitted 20 June, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

  5. arXiv:2310.20508  [pdf, other

    stat.ML cs.CY cs.LG

    Parametric Fairness with Statistical Guarantees

    Authors: François HU, Philipp Ratz, Arthur Charpentier

    Abstract: Algorithmic fairness has gained prominence due to societal and regulatory concerns about biases in Machine Learning models. Common group fairness metrics like Equalized Odds for classification or Demographic Parity for both classification and regression are widely used and a host of computationally advantageous post-processing methods have been developed around them. However, these metrics often l… ▽ More

    Submitted 31 October, 2023; originally announced October 2023.

  6. arXiv:2309.06627  [pdf, other

    stat.ML cs.CY cs.LG

    A Sequentially Fair Mechanism for Multiple Sensitive Attributes

    Authors: François Hu, Philipp Ratz, Arthur Charpentier

    Abstract: In the standard use case of Algorithmic Fairness, the goal is to eliminate the relationship between a sensitive variable and a corresponding score. Throughout recent years, the scientific community has developed a host of definitions and tools to solve this task, which work well in many practical applications. However, the applicability and effectivity of these tools and definitions becomes less s… ▽ More

    Submitted 14 January, 2024; v1 submitted 12 September, 2023; originally announced September 2023.

  7. arXiv:2308.11090  [pdf, other

    cs.CV cs.LG stat.AP

    Fairness Explainability using Optimal Transport with Applications in Image Classification

    Authors: Philipp Ratz, François Hu, Arthur Charpentier

    Abstract: Ensuring trust and accountability in Artificial Intelligence systems demands explainability of its outcomes. Despite significant progress in Explainable AI, human biases still taint a substantial portion of its training data, raising concerns about unfairness or discriminatory tendencies. Current approaches in the field of Algorithmic Fairness focus on mitigating such biases in the outcomes of a m… ▽ More

    Submitted 31 October, 2023; v1 submitted 21 August, 2023; originally announced August 2023.

  8. arXiv:2308.02966  [pdf, other

    stat.ML cs.LG

    Generalized Oversampling for Learning from Imbalanced datasets and Associated Theory

    Authors: Samuel Stocksieker, Denys Pommeret, Arthur Charpentier

    Abstract: In supervised learning, it is quite frequent to be confronted with real imbalanced datasets. This situation leads to a learning difficulty for standard algorithms. Research and solutions in imbalanced learning have mainly focused on classification tasks. Despite its importance, very few solutions exist for imbalanced regression. In this paper, we propose a data augmentation procedure, the GOLIATH… ▽ More

    Submitted 5 August, 2023; originally announced August 2023.

    Comments: This paper focuses specifically on the Imbalanced Regression issues but could be used for Imbalanced classification tasks

  9. arXiv:2306.13633  [pdf, other

    q-bio.PE math.PR

    Optimal Vaccination Policy to Prevent Endemicity: A Stochastic Model

    Authors: Félix Foutel-Rodier, Arthur Charpentier, Hélène Guérin

    Abstract: We examine here the effects of recurrent vaccination and waning immunity on the establishment of an endemic equilibrium in a population. An individual-based model that incorporates memory effects for transmission rate during infection and subsequent immunity is introduced, considering stochasticity at the individual level. By letting the population size going to infinity, we derive a set of equati… ▽ More

    Submitted 5 April, 2024; v1 submitted 23 June, 2023; originally announced June 2023.

    Comments: 51 pages, 7 figures

  10. arXiv:2306.12912  [pdf, other

    stat.ML cs.CY cs.LG

    Mitigating Discrimination in Insurance with Wasserstein Barycenters

    Authors: Arthur Charpentier, François Hu, Philipp Ratz

    Abstract: The insurance industry is heavily reliant on predictions of risks based on characteristics of potential customers. Although the use of said models is common, researchers have long pointed out that such practices perpetuate discrimination based on sensitive features such as gender or race. Given that such discrimination can often be attributed to historical data biases, an elimination or at least m… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

  11. Fairness in Multi-Task Learning via Wasserstein Barycenters

    Authors: François Hu, Philipp Ratz, Arthur Charpentier

    Abstract: Algorithmic Fairness is an established field in machine learning that aims to reduce biases in data. Recent advances have proposed various methods to ensure fairness in a univariate environment, where the goal is to de-bias a single task. However, extending fairness to a multi-task setting, where more than one objective is optimised using a shared representation, remains underexplored. To bridge t… ▽ More

    Submitted 6 July, 2023; v1 submitted 16 June, 2023; originally announced June 2023.

  12. arXiv:2302.09288  [pdf, other

    stat.ML cs.LG stat.ME

    Data Augmentation for Imbalanced Regression

    Authors: Samuel Stocksieker, Denys Pommeret, Arthur Charpentier

    Abstract: In this work, we consider the problem of imbalanced data in a regression framework when the imbalanced phenomenon concerns continuous or discrete covariates. Such a situation can lead to biases in the estimates. In this case, we propose a data augmentation algorithm that combines a weighted resampling (WR) and a data augmentation (DA) procedure. In a first step, the DA procedure permits exploring… ▽ More

    Submitted 18 February, 2023; originally announced February 2023.

    Comments: paper accepted at the AISTATS 2023 conference, to be published in PMLR (Proceedings of Machine Learning Research)

  13. arXiv:2301.07755  [pdf, other

    econ.EM

    Optimal Transport for Counterfactual Estimation: A Method for Causal Inference

    Authors: Arthur Charpentier, Emmanuel Flachaire, Ewen Gallic

    Abstract: Many problems ask a question that can be formulated as a causal question: "what would have happened if...?" For example, "would the person have had surgery if he or she had been Black?" To address this kind of questions, calculating an average treatment effect (ATE) is often uninformative, because one would like to know how much impact a variable (such as skin color) has on a specific individual,… ▽ More

    Submitted 18 January, 2023; originally announced January 2023.

  14. arXiv:2212.09868  [pdf, other

    econ.EM

    Quantifying fairness and discrimination in predictive models

    Authors: Arthur Charpentier

    Abstract: The analysis of discrimination has long interested economists and lawyers. In recent years, the literature in computer science and machine learning has become interested in the subject, offering an interesting re-reading of the topic. These questions are the consequences of numerous criticisms of algorithms used to translate texts or to identify people in images. With the arrival of massive data,… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: Classifier; Demographic Parity; Discrimination; Equal Opportunity; Fairness; Penalized regression; Proxy; Statistical Discrimination

  15. arXiv:2212.09192  [pdf, other

    math.OC q-fin.CP

    Multiarmed Bandits Problem Under the Mean-Variance Setting

    Authors: Hongda Hu, Arthur Charpentier, Mario Ghossoub, Alexander Schied

    Abstract: The classical multi-armed bandit (MAB) problem involves a learner and a collection of K independent arms, each with its own ex ante unknown independent reward distribution. At each one of a finite number of rounds, the learner selects one arm and receives new information. The learner often faces an exploration-exploitation dilemma: exploiting the current information by playing the arm with the hig… ▽ More

    Submitted 3 May, 2024; v1 submitted 18 December, 2022; originally announced December 2022.

  16. arXiv:2207.01010  [pdf, other

    cs.MA cs.LG econ.GN

    Government Intervention in Catastrophe Insurance Markets: A Reinforcement Learning Approach

    Authors: Menna Hassan, Nourhan Sakr, Arthur Charpentier

    Abstract: This paper designs a sequential repeated game of a micro-founded society with three types of agents: individuals, insurers, and a government. Nascent to economics literature, we use Reinforcement Learning (RL), closely related to multi-armed bandit problems, to learn the welfare impact of a set of proposed policy interventions per $1 spent on them. The paper rigorously discusses the desirability o… ▽ More

    Submitted 3 July, 2022; originally announced July 2022.

  17. arXiv:2205.08112  [pdf, ps, other

    econ.GN cs.CY

    The Fairness of Machine Learning in Insurance: New Rags for an Old Man?

    Authors: Laurence Barry, Arthur Charpentier

    Abstract: Since the beginning of their history, insurers have been known to use data to classify and price risks. As such, they were confronted early on with the problem of fairness and discrimination associated with data. This issue is becoming increasingly important with access to more granular and behavioural data, and is evolving to reflect current technologies and societal concerns. By looking into ear… ▽ More

    Submitted 17 May, 2022; originally announced May 2022.

  18. arXiv:2202.12008  [pdf, other

    stat.ML cs.AI cs.CY cs.LG stat.AP

    A Fair Pricing Model via Adversarial Learning

    Authors: Vincent Grari, Arthur Charpentier, Marcin Detyniecki

    Abstract: At the core of insurance business lies classification between risky and non-risky insureds, actuarial fairness meaning that risky insureds should contribute more and pay a higher premium than non-risky or less-risky ones. Actuaries, therefore, use econometric or machine learning techniques to classify, but the distinction between a fair actuarial classification and "discrimination" is subtle. For… ▽ More

    Submitted 26 December, 2022; v1 submitted 24 February, 2022; originally announced February 2022.

    Comments: 20 pages, 12 figures

  19. arXiv:2108.04737  [pdf, other

    econ.EM

    Weighted asymmetric least squares regression with fixed-effects

    Authors: Amadou Barry, Karim Oualkacha, Arthur Charpentier

    Abstract: The fixed-effects model estimates the regressor effects on the mean of the response, which is inadequate to summarize the variable relationships in the presence of heteroscedasticity. In this paper, we adapt the asymmetric least squares (expectile) regression to the fixed-effects model and propose a new model: expectile regression with fixed-effects $(\ERFE).$ The $\ERFE$ model applies the within… ▽ More

    Submitted 10 August, 2021; originally announced August 2021.

    MSC Class: 62Jxx; 62J05 ACM Class: G.3.2

  20. Predicting Drought and Subsidence Risks in France

    Authors: Arthur Charpentier, Molly James, Hani Ali

    Abstract: The economic consequences of drought episodes are increasingly important, although they are often difficult to apprehend in part because of the complexity of the underlying mechanisms. In this article, we will study one of the consequences of drought, namely the risk of subsidence (or more specifically clay shrinkage induced subsidence), for which insurance has been mandatory in France for several… ▽ More

    Submitted 15 July, 2021; originally announced July 2021.

  21. arXiv:2107.02764  [pdf, other

    q-fin.RM cs.SI econ.GN q-fin.CP

    Collaborative Insurance Sustainability and Network Structure

    Authors: Arthur Charpentier, Lariosse Kouakou, Matthias Löwe, Philipp Ratz, Franck Vermet

    Abstract: The peer-to-peer (P2P) economy has been growing with the advent of the Internet, with well known brands such as Uber or Airbnb being examples thereof. In the insurance sector the approach is still in its infancy, but some companies have started to explore P2P-based collaborative insurance products (eg. Lemonade in the U.S. or Inspeer in France). The actuarial literature only recently started to co… ▽ More

    Submitted 12 September, 2022; v1 submitted 5 July, 2021; originally announced July 2021.

  22. arXiv:2103.03635  [pdf, other

    stat.ML cs.LG econ.EM

    Autocalibration and Tweedie-dominance for Insurance Pricing with Machine Learning

    Authors: Michel Denuit, Arthur Charpentier, Julien Trufin

    Abstract: Boosting techniques and neural networks are particularly effective machine learning methods for insurance pricing. Often in practice, there are nevertheless endless debates about the choice of the right loss function to be used to train the machine learning model, as well as about the appropriate metric to assess the performances of competing models. Also, the sum of fitted values can depart from… ▽ More

    Submitted 9 July, 2021; v1 submitted 5 March, 2021; originally announced March 2021.

  23. Local Utility and Multivariate Risk Aversion

    Authors: Arthur Charpentier, Alfred Galichon, Marc Henry

    Abstract: We revisit Machina's local utility as a tool to analyze attitudes to multivariate risks. We show that for non-expected utility maximizers choosing between multivariate prospects, aversion to multivariate mean preserving increases in risk is equivalent to the concavity of the local utility functions, thereby generalizing Machina's result in Machina (1982). To analyze comparative risk attitudes with… ▽ More

    Submitted 22 February, 2021; v1 submitted 8 February, 2021; originally announced February 2021.

    Comments: 18 pages

    Journal ref: Mathematics of Operations Research 41-2 (2016) pp. 377-744

  24. arXiv:2006.08446  [pdf, other

    stat.AP econ.GN

    Modeling Joint Lives within Families

    Authors: Olivier Cabrignac, Arthur Charpentier, Ewen Gallic

    Abstract: Family history is usually seen as a significant factor insurance companies look at when applying for a life insurance policy. Where it is used, family history of cardiovascular diseases, death by cancer, or family history of high blood pressure and diabetes could result in higher premiums or no coverage at all. In this article, we use massive (historical) data to study dependencies between life le… ▽ More

    Submitted 15 June, 2020; originally announced June 2020.

  25. arXiv:2005.06526  [pdf, other

    q-bio.PE eess.SY physics.soc-ph

    COVID-19 pandemic control: balancing detection policy and lockdown intervention under ICU sustainability

    Authors: Arthur Charpentier, Romuald Elie, Mathieu Laurière, Viet Chi Tran

    Abstract: We consider here an extended SIR model, including several features of the recent COVID-19 outbreak: in particular the infected and recovered individuals can either be detected (+) or undetected (-) and we also integrate an intensive care unit (ICU) capacity. Our model enables a tractable quantitative analysis of the optimal policy for the control of the epidemic dynamics using both lockdown and de… ▽ More

    Submitted 21 May, 2020; v1 submitted 13 May, 2020; originally announced May 2020.

    MSC Class: 49N90; 92D30; 34H05

  26. arXiv:2003.10014  [pdf, other

    econ.TH cs.LG q-fin.CP

    Reinforcement Learning in Economics and Finance

    Authors: Arthur Charpentier, Romuald Elie, Carl Remlinger

    Abstract: Reinforcement learning algorithms describe how an agent can learn an optimal action policy in a sequential decision process, through repeated experience. In a given environment, the agent policy provides him some running and terminal rewards. As in online learning, the agent learns sequentially. As in multi-armed bandit problems, when an agent picks an action, he can not infer ex-post the rewards… ▽ More

    Submitted 22 March, 2020; originally announced March 2020.

  27. arXiv:1912.11736  [pdf, other

    econ.EM stat.ME

    Pareto models for risk management

    Authors: Arthur Charpentier, Emmanuel Flachaire

    Abstract: The Pareto model is very popular in risk management, since simple analytical formulas can be derived for financial downside risk measures (Value-at-Risk, Expected Shortfall) or reinsurance premiums and related quantities (Large Claim Index, Return Period). Nevertheless, in practice, distributions are (strictly) Pareto only in the tails, above (possible very) large threshold. Therefore, it could be… ▽ More

    Submitted 25 December, 2019; originally announced December 2019.

  28. arXiv:1907.02320  [pdf, other

    econ.GN cs.DS econ.EM

    Optimal transport on large networks, a practitioner's guide

    Authors: Arthur Charpentier, Alfred Galichon, Lucas Vernet

    Abstract: This article presents a set of tools for the modeling of a spatial allocation problem in a large geographic market and gives examples of applications. In our settings, the market is described by a network that maps the cost of travel between each pair of adjacent locations. Two types of agents are located at the nodes of this network. The buyers choose the most competitive sellers depending on the… ▽ More

    Submitted 22 August, 2019; v1 submitted 4 July, 2019; originally announced July 2019.

  29. arXiv:1905.10267  [pdf, other

    cs.SI physics.soc-ph stat.ME

    Extended Scale-Free Networks

    Authors: Arthur Charpentier, Emmanuel Flachaire

    Abstract: Recently, Broido & Clauset (2019) mentioned that (strict) Scale-Free networks were rare, in real life. This might be related to the statement of Stumpf, Wiuf & May (2005), that sub-networks of scale-free networks are not scale-free. In the later, those sub-networks are asymptotically scale-free, but one should not forget about second-order deviation (possibly also third order actually). In this ar… ▽ More

    Submitted 28 May, 2019; v1 submitted 24 May, 2019; originally announced May 2019.

  30. arXiv:1810.09214  [pdf, other

    stat.ME

    A new GEE method to account for heteroscedasticity, using asymmetric least-square regressions

    Authors: Amadou Barry, Karim Oualkacha, Arthur Charpentier

    Abstract: Generalized estimating equations (GEE) are widely used to analyze longitudinal data; however, they are not appropriate for heteroscedastic data, because they only estimate regressor effects on the mean response{\textemdash}and therefore do not account for data heterogeneity. Here, we combine the GEE with the asymmetric least squares (expectile) regression to derive a new class of estimators, which… ▽ More

    Submitted 24 December, 2020; v1 submitted 22 October, 2018; originally announced October 2018.

    Comments: 40 pages, 14 figures and all section modified

  31. arXiv:1807.08991  [pdf, other

    physics.soc-ph

    Internal Migrations in France in the Nineteenth Century

    Authors: Arthur Charpentier, Ewen Gallic

    Abstract: The digital age allows data collection to be done on a large scale and at low cost. This is the case of genealogy trees, which flourish on numerous digital platforms thanks to the collaboration of a mass of individuals wishing to trace their origins and share them with other users. The family trees constituted in this way contain information on the links between individuals and their ancestors, wh… ▽ More

    Submitted 24 July, 2018; originally announced July 2018.

  32. arXiv:1708.06992  [pdf, other

    stat.OT econ.EM

    Econométrie et Machine Learning

    Authors: Arthur Charpentier, Emmanuel Flachaire, Antoine Ly

    Abstract: Econometrics and machine learning seem to have one common goal: to construct a predictive model, for a variable of interest, using explanatory variables (or features). However, these two fields developed in parallel, thus creating two different cultures, to paraphrase Breiman (2001). The first was to build probabilistic models to describe economic phenomena. The second uses algorithms that will le… ▽ More

    Submitted 19 March, 2018; v1 submitted 26 July, 2017; originally announced August 2017.

    Comments: in French

  33. arXiv:1707.07607  [pdf, other

    stat.OT

    We are not alone ! (at least, most of us). Homonymy in large scale social groups

    Authors: Arthur Charpentier, Baptiste Coulmont

    Abstract: This article brings forward an estimation of the proportion of homonyms in large scale groups based on the distribution of first names and last names in a subset of these groups. The estimation is based on the generalization of the "birthday paradox problem". The main results is that, in societies such as France or the United States, identity collisions (based on first + last names) are frequent.… ▽ More

    Submitted 24 July, 2017; originally announced July 2017.

  34. arXiv:1602.08773  [pdf, other

    stat.AP

    Macro vs. Micro Methods in Non-Life Claims Reserving (an Econometric Perspective)

    Authors: Arthur Charpentier, Mathieu Pigeon

    Abstract: Traditionally, actuaries have used run-off triangles to estimate reserve ("macro" models, on agregated data). But it is possible to model payments related to individual claims. If those models provide similar estimations, we investigate uncertainty related to reserves, with "macro" and "micro" models. We study theoretical properties of econometric models (Gaussian, Poisson and quasi-Poisson) on in… ▽ More

    Submitted 28 February, 2016; originally announced February 2016.

  35. arXiv:1404.4414  [pdf, other

    stat.ME math.ST

    Probit transformation for nonparametric kernel estimation of the copula density

    Authors: Gery Geenens, Arthur Charpentier, Davy Paindaveine

    Abstract: Copula modelling has become ubiquitous in modern statistics. Here, the problem of nonparametrically estimating a copula density is addressed. Arguably the most popular nonparametric density estimator, the kernel estimator is not suitable for the unit-square-supported copula densities, mainly because it is heavily affected by boundary bias issues. In addition, most common copulas admit unbounded de… ▽ More

    Submitted 16 April, 2014; originally announced April 2014.

  36. arXiv:1112.0929  [pdf, other

    stat.AP stat.ME

    Multivariate integer-valued autoregressive models applied to earthquake counts

    Authors: Mathieu Boudreault, Arthur Charpentier

    Abstract: In various situations in the insurance industry, in finance, in epidemiology, etc., one needs to represent the joint evolution of the number of occurrences of an event. In this paper, we present a multivariate integer-valued autoregressive (MINAR) model, derive its properties and apply the model to earthquake occurrences across various pairs of tectonic plates. The model is an extension of Pedelis… ▽ More

    Submitted 5 December, 2011; originally announced December 2011.

  37. arXiv:1010.2621  [pdf, ps, other

    cs.CR

    An Asymmetric Fingerprinting Scheme based on Tardos Codes

    Authors: Ana Charpentier, Caroline Fontaine, Teddy Furon, Ingemar Cox

    Abstract: Tardos codes are currently the state-of-the-art in the design of practical collusion-resistant fingerprinting codes. Tardos codes rely on a secret vector drawn from a publicly known probability distribution in order to generate each Buyer's fingerprint. For security purposes, this secret vector must not be revealed to the Buyers. To prevent an untrustworthy Provider forging a copy of a Work with a… ▽ More

    Submitted 13 October, 2010; originally announced October 2010.

    Comments: 6 pages, 2 figures

  38. arXiv:0901.1521  [pdf, ps, other

    math.PR

    Tails of multivariate Archimedean copulas

    Authors: Arthur Charpentier, Johan Segers

    Abstract: A complete and user-friendly directory of tails of Archimedean copulas is presented which can be used in the selection and construction of appropriate models with desired properties. The results are synthesized in the form of a decision tree: Given the values of some readily computable characteristics of the Archimedean generator, the upper and lower tails of the copula are classified into one o… ▽ More

    Submitted 12 January, 2009; originally announced January 2009.

    Comments: to appear in the Journal of Multivariate Analysis

    Report number: Univ catholique de Louvain, Institut de statistique DP0808 MSC Class: 60G70; 62E20