Search | arXiv e-print repository

A Unified Approach to Extract Intepretable Rules from Tree Ensembles via Integer Programming

Authors: Lorenzo Bonasera, Emilio Carrizosa

Abstract: Tree ensemble methods represent a popular machine learning model, known for their effectiveness in supervised classification and regression tasks. Their performance derives from aggregating predictions of multiple decision trees, which are renowned for their interpretability properties. However, tree ensemble methods do not reliably exhibit interpretable output. Our work aims to extract an optimiz… ▽ More Tree ensemble methods represent a popular machine learning model, known for their effectiveness in supervised classification and regression tasks. Their performance derives from aggregating predictions of multiple decision trees, which are renowned for their interpretability properties. However, tree ensemble methods do not reliably exhibit interpretable output. Our work aims to extract an optimized list of rules from a trained tree ensemble, providing the user with a condensed, interpretable model that retains most of the predictive power of the full model. Our approach consists of solving a clean and neat set partitioning problem formulated through Integer Programming. The proposed method works with either tabular or time series data, for both classification and regression tasks, and does not require parameter tuning under the most common setting. Through rigorous computational experiments, we offer statistically significant evidence that our method is competitive with other rule extraction methods and effectively handles time series. △ Less

Submitted 30 June, 2024; originally announced July 2024.

arXiv:2110.11952 [pdf, other]

doi 10.1016/j.cor.2021.105281

Optimal randomized classification trees

Authors: Rafael Blanquero, Emilio Carrizosa, Cristina Molero-Río, Dolores Romero Morales

Abstract: Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and the associated threshold. This greedy approach trains trees very fast, but, by its nature, their classification accuracy may not be competitive against other st… ▽ More Classification and Regression Trees (CARTs) are off-the-shelf techniques in modern Statistics and Machine Learning. CARTs are traditionally built by means of a greedy procedure, sequentially deciding the splitting predictor variable(s) and the associated threshold. This greedy approach trains trees very fast, but, by its nature, their classification accuracy may not be competitive against other state-of-the-art procedures. Moreover, controlling critical issues, such as the misclassification rates in each of the classes, is difficult. To address these shortcomings, optimal decision trees have been recently proposed in the literature, which use discrete decision variables to model the path each observation will follow in the tree. Instead, we propose a new approach based on continuous optimization. Our classifier can be seen as a randomized tree, since at each node of the decision tree a random decision is made. The computational experience reported demonstrates the good performance of our procedure. △ Less

Submitted 19 October, 2021; originally announced October 2021.

Comments: This research has been financed in part by research projects EC H2020 MSCA RISE NeEDS (Grant agreement ID: 822214), FQM-329 and P18-FR-2369 (Junta de Andalucía), and PID2019-110886RB-I00 (Ministerio de Ciencia, Innovación y Universidades, Spain). This support is gratefully acknowledged

Journal ref: Computers & Operations Research, 2021

arXiv:2002.09191 [pdf, other]

doi 10.1016/j.ejor.2019.12.002

Sparsity in Optimal Randomized Classification Trees

Authors: Rafael Blanquero, Emilio Carrizosa, Cristina Molero-Río, Dolores Romero Morales

Abstract: Decision trees are popular Classification and Regression tools and, when small-sized, easy to interpret. Traditionally, a greedy approach has been used to build the trees, yielding a very fast training process; however, controlling sparsity (a proxy for interpretability) is challenging. In recent studies, optimal decision trees, where all decisions are optimized simultaneously, have shown a better… ▽ More Decision trees are popular Classification and Regression tools and, when small-sized, easy to interpret. Traditionally, a greedy approach has been used to build the trees, yielding a very fast training process; however, controlling sparsity (a proxy for interpretability) is challenging. In recent studies, optimal decision trees, where all decisions are optimized simultaneously, have shown a better learning performance, especially when oblique cuts are implemented. In this paper, we propose a continuous optimization approach to build sparse optimal classification trees, based on oblique cuts, with the aim of using fewer predictor variables in the cuts as well as along the whole tree. Both types of sparsity, namely local and global, are modeled by means of regularizations with polyhedral norms. The computational experience reported supports the usefulness of our methodology. In all our data sets, local and global sparsity can be improved without harming classification accuracy. Unlike greedy approaches, our ability to easily trade in some of our classification accuracy for a gain in global sparsity is shown. △ Less

Submitted 21 February, 2020; originally announced February 2020.

Comments: This research has been financed in part by research projects EC H2020 Marie Skłodowska-Curie Actions, Research and Innovation Staff Exchange Network of European Data Scientists, NeEDS, Grant agreement ID 822214, COSECLA - Fundación BBVA, MTM2015-65915R, Spain, P11-FQM-7603 and FQM-329, Junta de Andalucía. This support is gratefully acknowledged. Available online 16 December 2019

Journal ref: European Journal of Operational Research, 2019

arXiv:1604.01542 [pdf, other]

A biobjective approach to robustness based on location planning

Authors: Emilio Carrizosa, Marc Goerigk, Anita Schöbel

Abstract: Finding robust solutions of an optimization problem is an important issue in practice, and various concepts on how to define the robustness of a solution have been suggested. The idea of recoverable robustness requires that a solution can be recovered to a feasible one as soon as the realized scenario becomes known. The usual approach in the literature is to minimize the objective function value o… ▽ More Finding robust solutions of an optimization problem is an important issue in practice, and various concepts on how to define the robustness of a solution have been suggested. The idea of recoverable robustness requires that a solution can be recovered to a feasible one as soon as the realized scenario becomes known. The usual approach in the literature is to minimize the objective function value of the recovered solution in the nominal or in the worst case. As the recovery itself is also costly, there is a trade-off between the recovery costs and the solution value obtained; we study both, the recovery costs and the solution value in the worst case in a biobjective setting. To this end, we assume that the recovery costs can be described by a metric. We demonstrate that this leads to a location planning problem, bringing together two fields of research which have been considered separate so far. We show how weakly Pareto efficient solutions to this biobjective problem can be computed by minimizing the recovery costs for a fixed worst-case objective function value and present approaches for the case of linear and quasiconvex problems for finite uncertainty sets. We furthermore derive cases in which the size of the uncertainty set can be reduced without changing the set of Pareto efficient solutions. △ Less

Submitted 6 April, 2016; originally announced April 2016.

Showing 1–4 of 4 results for author: Carrizosa, E