A One-Class Classification Decision Tree Based on Kernel Density Estimation
Authors:
Sarah Itani,
Fabian Lecron,
Philippe Fortemps
Abstract:
One-class Classification (OCC) is an area of machine learning which addresses prediction based on unbalanced datasets. Basically, OCC algorithms achieve training by means of a single class sample, with potentially some additional counter-examples. The current OCC models give satisfaction in terms of performance, but there is an increasing need for the development of interpretable models. In the pr…
▽ More
One-class Classification (OCC) is an area of machine learning which addresses prediction based on unbalanced datasets. Basically, OCC algorithms achieve training by means of a single class sample, with potentially some additional counter-examples. The current OCC models give satisfaction in terms of performance, but there is an increasing need for the development of interpretable models. In the present work, we propose a one-class model which addresses concerns of both performance and interpretability. Our hybrid OCC method relies on density estimation as part of a tree-based learning algorithm, called One-Class decision Tree (OC-Tree). Within a greedy and recursive approach, our proposal rests on kernel density estimation to split a data subset on the basis of one or several intervals of interest. Thus, the OC-Tree encloses data within hyper-rectangles of interest which can be described by a set of rules. Against state-of-the-art methods such as Cluster Support Vector Data Description (ClusterSVDD), One-Class Support Vector Machine (OCSVM) and isolation Forest (iForest), the OC-Tree performs favorably on a range of benchmark datasets. Furthermore, we propose a real medical application for which the OC-Tree has demonstrated its effectiveness, through the ability to tackle interpretable diagnosis aid based on unbalanced datasets.
△ Less
Submitted 20 March, 2020; v1 submitted 14 May, 2018;
originally announced May 2018.