-
Kernel KMeans clustering splits for end-to-end unsupervised decision trees
Authors:
Louis Ohl,
Pierre-Alexandre Mattei,
Mickaël Leclercq,
Arnaud Droit,
Frédéric Precioso
Abstract:
Trees are convenient models for obtaining explainable predictions on relatively small datasets. Although there are many proposals for the end-to-end construction of such trees in supervised learning, learning a tree end-to-end for clustering without labels remains an open challenge. As most works focus on interpreting with trees the result of another clustering algorithm, we present here a novel e…
▽ More
Trees are convenient models for obtaining explainable predictions on relatively small datasets. Although there are many proposals for the end-to-end construction of such trees in supervised learning, learning a tree end-to-end for clustering without labels remains an open challenge. As most works focus on interpreting with trees the result of another clustering algorithm, we present here a novel end-to-end trained unsupervised binary tree for clustering: Kauri. This method performs a greedy maximisation of the kernel KMeans objective without requiring the definition of centroids. We compare this model on multiple datasets with recent unsupervised trees and show that Kauri performs identically when using a linear kernel. For other kernels, Kauri often outperforms the concatenation of kernel KMeans and a CART decision tree.
△ Less
Submitted 19 February, 2024;
originally announced February 2024.
-
Generalised Mutual Information: a Framework for Discriminative Clustering
Authors:
Louis Ohl,
Pierre-Alexandre Mattei,
Charles Bouveyron,
Warith Harchaoui,
Mickaël Leclercq,
Arnaud Droit,
Frédéric Precioso
Abstract:
In the last decade, recent successes in deep clustering majorly involved the Mutual Information (MI) as an unsupervised objective for training neural networks with increasing regularisations. While the quality of the regularisations have been largely discussed for improvements, little attention has been dedicated to the relevance of MI as a clustering objective. In this paper, we first highlight h…
▽ More
In the last decade, recent successes in deep clustering majorly involved the Mutual Information (MI) as an unsupervised objective for training neural networks with increasing regularisations. While the quality of the regularisations have been largely discussed for improvements, little attention has been dedicated to the relevance of MI as a clustering objective. In this paper, we first highlight how the maximisation of MI does not lead to satisfying clusters. We identified the Kullback-Leibler divergence as the main reason of this behaviour. Hence, we generalise the mutual information by changing its core distance, introducing the Generalised Mutual Information (GEMINI): a set of metrics for unsupervised neural network training. Unlike MI, some GEMINIs do not require regularisations when training as they are geometry-aware thanks to distances or kernels in the data space. Finally, we highlight that GEMINIs can automatically select a relevant number of clusters, a property that has been little studied in deep discriminative clustering context where the number of clusters is a priori unknown.
△ Less
Submitted 6 September, 2023;
originally announced September 2023.
-
3DTeethSeg'22: 3D Teeth Scan Segmentation and Labeling Challenge
Authors:
Achraf Ben-Hamadou,
Oussama Smaoui,
Ahmed Rekik,
Sergi Pujades,
Edmond Boyer,
Hoyeon Lim,
Minchang Kim,
Minkyung Lee,
Minyoung Chung,
Yeong-Gil Shin,
Mathieu Leclercq,
Lucia Cevidanes,
Juan Carlos Prieto,
Shaojie Zhuang,
Guangshun Wei,
Zhiming Cui,
Yuanfeng Zhou,
Tudor Dascalu,
Bulat Ibragimov,
Tae-Hoon Yong,
Hong-Gi Ahn,
Wan Kim,
Jae-Hwan Han,
Byungsun Choi,
Niels van Nistelrooij
, et al. (7 additional authors not shown)
Abstract:
Teeth localization, segmentation, and labeling from intra-oral 3D scans are essential tasks in modern dentistry to enhance dental diagnostics, treatment planning, and population-based studies on oral health. However, develo** automated algorithms for teeth analysis presents significant challenges due to variations in dental anatomy, imaging protocols, and limited availability of publicly accessi…
▽ More
Teeth localization, segmentation, and labeling from intra-oral 3D scans are essential tasks in modern dentistry to enhance dental diagnostics, treatment planning, and population-based studies on oral health. However, develo** automated algorithms for teeth analysis presents significant challenges due to variations in dental anatomy, imaging protocols, and limited availability of publicly accessible data. To address these challenges, the 3DTeethSeg'22 challenge was organized in conjunction with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2022, with a call for algorithms tackling teeth localization, segmentation, and labeling from intraoral 3D scans. A dataset comprising a total of 1800 scans from 900 patients was prepared, and each tooth was individually annotated by a human-machine hybrid algorithm. A total of 6 algorithms were evaluated on this dataset. In this study, we present the evaluation results of the 3DTeethSeg'22 challenge. The 3DTeethSeg'22 challenge code can be accessed at: https://github.com/abenhamadou/3DTeethSeg22_challenge
△ Less
Submitted 29 May, 2023;
originally announced May 2023.
-
Sparse GEMINI for Joint Discriminative Clustering and Feature Selection
Authors:
Louis Ohl,
Pierre-Alexandre Mattei,
Charles Bouveyron,
Mickaël Leclercq,
Arnaud Droit,
Frédéric Precioso
Abstract:
Feature selection in clustering is a hard task which involves simultaneously the discovery of relevant clusters as well as relevant variables with respect to these clusters. While feature selection algorithms are often model-based through optimised model selection or strong assumptions on $p(\pmb{x})$, we introduce a discriminative clustering model trying to maximise a geometry-aware generalisatio…
▽ More
Feature selection in clustering is a hard task which involves simultaneously the discovery of relevant clusters as well as relevant variables with respect to these clusters. While feature selection algorithms are often model-based through optimised model selection or strong assumptions on $p(\pmb{x})$, we introduce a discriminative clustering model trying to maximise a geometry-aware generalisation of the mutual information called GEMINI with a simple $\ell_1$ penalty: the Sparse GEMINI. This algorithm avoids the burden of combinatorial feature subset exploration and is easily scalable to high-dimensional data and large amounts of samples while only designing a clustering model $p_θ(y|\pmb{x})$. We demonstrate the performances of Sparse GEMINI on synthetic datasets as well as large-scale datasets. Our results show that Sparse GEMINI is a competitive algorithm and has the ability to select relevant subsets of variables with respect to the clustering without using relevance criteria or prior hypotheses.
△ Less
Submitted 7 February, 2023;
originally announced February 2023.
-
Generalised Mutual Information for Discriminative Clustering
Authors:
Louis Ohl,
Pierre-Alexandre Mattei,
Charles Bouveyron,
Warith Harchaoui,
Mickaël Leclercq,
Arnaud Droit,
Frederic Precioso
Abstract:
In the last decade, recent successes in deep clustering majorly involved the mutual information (MI) as an unsupervised objective for training neural networks with increasing regularisations. While the quality of the regularisations have been largely discussed for improvements, little attention has been dedicated to the relevance of MI as a clustering objective. In this paper, we first highlight h…
▽ More
In the last decade, recent successes in deep clustering majorly involved the mutual information (MI) as an unsupervised objective for training neural networks with increasing regularisations. While the quality of the regularisations have been largely discussed for improvements, little attention has been dedicated to the relevance of MI as a clustering objective. In this paper, we first highlight how the maximisation of MI does not lead to satisfying clusters. We identified the Kullback-Leibler divergence as the main reason of this behaviour. Hence, we generalise the mutual information by changing its core distance, introducing the generalised mutual information (GEMINI): a set of metrics for unsupervised neural network training. Unlike MI, some GEMINIs do not require regularisations when training. Some of these metrics are geometry-aware thanks to distances or kernels in the data space. Finally, we highlight that GEMINIs can automatically select a relevant number of clusters, a property that has been little studied in deep clustering context where the number of clusters is a priori unknown.
△ Less
Submitted 14 October, 2022; v1 submitted 12 October, 2022;
originally announced October 2022.
-
Multi-view Attention for gestational age at birth prediction
Authors:
Mathieu Leclercq,
Martin Styner,
Juan Carlos Prieto
Abstract:
We present our method for gestational age at birth prediction for the SLCN (surface learning for clinical neuroimaging) challenge. Our method is based on a multi-view shape analysis technique that captures 2D renderings of a 3D object from different viewpoints. We render the brain features on the surface of the sphere and then the 2D images are analyzed via 2D CNNs and an attention layer for the r…
▽ More
We present our method for gestational age at birth prediction for the SLCN (surface learning for clinical neuroimaging) challenge. Our method is based on a multi-view shape analysis technique that captures 2D renderings of a 3D object from different viewpoints. We render the brain features on the surface of the sphere and then the 2D images are analyzed via 2D CNNs and an attention layer for the regression task. The regression task achieves a MAE of 1.637 +- 1.3 on the Native space and MAE of 1.38 +- 1.14 on the template space. The source code for this project is available in our github repository https://github.com/MathieuLeclercq/SLCN_challenge_UNC
△ Less
Submitted 8 July, 2022;
originally announced July 2022.
-
Support pour la reconfiguration d'implantation dans les applications a composants Java
Authors:
Jakub Kornas,
Matthieu Leclercq,
Vivien Quema,
Jean-Bernard Stefani
Abstract:
Nowadays, numerous component models are used for various purposes: to build applications, middleware or even operating systems. Those models commonly support structure reconfiguration, that is modification of application's architecture at runtime. On the other hand, very few allow implementation reconfiguration, that is runtime modification of the code of components building the application. In…
▽ More
Nowadays, numerous component models are used for various purposes: to build applications, middleware or even operating systems. Those models commonly support structure reconfiguration, that is modification of application's architecture at runtime. On the other hand, very few allow implementation reconfiguration, that is runtime modification of the code of components building the application. In this article we present the work we performed on JULIA, a Java-based implementation of the FRACTAL component model, in order for it to support implementation reconfigurations. We show how we overcame the limitations of Java class loading mechanism to allow runtime modifications of components' implementation and interfaces. We also describe the integration of our solution with the JULIA ADL.
△ Less
Submitted 24 November, 2004;
originally announced November 2004.