Search | arXiv e-print repository

Parametric Information Maximization for Generalized Category Discovery

Authors: Florent Chiaroni, Jose Dolz, Ziko Imtiaz Masud, Amar Mitiche, Ismail Ben Ayed

Abstract: We introduce a Parametric Information Maximization (PIM) model for the Generalized Category Discovery (GCD) problem. Specifically, we propose a bi-level optimization formulation, which explores a parameterized family of objective functions, each evaluating a weighted mutual information between the features and the latent labels, subject to supervision constraints from the labeled samples. Our form… ▽ More We introduce a Parametric Information Maximization (PIM) model for the Generalized Category Discovery (GCD) problem. Specifically, we propose a bi-level optimization formulation, which explores a parameterized family of objective functions, each evaluating a weighted mutual information between the features and the latent labels, subject to supervision constraints from the labeled samples. Our formulation mitigates the class-balance bias encoded in standard information maximization approaches, thereby handling effectively both short-tailed and long-tailed data sets. We report extensive experiments and comparisons demonstrating that our PIM model consistently sets new state-of-the-art performances in GCD across six different datasets, more so when dealing with challenging fine-grained problems. △ Less

Submitted 14 July, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

arXiv:2208.00287 [pdf, other]

Simplex Clustering via sBeta with Applications to Online Adjustment of Black-Box Predictions

Authors: Florent Chiaroni, Malik Boudiaf, Amar Mitiche, Ismail Ben Ayed

Abstract: We explore clustering the softmax predictions of deep neural networks and introduce a novel probabilistic clustering method, referred to as k-sBetas. In the general context of clustering discrete distributions, the existing methods focused on exploring distortion measures tailored to simplex data, such as the KL divergence, as alternatives to the standard Euclidean distance. We provide a general m… ▽ More We explore clustering the softmax predictions of deep neural networks and introduce a novel probabilistic clustering method, referred to as k-sBetas. In the general context of clustering discrete distributions, the existing methods focused on exploring distortion measures tailored to simplex data, such as the KL divergence, as alternatives to the standard Euclidean distance. We provide a general maximum a posteriori (MAP) perspective of clustering distributions, emphasizing that the statistical models underlying the existing distortion-based methods may not be descriptive enough. Instead, we optimize a mixed-variable objective measuring data conformity within each cluster to the introduced sBeta density function, whose parameters are constrained and estimated jointly with binary assignment variables. Our versatile formulation approximates various parametric densities for modeling simplex data and enables the control of the cluster-balance bias. This yields highly competitive performances for the unsupervised adjustment of black-box model predictions in various scenarios. Our code and comparisons with the existing simplex-clustering approaches and our introduced softmax-prediction benchmarks are publicly available: https://github.com/fchiaroni/Clustering_Softmax_Predictions. △ Less

Submitted 30 June, 2024; v1 submitted 30 July, 2022; originally announced August 2022.

arXiv:2103.01859 [pdf, other]

Physical Activity Recognition Based on a Parallel Approach for an Ensemble of Machine Learning and Deep Learning Classifiers

Authors: M. Abid, A. Khabou, Y. Ouakrim, H. Watel, S. Chemkhi, A. Mitiche, A. Benazza-Benyahia, N. Mezghani

Abstract: Human activity recognition (HAR) by wearable sensor devices embedded in the Internet of things (IOT) can play a significant role in remote health monitoring and emergency notification, to provide healthcare of higher standards. The purpose of this study is to investigate a human activity recognition method of accrued decision accuracy and speed of execution to be applicable in healthcare. This met… ▽ More Human activity recognition (HAR) by wearable sensor devices embedded in the Internet of things (IOT) can play a significant role in remote health monitoring and emergency notification, to provide healthcare of higher standards. The purpose of this study is to investigate a human activity recognition method of accrued decision accuracy and speed of execution to be applicable in healthcare. This method classifies wearable sensor acceleration time series data of human movement using efficient classifier combination of feature engineering-based and feature learning-based data representation. Leave-one-subject-out cross-validation of the method with data acquired from 44 subjects wearing a single waist-worn accelerometer on a smart textile, and engaged in a variety of 10 activities, yields an average recognition rate of 90%, performing significantly better than individual classifiers. The method easily accommodates functional and computational parallelization to bring execution time significantly down. △ Less

Submitted 2 March, 2021; originally announced March 2021.

Report number: https://www.mdpi.com/1424-8220/21/14/4713

arXiv:1912.03354 [pdf, ps, other]

Bilinear Models for Machine Learning

Authors: Tayssir Doghri, Leszek Szczecinski, Jacob Benesty, Amar Mitiche

Abstract: In this work we define and analyze the bilinear models which replace the conventional linear operation used in many building blocks of machine learning (ML). The main idea is to devise the ML algorithms which are adapted to the objects they treat. In the case of monochromatic images, we show that the bilinear operation exploits better the structure of the image than the conventional linear operati… ▽ More In this work we define and analyze the bilinear models which replace the conventional linear operation used in many building blocks of machine learning (ML). The main idea is to devise the ML algorithms which are adapted to the objects they treat. In the case of monochromatic images, we show that the bilinear operation exploits better the structure of the image than the conventional linear operation which ignores the spatial relationship between the pixels. This translates into significantly smaller number of parameters required to yield the same performance. We show numerical examples of classification in the MNIST data set. △ Less

Submitted 6 December, 2019; originally announced December 2019.

arXiv:1810.04246 [pdf, ps, other]

Deep clustering: On the link between discriminative models and K-means

Authors: Mohammed Jabi, Marco Pedersoli, Amar Mitiche, Ismail Ben Ayed

Abstract: In the context of recent deep clustering studies, discriminative models dominate the literature and report the most competitive performances. These models learn a deep discriminative neural network classifier in which the labels are latent. Typically, they use multinomial logistic regression posteriors and parameter regularization, as is very common in supervised learning. It is generally acknowle… ▽ More In the context of recent deep clustering studies, discriminative models dominate the literature and report the most competitive performances. These models learn a deep discriminative neural network classifier in which the labels are latent. Typically, they use multinomial logistic regression posteriors and parameter regularization, as is very common in supervised learning. It is generally acknowledged that discriminative objective functions (e.g., those based on the mutual information or the KL divergence) are more flexible than generative approaches (e.g., K-means) in the sense that they make fewer assumptions about the data distributions and, typically, yield much better unsupervised deep learning results. On the surface, several recent discriminative models may seem unrelated to K-means. This study shows that these models are, in fact, equivalent to K-means under mild conditions and common posterior models and parameter regularization. We prove that, for the commonly used logistic regression posteriors, maximizing the $L_2$ regularized mutual information via an approximate alternating direction method (ADM) is equivalent to a soft and regularized K-means loss. Our theoretical analysis not only connects directly several recent state-of-the-art discriminative models to K-means, but also leads to a new soft and regularized deep K-means algorithm, which yields competitive performance on several image clustering benchmarks. △ Less

Submitted 15 December, 2019; v1 submitted 9 October, 2018; originally announced October 2018.

Showing 1–5 of 5 results for author: Mitiche, A