Search | arXiv e-print repository

doi 10.1016/j.neucom.2023.126634

Meta-survey on outlier and anomaly detection

Authors: Madalina Olteanu, Fabrice Rossi, Florian Yger

Abstract: The impact of outliers and anomalies on model estimation and data processing is of paramount importance, as evidenced by the extensive body of research spanning various fields over several decades: thousands of research papers have been published on the subject. As a consequence, numerous reviews, surveys, and textbooks have sought to summarize the existing literature, encompassing a wide ra… ▽ More The impact of outliers and anomalies on model estimation and data processing is of paramount importance, as evidenced by the extensive body of research spanning various fields over several decades: thousands of research papers have been published on the subject. As a consequence, numerous reviews, surveys, and textbooks have sought to summarize the existing literature, encompassing a wide range of methods from both the statistical and data mining communities. While these endeavors to organize and summarize the research are invaluable, they face inherent challenges due to the pervasive nature of outliers and anomalies in all data-intensive applications, irrespective of the specific application field or scientific discipline. As a result, the resulting collection of papers remains voluminous and somewhat heterogeneous. To address the need for knowledge organization in this domain, this paper implements the first systematic meta-survey of general surveys and reviews on outlier and anomaly detection. Employing a classical systematic survey approach, the study collects nearly 500 papers using two specialized scientific search engines. From this comprehensive collection, a subset of 56 papers that claim to be general surveys on outlier detection is selected using a snowball search technique to enhance field coverage. A meticulous quality assessment phase further refines the selection to a subset of 25 high-quality general surveys. Using this curated collection, the paper investigates the evolution of the outlier detection field over a 20-year period, revealing emerging themes and methods. Furthermore, an analysis of the surveys sheds light on the survey writing practices adopted by scholars from different communities who have contributed to this field. Finally, the paper delves into several topics where consensus has emerged from the literature. These include taxonomies of outlier types, challenges posed by high-dimensional data, the importance of anomaly scores, the impact of learning conditions, difficulties in benchmarking, and the significance of neural networks. Non-consensual aspects are also discussed, particularly the distinction between local and global outliers and the challenges in organizing detection methods into meaningful taxonomies. △ Less

Submitted 12 December, 2023; originally announced December 2023.

Journal ref: Neurocomputing, 2023, 555, pp.126634

arXiv:2309.07579 [pdf, other]

Structure-Preserving Transformers for Sequences of SPD Matrices

Authors: Mathieu Seraphim, Alexis Lechervy, Florian Yger, Luc Brun, Olivier Etard

Abstract: In recent years, Transformer-based auto-attention mechanisms have been successfully applied to the analysis of a variety of context-reliant data types, from texts to images and beyond, including data from non-Euclidean geometries. In this paper, we present such a mechanism, designed to classify sequences of Symmetric Positive Definite matrices while preserving their Riemannian geometry throughout… ▽ More In recent years, Transformer-based auto-attention mechanisms have been successfully applied to the analysis of a variety of context-reliant data types, from texts to images and beyond, including data from non-Euclidean geometries. In this paper, we present such a mechanism, designed to classify sequences of Symmetric Positive Definite matrices while preserving their Riemannian geometry throughout the analysis. We apply our method to automatic sleep staging on timeseries of EEG-derived covariance matrices from a standard dataset, obtaining high levels of stage-wise performance. △ Less

Submitted 28 May, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

Comments: New year, new version! (updated template, minimal additions - including two new references)

arXiv:2212.13520 [pdf, other]

Challenges in anomaly and change point detection

Authors: Madalina Olteanu, Fabrice Rossi, Florian Yger

Abstract: This paper presents an introduction to the state-of-the-art in anomaly and change-point detection. On the one hand, the main concepts needed to understand the vast scientific literature on those subjects are introduced. On the other, a selection of important surveys and books, as well as two selected active research topics in the field, are presented. This paper presents an introduction to the state-of-the-art in anomaly and change-point detection. On the one hand, the main concepts needed to understand the vast scientific literature on those subjects are introduced. On the other, a selection of important surveys and books, as well as two selected active research topics in the field, are presented. △ Less

Submitted 27 December, 2022; originally announced December 2022.

Journal ref: 30th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2022), Oct 2022, Bruges, Belgium

arXiv:2207.02574 [pdf, other]

Is the U-Net Directional-Relationship Aware?

Authors: Mateus Riva, Pietro Gori, Florian Yger, Isabelle Bloch

Abstract: CNNs are often assumed to be capable of using contextual information about distinct objects (such as their directional relations) inside their receptive field. However, the nature and limits of this capacity has never been explored in full. We explore a specific type of relationship~-- directional~-- using a standard U-Net trained to optimize a cross-entropy loss function for segmentation. We trai… ▽ More CNNs are often assumed to be capable of using contextual information about distinct objects (such as their directional relations) inside their receptive field. However, the nature and limits of this capacity has never been explored in full. We explore a specific type of relationship~-- directional~-- using a standard U-Net trained to optimize a cross-entropy loss function for segmentation. We train this network on a pretext segmentation task requiring directional relation reasoning for success and state that, with enough data and a sufficiently large receptive field, it succeeds to learn the proposed task. We further explore what the network has learned by analysing scenarios where the directional relationships are perturbed, and show that the network has learned to reason using these relationships. △ Less

Submitted 6 July, 2022; originally announced July 2022.

Comments: Accepted at ICIP 2022

arXiv:2201.06655 [pdf, other]

Multi-winner Approval Voting Goes Epistemic

Authors: Tahar Allouche, Jérôme Lang, Florian Yger

Abstract: Epistemic voting interprets votes as noisy signals about a ground truth. We consider contexts where the truth consists of a set of objective winners, knowing a lower and upper bound on its cardinality. A prototypical problem for this setting is the aggre-gation of multi-label annotations with prior knowledge on the size of the ground truth. We posit noisemodels, for which we define rules that outp… ▽ More Epistemic voting interprets votes as noisy signals about a ground truth. We consider contexts where the truth consists of a set of objective winners, knowing a lower and upper bound on its cardinality. A prototypical problem for this setting is the aggre-gation of multi-label annotations with prior knowledge on the size of the ground truth. We posit noisemodels, for which we define rules that output an optimal set of winners. We report on experiments on multi-label annotations (which we collected). △ Less

Submitted 17 January, 2022; originally announced January 2022.

arXiv:2112.04387 [pdf, other]

Truth-tracking via Approval Voting: Size Matters

Authors: Tahar Allouche, Jérôme Lang, Florian Yger

Abstract: Epistemic social choice aims at unveiling a hidden ground truth given votes, which are interpreted as noisy signals about it. We consider here a simple setting where votes consist of approval ballots: each voter approves a set of alternatives which they believe can possibly be the ground truth. Based on the intuitive idea that more reliable votes contain fewer alternatives, we define several noise… ▽ More Epistemic social choice aims at unveiling a hidden ground truth given votes, which are interpreted as noisy signals about it. We consider here a simple setting where votes consist of approval ballots: each voter approves a set of alternatives which they believe can possibly be the ground truth. Based on the intuitive idea that more reliable votes contain fewer alternatives, we define several noise models that are approval voting variants of the Mallows model. The likelihood-maximizing alternative is then characterized as the winner of a weighted approval rule, where the weight of a ballot decreases with its cardinality. We have conducted an experiment on three image annotation datasets; they conclude that rules based on our noise model outperform standard approval voting; the best performance is obtained by a variant of the Condorcet noise model. △ Less

Submitted 7 December, 2021; originally announced December 2021.

Comments: Accepted in the 36th AAAI Conference on Artificial Intelligence (AAAI 2022)

arXiv:2112.04288 [pdf, other]

Non parametric estimation of causal populations in a counterfactual scenario

Authors: Celine Beji, Florian Yger, Jamal Atif

Abstract: In causality, estimating the effect of a treatment without confounding inference remains a major issue because requires to assess the outcome in both case with and without treatment. Not being able to observe simultaneously both of them, the estimation of potential outcome remains a challenging task. We propose an innovative approach where the problem is reformulated as a missing data model. The a… ▽ More In causality, estimating the effect of a treatment without confounding inference remains a major issue because requires to assess the outcome in both case with and without treatment. Not being able to observe simultaneously both of them, the estimation of potential outcome remains a challenging task. We propose an innovative approach where the problem is reformulated as a missing data model. The aim is to estimate the hidden distribution of \emph{causal populations}, defined as a function of treatment and outcome. A Causal Auto-Encoder (CAE), enhanced by a prior dependent on treatment and outcome information, assimilates the latent space to the probability distribution of the target populations. The features are reconstructed after being reduced to a latent space and constrained by a mask introduced in the intermediate layer of the network, containing treatment and outcome information. △ Less

Submitted 8 December, 2021; originally announced December 2021.

arXiv:2111.14565 [pdf, other]

A new Sinkhorn algorithm with Deletion and Insertion operations

Authors: Luc Brun, Benoit Gaüzère, Sébastien Bougleux, Florian Yger

Abstract: This technical report is devoted to the continuous estimation of an epsilon-assignment. Roughly speaking, an epsilon assignment between two sets V1 and V2 may be understood as a bijective map** between a sub part of V1 and a sub part of V2 . The remaining elements of V1 (not included in this map**) are mapped onto an epsilon pseudo element of V2 . We say that such elements are deleted. Convers… ▽ More This technical report is devoted to the continuous estimation of an epsilon-assignment. Roughly speaking, an epsilon assignment between two sets V1 and V2 may be understood as a bijective map** between a sub part of V1 and a sub part of V2 . The remaining elements of V1 (not included in this map**) are mapped onto an epsilon pseudo element of V2 . We say that such elements are deleted. Conversely, the remaining elements of V2 correspond to the image of the epsilon pseudo element of V1. We say that these elements are inserted. As a result our method provides a result similar to the one of the Sinkhorn algorithm with the additional ability to reject some elements which are either inserted or deleted. It thus naturally handles sets V1 and V2 of different sizes and decides map**s/insertions/deletions in a unified way. Our algorithms are iterative and differentiable and may thus be easily inserted within a backpropagation based learning framework such as artificial neural networks. △ Less

Submitted 18 January, 2022; v1 submitted 29 November, 2021; originally announced November 2021.

Comments: 20 pages

arXiv:2111.03122 [pdf, other]

Functional connectivity ensemble method to enhance BCI performance (FUCONE)

Authors: Marie-Constance Corsi, Sylvain Chevallier, Fabrizio De Vico Fallani, Florian Yger

Abstract: Functional connectivity is a key approach to investigate oscillatory activities of the brain that provides important insights on the underlying dynamic of neuronal interactions and that is mostly applied for brain activity analysis. Building on the advances in information geometry for brain-computer interface, we propose a novel framework that combines functional connectivity estimators and covari… ▽ More Functional connectivity is a key approach to investigate oscillatory activities of the brain that provides important insights on the underlying dynamic of neuronal interactions and that is mostly applied for brain activity analysis. Building on the advances in information geometry for brain-computer interface, we propose a novel framework that combines functional connectivity estimators and covariance-based pipelines to classify mental states, such as motor imagery. A Riemannian classifier is trained for each estimator and an ensemble classifier combines the decisions in each feature space. A thorough assessment of the functional connectivity estimators is provided and the best performing pipeline, called FUCONE, is evaluated on different conditions and datasets. Using a meta-analysis to aggregate results across datasets, FUCONE performed significantly better than all state-of-the-art methods. The performance gain is mostly imputable to the improved diversity of the feature spaces, increasing the robustness of the ensemble classifier with respect to the inter- and intra-subject variability. △ Less

Submitted 16 February, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

arXiv:2107.14525 [pdf, ps, other]

The Minimum Edit Arborescence Problem and Its Use in Compressing Graph Collections [Extended Version]

Authors: Lucas Gnecco, Nicolas Boria, Sébastien Bougleux, Florian Yger, David B. Blumenthal

Abstract: The inference of minimum spanning arborescences within a set of objects is a general problem which translates into numerous application-specific unsupervised learning tasks. We introduce a unified and generic structure called edit arborescence that relies on edit paths between data in a collection, as well as the Min Edit Arborescence Problem, which asks for an edit arborescence that minimizes the… ▽ More The inference of minimum spanning arborescences within a set of objects is a general problem which translates into numerous application-specific unsupervised learning tasks. We introduce a unified and generic structure called edit arborescence that relies on edit paths between data in a collection, as well as the Min Edit Arborescence Problem, which asks for an edit arborescence that minimizes the sum of costs of its inner edit paths. Through the use of suitable cost functions, this generic framework allows to model a variety of problems. In particular, we show that by introducing encoding size preserving edit costs, it can be used as an efficient method for compressing collections of labeled graphs. Experiments on various graph datasets, with comparisons to standard compression tools, show the potential of our method. △ Less

Submitted 30 July, 2021; originally announced July 2021.

arXiv:2107.08135 [pdf, other]

Mediated Uncoupled Learning: Learning Functions without Direct Input-output Correspondences

Authors: Ikko Yamane, Junya Honda, Florian Yger, Masashi Sugiyama

Abstract: Ordinary supervised learning is useful when we have paired training data of input $X$ and output $Y$. However, such paired data can be difficult to collect in practice. In this paper, we consider the task of predicting $Y$ from $X$ when we have no paired data of them, but we have two separate, independent datasets of $X$ and $Y$ each observed with some mediating variable $U$, that is, we have two… ▽ More Ordinary supervised learning is useful when we have paired training data of input $X$ and output $Y$. However, such paired data can be difficult to collect in practice. In this paper, we consider the task of predicting $Y$ from $X$ when we have no paired data of them, but we have two separate, independent datasets of $X$ and $Y$ each observed with some mediating variable $U$, that is, we have two datasets $S_X = \{(X_i, U_i)\}$ and $S_Y = \{(U'_j, Y'_j)\}$. A naive approach is to predict $U$ from $X$ using $S_X$ and then $Y$ from $U$ using $S_Y$, but we show that this is not statistically consistent. Moreover, predicting $U$ can be more difficult than predicting $Y$ in practice, e.g., when $U$ has higher dimensionality. To circumvent the difficulty, we propose a new method that avoids predicting $U$ but directly learns $Y = f(X)$ by training $f(X)$ with $S_{X}$ to predict $h(U)$ which is trained with $S_{Y}$ to approximate $Y$. We prove statistical consistency and error bounds of our method and experimentally confirm its practical usefulness. △ Less

Submitted 17 July, 2022; v1 submitted 16 July, 2021; originally announced July 2021.

Comments: ICML 2021 version with correction to Figure 1 and the appendices

arXiv:2107.01994 [pdf, ps, other]

Template-Based Graph Clustering

Authors: Mateus Riva, Florian Yger, Pietro Gori, Roberto M. Cesar Jr., Isabelle Bloch

Abstract: We propose a novel graph clustering method guided by additional information on the underlying structure of the clusters (or communities). The problem is formulated as the matching of a graph to a template with smaller dimension, hence matching $n$ vertices of the observed graph (to be clustered) to the $k$ vertices of a template graph, using its edges as support information, and relaxed on the set… ▽ More We propose a novel graph clustering method guided by additional information on the underlying structure of the clusters (or communities). The problem is formulated as the matching of a graph to a template with smaller dimension, hence matching $n$ vertices of the observed graph (to be clustered) to the $k$ vertices of a template graph, using its edges as support information, and relaxed on the set of orthonormal matrices in order to find a $k$ dimensional embedding. With relevant priors that encode the density of the clusters and their relationships, our method outperforms classical methods, especially for challenging cases. △ Less

Submitted 5 July, 2021; originally announced July 2021.

Comments: ECML-PKDD, Workshop on Graph Embedding and Minin (GEM) 2020

Journal ref: ECML-PKDD, Workshop on Graph Embedding and Minin (GEM) 2020

arXiv:2104.04040 [pdf, ps, other]

Scaling up graph homomorphism for classification via sampling

Authors: Paul Beaujean, Florian Sikora, Florian Yger

Abstract: Feature generation is an open topic of investigation in graph machine learning. In this paper, we study the use of graph homomorphism density features as a scalable alternative to homomorphism numbers which retain similar theoretical properties and ability to take into account inductive bias. For this, we propose a high-performance implementation of a simple sampling algorithm which computes addit… ▽ More Feature generation is an open topic of investigation in graph machine learning. In this paper, we study the use of graph homomorphism density features as a scalable alternative to homomorphism numbers which retain similar theoretical properties and ability to take into account inductive bias. For this, we propose a high-performance implementation of a simple sampling algorithm which computes additive approximations of homomorphism densities. In the context of graph machine learning, we demonstrate in experiments that simple linear models trained on sample homomorphism densities can achieve performance comparable to graph neural networks on standard graph classification datasets. Finally, we show in experiments on synthetic data that this algorithm scales to very large graphs when implemented with Bloom filters. △ Less

Submitted 8 April, 2021; originally announced April 2021.

Comments: 17 pages, 1 figure

ACM Class: I.5.1; I.5.2

arXiv:2102.10923 [pdf, other]

Approximation of dilation-based spatial relations to add structural constraints in neural networks

Authors: Mateus Riva, Pietro Gori, Florian Yger, Roberto Cesar, Isabelle Bloch

Abstract: Spatial relations between objects in an image have proved useful for structural object recognition. Structural constraints can act as regularization in neural network training, improving generalization capability with small datasets. Several relations can be modeled as a morphological dilation of a reference object with a structuring element representing the semantics of the relation, from which t… ▽ More Spatial relations between objects in an image have proved useful for structural object recognition. Structural constraints can act as regularization in neural network training, improving generalization capability with small datasets. Several relations can be modeled as a morphological dilation of a reference object with a structuring element representing the semantics of the relation, from which the degree of satisfaction of the relation between another object and the reference object can be derived. However, dilation is not differentiable, requiring an approximation to be used in the context of gradient-descent training of a network. We propose to approximate dilations using convolutions based on a kernel equal to the structuring element. We show that the proposed approximation, even if slightly less accurate than previous approximations, is definitely faster to compute and therefore more suitable for computationally intensive neural network applications. △ Less

Submitted 22 February, 2021; originally announced February 2021.

arXiv:2102.10875 [pdf, other]

On the robustness of randomized classifiers to adversarial examples

Authors: Rafael Pinot, Laurent Meunier, Florian Yger, Cédric Gouy-Pailler, Yann Chevaleyre, Jamal Atif

Abstract: This paper investigates the theory of robustness against adversarial attacks. We focus on randomized classifiers (\emph{i.e.} classifiers that output random variables) and provide a thorough analysis of their behavior through the lens of statistical learning theory and information theory. To this aim, we introduce a new notion of robustness for randomized classifiers, enforcing local Lipschitzness… ▽ More This paper investigates the theory of robustness against adversarial attacks. We focus on randomized classifiers (\emph{i.e.} classifiers that output random variables) and provide a thorough analysis of their behavior through the lens of statistical learning theory and information theory. To this aim, we introduce a new notion of robustness for randomized classifiers, enforcing local Lipschitzness using probability metrics. Equipped with this definition, we make two new contributions. The first one consists in devising a new upper bound on the adversarial generalization gap of randomized classifiers. More precisely, we devise bounds on the generalization gap and the adversarial gap (\emph{i.e.} the gap between the risk and the worst-case risk under attack) of randomized classifiers. The second contribution presents a yet simple but efficient noise injection method to design robust randomized classifiers. We show that our results are applicable to a wide range of machine learning models under mild hypotheses. We further corroborate our findings with experimental results using deep neural networks on standard image datasets, namely CIFAR-10 and CIFAR-100. All robust models we trained models can simultaneously achieve state-of-the-art accuracy (over $0.82$ clean accuracy on CIFAR-10) and enjoy \emph{guaranteed} robust accuracy bounds ($0.45$ against $\ell_2$ adversaries with magnitude $0.5$ on CIFAR-10). △ Less

Submitted 22 February, 2021; originally announced February 2021.

arXiv:2102.06015 [pdf, other]

RIGOLETTO -- RIemannian GeOmetry LEarning: applicaTion To cOnnectivity. A contribution to the Clinical BCI Challenge -- WCCI2020

Authors: Marie-Constance Corsi, Florian Yger, Sylvain Chevallier, Camille Noûs

Abstract: This short technical report describes the approach submitted to the Clinical BCI Challenge-WCCI2020. This submission aims to classify motor imagery task from EEG signals and relies on Riemannian Geometry, with a twist. Instead of using the classical covariance matrices, we also rely on measures of functional connectivity. Our approach ranked 1st on the task 1 of the competition. This short technical report describes the approach submitted to the Clinical BCI Challenge-WCCI2020. This submission aims to classify motor imagery task from EEG signals and relies on Riemannian Geometry, with a twist. Instead of using the classical covariance matrices, we also rely on measures of functional connectivity. Our approach ranked 1st on the task 1 of the competition. △ Less

Submitted 11 March, 2021; v1 submitted 9 February, 2021; originally announced February 2021.

Comments: 7 pages, 7 figures, submitted to ICASSP conference

arXiv:2004.05013 [pdf, ps, other]

Estimating Individual Treatment Effects through Causal Populations Identification

Authors: Céline Beji, Michaël Bon, Florian Yger, Jamal Atif

Abstract: Estimating the Individual Treatment Effect from observational data, defined as the difference between outcomes with and without treatment or intervention, while observing just one of both, is a challenging problems in causal learning. In this paper, we formulate this problem as an inference from hidden variables and enforce causal constraints based on a model of four exclusive causal populations.… ▽ More Estimating the Individual Treatment Effect from observational data, defined as the difference between outcomes with and without treatment or intervention, while observing just one of both, is a challenging problems in causal learning. In this paper, we formulate this problem as an inference from hidden variables and enforce causal constraints based on a model of four exclusive causal populations. We propose a new version of the EM algorithm, coined as Expected-Causality-Maximization (ECM) algorithm and provide hints on its convergence under mild conditions. We compare our algorithm to baseline methods on synthetic and real-world data and discuss its performances. △ Less

Submitted 6 May, 2020; v1 submitted 10 April, 2020; originally announced April 2020.

Comments: Accepted (to appear) in ESANN 2020 proceedings, European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Bruges (Belgium), 2-4 October 2020

arXiv:1906.07982 [pdf, ps, other]

A unified view on differential privacy and robustness to adversarial examples

Authors: Rafael Pinot, Florian Yger, Cédric Gouy-Pailler, Jamal Atif

Abstract: This short note highlights some links between two lines of research within the emerging topic of trustworthy machine learning: differential privacy and robustness to adversarial examples. By abstracting the definitions of both notions, we show that they build upon the same theoretical ground and hence results obtained so far in one domain can be transferred to the other. More precisely, our analys… ▽ More This short note highlights some links between two lines of research within the emerging topic of trustworthy machine learning: differential privacy and robustness to adversarial examples. By abstracting the definitions of both notions, we show that they build upon the same theoretical ground and hence results obtained so far in one domain can be transferred to the other. More precisely, our analysis is based on two key elements: probabilistic map**s (also called randomized algorithms in the differential privacy community), and the Renyi divergence which subsumes a large family of divergences. We first generalize the definition of robustness against adversarial examples to encompass probabilistic map**s. Then we observe that Renyi-differential privacy (a generalization of differential privacy recently proposed in~\cite{Mironov2017RenyiDP}) and our definition of robustness share several similarities. We finally discuss how can both communities benefit from this connection to transfer technical tools from one research field to the other. △ Less

Submitted 19 June, 2019; originally announced June 2019.

arXiv:1902.01148 [pdf, other]

Theoretical evidence for adversarial robustness through randomization

Authors: Rafael Pinot, Laurent Meunier, Alexandre Araujo, Hisashi Kashima, Florian Yger, Cédric Gouy-Pailler, Jamal Atif

Abstract: This paper investigates the theory of robustness against adversarial attacks. It focuses on the family of randomization techniques that consist in injecting noise in the network at inference time. These techniques have proven effective in many contexts, but lack theoretical arguments. We close this gap by presenting a theoretical analysis of these approaches, hence explaining why they perform well… ▽ More This paper investigates the theory of robustness against adversarial attacks. It focuses on the family of randomization techniques that consist in injecting noise in the network at inference time. These techniques have proven effective in many contexts, but lack theoretical arguments. We close this gap by presenting a theoretical analysis of these approaches, hence explaining why they perform well in practice. More precisely, we make two new contributions. The first one relates the randomization rate to robustness to adversarial attacks. This result applies for the general family of exponential distributions, and thus extends and unifies the previous approaches. The second contribution consists in devising a new upper bound on the adversarial generalization gap of randomized neural networks. We support our theoretical claims with a set of experiments. △ Less

Submitted 11 June, 2019; v1 submitted 4 February, 2019; originally announced February 2019.

arXiv:1803.05112 [pdf, other]

Uplift Modeling from Separate Labels

Authors: Ikko Yamane, Florian Yger, Jamal Atif, Masashi Sugiyama

Abstract: Uplift modeling is aimed at estimating the incremental impact of an action on an individual's behavior, which is useful in various application domains such as targeted marketing (advertisement campaigns) and personalized medicine (medical treatments). Conventional methods of uplift modeling require every instance to be jointly equipped with two types of labels: the taken action and its outcome. Ho… ▽ More Uplift modeling is aimed at estimating the incremental impact of an action on an individual's behavior, which is useful in various application domains such as targeted marketing (advertisement campaigns) and personalized medicine (medical treatments). Conventional methods of uplift modeling require every instance to be jointly equipped with two types of labels: the taken action and its outcome. However, obtaining two labels for each instance at the same time is difficult or expensive in many real-world problems. In this paper, we propose a novel method of uplift modeling that is applicable to a more practical setting where only one type of labels is available for each instance. We show a mean squared error bound for the proposed estimator and demonstrate its effectiveness through experiments. △ Less

Submitted 20 November, 2018; v1 submitted 13 March, 2018; originally announced March 2018.

Comments: 17 pages, 7 figures, to appear in NeurIPS 2018

arXiv:1803.03831 [pdf, other]

Graph-based Clustering under Differential Privacy

Authors: Rafael Pinot, Anne Morvan, Florian Yger, Cédric Gouy-Pailler, Jamal Atif

Abstract: In this paper, we present the first differentially private clustering method for arbitrary-shaped node clusters in a graph. This algorithm takes as input only an approximate Minimum Spanning Tree (MST) $\mathcal{T}$ released under weight differential privacy constraints from the graph. Then, the underlying nonconvex clustering partition is successfully recovered from cutting optimal cuts on… ▽ More In this paper, we present the first differentially private clustering method for arbitrary-shaped node clusters in a graph. This algorithm takes as input only an approximate Minimum Spanning Tree (MST) $\mathcal{T}$ released under weight differential privacy constraints from the graph. Then, the underlying nonconvex clustering partition is successfully recovered from cutting optimal cuts on $\mathcal{T}$. As opposed to existing methods, our algorithm is theoretically well-motivated. Experiments support our theoretical findings. △ Less

Submitted 10 March, 2018; originally announced March 2018.

arXiv:1709.01584 [pdf, other]

Using Posters to Recommend Anime and Mangas in a Cold-Start Scenario

Authors: Jill-Jênn Vie, Florian Yger, Ryan Lahfa, Basile Clement, Kévin Cocchi, Thomas Chalumeau, Hisashi Kashima

Abstract: Item cold-start is a classical issue in recommender systems that affects anime and manga recommendations as well. This problem can be framed as follows: how to predict whether a user will like a manga that received few ratings from the community? Content-based techniques can alleviate this issue but require extra information, that is usually expensive to gather. In this paper, we use a deep learni… ▽ More Item cold-start is a classical issue in recommender systems that affects anime and manga recommendations as well. This problem can be framed as follows: how to predict whether a user will like a manga that received few ratings from the community? Content-based techniques can alleviate this issue but require extra information, that is usually expensive to gather. In this paper, we use a deep learning technique, Illustration2Vec, to easily extract tag information from the manga and anime posters (e.g., sword, or ponytail). We propose BALSE (Blended Alternate Least Squares with Explanation), a new model for collaborative filtering, that benefits from this extra information to recommend mangas. We show, using real data from an online manga recommender system called Mangaki, that our model improves substantially the quality of recommendations, especially for less-known manga, and is able to provide an interpretation of the taste of the users. △ Less

Submitted 7 September, 2017; v1 submitted 3 September, 2017; originally announced September 2017.

Comments: 6 pages, 3 figures, 1 table, accepted at the MANPU 2017 workshop, co-located with ICDAR 2017 in Kyoto on November 10, 2017

arXiv:1605.07785 [pdf, other]

Geometry-aware stationary subspace analysis

Authors: Inbal Horev, Florian Yger, Masashi Sugiyama

Abstract: In many real-world applications data exhibits non-stationarity, i.e., its distribution changes over time. One approach to handling non-stationarity is to remove or minimize it before attempting to analyze the data. In the context of brain computer interface (BCI) data analysis this may be done by means of stationary subspace analysis (SSA). The classic SSA method finds a matrix that projects the d… ▽ More In many real-world applications data exhibits non-stationarity, i.e., its distribution changes over time. One approach to handling non-stationarity is to remove or minimize it before attempting to analyze the data. In the context of brain computer interface (BCI) data analysis this may be done by means of stationary subspace analysis (SSA). The classic SSA method finds a matrix that projects the data onto a stationary subspace by optimizing a cost function based on a matrix divergence. In this work we present an alternative method for SSA based on a symmetrized version of this matrix divergence. We show that this frames the problem in terms of distances between symmetric positive definite (SPD) matrices, suggesting a geometric interpretation of the problem. Stemming from this geometric viewpoint, we introduce and analyze a method which utilizes the geometry of the SPD matrix manifold and the invariance properties of its metrics. Most notably we show that these invariances alleviate the need to whiten the input matrices, a common step in many SSA methods which often introduces errors. We demonstrate the usefulness of our technique in experiments on both synthesized and real-world data. △ Less

Submitted 25 May, 2016; originally announced May 2016.

arXiv:1502.03505 [pdf, other]

Supervised LogEuclidean Metric Learning for Symmetric Positive Definite Matrices

Authors: Florian Yger, Masashi Sugiyama

Abstract: Metric learning has been shown to be highly effective to improve the performance of nearest neighbor classification. In this paper, we address the problem of metric learning for Symmetric Positive Definite (SPD) matrices such as covariance matrices, which arise in many real-world applications. Naively using standard Mahalanobis metric learning methods under the Euclidean geometry for SPD matrices… ▽ More Metric learning has been shown to be highly effective to improve the performance of nearest neighbor classification. In this paper, we address the problem of metric learning for Symmetric Positive Definite (SPD) matrices such as covariance matrices, which arise in many real-world applications. Naively using standard Mahalanobis metric learning methods under the Euclidean geometry for SPD matrices is not appropriate, because the difference of SPD matrices can be a non-SPD matrix and thus the obtained solution can be uninterpretable. To cope with this problem, we propose to use a properly parameterized LogEuclidean distance and optimize the metric with respect to kernel-target alignment, which is a supervised criterion for kernel learning. Then the resulting non-trivial optimization problem is solved by utilizing the Riemannian geometry. Finally, we experimentally demonstrate the usefulness of our LogEuclidean metric learning algorithm on real-world classification tasks for EEG signals and texture patches. △ Less

Submitted 11 February, 2015; originally announced February 2015.

Comments: 19 pages, 6 figures, 3 tables

arXiv:1410.2663 [pdf, other]

Challenge IEEE-ISBI/TCB : Application of Covariance matrices and wavelet marginals

Authors: Florian Yger

Abstract: This short memo aims at explaining our approach for the challenge IEEE-ISBI on Bone Texture Characterization. In this work, we focus on the use of covariance matrices and wavelet marginals in an SVM classifier. This short memo aims at explaining our approach for the challenge IEEE-ISBI on Bone Texture Characterization. In this work, we focus on the use of covariance matrices and wavelet marginals in an SVM classifier. △ Less

Submitted 9 October, 2014; originally announced October 2014.

Comments: 9 pages, 4 Figues, 2 Tables, Challenge IEEE-ISBI : Bone Texture Characterization

arXiv:1206.6453 [pdf]

Adaptive Canonical Correlation Analysis Based On Matrix Manifolds

Authors: Florian Yger, Maxime Berar, Gilles Gasso, Alain Rakotomamonjy

Abstract: In this paper, we formulate the Canonical Correlation Analysis (CCA) problem on matrix manifolds. This framework provides a natural way for dealing with matrix constraints and tools for building efficient algorithms even in an adaptive setting. Finally, an adaptive CCA algorithm is proposed and applied to a change detection problem in EEG signals. In this paper, we formulate the Canonical Correlation Analysis (CCA) problem on matrix manifolds. This framework provides a natural way for dealing with matrix constraints and tools for building efficient algorithms even in an adaptive setting. Finally, an adaptive CCA algorithm is proposed and applied to a change detection problem in EEG signals. △ Less

Submitted 27 June, 2012; originally announced June 2012.

Comments: Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

Showing 1–26 of 26 results for author: Yger, F