-
Towards Efficient and Optimal Covariance-Adaptive Algorithms for Combinatorial Semi-Bandits
Authors:
Julien Zhou,
Pierre Gaillard,
Thibaud Rahier,
Houssam Zenati,
Julyan Arbel
Abstract:
We address the problem of stochastic combinatorial semi-bandits, where a player selects among $P$ actions from the power set of a set containing $d$ base items. Adaptivity to the problem's structure is essential in order to obtain optimal regret upper bounds. As estimating the coefficients of a covariance matrix can be manageable in practice, leveraging them should improve the regret. We design ``…
▽ More
We address the problem of stochastic combinatorial semi-bandits, where a player selects among $P$ actions from the power set of a set containing $d$ base items. Adaptivity to the problem's structure is essential in order to obtain optimal regret upper bounds. As estimating the coefficients of a covariance matrix can be manageable in practice, leveraging them should improve the regret. We design ``optimistic'' covariance-adaptive algorithms relying on online estimations of the covariance structure, called OLSUCBC and COSV (only the variances for the latter). They both yields improved gap-free regret. Although COSV can be slightly suboptimal, it improves on computational complexity by taking inspiration from Thompson Sampling approaches. It is the first sampling-based algorithm satisfying a $\sqrt{T}$ gap-free regret (up to poly-logs). We also show that in some cases, our approach efficiently leverages the semi-bandit feedback and outperforms bandit feedback approaches, not only in exponential regimes where $P\gg d$ but also when $P\leq d$, which is not covered by existing analyses.
△ Less
Submitted 3 July, 2024; v1 submitted 23 February, 2024;
originally announced February 2024.
-
Sequential Counterfactual Risk Minimization
Authors:
Houssam Zenati,
Eustache Diemert,
Matthieu Martin,
Julien Mairal,
Pierre Gaillard
Abstract:
Counterfactual Risk Minimization (CRM) is a framework for dealing with the logged bandit feedback problem, where the goal is to improve a logging policy using offline data. In this paper, we explore the case where it is possible to deploy learned policies multiple times and acquire new data. We extend the CRM principle and its theory to this scenario, which we call "Sequential Counterfactual Risk…
▽ More
Counterfactual Risk Minimization (CRM) is a framework for dealing with the logged bandit feedback problem, where the goal is to improve a logging policy using offline data. In this paper, we explore the case where it is possible to deploy learned policies multiple times and acquire new data. We extend the CRM principle and its theory to this scenario, which we call "Sequential Counterfactual Risk Minimization (SCRM)." We introduce a novel counterfactual estimator and identify conditions that can improve the performance of CRM in terms of excess risk and regret rates, by using an analysis similar to restart strategies in accelerated optimization methods. We also provide an empirical evaluation of our method in both discrete and continuous action settings, and demonstrate the benefits of multiple deployments of CRM.
△ Less
Submitted 25 May, 2023; v1 submitted 23 February, 2023;
originally announced February 2023.
-
Nested bandits
Authors:
Matthieu Martin,
Panayotis Mertikopoulos,
Thibaud Rahier,
Houssam Zenati
Abstract:
In many online decision processes, the optimizing agent is called to choose between large numbers of alternatives with many inherent similarities; in turn, these similarities imply closely correlated losses that may confound standard discrete choice models and bandit algorithms. We study this question in the context of nested bandits, a class of adversarial multi-armed bandit problems where the le…
▽ More
In many online decision processes, the optimizing agent is called to choose between large numbers of alternatives with many inherent similarities; in turn, these similarities imply closely correlated losses that may confound standard discrete choice models and bandit algorithms. We study this question in the context of nested bandits, a class of adversarial multi-armed bandit problems where the learner seeks to minimize their regret in the presence of a large number of distinct alternatives with a hierarchy of embedded (non-combinatorial) similarities. In this setting, optimal algorithms based on the exponential weights blueprint (like Hedge, EXP3, and their variants) may incur significant regret because they tend to spend excessive amounts of time exploring irrelevant alternatives with similar, suboptimal costs. To account for this, we propose a nested exponential weights (NEW) algorithm that performs a layered exploration of the learner's set of alternatives based on a nested, step-by-step selection method. In so doing, we obtain a series of tight bounds for the learner's regret showing that online learning problems with a high degree of similarity between alternatives can be resolved efficiently, without a red bus / blue bus paradox occurring.
△ Less
Submitted 19 June, 2022;
originally announced June 2022.
-
Efficient Kernel UCB for Contextual Bandits
Authors:
Houssam Zenati,
Alberto Bietti,
Eustache Diemert,
Julien Mairal,
Matthieu Martin,
Pierre Gaillard
Abstract:
In this paper, we tackle the computational efficiency of kernelized UCB algorithms in contextual bandits. While standard methods require a O(CT^3) complexity where T is the horizon and the constant C is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems. Specifically, our method relies on incremental Nystrom approximations of the joint kernel…
▽ More
In this paper, we tackle the computational efficiency of kernelized UCB algorithms in contextual bandits. While standard methods require a O(CT^3) complexity where T is the horizon and the constant C is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems. Specifically, our method relies on incremental Nystrom approximations of the joint kernel embedding of contexts and actions. This allows us to achieve a complexity of O(CTm^2) where m is the number of Nystrom points. To recover the same regret as the standard kernelized UCB algorithm, m needs to be of order of the effective dimension of the problem, which is at most O(\sqrt(T)) and nearly constant in some cases.
△ Less
Submitted 11 February, 2022;
originally announced February 2022.
-
Counterfactual Learning of Stochastic Policies with Continuous Actions: from Models to Offline Evaluation
Authors:
Houssam Zenati,
Alberto Bietti,
Matthieu Martin,
Eustache Diemert,
Pierre Gaillard,
Julien Mairal
Abstract:
Counterfactual reasoning from logged data has become increasingly important for many applications such as web advertising or healthcare. In this paper, we address the problem of learning stochastic policies with continuous actions from the viewpoint of counterfactual risk minimization (CRM). While the CRM framework is appealing and well studied for discrete actions, the continuous action case rais…
▽ More
Counterfactual reasoning from logged data has become increasingly important for many applications such as web advertising or healthcare. In this paper, we address the problem of learning stochastic policies with continuous actions from the viewpoint of counterfactual risk minimization (CRM). While the CRM framework is appealing and well studied for discrete actions, the continuous action case raises new challenges about modelization, optimization, and~offline model selection with real data which turns out to be particularly challenging. Our paper contributes to these three aspects of the CRM estimation pipeline. First, we introduce a modelling strategy based on a joint kernel embedding of contexts and actions, which overcomes the shortcomings of previous discretization approaches. Second, we empirically show that the optimization aspect of counterfactual learning is important, and we demonstrate the benefits of proximal point algorithms and differentiable estimators. Finally, we propose an evaluation protocol for offline policies in real-world logged systems, which is challenging since policies cannot be replayed on test data, and we release a new large-scale dataset along with multiple synthetic, yet realistic, evaluation setups.
△ Less
Submitted 14 December, 2022; v1 submitted 22 April, 2020;
originally announced April 2020.
-
RGB-Topography and X-rays Image Registration for Idiopathic Scoliosis Children Patient Follow-up
Authors:
Insaf Setitra,
Noureddine Aouaa,
Abdelkrim Meziane,
Afef Benrabia,
Houria Kaced,
Hanene Belabassi,
Sara Ait Ziane,
Nadia Henda Zenati,
Oualid Djekkoune
Abstract:
Children diagnosed with a scoliosis pathology are exposed during their follow up to ionic radiations in each X-rays diagnosis. This exposure can have negative effects on the patient's health and cause diseases in the adult age. In order to reduce X-rays scanning, recent systems provide diagnosis of scoliosis patients using solely RGB images. The output of such systems is a set of augmented images…
▽ More
Children diagnosed with a scoliosis pathology are exposed during their follow up to ionic radiations in each X-rays diagnosis. This exposure can have negative effects on the patient's health and cause diseases in the adult age. In order to reduce X-rays scanning, recent systems provide diagnosis of scoliosis patients using solely RGB images. The output of such systems is a set of augmented images and scoliosis related angles. These angles, however, confuse the physicians due to their large number. Moreover, the lack of X-rays scans makes it impossible for the physician to compare RGB and X-rays images, and decide whether to reduce X-rays exposure or not. In this work, we exploit both RGB images of scoliosis captured during clinical diagnosis, and X-rays hard copies provided by patients in order to register both images and give a rich comparison of diagnoses. The work consists in, first, establishing the monomodal (RGB topography of the back) and multimodal (RGB and Xrays) image database, then registering images based on patient landmarks, and finally blending registered images for a visual analysis and follow up by the physician. The proposed registration is based on a rigid transformation that preserves the topology of the patient's back. Parameters of the rigid transformation are estimated using a proposed angle minimization of Cervical vertebra 7, and Posterior Superior Iliac Spine landmarks of a source and target diagnoses. Experiments conducted on the constructed database show a better monomodal and multimodal registration using our proposed method compared to registration using an Equation System Solving based registration.
△ Less
Submitted 20 March, 2020;
originally announced March 2020.
-
Semi-Supervised Deep Learning for Abnormality Classification in Retinal Images
Authors:
Bruno Lecouat,
Ken Chang,
Chuan-Sheng Foo,
Balagopal Unnikrishnan,
James M. Brown,
Houssam Zenati,
Andrew Beers,
Vijay Chandrasekhar,
Jayashree Kalpathy-Cramer,
Pavitra Krishnaswamy
Abstract:
Supervised deep learning algorithms have enabled significant performance gains in medical image classification tasks. But these methods rely on large labeled datasets that require resource-intensive expert annotation. Semi-supervised generative adversarial network (GAN) approaches offer a means to learn from limited labeled data alongside larger unlabeled datasets, but have not been applied to dis…
▽ More
Supervised deep learning algorithms have enabled significant performance gains in medical image classification tasks. But these methods rely on large labeled datasets that require resource-intensive expert annotation. Semi-supervised generative adversarial network (GAN) approaches offer a means to learn from limited labeled data alongside larger unlabeled datasets, but have not been applied to discern fine-scale, sparse or localized features that define medical abnormalities. To overcome these limitations, we propose a patch-based semi-supervised learning approach and evaluate performance on classification of diabetic retinopathy from funduscopic images. Our semi-supervised approach achieves high AUC with just 10-20 labeled training images, and outperforms the supervised baselines by upto 15% when less than 30% of the training dataset is labeled. Further, our method implicitly enables interpretation of the SSL predictions. As this approach enables good accuracy, resolution and interpretability with lower annotation burden, it sets the pathway for scalable applications of deep learning in clinical imaging.
△ Less
Submitted 19 December, 2018;
originally announced December 2018.
-
Adversarially Learned Anomaly Detection
Authors:
Houssam Zenati,
Manon Romain,
Chuan Sheng Foo,
Bruno Lecouat,
Vijay Ramaseshan Chandrasekhar
Abstract:
Anomaly detection is a significant and hence well-studied problem. However, develo** effective anomaly detection methods for complex and high-dimensional data remains a challenge. As Generative Adversarial Networks (GANs) are able to model the complex high-dimensional distributions of real-world data, they offer a promising approach to address this challenge. In this work, we propose an anomaly…
▽ More
Anomaly detection is a significant and hence well-studied problem. However, develo** effective anomaly detection methods for complex and high-dimensional data remains a challenge. As Generative Adversarial Networks (GANs) are able to model the complex high-dimensional distributions of real-world data, they offer a promising approach to address this challenge. In this work, we propose an anomaly detection method, Adversarially Learned Anomaly Detection (ALAD) based on bi-directional GANs, that derives adversarially learned features for the anomaly detection task. ALAD then uses reconstruction errors based on these adversarially learned features to determine if a data sample is anomalous. ALAD builds on recent advances to ensure data-space and latent-space cycle-consistencies and stabilize GAN training, which results in significantly improved anomaly detection performance. ALAD achieves state-of-the-art performance on a range of image and tabular datasets while being several hundred-fold faster at test time than the only published GAN-based method.
△ Less
Submitted 5 December, 2018;
originally announced December 2018.
-
Manifold regularization with GANs for semi-supervised learning
Authors:
Bruno Lecouat,
Chuan-Sheng Foo,
Houssam Zenati,
Vijay Chandrasekhar
Abstract:
Generative Adversarial Networks are powerful generative models that are able to model the manifold of natural images. We leverage this property to perform manifold regularization by approximating a variant of the Laplacian norm using a Monte Carlo approximation that is easily computed with the GAN. When incorporated into the semi-supervised feature-matching GAN we achieve state-of-the-art results…
▽ More
Generative Adversarial Networks are powerful generative models that are able to model the manifold of natural images. We leverage this property to perform manifold regularization by approximating a variant of the Laplacian norm using a Monte Carlo approximation that is easily computed with the GAN. When incorporated into the semi-supervised feature-matching GAN we achieve state-of-the-art results for GAN-based semi-supervised learning on CIFAR-10 and SVHN benchmarks, with a method that is significantly easier to implement than competing methods. We also find that manifold regularization improves the quality of generated images, and is affected by the quality of the GAN used to approximate the regularizer.
△ Less
Submitted 11 July, 2018;
originally announced July 2018.
-
Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile
Authors:
Panayotis Mertikopoulos,
Bruno Lecouat,
Houssam Zenati,
Chuan-Sheng Foo,
Vijay Chandrasekhar,
Georgios Piliouras
Abstract:
Owing to their connection with generative adversarial networks (GANs), saddle-point problems have recently attracted considerable interest in machine learning and beyond. By necessity, most theoretical guarantees revolve around convex-concave (or even linear) problems; however, making theoretical inroads towards efficient GAN training depends crucially on moving beyond this classic framework. To m…
▽ More
Owing to their connection with generative adversarial networks (GANs), saddle-point problems have recently attracted considerable interest in machine learning and beyond. By necessity, most theoretical guarantees revolve around convex-concave (or even linear) problems; however, making theoretical inroads towards efficient GAN training depends crucially on moving beyond this classic framework. To make piecemeal progress along these lines, we analyze the behavior of mirror descent (MD) in a class of non-monotone problems whose solutions coincide with those of a naturally associated variational inequality - a property which we call coherence. We first show that ordinary, "vanilla" MD converges under a strict version of this condition, but not otherwise; in particular, it may fail to converge even in bilinear models with a unique solution. We then show that this deficiency is mitigated by optimism: by taking an "extra-gradient" step, optimistic mirror descent (OMD) converges in all coherent problems. Our analysis generalizes and extends the results of Daskalakis et al. (2018) for optimistic gradient descent (OGD) in bilinear problems, and makes concrete headway for establishing convergence beyond convex-concave games. We also provide stochastic analogues of these results, and we validate our analysis by numerical experiments in a wide array of GAN models (including Gaussian mixture models, as well as the CelebA and CIFAR-10 datasets).
△ Less
Submitted 1 October, 2018; v1 submitted 7 July, 2018;
originally announced July 2018.
-
Semi-Supervised Learning with GANs: Revisiting Manifold Regularization
Authors:
Bruno Lecouat,
Chuan-Sheng Foo,
Houssam Zenati,
Vijay R. Chandrasekhar
Abstract:
GANS are powerful generative models that are able to model the manifold of natural images. We leverage this property to perform manifold regularization by approximating the Laplacian norm using a Monte Carlo approximation that is easily computed with the GAN. When incorporated into the feature-matching GAN of Improved GAN, we achieve state-of-the-art results for GAN-based semi-supervised learning…
▽ More
GANS are powerful generative models that are able to model the manifold of natural images. We leverage this property to perform manifold regularization by approximating the Laplacian norm using a Monte Carlo approximation that is easily computed with the GAN. When incorporated into the feature-matching GAN of Improved GAN, we achieve state-of-the-art results for GAN-based semi-supervised learning on the CIFAR-10 dataset, with a method that is significantly easier to implement than competing methods.
△ Less
Submitted 23 May, 2018;
originally announced May 2018.
-
Efficient GAN-Based Anomaly Detection
Authors:
Houssam Zenati,
Chuan Sheng Foo,
Bruno Lecouat,
Gaurav Manek,
Vijay Ramaseshan Chandrasekhar
Abstract:
Generative adversarial networks (GANs) are able to model the complex highdimensional distributions of real-world data, which suggests they could be effective for anomaly detection. However, few works have explored the use of GANs for the anomaly detection task. We leverage recently developed GAN models for anomaly detection, and achieve state-of-the-art performance on image and network intrusion d…
▽ More
Generative adversarial networks (GANs) are able to model the complex highdimensional distributions of real-world data, which suggests they could be effective for anomaly detection. However, few works have explored the use of GANs for the anomaly detection task. We leverage recently developed GAN models for anomaly detection, and achieve state-of-the-art performance on image and network intrusion datasets, while being several hundred-fold faster at test time than the only published GAN-based method.
△ Less
Submitted 1 May, 2019; v1 submitted 17 February, 2018;
originally announced February 2018.