Skip to main content

Showing 1–12 of 12 results for author: Colin, I

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.05525  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Differentially Private Model-Based Offline Reinforcement Learning

    Authors: Alexandre Rio, Merwan Barlier, Igor Colin, Albert Thomas

    Abstract: We address offline reinforcement learning with privacy guarantees, where the goal is to train a policy that is differentially private with respect to individual trajectories in the dataset. To achieve this, we introduce DP-MORL, an MBRL algorithm coming with differential privacy guarantees. A private model of the environment is first learned from offline data using DP-FedAvg, a training method for… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

  2. arXiv:2309.08710  [pdf, other

    cs.LG stat.ML

    Clustered Multi-Agent Linear Bandits

    Authors: Hamza Cherkaoui, Merwan Barlier, Igor Colin

    Abstract: We address in this paper a particular instance of the multi-agent linear stochastic bandit problem, called clustered multi-agent linear bandits. In this setting, we propose a novel algorithm leveraging an efficient collaboration between the agents in order to accelerate the overall optimization problem. In this contribution, a network controller is responsible for estimating the underlying cluster… ▽ More

    Submitted 30 October, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

    Comments: 20 pages, 10 figures

  3. arXiv:2309.08709  [pdf, other

    stat.ML cs.LG

    Price of Safety in Linear Best Arm Identification

    Authors: Xuedong Shang, Igor Colin, Merwan Barlier, Hamza Cherkaoui

    Abstract: We introduce the safe best-arm identification framework with linear feedback, where the agent is subject to some stage-wise safety constraint that linearly depends on an unknown parameter vector. The agent must take actions in a conservative way so as to ensure that the safety constraint is not violated with high probability at each round. Ways of leveraging the linear structure for ensuring safet… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

    Comments: 20 pages, 1 figures

  4. arXiv:2206.00466  [pdf, other

    cs.LG stat.ML

    An $α$-No-Regret Algorithm For Graphical Bilinear Bandits

    Authors: Geovani Rizk, Igor Colin, Albert Thomas, Rida Laraki, Yann Chevaleyre

    Abstract: We propose the first regret-based approach to the Graphical Bilinear Bandits problem, where $n$ agents in a graph play a stochastic bilinear bandit game with each of their neighbors. This setting reveals a combinatorial NP-hard problem that prevents the use of any existing regret-based algorithm in the (bi-)linear bandit literature. In this paper, we fill this gap and present the first regret-base… ▽ More

    Submitted 12 October, 2022; v1 submitted 1 June, 2022; originally announced June 2022.

  5. arXiv:2012.15726  [pdf, other

    math.ST cs.LG stat.ME

    Refined bounds for randomized experimental design

    Authors: Geovani Rizk, Igor Colin, Albert Thomas, Moez Draief

    Abstract: Experimental design is an approach for selecting samples among a given set so as to obtain the best estimator for a given criterion. In the context of linear regression, several optimal designs have been derived, each associated with a different criterion: mean square error, robustness, \emph{etc}. Computing such designs is generally an NP-hard problem and one can instead rely on a convex relaxati… ▽ More

    Submitted 22 December, 2020; originally announced December 2020.

  6. arXiv:2012.07641  [pdf, other

    cs.LG

    Best Arm Identification in Graphical Bilinear Bandits

    Authors: Geovani Rizk, Albert Thomas, Igor Colin, Rida Laraki, Yann Chevaleyre

    Abstract: We introduce a new graphical bilinear bandit problem where a learner (or a \emph{central entity}) allocates arms to the nodes of a graph and observes for each edge a noisy bilinear reward representing the interaction between the two end nodes. We study the best arm identification problem in which the learner wants to find the graph allocation maximizing the sum of the bilinear rewards. By efficien… ▽ More

    Submitted 10 June, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

  7. arXiv:1910.05104  [pdf, other

    stat.ML cs.DC cs.LG

    Theoretical Limits of Pipeline Parallel Optimization and Application to Distributed Deep Learning

    Authors: Igor Colin, Ludovic Dos Santos, Kevin Scaman

    Abstract: We investigate the theoretical limits of pipeline parallel learning of deep learning architectures, a distributed setup in which the computation is distributed per layer instead of per example. For smooth convex and non-convex objective functions, we provide matching lower and upper complexity bounds and show that a naive pipeline parallelization of Nesterov's accelerated gradient descent is optim… ▽ More

    Submitted 11 October, 2019; originally announced October 2019.

  8. arXiv:1902.01931  [pdf, other

    cs.NI cs.LG stat.ML

    Parallel Contextual Bandits in Wireless Handover Optimization

    Authors: Igor Colin, Albert Thomas, Moez Draief

    Abstract: As cellular networks become denser, a scalable and dynamic tuning of wireless base station parameters can only be achieved through automated optimization. Although the contextual bandit framework arises as a natural candidate for such a task, its extension to a parallel setting is not straightforward: one needs to carefully adapt existing methods to fully leverage the multi-agent structure of this… ▽ More

    Submitted 21 January, 2019; originally announced February 2019.

  9. arXiv:1610.01417  [pdf, other

    stat.ML cs.LG

    Decentralized Topic Modelling with Latent Dirichlet Allocation

    Authors: Igor Colin, Christophe Dupuy

    Abstract: Privacy preserving networks can be modelled as decentralized networks (e.g., sensors, connected objects, smartphones), where communication between nodes of the network is not controlled by an all-knowing, central node. For this type of networks, the main issue is to gather/learn global information on the network (e.g., by optimizing a global cost function) while kee** the (sensitive) information… ▽ More

    Submitted 5 October, 2016; originally announced October 2016.

  10. arXiv:1606.02421  [pdf, other

    stat.ML cs.AI cs.DC cs.LG eess.SY

    Gossip Dual Averaging for Decentralized Optimization of Pairwise Functions

    Authors: Igor Colin, Aurélien Bellet, Joseph Salmon, Stéphan Clémençon

    Abstract: In decentralized networks (of sensors, connected objects, etc.), there is an important need for efficient algorithms to optimize a global cost function, for instance to learn a global model from the local data collected by each computing unit. In this paper, we address the problem of decentralized minimization of pairwise functions of the data points, where these points are distributed over the no… ▽ More

    Submitted 8 June, 2016; originally announced June 2016.

  11. arXiv:1511.05464  [pdf, other

    stat.ML cs.DC cs.LG eess.SY stat.CO

    Extending Gossip Algorithms to Distributed Estimation of U-Statistics

    Authors: Igor Colin, Aurélien Bellet, Joseph Salmon, Stéphan Clémençon

    Abstract: Efficient and robust algorithms for decentralized estimation in networks are essential to many distributed systems. Whereas distributed estimation of sample mean statistics has been the subject of a good deal of attention, computation of $U$-statistics, relying on more expensive averaging over pairs of observations, is a less investigated area. Yet, such data functionals are essential to describe… ▽ More

    Submitted 17 November, 2015; originally announced November 2015.

    Comments: to be presented at NIPS 2015

    MSC Class: 68Uxx; 62J15; 68Q32; 62-04;

  12. arXiv:1501.02629  [pdf, other

    stat.ML cs.AI cs.LG

    Scaling-up Empirical Risk Minimization: Optimization of Incomplete U-statistics

    Authors: Stéphan Clémençon, Aurélien Bellet, Igor Colin

    Abstract: In a wide range of statistical learning problems such as ranking, clustering or metric learning among others, the risk is accurately estimated by $U$-statistics of degree $d\geq 1$, i.e. functionals of the training data with low variance that take the form of averages over $k$-tuples. From a computational perspective, the calculation of such statistics is highly expensive even for a moderate sampl… ▽ More

    Submitted 19 April, 2016; v1 submitted 12 January, 2015; originally announced January 2015.

    Comments: To appear in Journal of Machine Learning Research. 34 pages. v2: minor correction to Theorem 4 and its proof, added 1 reference. v3: typo corrected in Proposition 3. v4: improved presentation, added experiments on model selection for clustering, fixed minor typos

    Journal ref: Journal of Machine Learning Research 17(76):1-36, 2016