Skip to main content

Showing 1–17 of 17 results for author: Proutiere, A

Searching in archive math. Search in all archives.
.
  1. arXiv:2208.08480  [pdf, other

    cs.LG math.ST stat.ML

    Nearly Optimal Latent State Decoding in Block MDPs

    Authors: Yassir Jedra, Junghyun Lee, Alexandre Proutière, Se-Young Yun

    Abstract: We investigate the problems of model estimation and reward-free learning in episodic Block MDPs. In these MDPs, the decision maker has access to rich observations or contexts generated from a small number of latent states. We are first interested in estimating the latent state decoding function (the map** from the observations to latent states) based on data generated under a fixed behavior poli… ▽ More

    Submitted 24 February, 2023; v1 submitted 17 August, 2022; originally announced August 2022.

    Comments: Y. Jedra and J. Lee contributed equally; 100 pages, 3 figures; Accepted to the 26th International Conference on Artificial Intelligence and Statistics (AISTATS 2023)

  2. arXiv:2109.14429  [pdf, ps, other

    cs.LG eess.SY math.OC stat.ML

    Minimal Expected Regret in Linear Quadratic Control

    Authors: Yassir Jedra, Alexandre Proutiere

    Abstract: We consider the problem of online learning in Linear Quadratic Control systems whose state transition and state-action transition matrices $A$ and $B$ may be initially unknown. We devise an online learning algorithm and provide guarantees on its expected regret. This regret at time $T$ is upper bounded (i) by $\widetilde{O}((d_u+d_x)\sqrt{d_xT})$ when $A$ and $B$ are unknown, (ii) by… ▽ More

    Submitted 29 September, 2021; originally announced September 2021.

  3. arXiv:2003.07937  [pdf, ps, other

    math.ST cs.LG eess.SY stat.ML

    Finite-time Identification of Stable Linear Systems: Optimality of the Least-Squares Estimator

    Authors: Yassir Jedra, Alexandre Proutiere

    Abstract: We present a new finite-time analysis of the estimation error of the Ordinary Least Squares (OLS) estimator for stable linear time-invariant systems. We characterize the number of observed samples (the length of the observed trajectory) sufficient for the OLS estimator to be $(\varepsilon,δ)$-PAC, i.e., to yield an estimation error less than $\varepsilon$ with probability at least $1-δ$. We show t… ▽ More

    Submitted 26 March, 2020; v1 submitted 17 March, 2020; originally announced March 2020.

  4. arXiv:2002.08064  [pdf, other

    math.OC

    Distributed Algorithms that Solve Boolean Equations with Local and Differential Privacies

    Authors: Hongsheng Qi, Bo Li, Rui-Juan **g, Lei Wang, Alexandre Proutiere, Guodong Shi

    Abstract: In this paper, we propose distributed algorithms that solve a system of Boolean equations over a network, where each node in the network possesses only one Boolean equation from the system. The Boolean equation assigned at any particular node is a {\em private} equation known to this node only, and the nodes aim to compute the exact set of solutions to the system without exchanging their local equ… ▽ More

    Submitted 3 March, 2021; v1 submitted 19 February, 2020; originally announced February 2020.

    Comments: 34 pages, 5 figures

  5. arXiv:1912.09705  [pdf, other

    cs.LG math.OC stat.ML

    Distributed Online Optimization with Long-Term Constraints

    Authors: Deming Yuan, Alexandre Proutiere, Guodong Shi

    Abstract: We consider distributed online convex optimization problems, where the distributed system consists of various computing units connected through a time-varying communication graph. In each time step, each computing unit selects a constrained vector, experiences a loss equal to an arbitrary convex function evaluated at this vector, and may communicate to its neighbors in the graph. The objective is… ▽ More

    Submitted 20 December, 2019; originally announced December 2019.

  6. arXiv:1906.11392  [pdf, other

    math.OC cs.LG stat.ML

    From self-tuning regulators to reinforcement learning and back again

    Authors: Nikolai Matni, Alexandre Proutiere, Anders Rantzer, Stephen Tu

    Abstract: Machine and reinforcement learning (RL) are increasingly being applied to plan and control the behavior of autonomous systems interacting with the physical world. Examples include self-driving vehicles, distributed sensor networks, and agile robots. However, when machine learning is to be applied in these new settings, the algorithms had better come with the same type of reliability, robustness, a… ▽ More

    Submitted 22 September, 2019; v1 submitted 26 June, 2019; originally announced June 2019.

    Comments: Tutorial paper, 2019 IEEE Conference on Decision and Control, to appear

  7. arXiv:1902.04774  [pdf, ps, other

    cs.LG cs.DC math.OC stat.ML

    Distributed Online Linear Regression

    Authors: Deming Yuan, Alexandre Proutiere, Guodong Shi

    Abstract: We study online linear regression problems in a distributed setting, where the data is spread over a network. In each round, each network node proposes a linear predictor, with the objective of fitting the \emph{network-wide} data. It then updates its predictor for the next round according to the received local feedback and information received from neighboring nodes. The predictions made at a giv… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

  8. arXiv:1712.09232  [pdf, other

    math.PR cs.IT math.ST

    Clustering in Block Markov Chains

    Authors: Jaron Sanders, Alexandre Proutière, Se-Young Yun

    Abstract: This paper considers cluster detection in Block Markov Chains (BMCs). These Markov chains are characterized by a block structure in their transition matrix. More precisely, the $n$ possible states are divided into a finite number of $K$ groups or clusters, such that states in the same cluster exhibit the same transition rates to other states. One observes a trajectory of the Markov chain, and the… ▽ More

    Submitted 29 July, 2019; v1 submitted 26 December, 2017; originally announced December 2017.

    Comments: 73 pages, 18 plots, second revision

  9. arXiv:1711.00400  [pdf, other

    stat.ML cs.AI cs.LG math.OC

    Minimal Exploration in Structured Stochastic Bandits

    Authors: Richard Combes, Stefan Magureanu, Alexandre Proutiere

    Abstract: This paper introduces and addresses a wide class of stochastic bandit problems where the function map** the arm to the corresponding reward exhibits some known structural properties. Most existing structures (e.g. linear, Lipschitz, unimodal, combinatorial, dueling, ...) are covered by our framework. We derive an asymptotic instance-specific regret lower bound for these problems, and develop OSS… ▽ More

    Submitted 1 November, 2017; originally announced November 2017.

    Comments: 13 pages, NIPS 2017

  10. arXiv:1510.05956  [pdf, ps, other

    math.PR cs.LG cs.SI stat.ML

    Optimal Cluster Recovery in the Labeled Stochastic Block Model

    Authors: Se-Young Yun, Alexandre Proutiere

    Abstract: We consider the problem of community detection or clustering in the labeled Stochastic Block Model (LSBM) with a finite number $K$ of clusters of sizes linearly growing with the global population of items $n$. Every pair of items is labeled independently at random, and label $\ell$ appears with probability $p(i,j,\ell)$ between two items in clusters indexed by $i$ and $j$, respectively. The object… ▽ More

    Submitted 21 May, 2016; v1 submitted 20 October, 2015; originally announced October 2015.

    Comments: arXiv admin note: text overlap with arXiv:1412.7335

  11. arXiv:1504.03156  [pdf, ps, other

    math.SP stat.ML

    Streaming, Memory Limited Matrix Completion with Noise

    Authors: Se-Young Yun, Marc Lelarge, Alexandre Proutiere

    Abstract: In this paper, we consider the streaming memory-limited matrix completion problem when the observed entries are noisy versions of a small random fraction of the original entries. We are interested in scenarios where the matrix size is very large so the matrix is very hard to store and manipulate. Here, columns of the observed matrix are presented sequentially and the goal is to complete the missin… ▽ More

    Submitted 13 April, 2015; originally announced April 2015.

    Comments: 21 pages

  12. arXiv:1502.03475  [pdf, other

    cs.LG math.OC stat.ML

    Combinatorial Bandits Revisited

    Authors: Richard Combes, M. Sadegh Talebi, Alexandre Proutiere, Marc Lelarge

    Abstract: This paper investigates stochastic and adversarial combinatorial multi-armed bandit problems. In the stochastic setting under semi-bandit feedback, we derive a problem-specific regret lower bound, and discuss its scaling with the dimension of the decision space. We propose ESCB, an algorithm that efficiently exploits the structure of the problem and provide a finite-time analysis of its regret. ES… ▽ More

    Submitted 5 November, 2015; v1 submitted 11 February, 2015; originally announced February 2015.

    Comments: 30 pages, Advances in Neural Information Processing Systems 28 (NIPS 2015)

  13. arXiv:1309.7367  [pdf, other

    cs.NI cs.LG math.OC

    Stochastic Online Shortest Path Routing: The Value of Feedback

    Authors: M. Sadegh Talebi, Zhenhua Zou, Richard Combes, Alexandre Proutiere, Mikael Johansson

    Abstract: This paper studies online shortest path routing over multi-hop networks. Link costs or delays are time-varying and modeled by independent and identically distributed random processes, whose parameters are initially unknown. The parameters, and hence the optimal path, can only be estimated by routing packets through the network and observing the realized delays. Our aim is to find a routing policy… ▽ More

    Submitted 18 January, 2017; v1 submitted 27 September, 2013; originally announced September 2013.

    Comments: 18 pages

  14. arXiv:1309.2574  [pdf, ps, other

    eess.SY cs.MA math.OC

    Randomized Consensus with Attractive and Repulsive Links

    Authors: Guodong Shi, Alexandre Proutiere, Mikael Johansson, Karl H. Johansson

    Abstract: We study convergence properties of a randomized consensus algorithm over a graph with both attractive and repulsive links. At each time instant, a node is randomly selected to interact with a random neighbor. Depending on if the link between the two nodes belongs to a given subgraph of attractive or repulsive links, the node update follows a standard attractive weighted average or a repulsive weig… ▽ More

    Submitted 9 September, 2013; originally announced September 2013.

  15. arXiv:1302.6974  [pdf, ps, other

    cs.LG cs.NI math.OC

    Spectrum Bandit Optimization

    Authors: Marc Lelarge, Alexandre Proutiere, M. Sadegh Talebi

    Abstract: We consider the problem of allocating radio channels to links in a wireless network. Links interact through interference, modelled as a conflict graph (i.e., two interfering links cannot be simultaneously active on the same channel). We aim at identifying the channel allocation maximizing the total network throughput over a finite time horizon. Should we know the average radio conditions on each c… ▽ More

    Submitted 17 February, 2015; v1 submitted 27 February, 2013; originally announced February 2013.

    Comments: 21 pages

  16. arXiv:1210.6685  [pdf, ps, other

    eess.SY cs.DC math.OC

    Distributed Optimization: Convergence Conditions from a Dynamical System Perspective

    Authors: Guodong Shi, Alexandre Proutiere, Karl Henrik Johansson

    Abstract: This paper explores the fundamental properties of distributed minimization of a sum of functions with each function only known to one node, and a pre-specified level of node knowledge and computational capacity. We define the optimization information each node receives from its objective function, the neighboring information each node receives from its neighbors, and the computational capacity eac… ▽ More

    Submitted 24 October, 2012; originally announced October 2012.

  17. arXiv:math/0701363  [pdf, ps, other

    math.PR

    A particle system in interaction with a rapidly varying environment: Mean field limits and applications

    Authors: Charles Bordenave, David McDonald, Alexandre Proutiere

    Abstract: We study an interacting particle system whose dynamics depends on an interacting random environment. As the number of particles grows large, the transition rate of the particles slows down (perhaps because they share a common resource of fixed capacity). The transition rate of a particle is determined by its state, by the empirical distribution of all the particles and by a rapidly varying envir… ▽ More

    Submitted 16 February, 2009; v1 submitted 12 January, 2007; originally announced January 2007.

    Comments: 31 pages, 2 figures

    MSC Class: primary 60K35 ; secondary 60K37; 90B18