Skip to main content

Showing 1–10 of 10 results for author: Szorenyi, B

Searching in archive cs. Search in all archives.
.
  1. arXiv:1911.09157  [pdf, ps, other

    cs.LG math.PR

    A Tale of Two-Timescale Reinforcement Learning with the Tightest Finite-Time Bound

    Authors: Gal Dalal, Balazs Szorenyi, Gugan Thoppe

    Abstract: Policy evaluation in reinforcement learning is often conducted using two-timescale stochastic approximation, which results in various gradient temporal difference methods such as GTD(0), GTD2, and TDC. Here, we provide convergence rate bounds for this suite of algorithms. Algorithms such as these have two iterates, $θ_n$ and $w_n,$ which are updated using two distinct stepsize sequences, $α_n$ and… ▽ More

    Submitted 4 December, 2019; v1 submitted 20 November, 2019; originally announced November 2019.

  2. arXiv:1906.01009  [pdf, ps, other

    cs.LG stat.ML

    Optimal Learning of Mallows Block Model

    Authors: Róbert Busa-Fekete, Dimitris Fotakis, Balázs Szörényi, Manolis Zampetakis

    Abstract: The Mallows model, introduced in the seminal paper of Mallows 1957, is one of the most fundamental ranking distribution over the symmetric group $S_m$. To analyze more complex ranking data, several studies considered the Generalized Mallows model defined by Fligner and Verducci 1986. Despite the significant research interest of ranking distributions, the exact sample complexity of estimating the p… ▽ More

    Submitted 3 June, 2019; originally announced June 2019.

  3. arXiv:1905.12781  [pdf, other

    cs.LG stat.ML

    Learning to Crawl

    Authors: Utkarsh Upadhyay, Robert Busa-Fekete, Wojciech Kotlowski, David Pal, Balazs Szorenyi

    Abstract: Web crawling is the problem of kee** a cache of webpages fresh, i.e., having the most recent copy available when a page is requested. This problem is usually coupled with the natural restriction that the bandwidth available to the web crawler is limited. The corresponding optimization problem was solved optimally by Azar et al. [2018] under the assumption that, for each webpage, both the elapsed… ▽ More

    Submitted 22 November, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: Published at AAAI 2020

  4. arXiv:1902.02244  [pdf, other

    cs.LG stat.ML

    Bandit Multiclass Linear Classification: Efficient Algorithms for the Separable Case

    Authors: Alina Beygelzimer, Dávid Pál, Balázs Szörényi, Devanathan Thiruvenkatachari, Chen-Yu Wei, Chicheng Zhang

    Abstract: We study the problem of efficient online multiclass linear classification with bandit feedback, where all examples belong to one of $K$ classes and lie in the $d$-dimensional Euclidean space. Previous works have left open the challenge of designing efficient algorithms with finite mistake bounds when the data is linearly separable by a margin $γ$. In this work, we take a first step towards this pr… ▽ More

    Submitted 18 June, 2019; v1 submitted 6 February, 2019; originally announced February 2019.

    Comments: 41 pages, 8 figures

  5. arXiv:1901.05515  [pdf, other

    cs.LG stat.ML

    The information-theoretic value of unlabeled data in semi-supervised learning

    Authors: Alexander Golovnev, Dávid Pál, Balázs Szörényi

    Abstract: We quantify the separation between the numbers of labeled examples required to learn in two settings: Settings with and without the knowledge of the distribution of the unlabeled data. More specifically, we prove a separation by $Θ(\log n)$ multiplicative factor for the class of projections over the Boolean hypercube of dimension $n$. We prove that there is no separation for the class of all funct… ▽ More

    Submitted 13 May, 2019; v1 submitted 16 January, 2019; originally announced January 2019.

  6. arXiv:1706.04933  [pdf, other

    cs.LG

    Multi-objective Bandits: Optimizing the Generalized Gini Index

    Authors: Robert Busa-Fekete, Balazs Szorenyi, Paul Weng, Shie Mannor

    Abstract: We study the multi-armed bandit (MAB) problem where the agent receives a vectorial feedback that encodes many possibly competing objectives to be optimized. The goal of the agent is to find a policy, which can optimize these objectives simultaneously in a fair way. This multi-objective online optimization problem is formalized by using the Generalized Gini Index (GGI) aggregation function. We prop… ▽ More

    Submitted 15 June, 2017; originally announced June 2017.

    Comments: 13 pages, 3 figures, draft version of ICML'17 paper

  7. arXiv:1704.01161  [pdf, other

    cs.AI

    Finite Sample Analyses for TD(0) with Function Approximation

    Authors: Gal Dalal, Balázs Szörényi, Gugan Thoppe, Shie Mannor

    Abstract: TD(0) is one of the most commonly used algorithms in reinforcement learning. Despite this, there is no existing finite sample analysis for TD(0) with function approximation, even for the linear case. Our work is the first to provide such results. Existing convergence rates for Temporal Difference (TD) methods apply only to somewhat modified versions, e.g., projected variants or ones where stepsize… ▽ More

    Submitted 11 December, 2017; v1 submitted 4 April, 2017; originally announced April 2017.

  8. arXiv:1703.05376  [pdf, other

    cs.AI

    Finite Sample Analysis of Two-Timescale Stochastic Approximation with Applications to Reinforcement Learning

    Authors: Gal Dalal, Balazs Szorenyi, Gugan Thoppe, Shie Mannor

    Abstract: Two-timescale Stochastic Approximation (SA) algorithms are widely used in Reinforcement Learning (RL). Their iterates have two parts that are updated using distinct stepsizes. In this work, we develop a novel recipe for their finite sample analysis. Using this, we provide a concentration bound, which is the first such result for a two-timescale SA. The type of bound we obtain is known as `lock-in… ▽ More

    Submitted 4 June, 2018; v1 submitted 15 March, 2017; originally announced March 2017.

  9. arXiv:1604.07706  [pdf, other

    cs.LG cs.AI stat.ML

    Distributed Clustering of Linear Bandits in Peer to Peer Networks

    Authors: Nathan Korda, Balazs Szorenyi, Shuai Li

    Abstract: We provide two distributed confidence ball algorithms for solving linear bandit problems in peer to peer networks with limited communication capabilities. For the first, we assume that all the peers are solving the same linear bandit problem, and prove that our algorithm achieves the optimal asymptotic regret rate of any centralised algorithm that can instantly communicate information between the… ▽ More

    Submitted 7 June, 2016; v1 submitted 26 April, 2016; originally announced April 2016.

    Comments: The 33rd ICML, 2016

  10. arXiv:1406.0017  [pdf, ps, other

    cs.FL cs.DM

    Biclique coverings, rectifier networks and the cost of $\varepsilon$-removal

    Authors: Szabolcs Iván, Ádám Dániel Lelkes, Judit Nagy-György, Balázs Szörényi, György Turán

    Abstract: We relate two complexity notions of bipartite graphs: the minimal weight biclique covering number $\mathrm{Cov}(G)$ and the minimal rectifier network size $\mathrm{Rect}(G)$ of a bipartite graph $G$. We show that there exist graphs with $\mathrm{Cov}(G)\geq \mathrm{Rect}(G)^{3/2-ε}$. As a corollary, we establish that there exist nondeterministic finite automata (NFAs) with $\varepsilon$-transition… ▽ More

    Submitted 30 May, 2014; originally announced June 2014.

    Comments: 12 pages, to appear in proceedings of DCFS 2014: 16th International Conference on Descriptional Complexity of Finite-State Systems

    MSC Class: 68R10 ACM Class: G.2.2; F.1.1