Skip to main content

Showing 1–17 of 17 results for author: Golowich, N

Searching in archive math. Search in all archives.
.
  1. arXiv:2406.07920  [pdf, ps, other

    cs.LG cs.AI cs.CC math.ST stat.ML

    Near-Optimal Learning and Planning in Separated Latent MDPs

    Authors: Fan Chen, Constantinos Daskalakis, Noah Golowich, Alexander Rakhlin

    Abstract: We study computational and statistical aspects of learning Latent Markov Decision Processes (LMDPs). In this model, the learner interacts with an MDP drawn at the beginning of each epoch from an unknown mixture of MDPs. To sidestep known impossibility results, we consider several notions of separation of the constituent MDPs. The main thrust of this paper is in establishing a nearly-sharp *statist… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

    Comments: COLT 2024

  2. arXiv:2406.01799  [pdf, other

    cs.LG math.OC stat.ML

    Online Control in Population Dynamics

    Authors: Noah Golowich, Elad Hazan, Zhou Lu, Dhruv Rohatgi, Y. Jennifer Sun

    Abstract: The study of population dynamics originated with early sociological works but has since extended into many fields, including biology, epidemiology, evolutionary game theory, and economics. Most studies on population dynamics focus on the problem of prediction rather than control. Existing mathematical models for control in population dynamics are often restricted to specific, noise-free dynamics,… ▽ More

    Submitted 6 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

  3. arXiv:2309.09457  [pdf, ps, other

    cs.LG cs.AI cs.DS math.OC stat.ML

    Exploring and Learning in Sparse Linear MDPs without Computationally Intractable Oracles

    Authors: Noah Golowich, Ankur Moitra, Dhruv Rohatgi

    Abstract: The key assumption underlying linear Markov Decision Processes (MDPs) is that the learner has access to a known feature map $φ(x, a)$ that maps state-action pairs to $d$-dimensional vectors, and that the rewards and transitions are linear functions in this representation. But where do these features come from? In the absence of expert domain knowledge, a tempting strategy is to use the ``kitchen s… ▽ More

    Submitted 18 September, 2023; v1 submitted 17 September, 2023; originally announced September 2023.

  4. arXiv:2301.08215  [pdf, other

    cs.LG math.OC math.ST stat.ML

    Tight Guarantees for Interactive Decision Making with the Decision-Estimation Coefficient

    Authors: Dylan J. Foster, Noah Golowich, Yanjun Han

    Abstract: A foundational problem in reinforcement learning and interactive decision making is to understand what modeling assumptions lead to sample-efficient learning guarantees, and what algorithm design principles achieve optimal sample complexity. Recently, Foster et al. (2021) introduced the Decision-Estimation Coefficient (DEC), a measure of statistical complexity which leads to upper and lower bounds… ▽ More

    Submitted 19 January, 2023; originally announced January 2023.

  5. arXiv:2211.14250  [pdf, other

    cs.LG math.OC math.ST stat.ML

    Model-Free Reinforcement Learning with the Decision-Estimation Coefficient

    Authors: Dylan J. Foster, Noah Golowich, Jian Qian, Alexander Rakhlin, Ayush Sekhari

    Abstract: We consider the problem of interactive decision making, encompassing structured bandits and reinforcement learning with general function approximation. Recently, Foster et al. (2021) introduced the Decision-Estimation Coefficient, a measure of statistical complexity that lower bounds the optimal regret for interactive decision making, as well as a meta-algorithm, Estimation-to-Decisions, which ach… ▽ More

    Submitted 12 August, 2023; v1 submitted 25 November, 2022; originally announced November 2022.

    Comments: V2 changes: Improved writing and added more examples

  6. arXiv:2210.09769  [pdf, other

    cs.LG cs.GT math.OC

    STay-ON-the-Ridge: Guaranteed Convergence to Local Minimax Equilibrium in Nonconvex-Nonconcave Games

    Authors: Constantinos Daskalakis, Noah Golowich, Stratis Skoulakis, Manolis Zampetakis

    Abstract: Min-max optimization problems involving nonconvex-nonconcave objectives have found important applications in adversarial training and other multi-agent learning settings. Yet, no known gradient descent-based method is guaranteed to converge to (even local notions of) min-max equilibrium in the nonconvex-nonconcave setting. For all known methods, there exist relatively simple objectives for which t… ▽ More

    Submitted 18 October, 2022; originally announced October 2022.

  7. arXiv:2206.03446  [pdf, ps, other

    cs.LG cs.AI cs.DS math.OC stat.ML

    Learning in Observable POMDPs, without Computationally Intractable Oracles

    Authors: Noah Golowich, Ankur Moitra, Dhruv Rohatgi

    Abstract: Much of reinforcement learning theory is built on top of oracles that are computationally hard to implement. Specifically for learning near-optimal policies in Partially Observable Markov Decision Processes (POMDPs), existing algorithms either need to make strong assumptions about the model dynamics (e.g. deterministic transitions) or assume access to an oracle for solving a hard optimistic planni… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

  8. arXiv:2201.04735  [pdf, ps, other

    cs.LG cs.DS math.OC stat.ML

    Planning in Observable POMDPs in Quasipolynomial Time

    Authors: Noah Golowich, Ankur Moitra, Dhruv Rohatgi

    Abstract: Partially Observable Markov Decision Processes (POMDPs) are a natural and general model in reinforcement learning that take into account the agent's uncertainty about its current state. In the literature on POMDPs, it is customary to assume access to a planning oracle that computes an optimal policy when the parameters are known, even though the problem is known to be computationally hard. Almost… ▽ More

    Submitted 23 March, 2022; v1 submitted 12 January, 2022; originally announced January 2022.

    Comments: 52 pages

  9. arXiv:2110.13052  [pdf, ps, other

    cs.LG cs.AI cs.DS math.OC stat.ML

    Can Q-Learning be Improved with Advice?

    Authors: Noah Golowich, Ankur Moitra

    Abstract: Despite rapid progress in theoretical reinforcement learning (RL) over the last few years, most of the known guarantees are worst-case in nature, failing to take advantage of structure that may be known a priori about a given RL problem at hand. In this paper we address the question of whether worst-case lower bounds for regret in online learning of Markov decision processes (MDPs) can be circumve… ▽ More

    Submitted 25 October, 2021; originally announced October 2021.

  10. arXiv:2010.13724  [pdf, ps, other

    cs.LG math.OC

    Tight last-iterate convergence rates for no-regret learning in multi-player games

    Authors: Noah Golowich, Sarath Pattathil, Constantinos Daskalakis

    Abstract: We study the question of obtaining last-iterate convergence rates for no-regret learning algorithms in multi-player games. We show that the optimistic gradient (OG) algorithm with a constant step-size, which is no-regret, achieves a last-iterate rate of $O(1/\sqrt{T})$ with respect to the gap function in smooth monotone games. This result addresses a question of Mertikopoulos & Zhou (2018), who as… ▽ More

    Submitted 26 October, 2020; originally announced October 2020.

    Comments: To appear at NeurIPS 2020. 41 pages

  11. arXiv:2007.03668  [pdf, ps, other

    cs.LG math.CO stat.ML

    Near-tight closure bounds for Littlestone and threshold dimensions

    Authors: Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi

    Abstract: We study closure properties for the Littlestone and threshold dimensions of binary hypothesis classes. Given classes $\mathcal{H}_1, \ldots, \mathcal{H}_k$ of Boolean functions with bounded Littlestone (respectively, threshold) dimension, we establish an upper bound on the Littlestone (respectively, threshold) dimension of the class defined by applying an arbitrary binary aggregation rule to… ▽ More

    Submitted 7 July, 2020; originally announced July 2020.

    Comments: 7 pages

  12. arXiv:2002.00057  [pdf, ps, other

    cs.LG math.OC stat.ML

    Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems

    Authors: Noah Golowich, Sarath Pattathil, Constantinos Daskalakis, Asuman Ozdaglar

    Abstract: In this paper we study the smooth convex-concave saddle point problem. Specifically, we analyze the last iterate convergence properties of the Extragradient (EG) algorithm. It is well known that the ergodic (averaged) iterates of EG converge at a rate of $O(1/T)$ (Nemirovski, 2004). In this paper, we show that the last iterate of EG converges at a rate of $O(1/\sqrt{T})$. To the best of our knowle… ▽ More

    Submitted 6 July, 2020; v1 submitted 31 January, 2020; originally announced February 2020.

    Comments: 27 pages

  13. arXiv:1707.03132  [pdf, ps, other

    math.CO

    Coloring Chains for Compression with Uncertain Priors

    Authors: Noah Golowich

    Abstract: Haramaty and Sudan considered the problem of transmitting a message between two people, Alice and Bob, when Alice's and Bob's priors on the message are allowed to differ by at most a given factor. To find a deterministic compression scheme for this problem, they showed that it is sufficient to obtain an upper bound on the chromatic number of a graph, denoted $U(N,s,k)$ for parameters $N,s,k$, whos… ▽ More

    Submitted 21 October, 2018; v1 submitted 11 July, 2017; originally announced July 2017.

    Comments: 20 pages; added Table 1 and some minor clarifications

  14. arXiv:1409.7535  [pdf, other

    math.CO

    The m-Degenerate Chromatic Number of a Digraph

    Authors: Noah Golowich

    Abstract: The digraph chromatic number of a directed graph $D$, denoted $χ_A(D)$, is the minimum positive integer $k$ such that there exists a partition of the vertices of $D$ into $k$ disjoint sets, each of which induces an acyclic subgraph. For any $m \geq 1$, a digraph is weakly $m$-degenerate if each of its induced subgraphs has a vertex of in-degree or out-degree less than $m$. We introduce a generaliz… ▽ More

    Submitted 3 December, 2018; v1 submitted 26 September, 2014; originally announced September 2014.

    Comments: 16 pages, 1 figure

    MSC Class: 05C15; 05C20

    Journal ref: Noah Golowich. The m-degenerate chromatic number of a digraph, Discrete Mathematics, 339(6): 1734-1743, 2016

  15. arXiv:1407.8045  [pdf, other

    math.CO

    Acyclic Subgraphs of Planar Digraphs

    Authors: Noah Golowich, David Rolnick

    Abstract: An acyclic set in a digraph is a set of vertices that induces an acyclic subgraph. In 2011, Harutyunyan conjectured that every planar digraph on $n$ vertices without directed 2-cycles possesses an acyclic set of size at least $3n/5$. We prove this conjecture for digraphs where every directed cycle has length at least 8. More generally, if $g$ is the length of the shortest directed cycle, we show t… ▽ More

    Submitted 30 July, 2014; originally announced July 2014.

    Comments: 9 pages

  16. arXiv:1404.3384  [pdf, ps, other

    math.CO

    Resolving a Conjecture on Degree of Regularity of Linear Homogeneous Equations

    Authors: Noah Golowich

    Abstract: A linear equation is $r$-regular, if, for every $r$-coloring of the positive integers, there exist positive integers of the same color which satisfy the equation. In 2005, Fox and Radoicic conjectured that the equation $x_1 + 2x_2 + \cdots + 2^{n-2}x_{n-1} - 2^{n-1}x_n = 0$, for any $n \geq 2$, has a degree of regularity of $n-1$, which would verify a conjecture of Rado from 1933. Rado's conjectur… ▽ More

    Submitted 13 April, 2014; originally announced April 2014.

    Comments: 8 pages

    MSC Class: 05D10

  17. Degree of Regularity of Linear Homogeneous Equations

    Authors: Kavish Gandhi, Noah Golowich, László Miklós Lovász

    Abstract: We define a linear homogeneous equation to be strongly r-regular if, when a finite number of inequalities is added to the equation, the system of the equation and inequalities is still r-regular. In this paper, we show that, if a linear homogeneous equation is r-regular, then it is strongly r-regular. In 2009, Alexeev and Tsimerman introduced a family of equations, each of which is (n-1)-regular b… ▽ More

    Submitted 26 January, 2014; v1 submitted 27 September, 2013; originally announced September 2013.

    Comments: 7 pages. This is an update of our previous version that proves a stronger result

    MSC Class: 05D10; 11B30