Skip to main content

Showing 1–27 of 27 results for author: Sellke, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2402.01089  [pdf, other

    stat.ML cs.LG

    No Free Prune: Information-Theoretic Barriers to Pruning at Initialization

    Authors: Tanishq Kumar, Kevin Luo, Mark Sellke

    Abstract: The existence of "lottery tickets" arXiv:1803.03635 at or near initialization raises the tantalizing question of whether large models are necessary in deep learning, or whether sparse networks can be quickly identified and trained without ever training the dense models that contain them. However, efforts to find these sparse subnetworks without training the dense model ("pruning at initialization"… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

  2. arXiv:2306.01995  [pdf, ps, other

    cs.LG stat.ML

    Asymptotically Optimal Pure Exploration for Infinite-Armed Bandits

    Authors: Xiao-Yue Gong, Mark Sellke

    Abstract: We study pure exploration with infinitely many bandit arms generated i.i.d. from an unknown distribution. Our goal is to efficiently select a single high quality arm whose average reward is, with probability $1-δ$, within $\varepsilon$ of being among the top $η$-fraction of arms; this is a natural adaptation of the classical PAC guarantee for infinite action sets. We consider both the fixed confid… ▽ More

    Submitted 3 June, 2023; originally announced June 2023.

  3. arXiv:2306.01992  [pdf, other

    cs.LG stat.ML

    On Size-Independent Sample Complexity of ReLU Networks

    Authors: Mark Sellke

    Abstract: We study the sample complexity of learning ReLU neural networks from the point of view of generalization. Given norm constraints on the weight matrices, a common approach is to estimate the Rademacher complexity of the associated function class. Previously Golowich-Rakhlin-Shamir (2020) obtained a bound independent of the network size (scaling with a product of Frobenius norms) except for a factor… ▽ More

    Submitted 4 February, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: 4 pages

  4. arXiv:2306.01990  [pdf, other

    cs.GT cs.LG

    Incentivizing Exploration with Linear Contexts and Combinatorial Actions

    Authors: Mark Sellke

    Abstract: We advance the study of incentivized bandit exploration, in which arm choices are viewed as recommendations and are required to be Bayesian incentive compatible. Recent work has shown under certain independence assumptions that after collecting enough initial samples, the popular Thompson sampling algorithm becomes incentive compatible. We give an analog of this result for linear bandits, where th… ▽ More

    Submitted 19 February, 2024; v1 submitted 2 June, 2023; originally announced June 2023.

    Comments: International Conference on Machine Learning (ICML) 2023

  5. arXiv:2304.01438  [pdf, ps, other

    cs.DS cs.CC

    Tight Space Lower Bound for Pseudo-Deterministic Approximate Counting

    Authors: Ofer Grossman, Meghal Gupta, Mark Sellke

    Abstract: We investigate one of the most basic problems in streaming algorithms: approximating the number of elements in the stream. In 1978, Morris famously gave a randomized algorithm achieving a constant-factor approximation error for streams of length at most N in space $O(\log \log N)$. We investigate the pseudo-deterministic complexity of the problem and prove a tight $Ω(\log N)$ lower bound, thus res… ▽ More

    Submitted 5 July, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

    Comments: 18 pages

  6. arXiv:2303.12172  [pdf, other

    math.PR cond-mat.dis-nn cs.CC math-ph

    Algorithmic Threshold for Multi-Species Spherical Spin Glasses

    Authors: Brice Huang, Mark Sellke

    Abstract: We study efficient optimization of the Hamiltonians of multi-species spherical spin glasses. Our results characterize the maximum value attained by algorithms that are suitably Lipschitz with respect to the disorder through a variational principle that we study in detail. We rely on the branching overlap gap property introduced in our previous work and develop a new method to establish it that doe… ▽ More

    Submitted 13 September, 2023; v1 submitted 21 March, 2023; originally announced March 2023.

    Comments: updated references

  7. arXiv:2206.05265  [pdf, ps, other

    quant-ph cs.CC cs.IT cs.LG

    When Does Adaptivity Help for Quantum State Learning?

    Authors: Sitan Chen, Brice Huang, Jerry Li, Allen Liu, Mark Sellke

    Abstract: We consider the classic question of state tomography: given copies of an unknown quantum state $ρ\in\mathbb{C}^{d\times d}$, output $\widehatρ$ which is close to $ρ$ in some sense, e.g. trace distance or fidelity. When one is allowed to make coherent measurements entangled across all copies, $Θ(d^2/ε^2)$ copies are necessary and sufficient to get trace distance $ε$. Unfortunately, the protocols ac… ▽ More

    Submitted 30 May, 2023; v1 submitted 10 June, 2022; originally announced June 2022.

    Comments: 22 pages

  8. arXiv:2203.05093  [pdf, ps, other

    math.PR cond-mat.dis-nn cs.DS

    Sampling from the Sherrington-Kirkpatrick Gibbs measure via algorithmic stochastic localization

    Authors: Ahmed El Alaoui, Andrea Montanari, Mark Sellke

    Abstract: We consider the Sherrington-Kirkpatrick model of spin glasses at high-temperature and no external field, and study the problem of sampling from the Gibbs distribution $μ$ in polynomial time. We prove that, for any inverse temperature $β<1/2$, there exists an algorithm with complexity $O(n^2)$ that samples from a distribution $μ^{alg}$ which is close in normalized Wasserstein distance to $μ$. Namel… ▽ More

    Submitted 15 February, 2024; v1 submitted 9 March, 2022; originally announced March 2022.

  9. arXiv:2202.09653  [pdf, other

    cs.LG cs.MA stat.ML

    The Pareto Frontier of Instance-Dependent Guarantees in Multi-Player Multi-Armed Bandits with no Communication

    Authors: Allen Liu, Mark Sellke

    Abstract: We study the stochastic multi-player multi-armed bandit problem. In this problem, $m$ players cooperate to maximize their total reward from $K > m$ arms. However the players cannot communicate and are penalized (e.g. receive no reward) if they pull the same arm at the same time. We ask whether it is possible to obtain optimal instance-dependent regret $\tilde{O}(1/Δ)$ where $Δ$ is the gap between… ▽ More

    Submitted 6 June, 2022; v1 submitted 19 February, 2022; originally announced February 2022.

    Comments: Accepted for presentation at Conference on Learning Theory (COLT) 2022

  10. arXiv:2111.06813  [pdf, ps, other

    math.PR cs.DM math-ph math.CO

    Local algorithms for Maximum Cut and Minimum Bisection on locally treelike regular graphs of large degree

    Authors: Ahmed El Alaoui, Andrea Montanari, Mark Sellke

    Abstract: Given a graph $G$ of degree $k$ over $n$ vertices, we consider the problem of computing a near maximum cut or a near minimum bisection in polynomial time. For graphs of girth $2L$, we develop a local message passing algorithm whose complexity is $O(nkL)$, and that achieves near optimal cut values among all $L$-local algorithms. Focusing on max-cut, the algorithm constructs a cut of value… ▽ More

    Submitted 3 February, 2023; v1 submitted 12 November, 2021; originally announced November 2021.

    Comments: Improved presentation. To appear in Random Structures and Algorithms

  11. arXiv:2110.07847  [pdf, other

    math.PR cond-mat.dis-nn cs.CC math-ph math.OC

    Tight Lipschitz Hardness for Optimizing Mean Field Spin Glasses

    Authors: Brice Huang, Mark Sellke

    Abstract: We study the problem of algorithmically optimizing the Hamiltonian $H_N$ of a spherical or Ising mixed $p$-spin glass. The maximum asymptotic value $\mathsf{OPT}$ of $H_N/N$ is characterized by a variational principle known as the Parisi formula, proved first by Talagrand and in more generality by Panchenko. Recently developed approximate message passing algorithms efficiently optimize $H_N/N$ up… ▽ More

    Submitted 11 September, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

    Comments: 84 pages, 2 figures, updated introduction

  12. arXiv:2106.09913  [pdf, other

    cs.LG stat.ML

    Iterative Feature Matching: Toward Provable Domain Generalization with Logarithmic Environments

    Authors: Yining Chen, Elan Rosenfeld, Mark Sellke, Tengyu Ma, Andrej Risteski

    Abstract: Domain generalization aims at performing well on unseen test environments with data from a limited number of training environments. Despite a proliferation of proposal algorithms for this task, assessing their performance both theoretically and empirically is still very challenging. Distributional matching algorithms such as (Conditional) Domain Adversarial Networks [Ganin et al., 2016, Long et al… ▽ More

    Submitted 22 November, 2021; v1 submitted 18 June, 2021; originally announced June 2021.

    Comments: We acknowledge that the previous version of this paper (v1) contained an error - Theorem 3.2 was incorrect. We removed this theorem and updated the rest of the paper in v2

  13. arXiv:2105.12806  [pdf, ps, other

    cs.LG stat.ML

    A Universal Law of Robustness via Isoperimetry

    Authors: Sébastien Bubeck, Mark Sellke

    Abstract: Classically, data interpolation with a parametrized model class is possible as long as the number of parameters is larger than the number of equations to be satisfied. A puzzling phenomenon in deep learning is that models are trained with many more parameters than what this classical theory would suggest. We propose a partial theoretical explanation for this phenomenon. We prove that for a broad c… ▽ More

    Submitted 23 December, 2022; v1 submitted 26 May, 2021; originally announced May 2021.

  14. arXiv:2011.11503  [pdf, ps, other

    cs.CG cs.LG math.MG

    Metric Transforms and Low Rank Matrices via Representation Theory of the Real Hyperrectangle

    Authors: Josh Alman, Timothy Chu, Gary Miller, Shyam Narayanan, Mark Sellke, Zhao Song

    Abstract: In this paper, we develop a new technique which we call representation theory of the real hyperrectangle, which describes how to compute the eigenvectors and eigenvalues of certain matrices arising from hyperrectangles. We show that these matrices arise naturally when analyzing a number of different algorithmic tasks such as kernel methods, neural network training, natural language processing, and… ▽ More

    Submitted 4 August, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

  15. arXiv:2011.03896  [pdf, other

    cs.LG cs.MA stat.ML

    Cooperative and Stochastic Multi-Player Multi-Armed Bandit: Optimal Regret With Neither Communication Nor Collisions

    Authors: Sébastien Bubeck, Thomas Budzinski, Mark Sellke

    Abstract: We consider the cooperative multi-player version of the stochastic multi-armed bandit problem. We study the regime where the players cannot communicate but have access to shared randomness. In prior work by the first two authors, a strategy for this regime was constructed for two players and three arms, with regret $\tilde{O}(\sqrt{T})$, and with no collisions at all between the players (with very… ▽ More

    Submitted 7 November, 2020; originally announced November 2020.

  16. arXiv:2010.15811  [pdf, ps, other

    math.PR cs.DS math-ph

    Algorithmic pure states for the negative spherical perceptron

    Authors: Ahmed El Alaoui, Mark Sellke

    Abstract: We consider the spherical perceptron with Gaussian disorder. This is the set $S$ of points $σ\in \mathbb{R}^N$ on the sphere of radius $\sqrt{N}$ satisfying $\langle g_a , σ\rangle \ge κ\sqrt{N}\,$ for all $1 \le a \le M$, where $(g_a)_{a=1}^M$ are independent standard gaussian vectors and $κ\in \mathbb{R}$ is fixed. Various characteristics of $S$ such as its surface measure and the largest $M$ fo… ▽ More

    Submitted 29 October, 2020; originally announced October 2020.

    Comments: 34 pages

  17. arXiv:2009.08266  [pdf, other

    cs.DS cs.DM math.MG

    Metrical Service Systems with Transformations

    Authors: Sébastien Bubeck, Niv Buchbinder, Christian Coester, Mark Sellke

    Abstract: We consider a generalization of the fundamental online metrical service systems (MSS) problem where the feasible region can be transformed between requests. In this problem, which we call T-MSS, an algorithm maintains a point in a metric space and has to serve a sequence of requests. Each request is a map (transformation) $f_t\colon A_t\to B_t$ between subsets $A_t$ and $B_t$ of the metric space.… ▽ More

    Submitted 17 September, 2020; originally announced September 2020.

  18. arXiv:2007.07862  [pdf, ps, other

    cs.DS

    Vertex Sparsification for Edge Connectivity

    Authors: Parinya Chalermsook, Syamantak Das, Bundit Laekhanukit, Yunbum Kook, Yang P. Liu, Richard Peng, Mark Sellke, Daniel Vaz

    Abstract: Graph compression or sparsification is a basic information-theoretic and computational question. A major open problem in this research area is whether $(1+ε)$-approximate cut-preserving vertex sparsifiers with size close to the number of terminals exist. As a step towards this goal, we study a thresholded version of the problem: for a given parameter $c$, find a smaller graph, which we call connec… ▽ More

    Submitted 15 July, 2020; originally announced July 2020.

    Comments: Merged version of arXiv:1910.10359 and arXiv:1910.10665 with improved bounds, 55 pages

  19. arXiv:2004.07346  [pdf, other

    cs.DS cs.LG

    Online Multiserver Convex Chasing and Optimization

    Authors: Sébastien Bubeck, Yuval Rabani, Mark Sellke

    Abstract: We introduce the problem of $k$-chasing of convex functions, a simultaneous generalization of both the famous k-server problem in $R^d$, and of the problem of chasing convex bodies and functions. Aside from fundamental interest in this general form, it has natural applications to online $k$-clustering problems with objectives such as $k$-median or $k$-means. We show that this problem exhibits a ri… ▽ More

    Submitted 15 April, 2020; originally announced April 2020.

  20. arXiv:2002.00558  [pdf, ps, other

    cs.GT cs.DS cs.LG

    The Price of Incentivizing Exploration: A Characterization via Thompson Sampling and Sample Complexity

    Authors: Mark Sellke, Aleksandrs Slivkins

    Abstract: We consider incentivized exploration: a version of multi-armed bandits where the choice of arms is controlled by self-interested agents, and the algorithm can only issue recommendations. The algorithm controls the flow of information, and the information asymmetry can incentivize the agents to explore. Prior work achieves optimal regret rates up to multiplicative factors that become arbitrarily la… ▽ More

    Submitted 12 June, 2022; v1 submitted 2 February, 2020; originally announced February 2020.

  21. arXiv:1910.10359  [pdf, ps, other

    cs.DS

    Vertex Sparsifiers for c-Edge Connectivity

    Authors: Yang P. Liu, Richard Peng, Mark Sellke

    Abstract: We show the existence of O(f(c)k) sized vertex sparsifiers that preserve all edge-connectivity values up to c between a set of k terminal vertices, where f(c) is a function that only depends on c, the edge-connectivity value. This construction is algorithmic: we also provide an algorithm whose running time depends linearly on k, but exponentially in c. It implies that for constant values of c, an… ▽ More

    Submitted 23 October, 2019; originally announced October 2019.

  22. arXiv:1905.11968  [pdf, ps, other

    cs.DS math.MG

    Chasing Convex Bodies Optimally

    Authors: Mark Sellke

    Abstract: In the chasing convex bodies problem, an online player receives a request sequence of $N$ convex sets $K_1,\dots, K_N$ contained in a normed space $\mathbb R^d$. The player starts at $x_0\in \mathbb R^d$, and after observing each $K_n$ picks a new point $x_n\in K_n$. At each step the player pays a movement cost of $||x_n-x_{n-1}||$. The player aims to maintain a constant competitive ratio against… ▽ More

    Submitted 23 November, 2021; v1 submitted 28 May, 2019; originally announced May 2019.

  23. arXiv:1904.12233  [pdf, ps, other

    cs.LG cs.MA stat.ML

    Non-Stochastic Multi-Player Multi-Armed Bandits: Optimal Rate With Collision Information, Sublinear Without

    Authors: Sébastien Bubeck, Yuanzhi Li, Yuval Peres, Mark Sellke

    Abstract: We consider the non-stochastic version of the (cooperative) multi-player multi-armed bandit problem. The model assumes no communication at all between the players, and furthermore when two (or more) players select the same action this results in a maximal loss. We prove the first $\sqrt{T}$-type regret guarantee for this problem, under the feedback model where collisions are announced to the colli… ▽ More

    Submitted 1 May, 2019; v1 submitted 27 April, 2019; originally announced April 2019.

    Comments: 27 pages, v2 adds a pseudorandom generator construction to remove the shared randomness assumption in the $\sqrt{T}$-regret result (Section 3.9)

  24. arXiv:1902.00681  [pdf, ps, other

    cs.LG stat.ML

    First-Order Bayesian Regret Analysis of Thompson Sampling

    Authors: Sébastien Bubeck, Mark Sellke

    Abstract: We address online combinatorial optimization when the player has a prior over the adversary's sequence of losses. In this framework, Russo and Van Roy proposed an information-theoretic analysis of Thompson Sampling based on the information ratio, resulting in optimal worst-case regret bounds. In this paper we introduce three novel ideas to this line of work. First we propose a new quantity, the sc… ▽ More

    Submitted 3 April, 2022; v1 submitted 2 February, 2019; originally announced February 2019.

    Comments: 58 pages

  25. arXiv:1811.00999  [pdf, ps, other

    cs.DS math.MG

    Chasing Nested Convex Bodies Nearly Optimally

    Authors: Sébastien Bubeck, Bo'az Klartag, Yin Tat Lee, Yuanzhi Li, Mark Sellke

    Abstract: The convex body chasing problem, introduced by Friedman and Linial, is a competitive analysis problem on any normed vector space. In convex body chasing, for each timestep $t\in\mathbb N$, a convex body $K_t\subseteq \mathbb R^d$ is given as a request, and the player picks a point $x_t\in K_t$. The player aims to ensure that the total distance $\sum_{t=0}^{T-1}||x_t-x_{t+1}||$ is within a bounded… ▽ More

    Submitted 12 August, 2021; v1 submitted 2 November, 2018; originally announced November 2018.

  26. arXiv:1811.00887  [pdf, ps, other

    cs.DS math.MG

    Competitively Chasing Convex Bodies

    Authors: Sébastien Bubeck, Yin Tat Lee, Yuanzhi Li, Mark Sellke

    Abstract: Let $\mathcal{F}$ be a family of sets in some metric space. In the $\mathcal{F}$-chasing problem, an online algorithm observes a request sequence of sets in $\mathcal{F}$ and responds (online) by giving a sequence of points in these sets. The movement cost is the distance between consecutive such points. The competitive ratio is the worst case ratio (over request sequences) between the total movem… ▽ More

    Submitted 2 November, 2018; originally announced November 2018.

    Comments: 14 pages

  27. arXiv:1710.11278  [pdf, other

    stat.ML cs.CC cs.LG math.CO math.ST

    Approximating Continuous Functions by ReLU Nets of Minimal Width

    Authors: Boris Hanin, Mark Sellke

    Abstract: This article concerns the expressive power of depth in deep feed-forward neural nets with ReLU activations. Specifically, we answer the following question: for a fixed $d_{in}\geq 1,$ what is the minimal width $w$ so that neural nets with ReLU activations, input dimension $d_{in}$, hidden layer widths at most $w,$ and arbitrary depth can approximate any continuous, real-valued function of… ▽ More

    Submitted 10 March, 2018; v1 submitted 30 October, 2017; originally announced October 2017.

    Comments: v2. 13p. Extended main result to higher dimensional output. Comments welcome