Skip to main content

Showing 1–50 of 103 results for author: Wainwright, M J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2401.13665  [pdf, other

    math.ST econ.EM stat.ME stat.ML

    Entrywise Inference for Missing Panel Data: A Simple and Instance-Optimal Approach

    Authors: Yuling Yan, Martin J. Wainwright

    Abstract: Longitudinal or panel data can be represented as a matrix with rows indexed by units and columns indexed by time. We consider inferential questions associated with the missing data version of panel data induced by staggered adoption. We propose a computationally efficient procedure for estimation, involving only simple matrix algebra and singular value decomposition, and prove non-asymptotic and h… ▽ More

    Submitted 1 July, 2024; v1 submitted 24 January, 2024; originally announced January 2024.

  2. arXiv:2401.05233  [pdf, other

    cs.LG cs.IT eess.SY math.OC stat.ML

    Taming "data-hungry" reinforcement learning? Stability in continuous state-action spaces

    Authors: Yaqi Duan, Martin J. Wainwright

    Abstract: We introduce a novel framework for analyzing reinforcement learning (RL) in continuous state-action spaces, and use it to prove fast rates of convergence in both off-line and on-line settings. Our analysis highlights two key stability properties, relating to how changes in value functions and/or policies affect the Bellman operator and occupation measures. We argue that these properties are satisf… ▽ More

    Submitted 10 January, 2024; originally announced January 2024.

  3. arXiv:2311.10076  [pdf, other

    stat.ME math.ST

    A decorrelation method for general regression adjustment in randomized experiments

    Authors: Fangzhou Su, Wenlong Mou, Peng Ding, Martin J. Wainwright

    Abstract: We study regression adjustment with general function class approximations for estimating the average treatment effect in the design-based setting. Standard regression adjustment involves bias due to sample re-use, and this bias leads to behavior that is sub-optimal in the sample size, and/or imposes restrictive assumptions. Our main contribution is to introduce a novel decorrelation-based approach… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

    Comments: Fangzhou Su and Wenlong Mou contributed equally to this work

  4. arXiv:2309.08634  [pdf, other

    stat.ML cs.AI cs.LG stat.AP stat.ME

    Doubly High-Dimensional Contextual Bandits: An Interpretable Model for Joint Assortment-Pricing

    Authors: Junhui Cai, Ran Chen, Martin J. Wainwright, Linda Zhao

    Abstract: Key challenges in running a retail business include how to select products to present to consumers (the assortment problem), and how to price products (the pricing problem) to maximize revenue or profit. Instead of considering these problems in isolation, we propose a joint approach to assortment-pricing based on contextual bandits. Our model is doubly high-dimensional, in that both context vector… ▽ More

    Submitted 13 September, 2023; originally announced September 2023.

  5. arXiv:2309.01362  [pdf, other

    math.ST stat.ME

    Challenges of the inconsistency regime: Novel debiasing methods for missing data models

    Authors: Michael Celentano, Martin J. Wainwright

    Abstract: We study semi-parametric estimation of the population mean when data is observed missing at random (MAR) in the $n < p$ "inconsistency regime", in which neither the outcome model nor the propensity/missingness model can be estimated consistently. Consider a high-dimensional linear-GLM specification in which the number of confounders is proportional to the sample size. In the case $n > p$, past wor… ▽ More

    Submitted 4 September, 2023; originally announced September 2023.

    Comments: 89 pages, 6 figures

    MSC Class: 62J05; 62J12; 62F10

  6. arXiv:2303.17102  [pdf, other

    stat.ME

    When is the estimated propensity score better? High-dimensional analysis and bias correction

    Authors: Fangzhou Su, Wenlong Mou, Peng Ding, Martin J. Wainwright

    Abstract: Anecdotally, using an estimated propensity score is superior to the true propensity score in estimating the average treatment effect based on observational data. However, this claim comes with several qualifications: it holds only if propensity score model is correctly specified and the number of covariates $d$ is small relative to the sample size $n$. We revisit this phenomenon by studying the in… ▽ More

    Submitted 29 March, 2023; originally announced March 2023.

    Comments: Fangzhou Su and Wenlong Mou contributed equally to this work

  7. arXiv:2303.02534  [pdf, other

    math.ST cs.LG stat.ME stat.ML

    Semi-parametric inference based on adaptively collected data

    Authors: Licong Lin, Koulik Khamaru, Martin J. Wainwright

    Abstract: Many standard estimators, when applied to adaptively collected data, fail to be asymptotically normal, thereby complicating the construction of confidence intervals. We address this challenge in a semi-parametric context: estimating the parameter vector of a generalized linear regression model contaminated by a non-parametric nuisance component. We construct suitably weighted estimating equations… ▽ More

    Submitted 4 March, 2023; originally announced March 2023.

  8. arXiv:2301.06240  [pdf, other

    math.ST stat.ME stat.ML

    Kernel-based off-policy estimation without overlap: Instance optimality beyond semiparametric efficiency

    Authors: Wenlong Mou, Peng Ding, Martin J. Wainwright, Peter L. Bartlett

    Abstract: We study optimal procedures for estimating a linear functional based on observational data. In many problems of this kind, a widely used assumption is strict overlap, i.e., uniform boundedness of the importance ratio, which measures how well the observational data covers the directions of interest. When it is violated, the classical semi-parametric efficiency bound can easily become infinite, so t… ▽ More

    Submitted 15 January, 2023; originally announced January 2023.

  9. arXiv:2211.03899  [pdf, other

    stat.ML cs.LG math.ST

    Policy evaluation from a single path: Multi-step methods, mixing and mis-specification

    Authors: Yaqi Duan, Martin J. Wainwright

    Abstract: We study non-parametric estimation of the value function of an infinite-horizon $γ$-discounted Markov reward process (MRP) using observations from a single trajectory. We provide non-asymptotic guarantees for a general family of kernel-based multi-step temporal difference (TD) estimates, including canonical $K$-step look-ahead TD for $K = 1, 2, \ldots$ and the TD$(λ)$ family for $λ\in [0,1)$ as sp… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

  10. arXiv:2210.11377  [pdf, other

    stat.ML cs.LG math.OC math.ST

    Krylov-Bellman boosting: Super-linear policy evaluation in general state spaces

    Authors: Eric Xia, Martin J. Wainwright

    Abstract: We present and analyze the Krylov-Bellman Boosting (KBB) algorithm for policy evaluation in general state spaces. It alternates between fitting the Bellman residual using non-parametric regression (as in boosting), and estimating the value function via the least-squares temporal difference (LSTD) procedure applied with a feature set that grows adaptively over time. By exploiting the connection to… ▽ More

    Submitted 20 October, 2022; originally announced October 2022.

    Comments: 40 pages, 7 figures

  11. arXiv:2210.04334  [pdf, other

    stat.ME cs.LG eess.SP

    QuTE: decentralized multiple testing on sensor networks with false discovery rate control

    Authors: Aaditya Ramdas, Jianbo Chen, Martin J. Wainwright, Michael I. Jordan

    Abstract: This paper designs methods for decentralized multiple hypothesis testing on graphs that are equipped with provable guarantees on the false discovery rate (FDR). We consider the setting where distinct agents reside on the nodes of an undirected graph, and each agent possesses p-values corresponding to one or more hypotheses local to its node. Each agent must individually decide whether to reject on… ▽ More

    Submitted 9 October, 2022; originally announced October 2022.

    Comments: This paper appeared in the IEEE CDC'17 conference proceedings. The last two sections were then developed in 2018, and it is now being put on arXiv simply for easier access

  12. arXiv:2209.13075  [pdf, other

    math.ST cs.IT stat.ML

    Off-policy estimation of linear functionals: Non-asymptotic theory for semi-parametric efficiency

    Authors: Wenlong Mou, Martin J. Wainwright, Peter L. Bartlett

    Abstract: The problem of estimating a linear functional based on observational data is canonical in both the causal inference and bandit literatures. We analyze a broad class of two-stage procedures that first estimate the treatment effect function, and then use this quantity to estimate the linear functional. We prove non-asymptotic upper bounds on the mean-squared error of such procedures: these bounds re… ▽ More

    Submitted 26 September, 2022; originally announced September 2022.

    Comments: 56 pages, 6 figures

  13. arXiv:2205.02986  [pdf, other

    math.ST cs.LG stat.ML

    Optimally tackling covariate shift in RKHS-based nonparametric regression

    Authors: Cong Ma, Reese Pathak, Martin J. Wainwright

    Abstract: We study the covariate shift problem in the context of nonparametric regression over a reproducing kernel Hilbert space (RKHS). We focus on two natural families of covariate shift problems defined using the likelihood ratios between the source and target distributions. When the likelihood ratios are uniformly bounded, we prove that the kernel ridge regression (KRR) estimator with a carefully chose… ▽ More

    Submitted 6 June, 2023; v1 submitted 5 May, 2022; originally announced May 2022.

    Comments: to appear in the Annals of Statistics

  14. arXiv:2202.02837  [pdf, other

    math.ST cs.LG stat.ML

    A new similarity measure for covariate shift with applications to nonparametric regression

    Authors: Reese Pathak, Cong Ma, Martin J. Wainwright

    Abstract: We study covariate shift in the context of nonparametric regression. We introduce a new measure of distribution mismatch between the source and target distributions that is based on the integrated ratio of probabilities of balls at a given radius. We use the scaling of this measure with respect to the radius to characterize the minimax rate of estimation over a family of Hölder continuous function… ▽ More

    Submitted 6 February, 2022; originally announced February 2022.

    Comments: 22 pages, 2 figures, 1 table

  15. arXiv:2201.08536  [pdf, other

    stat.ML cs.LG

    Instance-Dependent Confidence and Early Stop** for Reinforcement Learning

    Authors: Koulik Khamaru, Eric Xia, Martin J. Wainwright, Michael I. Jordan

    Abstract: Various algorithms for reinforcement learning (RL) exhibit dramatic variation in their convergence rates as a function of problem structure. Such problem-dependent behavior is not captured by worst-case analyses and has accordingly inspired a growing effort in obtaining instance-dependent guarantees and deriving instance-optimal algorithms for RL problems. This research has been carried out, howev… ▽ More

    Submitted 20 January, 2022; originally announced January 2022.

  16. arXiv:2201.08518  [pdf, ps, other

    math.ST cs.LG math.OC stat.ML

    Optimal variance-reduced stochastic approximation in Banach spaces

    Authors: Wenlong Mou, Koulik Khamaru, Martin J. Wainwright, Peter L. Bartlett, Michael I. Jordan

    Abstract: We study the problem of estimating the fixed point of a contractive operator defined on a separable Banach space. Focusing on a stochastic query model that provides noisy evaluations of the operator, we analyze a variance-reduced stochastic approximation scheme, and establish non-asymptotic bounds for both the operator defect and the estimation error, measured in an arbitrary semi-norm. In contras… ▽ More

    Submitted 29 November, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

  17. arXiv:2112.12770  [pdf, ps, other

    math.OC cs.LG math.PR math.ST stat.ML

    Optimal and instance-dependent guarantees for Markovian linear stochastic approximation

    Authors: Wenlong Mou, Ashwin Pananjady, Martin J. Wainwright, Peter L. Bartlett

    Abstract: We study stochastic approximation procedures for approximately solving a $d$-dimensional linear fixed point equation based on observing a trajectory of length $n$ from an ergodic Markov chain. We first exhibit a non-asymptotic bound of the order $t_{\mathrm{mix}} \tfrac{d}{n}$ on the squared error of the last iterate of a standard scheme, where $t_{\mathrm{mix}}$ is a mixing time. We then prove a… ▽ More

    Submitted 11 May, 2024; v1 submitted 23 December, 2021; originally announced December 2021.

    Comments: Published at Mathematical Statistics and Learning

  18. arXiv:2109.12002  [pdf, other

    stat.ML cs.LG math.ST

    Optimal policy evaluation using kernel-based temporal difference methods

    Authors: Yaqi Duan, Mengdi Wang, Martin J. Wainwright

    Abstract: We study methods based on reproducing kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process (MRP). We study a regularized form of the kernel least-squares temporal difference (LSTD) estimate; in the population limit of infinite data, it corresponds to the fixed point of a projected Bellman operator defined by the associated reproducing kern… ▽ More

    Submitted 24 September, 2021; originally announced September 2021.

  19. arXiv:2107.02266  [pdf, other

    math.ST cs.LG stat.ML

    Near-optimal inference in adaptive linear regression

    Authors: Koulik Khamaru, Yash Deshpande, Tor Lattimore, Lester Mackey, Martin J. Wainwright

    Abstract: When data is collected in an adaptive manner, even simple methods like ordinary least squares can exhibit non-normal asymptotic behavior. As an undesirable consequence, hypothesis tests and confidence intervals based on asymptotic normality can lead to erroneous results. We propose a family of online debiasing estimators to correct these distributional anomalies in least squares estimation. Our pr… ▽ More

    Submitted 21 March, 2023; v1 submitted 5 July, 2021; originally announced July 2021.

    Comments: 51 pages, 7 figures

  20. arXiv:2106.14352  [pdf, other

    stat.ML cs.LG

    Instance-optimality in optimal value estimation: Adaptivity via variance-reduced Q-learning

    Authors: Koulik Khamaru, Eric Xia, Martin J. Wainwright, Michael I. Jordan

    Abstract: Various algorithms in reinforcement learning exhibit dramatic variability in their convergence rates and ultimate accuracy as a function of the problem structure. Such instance-specific behavior is not captured by existing global minimax bounds, which are worst-case in nature. We analyze the problem of estimating optimal $Q$-value functions for a discounted Markov decision process with discrete st… ▽ More

    Submitted 27 June, 2021; originally announced June 2021.

  21. arXiv:2105.01850  [pdf, other

    cs.LG stat.ML

    Preference learning along multiple criteria: A game-theoretic perspective

    Authors: Kush Bhatia, Ashwin Pananjady, Peter L. Bartlett, Anca D. Dragan, Martin J. Wainwright

    Abstract: The literature on ranking from ordinal data is vast, and there are several ways to aggregate overall preferences from pairwise comparisons between objects. In particular, it is well known that any Nash equilibrium of the zero sum game induced by the preference matrix defines a natural solution concept (winning distribution over objects) known as a von Neumann winner. Many real-world problems, howe… ▽ More

    Submitted 4 May, 2021; originally announced May 2021.

    Comments: 47 pages; published as a conference paper at NeurIPS 2020

  22. arXiv:2101.07781  [pdf, other

    stat.ML cs.LG math.ST

    Minimax Off-Policy Evaluation for Multi-Armed Bandits

    Authors: Cong Ma, Banghua Zhu, Jiantao Jiao, Martin J. Wainwright

    Abstract: We study the problem of off-policy evaluation in the multi-armed bandit model with bounded rewards, and develop minimax rate-optimal procedures under three settings. First, when the behavior policy is known, we show that the Switch estimator, a method that alternates between the plug-in and importance sampling estimators, is minimax rate-optimal for all sample sizes. Second, when the behavior poli… ▽ More

    Submitted 19 January, 2021; originally announced January 2021.

  23. arXiv:2012.05299  [pdf, other

    cs.LG math.OC math.ST stat.ML

    Optimal oracle inequalities for solving projected fixed-point equations

    Authors: Wenlong Mou, Ashwin Pananjady, Martin J. Wainwright

    Abstract: Linear fixed point equations in Hilbert spaces arise in a variety of settings, including reinforcement learning, and computational methods for solving differential and integral equations. We study methods that use a collection of random observations to compute approximate solutions by searching over a known low-dimensional subspace of the Hilbert space. First, we prove an instance-dependent upper… ▽ More

    Submitted 9 December, 2020; originally announced December 2020.

  24. arXiv:2006.10189  [pdf, other

    cs.LG cs.IT math.ST stat.ML

    Revisiting minimum description length complexity in overparameterized models

    Authors: Raaz Dwivedi, Chandan Singh, Bin Yu, Martin J. Wainwright

    Abstract: Complexity is a fundamental concept underlying statistical learning theory that aims to inform generalization performance. Parameter count, while successful in low-dimensional settings, is not well-justified for overparameterized settings when the number of parameters is more than the number of training samples. We revisit complexity measures based on Rissanen's principle of minimum description le… ▽ More

    Submitted 12 October, 2023; v1 submitted 17 June, 2020; originally announced June 2020.

    Comments: First two authors contributed equally

  25. arXiv:2005.11411  [pdf, other

    cs.LG math.ST stat.ML

    Instability, Computational Efficiency and Statistical Accuracy

    Authors: Nhat Ho, Koulik Khamaru, Raaz Dwivedi, Martin J. Wainwright, Michael I. Jordan, Bin Yu

    Abstract: Many statistical estimators are defined as the fixed point of a data-dependent operator, with estimators based on minimizing a cost function being an important special case. The limiting performance of such estimators depends on the properties of the population-level operator in the idealized limit of infinitely many samples. We develop a general framework that yields bounds on statistical accurac… ▽ More

    Submitted 20 March, 2022; v1 submitted 22 May, 2020; originally announced May 2020.

    Comments: 68 pages, 6 Figures, 2 Tables. First three authors contributed equally

  26. arXiv:2005.05238  [pdf, other

    cs.LG math.OC stat.ML

    FedSplit: An algorithmic framework for fast federated optimization

    Authors: Reese Pathak, Martin J. Wainwright

    Abstract: Motivated by federated learning, we consider the hub-and-spoke model of distributed optimization in which a central authority coordinates the computation of a solution among many agents while limiting communication. We first study some past procedures for federated optimization, and show that their fixed points need not correspond to stationary points of the original optimization problem, even in… ▽ More

    Submitted 11 May, 2020; originally announced May 2020.

    Comments: 27 pages, 4 figures

  27. arXiv:2005.03725  [pdf, other

    math.ST cs.LG stat.ML

    Lower bounds in multiple testing: A framework based on derandomized proxies

    Authors: Max Rabinovich, Michael I. Jordan, Martin J. Wainwright

    Abstract: The large bulk of work in multiple testing has focused on specifying procedures that control the false discovery rate (FDR), with relatively less attention being paid to the corresponding Type II error known as the false non-discovery rate (FNR). A line of more recent work in multiple testing has begun to investigate the tradeoffs between the FDR and FNR and to provide lower bounds on the performa… ▽ More

    Submitted 7 May, 2020; originally announced May 2020.

  28. arXiv:2004.04719  [pdf, ps, other

    stat.ML cs.LG math.OC math.ST

    On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration

    Authors: Wenlong Mou, Chris Junchi Li, Martin J. Wainwright, Peter L. Bartlett, Michael I. Jordan

    Abstract: We undertake a precise study of the asymptotic and non-asymptotic properties of stochastic approximation procedures with Polyak-Ruppert averaging for solving a linear system $\bar{A} θ= \bar{b}$. When the matrix $\bar{A}$ is Hurwitz, we prove a central limit theorem (CLT) for the averaged iterates with fixed step size and number of iterations going to infinity. The CLT characterizes the exact asym… ▽ More

    Submitted 9 April, 2020; originally announced April 2020.

  29. arXiv:2003.07337  [pdf, other

    stat.ML cs.LG math.OC

    Is Temporal Difference Learning Optimal? An Instance-Dependent Analysis

    Authors: Koulik Khamaru, Ashwin Pananjady, Feng Ruan, Martin J. Wainwright, Michael I. Jordan

    Abstract: We address the problem of policy evaluation in discounted Markov decision processes, and provide instance-dependent guarantees on the $\ell_\infty$-error under a generative model. We establish both asymptotic and non-asymptotic versions of local minimax lower bounds for policy evaluation, thereby providing an instance-dependent baseline by which to compare algorithms. Theory-inspired simulations s… ▽ More

    Submitted 16 March, 2020; originally announced March 2020.

    Comments: 38 pages, 3 figures

  30. arXiv:1912.05153  [pdf, other

    stat.ML cs.DS cs.LG math.PR stat.CO

    Sampling for Bayesian Mixture Models: MCMC with Polynomial-Time Mixing

    Authors: Wenlong Mou, Nhat Ho, Martin J. Wainwright, Peter L. Bartlett, Michael I. Jordan

    Abstract: We study the problem of sampling from the power posterior distribution in Bayesian Gaussian mixture models, a robust version of the classical posterior. This power posterior is known to be non-log-concave and multi-modal, which leads to exponential mixing times for some standard MCMC algorithms. We introduce and study the Reflected Metropolis-Hastings Random Walk (RMRW) algorithm for sampling. For… ▽ More

    Submitted 11 December, 2019; originally announced December 2019.

  31. arXiv:1910.00551  [pdf, ps, other

    stat.ML cs.DS cs.LG stat.CO

    An Efficient Sampling Algorithm for Non-smooth Composite Potentials

    Authors: Wenlong Mou, Nicolas Flammarion, Martin J. Wainwright, Peter L. Bartlett

    Abstract: We consider the problem of sampling from a density of the form $p(x) \propto \exp(-f(x)- g(x))$, where $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is a smooth and strongly convex function and $g: \mathbb{R}^d \rightarrow \mathbb{R}$ is a convex and Lipschitz function. We propose a new algorithm based on the Metropolis-Hastings framework, and prove that it mixes to within TV distance $\varepsilon$ of… ▽ More

    Submitted 1 October, 2019; originally announced October 2019.

  32. arXiv:1909.08749  [pdf, other

    stat.ML cs.LG math.OC math.PR math.ST

    Instance-dependent $\ell_\infty$-bounds for policy evaluation in tabular reinforcement learning

    Authors: Ashwin Pananjady, Martin J. Wainwright

    Abstract: Markov reward processes (MRPs) are used to model stochastic phenomena arising in operations research, control engineering, robotics, and artificial intelligence, as well as communication and transportation networks. In many of these cases, such as in the policy evaluation problem encountered in reinforcement learning, the goal is to estimate the long-term value function of such a process without a… ▽ More

    Submitted 15 September, 2020; v1 submitted 18 September, 2019; originally announced September 2019.

    Comments: Version v2 is consistent with manuscript to appear in IEEE Transactions on Information Theory

  33. arXiv:1908.10859  [pdf, ps, other

    stat.ML cs.DS cs.LG math.OC stat.CO

    High-Order Langevin Diffusion Yields an Accelerated MCMC Algorithm

    Authors: Wenlong Mou, Yi-An Ma, Martin J. Wainwright, Peter L. Bartlett, Michael I. Jordan

    Abstract: We propose a Markov chain Monte Carlo (MCMC) algorithm based on third-order Langevin dynamics for sampling from distributions with log-concave and smooth densities. The higher-order dynamics allow for more flexible discretization schemes, and we develop a specific method that combines splitting with more accurate integration. For a broad class of $d$-dimensional distributions arising from generali… ▽ More

    Submitted 26 May, 2020; v1 submitted 28 August, 2019; originally announced August 2019.

    Comments: Changes from v1: improved algorithm with $O (d^{1/4} / \varepsilon^{1/2})$ mixing time

  34. arXiv:1907.11331  [pdf, other

    math.PR math.ST stat.CO stat.ML

    Improved Bounds for Discretization of Langevin Diffusions: Near-Optimal Rates without Convexity

    Authors: Wenlong Mou, Nicolas Flammarion, Martin J. Wainwright, Peter L. Bartlett

    Abstract: We present an improved analysis of the Euler-Maruyama discretization of the Langevin diffusion. Our analysis does not require global contractivity, and yields polynomial dependence on the time horizon. Compared to existing approaches, we make an additional smoothness assumption, and improve the existing rate from $O(η)$ to $O(η^2)$ in terms of the KL divergence. This result matches the correct ord… ▽ More

    Submitted 4 November, 2019; v1 submitted 25 July, 2019; originally announced July 2019.

    Comments: Changes from v1: corrections in the proof of Lemma 6 and Lemma 10; fixed some minor typos

  35. arXiv:1906.04697  [pdf, other

    cs.LG math.OC stat.ML

    Variance-reduced $Q$-learning is minimax optimal

    Authors: Martin J. Wainwright

    Abstract: We introduce and analyze a form of variance-reduced $Q$-learning. For $γ$-discounted MDPs with finite state space $\mathcal{X}$ and action space $\mathcal{U}$, we prove that it yields an $ε$-accurate estimate of the optimal $Q$-function in the $\ell_\infty$-norm using $\mathcal{O} \left(\left(\frac{D}{ ε^2 (1-γ)^3} \right) \; \log \left( \frac{D}{(1-γ)} \right) \right)$ samples, where… ▽ More

    Submitted 8 August, 2019; v1 submitted 11 June, 2019; originally announced June 2019.

    Comments: Update from v1: new Proposition 1 on minimax optimality; updated referencing and discussion of related work

  36. arXiv:1905.12247  [pdf, other

    stat.ML cs.LG stat.CO

    Fast mixing of Metropolized Hamiltonian Monte Carlo: Benefits of multi-step gradients

    Authors: Yuansi Chen, Raaz Dwivedi, Martin J. Wainwright, Bin Yu

    Abstract: Hamiltonian Monte Carlo (HMC) is a state-of-the-art Markov chain Monte Carlo sampling algorithm for drawing samples from smooth probability densities over continuous spaces. We study the variant most widely used in practice, Metropolized HMC with the Störmer-Verlet or leapfrog integrator, and make two primary contributions. First, we provide a non-asymptotic upper bound on the mixing time of the M… ▽ More

    Submitted 11 January, 2021; v1 submitted 29 May, 2019; originally announced May 2019.

    Comments: 73 pages, 2 figures, fixed a mistake in the proof of Lemma 11, accepted in JMLR

  37. arXiv:1905.06265  [pdf, other

    cs.LG math.OC stat.ML

    Stochastic approximation with cone-contractive operators: Sharp $\ell_\infty$-bounds for $Q$-learning

    Authors: Martin J. Wainwright

    Abstract: Motivated by the study of $Q$-learning algorithms in reinforcement learning, we study a class of stochastic approximation procedures based on operators that satisfy monotonicity and quasi-contractivity conditions with respect to an underlying cone. We prove a general sandwich relation on the iterate error at each time, and use it to derive non-asymptotic bounds on the error in terms of a cone-indu… ▽ More

    Submitted 24 June, 2019; v1 submitted 15 May, 2019; originally announced May 2019.

    Comments: Changes from v1: -- Part of Lemma 1 was incorrect; corrected -- proof of Lemma 2: fixed minor typo in equation (36)

  38. arXiv:1904.02144  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    HopSkipJumpAttack: A Query-Efficient Decision-Based Attack

    Authors: Jianbo Chen, Michael I. Jordan, Martin J. Wainwright

    Abstract: The goal of a decision-based adversarial attack on a trained model is to generate adversarial examples based solely on observing output labels returned by the targeted model. We develop HopSkipJumpAttack, a family of algorithms based on a novel estimate of the gradient direction using binary information at the decision boundary. The proposed family includes both untargeted and targeted attacks opt… ▽ More

    Submitted 27 April, 2020; v1 submitted 3 April, 2019; originally announced April 2019.

  39. arXiv:1902.00194  [pdf, other

    math.ST cs.LG stat.ML

    Sharp Analysis of Expectation-Maximization for Weakly Identifiable Models

    Authors: Raaz Dwivedi, Nhat Ho, Koulik Khamaru, Martin J. Wainwright, Michael I. Jordan, Bin Yu

    Abstract: We study a class of weakly identifiable location-scale mixture models for which the maximum likelihood estimates based on $n$ i.i.d. samples are known to have lower accuracy than the classical $n^{- \frac{1}{2}}$ error. We investigate whether the Expectation-Maximization (EM) algorithm also converges slowly for these models. We provide a rigorous characterization of EM for fitting a weakly identif… ▽ More

    Submitted 15 November, 2021; v1 submitted 1 February, 2019; originally announced February 2019.

    Comments: 30 pages, 4 figures. The first three authors contributed equally to this work. To appear in AISTATS 2020

  40. arXiv:1812.08305  [pdf, ps, other

    cs.LG math.OC stat.ML

    Derivative-Free Methods for Policy Optimization: Guarantees for Linear Quadratic Systems

    Authors: Dhruv Malik, Ashwin Pananjady, Kush Bhatia, Koulik Khamaru, Peter L. Bartlett, Martin J. Wainwright

    Abstract: We study derivative-free methods for policy optimization over the class of linear policies. We focus on characterizing the convergence rate of these methods when applied to linear-quadratic systems, and study various settings of driving noise and reward feedback. We show that these methods provably converge to within any pre-specified tolerance of the optimal policy with a number of zero-order eva… ▽ More

    Submitted 18 May, 2020; v1 submitted 19 December, 2018; originally announced December 2018.

    Comments: Version v3 consistent with paper appearing in JMLR

  41. arXiv:1810.00828  [pdf, other

    math.ST stat.ML

    Singularity, Misspecification, and the Convergence Rate of EM

    Authors: Raaz Dwivedi, Nhat Ho, Koulik Khamaru, Michael I. Jordan, Martin J. Wainwright, Bin Yu

    Abstract: A line of recent work has analyzed the behavior of the Expectation-Maximization (EM) algorithm in the well-specified setting, in which the population likelihood is locally strongly concave around its maximizing argument. Examples include suitably separated Gaussian mixture models and mixtures of linear regressions. We consider over-specified settings in which the number of fitted components is lar… ▽ More

    Submitted 28 April, 2020; v1 submitted 1 October, 2018; originally announced October 2018.

    Comments: 63 pages, 12 figures. The first three authors contributed equally to this work. To appear in Annals of Statistics

    MSC Class: Primary 62F15; 62G05; secondary 62G20

  42. arXiv:1808.02610  [pdf, other

    cs.LG stat.ML

    L-Shapley and C-Shapley: Efficient Model Interpretation for Structured Data

    Authors: Jianbo Chen, Le Song, Martin J. Wainwright, Michael I. Jordan

    Abstract: We study instancewise feature importance scoring as a method for model interpretation. Any such method yields, for each predicted instance, a vector of importance scores associated with the feature vector. Methods based on the Shapley score have been proposed as a fair way of computing feature attributions of this kind, but incur an exponential complexity in the number of features. This combinator… ▽ More

    Submitted 7 August, 2018; originally announced August 2018.

  43. arXiv:1806.09544  [pdf, other

    stat.ML cs.IT cs.LG math.ST

    Towards Optimal Estimation of Bivariate Isotonic Matrices with Unknown Permutations

    Authors: Cheng Mao, Ashwin Pananjady, Martin J. Wainwright

    Abstract: Many applications, including rank aggregation, crowd-labeling, and graphon estimation, can be modeled in terms of a bivariate isotonic matrix with unknown permutations acting on its rows and/or columns. We consider the problem of estimating an unknown matrix in this class, based on noisy observations of (possibly, a subset of) its entries. We design and analyze polynomial-time algorithms that impr… ▽ More

    Submitted 26 October, 2019; v1 submitted 25 June, 2018; originally announced June 2018.

    Comments: 60 pages, 1 figure. This paper is a longer version of the paper arXiv:1802.09963 v3, which appeared in part as a 4-page extended abstract at Conference on Learning Theory (COLT) 2018. This paper studies the problem in more general settings and in another error metric. This version corrects a statement in Theorem 2 of v1

  44. arXiv:1804.09629  [pdf, other

    stat.ML cs.LG math.OC

    Convergence guarantees for a class of non-convex and non-smooth optimization problems

    Authors: Koulik Khamaru, Martin J. Wainwright

    Abstract: We consider the problem of finding critical points of functions that are non-convex and non-smooth. Studying a fairly broad class of such problems, we analyze the behavior of three gradient-based methods (gradient descent, proximal update, and Frank-Wolfe update). For each of these methods, we establish rates of convergence for general problems, and also prove faster rates for continuous sub-analy… ▽ More

    Submitted 25 April, 2018; originally announced April 2018.

    Comments: 50 pages, 2 figures

  45. arXiv:1802.09963  [pdf, other

    stat.ML cs.IT cs.LG math.ST

    Breaking the $1/\sqrt{n}$ Barrier: Faster Rates for Permutation-based Models in Polynomial Time

    Authors: Cheng Mao, Ashwin Pananjady, Martin J. Wainwright

    Abstract: Many applications, including rank aggregation and crowd-labeling, can be modeled in terms of a bivariate isotonic matrix with unknown permutations acting on its rows and columns. We consider the problem of estimating such a matrix based on noisy observations of a subset of its entries, and design and analyze a polynomial-time algorithm that improves upon the state of the art. In particular, our re… ▽ More

    Submitted 5 June, 2018; v1 submitted 27 February, 2018; originally announced February 2018.

    Comments: 30 pages, 1 figure. Accepted for presentation at Conference on Learning Theory (COLT) 2018

  46. arXiv:1802.07814  [pdf, other

    cs.LG cs.AI stat.ML

    Learning to Explain: An Information-Theoretic Perspective on Model Interpretation

    Authors: Jianbo Chen, Le Song, Martin J. Wainwright, Michael I. Jordan

    Abstract: We introduce instancewise feature selection as a methodology for model interpretation. Our method is based on learning a function to extract a subset of features that are most informative for each given example. This feature selector is trained to maximize the mutual information between selected features and the response variable, where the conditional distribution of the response variable given t… ▽ More

    Submitted 13 June, 2018; v1 submitted 21 February, 2018; originally announced February 2018.

    Comments: Accepted to ICML 2018 as a long oral

  47. arXiv:1801.02309  [pdf, other

    stat.ML stat.CO

    Log-concave sampling: Metropolis-Hastings algorithms are fast

    Authors: Raaz Dwivedi, Yuansi Chen, Martin J. Wainwright, Bin Yu

    Abstract: We consider the problem of sampling from a strongly log-concave density in $\mathbb{R}^d$, and prove a non-asymptotic upper bound on the mixing time of the Metropolis-adjusted Langevin algorithm (MALA). The method draws samples by simulating a Markov chain obtained from the discretization of an appropriate Langevin diffusion, combined with an accept-reject step. Relative to known guarantees for th… ▽ More

    Submitted 10 December, 2019; v1 submitted 8 January, 2018; originally announced January 2018.

    Comments: 42 pages, 11 Figures; The first two authors contributed equally; A subset of results were presented in an extended abstract at COLT 2018

    Journal ref: Journal of Machine Learning Research, 2019

  48. arXiv:1801.01253  [pdf, other

    cs.LG cs.AI cs.IT stat.ML

    Approximate Ranking from Pairwise Comparisons

    Authors: Reinhard Heckel, Max Simchowitz, Kannan Ramchandran, Martin J. Wainwright

    Abstract: A common problem in machine learning is to rank a set of n items based on pairwise comparisons. Here ranking refers to partitioning the items into sets of pre-specified sizes according to their scores, which includes identification of the top-k items as the most prominent special case. The score of a given item is defined as the probability that it beats a randomly chosen other item. Finding an ex… ▽ More

    Submitted 4 January, 2018; originally announced January 2018.

    Comments: AISTATS 2017

  49. arXiv:1710.08165  [pdf, other

    stat.ML

    Fast MCMC sampling algorithms on polytopes

    Authors: Yuansi Chen, Raaz Dwivedi, Martin J. Wainwright, Bin Yu

    Abstract: We propose and analyze two new MCMC sampling algorithms, the Vaidya walk and the John walk, for generating samples from the uniform distribution over a polytope. Both random walks are sampling algorithms derived from interior point methods. The former is based on volumetric-logarithmic barrier introduced by Vaidya whereas the latter uses John's ellipsoids. We show that the Vaidya walk mixes in sig… ▽ More

    Submitted 6 March, 2019; v1 submitted 23 October, 2017; originally announced October 2017.

    Comments: 86 pages, 9 figures, First two authors contributed equally

    Journal ref: The Journal of Machine Learning Research, 19(1), 2146-2231 (2019)

  50. arXiv:1710.00499  [pdf, other

    stat.ME cs.LG math.ST stat.ML

    Online control of the false discovery rate with decaying memory

    Authors: Aaditya Ramdas, Fanny Yang, Martin J. Wainwright, Michael I. Jordan

    Abstract: In the online multiple testing problem, p-values corresponding to different null hypotheses are observed one by one, and the decision of whether or not to reject the current hypothesis must be made immediately, after which the next p-value is observed. Alpha-investing algorithms to control the false discovery rate (FDR), formulated by Foster and Stine, have been generalized and applied to many set… ▽ More

    Submitted 2 October, 2017; originally announced October 2017.

    Comments: 20 pages, 4 figures. Published in the proceedings of NIPS 2017