Skip to main content

Showing 1–23 of 23 results for author: Ye, M

Searching in archive stat. Search in all archives.
.
  1. arXiv:2302.00248  [pdf, ps, other

    cs.DS cs.LG stat.ML

    A Nearly-Optimal Bound for Fast Regression with $\ell_\infty$ Guarantee

    Authors: Zhao Song, Mingquan Ye, Junze Yin, Lichen Zhang

    Abstract: Given a matrix $A\in \mathbb{R}^{n\times d}$ and a vector $b\in \mathbb{R}^n$, we consider the regression problem with $\ell_\infty$ guarantees: finding a vector $x'\in \mathbb{R}^d$ such that $ \|x'-x^*\|_\infty \leq \fracε{\sqrt{d}}\cdot \|Ax^*-b\|_2\cdot \|A^\dagger\|$ where $x^*=\arg\min_{x\in \mathbb{R}^d}\|Ax-b\|_2$. One popular approach for solving such $\ell_2$ regression problem is via sk… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

    Comments: Abstract shortened to meet arxiv requirement

  2. arXiv:2110.08720  [pdf, other

    cs.LG stat.ML

    Centroid Approximation for Bootstrap: Improving Particle Quality at Inference

    Authors: Mao Ye, Qiang Liu

    Abstract: Bootstrap is a principled and powerful frequentist statistical tool for uncertainty quantification. Unfortunately, standard bootstrap methods are computationally intensive due to the need of drawing a large i.i.d. bootstrap sample to approximate the ideal bootstrap distribution; this largely hinders their application in large-scale machine learning, especially deep learning problems. In this work,… ▽ More

    Submitted 31 August, 2022; v1 submitted 17 October, 2021; originally announced October 2021.

  3. arXiv:2103.07861  [pdf, other

    cs.LG stat.ML

    VCNet and Functional Targeted Regularization For Learning Causal Effects of Continuous Treatments

    Authors: Lizhen Nie, Mao Ye, Qiang Liu, Dan Nicolae

    Abstract: Motivated by the rising abundance of observational data with continuous treatments, we investigate the problem of estimating the average dose-response curve (ADRF). Available parametric methods are limited in their model space, and previous attempts in leveraging neural network to enhance model expressiveness relied on partitioning continuous treatment into blocks and using separate heads for each… ▽ More

    Submitted 14 March, 2021; originally announced March 2021.

  4. arXiv:2010.15969  [pdf, other

    cs.LG math.OC stat.ML

    Greedy Optimization Provably Wins the Lottery: Logarithmic Number of Winning Tickets is Enough

    Authors: Mao Ye, Lemeng Wu, Qiang Liu

    Abstract: Despite the great success of deep learning, recent works show that large deep neural networks are often highly redundant and can be significantly reduced in size. However, the theoretical question of how much we can prune a neural network given a specified tolerance of accuracy drop is still open. This paper provides one answer to this question by proposing a greedy optimization based pruning meth… ▽ More

    Submitted 29 October, 2020; originally announced October 2020.

    Journal ref: NeurIPS 2020

  5. arXiv:2007.00811  [pdf, other

    cs.LG stat.ML

    Go Wide, Then Narrow: Efficient Training of Deep Thin Networks

    Authors: Denny Zhou, Mao Ye, Chen Chen, Tianjian Meng, Mingxing Tan, Xiaodan Song, Quoc Le, Qiang Liu, Dale Schuurmans

    Abstract: For deploying a deep learning model into production, it needs to be both accurate and compact to meet the latency and memory constraints. This usually results in a network that is deep (to ensure performance) and yet thin (to improve computational efficiency). In this paper, we propose an efficient method to train a deep thin network with a theoretic guarantee. Our method is motivated by model com… ▽ More

    Submitted 17 August, 2020; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: ICML 2020

  6. arXiv:2005.14424  [pdf, other

    cs.LG cs.CL cs.CR stat.ML

    SAFER: A Structure-free Approach for Certified Robustness to Adversarial Word Substitutions

    Authors: Mao Ye, Chengyue Gong, Qiang Liu

    Abstract: State-of-the-art NLP models can often be fooled by human-unaware transformations such as synonymous word substitution. For security reasons, it is of critical importance to develop models with certified robustness that can provably guarantee that the prediction is can not be altered by any possible synonymous word substitution. In this work, we propose a certified robust method based on a new rand… ▽ More

    Submitted 29 May, 2020; originally announced May 2020.

    Comments: ACL 2020

  7. arXiv:2005.14359  [pdf, other

    cs.LG stat.ML

    Unsupervised Feature Selection via Multi-step Markov Transition Probability

    Authors: Yan Min, Mao Ye, Liang Tian, Yulin Jian, Ce Zhu, Shangming Yang

    Abstract: Feature selection is a widely used dimension reduction technique to select feature subsets because of its interpretability. Many methods have been proposed and achieved good results, in which the relationships between adjacent data points are mainly concerned. But the possible associations between data pairs that are may not adjacent are always neglected. Different from previous methods, we propos… ▽ More

    Submitted 28 May, 2020; originally announced May 2020.

  8. arXiv:2004.05944  [pdf, ps, other

    math.PR cs.IT stat.ML

    Exact recovery and sharp thresholds of Stochastic Ising Block Model

    Authors: Min Ye

    Abstract: The stochastic block model (SBM) is a random graph model in which the edges are generated according to the underlying cluster structure on the vertices. The (ferromagnetic) Ising model, on the other hand, assigns $\pm 1$ labels to vertices according to an underlying graph structure in a way that if two vertices are connected in the graph then they are more likely to be assigned the same label. In… ▽ More

    Submitted 14 October, 2020; v1 submitted 13 April, 2020; originally announced April 2020.

    Comments: Fixed a gap in the original proof of Theorem 5. The new proof of Theorem 5 relies on Lemma 5, which is the main new element in this version

  9. arXiv:2003.10392  [pdf, other

    cs.LG stat.ML

    Steepest Descent Neural Architecture Optimization: Esca** Local Optimum with Signed Neural Splitting

    Authors: Lemeng Wu, Mao Ye, Qi Lei, Jason D. Lee, Qiang Liu

    Abstract: Develo** efficient and principled neural architecture optimization methods is a critical challenge of modern deep learning. Recently, Liu et al.[19] proposed a splitting steepest descent (S2D) method that jointly optimizes the neural parameters and architectures based on progressively growing network structures by splitting neurons into multiple copies in a steepest descent fashion. However, S2D… ▽ More

    Submitted 20 June, 2021; v1 submitted 23 March, 2020; originally announced March 2020.

  10. arXiv:2003.01794  [pdf, other

    cs.LG stat.ML

    Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection

    Authors: Mao Ye, Chengyue Gong, Lizhen Nie, Denny Zhou, Adam Klivans, Qiang Liu

    Abstract: Recent empirical works show that large deep neural networks are often highly redundant and one can find much smaller subnetworks without a significant drop of accuracy. However, most existing methods of network pruning are empirical and heuristic, leaving it open whether good subnetworks provably exist, how to find them efficiently, and if network pruning can be provably better than direct trainin… ▽ More

    Submitted 19 October, 2020; v1 submitted 3 March, 2020; originally announced March 2020.

    Comments: ICML 2020

  11. arXiv:2002.09169  [pdf, other

    cs.LG cs.CR math.OC stat.ML

    Black-Box Certification with Randomized Smoothing: A Functional Optimization Based Framework

    Authors: Dinghuai Zhang, Mao Ye, Chengyue Gong, Zhanxing Zhu, Qiang Liu

    Abstract: Randomized classifiers have been shown to provide a promising approach for achieving certified robustness against adversarial attacks in deep learning. However, most existing methods only leverage Gaussian smoothing noise and only work for $\ell_2$ perturbation. We propose a general framework of adversarial certification with non-Gaussian noise and for more general types of attacks, from a unified… ▽ More

    Submitted 20 October, 2020; v1 submitted 21 February, 2020; originally announced February 2020.

    Comments: Accepted by NeurIPS 2020

  12. arXiv:2002.09070  [pdf, other

    cs.LG stat.ML

    Stein Self-Repulsive Dynamics: Benefits From Past Samples

    Authors: Mao Ye, Tongzheng Ren, Qiang Liu

    Abstract: We propose a new Stein self-repulsive dynamics for obtaining diversified samples from intractable un-normalized distributions. Our idea is to introduce Stein variational gradient as a repulsive force to push the samples of Langevin dynamics away from the past trajectories. This simple idea allows us to significantly decrease the auto-correlation in Langevin dynamics and hence increase the effectiv… ▽ More

    Submitted 15 December, 2020; v1 submitted 20 February, 2020; originally announced February 2020.

    Journal ref: NeurIPS 2020

  13. arXiv:2002.09049  [pdf, other

    cs.LG cs.CV stat.ML

    Post-training Quantization with Multiple Points: Mixed Precision without Mixed Precision

    Authors: Xingchao Liu, Mao Ye, Dengyong Zhou, Qiang Liu

    Abstract: We consider the post-training quantization problem, which discretizes the weights of pre-trained deep neural networks without re-training the model. We propose multipoint quantization, a quantization method that approximates a full-precision weight vector using a linear combination of multiple vectors of low-bit numbers; this is in contrast to typical quantization methods that approximate each wei… ▽ More

    Submitted 14 January, 2021; v1 submitted 20 February, 2020; originally announced February 2020.

    Comments: Accepted by AAAI2021

  14. arXiv:2002.09024  [pdf, ps, other

    cs.LG stat.ML

    MaxUp: A Simple Way to Improve Generalization of Neural Network Training

    Authors: Chengyue Gong, Tongzheng Ren, Mao Ye, Qiang Liu

    Abstract: We propose \emph{MaxUp}, an embarrassingly simple, highly effective technique for improving the generalization performance of machine learning models, especially deep neural networks. The idea is to generate a set of augmented data with some random perturbations or transforms and minimize the maximum, or worst case loss over the augmented data. By doing so, we implicitly introduce a smoothness or… ▽ More

    Submitted 20 February, 2020; originally announced February 2020.

  15. arXiv:2002.02919  [pdf, ps, other

    stat.CO cs.LG stat.ML

    Extended Stochastic Gradient MCMC for Large-Scale Bayesian Variable Selection

    Authors: Qifan Song, Yan Sun, Mao Ye, Faming Liang

    Abstract: Stochastic gradient Markov chain Monte Carlo (MCMC) algorithms have received much attention in Bayesian computing for big data problems, but they are only applicable to a small class of problems for which the parameter space has a fixed dimension and the log-posterior density is differentiable with respect to the parameters. This paper proposes an extended stochastic gradient MCMC lgoriathm which,… ▽ More

    Submitted 7 February, 2020; originally announced February 2020.

  16. arXiv:1912.03321  [pdf, other

    cs.LG stat.ML

    Robust Deep Graph Based Learning for Binary Classification

    Authors: Minxiang Ye, Vladimir Stankovic, Lina Stankovic, Gene Cheung

    Abstract: Convolutional neural network (CNN)-based feature learning has become state of the art, since given sufficient training data, CNN can significantly outperform traditional methods for various classification tasks. However, feature learning becomes more difficult if some training labels are noisy. With traditional regularization techniques, CNN often overfits to the noisy training labels, resulting i… ▽ More

    Submitted 6 December, 2019; originally announced December 2019.

  17. arXiv:1811.07455  [pdf, other

    cs.LG cs.CG stat.ML

    On Geometric Alignment in Low Doubling Dimension

    Authors: Hu Ding, Mingquan Ye

    Abstract: In real-world, many problems can be formulated as the alignment between two geometric patterns. Previously, a great amount of research focus on the alignment of 2D or 3D patterns, especially in the field of computer vision. Recently, the alignment of geometric patterns in high dimension finds several novel applications, and has attracted more and more attentions. However, the research is still rat… ▽ More

    Submitted 18 November, 2018; originally announced November 2018.

  18. arXiv:1810.03545  [pdf, other

    stat.ML cs.LG

    Stein Neural Sampler

    Authors: Tianyang Hu, Zixiang Chen, Hanxi Sun, **cheng Bai, Mao Ye, Guang Cheng

    Abstract: We propose two novel samplers to generate high-quality samples from a given (un-normalized) probability density. Motivated by the success of generative adversarial networks, we construct our samplers using deep neural networks that transform a reference distribution to the target distribution. Training schemes are developed to minimize two variations of the Stein discrepancy, which is designed to… ▽ More

    Submitted 8 February, 2021; v1 submitted 8 October, 2018; originally announced October 2018.

  19. arXiv:1808.02474  [pdf, other

    cs.CV cs.LG stat.ML

    Multi-Label Zero-Shot Learning with Transfer-Aware Label Embedding Projection

    Authors: Meng Ye, Yuhong Guo

    Abstract: Zero-shot learning transfers knowledge from seen classes to novel unseen classes to reduce human labor of labelling data for building new classifiers. Much effort on zero-shot learning however has focused on the standard multi-class setting, the more challenging multi-label zero-shot problem has received limited attention. In this paper we propose a transfer-aware embedding projection approach to… ▽ More

    Submitted 7 August, 2018; originally announced August 2018.

  20. arXiv:1805.07473  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Progressive Ensemble Networks for Zero-Shot Recognition

    Authors: Meng Ye, Yuhong Guo

    Abstract: Despite the advancement of supervised image recognition algorithms, their dependence on the availability of labeled data and the rapid expansion of image categories raise the significant challenge of zero-shot learning. Zero-shot learning (ZSL) aims to transfer knowledge from labeled classes into unlabeled classes to reduce human labeling effort. In this paper, we propose a novel progressive ensem… ▽ More

    Submitted 6 April, 2019; v1 submitted 18 May, 2018; originally announced May 2018.

    Comments: CVPR19

  21. arXiv:1804.07275  [pdf, other

    cs.LG stat.ML

    Deep Triplet Ranking Networks for One-Shot Recognition

    Authors: Meng Ye, Yuhong Guo

    Abstract: Despite the breakthroughs achieved by deep learning models in conventional supervised learning scenarios, their dependence on sufficient labeled training data in each class prevents effective applications of these deep models in situations where labeled training instances for a subset of novel classes are very sparse -- in the extreme case only one instance is available for each class. To tackle t… ▽ More

    Submitted 19 April, 2018; originally announced April 2018.

  22. arXiv:1802.05251  [pdf, other

    cs.LG cs.CR stat.ML

    Differentially Private Empirical Risk Minimization Revisited: Faster and More General

    Authors: Di Wang, Minwei Ye, **hui Xu

    Abstract: In this paper we study the differentially private Empirical Risk Minimization (ERM) problem in different settings. For smooth (strongly) convex loss function with or without (non)-smooth regularization, we give algorithms that achieve either optimal or near optimal utility bounds with less gradient complexity compared with previous work. For ERM with smooth convex loss function in high-dimensional… ▽ More

    Submitted 14 February, 2018; originally announced February 2018.

    Comments: Thirty-first Annual Conference on Neural Information Processing Systems (NIPS-2017)

  23. arXiv:1802.03475  [pdf, other

    stat.ML cs.DC cs.IT cs.LG

    Communication-Computation Efficient Gradient Coding

    Authors: Min Ye, Emmanuel Abbe

    Abstract: This paper develops coding techniques to reduce the running time of distributed learning tasks. It characterizes the fundamental tradeoff to compute gradients (and more generally vector summations) in terms of three parameters: computation load, straggler tolerance and communication cost. It further gives an explicit coding scheme that achieves the optimal tradeoff based on recursive polynomial co… ▽ More

    Submitted 9 February, 2018; originally announced February 2018.