Skip to main content

Showing 1–14 of 14 results for author: Zhai, X

Searching in archive stat. Search in all archives.
.
  1. arXiv:2011.03395  [pdf, other

    cs.LG stat.ML

    Underspecification Presents Challenges for Credibility in Modern Machine Learning

    Authors: Alexander D'Amour, Katherine Heller, Dan Moldovan, Ben Adlam, Babak Alipanahi, Alex Beutel, Christina Chen, Jonathan Deaton, Jacob Eisenstein, Matthew D. Hoffman, Farhad Hormozdiari, Neil Houlsby, Shaobo Hou, Ghassen Jerfel, Alan Karthikesalingam, Mario Lucic, Yian Ma, Cory McLean, Diana Mincu, Akinori Mitani, Andrea Montanari, Zachary Nado, Vivek Natarajan, Christopher Nielson, Thomas F. Osborne , et al. (15 additional authors not shown)

    Abstract: ML models often exhibit unexpectedly poor behavior when they are deployed in real-world domains. We identify underspecification as a key reason for these failures. An ML pipeline is underspecified when it can return many predictors with equivalently strong held-out performance in the training domain. Underspecification is common in modern ML pipelines, such as those based on deep learning. Predict… ▽ More

    Submitted 24 November, 2020; v1 submitted 6 November, 2020; originally announced November 2020.

    Comments: Updates: Updated statistical analysis in Section 6; Additional citations

  2. arXiv:2004.04894  [pdf

    cs.LG eess.SP stat.ML

    Fully Automatic Electrocardiogram Classification System based on Generative Adversarial Network with Auxiliary Classifier

    Authors: Zhanhong Zhou, Xiaolong Zhai, Chung Tin

    Abstract: A generative adversarial network (GAN) based fully automatic electrocardiogram (ECG) arrhythmia classification system with high performance is presented in this paper. The generator (G) in our GAN is designed to generate various coupling matrix inputs conditioned on different arrhythmia classes for data augmentation. Our designed discriminator (D) is trained on both real and generated ECG coupling… ▽ More

    Submitted 4 March, 2021; v1 submitted 9 April, 2020; originally announced April 2020.

    Comments: Accepted for publication in Expert Systems with Applications

    Journal ref: Expert Systems with Applications, Volume 174, 2021, 114809, ISSN 0957-4174

  3. arXiv:1910.04867  [pdf, other

    cs.CV cs.LG stat.ML

    A Large-scale Study of Representation Learning with the Visual Task Adaptation Benchmark

    Authors: Xiaohua Zhai, Joan Puigcerver, Alexander Kolesnikov, Pierre Ruyssen, Carlos Riquelme, Mario Lucic, Josip Djolonga, Andre Susano Pinto, Maxim Neumann, Alexey Dosovitskiy, Lucas Beyer, Olivier Bachem, Michael Tschannen, Marcin Michalski, Olivier Bousquet, Sylvain Gelly, Neil Houlsby

    Abstract: Representation learning promises to unlock deep learning for the long tail of vision tasks without expensive labelled datasets. Yet, the absence of a unified evaluation for general visual representations hinders progress. Popular protocols are often too constrained (linear classification), limited in diversity (ImageNet, CIFAR, Pascal-VOC), or only weakly related to representation quality (ELBO, r… ▽ More

    Submitted 21 February, 2020; v1 submitted 1 October, 2019; originally announced October 2019.

  4. arXiv:1908.10292  [pdf, other

    math.ST cs.LG stat.ML

    On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels

    Authors: Tengyuan Liang, Alexander Rakhlin, Xiyu Zhai

    Abstract: We study the risk of minimum-norm interpolants of data in Reproducing Kernel Hilbert Spaces. Our upper bounds on the risk are of a multiple-descent shape for the various scalings of $d = n^α$, $α\in(0,1)$, for the input dimension $d$ and sample size $n$. Empirical evidence supports our finding that minimum-norm interpolants in RKHS can exhibit this unusual non-monotonicity in sample size; furtherm… ▽ More

    Submitted 3 February, 2020; v1 submitted 27 August, 2019; originally announced August 2019.

    Journal ref: Proceedings of the 33rd Conference on Learning Theory 125 (2020) 2683-2711

  5. arXiv:1906.11289   

    cs.LG stat.ML

    Near Optimal Stratified Sampling

    Authors: Tiancheng Yu, Xiyu Zhai, Suvrit Sra

    Abstract: The performance of a machine learning system is usually evaluated by using i.i.d.\ observations with true labels. However, acquiring ground truth labels is expensive, while obtaining unlabeled samples may be cheaper. Stratified sampling can be beneficial in such settings and can reduce the number of true labels required without compromising the evaluation accuracy. Stratified sampling exploits sta… ▽ More

    Submitted 26 July, 2019; v1 submitted 26 June, 2019; originally announced June 2019.

    Comments: We have discovered a mistake in the main result. The quantity on the RHS of (3) is not equal to the variance of estimator (2) when the sampling rule is designed adaptively as we do. There will be further cross-product terms which are now dominant terms. Therefore, although our bound is correct for (3), it no longer implies bound of the variance of (2)

  6. arXiv:1903.02271  [pdf, other

    cs.LG cs.CV stat.ML

    High-Fidelity Image Generation With Fewer Labels

    Authors: Mario Lucic, Michael Tschannen, Marvin Ritter, Xiaohua Zhai, Olivier Bachem, Sylvain Gelly

    Abstract: Deep generative models are becoming a cornerstone of modern machine learning. Recent work on conditional generative adversarial networks has shown that learning complex, high-dimensional distributions over natural images is within reach. While the latest models are able to generate high-fidelity, diverse natural images at high resolution, they rely on a vast quantity of labeled data. In this work… ▽ More

    Submitted 14 May, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

    Comments: Mario Lucic, Michael Tschannen, and Marvin Ritter contributed equally to this work. ICML 2019 camera-ready version. Code available at https://github.com/google/compare_gan

  7. arXiv:1812.11167  [pdf, ps, other

    stat.ML cs.LG math.ST

    Consistency of Interpolation with Laplace Kernels is a High-Dimensional Phenomenon

    Authors: Alexander Rakhlin, Xiyu Zhai

    Abstract: We show that minimum-norm interpolation in the Reproducing Kernel Hilbert Space corresponding to the Laplace kernel is not consistent if input dimension is constant. The lower bound holds for any choice of kernel bandwidth, even if selected based on data. The result supports the empirical observation that minimum-norm interpolation (that is, exact fit to training data) in RKHS generalizes well for… ▽ More

    Submitted 28 December, 2018; originally announced December 2018.

  8. arXiv:1811.11212  [pdf, other

    cs.LG cs.CV stat.ML

    Self-Supervised GANs via Auxiliary Rotation Loss

    Authors: Ting Chen, Xiaohua Zhai, Marvin Ritter, Mario Lucic, Neil Houlsby

    Abstract: Conditional GANs are at the forefront of natural image synthesis. The main drawback of such models is the necessity for labeled data. In this work we exploit two popular unsupervised learning techniques, adversarial training and self-supervision, and take a step towards bridging the gap between conditional and unconditional GANs. In particular, we allow the networks to collaborate on the task of r… ▽ More

    Submitted 9 April, 2019; v1 submitted 27 November, 2018; originally announced November 2018.

  9. arXiv:1811.03804  [pdf, ps, other

    cs.LG cs.AI cs.CV math.OC stat.ML

    Gradient Descent Finds Global Minima of Deep Neural Networks

    Authors: Simon S. Du, Jason D. Lee, Haochuan Li, Liwei Wang, Xiyu Zhai

    Abstract: Gradient descent finds a global minimum in training deep neural networks despite the objective function being non-convex. The current paper proves gradient descent achieves zero training loss in polynomial time for a deep over-parameterized neural network with residual connections (ResNet). Our analysis relies on the particular structure of the Gram matrix induced by the neural network architectur… ▽ More

    Submitted 28 May, 2019; v1 submitted 9 November, 2018; originally announced November 2018.

    Comments: ICML 2019

  10. arXiv:1810.11598  [pdf, other

    cs.LG cs.CV stat.ML

    Self-Supervised GAN to Counter Forgetting

    Authors: Ting Chen, Xiaohua Zhai, Neil Houlsby

    Abstract: GANs involve training two networks in an adversarial game, where each network's task depends on its adversary. Recently, several works have framed GAN training as an online or continual learning problem. We focus on the discriminator, which must perform classification under an (adversarially) shifting data distribution. When trained on sequential tasks, neural networks exhibit \emph{forgetting}. F… ▽ More

    Submitted 29 November, 2018; v1 submitted 27 October, 2018; originally announced October 2018.

    Comments: NeurIPS'18 Continual Learning workshop

  11. arXiv:1810.02054  [pdf, other

    cs.LG math.OC stat.ML

    Gradient Descent Provably Optimizes Over-parameterized Neural Networks

    Authors: Simon S. Du, Xiyu Zhai, Barnabas Poczos, Aarti Singh

    Abstract: One of the mysteries in the success of neural networks is randomly initialized first order methods like gradient descent can achieve zero training loss even though the objective function is non-convex and non-smooth. This paper demystifies this surprising phenomenon for two-layer fully connected ReLU activated neural networks. For an $m$ hidden node shallow neural network with ReLU activation and… ▽ More

    Submitted 4 February, 2019; v1 submitted 4 October, 2018; originally announced October 2018.

    Comments: ICLR 2019

  12. arXiv:1807.04720  [pdf, other

    cs.LG stat.ML

    A Large-Scale Study on Regularization and Normalization in GANs

    Authors: Karol Kurach, Mario Lucic, Xiaohua Zhai, Marcin Michalski, Sylvain Gelly

    Abstract: Generative adversarial networks (GANs) are a class of deep generative models which aim to learn a target distribution in an unsupervised fashion. While they were successfully applied to many problems, training a GAN is a notoriously challenging task and requires a significant number of hyperparameter tuning, neural architecture engineering, and a non-trivial amount of "tricks". The success in many… ▽ More

    Submitted 14 May, 2019; v1 submitted 12 July, 2018; originally announced July 2018.

    Comments: Revision accepted to ICML'19: More focus on regularization and normalization aspects. Added recent references and promising future directions

  13. arXiv:1805.07883  [pdf, other

    stat.ML cs.AI cs.CV cs.LG

    How Many Samples are Needed to Estimate a Convolutional or Recurrent Neural Network?

    Authors: Simon S. Du, Yining Wang, Xiyu Zhai, Sivaraman Balakrishnan, Ruslan Salakhutdinov, Aarti Singh

    Abstract: It is widely believed that the practical success of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) owes to the fact that CNNs and RNNs use a more compact parametric representation than their Fully-Connected Neural Network (FNN) counterparts, and consequently require fewer training examples to accurately estimate their parameters. We initiate the study of rigorously chara… ▽ More

    Submitted 29 June, 2019; v1 submitted 20 May, 2018; originally announced May 2018.

    Comments: Revised version, with new results on recurrent neural networks. Preliminary version in NeurIPS 2018

  14. arXiv:1707.05947  [pdf, other

    cs.LG math.OC stat.ML

    Generalization Bounds of SGLD for Non-convex Learning: Two Theoretical Viewpoints

    Authors: Wenlong Mou, Liwei Wang, Xiyu Zhai, Kai Zheng

    Abstract: Algorithm-dependent generalization error bounds are central to statistical learning theory. A learning algorithm may use a large hypothesis space, but the limited number of iterations controls its model capacity and generalization error. The impacts of stochastic gradient methods on generalization error for non-convex learning problems not only have important theoretical consequences, but are also… ▽ More

    Submitted 19 July, 2017; originally announced July 2017.