Skip to main content

Showing 1–50 of 89 results for author: Póczós, B

Searching in archive stat. Search in all archives.
.
  1. arXiv:2007.12948  [pdf, ps, other

    eess.AS cs.LG cs.SD stat.ML

    Nonlinear ISA with Auxiliary Variables for Learning Speech Representations

    Authors: Amrith Setlur, Barnabas Poczos, Alan W Black

    Abstract: This paper extends recent work on nonlinear Independent Component Analysis (ICA) by introducing a theoretical framework for nonlinear Independent Subspace Analysis (ISA) in the presence of auxiliary variables. Observed high dimensional acoustic features like log Mel spectrograms can be considered as surface level manifestations of nonlinear transformations over individual multivariate sources of i… ▽ More

    Submitted 25 July, 2020; originally announced July 2020.

    Comments: To be presented at Interspeech 2020

  2. arXiv:2007.02523  [pdf, other

    cs.LG stat.ML

    Covariate Distribution Aware Meta-learning

    Authors: Amrith Setlur, Saket Dingliwal, Barnabas Poczos

    Abstract: Meta-learning has proven to be successful for few-shot learning across the regression, classification, and reinforcement learning paradigms. Recent approaches have adopted Bayesian interpretations to improve gradient-based meta-learners by quantifying the uncertainty of the post-adaptation estimates. Most of these works almost completely ignore the latent relationship between the covariate distrib… ▽ More

    Submitted 27 November, 2020; v1 submitted 6 July, 2020; originally announced July 2020.

    Journal ref: ICML 2020 Lifelong Learning Workshop

  3. arXiv:2004.08597  [pdf, other

    math.ST cs.LG stat.ML

    Robust Density Estimation under Besov IPM Losses

    Authors: Ananya Uppal, Shashank Singh, Barnabas Poczos

    Abstract: We study minimax convergence rates of nonparametric density estimation in the Huber contamination model, in which a proportion of the data comes from an unknown outlier distribution. We provide the first results for this problem under a large family of losses, called Besov integral probability metrics (IPMs), that includes $\mathcal{L}^p$, Wasserstein, Kolmogorov-Smirnov, and other common distance… ▽ More

    Submitted 6 September, 2021; v1 submitted 18 April, 2020; originally announced April 2020.

  4. arXiv:2004.05665  [pdf, other

    cs.LG stat.ML

    Minimizing FLOPs to Learn Efficient Sparse Representations

    Authors: Biswajit Paria, Chih-Kuan Yeh, Ian E. H. Yen, Ning Xu, Pradeep Ravikumar, Barnabás Póczos

    Abstract: Deep representation learning has become one of the most widely adopted approaches for visual search, recommendation, and identification. Retrieval of such representations from a large database is however computationally challenging. Approximate methods based on learning compact representations, have been widely explored for this problem, such as locality sensitive hashing, product quantization, an… ▽ More

    Submitted 12 April, 2020; originally announced April 2020.

    Comments: Published at ICLR 2020

  5. arXiv:2002.08528  [pdf, other

    cs.LG math.OC stat.ML

    Adaptive Sampling Distributed Stochastic Variance Reduced Gradient for Heterogeneous Distributed Datasets

    Authors: Ilqar Ramazanli, Han Nguyen, Hai Pham, Sashank J. Reddi, Barnabas Poczos

    Abstract: We study distributed optimization algorithms for minimizing the average of \emph{heterogeneous} functions distributed across several machines with a focus on communication efficiency. In such settings, naively using the classical stochastic gradient descent (SGD) or its variants (e.g., SVRG) with a uniform sampling of machines typically yields poor performance. It often leads to the dependence of… ▽ More

    Submitted 17 November, 2020; v1 submitted 19 February, 2020; originally announced February 2020.

  6. arXiv:2002.02431  [pdf, other

    cs.LG stat.ML

    Optimal Exact Matrix Completion Under new Parametrization

    Authors: Ilqar Ramazanli, Barnabas Poczos

    Abstract: We study the problem of exact completion for $m \times n$ sized matrix of rank $r$ with the adaptive sampling method. We introduce a relation of the exact completion problem with the sparsest vector of column and row spaces (which we call \textit{sparsity-number} here). Using this relation, we propose matrix completion algorithms that exactly recovers the target matrix. These algorithms are supe… ▽ More

    Submitted 4 March, 2022; v1 submitted 6 February, 2020; originally announced February 2020.

    Comments: It has been decided different sections of this work to be part of different projects

  7. arXiv:2001.10119  [pdf, other

    cs.LG stat.ML

    Unsupervised Program Synthesis for Images By Sampling Without Replacement

    Authors: Chenghui Zhou, Chun-Liang Li, Barnabas Poczos

    Abstract: Program synthesis has emerged as a successful approach to the image parsing task. Most prior works rely on a two-step scheme involving supervised pretraining of a Seq2Seq model with synthetic programs followed by reinforcement learning (RL) for fine-tuning with real reference images. Fully unsupervised approaches promise to train the model directly on the target images without requiring curated pr… ▽ More

    Submitted 14 June, 2021; v1 submitted 27 January, 2020; originally announced January 2020.

    Comments: Accepted to UAI 2021

    Journal ref: UAI 2021

  8. arXiv:1912.10787  [pdf, other

    cs.GR cs.LG stat.ML

    Learned Interpolation for 3D Generation

    Authors: Austin Dill, Songwei Ge, Eunsu Kang, Chun-Liang Li, Barnabas Poczos

    Abstract: In order to generate novel 3D shapes with machine learning, one must allow for interpolation. The typical approach for incorporating this creative process is to interpolate in a learned latent space so as to avoid the problem of generating unrealistic instances by exploiting the model's learned structure. The process of the interpolation is supposed to form a semantically smooth morphing. While th… ▽ More

    Submitted 24 January, 2020; v1 submitted 8 December, 2019; originally announced December 2019.

    Comments: Creativity and Design Workshop at NeurIPS 2019

  9. arXiv:1911.07427  [pdf, other

    cs.LG stat.ML

    RotationOut as a Regularization Method for Neural Network

    Authors: Kai Hu, Barnabas Poczos

    Abstract: In this paper, we propose a novel regularization method, RotationOut, for neural networks. Different from Dropout that handles each neuron/channel independently, RotationOut regards its input layer as an entire vector and introduces regularization by randomly rotating the vector. RotationOut can also be used in convolutional layers and recurrent layers with small modifications. We further use a no… ▽ More

    Submitted 17 November, 2019; originally announced November 2019.

    Comments: 20 pages, 8 figures

  10. arXiv:1910.10211  [pdf, other

    cs.LG stat.ML

    Better Approximate Inference for Partial Likelihood Models with a Latent Structure

    Authors: Amrith Setlur, Barnabás Póczós

    Abstract: Temporal Point Processes (TPP) with partial likelihoods involving a latent structure often entail an intractable marginalization, thus making inference hard. We propose a novel approach to Maximum Likelihood Estimation (MLE) involving approximate inference over the latent variables by minimizing a tight upper bound on the approximation gap. Given a discrete latent variable $Z$, the proposed approx… ▽ More

    Submitted 19 December, 2019; v1 submitted 22 October, 2019; originally announced October 2019.

    Journal ref: NeurIPS 2019 Workshop on Learning with Temporal Point Processes

  11. arXiv:1908.07587  [pdf, other

    cs.LG cs.AI cs.GR stat.ML

    Develo** Creative AI to Generate Sculptural Objects

    Authors: Songwei Ge, Austin Dill, Eunsu Kang, Chun-Liang Li, Lingyao Zhang, Manzil Zaheer, Barnabas Poczos

    Abstract: We explore the intersection of human and machine creativity by generating sculptural objects through machine learning. This research raises questions about both the technical details of automatic art generation and the interaction between AI and people, as both artists and the audience of art. We introduce two algorithms for generating 3D point clouds and then discuss their actualization as sculpt… ▽ More

    Submitted 20 August, 2019; originally announced August 2019.

    Comments: In the Proceedings of International Symposium on Electronic Art (ISEA 2019)

  12. arXiv:1908.01425  [pdf, other

    cs.LG physics.chem-ph stat.ML

    ChemBO: Bayesian Optimization of Small Organic Molecules with Synthesizable Recommendations

    Authors: Ksenia Korovina, Sailun Xu, Kirthevasan Kandasamy, Willie Neiswanger, Barnabas Poczos, Jeff Schneider, Eric P. Xing

    Abstract: In applications such as molecule design or drug discovery, it is desirable to have an algorithm which recommends new candidate molecules based on the results of past tests. These molecules first need to be synthesized and then tested for objective properties. We describe ChemBO, a Bayesian optimization framework for generating and optimizing organic molecules for desired molecular properties. Whil… ▽ More

    Submitted 21 October, 2019; v1 submitted 4 August, 2019; originally announced August 2019.

  13. arXiv:1906.08809  [pdf, other

    cs.LG cs.AI stat.ML

    A Deep Reinforcement Learning Approach for Global Routing

    Authors: Haiguang Liao, Wentai Zhang, Xuliang Dong, Barnabas Poczos, Kenji Shimada, Levent Burak Kara

    Abstract: Global routing has been a historically challenging problem in electronic circuit design, where the challenge is to connect a large and arbitrary number of circuit components with wires without violating the design rules for the printed circuit boards or integrated circuits. Similar routing problems also exist in the design of complex hydraulic systems, pipe systems and logistic networks. Existing… ▽ More

    Submitted 20 June, 2019; originally announced June 2019.

    Comments: Preprint submitted to ASME JMD

  14. arXiv:1905.13192  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels

    Authors: Simon S. Du, Kangcheng Hou, Barnabás Póczos, Ruslan Salakhutdinov, Ruosong Wang, Keyulu Xu

    Abstract: While graph kernels (GKs) are easy to train and enjoy provable theoretical guarantees, their practical performances are limited by their expressive power, as the kernel function often depends on hand-crafted combinatorial features of graphs. Compared to graph kernels, graph neural networks (GNNs) usually achieve better practical performance, as GNNs use multi-layer architectures and non-linear act… ▽ More

    Submitted 4 November, 2019; v1 submitted 30 May, 2019; originally announced May 2019.

    Comments: In NeurIPS 2019. Code available: https://github.com/KangchengHou/gntk

  15. arXiv:1903.09848  [pdf, other

    cs.CL cs.LG stat.ML

    Competence-based Curriculum Learning for Neural Machine Translation

    Authors: Emmanouil Antonios Platanios, Otilia Stretcu, Graham Neubig, Barnabas Poczos, Tom M. Mitchell

    Abstract: Current state-of-the-art NMT systems use large neural networks that are not only slow to train, but also often require many heuristics and optimization tricks, such as specialized learning rate schedules and large batch sizes. This is undesirable as it requires extensive hyperparameter tuning. In this paper, we propose a curriculum learning framework for NMT that reduces training time, reduces the… ▽ More

    Submitted 26 March, 2019; v1 submitted 23 March, 2019; originally announced March 2019.

    Journal ref: NAACL 2019

  16. arXiv:1903.06694  [pdf, other

    stat.ML cs.AI cs.LG

    Tuning Hyperparameters without Grad Students: Scalable and Robust Bayesian Optimisation with Dragonfly

    Authors: Kirthevasan Kandasamy, Karun Raju Vysyaraju, Willie Neiswanger, Biswajit Paria, Christopher R. Collins, Jeff Schneider, Barnabas Poczos, Eric P. Xing

    Abstract: Bayesian Optimisation (BO) refers to a suite of techniques for global optimisation of expensive black box functions, which use introspective Bayesian models of the function to efficiently search for the optimum. While BO has been applied successfully in many applications, modern optimisation tasks usher in new challenges where conventional methods fail spectacularly. In this work, we present Drago… ▽ More

    Submitted 19 April, 2020; v1 submitted 15 March, 2019; originally announced March 2019.

    Comments: Journal of Machine Learning Research 2020, Special Issue on Bayesian Optimization

  17. arXiv:1902.10214  [pdf, other

    stat.ML cs.AI cs.LG

    Implicit Kernel Learning

    Authors: Chun-Liang Li, Wei-Cheng Chang, Youssef Mroueh, Yiming Yang, Barnabás Póczos

    Abstract: Kernels are powerful and versatile tools in machine learning and statistics. Although the notion of universal kernels and characteristic kernels has been studied, kernel selection still greatly influences the empirical performance. While learning the kernel in a data driven way has been investigated, in this paper we explore learning the spectral distribution of kernel via implicit generative mode… ▽ More

    Submitted 26 February, 2019; originally announced February 2019.

    Comments: In the Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2019)

  18. arXiv:1902.03511  [pdf, other

    math.ST cs.IT cs.LG stat.ML

    Nonparametric Density Estimation & Convergence Rates for GANs under Besov IPM Losses

    Authors: Ananya Uppal, Shashank Singh, Barnabás Póczos

    Abstract: We study the problem of estimating a nonparametric probability density under a large family of losses called Besov IPMs, which include, for example, $\mathcal{L}^p$ distances, total variation distance, and generalizations of both Wasserstein and Kolmogorov-Smirnov distances. For a wide variety of settings, we provide both lower and upper bounds, identifying precisely how the choice of loss functio… ▽ More

    Submitted 13 January, 2020; v1 submitted 9 February, 2019; originally announced February 2019.

    Comments: Advances in Neural Information Processing Systems. 2019

  19. arXiv:1901.11515  [pdf, other

    cs.LG cs.AI stat.ML

    ProBO: Versatile Bayesian Optimization Using Any Probabilistic Programming Language

    Authors: Willie Neiswanger, Kirthevasan Kandasamy, Barnabas Poczos, Jeff Schneider, Eric Xing

    Abstract: Optimizing an expensive-to-query function is a common task in science and engineering, where it is beneficial to keep the number of queries to a minimum. A popular strategy is Bayesian optimization (BO), which leverages probabilistic models for this task. Most BO today uses Gaussian processes (GPs), or a few other surrogate models. However, there is a broad set of Bayesian modeling techniques that… ▽ More

    Submitted 4 July, 2019; v1 submitted 31 January, 2019; originally announced January 2019.

  20. arXiv:1901.06077  [pdf, other

    stat.ML cs.LG

    Kernel Change-point Detection with Auxiliary Deep Generative Models

    Authors: Wei-Cheng Chang, Chun-Liang Li, Yiming Yang, Barnabás Póczos

    Abstract: Detecting the emergence of abrupt property changes in time series is a challenging problem. Kernel two-sample test has been studied for this task which makes fewer assumptions on the distributions than traditional parametric approaches. However, selecting kernels is non-trivial in practice. Although kernel selection for two-sample test has been studied, the insufficient samples in change point det… ▽ More

    Submitted 17 January, 2019; originally announced January 2019.

    Comments: To appear in ICLR 2019

  21. arXiv:1812.07809  [pdf, other

    cs.LG cs.CL cs.CV cs.HC stat.ML

    Found in Translation: Learning Robust Joint Representations by Cyclic Translations Between Modalities

    Authors: Hai Pham, Paul Pu Liang, Thomas Manzini, Louis-Philippe Morency, Barnabas Poczos

    Abstract: Multimodal sentiment analysis is a core research area that studies speaker sentiment expressed from the language, visual, and acoustic modalities. The central challenge in multimodal learning involves inferring joint representations that can process and relate information from these modalities. However, existing work learns joint representations by requiring all modalities as input and as a result… ▽ More

    Submitted 28 February, 2020; v1 submitted 19 December, 2018; originally announced December 2018.

    Comments: AAAI 2019, code available at https://github.com/hainow/MCTN

  22. arXiv:1811.09751  [pdf, other

    cs.LG stat.ML

    Characterizing and Avoiding Negative Transfer

    Authors: Zirui Wang, Zihang Dai, Barnabás Póczos, Jaime Carbonell

    Abstract: When labeled data is scarce for a specific target task, transfer learning often offers an effective solution by utilizing data from a related source task. However, when transferring knowledge from a less related source, it may inversely hurt the target performance, a phenomenon known as negative transfer. Despite its pervasiveness, negative transfer is usually described in an informal manner, lack… ▽ More

    Submitted 4 October, 2019; v1 submitted 23 November, 2018; originally announced November 2018.

    Comments: Published at CVPR 2019

  23. arXiv:1810.05795  [pdf, other

    cs.LG stat.ML

    Point Cloud GAN

    Authors: Chun-Liang Li, Manzil Zaheer, Yang Zhang, Barnabas Poczos, Ruslan Salakhutdinov

    Abstract: Generative Adversarial Networks (GAN) can achieve promising performance on learning complex data distributions on different types of data. In this paper, we first show a straightforward extension of existing GAN algorithm is not applicable to point clouds, because the constraint required for discriminators is undefined for set data. We propose a two fold modification to GAN algorithm for learning… ▽ More

    Submitted 13 October, 2018; originally announced October 2018.

  24. arXiv:1810.02054  [pdf, other

    cs.LG math.OC stat.ML

    Gradient Descent Provably Optimizes Over-parameterized Neural Networks

    Authors: Simon S. Du, Xiyu Zhai, Barnabas Poczos, Aarti Singh

    Abstract: One of the mysteries in the success of neural networks is randomly initialized first order methods like gradient descent can achieve zero training loss even though the objective function is non-convex and non-smooth. This paper demystifies this surprising phenomenon for two-layer fully connected ReLU activated neural networks. For an $m$ hidden node shallow neural network with ReLU activation and… ▽ More

    Submitted 4 February, 2019; v1 submitted 4 October, 2018; originally announced October 2018.

    Comments: ICLR 2019

  25. arXiv:1807.03915  [pdf, other

    cs.CL cs.LG stat.ML

    Seq2Seq2Sentiment: Multimodal Sequence to Sequence Models for Sentiment Analysis

    Authors: Hai Pham, Thomas Manzini, Paul Pu Liang, Barnabas Poczos

    Abstract: Multimodal machine learning is a core research area spanning the language, visual and acoustic modalities. The central challenge in multimodal learning involves learning representations that can process and relate information from multiple modalities. In this paper, we propose two methods for unsupervised learning of joint multimodal representations using sequence to sequence (Seq2Seq) methods: a… ▽ More

    Submitted 6 August, 2018; v1 submitted 10 July, 2018; originally announced July 2018.

    Comments: 8 pages of content, 11 pages total, 2 figures. Published as a workshop paper at ACL 2018, Proceedings of Grand Challenge and Workshop on Human Multimodal Language (Challenge-HML). 2018

  26. arXiv:1805.12168  [pdf, other

    cs.LG stat.ML

    A Flexible Framework for Multi-Objective Bayesian Optimization using Random Scalarizations

    Authors: Biswajit Paria, Kirthevasan Kandasamy, Barnabás Póczos

    Abstract: Many real world applications can be framed as multi-objective optimization problems, where we wish to simultaneously optimize for multiple criteria. Bayesian optimization techniques for the multi-objective setting are pertinent when the evaluation of the functions in question are expensive. Traditional methods for multi-objective optimization, both Bayesian and otherwise, are aimed at recovering t… ▽ More

    Submitted 20 June, 2019; v1 submitted 30 May, 2018; originally announced May 2018.

    Comments: Accepted to UAI 2019

  27. arXiv:1805.09964  [pdf, ps, other

    stat.ML cs.AI cs.LG

    Myopic Bayesian Design of Experiments via Posterior Sampling and Probabilistic Programming

    Authors: Kirthevasan Kandasamy, Willie Neiswanger, Reed Zhang, Akshay Krishnamurthy, Jeff Schneider, Barnabas Poczos

    Abstract: We design a new myopic strategy for a wide class of sequential design of experiment (DOE) problems, where the goal is to collect data in order to to fulfil a certain problem specific goal. Our approach, Myopic Posterior Sampling (MPS), is inspired by the classical posterior (Thompson) sampling algorithm for multi-armed bandits and leverages the flexibility of probabilistic programming and approxim… ▽ More

    Submitted 24 May, 2018; originally announced May 2018.

  28. arXiv:1805.09460  [pdf, other

    stat.ML cs.AI cs.CV cs.LG stat.ME

    Cautious Deep Learning

    Authors: Yotam Hechtlinger, Barnabás Póczos, Larry Wasserman

    Abstract: Most classifiers operate by selecting the maximum of an estimate of the conditional distribution $p(y|x)$ where $x$ stands for the features of the instance to be classified and $y$ denotes its label. This often results in a {\em hubristic bias}: overconfidence in the assignment of a definite label. Usually, the observations are concentrated on a small volume but the classifier provides definite pr… ▽ More

    Submitted 27 February, 2019; v1 submitted 23 May, 2018; originally announced May 2018.

  29. arXiv:1805.08836  [pdf, other

    math.ST cs.IT stat.ML

    Nonparametric Density Estimation under Adversarial Losses

    Authors: Shashank Singh, Ananya Uppal, Boyue Li, Chun-Liang Li, Manzil Zaheer, Barnabás Póczos

    Abstract: We study minimax convergence rates of nonparametric density estimation under a large class of loss functions called "adversarial losses", which, besides classical $\mathcal{L}^p$ losses, includes maximum mean discrepancy (MMD), Wasserstein distance, and total variation distance. These losses are closely related to the losses encoded by discriminator networks in generative adversarial networks (GAN… ▽ More

    Submitted 28 October, 2018; v1 submitted 22 May, 2018; originally announced May 2018.

  30. arXiv:1803.11451  [pdf, ps, other

    math.ST cs.IT stat.ML

    Minimax Estimation of Quadratic Fourier Functionals

    Authors: Shashank Singh, Bharath K. Sriperumbudur, Barnabás Póczos

    Abstract: We study estimation of (semi-)inner products between two nonparametric probability distributions, given IID samples from each distribution. These products include relatively well-studied classical $\mathcal{L}^2$ and Sobolev inner products, as well as those induced by translation-invariant reproducing kernels, for which we believe our results are the first. We first propose estimators for these qu… ▽ More

    Submitted 1 September, 2018; v1 submitted 30 March, 2018; originally announced March 2018.

  31. arXiv:1802.08855  [pdf, ps, other

    math.ST cs.IT cs.LG stat.ML

    Minimax Distribution Estimation in Wasserstein Distance

    Authors: Shashank Singh, Barnabás Póczos

    Abstract: The Wasserstein metric is an important measure of distance between probability distributions, with applications in machine learning, statistics, probability theory, and data analysis. This paper provides upper and lower bounds on statistical minimax rates for the problem of estimating a probability distribution under Wasserstein loss, using only metric properties, such as covering and packing numb… ▽ More

    Submitted 6 November, 2019; v1 submitted 24 February, 2018; originally announced February 2018.

  32. arXiv:1802.07191  [pdf, other

    cs.LG stat.ML

    Neural Architecture Search with Bayesian Optimisation and Optimal Transport

    Authors: Kirthevasan Kandasamy, Willie Neiswanger, Jeff Schneider, Barnabas Poczos, Eric Xing

    Abstract: Bayesian Optimisation (BO) refers to a class of methods for global optimisation of a function $f$ which is only accessible via point evaluations. It is typically used in settings where $f$ is expensive to evaluate. A common use case for BO in machine learning is model selection, where it is not possible to analytically model the generalisation performance of a statistical model, and we resort to n… ▽ More

    Submitted 15 March, 2019; v1 submitted 11 February, 2018; originally announced February 2018.

    Journal ref: Neural Information Processing Systems (NeurIPS) 2018

  33. arXiv:1802.04420  [pdf, other

    cs.LG stat.ML

    Towards Understanding the Generalization Bias of Two Layer Convolutional Linear Classifiers with Gradient Descent

    Authors: Yifan Wu, Barnabas Poczos, Aarti Singh

    Abstract: A major challenge in understanding the generalization of deep learning is to explain why (stochastic) gradient descent can exploit the network architecture to find solutions that have good generalization performance when using high capacity models. We find simple but realistic examples showing that this phenomenon exists even when learning linear classifiers --- between two linear networks with th… ▽ More

    Submitted 9 February, 2019; v1 submitted 12 February, 2018; originally announced February 2018.

    Comments: The 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2019)

  34. arXiv:1801.09819  [pdf, other

    stat.ML

    Transformation Autoregressive Networks

    Authors: Junier B. Oliva, Avinava Dubey, Manzil Zaheer, Barnabás Póczos, Ruslan Salakhutdinov, Eric P. Xing, Jeff Schneider

    Abstract: The fundamental task of general density estimation $p(x)$ has been of keen interest to machine learning. In this work, we attempt to systematically characterize methods for density estimation. Broadly speaking, most of the existing methods can be categorized into either using: \textit{a}) autoregressive models to estimate the conditional factors of the chain rule, $p(x_{i}\, |\, x_{i-1}, \ldots)$;… ▽ More

    Submitted 23 October, 2018; v1 submitted 29 January, 2018; originally announced January 2018.

    Journal ref: ICML 2018

  35. arXiv:1712.00779  [pdf, other

    cs.LG cs.AI cs.CV math.OC stat.ML

    Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima

    Authors: Simon S. Du, Jason D. Lee, Yuandong Tian, Barnabas Poczos, Aarti Singh

    Abstract: We consider the problem of learning a one-hidden-layer neural network with non-overlap** convolutional layer and ReLU activation, i.e., $f(\mathbf{Z}, \mathbf{w}, \mathbf{a}) = \sum_j a_jσ(\mathbf{w}^T\mathbf{Z}_j)$, in which both the convolutional weights $\mathbf{w}$ and the output weights $\mathbf{a}$ are parameters to be learned. When the labels are the outputs from a teacher network of the… ▽ More

    Submitted 14 June, 2018; v1 submitted 3 December, 2017; originally announced December 2017.

    Comments: Accepted by ICML 2018

  36. arXiv:1711.02033  [pdf, other

    astro-ph.CO cs.LG stat.ML

    Estimating Cosmological Parameters from the Dark Matter Distribution

    Authors: Siamak Ravanbakhsh, Junier Oliva, Sebastien Fromenteau, Layne C. Price, Shirley Ho, Jeff Schneider, Barnabas Poczos

    Abstract: A grand challenge of the 21st century cosmology is to accurately estimate the cosmological parameters of our Universe. A major approach to estimating the cosmological parameters is to use the large-scale matter distribution of the Universe. Galaxy surveys provide the means to map out cosmic large-scale structure in three dimensions. Information about galaxy locations is typically summarized in a "… ▽ More

    Submitted 6 November, 2017; originally announced November 2017.

    Comments: ICML 2016

  37. arXiv:1708.08587  [pdf, other

    math.ST cs.IT cs.LG stat.ML

    On the Reconstruction Risk of Convolutional Sparse Dictionary Learning

    Authors: Shashank Singh, Barnabás Póczos, Jian Ma

    Abstract: Sparse dictionary learning (SDL) has become a popular method for adaptively identifying parsimonious representations of a dataset, a fundamental problem in machine learning and signal processing. While most work on SDL assumes a training dataset of independent and identically distributed samples, a variant known as convolutional sparse dictionary learning (CSDL) relaxes this assumption, allowing m… ▽ More

    Submitted 24 February, 2018; v1 submitted 29 August, 2017; originally announced August 2017.

  38. arXiv:1705.10750  [pdf, other

    cs.LG stat.ML

    Recurrent Estimation of Distributions

    Authors: Junier B. Oliva, Kumar Avinava Dubey, Barnabas Poczos, Eric Xing, Jeff Schneider

    Abstract: This paper presents the recurrent estimation of distributions (RED) for modeling real-valued data in a semiparametric fashion. RED models make two novel uses of recurrent neural networks (RNNs) for density estimation of general real-valued data. First, RNNs are used to transform input covariates into a latent space to better capture conditional dependencies in inputs. After, an RNN is used to comp… ▽ More

    Submitted 30 May, 2017; originally announced May 2017.

  39. arXiv:1705.10412  [pdf, other

    math.OC cs.LG stat.ML

    Gradient Descent Can Take Exponential Time to Escape Saddle Points

    Authors: Simon S. Du, Chi **, Jason D. Lee, Michael I. Jordan, Barnabas Poczos, Aarti Singh

    Abstract: Although gradient descent (GD) almost always escapes saddle points asymptotically [Lee et al., 2016], this paper shows that even with fairly natural random initialization schemes and non-pathological functions, GD can be significantly slowed down by saddle points, taking exponential time to escape. On the other hand, gradient descent with perturbations [Ge et al., 2015, ** et al., 2017] is not sl… ▽ More

    Submitted 5 November, 2017; v1 submitted 29 May, 2017; originally announced May 2017.

    Comments: Accepted by NIPS 2017

  40. arXiv:1705.09236  [pdf, other

    stat.ML cs.LG

    Asynchronous Parallel Bayesian Optimisation via Thompson Sampling

    Authors: Kirthevasan Kandasamy, Akshay Krishnamurthy, Jeff Schneider, Barnabas Poczos

    Abstract: We design and analyse variations of the classical Thompson sampling (TS) procedure for Bayesian optimisation (BO) in settings where function evaluations are expensive, but can be performed in parallel. Our theoretical analysis shows that a direct application of the sequential Thompson sampling algorithm in either synchronous or asynchronous parallel settings yields a surprisingly powerful result:… ▽ More

    Submitted 25 May, 2017; originally announced May 2017.

  41. arXiv:1705.08584  [pdf, other

    cs.LG cs.AI stat.ML

    MMD GAN: Towards Deeper Understanding of Moment Matching Network

    Authors: Chun-Liang Li, Wei-Cheng Chang, Yu Cheng, Yiming Yang, Barnabás Póczos

    Abstract: Generative moment matching network (GMMN) is a deep generative model that differs from Generative Adversarial Network (GAN) by replacing the discriminator in GAN with a two-sample test based on kernel maximum mean discrepancy (MMD). Although some theoretical guarantees of MMD have been studied, the empirical performance of GMMN is still not as competitive as that of GAN on challenging and large be… ▽ More

    Submitted 27 November, 2017; v1 submitted 23 May, 2017; originally announced May 2017.

    Comments: In the Proceedings of Thirty-first Annual Conference on Neural Information Processing Systems (NIPS 2017)

  42. arXiv:1705.08525  [pdf, other

    cs.LG stat.ML

    Data-driven Random Fourier Features using Stein Effect

    Authors: Wei-Cheng Chang, Chun-Liang Li, Yiming Yang, Barnabas Poczos

    Abstract: Large-scale kernel approximation is an important problem in machine learning research. Approaches using random Fourier features have become increasingly popular [Rahimi and Recht, 2007], where kernel approximation is treated as empirical mean estimation via Monte Carlo (MC) or Quasi-Monte Carlo (QMC) integration [Yang et al., 2014]. A limitation of the current approaches is that all the features r… ▽ More

    Submitted 23 May, 2017; originally announced May 2017.

    Comments: To appear in International Joint Conference on Artificial Intelligence (IJCAI), 2017

  43. arXiv:1703.06240  [pdf, ps, other

    stat.ML

    Multi-fidelity Bayesian Optimisation with Continuous Approximations

    Authors: Kirthevasan Kandasamy, Gautam Dasarathy, Jeff Schneider, Barnabas Poczos

    Abstract: Bandit methods for black-box optimisation, such as Bayesian optimisation, are used in a variety of applications including hyper-parameter tuning and experiment design. Recently, \emph{multi-fidelity} methods have garnered considerable attention since function evaluations have become increasingly expensive in such applications. Multi-fidelity methods use cheap approximations to the function of inte… ▽ More

    Submitted 17 March, 2017; originally announced March 2017.

  44. arXiv:1703.06114  [pdf, other

    cs.LG stat.ML

    Deep Sets

    Authors: Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan Salakhutdinov, Alexander Smola

    Abstract: We study the problem of designing models for machine learning tasks defined on \emph{sets}. In contrast to traditional approach of operating on fixed dimensional vectors, we consider objective functions defined on sets that are invariant to permutations. Such problems are widespread, ranging from estimation of population statistics \cite{poczos13aistats}, to anomaly detection in piezometer data of… ▽ More

    Submitted 14 April, 2018; v1 submitted 10 March, 2017; originally announced March 2017.

    Comments: NIPS 2017

  45. arXiv:1703.00381  [pdf, other

    cs.LG cs.AI stat.ML

    The Statistical Recurrent Unit

    Authors: Junier B. Oliva, Barnabas Poczos, Jeff Schneider

    Abstract: Sophisticated gated recurrent neural network architectures like LSTMs and GRUs have been shown to be highly effective in a myriad of applications. We develop an un-gated unit, the statistical recurrent unit (SRU), that is able to learn long term dependencies in data by only kee** moving averages of statistics. The SRU's architecture is simple, un-gated, and contains a comparable number of parame… ▽ More

    Submitted 1 March, 2017; originally announced March 2017.

  46. arXiv:1702.08389  [pdf, other

    stat.ML cs.NE

    Equivariance Through Parameter-Sharing

    Authors: Siamak Ravanbakhsh, Jeff Schneider, Barnabas Poczos

    Abstract: We propose to study equivariance in deep neural networks through parameter symmetries. In particular, given a group $\mathcal{G}$ that acts discretely on the input and output of a standard neural network layer $φ_{W}: \Re^{M} \to \Re^{N}$, we show that $φ_{W}$ is equivariant with respect to $\mathcal{G}$-action iff $\mathcal{G}$ explains the symmetries of the network parameters $W$. Inspired by th… ▽ More

    Submitted 13 June, 2017; v1 submitted 27 February, 2017; originally announced February 2017.

    Comments: icml'17

  47. arXiv:1702.07803  [pdf, ps, other

    math.ST cs.IT stat.ML

    Nonparanormal Information Estimation

    Authors: Shashank Singh, Barnabás Pøczos

    Abstract: We study the problem of using i.i.d. samples from an unknown multivariate probability distribution $p$ to estimate the mutual information of $p$. This problem has recently received attention in two settings: (1) where $p$ is assumed to be Gaussian and (2) where $p$ is assumed only to lie in a large nonparametric smoothness class. Estimators proposed for the Gaussian case converge in high dimension… ▽ More

    Submitted 24 February, 2017; originally announced February 2017.

  48. Query Efficient Posterior Estimation in Scientific Experiments via Bayesian Active Learning

    Authors: Kirthevasan Kandasamy, Jeff Schneider, Barnabás Póczos

    Abstract: A common problem in disciplines of applied Statistics research such as Astrostatistics is of estimating the posterior distribution of relevant parameters. Typically, the likelihoods for such models are computed via expensive experiments such as cosmological simulations of the universe. An urgent challenge in these research domains is to develop methods that can estimate the posterior with few like… ▽ More

    Submitted 3 February, 2017; originally announced February 2017.

    Comments: Published in the Artificial Intelligence Journal (AIJ), Feb 2017 and International Joint Conference on Artificial Intelligence (IJCAI) 2015

  49. arXiv:1612.01020  [pdf, other

    stat.ML cs.LG

    Hypothesis Transfer Learning via Transformation Functions

    Authors: Simon Shaolei Du, Jayanth Koushik, Aarti Singh, Barnabas Poczos

    Abstract: We consider the Hypothesis Transfer Learning (HTL) problem where one incorporates a hypothesis trained on the source domain into the learning procedure of the target domain. Existing theoretical analysis either only studies specific algorithms or only presents upper bounds on the generalization error but not on the excess risk. In this paper, we propose a unified algorithm-dependent framework for… ▽ More

    Submitted 5 November, 2017; v1 submitted 3 December, 2016; originally announced December 2016.

    Comments: Accepted by NIPS 2017

  50. arXiv:1611.04500  [pdf, other

    stat.ML cs.LG cs.NE

    Deep Learning with Sets and Point Clouds

    Authors: Siamak Ravanbakhsh, Jeff Schneider, Barnabas Poczos

    Abstract: We introduce a simple permutation equivariant layer for deep learning with set structure.This type of layer, obtained by parameter-sharing, has a simple implementation and linear-time complexity in the size of each set. We use deep permutation-invariant networks to perform point-could classification and MNIST-digit summation, where in both cases the output is invariant to permutations of the input… ▽ More

    Submitted 23 February, 2017; v1 submitted 14 November, 2016; originally announced November 2016.