Skip to main content

Showing 1–34 of 34 results for author: Snoek, J

Searching in archive stat. Search in all archives.
.
  1. arXiv:2404.11599  [pdf, other

    cs.LG cs.CV stat.ML

    Variational Bayesian Last Layers

    Authors: James Harrison, John Willes, Jasper Snoek

    Abstract: We introduce a deterministic variational formulation for training Bayesian last layer neural networks. This yields a sampling-free, single-pass model and loss that effectively improves uncertainty estimation. Our variational Bayesian last layer (VBLL) can be trained and evaluated with only quadratic complexity in last layer width, and is thus (nearly) computationally free to add to standard archit… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: International Conference on Learning Representations (ICLR) 2024

  2. arXiv:2303.05420  [pdf, other

    stat.ML cs.CV cs.LG

    Kernel Regression with Infinite-Width Neural Networks on Millions of Examples

    Authors: Ben Adlam, Jaehoon Lee, Shreyas Padhy, Zachary Nado, Jasper Snoek

    Abstract: Neural kernels have drastically increased performance on diverse and nonstandard data modalities but require significantly more compute, which previously limited their application to smaller datasets. In this work, we address this by massively parallelizing their computation across many GPUs. We combine this with a distributed, preconditioned conjugate gradients algorithm to enable kernel regressi… ▽ More

    Submitted 9 March, 2023; originally announced March 2023.

  3. arXiv:2207.07411  [pdf, other

    cs.LG stat.ML

    Plex: Towards Reliability using Pretrained Large Model Extensions

    Authors: Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek , et al. (1 additional authors not shown)

    Abstract: A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures. Probing these models' abilities in diverse ways is therefore critical to the field. In this paper, we explore the reliability of models, where we define a reliable model as one that not only achieves strong predictive per… ▽ More

    Submitted 15 July, 2022; originally announced July 2022.

    Comments: Code available at https://goo.gle/plex-code

  4. arXiv:2207.03084  [pdf, other

    cs.LG cs.AI stat.ML

    Pre-training helps Bayesian optimization too

    Authors: Zi Wang, George E. Dahl, Kevin Swersky, Chansoo Lee, Zelda Mariet, Zachary Nado, Justin Gilmer, Jasper Snoek, Zoubin Ghahramani

    Abstract: Bayesian optimization (BO) has become a popular strategy for global optimization of many expensive real-world functions. Contrary to a common belief that BO is suited to optimizing black-box functions, it actually requires domain knowledge on characteristics of those functions to deploy BO successfully. Such domain knowledge often manifests in Gaussian process priors that specify initial beliefs o… ▽ More

    Submitted 7 July, 2022; originally announced July 2022.

    Comments: ICML2022 Workshop on Adaptive Experimental Design and Active Learning in the Real World. arXiv admin note: substantial text overlap with arXiv:2109.08215

  5. arXiv:2205.00403  [pdf, other

    cs.LG stat.ML

    A Simple Approach to Improve Single-Model Deep Uncertainty via Distance-Awareness

    Authors: Jeremiah Zhe Liu, Shreyas Padhy, Jie Ren, Zi Lin, Yeming Wen, Ghassen Jerfel, Zack Nado, Jasper Snoek, Dustin Tran, Balaji Lakshminarayanan

    Abstract: Accurate uncertainty quantification is a major challenge in deep learning, as neural networks can make overconfident errors and assign high confidence predictions to out-of-distribution (OOD) inputs. The most popular approaches to estimate predictive uncertainty in deep learning are methods that combine predictions from multiple neural networks, such as Bayesian neural networks (BNNs) and deep ens… ▽ More

    Submitted 30 December, 2022; v1 submitted 1 May, 2022; originally announced May 2022.

    Comments: arXiv admin note: text overlap with arXiv:2006.10108

  6. arXiv:2110.03360  [pdf, other

    cs.LG cs.CV stat.ML

    Sparse MoEs meet Efficient Ensembles

    Authors: James Urquhart Allingham, Florian Wenzel, Zelda E Mariet, Basil Mustafa, Joan Puigcerver, Neil Houlsby, Ghassen Jerfel, Vincent Fortuin, Balaji Lakshminarayanan, Jasper Snoek, Dustin Tran, Carlos Riquelme Ruiz, Rodolphe Jenatton

    Abstract: Machine learning models based on the aggregated outputs of submodels, either at the activation or prediction levels, often exhibit strong performance compared to individual models. We study the interplay of two popular classes of such models: ensembles of neural networks and sparse mixture of experts (sparse MoEs). First, we show that the two approaches have complementary features whose combinatio… ▽ More

    Submitted 9 July, 2023; v1 submitted 7 October, 2021; originally announced October 2021.

    Comments: 59 pages, 26 figures, 36 tables. Accepted at TMLR

  7. arXiv:2109.08215  [pdf, other

    cs.LG stat.ML

    Pre-trained Gaussian processes for Bayesian optimization

    Authors: Zi Wang, George E. Dahl, Kevin Swersky, Chansoo Lee, Zelda Mariet, Zachary Nado, Justin Gilmer, Jasper Snoek, Zoubin Ghahramani

    Abstract: Bayesian optimization (BO) has become a popular strategy for global optimization of many expensive real-world functions. Contrary to a common belief that BO is suited to optimizing black-box functions, it actually requires domain knowledge on characteristics of those functions to deploy BO successfully. Such domain knowledge often manifests in Gaussian process priors that specify initial beliefs o… ▽ More

    Submitted 6 July, 2022; v1 submitted 16 September, 2021; originally announced September 2021.

  8. arXiv:2010.09875  [pdf, other

    cs.LG stat.ML

    Combining Ensembles and Data Augmentation can Harm your Calibration

    Authors: Yeming Wen, Ghassen Jerfel, Rafael Muller, Michael W. Dusenberry, Jasper Snoek, Balaji Lakshminarayanan, Dustin Tran

    Abstract: Ensemble methods which average over multiple neural network predictions are a simple approach to improve a model's calibration and robustness. Similarly, data augmentation techniques, which encode prior information in the form of invariant feature transformations, are effective for improving calibration and robustness. In this paper, we show a surprising pathology: combining ensembles and data aug… ▽ More

    Submitted 22 March, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

  9. arXiv:2010.07355  [pdf, other

    stat.ML cs.LG

    Exploring the Uncertainty Properties of Neural Networks' Implicit Priors in the Infinite-Width Limit

    Authors: Ben Adlam, Jaehoon Lee, Lechao Xiao, Jeffrey Pennington, Jasper Snoek

    Abstract: Modern deep learning models have achieved great success in predictive accuracy for many data modalities. However, their application to many real-world tasks is restricted by poor uncertainty estimates, such as overconfidence on out-of-distribution (OOD) data and ungraceful failing under distributional shift. Previous benchmarks have found that ensembles of neural networks (NNs) are typically the b… ▽ More

    Submitted 14 October, 2020; originally announced October 2020.

    Comments: 23 pages, 11 figures

  10. arXiv:2010.06610  [pdf, other

    cs.LG cs.CV stat.ML

    Training independent subnetworks for robust prediction

    Authors: Marton Havasi, Rodolphe Jenatton, Stanislav Fort, Jeremiah Zhe Liu, Jasper Snoek, Balaji Lakshminarayanan, Andrew M. Dai, Dustin Tran

    Abstract: Recent approaches to efficiently ensemble neural networks have shown that strong robustness and uncertainty performance can be achieved with a negligible gain in parameters over the original network. However, these methods still require multiple forward passes for prediction, leading to a significant computational cost. In this work, we show a surprising result: the benefits of using multiple pred… ▽ More

    Submitted 4 August, 2021; v1 submitted 13 October, 2020; originally announced October 2020.

    Comments: Updated to the ICLR camera ready version, added reference to Soflaei et al. 2020

  11. arXiv:2008.01160  [pdf, other

    eess.AS cs.LG cs.SD stat.ML

    A Spectral Energy Distance for Parallel Speech Synthesis

    Authors: Alexey A. Gritsenko, Tim Salimans, Rianne van den Berg, Jasper Snoek, Nal Kalchbrenner

    Abstract: Speech synthesis is an important practical generative modeling problem that has seen great progress over the last few years, with likelihood-based autoregressive neural models now outperforming traditional concatenative systems. A downside of such autoregressive models is that they require executing tens of thousands of sequential operations per second of generated audio, making them ill-suited fo… ▽ More

    Submitted 23 October, 2020; v1 submitted 3 August, 2020; originally announced August 2020.

  12. arXiv:2008.00029  [pdf, other

    stat.ML cs.LG

    Cold Posteriors and Aleatoric Uncertainty

    Authors: Ben Adlam, Jasper Snoek, Samuel L. Smith

    Abstract: Recent work has observed that one can outperform exact inference in Bayesian neural networks by tuning the "temperature" of the posterior on a validation set (the "cold posterior" effect). To help interpret this phenomenon, we argue that commonly used priors in Bayesian neural networks can significantly overestimate the aleatoric uncertainty in the labels on many classification datasets. This prob… ▽ More

    Submitted 31 July, 2020; originally announced August 2020.

    Comments: 5 pages, 3 figures

    Journal ref: ICML workshop on Uncertainty and Robustness in Deep Learning (2020)

  13. arXiv:2007.05134  [pdf, other

    cs.LG stat.ML

    Revisiting One-vs-All Classifiers for Predictive Uncertainty and Out-of-Distribution Detection in Neural Networks

    Authors: Shreyas Padhy, Zachary Nado, Jie Ren, Jeremiah Liu, Jasper Snoek, Balaji Lakshminarayanan

    Abstract: Accurate estimation of predictive uncertainty in modern neural networks is critical to achieve well calibrated predictions and detect out-of-distribution (OOD) inputs. The most promising approaches have been predominantly focused on improving model uncertainty (e.g. deep ensembles and Bayesian neural networks) and post-processing techniques for OOD detection (e.g. ODIN and Mahalanobis distance). H… ▽ More

    Submitted 9 July, 2020; originally announced July 2020.

  14. arXiv:2006.13570  [pdf, other

    cs.LG stat.ML

    Hyperparameter Ensembles for Robustness and Uncertainty Quantification

    Authors: Florian Wenzel, Jasper Snoek, Dustin Tran, Rodolphe Jenatton

    Abstract: Ensembles over neural network weights trained from different random initialization, known as deep ensembles, achieve state-of-the-art accuracy and calibration. The recently introduced batch ensembles provide a drop-in replacement that is more parameter efficient. In this paper, we design ensembles not only over weights, but over hyperparameters to improve the state of the art in both settings. For… ▽ More

    Submitted 8 January, 2021; v1 submitted 24 June, 2020; originally announced June 2020.

    Comments: Accepted at NeurIPS 2020

  15. arXiv:2006.10963  [pdf, other

    cs.LG stat.ML

    Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift

    Authors: Zachary Nado, Shreyas Padhy, D. Sculley, Alexander D'Amour, Balaji Lakshminarayanan, Jasper Snoek

    Abstract: Covariate shift has been shown to sharply degrade both predictive accuracy and the calibration of uncertainty estimates for deep learning models. This is worrying, because covariate shift is prevalent in a wide range of real world deployment settings. However, in this paper, we note that frequently there exists the potential to access small unlabeled batches of the shifted data just before predict… ▽ More

    Submitted 14 January, 2021; v1 submitted 19 June, 2020; originally announced June 2020.

  16. arXiv:2005.07186  [pdf, other

    cs.LG stat.ML

    Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors

    Authors: Michael W. Dusenberry, Ghassen Jerfel, Yeming Wen, Yi-An Ma, Jasper Snoek, Katherine Heller, Balaji Lakshminarayanan, Dustin Tran

    Abstract: Bayesian neural networks (BNNs) demonstrate promising success in improving the robustness and uncertainty quantification of modern deep learning. However, they generally struggle with underfitting at scale and parameter efficiency. On the other hand, deep ensembles have emerged as alternatives for uncertainty quantification that, while outperforming BNNs on certain problems, also suffer from effic… ▽ More

    Submitted 14 August, 2020; v1 submitted 14 May, 2020; originally announced May 2020.

    Comments: Published in the International Conference on Machine Learning (ICML) 2020. Code available at https://github.com/google/edward2

  17. arXiv:2002.09927  [pdf, other

    cs.LG stat.ML

    Weighting Is Worth the Wait: Bayesian Optimization with Importance Sampling

    Authors: Setareh Ariafar, Zelda Mariet, Ehsan Elhamifar, Dana Brooks, Jennifer Dy, Jasper Snoek

    Abstract: Many contemporary machine learning models require extensive tuning of hyperparameters to perform well. A variety of methods, such as Bayesian optimization, have been developed to automate and expedite this process. However, tuning remains extremely costly as it typically requires repeatedly fully training models. We propose to accelerate the Bayesian optimization approach to hyperparameter tuning… ▽ More

    Submitted 23 February, 2020; originally announced February 2020.

  18. arXiv:2002.02655  [pdf, other

    cs.LG stat.ML

    The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks

    Authors: Jakub Swiatkowski, Kevin Roth, Bastiaan S. Veeling, Linh Tran, Joshua V. Dillon, Jasper Snoek, Stephan Mandt, Tim Salimans, Rodolphe Jenatton, Sebastian Nowozin

    Abstract: Variational Bayesian Inference is a popular methodology for approximating posterior distributions over Bayesian neural network weights. Recent work develo** this class of methods has explored ever richer parameterizations of the approximate posterior in the hope of improving performance. In contrast, here we share a curious experimental finding that suggests instead restricting the variational d… ▽ More

    Submitted 5 July, 2020; v1 submitted 7 February, 2020; originally announced February 2020.

  19. arXiv:2002.02405  [pdf, other

    stat.ML cs.LG stat.CO

    How Good is the Bayes Posterior in Deep Neural Networks Really?

    Authors: Florian Wenzel, Kevin Roth, Bastiaan S. Veeling, Jakub ÅšwiÄ…tkowski, Linh Tran, Stephan Mandt, Jasper Snoek, Tim Salimans, Rodolphe Jenatton, Sebastian Nowozin

    Abstract: During the past five years the Bayesian deep learning community has developed increasingly accurate and efficient approximate inference procedures that allow for Bayesian inference in deep neural networks. However, despite this algorithmic progress and the promise of improved uncertainty quantification and sample efficiency there are---as of early 2020---no publicized deployments of Bayesian neura… ▽ More

    Submitted 2 July, 2020; v1 submitted 6 February, 2020; originally announced February 2020.

    Comments: Full version (main paper and appendix) of the ICML 2020 publication

  20. arXiv:2001.04694  [pdf, other

    cs.LG stat.ML

    Hydra: Preserving Ensemble Diversity for Model Distillation

    Authors: Linh Tran, Bastiaan S. Veeling, Kevin Roth, Jakub Swiatkowski, Joshua V. Dillon, Jasper Snoek, Stephan Mandt, Tim Salimans, Sebastian Nowozin, Rodolphe Jenatton

    Abstract: Ensembles of models have been empirically shown to improve predictive performance and to yield robust measures of uncertainty. However, they are expensive in computation and memory. Therefore, recent research has focused on distilling ensembles into a single compact model, reducing the computational and memory burden of the ensemble while trying to preserve its predictive behavior. Most existing d… ▽ More

    Submitted 19 March, 2021; v1 submitted 14 January, 2020; originally announced January 2020.

    Comments: Accepted to ICML 2020 Workshop on Uncertainty and Robustness in Deep Learning

  21. arXiv:1906.02845  [pdf, other

    stat.ML cs.LG

    Likelihood Ratios for Out-of-Distribution Detection

    Authors: Jie Ren, Peter J. Liu, Emily Fertig, Jasper Snoek, Ryan Poplin, Mark A. DePristo, Joshua V. Dillon, Balaji Lakshminarayanan

    Abstract: Discriminative neural networks offer little or no performance guarantees when deployed on data not generated by the same process as the training distribution. On such out-of-distribution (OOD) inputs, the prediction may not only be erroneous, but confidently so, limiting the safe deployment of classifiers in real-world applications. One such challenging application is bacteria identification based… ▽ More

    Submitted 5 December, 2019; v1 submitted 6 June, 2019; originally announced June 2019.

    Comments: Accepted to NeurIPS 2019

  22. arXiv:1906.02530  [pdf, other

    stat.ML cs.LG

    Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift

    Authors: Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, D Sculley, Sebastian Nowozin, Joshua V. Dillon, Balaji Lakshminarayanan, Jasper Snoek

    Abstract: Modern machine learning methods including deep learning have achieved great success in predictive accuracy for supervised learning tasks, but may still fall short in giving useful estimates of their predictive {\em uncertainty}. Quantifying uncertainty is especially critical in real-world settings, which often involve input distributions that are shifted from the training distribution due to a var… ▽ More

    Submitted 17 December, 2019; v1 submitted 6 June, 2019; originally announced June 2019.

    Comments: Advances in Neural Information Processing Systems, 2019

  23. arXiv:1901.06246  [pdf, other

    cs.CY cs.DL cs.LG stat.ML

    Avoiding a Tragedy of the Commons in the Peer Review Process

    Authors: D Sculley, Jasper Snoek, Alex Wiltschko

    Abstract: Peer review is the foundation of scientific publication, and the task of reviewing has long been seen as a cornerstone of professional service. However, the massive growth in the field of machine learning has put this community benefit under stress, threatening both the sustainability of an effective review process and the overall progress of the field. In this position paper, we argue that a trag… ▽ More

    Submitted 18 December, 2018; originally announced January 2019.

    Comments: Appeared in the 2018 Advances in Neural Information Processing Systems Workshop on Critiquing and Correcting Trends in Machine Learning

  24. arXiv:1901.02051  [pdf, other

    stat.ML cs.LG

    DPPNet: Approximating Determinantal Point Processes with Deep Networks

    Authors: Zelda Mariet, Yaniv Ovadia, Jasper Snoek

    Abstract: Determinantal Point Processes (DPPs) provide an elegant and versatile way to sample sets of items that balance the point-wise quality with the set-wise diversity of selected items. For this reason, they have gained prominence in many machine learning applications that rely on subset selection. However, sampling from a DPP over a ground set of size $N$ is a costly operation, requiring in general an… ▽ More

    Submitted 7 January, 2019; originally announced January 2019.

  25. arXiv:1802.09127  [pdf, other

    stat.ML cs.LG

    Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling

    Authors: Carlos Riquelme, George Tucker, Jasper Snoek

    Abstract: Recent advances in deep reinforcement learning have made significant strides in performance on applications such as Go and Atari games. However, develo** practical methods to balance exploration and exploitation in complex domains remains largely unsolved. Thompson Sampling and its extension to reinforcement learning provide an elegant approach to exploration that only requires access to posteri… ▽ More

    Submitted 25 February, 2018; originally announced February 2018.

    Comments: Sixth International Conference on Learning Representations, ICLR 2018

  26. arXiv:1802.08665  [pdf, other

    stat.ML cs.LG

    Learning Latent Permutations with Gumbel-Sinkhorn Networks

    Authors: Gonzalo Mena, David Belanger, Scott Linderman, Jasper Snoek

    Abstract: Permutations and matchings are core building blocks in a variety of latent variable models, as they allow us to align, canonicalize, and sort data. Learning in such models is difficult, however, because exact marginalization over these combinatorial objects is intractable. In response, this paper introduces a collection of new methods for end-to-end learning in such models that approximate discret… ▽ More

    Submitted 23 February, 2018; originally announced February 2018.

    Journal ref: ICLR 2018

  27. arXiv:1506.03767  [pdf, other

    stat.ML cs.LG

    Spectral Representations for Convolutional Neural Networks

    Authors: Oren Rippel, Jasper Snoek, Ryan P. Adams

    Abstract: Discrete Fourier transforms provide a significant speedup in the computation of convolutions in deep learning. In this work, we demonstrate that, beyond its advantages for efficient computation, the spectral domain also provides a powerful representation in which to model and train convolutional neural networks (CNNs). We employ spectral representations to introduce a number of innovations to CN… ▽ More

    Submitted 11 June, 2015; originally announced June 2015.

  28. arXiv:1502.05700  [pdf, other

    stat.ML

    Scalable Bayesian Optimization Using Deep Neural Networks

    Authors: Jasper Snoek, Oren Rippel, Kevin Swersky, Ryan Kiros, Nadathur Satish, Narayanan Sundaram, Md. Mostofa Ali Patwary, Prabhat, Ryan P. Adams

    Abstract: Bayesian optimization is an effective methodology for the global optimization of functions with expensive evaluations. It relies on querying a distribution over functions defined by a relatively cheap surrogate model. An accurate model for this distribution over functions is critical to the effectiveness of the approach, and is typically fit using Gaussian processes (GPs). However, since GPs scale… ▽ More

    Submitted 13 July, 2015; v1 submitted 19 February, 2015; originally announced February 2015.

  29. arXiv:1409.4011  [pdf, ps, other

    stat.ML

    Raiders of the Lost Architecture: Kernels for Bayesian Optimization in Conditional Parameter Spaces

    Authors: Kevin Swersky, David Duvenaud, Jasper Snoek, Frank Hutter, Michael A. Osborne

    Abstract: In practical Bayesian optimization, we must often search over structures with differing numbers of parameters. For instance, we may wish to search over neural network architectures with an unknown number of layers. To relate performance data gathered for different architectures, we define a new kernel for conditional parameter spaces that explicitly includes information about which parameters are… ▽ More

    Submitted 14 September, 2014; originally announced September 2014.

    Comments: 6 pages, 3 figures. Appeared in the NIPS 2013 workshop on Bayesian optimization

  30. arXiv:1406.3896  [pdf, other

    stat.ML cs.LG

    Freeze-Thaw Bayesian Optimization

    Authors: Kevin Swersky, Jasper Snoek, Ryan Prescott Adams

    Abstract: In this paper we develop a dynamic form of Bayesian optimization for machine learning models with the goal of rapidly finding good hyperparameter settings. Our method uses the partial information gained during the training of a machine learning model in order to decide whether to pause training and start a new model, or resume the training of a previously-considered model. We specifically tailor o… ▽ More

    Submitted 15 June, 2014; originally announced June 2014.

  31. arXiv:1403.5607  [pdf, other

    stat.ML cs.LG

    Bayesian Optimization with Unknown Constraints

    Authors: Michael A. Gelbart, Jasper Snoek, Ryan P. Adams

    Abstract: Recent work on Bayesian optimization has shown its effectiveness in global optimization of difficult black-box objective functions. Many real-world optimization problems of interest also have constraints which are unknown a priori. In this paper, we study Bayesian optimization for constrained problems in the general case that noise may be present in the constraint functions, and the objective and… ▽ More

    Submitted 21 March, 2014; originally announced March 2014.

    Comments: 14 pages, 3 figures

  32. arXiv:1402.0929  [pdf, other

    stat.ML cs.LG

    Input War** for Bayesian Optimization of Non-stationary Functions

    Authors: Jasper Snoek, Kevin Swersky, Richard S. Zemel, Ryan P. Adams

    Abstract: Bayesian optimization has proven to be a highly effective methodology for the global optimization of unknown, expensive and multimodal functions. The ability to accurately model distributions over functions is critical to the effectiveness of Bayesian optimization. Although Gaussian processes provide a flexible prior over functions which can be queried efficiently, there are various classes of fun… ▽ More

    Submitted 11 June, 2014; v1 submitted 4 February, 2014; originally announced February 2014.

  33. arXiv:1206.2944  [pdf, other

    stat.ML cs.LG

    Practical Bayesian Optimization of Machine Learning Algorithms

    Authors: Jasper Snoek, Hugo Larochelle, Ryan P. Adams

    Abstract: Machine learning algorithms frequently require careful tuning of model hyperparameters, regularization terms, and optimization parameters. Unfortunately, this tuning is often a "black art" that requires expert experience, unwritten rules of thumb, or sometimes brute-force search. Much more appealing is the idea of develo** automatic approaches which can optimize the performance of a given learni… ▽ More

    Submitted 29 August, 2012; v1 submitted 13 June, 2012; originally announced June 2012.

  34. arXiv:1102.1492  [pdf, other

    stat.ML

    On Nonparametric Guidance for Learning Autoencoder Representations

    Authors: Jasper Snoek, Ryan Prescott Adams, Hugo Larochelle

    Abstract: Unsupervised discovery of latent representations, in addition to being useful for density modeling, visualisation and exploratory data analysis, is also increasingly important for learning features relevant to discriminative tasks. Autoencoders, in particular, have proven to be an effective way to learn latent codes that reflect meaningful variations in data. A continuing challenge, however, is gu… ▽ More

    Submitted 25 October, 2011; v1 submitted 7 February, 2011; originally announced February 2011.

    Comments: 9 pages, 12 figures