Skip to main content

Showing 1–33 of 33 results for author: Cunningham, J P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.07457  [pdf, other

    cs.LG stat.ML

    Estimating the Hallucination Rate of Generative AI

    Authors: Andrew Jesson, Nicolas Beltran-Velez, Quentin Chu, Sweta Karlekar, Jannik Kossen, Yarin Gal, John P. Cunningham, David Blei

    Abstract: This work is about estimating the hallucination rate for in-context learning (ICL) with Generative AI. In ICL, a conditional generative model (CGM) is prompted with a dataset and asked to make a prediction based on that dataset. The Bayesian interpretation of ICL assumes that the CGM is calculating a posterior predictive distribution over an unknown Bayesian model of a latent parameter and data. W… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  2. arXiv:2406.04308  [pdf, other

    cs.LG stat.ML

    Approximation-Aware Bayesian Optimization

    Authors: Natalie Maus, Kyurae Kim, Geoff Pleiss, David Eriksson, John P. Cunningham, Jacob R. Gardner

    Abstract: High-dimensional Bayesian optimization (BO) tasks such as molecular design often require 10,000 function evaluations before obtaining meaningful results. While methods like sparse variational Gaussian processes (SVGPs) reduce computational requirements in these settings, the underlying approximations result in suboptimal data acquisitions that slow the progress of optimization. In this paper we mo… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  3. arXiv:2405.09673  [pdf, other

    cs.LG cs.AI cs.CL

    LoRA Learns Less and Forgets Less

    Authors: Dan Biderman, Jose Gonzalez Ortiz, Jacob Portes, Mansheej Paul, Philip Greengard, Connor Jennings, Daniel King, Sam Havens, Vitaliy Chiley, Jonathan Frankle, Cody Blakeney, John P. Cunningham

    Abstract: Low-Rank Adaptation (LoRA) is a widely-used parameter-efficient finetuning method for large language models. LoRA saves memory by training only low rank perturbations to selected weight matrices. In this work, we compare the performance of LoRA and full finetuning on two target domains, programming and mathematics. We consider both the instruction finetuning ($\approx$100K prompt-response pairs) a… ▽ More

    Submitted 15 May, 2024; originally announced May 2024.

  4. arXiv:2306.17775  [pdf, other

    stat.ML cs.LG q-bio.BM

    Practical and Asymptotically Exact Conditional Sampling in Diffusion Models

    Authors: Luhuan Wu, Brian L. Trippe, Christian A. Naesseth, David M. Blei, John P. Cunningham

    Abstract: Diffusion models have been successful on a range of conditional generation tasks including molecular design and text-to-image generation. However, these achievements have primarily depended on task-specific conditional training or error-prone heuristic approximations. Ideally, a conditional generation method should provide exact samples for a broad range of conditional distributions without requir… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

    Comments: Code: https://github.com/blt2114/twisted_diffusion_sampler

  5. arXiv:2302.00704  [pdf, other

    cs.LG stat.ML

    Pathologies of Predictive Diversity in Deep Ensembles

    Authors: Taiga Abe, E. Kelly Buchanan, Geoff Pleiss, John P. Cunningham

    Abstract: Classic results establish that encouraging predictive diversity improves performance in ensembles of low-capacity models, e.g. through bagging or boosting. Here we demonstrate that these intuitions do not apply to high-capacity neural network ensembles (deep ensembles), and in fact the opposite is often true. In a large scale study of nearly 600 neural network classification ensembles, we examine… ▽ More

    Submitted 9 January, 2024; v1 submitted 1 February, 2023; originally announced February 2023.

    Comments: now published in Transactions on Machine Learning Research

  6. arXiv:2301.00537  [pdf, other

    stat.ML cs.LG

    Posterior Collapse and Latent Variable Non-identifiability

    Authors: Yixin Wang, David M. Blei, John P. Cunningham

    Abstract: Variational autoencoders model high-dimensional data by positing low-dimensional latent variables that are mapped through a flexible distribution parametrized by a neural network. Unfortunately, variational autoencoders often suffer from posterior collapse: the posterior of the latent variables is equal to its prior, rendering the variational autoencoder useless as a means to produce meaningful re… ▽ More

    Submitted 2 January, 2023; originally announced January 2023.

    Comments: 19 pages, 4 figures; NeurIPS 2021

  7. arXiv:2212.01265  [pdf, other

    cs.LG cs.AI

    Denoising Deep Generative Models

    Authors: Gabriel Loaiza-Ganem, Brendan Leigh Ross, Luhuan Wu, John P. Cunningham, Jesse C. Cresswell, Anthony L. Caterini

    Abstract: Likelihood-based deep generative models have recently been shown to exhibit pathological behaviour under the manifold hypothesis as a consequence of using high-dimensional densities to model data with low-dimensional structure. In this paper we propose two methodologies aimed at addressing this problem. Both are based on adding Gaussian noise to the data to remove the dimensionality mismatch durin… ▽ More

    Submitted 4 January, 2023; v1 submitted 30 November, 2022; originally announced December 2022.

    Comments: NeurIPS 2022 ICBINB workshop (spotlight)

  8. arXiv:2205.15449  [pdf, other

    cs.LG math.NA stat.ML

    Posterior and Computational Uncertainty in Gaussian Processes

    Authors: Jonathan Wenger, Geoff Pleiss, Marvin Pförtner, Philipp Hennig, John P. Cunningham

    Abstract: Gaussian processes scale prohibitively with the size of the dataset. In response, many approximation methods have been developed, which inevitably introduce approximation error. This additional source of uncertainty, due to limited computation, is entirely ignored when using the approximate posterior. Therefore in practice, GP models are often as much about the approximation method as they are abo… ▽ More

    Submitted 9 October, 2023; v1 submitted 30 May, 2022; originally announced May 2022.

    Comments: Advances in Neural Information Processing Systems (NeurIPS 2022)

  9. arXiv:2205.09906  [pdf, other

    stat.ML cs.LG

    Data Augmentation for Compositional Data: Advancing Predictive Models of the Microbiome

    Authors: Elliott Gordon-Rodriguez, Thomas P. Quinn, John P. Cunningham

    Abstract: Data augmentation plays a key role in modern machine learning pipelines. While numerous augmentation strategies have been studied in the context of computer vision and natural language processing, less is known for other data modalities. Our work extends the success of data augmentation to compositional data, i.e., simplex-valued data, which is of particular interest in the context of the human mi… ▽ More

    Submitted 19 May, 2022; originally announced May 2022.

  10. arXiv:2204.13290  [pdf, other

    stat.ML cs.LG

    On the Normalizing Constant of the Continuous Categorical Distribution

    Authors: Elliott Gordon-Rodriguez, Gabriel Loaiza-Ganem, Andres Potapczynski, John P. Cunningham

    Abstract: Probability distributions supported on the simplex enjoy a wide range of applications across statistics and machine learning. Recently, a novel family of such distributions has been discovered: the continuous categorical. This family enjoys remarkable mathematical simplicity; its density function resembles that of the Dirichlet distribution, but with a normalizing constant that can be written in c… ▽ More

    Submitted 28 April, 2022; originally announced April 2022.

  11. arXiv:2202.06985  [pdf, other

    cs.LG stat.ML

    Deep Ensembles Work, But Are They Necessary?

    Authors: Taiga Abe, E. Kelly Buchanan, Geoff Pleiss, Richard Zemel, John P. Cunningham

    Abstract: Ensembling neural networks is an effective way to increase accuracy, and can often match the performance of individual larger models. This observation poses a natural question: given the choice between a deep ensemble and a single neural network with similar accuracy, is one preferable over the other? Recent work suggests that deep ensembles may offer distinct benefits beyond predictive power: nam… ▽ More

    Submitted 13 October, 2022; v1 submitted 14 February, 2022; originally announced February 2022.

  12. arXiv:2112.03638  [pdf, other

    cs.LG cs.CL cs.DS stat.AP stat.ML

    Scaling Structured Inference with Randomization

    Authors: Yao Fu, John P. Cunningham, Mirella Lapata

    Abstract: Deep discrete structured models have seen considerable progress recently, but traditional inference using dynamic programming (DP) typically works with a small number of states (less than hundreds), which severely limits model capacity. At the same time, across machine learning, there is a recent trend of using randomized truncation techniques to accelerate computations involving large sums. Here,… ▽ More

    Submitted 24 July, 2022; v1 submitted 7 December, 2021; originally announced December 2021.

    Comments: ICML 2022 camera ready

  13. arXiv:2107.00243  [pdf, other

    cs.LG math.NA

    Preconditioning for Scalable Gaussian Process Hyperparameter Optimization

    Authors: Jonathan Wenger, Geoff Pleiss, Philipp Hennig, John P. Cunningham, Jacob R. Gardner

    Abstract: Gaussian process hyperparameter optimization requires linear solves with, and log-determinants of, large kernel matrices. Iterative numerical techniques are becoming popular to scale to larger datasets, relying on the conjugate gradient method (CG) for the linear solves and stochastic trace estimation for the log-determinant. This work introduces new algorithmic and theoretical insights for precon… ▽ More

    Submitted 18 June, 2022; v1 submitted 1 July, 2021; originally announced July 2021.

    Comments: International Conference on Machine Learning (ICML)

  14. arXiv:2106.06529  [pdf, other

    cs.LG stat.ML

    The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective

    Authors: Geoff Pleiss, John P. Cunningham

    Abstract: Large width limits have been a recent focus of deep learning research: modulo computational practicalities, do wider networks outperform narrower ones? Answering this question has been challenging, as conventional networks gain representational power with width, potentially masking any negative effects. Our analysis in this paper decouples capacity and width via the generalization of neural networ… ▽ More

    Submitted 8 November, 2021; v1 submitted 11 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021

  15. arXiv:2106.01413  [pdf, other

    stat.ML cs.LG

    Rectangular Flows for Manifold Learning

    Authors: Anthony L. Caterini, Gabriel Loaiza-Ganem, Geoff Pleiss, John P. Cunningham

    Abstract: Normalizing flows are invertible neural networks with tractable change-of-volume terms, which allow optimization of their parameters to be efficiently performed via maximum likelihood. However, data of interest are typically assumed to live in some (often unknown) low-dimensional manifold embedded in a high-dimensional ambient space. The result is a modelling mismatch since -- by construction -- t… ▽ More

    Submitted 2 November, 2021; v1 submitted 2 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021 Camera Ready. Code available at https://github.com/layer6ai-labs/rectangular-flows

  16. arXiv:2103.02583  [pdf

    cs.CV

    Simulating time to event prediction with spatiotemporal echocardiography deep learning

    Authors: Rohan Shad, Nicolas Quach, Robyn Fong, Patpilai Kasinpila, Cayley Bowles, Kate M. Callon, Michelle C. Li, Jeffrey Teuteberg, John P. Cunningham, Curtis P. Langlotz, William Hiesinger

    Abstract: Integrating methods for time-to-event prediction with diagnostic imaging modalities is of considerable interest, as accurate estimates of survival requires accounting for censoring of individuals within the observation period. New methods for time-to-event prediction have been developed by extending the cox-proportional hazards model with neural networks. In this paper, to explore the feasibility… ▽ More

    Submitted 3 March, 2021; originally announced March 2021.

    Comments: 9 pages, 5 figures

  17. arXiv:2103.01938  [pdf

    eess.IV cs.CV cs.LG

    Medical Imaging and Machine Learning

    Authors: Rohan Shad, John P. Cunningham, Euan A. Ashley, Curtis P. Langlotz, William Hiesinger

    Abstract: Advances in computing power, deep learning architectures, and expert labelled datasets have spurred the development of medical imaging artificial intelligence systems that rival clinical experts in a variety of scenarios. The National Institutes of Health in 2018 identified key focus areas for the future of artificial intelligence in medical imaging, creating a foundational roadmap for research in… ▽ More

    Submitted 2 March, 2021; originally announced March 2021.

    Comments: 9 pages, 4 figures

    Journal ref: Nat Mach Intell 3, 929 - 935 (2021)

  18. Predicting post-operative right ventricular failure using video-based deep learning

    Authors: Rohan Shad, Nicolas Quach, Robyn Fong, Patpilai Kasinpila, Cayley Bowles, Miguel Castro, Ashrith Guha, Eddie Suarez, Stefan Jovinge, Sang** Lee, Theodore Boeve, Myriam Amsallem, Xiu Tang, Francois Haddad, Yasuhiro Shudo, Y. Joseph Woo, Jeffrey Teuteberg, John P. Cunningham, Curt P. Langlotz, William Hiesinger

    Abstract: Non-invasive and cost effective in nature, the echocardiogram allows for a comprehensive assessment of the cardiac musculature and valves. Despite progressive improvements over the decades, the rich temporally resolved data in echocardiography videos remain underutilized. Human reads of echocardiograms reduce the complex patterns of cardiac wall motion, to a small list of measurements of heart fun… ▽ More

    Submitted 27 February, 2021; originally announced March 2021.

    Comments: 12 pages, 3 figures

    Journal ref: Nat Commun 12, 5192 (2021)

  19. arXiv:2102.06695  [pdf, other

    cs.LG stat.ML

    Bias-Free Scalable Gaussian Processes via Randomized Truncations

    Authors: Andres Potapczynski, Luhuan Wu, Dan Biderman, Geoff Pleiss, John P. Cunningham

    Abstract: Scalable Gaussian Process methods are computationally attractive, yet introduce modeling biases that require rigorous study. This paper analyzes two common techniques: early truncated conjugate gradients (CG) and random Fourier features (RFF). We find that both methods introduce a systematic bias on the learned hyperparameters: CG tends to underfit while RFF tends to overfit. We address these issu… ▽ More

    Submitted 28 June, 2021; v1 submitted 12 February, 2021; originally announced February 2021.

    Journal ref: 38th International Conference on Machine Learning (ICML 2021)

  20. arXiv:2011.05231  [pdf, other

    stat.ML cs.LG

    Uses and Abuses of the Cross-Entropy Loss: Case Studies in Modern Deep Learning

    Authors: Elliott Gordon-Rodriguez, Gabriel Loaiza-Ganem, Geoff Pleiss, John P. Cunningham

    Abstract: Modern deep learning is primarily an experimental science, in which empirical advances occasionally come at the expense of probabilistic rigor. Here we focus on one such example; namely the use of the categorical cross-entropy loss to model data that is not strictly categorical, but rather takes values on the simplex. This practice is standard in neural network architectures with label smoothing a… ▽ More

    Submitted 10 November, 2020; originally announced November 2020.

  21. arXiv:2003.05554  [pdf, other

    stat.ML cs.LG

    Linear-time inference for Gaussian Processes on one dimension

    Authors: Jackson Loper, David Blei, John P. Cunningham, Liam Paninski

    Abstract: Gaussian Processes (GPs) provide powerful probabilistic frameworks for interpolation, forecasting, and smoothing, but have been hampered by computational scaling issues. Here we investigate data sampled on one dimension (e.g., a scalar or vector time series sampled at arbitrarily-spaced intervals), for which state-space models are popular due to their linearly-scaling computational costs. It has l… ▽ More

    Submitted 12 October, 2021; v1 submitted 11 March, 2020; originally announced March 2020.

    Comments: Accepted to JMLR

    MSC Class: 60G15 (Primary) 68W10; 47B34 (Secondary)

    Journal ref: The Journal of Machine Learning Research, 2021

  22. arXiv:2002.08563  [pdf, other

    stat.ML cs.LG

    The continuous categorical: a novel simplex-valued exponential family

    Authors: Elliott Gordon-Rodriguez, Gabriel Loaiza-Ganem, John P. Cunningham

    Abstract: Simplex-valued data appear throughout statistics and machine learning, for example in the context of transfer learning and compression of deep networks. Existing models for this class of data rely on the Dirichlet distribution or other related loss functions; here we show these standard choices suffer systematically from a number of limitations, including bias and numerical issues that frustrate t… ▽ More

    Submitted 8 June, 2020; v1 submitted 19 February, 2020; originally announced February 2020.

  23. arXiv:2001.01941  [pdf, other

    cs.CL cs.LG

    Paraphrase Generation with Latent Bag of Words

    Authors: Yao Fu, Yansong Feng, John P. Cunningham

    Abstract: Paraphrase generation is a longstanding important problem in natural language processing. In addition, recent progress in deep generative models has shown promising results on discrete latent variables for text generation. Inspired by variational autoencoders with discrete latent structures, in this work, we propose a latent bag of words (BOW) model for paraphrase generation. We ground the s… ▽ More

    Submitted 7 January, 2020; originally announced January 2020.

    Comments: NeurIPS 19 camera ready

  24. arXiv:1912.09588  [pdf, other

    stat.ML cs.LG

    Invertible Gaussian Reparameterization: Revisiting the Gumbel-Softmax

    Authors: Andres Potapczynski, Gabriel Loaiza-Ganem, John P. Cunningham

    Abstract: The Gumbel-Softmax is a continuous distribution over the simplex that is often used as a relaxation of discrete distributions. Because it can be readily interpreted and easily reparameterized, it enjoys widespread use. We propose a modular and more flexible family of reparameterizable distributions where Gaussian noise is transformed into a one-hot approximation through an invertible function. Thi… ▽ More

    Submitted 29 August, 2022; v1 submitted 19 December, 2019; originally announced December 2019.

    Comments: Accepted at NeurIPS 2020

    Journal ref: Published: NeurIPS 2020

  25. arXiv:1907.06845  [pdf, other

    stat.ML cs.LG

    The continuous Bernoulli: fixing a pervasive error in variational autoencoders

    Authors: Gabriel Loaiza-Ganem, John P. Cunningham

    Abstract: Variational autoencoders (VAE) have quickly become a central tool in machine learning, applicable to a broad range of data types and latent variable models. By far the most common first step, taken by seminal papers and by core software libraries alike, is to model MNIST data using a deep network parameterizing a Bernoulli likelihood. This practice contains what appears to be and what is often set… ▽ More

    Submitted 29 December, 2019; v1 submitted 16 July, 2019; originally announced July 2019.

    Comments: Accepted at NeurIPS 2019

  26. arXiv:1903.07515  [pdf, other

    stat.ML cs.LG

    Approximating exponential family models (not single distributions) with a two-network architecture

    Authors: Sean R. Bittner, John P. Cunningham

    Abstract: Recently much attention has been paid to deep generative models, since they have been used to great success for variational inference, generation of complex data types, and more. In most all of these settings, the goal has been to find a particular member of that model family: optimized parameters index a distribution that is close (via a divergence or classification metric) to a target distributi… ▽ More

    Submitted 18 March, 2019; originally announced March 2019.

  27. arXiv:1903.02610  [pdf, other

    stat.ML cs.AI cs.LG

    Deep Random Splines for Point Process Intensity Estimation of Neural Population Data

    Authors: Gabriel Loaiza-Ganem, Sean M. Perkins, Karen E. Schroeder, Mark M. Churchland, John P. Cunningham

    Abstract: Gaussian processes are the leading class of distributions on random functions, but they suffer from well known issues including difficulty scaling and inflexibility with respect to certain shape constraints (such as nonnegativity). Here we propose Deep Random Splines, a flexible class of random functions obtained by transforming Gaussian noise through a deep neural network whose output are the par… ▽ More

    Submitted 29 December, 2019; v1 submitted 6 March, 2019; originally announced March 2019.

    Comments: Accepted at NeurIPS 2019

  28. arXiv:1812.00209  [pdf, other

    stat.ML cs.LG q-bio.QM

    A Probabilistic Model of Cardiac Physiology and Electrocardiograms

    Authors: Andrew C. Miller, Ziad Obermeyer, David M. Blei, John P. Cunningham, Sendhil Mullainathan

    Abstract: An electrocardiogram (EKG) is a common, non-invasive test that measures the electrical activity of a patient's heart. EKGs contain useful diagnostic information about patient health that may be absent from other electronic health record (EHR) data. As multi-dimensional waveforms, they could be modeled using generic machine learning tools, such as a linear factor model or a variational autoencoder.… ▽ More

    Submitted 1 December, 2018; originally announced December 2018.

    Comments: Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:cs/0101200

    Report number: ML4H/2018/97

  29. arXiv:1805.10522  [pdf, other

    stat.ML cs.LG

    Calibrating Deep Convolutional Gaussian Processes

    Authors: Gia-Lac Tran, Edwin V. Bonilla, John P. Cunningham, Pietro Michiardi, Maurizio Filippone

    Abstract: The wide adoption of Convolutional Neural Networks (CNNs) in applications where decision-making under uncertainty is fundamental, has brought a great deal of attention to the ability of these models to accurately quantify the uncertainty in their predictions. Previous work on combining CNNs with Gaussian processes (GPs) has been developed under the assumption that the predictive probabilities of t… ▽ More

    Submitted 26 May, 2018; originally announced May 2018.

    Comments: 12 pages

  30. arXiv:1805.10050  [pdf, other

    stat.ML cs.LG

    Bayesian estimation for large scale multivariate Ornstein-Uhlenbeck model of brain connectivity

    Authors: Andrea Insabato, John P. Cunningham, Matthieu Gilson

    Abstract: Estimation of reliable whole-brain connectivity is a crucial step towards the use of connectivity information in quantitative approaches to the study of neuropsychiatric disorders. When estimating brain connectivity a challenge is imposed by the paucity of time samples and the large dimensionality of the measurements. Bayesian estimation methods for network models offer a number of advantages in t… ▽ More

    Submitted 25 May, 2018; originally announced May 2018.

  31. arXiv:1511.04156  [pdf, ps, other

    stat.ML cs.LG q-bio.NC

    Neuroprosthetic decoder training as imitation learning

    Authors: Josh Merel, David Carlson, Liam Paninski, John P. Cunningham

    Abstract: Neuroprosthetic brain-computer interfaces function via an algorithm which decodes neural activity of the user into movements of an end effector, such as a cursor or robotic arm. In practice, the decoder is often learned by updating its parameters while the user performs a task. When the user's intention is not directly observable, recent methods have demonstrated value in training the decoder agai… ▽ More

    Submitted 14 March, 2016; v1 submitted 12 November, 2015; originally announced November 2015.

  32. Sparse Probit Linear Mixed Model

    Authors: Stephan Mandt, Florian Wenzel, Shinichi Nakajima, John P. Cunningham, Christoph Lippert, Marius Kloft

    Abstract: Linear Mixed Models (LMMs) are important tools in statistical genetics. When used for feature selection, they allow to find a sparse set of genetic traits that best predict a continuous phenotype of interest, while simultaneously correcting for various confounding factors such as age, ethnicity and population structure. Formulated as models for linear regression, LMMs have been restricted to conti… ▽ More

    Submitted 17 July, 2017; v1 submitted 16 July, 2015; originally announced July 2015.

    Comments: Published version, 21 pages, 6 figures

    Journal ref: Machine Learning, 106(9), 1621-1642 (2017)

  33. arXiv:1310.5288  [pdf, other

    stat.ML cs.AI cs.LG stat.ME

    GPatt: Fast Multidimensional Pattern Extrapolation with Gaussian Processes

    Authors: Andrew Gordon Wilson, Elad Gilboa, Arye Nehorai, John P. Cunningham

    Abstract: Gaussian processes are typically used for smoothing and interpolation on small datasets. We introduce a new Bayesian nonparametric framework -- GPatt -- enabling automatic pattern extrapolation with Gaussian processes on large multidimensional datasets. GPatt unifies and extends highly expressive kernels and fast exact inference techniques. Without human intervention -- no hand crafting of kernel… ▽ More

    Submitted 31 December, 2013; v1 submitted 19 October, 2013; originally announced October 2013.

    Comments: 13 Pages, 9 Figures, 1 Table. Submitted for publication