Skip to main content

Showing 1–15 of 15 results for author: Dieng, A B

Searching in archive stat. Search in all archives.
.
  1. arXiv:2405.11848  [pdf, other

    stat.ML cs.AI cs.LG cs.NE physics.ao-ph q-bio.NC

    Alternators For Sequence Modeling

    Authors: Mohammad Reza Rezaei, Adji Bousso Dieng

    Abstract: This paper introduces alternators, a novel family of non-Markovian dynamical models for sequences. An alternator features two neural networks: the observation trajectory network (OTN) and the feature trajectory network (FTN). The OTN and the FTN work in conjunction, alternating between outputting samples in the observation space and some feature space, respectively, over a cycle. The parameters of… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

    Comments: A new versatile family of sequence models that can be used for both generative modeling and supervised learning. The codebase will be made available upon publication. This paper is dedicated to Thomas Sankara

  2. arXiv:2405.02449  [pdf, other

    stat.ML cond-mat.mtrl-sci cs.LG q-bio.BM

    Quality-Weighted Vendi Scores And Their Application To Diverse Experimental Design

    Authors: Quan Nguyen, Adji Bousso Dieng

    Abstract: Experimental design techniques such as active search and Bayesian optimization are widely used in the natural sciences for data collection and discovery. However, existing techniques tend to favor exploitation over exploration of the search space, which causes them to get stuck in local optima. This ``collapse" problem prevents experimental design algorithms from yielding diverse high-quality data… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Comments: Published in International Conference on Machine Learning, ICML 2024. Code can be found in the Vertaix GitHub: https://github.com/vertaix/Quality-Weighted-Vendi-Score. Paper dedicated to Kwame Nkrumah

  3. arXiv:2210.02410  [pdf, other

    cs.LG cond-mat.mtrl-sci stat.ML

    The Vendi Score: A Diversity Evaluation Metric for Machine Learning

    Authors: Dan Friedman, Adji Bousso Dieng

    Abstract: Diversity is an important criterion for many areas of machine learning (ML), including generative modeling and dataset curation. However, existing metrics for measuring diversity are often domain-specific and limited in flexibility. In this paper, we address the diversity evaluation problem by proposing the Vendi Score, which connects and extends ideas from ecology and quantum statistical mechanic… ▽ More

    Submitted 2 July, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: The Vendi Score is available as a pip package at https://github.com/vertaix/Vendi-Score

  4. arXiv:2206.06295  [pdf, other

    cs.LG cs.AI stat.ML

    Markov Chain Score Ascent: A Unifying Framework of Variational Inference with Markovian Gradients

    Authors: Kyurae Kim, Jisu Oh, Jacob R. Gardner, Adji Bousso Dieng, Hongseok Kim

    Abstract: Minimizing the inclusive Kullback-Leibler (KL) divergence with stochastic gradient descent (SGD) is challenging since its gradient is defined as an integral over the posterior. Recently, multiple methods have been proposed to run SGD with biased gradient estimates obtained from a Markov chain. This paper provides the first non-asymptotic convergence analysis of these methods by establishing their… ▽ More

    Submitted 13 October, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

    Comments: Accepted to NeurIPS 2022

  5. arXiv:2104.12053  [pdf, other

    stat.ML cs.LG

    Deep Probabilistic Graphical Modeling

    Authors: Adji B. Dieng

    Abstract: Probabilistic graphical modeling (PGM) provides a framework for formulating an interpretable generative process of data and expressing uncertainty about unknowns, but it lacks flexibility. Deep learning (DL) is an alternative framework for learning from data that has achieved great empirical success in recent years. DL offers great flexibility, but it lacks the interpretability and calibration of… ▽ More

    Submitted 24 April, 2021; originally announced April 2021.

    Comments: This thesis was defended in April 2020 and accepted without revision. The author received her PhD in Statistics from Columbia University on May 20, 2020

  6. arXiv:1910.04302  [pdf, other

    stat.ML cs.LG stat.ME

    Prescribed Generative Adversarial Networks

    Authors: Adji B. Dieng, Francisco J. R. Ruiz, David M. Blei, Michalis K. Titsias

    Abstract: Generative adversarial networks (GANs) are a powerful approach to unsupervised learning. They have achieved state-of-the-art performance in the image domain. However, GANs are limited in two ways. They often learn distributions with low support---a phenomenon known as mode collapse---and they do not guarantee the existence of a probability density, which makes evaluating generalization using predi… ▽ More

    Submitted 9 October, 2019; originally announced October 2019.

    Comments: Code for this paper can be found at https://github.com/adjidieng/PresGANs

  7. arXiv:1907.05545  [pdf, other

    cs.CL stat.ML

    The Dynamic Embedded Topic Model

    Authors: Adji B. Dieng, Francisco J. R. Ruiz, David M. Blei

    Abstract: Topic modeling analyzes documents to learn meaningful patterns of words. For documents collected in sequence, dynamic topic models capture how these patterns vary over time. We develop the dynamic embedded topic model (D-ETM), a generative model of documents that combines dynamic latent Dirichlet allocation (D-LDA) and word embeddings. The D-ETM models each word with a categorical distribution par… ▽ More

    Submitted 10 October, 2019; v1 submitted 11 July, 2019; originally announced July 2019.

  8. arXiv:1907.04907  [pdf, other

    cs.IR cs.CL cs.LG stat.ML

    Topic Modeling in Embedding Spaces

    Authors: Adji B. Dieng, Francisco J. R. Ruiz, David M. Blei

    Abstract: Topic modeling analyzes documents to learn meaningful patterns of words. However, existing topic models fail to learn interpretable topics when working with large and heavy-tailed vocabularies. To this end, we develop the Embedded Topic Model (ETM), a generative model of documents that marries traditional topic models with word embeddings. In particular, it models each word with a categorical dist… ▽ More

    Submitted 7 July, 2019; originally announced July 2019.

    Comments: Code can be found at https://github.com/adjidieng/ETM

  9. arXiv:1906.05850  [pdf, other

    stat.ML cs.LG stat.ME

    Reweighted Expectation Maximization

    Authors: Adji B. Dieng, John Paisley

    Abstract: Training deep generative models with maximum likelihood remains a challenge. The typical workaround is to use variational inference (VI) and maximize a lower bound to the log marginal likelihood of the data. Variational auto-encoders (VAEs) adopt this approach. They further amortize the cost of inference by using a recognition network to parameterize the variational family. Amortized VI scales app… ▽ More

    Submitted 10 August, 2019; v1 submitted 13 June, 2019; originally announced June 2019.

    Comments: Code can be found at https://github.com/adjidieng/REM

  10. arXiv:1807.04863  [pdf, other

    stat.ML cs.CL cs.LG

    Avoiding Latent Variable Collapse With Generative Skip Models

    Authors: Adji B. Dieng, Yoon Kim, Alexander M. Rush, David M. Blei

    Abstract: Variational autoencoders learn distributions of high-dimensional data. They model data with a deep latent-variable model and then fit the model by maximizing a lower bound of the log marginal likelihood. VAEs can capture complex distributions, but they can also suffer from an issue known as "latent variable collapse," especially if the likelihood model is powerful. Specifically, the lower bound in… ▽ More

    Submitted 30 January, 2019; v1 submitted 12 July, 2018; originally announced July 2018.

    Comments: In the Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2019), Naha, Okinawa, Japan. PMLR: Volume 89. An earlier version of this paper was presented at the Workshop on Theoretical Foundations and Applications of Deep Generative Models, ICML, 2018

  11. arXiv:1805.01500  [pdf, other

    stat.ML cs.LG stat.ME

    Noisin: Unbiased Regularization for Recurrent Neural Networks

    Authors: Adji B. Dieng, Rajesh Ranganath, Jaan Altosaar, David M. Blei

    Abstract: Recurrent neural networks (RNNs) are powerful models of sequential data. They have been successfully used in domains such as text and speech. However, RNNs are susceptible to overfitting; regularization is important. In this paper we develop Noisin, a new method for regularizing RNNs. Noisin injects random noise into the hidden states of the RNN and then maximizes the corresponding marginal likeli… ▽ More

    Submitted 12 July, 2018; v1 submitted 3 May, 2018; originally announced May 2018.

    Comments: In Proceedings of the International Conference on Machine Learning, 2018

  12. arXiv:1802.04220  [pdf, other

    stat.ML cs.LG

    Augment and Reduce: Stochastic Inference for Large Categorical Distributions

    Authors: Francisco J. R. Ruiz, Michalis K. Titsias, Adji B. Dieng, David M. Blei

    Abstract: Categorical distributions are ubiquitous in machine learning, e.g., in classification, language models, and recommendation systems. However, when the number of possible outcomes is very large, using categorical distributions becomes computationally expensive, as the complexity scales linearly with the number of outcomes. To address this problem, we propose augment and reduce (A&R), a method to all… ▽ More

    Submitted 7 June, 2018; v1 submitted 12 February, 2018; originally announced February 2018.

    Comments: 11 pages, 2 figures

    Journal ref: Francisco J. R. Ruiz, Michalis K. Titsias, Adji B. Dieng, and David M. Blei. Augment and Reduce: Stochastic Inference for Large Categorical Distributions. International Conference on Machine Learning. Stockholm (Sweden), July 2018

  13. arXiv:1611.01702  [pdf, other

    cs.CL cs.AI cs.LG stat.ML

    TopicRNN: A Recurrent Neural Network with Long-Range Semantic Dependency

    Authors: Adji B. Dieng, Chong Wang, Jianfeng Gao, John Paisley

    Abstract: In this paper, we propose TopicRNN, a recurrent neural network (RNN)-based language model designed to directly capture the global semantic meaning relating words in a document via latent topics. Because of their sequential nature, RNNs are good at capturing the local structure of a word sequence - both semantic and syntactic - but might face difficulty remembering long-range dependencies. Intuitiv… ▽ More

    Submitted 26 February, 2017; v1 submitted 5 November, 2016; originally announced November 2016.

    Comments: International Conference on Learning Representations

  14. arXiv:1611.00328  [pdf, other

    stat.ML cs.LG stat.CO stat.ME

    Variational Inference via $χ$-Upper Bound Minimization

    Authors: Adji B. Dieng, Dustin Tran, Rajesh Ranganath, John Paisley, David M. Blei

    Abstract: Variational inference (VI) is widely used as an efficient alternative to Markov chain Monte Carlo. It posits a family of approximating distributions $q$ and finds the closest member to the exact posterior $p$. Closeness is usually measured via a divergence $D(q || p)$ from $q$ to $p$. While successful, this approach also has problems. Notably, it typically leads to underestimation of the posterior… ▽ More

    Submitted 12 November, 2017; v1 submitted 1 November, 2016; originally announced November 2016.

    Comments: Neural Information Processing Systems, 2017

  15. arXiv:1610.09787  [pdf, other

    stat.CO cs.AI cs.PL stat.AP stat.ML

    Edward: A library for probabilistic modeling, inference, and criticism

    Authors: Dustin Tran, Alp Kucukelbir, Adji B. Dieng, Maja Rudolph, Dawen Liang, David M. Blei

    Abstract: Probabilistic modeling is a powerful approach for analyzing empirical information. We describe Edward, a library for probabilistic modeling. Edward's design reflects an iterative process pioneered by George Box: build a model of a phenomenon, make inferences about the model given data, and criticize the model's fit to the data. Edward supports a broad class of probabilistic models, efficient algor… ▽ More

    Submitted 31 January, 2017; v1 submitted 31 October, 2016; originally announced October 2016.