Skip to main content

Showing 1–50 of 81 results for author: Baraniuk, R

Searching in archive stat. Search in all archives.
.
  1. arXiv:2406.13781  [pdf, other

    cs.LG cs.AI cs.CL cs.CV stat.ML

    A Primal-Dual Framework for Transformers and Neural Networks

    Authors: Tan M. Nguyen, Tam Nguyen, Nhat Ho, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher

    Abstract: Self-attention is key to the remarkable success of transformers in sequence modeling tasks including many applications in natural language processing and computer vision. Like neural network layers, these attention mechanisms are often developed by heuristics and experience. To provide a principled framework for constructing attention layers in transformers, we show that the self-attention corresp… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Accepted to ICLR 2023, 26 pages, 4 figures, 14 tables

  2. arXiv:2406.09657  [pdf, other

    cs.LG stat.ML

    ScaLES: Scalable Latent Exploration Score for Pre-Trained Generative Networks

    Authors: Omer Ronen, Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk, Bin Yu

    Abstract: We develop Scalable Latent Exploration Score (ScaLES) to mitigate over-exploration in Latent Space Optimization (LSO), a popular method for solving black-box discrete optimization problems. LSO utilizes continuous optimization within the latent space of a Variational Autoencoder (VAE) and is known to be susceptible to over-exploration, which manifests in unrealistic solutions that reduce its pract… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  3. arXiv:2405.13977  [pdf, other

    cs.LG stat.ML

    Removing Bias from Maximum Likelihood Estimation with Model Autophagy

    Authors: Paul Mayer, Lorenzo Luzi, Ali Siahkoohi, Don H. Johnson, Richard G. Baraniuk

    Abstract: We propose autophagy penalized likelihood estimation (PLE), an unbiased alternative to maximum likelihood estimation (MLE) which is more fair and less susceptible to model autophagy disorder (madness). Model autophagy refers to models trained on their own output; PLE ensures the statistics of these outputs coincide with the data statistics. This enables PLE to be statistically unbiased in certain… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 9 Pages, submission for NeurIPS 2024

    MSC Class: 68T07

  4. arXiv:2401.14429  [pdf, ps, other

    cs.LG cs.RO eess.SP stat.ML

    [Re] The Discriminative Kalman Filter for Bayesian Filtering with Nonlinear and Non-Gaussian Observation Models

    Authors: Josue Casco-Rodriguez, Caleb Kemere, Richard G. Baraniuk

    Abstract: Kalman filters provide a straightforward and interpretable means to estimate hidden or latent variables, and have found numerous applications in control, robotics, signal processing, and machine learning. One such application is neural decoding for neuroprostheses. In 2020, Burkhart et al. thoroughly evaluated their new version of the Kalman filter that leverages Bayes' theorem to improve filter p… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

  5. arXiv:2302.01174  [pdf, other

    eess.SP stat.ML

    Unsupervised Learning of Sampling Distributions for Particle Filters

    Authors: Fernando Gama, Nicolas Zilberstein, Martin Sevilla, Richard Baraniuk, Santiago Segarra

    Abstract: Accurate estimation of the states of a nonlinear dynamical system is crucial for their design, synthesis, and analysis. Particle filters are estimators constructed by simulating trajectories from a sampling distribution and averaging them based on their importance weight. For particle filters to be computationally tractable, it must be feasible to simulate the trajectories by drawing from the samp… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

  6. arXiv:2210.12100  [pdf, other

    cs.CV cs.LG stat.ML

    Boomerang: Local sampling on image manifolds using diffusion models

    Authors: Lorenzo Luzi, Paul M Mayer, Josue Casco-Rodriguez, Ali Siahkoohi, Richard G. Baraniuk

    Abstract: The inference stage of diffusion models can be seen as running a reverse-time diffusion stochastic differential equation, where samples from a Gaussian latent distribution are transformed into samples from a target distribution that usually reside on a low-dimensional manifold, e.g., an image manifold. The intermediate values between the initial latent space and the image manifold can be interpret… ▽ More

    Submitted 17 April, 2024; v1 submitted 21 October, 2022; originally announced October 2022.

    Comments: Published in Transactions on Machine Learning Research

  7. arXiv:2209.14778  [pdf, other

    cs.LG cs.AI cs.CG cs.CV stat.ML

    Batch Normalization Explained

    Authors: Randall Balestriero, Richard G. Baraniuk

    Abstract: A critically important, ubiquitous, and yet poorly understood ingredient in modern deep networks (DNs) is batch normalization (BN), which centers and normalizes the feature maps. To date, only limited progress has been made understanding why BN boosts DN learning and inference performance; work has focused exclusively on showing that BN smooths a DN's loss landscape. In this paper, we study BN the… ▽ More

    Submitted 29 September, 2022; originally announced September 2022.

  8. arXiv:2205.14055  [pdf, other

    cs.LG stat.ML

    A Blessing of Dimensionality in Membership Inference through Regularization

    Authors: Jasper Tan, Daniel LeJeune, Blake Mason, Hamid Javadi, Richard G. Baraniuk

    Abstract: Is overparameterization a privacy liability? In this work, we study the effect that the number of parameters has on a classifier's vulnerability to membership inference attacks. We first demonstrate how the number of parameters of a model can induce a privacy--utility trade-off: increasing the number of parameters generally improves generalization performance at the expense of lower privacy. Howev… ▽ More

    Submitted 13 April, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: 26 pages, 14 figures

  9. arXiv:2204.03145  [pdf, other

    stat.AP cs.LG stat.ML

    DeepTensor: Low-Rank Tensor Decomposition with Deep Network Priors

    Authors: Vishwanath Saragadam, Randall Balestriero, Ashok Veeraraghavan, Richard G. Baraniuk

    Abstract: DeepTensor is a computationally efficient framework for low-rank decomposition of matrices and tensors using deep generative networks. We decompose a tensor as the product of low-rank tensor factors (e.g., a matrix as the outer product of two vectors), where each low-rank tensor is generated by a deep network (DN) that is trained in a self-supervised manner to minimize the mean-squared approximati… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: 14 pages

  10. arXiv:2202.01243  [pdf, other

    stat.ML cs.LG

    Parameters or Privacy: A Provable Tradeoff Between Overparameterization and Membership Inference

    Authors: Jasper Tan, Blake Mason, Hamid Javadi, Richard G. Baraniuk

    Abstract: A surprising phenomenon in modern machine learning is the ability of a highly overparameterized model to generalize well (small error on the test data) even when it is trained to memorize the training data (zero error on the training data). This has led to an arms race towards increasingly overparameterized models (c.f., deep learning). In this paper, we study an underexplored hidden cost of overp… ▽ More

    Submitted 30 November, 2022; v1 submitted 2 February, 2022; originally announced February 2022.

    Comments: 25 pages, 8 figures

  11. arXiv:2110.13262  [pdf, other

    econ.EM math.ST stat.ME

    Covariate Balancing Methods for Randomized Controlled Trials Are Not Adversarially Robust

    Authors: Hossein Babaei, Sina Alemohammad, Richard Baraniuk

    Abstract: The first step towards investigating the effectiveness of a treatment via a randomized trial is to split the population into control and treatment groups then compare the average response of the treatment group receiving the treatment to the control group receiving the placebo. In order to ensure that the difference between the two groups is caused only by the treatment, it is crucial that the c… ▽ More

    Submitted 27 August, 2022; v1 submitted 25 October, 2021; originally announced October 2021.

    Comments: 12 pages, double column, 4 figures

  12. arXiv:2110.08678  [pdf, other

    cs.LG cs.CL stat.ML

    Improving Transformers with Probabilistic Attention Keys

    Authors: Tam Nguyen, Tan M. Nguyen, Dung D. Le, Duy Khuong Nguyen, Viet-Anh Tran, Richard G. Baraniuk, Nhat Ho, Stanley J. Osher

    Abstract: Multi-head attention is a driving force behind state-of-the-art transformers, which achieve remarkable performance across a variety of natural language processing (NLP) and computer vision tasks. It has been observed that for many applications, those attention heads learn redundant embedding, and most of them can be removed without degrading the performance of the model. Inspired by this observati… ▽ More

    Submitted 12 June, 2022; v1 submitted 16 October, 2021; originally announced October 2021.

    Comments: 27 pages, 16 figures, 10 tables

    Journal ref: Proceedings of the 39th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022

  13. arXiv:2110.02915  [pdf, other

    cs.LG eess.SP stat.CO

    Unrolling Particles: Unsupervised Learning of Sampling Distributions

    Authors: Fernando Gama, Nicolas Zilberstein, Richard G. Baraniuk, Santiago Segarra

    Abstract: Particle filtering is used to compute good nonlinear estimates of complex systems. It samples trajectories from a chosen distribution and computes the estimate as a weighted average. Easy-to-sample distributions often lead to degenerate samples where only one trajectory carries all the weight, negatively affecting the resulting performance of the estimate. While much research has been done on the… ▽ More

    Submitted 6 October, 2021; originally announced October 2021.

  14. arXiv:2109.02355  [pdf, other

    stat.ML cs.LG

    A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning

    Authors: Yehuda Dar, Vidya Muthukumar, Richard G. Baraniuk

    Abstract: The rapid recent progress in machine learning (ML) has raised a number of scientific questions that challenge the longstanding dogma of the field. One of the most important riddles is the good empirical generalization of overparameterized models. Overparameterized models are excessively complex with respect to the size of the training dataset, which results in them perfectly fitting (i.e., interpo… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

  15. arXiv:2106.07769  [pdf, other

    cs.LG stat.ML

    The Flip Side of the Reweighted Coin: Duality of Adaptive Dropout and Regularization

    Authors: Daniel LeJeune, Hamid Javadi, Richard G. Baraniuk

    Abstract: Among the most successful methods for sparsifying deep (neural) networks are those that adaptively mask the network weights throughout training. By examining this masking, or dropout, in the linear case, we uncover a duality between such adaptive methods and regularization through the so-called "$η$-trick" that casts both as iteratively reweighted optimizations. We show that any dropout strategy t… ▽ More

    Submitted 3 January, 2022; v1 submitted 14 June, 2021; originally announced June 2021.

    Comments: 19 pages, 2 figures. Appeared in NeurIPS 2021. Small typographical correction

  16. arXiv:2104.07824  [pdf, ps, other

    cs.LG stat.ML

    NePTuNe: Neural Powered Tucker Network for Knowledge Graph Completion

    Authors: Shashank Sonkar, Arzoo Katiyar, Richard G. Baraniuk

    Abstract: Knowledge graphs link entities through relations to provide a structured representation of real world facts. However, they are often incomplete, because they are based on only a small fraction of all plausible facts. The task of knowledge graph completion via link prediction aims to overcome this challenge by inferring missing facts represented as links between entities. Current approaches to link… ▽ More

    Submitted 15 April, 2021; originally announced April 2021.

  17. arXiv:2012.04859  [pdf, other

    cs.LG stat.ML

    Enhanced Recurrent Neural Tangent Kernels for Non-Time-Series Data

    Authors: Sina Alemohammad, Randall Balestriero, Zichao Wang, Richard Baraniuk

    Abstract: Kernels derived from deep neural networks (DNNs) in the infinite-width regime provide not only high performance in a range of machine learning tasks but also new theoretical insights into DNN training dynamics and generalization. In this paper, we extend the family of kernels associated with recurrent neural networks (RNNs), which were previously derived only for simple RNNs, to more complex archi… ▽ More

    Submitted 19 October, 2021; v1 submitted 8 December, 2020; originally announced December 2020.

  18. arXiv:2009.09525  [pdf, other

    cs.LG math.GR stat.ML

    Deep Autoencoders: From Understanding to Generalization Guarantees

    Authors: Romain Cosentino, Randall Balestriero, Richard Baraniuk, Behnaam Aazhang

    Abstract: A big mystery in deep learning continues to be the ability of methods to generalize when the number of model parameters is larger than the number of training examples. In this work, we take a step towards a better understanding of the underlying phenomena of Deep Autoencoders (AEs), a mainstream deep learning solution for learning compressed, interpretable, and structured data representations. In… ▽ More

    Submitted 24 November, 2021; v1 submitted 20 September, 2020; originally announced September 2020.

    Journal ref: R. Cosentino, R. Balestriero, R. Baraniuk, B. Aazhang, 2nd Annual Conference on Mathematical and Scientific Machine Learning (2021)

  19. arXiv:2006.14600  [pdf, ps, other

    cs.LG stat.ML

    Ensembles of Generative Adversarial Networks for Disconnected Data

    Authors: Lorenzo Luzi, Randall Balestriero, Richard G. Baraniuk

    Abstract: Most current computer vision datasets are composed of disconnected sets, such as images from different classes. We prove that distributions of this type of data cannot be represented with a continuous generative network without error. They can be represented in two ways: With an ensemble of networks or with a single network with truncated latent space. We show that ensembles are more desirable tha… ▽ More

    Submitted 25 June, 2020; originally announced June 2020.

  20. arXiv:2006.10246  [pdf, other

    cs.LG stat.ML

    The Recurrent Neural Tangent Kernel

    Authors: Sina Alemohammad, Zichao Wang, Randall Balestriero, Richard Baraniuk

    Abstract: The study of deep neural networks (DNNs) in the infinite-width limit, via the so-called neural tangent kernel (NTK) approach, has provided new insights into the dynamics of learning, generalization, and the impact of initialization. One key DNN architecture remains to be kernelized, namely, the recurrent neural network (RNN). In this paper we introduce and study the Recurrent Neural Tangent Kernel… ▽ More

    Submitted 14 June, 2021; v1 submitted 17 June, 2020; originally announced June 2020.

  21. arXiv:2006.10023  [pdf, other

    cs.LG stat.ML

    Analytical Probability Distributions and EM-Learning for Deep Generative Networks

    Authors: Randall Balestriero, Sebastien Paris, Richard G. Baraniuk

    Abstract: Deep Generative Networks (DGNs) with probabilistic modeling of their output and latent space are currently trained via Variational Autoencoders (VAEs). In the absence of a known analytical form for the posterior and likelihood expectation, VAEs resort to approximations, including (Amortized) Variational Inference (AVI) and Monte-Carlo (MC) sampling. We exploit the Continuous Piecewise Affine (CPA)… ▽ More

    Submitted 17 June, 2020; originally announced June 2020.

  22. arXiv:2006.07460  [pdf, other

    cs.LG stat.ML

    An Improved Semi-Supervised VAE for Learning Disentangled Representations

    Authors: Weili Nie, Zichao Wang, Ankit B. Patel, Richard G. Baraniuk

    Abstract: Learning interpretable and disentangled representations is a crucial yet challenging task in representation learning. In this work, we focus on semi-supervised disentanglement learning and extend work by Locatello et al. (2019) by introducing another source of supervision that we denote as label replacement. Specifically, during training, we replace the inferred representation associated with a da… ▽ More

    Submitted 22 June, 2020; v1 submitted 12 June, 2020; originally announced June 2020.

  23. arXiv:2006.07002  [pdf, other

    cs.LG stat.ML

    Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks

    Authors: Yehuda Dar, Richard G. Baraniuk

    Abstract: We study the transfer learning process between two linear regression problems. An important and timely special case is when the regressors are overparameterized and perfectly interpolate their training data. We examine a parameter transfer mechanism whereby a subset of the parameters of the target task solution are constrained to the values learned for a related source task. We analytically charac… ▽ More

    Submitted 28 September, 2022; v1 submitted 12 June, 2020; originally announced June 2020.

  24. arXiv:2006.06919  [pdf, other

    cs.LG math.DS stat.ML

    MomentumRNN: Integrating Momentum into Recurrent Neural Networks

    Authors: Tan M. Nguyen, Richard G. Baraniuk, Andrea L. Bertozzi, Stanley J. Osher, Bao Wang

    Abstract: Designing deep neural networks is an art that often involves an expensive search over candidate architectures. To overcome this for recurrent neural nets (RNNs), we establish a connection between the hidden state dynamics in an RNN and gradient descent (GD). We then integrate momentum into this framework and propose a new family of RNNs, called {\em MomentumRNNs}. We theoretically prove and numeri… ▽ More

    Submitted 11 October, 2020; v1 submitted 11 June, 2020; originally announced June 2020.

    Comments: 21 pages, 11 figures, Accepted for publication at Advances in Neural Information Processing Systems (NeurIPS) 2020

    MSC Class: 68T07 ACM Class: I.2

    Journal ref: Advances in Neural Information Processing Systems (NeurIPS) 2020

  25. arXiv:2005.13107  [pdf, other

    stat.ML cs.CY cs.LG stat.AP

    VarFA: A Variational Factor Analysis Framework For Efficient Bayesian Learning Analytics

    Authors: Zichao Wang, Yi Gu, Andrew Lan, Richard Baraniuk

    Abstract: We propose VarFA, a variational inference factor analysis framework that extends existing factor analysis models for educational data mining to efficiently output uncertainty estimation in the model's estimated factors. Such uncertainty information is useful, for example, for an adaptive testing scenario, where additional tests can be administered if the model is not quite certain about a students… ▽ More

    Submitted 14 August, 2020; v1 submitted 26 May, 2020; originally announced May 2020.

    Comments: edm 2020

  26. arXiv:2005.12442  [pdf, other

    cs.LG cs.AI stat.ML

    qDKT: Question-centric Deep Knowledge Tracing

    Authors: Shashank Sonkar, Andrew E. Waters, Andrew S. Lan, Phillip J. Grimaldi, Richard G. Baraniuk

    Abstract: Knowledge tracing (KT) models, e.g., the deep knowledge tracing (DKT) model, track an individual learner's acquisition of skills over time by examining the learner's performance on questions related to those skills. A practical limitation in most existing KT models is that all questions nested under a particular skill are treated as equivalent observations of a learner's ability, which is an inacc… ▽ More

    Submitted 25 May, 2020; originally announced May 2020.

  27. arXiv:2005.06001  [pdf, other

    eess.IV cs.LG stat.ML

    Deep Learning Techniques for Inverse Problems in Imaging

    Authors: Gregory Ongie, Ajil Jalal, Christopher A. Metzler, Richard G. Baraniuk, Alexandros G. Dimakis, Rebecca Willett

    Abstract: Recent work in machine learning shows that deep neural networks can be used to solve a wide variety of inverse problems arising in computational imaging. We explore the central prevailing themes of this emerging area and present a taxonomy that can be used to categorize different problems and reconstruction methods. Our taxonomy is organized along two central axes: (1) whether or not a forward mod… ▽ More

    Submitted 12 May, 2020; originally announced May 2020.

  28. arXiv:2003.05980  [pdf, other

    cs.CY cs.LG stat.AP

    Educational Question Mining At Scale: Prediction, Analysis and Personalization

    Authors: Zichao Wang, Sebastian Tschiatschek, Simon Woodhead, Jose Miguel Hernandez-Lobato, Simon Peyton Jones, Richard G. Baraniuk, Cheng Zhang

    Abstract: Online education platforms enable teachers to share a large number of educational resources such as questions to form exercises and quizzes for students. With large volumes of available questions, it is important to have an automated way to quantify their properties and intelligently select them for students, enabling effective and personalized learning experiences. In this work, we propose a fram… ▽ More

    Submitted 28 February, 2021; v1 submitted 12 March, 2020; originally announced March 2020.

    Comments: Accepted at AAAI-EAAI 2021

  29. arXiv:2002.11912  [pdf, other

    stat.ML cs.CG cs.CV cs.LG

    Max-Affine Spline Insights into Deep Generative Networks

    Authors: Randall Balestriero, Sebastien Paris, Richard Baraniuk

    Abstract: We connect a large class of Generative Deep Networks (GDNs) with spline operators in order to derive their properties, limitations, and new opportunities. By characterizing the latent space partition, dimension and angularity of the generated manifold, we relate the manifold dimension and approximation error to the sample size. The manifold-per-region affine subspace defines a local coordinate bas… ▽ More

    Submitted 25 February, 2020; originally announced February 2020.

  30. arXiv:2002.10614  [pdf, other

    cs.LG stat.ML

    Subspace Fitting Meets Regression: The Effects of Supervision and Orthonormality Constraints on Double Descent of Generalization Errors

    Authors: Yehuda Dar, Paul Mayer, Lorenzo Luzi, Richard G. Baraniuk

    Abstract: We study the linear subspace fitting problem in the overparameterized setting, where the estimated subspace can perfectly interpolate the training examples. Our scope includes the least-squares solutions to subspace fitting tasks with varying levels of supervision in the training data (i.e., the proportion of input-output examples of the desired low-dimensional map**) and orthonormality of the v… ▽ More

    Submitted 20 August, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

  31. arXiv:2002.10583  [pdf, other

    cs.LG cs.NE stat.ML

    Scheduled Restart Momentum for Accelerated Stochastic Gradient Descent

    Authors: Bao Wang, Tan M. Nguyen, Andrea L. Bertozzi, Richard G. Baraniuk, Stanley J. Osher

    Abstract: Stochastic gradient descent (SGD) with constant momentum and its variants such as Adam are the optimization algorithms of choice for training deep neural networks (DNNs). Since DNN training is incredibly computationally expensive, there is great interest in speeding up the convergence. Nesterov accelerated gradient (NAG) improves the convergence rate of gradient descent (GD) for convex optimizatio… ▽ More

    Submitted 26 April, 2020; v1 submitted 24 February, 2020; originally announced February 2020.

    Comments: 35 pages, 16 figures, 18 tables

  32. arXiv:1912.03978  [pdf, other

    cs.LG cs.CV stat.ML

    InfoCNF: An Efficient Conditional Continuous Normalizing Flow with Adaptive Solvers

    Authors: Tan M. Nguyen, Animesh Garg, Richard G. Baraniuk, Anima Anandkumar

    Abstract: Continuous Normalizing Flows (CNFs) have emerged as promising deep generative models for a wide range of tasks thanks to their invertibility and exact likelihood estimation. However, conditioning CNFs on signals of interest for conditional image generation and downstream predictive tasks is inefficient due to the high-dimensional latent code generated by the model, which needs to be of the same si… ▽ More

    Submitted 9 December, 2019; originally announced December 2019.

    Comments: 17 pages, 14 figures, 2 tables

  33. arXiv:1910.04743  [pdf, other

    stat.ML cs.LG

    The Implicit Regularization of Ordinary Least Squares Ensembles

    Authors: Daniel LeJeune, Hamid Javadi, Richard G. Baraniuk

    Abstract: Ensemble methods that average over a collection of independent predictors that are each limited to a subsampling of both the examples and features of the training data command a significant presence in machine learning, such as the ever-popular random forest, yet the nature of the subsampling effect, particularly of the features, is not well understood. We study the case of an ensemble of linear p… ▽ More

    Submitted 24 March, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

    Comments: 18 pages, 4 figures. To appear in AISTATS 2020

  34. arXiv:1909.11957  [pdf, other

    cs.LG stat.ML

    Drawing Early-Bird Tickets: Towards More Efficient Training of Deep Networks

    Authors: Haoran You, Chaojian Li, Pengfei Xu, Yonggan Fu, Yue Wang, Xiaohan Chen, Richard G. Baraniuk, Zhangyang Wang, Yingyan Lin

    Abstract: (Frankle & Carbin, 2019) shows that there exist winning tickets (small but critical subnetworks) for dense, randomly initialized networks, that can be trained alone to achieve comparable accuracies to the latter in a similar number of iterations. However, the identification of these winning tickets still requires the costly train-prune-retrain process, limiting their practical benefits. In this pa… ▽ More

    Submitted 16 February, 2022; v1 submitted 26 September, 2019; originally announced September 2019.

    Comments: Accepted as ICLR2020 Spotlight

  35. arXiv:1907.04572  [pdf, other

    cs.LG cs.CV stat.ML

    Out-of-Distribution Detection Using Neural Rendering Generative Models

    Authors: Yujia Huang, Sihui Dai, Tan Nguyen, Richard G. Baraniuk, Anima Anandkumar

    Abstract: Out-of-distribution (OoD) detection is a natural downstream task for deep generative models, due to their ability to learn the input probability distribution. There are mainly two classes of approaches for OoD detection using deep generative models, viz., based on likelihood measure and the reconstruction loss. However, both approaches are unable to carry out OoD detection effectively, especially… ▽ More

    Submitted 10 July, 2019; originally announced July 2019.

  36. arXiv:1905.11639  [pdf, other

    cs.LG stat.ML

    Implicit Rugosity Regularization via Data Augmentation

    Authors: Daniel LeJeune, Randall Balestriero, Hamid Javadi, Richard G. Baraniuk

    Abstract: Deep (neural) networks have been applied productively in a wide range of supervised and unsupervised learning tasks. Unlike classical machine learning algorithms, deep networks typically operate in the \emph{overparameterized} regime, where the number of parameters is larger than the number of training data points. Consequently, understanding the generalization properties and the role of (explicit… ▽ More

    Submitted 10 October, 2019; v1 submitted 28 May, 2019; originally announced May 2019.

    Comments: 15 pages, 12 figures

  37. arXiv:1905.09190  [pdf, other

    cs.LG stat.ML

    Thresholding Graph Bandits with GrAPL

    Authors: Daniel LeJeune, Gautam Dasarathy, Richard G. Baraniuk

    Abstract: In this paper, we introduce a new online decision making paradigm that we call Thresholding Graph Bandits. The main goal is to efficiently identify a subset of arms in a multi-armed bandit problem whose means are above a specified threshold. While traditionally in such problems, the arms are assumed to be independent, in our paradigm we further suppose that we have access to the similarity between… ▽ More

    Submitted 24 March, 2020; v1 submitted 22 May, 2019; originally announced May 2019.

    Comments: 14 pages, 3 figures. To appear in AISTATS 2020

  38. arXiv:1905.08831  [pdf, other

    cs.SI cs.LG eess.SP stat.ML

    IdeoTrace: A Framework for Ideology Tracing with a Case Study on the 2016 U.S. Presidential Election

    Authors: Indu Manickam, Andrew S. Lan, Gautam Dasarathy, Richard G. Baraniuk

    Abstract: The 2016 United States presidential election has been characterized as a period of extreme divisiveness that was exacerbated on social media by the influence of fake news, trolls, and social bots. However, the extent to which the public became more polarized in response to these influences over the course of the election is not well understood. In this paper we propose IdeoTrace, a framework for (… ▽ More

    Submitted 30 May, 2019; v1 submitted 21 May, 2019; originally announced May 2019.

    Comments: 9 pages, 4 figures, submitted to ASONAM 2019

  39. arXiv:1905.08443  [pdf, other

    cs.LG stat.ML

    The Geometry of Deep Networks: Power Diagram Subdivision

    Authors: Randall Balestriero, Romain Cosentino, Behnaam Aazhang, Richard Baraniuk

    Abstract: We study the geometry of deep (neural) networks (DNs) with piecewise affine and convex nonlinearities. The layers of such DNs have been shown to be {\em max-affine spline operators} (MASOs) that partition their input space and apply a region-dependent affine map** to their input to produce their output. We demonstrate that each MASO layer's input space partitioning corresponds to a {\em power di… ▽ More

    Submitted 21 May, 2019; originally announced May 2019.

  40. arXiv:1902.09465  [pdf, other

    cs.DS cs.LG stat.ML

    Adaptive Estimation for Approximate k-Nearest-Neighbor Computations

    Authors: Daniel LeJeune, Richard G. Baraniuk, Reinhard Heckel

    Abstract: Algorithms often carry out equally many computations for "easy" and "hard" problem instances. In particular, algorithms for finding nearest neighbors typically have the same running time regardless of the particular problem instance. In this paper, we consider the approximate k-nearest-neighbor problem, which is the problem of finding a subset of O(k) points in a given set of points that contains… ▽ More

    Submitted 25 February, 2019; originally announced February 2019.

    Comments: 11 pages, 2 figures. To appear in AISTATS 2019

    Journal ref: Proceedings of Machine Learning Research 89 (2019):3099-3107

  41. arXiv:1902.06687  [pdf, other

    cs.DS cs.CG cs.LG eess.SP stat.ML

    Sub-linear Memory Sketches for Near Neighbor Search on Streaming Data

    Authors: Benjamin Coleman, Richard G. Baraniuk, Anshumali Shrivastava

    Abstract: We present the first sublinear memory sketch that can be queried to find the nearest neighbors in a dataset. Our online sketching algorithm compresses an N element dataset to a sketch of size $O(N^b \log^3 N)$ in $O(N^{(b+1)} \log^3 N)$ time, where $b < 1$. This sketch can correctly report the nearest neighbors of any query that satisfies a stability condition parameterized by $b$. We achieve subl… ▽ More

    Submitted 14 September, 2020; v1 submitted 18 February, 2019; originally announced February 2019.

    Comments: Published in ICML2020

  42. arXiv:1811.02657  [pdf, other

    cs.CV cs.AI cs.LG cs.NE stat.ML

    A Bayesian Perspective of Convolutional Neural Networks through a Deconvolutional Generative Model

    Authors: Tan Nguyen, Nhat Ho, Ankit Patel, Anima Anandkumar, Michael I. Jordan, Richard G. Baraniuk

    Abstract: Inspired by the success of Convolutional Neural Networks (CNNs) for supervised prediction in images, we design the Deconvolutional Generative Model (DGM), a new probabilistic generative model whose inference calculations correspond to those in a given CNN architecture. The DGM uses a CNN to design the prior distribution in the probabilistic model. Furthermore, the DGM generates images from coarse… ▽ More

    Submitted 9 December, 2019; v1 submitted 31 October, 2018; originally announced November 2018.

    Comments: Keywords: neural nets, generative models, semi-supervised learning, cross-entropy, statistical guarantees 80 pages, 7 figures, 8 tables

  43. arXiv:1810.09274  [pdf, other

    cs.LG stat.ML

    From Hard to Soft: Understanding Deep Network Nonlinearities via Vector Quantization and Statistical Inference

    Authors: Randall Balestriero, Richard G. Baraniuk

    Abstract: Nonlinearity is crucial to the performance of a deep (neural) network (DN). To date there has been little progress understanding the menagerie of available nonlinearities, but recently progress has been made on understanding the rôle played by piecewise affine and convex nonlinearities like the ReLU and absolute value activation functions and max-pooling. In particular, DN layers constructed from… ▽ More

    Submitted 22 October, 2018; originally announced October 2018.

  44. arXiv:1806.04310  [pdf, other

    cs.DS cs.LG stat.ML

    MISSION: Ultra Large-Scale Feature Selection using Count-Sketches

    Authors: Amirali Aghazadeh, Ryan Spring, Daniel LeJeune, Gautam Dasarathy, Anshumali Shrivastava, Richard G. Baraniuk

    Abstract: Feature selection is an important challenge in machine learning. It plays a crucial role in the explainability of machine-driven decisions that are rapidly permeating throughout modern society. Unfortunately, the explosion in the size and dimensionality of real-world datasets poses a severe challenge to standard feature selection algorithms. Today, it is not uncommon for datasets to have billions… ▽ More

    Submitted 11 June, 2018; originally announced June 2018.

  45. arXiv:1805.10531  [pdf, other

    stat.ML cs.CV cs.LG

    Unsupervised Learning with Stein's Unbiased Risk Estimator

    Authors: Christopher A. Metzler, Ali Mousavi, Reinhard Heckel, Richard G. Baraniuk

    Abstract: Learning from unlabeled and noisy data is one of the grand challenges of machine learning. As such, it has seen a flurry of research with new ideas proposed continuously. In this work, we revisit a classical idea: Stein's Unbiased Risk Estimator (SURE). We show that, in the context of image recovery, SURE and its generalizations can be used to train convolutional neural networks (CNNs) for a range… ▽ More

    Submitted 22 July, 2020; v1 submitted 26 May, 2018; originally announced May 2018.

  46. arXiv:1805.06576  [pdf, other

    stat.ML cs.LG

    Mad Max: Affine Spline Insights into Deep Learning

    Authors: Randall Balestriero, Richard Baraniuk

    Abstract: We build a rigorous bridge between deep networks (DNs) and approximation theory via spline functions and operators. Our key result is that a large class of DNs can be written as a composition of max-affine spline operators (MASOs), which provide a powerful portal through which to view and analyze their inner workings. For instance, conditioned on the input signal, the output of a MASO DN can be wr… ▽ More

    Submitted 11 November, 2018; v1 submitted 16 May, 2018; originally announced May 2018.

  47. arXiv:1803.00212  [pdf, other

    stat.ML cs.LG

    prDeep: Robust Phase Retrieval with a Flexible Deep Network

    Authors: Christopher A. Metzler, Philip Schniter, Ashok Veeraraghavan, Richard G. Baraniuk

    Abstract: Phase retrieval algorithms have become an important component in many modern computational imaging systems. For instance, in the context of ptychography and speckle correlation imaging, they enable imaging past the diffraction limit and through scattering media, respectively. Unfortunately, traditional phase retrieval algorithms struggle in the presence of noise. Progress has been made recently on… ▽ More

    Submitted 29 June, 2018; v1 submitted 28 February, 2018; originally announced March 2018.

  48. arXiv:1802.10172  [pdf, other

    cs.LG stat.ML

    Semi-Supervised Learning Enabled by Multiscale Deep Neural Network Inversion

    Authors: Randall Balestriero, Herve Glotin, Richard Baraniuk

    Abstract: Deep Neural Networks (DNNs) provide state-of-the-art solutions in several difficult machine perceptual tasks. However, their performance relies on the availability of a large set of labeled training data, which limits the breadth of their applicability. Hence, there is a need for new {\em semi-supervised learning} methods for DNNs that can leverage both (a small amount of) labeled and unlabeled tr… ▽ More

    Submitted 27 February, 2018; originally announced February 2018.

  49. arXiv:1712.09117  [pdf, other

    eess.AS cs.SD stat.ML

    Overcomplete Frame Thresholding for Acoustic Scene Analysis

    Authors: Romain Cosentino, Randall Balestriero, Richard Baraniuk, Ankit Patel

    Abstract: In this work, we derive a generic overcomplete frame thresholding scheme based on risk minimization. Overcomplete frames being favored for analysis tasks such as classification, regression or anomaly detection, we provide a way to leverage those optimal representations in real-world applications through the use of thresholding. We validate the method on a large scale bird activity detection task v… ▽ More

    Submitted 25 December, 2017; originally announced December 2017.

  50. arXiv:1711.04313  [pdf, other

    stat.ML cs.LG

    Semi-Supervised Learning via New Deep Network Inversion

    Authors: Randall Balestriero, Vincent Roger, Herve G. Glotin, Richard G. Baraniuk

    Abstract: We exploit a recently derived inversion scheme for arbitrary deep neural networks to develop a new semi-supervised learning framework that applies to a wide range of systems and problems. The approach outperforms current state-of-the-art methods on MNIST reaching $99.14\%$ of test set accuracy while using $5$ labeled examples per class. Experiments with one-dimensional signals highlight the genera… ▽ More

    Submitted 12 November, 2017; originally announced November 2017.

    Comments: arXiv admin note: substantial text overlap with arXiv:1710.09302