Skip to main content

Showing 1–16 of 16 results for author: LeJeune, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.05818  [pdf, ps, other

    cs.DS cs.LG math.NA math.OC

    Fine-grained Analysis and Faster Algorithms for Iteratively Solving Linear Systems

    Authors: Michał Dereziński, Daniel LeJeune, Deanna Needell, Elizaveta Rebrova

    Abstract: While effective in practice, iterative methods for solving large systems of linear equations can be significantly affected by problem-dependent condition number quantities. This makes characterizing their time complexity challenging, particularly when we wish to make comparisons between deterministic and stochastic methods, that may or may not rely on preconditioning and/or fast matrix multiplicat… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

    Comments: 32 pages

  2. arXiv:2308.15478  [pdf, other

    cs.LG cs.CV

    An Adaptive Tangent Feature Perspective of Neural Networks

    Authors: Daniel LeJeune, Sina Alemohammad

    Abstract: In order to better understand feature learning in neural networks, we propose a framework for understanding linear models in tangent feature space where the features are allowed to be transformed during training. We consider linear transformations of features, resulting in a joint optimization over parameters and transformations with a bilinear interpolation constraint. We show that this optimizat… ▽ More

    Submitted 20 February, 2024; v1 submitted 29 August, 2023; originally announced August 2023.

    Comments: 14 pages, 3 figures. Appeared at the First Conference on Parsimony and Learning (CPAL 2024)

  3. arXiv:2307.01850  [pdf, other

    cs.LG cs.AI cs.CV

    Self-Consuming Generative Models Go MAD

    Authors: Sina Alemohammad, Josue Casco-Rodriguez, Lorenzo Luzi, Ahmed Imtiaz Humayun, Hossein Babaei, Daniel LeJeune, Ali Siahkoohi, Richard G. Baraniuk

    Abstract: Seismic advances in generative AI algorithms for imagery, text, and other data types has led to the temptation to use synthetic data to train next-generation models. Repeating this process creates an autophagous (self-consuming) loop whose properties are poorly understood. We conduct a thorough analytical and empirical analysis using state-of-the-art generative image models of three families of au… ▽ More

    Submitted 4 July, 2023; originally announced July 2023.

    Comments: 31 pages, 31 figures, pre-print

  4. arXiv:2301.05187  [pdf, other

    cs.CV cs.GR eess.IV

    WIRE: Wavelet Implicit Neural Representations

    Authors: Vishwanath Saragadam, Daniel LeJeune, Jasper Tan, Guha Balakrishnan, Ashok Veeraraghavan, Richard G. Baraniuk

    Abstract: Implicit neural representations (INRs) have recently advanced numerous vision-related areas. INR performance depends strongly on the choice of the nonlinear activation function employed in its multilayer perceptron (MLP) network. A wide range of nonlinearities have been explored, but, unfortunately, current INRs designed to have high accuracy also suffer from poor robustness (to signal noise, para… ▽ More

    Submitted 5 January, 2023; originally announced January 2023.

  5. arXiv:2211.03751  [pdf, other

    math.NA cs.DS math.ST

    Asymptotics of the Sketched Pseudoinverse

    Authors: Daniel LeJeune, Pratik Patil, Hamid Javadi, Richard G. Baraniuk, Ryan J. Tibshirani

    Abstract: We take a random matrix theory approach to random sketching and show an asymptotic first-order equivalence of the regularized sketched pseudoinverse of a positive semidefinite matrix to a certain evaluation of the resolvent of the same matrix. We focus on real-valued regularization and extend previous results on an asymptotic equivalence of random matrices to the real setting, providing a precise… ▽ More

    Submitted 6 October, 2023; v1 submitted 7 November, 2022; originally announced November 2022.

    Comments: 45 pages, 9 figures

    MSC Class: 15B52; 46L54; 62J07

  6. arXiv:2210.11589  [pdf, other

    cs.LG stat.ML

    Monotonic Risk Relationships under Distribution Shifts for Regularized Risk Minimization

    Authors: Daniel LeJeune, Jiayu Liu, Reinhard Heckel

    Abstract: Machine learning systems are often applied to data that is drawn from a different distribution than the training distribution. Recent work has shown that for a variety of classification and signal reconstruction problems, the out-of-distribution performance is strongly linearly correlated with the in-distribution performance. If this relationship or more generally a monotonic one holds, it has imp… ▽ More

    Submitted 20 July, 2023; v1 submitted 20 October, 2022; originally announced October 2022.

    Comments: 34 pages, 7 figures

  7. arXiv:2205.14055  [pdf, other

    cs.LG stat.ML

    A Blessing of Dimensionality in Membership Inference through Regularization

    Authors: Jasper Tan, Daniel LeJeune, Blake Mason, Hamid Javadi, Richard G. Baraniuk

    Abstract: Is overparameterization a privacy liability? In this work, we study the effect that the number of parameters has on a classifier's vulnerability to membership inference attacks. We first demonstrate how the number of parameters of a model can induce a privacy--utility trade-off: increasing the number of parameters generally improves generalization performance at the expense of lower privacy. Howev… ▽ More

    Submitted 13 April, 2023; v1 submitted 27 May, 2022; originally announced May 2022.

    Comments: 26 pages, 14 figures

  8. arXiv:2106.07769  [pdf, other

    cs.LG stat.ML

    The Flip Side of the Reweighted Coin: Duality of Adaptive Dropout and Regularization

    Authors: Daniel LeJeune, Hamid Javadi, Richard G. Baraniuk

    Abstract: Among the most successful methods for sparsifying deep (neural) networks are those that adaptively mask the network weights throughout training. By examining this masking, or dropout, in the linear case, we uncover a duality between such adaptive methods and regularization through the so-called "$η$-trick" that casts both as iteratively reweighted optimizations. We show that any dropout strategy t… ▽ More

    Submitted 3 January, 2022; v1 submitted 14 June, 2021; originally announced June 2021.

    Comments: 19 pages, 2 figures. Appeared in NeurIPS 2021. Small typographical correction

  9. arXiv:2103.05621  [pdf, other

    cs.LG

    The Common Intuition to Transfer Learning Can Win or Lose: Case Studies for Linear Regression

    Authors: Yehuda Dar, Daniel LeJeune, Richard G. Baraniuk

    Abstract: We study a fundamental transfer learning process from source to target linear regression tasks, including overparameterized settings where there are more learned parameters than data samples. The target task learning is addressed by using its training data together with the parameters previously computed for the source task. We define a transfer learning approach to the target task as a linear reg… ▽ More

    Submitted 31 May, 2024; v1 submitted 9 March, 2021; originally announced March 2021.

  10. arXiv:2010.13975  [pdf, other

    eess.SP cs.LG

    Wearing a MASK: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels

    Authors: Sina Alemohammad, Hossein Babaei, Randall Balestriero, Matt Y. Cheung, Ahmed Imtiaz Humayun, Daniel LeJeune, Naiming Liu, Lorenzo Luzi, Jasper Tan, Zichao Wang, Richard G. Baraniuk

    Abstract: High dimensionality poses many challenges to the use of data, from visualization and interpretation, to prediction and storage for historical preservation. Techniques abound to reduce the dimensionality of fixed-length sequences, yet these methods rarely generalize to variable-length sequences. To address this gap, we extend existing methods that rely on the use of kernels to variable-length seque… ▽ More

    Submitted 17 April, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

  11. arXiv:1910.04743  [pdf, other

    stat.ML cs.LG

    The Implicit Regularization of Ordinary Least Squares Ensembles

    Authors: Daniel LeJeune, Hamid Javadi, Richard G. Baraniuk

    Abstract: Ensemble methods that average over a collection of independent predictors that are each limited to a subsampling of both the examples and features of the training data command a significant presence in machine learning, such as the ever-popular random forest, yet the nature of the subsampling effect, particularly of the features, is not well understood. We study the case of an ensemble of linear p… ▽ More

    Submitted 24 March, 2020; v1 submitted 10 October, 2019; originally announced October 2019.

    Comments: 18 pages, 4 figures. To appear in AISTATS 2020

  12. arXiv:1905.11639  [pdf, other

    cs.LG stat.ML

    Implicit Rugosity Regularization via Data Augmentation

    Authors: Daniel LeJeune, Randall Balestriero, Hamid Javadi, Richard G. Baraniuk

    Abstract: Deep (neural) networks have been applied productively in a wide range of supervised and unsupervised learning tasks. Unlike classical machine learning algorithms, deep networks typically operate in the \emph{overparameterized} regime, where the number of parameters is larger than the number of training data points. Consequently, understanding the generalization properties and the role of (explicit… ▽ More

    Submitted 10 October, 2019; v1 submitted 28 May, 2019; originally announced May 2019.

    Comments: 15 pages, 12 figures

  13. arXiv:1905.09190  [pdf, other

    cs.LG stat.ML

    Thresholding Graph Bandits with GrAPL

    Authors: Daniel LeJeune, Gautam Dasarathy, Richard G. Baraniuk

    Abstract: In this paper, we introduce a new online decision making paradigm that we call Thresholding Graph Bandits. The main goal is to efficiently identify a subset of arms in a multi-armed bandit problem whose means are above a specified threshold. While traditionally in such problems, the arms are assumed to be independent, in our paradigm we further suppose that we have access to the similarity between… ▽ More

    Submitted 24 March, 2020; v1 submitted 22 May, 2019; originally announced May 2019.

    Comments: 14 pages, 3 figures. To appear in AISTATS 2020

  14. arXiv:1902.09465  [pdf, other

    cs.DS cs.LG stat.ML

    Adaptive Estimation for Approximate k-Nearest-Neighbor Computations

    Authors: Daniel LeJeune, Richard G. Baraniuk, Reinhard Heckel

    Abstract: Algorithms often carry out equally many computations for "easy" and "hard" problem instances. In particular, algorithms for finding nearest neighbors typically have the same running time regardless of the particular problem instance. In this paper, we consider the approximate k-nearest-neighbor problem, which is the problem of finding a subset of O(k) points in a given set of points that contains… ▽ More

    Submitted 25 February, 2019; originally announced February 2019.

    Comments: 11 pages, 2 figures. To appear in AISTATS 2019

    Journal ref: Proceedings of Machine Learning Research 89 (2019):3099-3107

  15. arXiv:1806.04310  [pdf, other

    cs.DS cs.LG stat.ML

    MISSION: Ultra Large-Scale Feature Selection using Count-Sketches

    Authors: Amirali Aghazadeh, Ryan Spring, Daniel LeJeune, Gautam Dasarathy, Anshumali Shrivastava, Richard G. Baraniuk

    Abstract: Feature selection is an important challenge in machine learning. It plays a crucial role in the explainability of machine-driven decisions that are rapidly permeating throughout modern society. Unfortunately, the explosion in the size and dimensionality of real-world datasets poses a severe challenge to standard feature selection algorithms. Today, it is not uncommon for datasets to have billions… ▽ More

    Submitted 11 June, 2018; originally announced June 2018.

  16. arXiv:1303.0866  [pdf

    cs.DB

    Adaptive Partitioning and its Applicability to a Highly Scalable and Available Geo-Spatial Indexing Solution

    Authors: David W. LeJeune Jr

    Abstract: Satellite Tracking of People (STOP) tracks thousands of GPS-enabled devices 24 hours a day and 365 days a year. With locations captured for each device every minute, STOP servers receive tens of millions of points each day. In addition to cataloging these points in real-time, STOP must also respond to questions from customers such as, "What devices of mine were at this location two months ago?" Th… ▽ More

    Submitted 4 March, 2013; originally announced March 2013.