Skip to main content

Showing 1–12 of 12 results for author: Bartoldson, B

.
  1. arXiv:2406.04273  [pdf, other

    cs.CV cs.AI

    ELFS: Enhancing Label-Free Coreset Selection via Clustering-based Pseudo-Labeling

    Authors: Haizhong Zheng, Elisa Tsai, Yifu Lu, Jiachen Sun, Brian R. Bartoldson, Bhavya Kailkhura, Atul Prakash

    Abstract: High-quality human-annotated data is crucial for modern deep learning pipelines, yet the human annotation process is both costly and time-consuming. Given a constrained human labeling budget, selecting an informative and representative data subset for labeling can significantly reduce human annotation effort. Well-performing state-of-the-art (SOTA) coreset selection methods require ground-truth la… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

  2. arXiv:2405.17399  [pdf, other

    cs.LG cs.AI

    Transformers Can Do Arithmetic with the Right Embeddings

    Authors: Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Gei**, Avi Schwarzschild, Tom Goldstein

    Abstract: The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits. We mend this problem by adding an embedding to each digit that encodes its position relative to the start of the number. In addition to the boost these embeddings provide on their own, we show that this fix ena… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  3. arXiv:2404.09349  [pdf, other

    cs.LG cs.CR cs.CV

    Adversarial Robustness Limits via Scaling-Law and Human-Alignment Studies

    Authors: Brian R. Bartoldson, James Diffenderfer, Konstantinos Parasyris, Bhavya Kailkhura

    Abstract: This paper revisits the simple, long-studied, yet still unsolved problem of making image classifiers robust to imperceptible perturbations. Taking CIFAR10 as an example, SOTA clean accuracy is about $100$%, but SOTA robustness to $\ell_{\infty}$-norm bounded perturbations barely exceeds $70$%. To understand this gap, we analyze how model size, dataset size, and synthetic data quality affect robust… ▽ More

    Submitted 14 April, 2024; originally announced April 2024.

  4. arXiv:2403.15447  [pdf, other

    cs.CL cs.AI

    Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression

    Authors: Junyuan Hong, **hao Duan, Chenhui Zhang, Zhangheng Li, Chulin Xie, Kelsey Lieberman, James Diffenderfer, Brian Bartoldson, Ajay Jaiswal, Kaidi Xu, Bhavya Kailkhura, Dan Hendrycks, Dawn Song, Zhangyang Wang, Bo Li

    Abstract: Compressing high-capability Large Language Models (LLMs) has emerged as a favored strategy for resource-efficient inferences. While state-of-the-art (SoTA) compression methods boast impressive advancements in preserving benign task performance, the potential risks of compression in terms of safety and trustworthiness have been largely neglected. This study conducts the first, thorough evaluation o… ▽ More

    Submitted 4 June, 2024; v1 submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted to ICML'24

  5. arXiv:2310.05914  [pdf, other

    cs.CL cs.LG

    NEFTune: Noisy Embeddings Improve Instruction Finetuning

    Authors: Neel Jain, **-yeh Chiang, Yuxin Wen, John Kirchenbauer, Hong-Min Chu, Gowthami Somepalli, Brian R. Bartoldson, Bhavya Kailkhura, Avi Schwarzschild, Aniruddha Saha, Micah Goldblum, Jonas Gei**, Tom Goldstein

    Abstract: We show that language model finetuning can be improved, sometimes dramatically, with a simple augmentation. NEFTune adds noise to the embedding vectors during training. Standard finetuning of LLaMA-2-7B using Alpaca achieves 29.79% on AlpacaEval, which rises to 64.69% using noisy embeddings. NEFTune also improves over strong baselines on modern instruction datasets. Models trained with Evol-Instru… ▽ More

    Submitted 10 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: 25 pages, Code is available on Github: https://github.com/neelsjain/NEFTune

  6. arXiv:2304.00338  [pdf, other

    cs.LG math.NA

    Scientific Computing Algorithms to Learn Enhanced Scalable Surrogates for Mesh Physics

    Authors: Brian R. Bartoldson, Ye** Hu, Amar Saini, Jose Cadena, Yucheng Fu, Jie Bao, Zhijie Xu, Brenda Ng, Phan Nguyen

    Abstract: Data-driven modeling approaches can produce fast surrogates to study large-scale physics problems. Among them, graph neural networks (GNNs) that operate on mesh-based data are desirable because they possess inductive biases that promote physical faithfulness, but hardware limitations have precluded their application to large computational domains. We show that it is \textit{possible} to train a cl… ▽ More

    Submitted 1 April, 2023; originally announced April 2023.

    Comments: ICLR 2023 Workshop on Physics for Machine Learning

  7. arXiv:2210.06640  [pdf, other

    cs.LG

    Compute-Efficient Deep Learning: Algorithmic Trends and Opportunities

    Authors: Brian R. Bartoldson, Bhavya Kailkhura, Davis Blalock

    Abstract: Although deep learning has made great progress in recent years, the exploding economic and environmental costs of training neural networks are becoming unsustainable. To address this problem, there has been a great deal of research on *algorithmically-efficient deep learning*, which seeks to reduce training costs not at the hardware or implementation level, but through changes in the semantics of… ▽ More

    Submitted 21 March, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

    Comments: 77 pages

    Journal ref: Journal of Machine Learning Research (2023)

  8. arXiv:2207.04075  [pdf, other

    cs.LG

    Models Out of Line: A Fourier Lens on Distribution Shift Robustness

    Authors: Sara Fridovich-Keil, Brian R. Bartoldson, James Diffenderfer, Bhavya Kailkhura, Peer-Timo Bremer

    Abstract: Improving the accuracy of deep neural networks (DNNs) on out-of-distribution (OOD) data is critical to an acceptance of deep learning (DL) in real world applications. It has been observed that accuracies on in-distribution (ID) versus OOD data follow a linear trend and models that outperform this baseline are exceptionally rare (and referred to as "effectively robust"). Recently, some promising ap… ▽ More

    Submitted 8 July, 2022; originally announced July 2022.

  9. arXiv:2112.11656  [pdf, other

    cs.LG cs.CE

    Latent Space Simulation for Carbon Capture Design Optimization

    Authors: Brian Bartoldson, Rui Wang, Yucheng Fu, David Widemann, Sam Nguyen, Jie Bao, Zhijie Xu, Brenda Ng

    Abstract: The CO2 capture efficiency in solvent-based carbon capture systems (CCSs) critically depends on the gas-solvent interfacial area (IA), making maximization of IA a foundational challenge in CCS design. While the IA associated with a particular CCS design can be estimated via a computational fluid dynamics (CFD) simulation, using CFD to derive the IAs associated with numerous CCS designs is prohibit… ▽ More

    Submitted 21 December, 2021; originally announced December 2021.

    Comments: Extended version of a paper appearing in the Proceedings of the 34th Annual Conference on Innovative Applications of Artificial Intelligence (IAAI-22)

  10. arXiv:2106.09129  [pdf, other

    cs.LG

    A Winning Hand: Compressing Deep Networks Can Improve Out-Of-Distribution Robustness

    Authors: James Diffenderfer, Brian R. Bartoldson, Shreya Chaganti, Jize Zhang, Bhavya Kailkhura

    Abstract: Successful adoption of deep learning (DL) in the wild requires models to be: (1) compact, (2) accurate, and (3) robust to distributional shifts. Unfortunately, efforts towards simultaneously meeting these requirements have mostly been unsuccessful. This raises an important question: Is the inability to create Compact, Accurate, and Robust Deep neural networks (CARDs) fundamental? To answer this qu… ▽ More

    Submitted 5 November, 2021; v1 submitted 16 June, 2021; originally announced June 2021.

  11. arXiv:1906.03728  [pdf, other

    cs.LG stat.ML

    The Generalization-Stability Tradeoff In Neural Network Pruning

    Authors: Brian R. Bartoldson, Ari S. Morcos, Adrian Barbu, Gordon Erlebacher

    Abstract: Pruning neural network parameters is often viewed as a means to compress models, but pruning has also been motivated by the desire to prevent overfitting. This motivation is particularly relevant given the perhaps surprising observation that a wide variety of pruning approaches increase test accuracy despite sometimes massive reductions in parameter counts. To better understand this phenomenon, we… ▽ More

    Submitted 22 October, 2020; v1 submitted 9 June, 2019; originally announced June 2019.

    Comments: NeurIPS 2020 conference paper

  12. arXiv:1805.01930  [pdf, other

    stat.ML cs.LG

    Enhancing the Regularization Effect of Weight Pruning in Artificial Neural Networks

    Authors: Brian Bartoldson, Adrian Barbu, Gordon Erlebacher

    Abstract: Artificial neural networks (ANNs) may not be worth their computational/memory costs when used in mobile phones or embedded devices. Parameter-pruning algorithms combat these costs, with some algorithms capable of removing over 90% of an ANN's weights without harming the ANN's performance. Removing weights from an ANN is a form of regularization, but existing pruning algorithms do not significantly… ▽ More

    Submitted 4 May, 2018; originally announced May 2018.