Skip to main content

Showing 1–50 of 215 results for author: Goldstein, T

.
  1. arXiv:2406.19314  [pdf, other

    cs.CL cs.AI cs.LG

    LiveBench: A Challenging, Contamination-Free LLM Benchmark

    Authors: Colin White, Samuel Dooley, Manley Roberts, Arka Pal, Ben Feuer, Siddhartha Jain, Ravid Shwartz-Ziv, Neel Jain, Khalid Saifullah, Siddartha Naidu, Chinmay Hegde, Yann LeCun, Tom Goldstein, Willie Neiswanger, Micah Goldblum

    Abstract: Test set contamination, wherein test data from a benchmark ends up in a newer model's training set, is a well-documented obstacle for fair LLM evaluation and can quickly render benchmarks obsolete. To mitigate this, many recent benchmarks crowdsource new prompts and evaluations from human or LLM judges; however, these can introduce significant biases, and break down when scoring hard questions. In… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

  2. arXiv:2406.10328  [pdf, other

    cs.CV cs.CL cs.LG

    From Pixels to Prose: A Large Dataset of Dense Image Captions

    Authors: Vasu Singla, Kaiyu Yue, Sukriti Paul, Reza Shirkavand, Mayuka Jayawardhana, Alireza Ganjdanesh, Heng Huang, Abhinav Bhatele, Gowthami Somepalli, Tom Goldstein

    Abstract: Training large vision-language models requires extensive, high-quality image-text pairs. Existing web-scraped datasets, however, are noisy and lack detailed image descriptions. To bridge this gap, we introduce PixelProse, a comprehensive dataset of over 16M (million) synthetically generated captions, leveraging cutting-edge vision-language models for detailed and accurate descriptions. To ensure d… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: pixelprose 16M dataset

  3. arXiv:2406.10323  [pdf, other

    cs.CL

    GenQA: Generating Millions of Instructions from a Handful of Prompts

    Authors: Jiuhai Chen, Rifaa Qadri, Yuxin Wen, Neel Jain, John Kirchenbauer, Tianyi Zhou, Tom Goldstein

    Abstract: Most public instruction finetuning datasets are relatively small compared to the closed source datasets used to train industry models. To study questions about finetuning at scale, such as curricula and learning rate cooldown schedules, there is a need for industrial-scale datasets. However, this scale necessitates a data generation process that is almost entirely automated. In this work, we study… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 9.5 pages, 6 Figures, and 3 tables in the main body. Dataset available at https://huggingface.co/datasets/tomg-group-umd/GenQA

  4. arXiv:2406.10219  [pdf, other

    cs.CV cs.GR

    PUP 3D-GS: Principled Uncertainty Pruning for 3D Gaussian Splatting

    Authors: Alex Hanson, Allen Tu, Vasu Singla, Mayuka Jayawardhana, Matthias Zwicker, Tom Goldstein

    Abstract: Recent advancements in novel view synthesis have enabled real-time rendering speeds and high reconstruction accuracy. 3D Gaussian Splatting (3D-GS), a foundational point-based parametric 3D scene representation, models scenes as large sets of 3D Gaussians. Complex scenes can comprise of millions of Gaussians, amounting to large storage and memory requirements that limit the viability of 3D-GS on d… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  5. arXiv:2406.10209  [pdf, other

    cs.CL

    Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

    Authors: Abhimanyu Hans, Yuxin Wen, Neel Jain, John Kirchenbauer, Hamid Kazemi, Prajwal Singhania, Siddharth Singh, Gowthami Somepalli, Jonas Gei**, Abhinav Bhatele, Tom Goldstein

    Abstract: Large language models can memorize and repeat their training data, causing privacy and copyright risks. To mitigate memorization, we introduce a subtle modification to the next-token training objective that we call the goldfish loss. During training, a randomly sampled subset of tokens are excluded from the loss computation. These dropped tokens are not memorized by the model, which prevents verba… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 9.5 pages, 8 figures, and 1 table in the main body. Code available at https://github.com/ahans30/goldfish-loss

  6. arXiv:2406.07657  [pdf, other

    cs.LG cs.CL

    OPTune: Efficient Online Preference Tuning

    Authors: Lichang Chen, Jiuhai Chen, Chenxi Liu, John Kirchenbauer, Davit Soselia, Chen Zhu, Tom Goldstein, Tianyi Zhou, Heng Huang

    Abstract: Reinforcement learning with human feedback~(RLHF) is critical for aligning Large Language Models (LLMs) with human preference. Compared to the widely studied offline version of RLHF, \emph{e.g.} direct preference optimization (DPO), recent works have shown that the online variants achieve even better alignment. However, online alignment requires on-the-fly generation of new training data, which is… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: 16 pages, 7 figures

  7. arXiv:2406.04229  [pdf, other

    cs.LG cs.AI cs.CL cs.DS stat.ML

    The CLRS-Text Algorithmic Reasoning Language Benchmark

    Authors: Larisa Markeeva, Sean McLeish, Borja Ibarz, Wilfried Bounsi, Olga Kozlova, Alex Vitvitskyi, Charles Blundell, Tom Goldstein, Avi Schwarzschild, Petar Veličković

    Abstract: Eliciting reasoning capabilities from language models (LMs) is a critical direction on the path towards building intelligent systems. Most recent studies dedicated to reasoning focus on out-of-distribution performance on procedurally-generated synthetic benchmarks, bespoke-built to evaluate specific skills only. This trend makes results hard to transfer across publications, slowing down progress.… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: Preprint, under review. Comments welcome

  8. arXiv:2405.17399  [pdf, other

    cs.LG cs.AI

    Transformers Can Do Arithmetic with the Right Embeddings

    Authors: Sean McLeish, Arpit Bansal, Alex Stein, Neel Jain, John Kirchenbauer, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Jonas Gei**, Avi Schwarzschild, Tom Goldstein

    Abstract: The poor performance of transformers on arithmetic tasks seems to stem in large part from their inability to keep track of the exact position of each digit inside of a large span of digits. We mend this problem by adding an embedding to each digit that encodes its position relative to the start of the number. In addition to the boost these embeddings provide on their own, we show that this fix ena… ▽ More

    Submitted 27 May, 2024; originally announced May 2024.

  9. arXiv:2405.15973  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Enhancing Visual-Language Modality Alignment in Large Vision Language Models via Self-Improvement

    Authors: Xiyao Wang, Jiuhai Chen, Zhaoyang Wang, Yuhang Zhou, Yiyang Zhou, Huaxiu Yao, Tianyi Zhou, Tom Goldstein, Parminder Bhatia, Furong Huang, Cao Xiao

    Abstract: Large vision-language models (LVLMs) have achieved impressive results in various visual question-answering and reasoning tasks through vision instruction tuning on specific datasets. However, there is still significant room for improvement in the alignment between visual and language modalities. Previous methods to enhance this alignment typically require external models or data, heavily depending… ▽ More

    Submitted 7 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 15 pages, 8 figures

  10. arXiv:2405.08813  [pdf, other

    cs.CV cs.LG cs.MM

    CinePile: A Long Video Question Answering Dataset and Benchmark

    Authors: Ruchit Rawal, Khalid Saifullah, Ronen Basri, David Jacobs, Gowthami Somepalli, Tom Goldstein

    Abstract: Current datasets for long-form video understanding often fall short of providing genuine long-form comprehension challenges, as many tasks derived from these datasets can be successfully tackled by analyzing just one or a few random frames from a video. To address this issue, we present a novel dataset and benchmark, CinePile, specifically designed for authentic long-form video understanding. This… ▽ More

    Submitted 14 June, 2024; v1 submitted 14 May, 2024; originally announced May 2024.

    Comments: Project page with all the artifacts - https://ruchitrawal.github.io/cinepile/. Updated version with results on Gemini Flash model and additional related work

  11. arXiv:2405.06331  [pdf, other

    cs.LG cs.CL

    LMD3: Language Model Data Density Dependence

    Authors: John Kirchenbauer, Garrett Honke, Gowthami Somepalli, Jonas Gei**, Daphne Ippolito, Katherine Lee, Tom Goldstein, David Andre

    Abstract: We develop a methodology for analyzing language model task performance at the individual example level based on training data density estimation. Experiments with paraphrasing as a controlled intervention on finetuning data demonstrate that increasing the support in the training distribution for specific test queries results in a measurable increase in density, which is also a significant predicto… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: 10 pages in the main body

  12. arXiv:2404.03441  [pdf, other

    cs.AI cs.CL cs.LG

    Benchmarking ChatGPT on Algorithmic Reasoning

    Authors: Sean McLeish, Avi Schwarzschild, Tom Goldstein

    Abstract: We evaluate ChatGPT's ability to solve algorithm problems from the CLRS benchmark suite that is designed for GNNs. The benchmark requires the use of a specified classical algorithm to solve a given problem. We find that ChatGPT outperforms specialist GNN models, using Python to successfully solve these problems. This raises new points in the discussion about learning algorithms with neural network… ▽ More

    Submitted 16 April, 2024; v1 submitted 4 April, 2024; originally announced April 2024.

  13. arXiv:2404.01292  [pdf, other

    cs.CV cs.LG

    Measuring Style Similarity in Diffusion Models

    Authors: Gowthami Somepalli, Anubhav Gupta, Kamal Gupta, Shramay Palta, Micah Goldblum, Jonas Gei**, Abhinav Shrivastava, Tom Goldstein

    Abstract: Generative models are now widely used by graphic designers and artists. Prior works have shown that these models remember and often replicate content from their training data during generation. Hence as their proliferation increases, it has become important to perform a database search to determine whether the properties of the image are attributable to specific training data, every time before a… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  14. arXiv:2404.01231  [pdf, other

    cs.CR cs.LG

    Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models

    Authors: Yuxin Wen, Leo Marchyok, Sanghyun Hong, Jonas Gei**, Tom Goldstein, Nicholas Carlini

    Abstract: It is commonplace to produce application-specific models by fine-tuning large pre-trained models using a small bespoke dataset. The widespread availability of foundation model checkpoints on the web poses considerable risks, including the vulnerability to backdoor attacks. In this paper, we unveil a new vulnerability: the privacy backdoor attack. This black-box privacy attack aims to amplify the p… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

  15. arXiv:2403.16365  [pdf, other

    cs.LG cs.CR cs.CV

    Generating Potent Poisons and Backdoors from Scratch with Guided Diffusion

    Authors: Hossein Souri, Arpit Bansal, Hamid Kazemi, Liam Fowl, Aniruddha Saha, Jonas Gei**, Andrew Gordon Wilson, Rama Chellappa, Tom Goldstein, Micah Goldblum

    Abstract: Modern neural networks are often trained on massive datasets that are web scraped with minimal human inspection. As a result of this insecure curation pipeline, an adversary can poison or backdoor the resulting model by uploading malicious data to the internet and waiting for a victim to scrape and train on it. Existing approaches for creating poisons and backdoors start with randomly sampled clea… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  16. arXiv:2403.02580  [pdf, other

    cs.CV cs.LG

    What do we learn from inverting CLIP models?

    Authors: Hamid Kazemi, Atoosa Chegini, Jonas Gei**, Soheil Feizi, Tom Goldstein

    Abstract: We employ an inversion-based approach to examine CLIP models. Our examination reveals that inverting CLIP models results in the generation of images that exhibit semantic alignment with the specified target prompts. We leverage these inverted images to gain insights into various aspects of CLIP models, such as their ability to blend concepts and inclusion of gender biases. We notably observe insta… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: Warning: This paper contains sexually explicit images and language, offensive visuals and terminology, discussions on pornography, gender bias, and other potentially unsettling, distressing, and/or offensive content for certain readers

  17. arXiv:2402.14020  [pdf, other

    cs.LG cs.CL cs.CR

    Coercing LLMs to do and reveal (almost) anything

    Authors: Jonas Gei**, Alex Stein, Manli Shu, Khalid Saifullah, Yuxin Wen, Tom Goldstein

    Abstract: It has recently been shown that adversarial attacks on large language models (LLMs) can "jailbreak" the model into making harmful statements. In this work, we argue that the spectrum of adversarial attacks on LLMs is much larger than merely jailbreaking. We provide a broad overview of possible attack surfaces and attack goals. Based on a series of concrete examples, we discuss, categorize and syst… ▽ More

    Submitted 21 February, 2024; originally announced February 2024.

    Comments: 32 pages. Implementation available at https://github.com/JonasGei**/carving

  18. arXiv:2402.07319  [pdf, other

    cs.LG cs.AI cs.CL

    ODIN: Disentangled Reward Mitigates Hacking in RLHF

    Authors: Lichang Chen, Chen Zhu, Davit Soselia, Jiuhai Chen, Tianyi Zhou, Tom Goldstein, Heng Huang, Mohammad Shoeybi, Bryan Catanzaro

    Abstract: In this work, we study the issue of reward hacking on the response length, a challenge emerging in Reinforcement Learning from Human Feedback (RLHF) on LLMs. A well-formatted, verbose but less helpful response from the LLMs can often deceive LLMs or even human evaluators to achieve high scores. The same issue also holds for some reward models in RL. To address the challenges in both training and e… ▽ More

    Submitted 11 February, 2024; originally announced February 2024.

  19. arXiv:2402.06659  [pdf, other

    cs.CR cs.AI cs.LG

    Shadowcast: Stealthy Data Poisoning Attacks Against Vision-Language Models

    Authors: Yuancheng Xu, Jiarui Yao, Manli Shu, Yanchao Sun, Zichu Wu, Ning Yu, Tom Goldstein, Furong Huang

    Abstract: Vision-Language Models (VLMs) excel in generating textual responses from visual inputs, yet their versatility raises significant security concerns. This study takes the first step in exposing VLMs' susceptibility to data poisoning attacks that can manipulate responses to innocuous, everyday prompts. We introduce Shadowcast, a stealthy data poisoning attack method where poison samples are visually… ▽ More

    Submitted 5 February, 2024; originally announced February 2024.

  20. arXiv:2401.12070  [pdf, other

    cs.CL cs.AI cs.LG

    Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text

    Authors: Abhimanyu Hans, Avi Schwarzschild, Valeriia Cherepanova, Hamid Kazemi, Aniruddha Saha, Micah Goldblum, Jonas Gei**, Tom Goldstein

    Abstract: Detecting text generated by modern large language models is thought to be hard, as both LLMs and humans can exhibit a wide range of complex behaviors. However, we find that a score based on contrasting two closely related language models is highly accurate at separating human-generated and machine-generated text. Based on this mechanism, we propose a novel LLM detector that only requires simple ca… ▽ More

    Submitted 1 July, 2024; v1 submitted 22 January, 2024; originally announced January 2024.

    Comments: 20 pages, code available at https://github.com/ahans30/Binoculars

  21. arXiv:2401.08573  [pdf, other

    cs.CV cs.CR cs.LG

    WAVES: Benchmarking the Robustness of Image Watermarks

    Authors: Bang An, Mucong Ding, Tahseen Rabbani, Aakriti Agrawal, Yuancheng Xu, Chenghao Deng, Sicheng Zhu, Abdirisak Mohamed, Yuxin Wen, Tom Goldstein, Furong Huang

    Abstract: In the burgeoning age of generative AI, watermarks act as identifiers of provenance and artificial content. We present WAVES (Watermark Analysis Via Enhanced Stress-testing), a benchmark for assessing image watermark robustness, overcoming the limitations of current evaluation methods. WAVES integrates detection and identification tasks and establishes a standardized evaluation protocol comprised… ▽ More

    Submitted 6 June, 2024; v1 submitted 16 January, 2024; originally announced January 2024.

    Comments: Accepted by ICML 2024

  22. arXiv:2312.16339  [pdf, other

    cs.CV cs.LG

    Universal Pyramid Adversarial Training for Improved ViT Performance

    Authors: **-yeh Chiang, Yipin Zhou, Omid Poursaeed, Satya Narayan Shukla, Ashish Shah, Tom Goldstein, Ser-Nam Lim

    Abstract: Recently, Pyramid Adversarial training (Herrmann et al., 2022) has been shown to be very effective for improving clean accuracy and distribution-shift robustness of vision transformers. However, due to the iterative nature of adversarial training, the technique is up to 7 times more expensive than standard training. To make the method more efficient, we propose Universal Pyramid Adversarial traini… ▽ More

    Submitted 26 December, 2023; originally announced December 2023.

  23. arXiv:2312.09323  [pdf, other

    cs.AI cs.LG

    Perspectives on the State and Future of Deep Learning - 2023

    Authors: Micah Goldblum, Anima Anandkumar, Richard Baraniuk, Tom Goldstein, Kyunghyun Cho, Zachary C Lipton, Melanie Mitchell, Preetum Nakkiran, Max Welling, Andrew Gordon Wilson

    Abstract: The goal of this series is to chronicle opinions and issues in the field of machine learning as they stand today and as they change over time. The plan is to host this survey periodically until the AI singularity paperclip-frenzy-driven doomsday, kee** an updated list of topical questions and interviewing new community members for each edition. In this issue, we probed people's opinions on inter… ▽ More

    Submitted 18 December, 2023; v1 submitted 7 December, 2023; originally announced December 2023.

  24. arXiv:2312.02142  [pdf, other

    cs.CV

    Object Recognition as Next Token Prediction

    Authors: Kaiyu Yue, Bor-Chun Chen, Jonas Gei**, Hengduo Li, Tom Goldstein, Ser-Nam Lim

    Abstract: We present an approach to pose object recognition as next token prediction. The idea is to apply a language decoder that auto-regressively predicts the text tokens from image embeddings to form labels. To ground this prediction process in auto-regression, we customize a non-causal attention mask for the decoder, incorporating two key features: modeling tokens from different labels to be independen… ▽ More

    Submitted 31 March, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: CVPR 2024

  25. arXiv:2311.05877  [pdf, other

    cs.LG cs.AI

    A Performance-Driven Benchmark for Feature Selection in Tabular Deep Learning

    Authors: Valeriia Cherepanova, Roman Levin, Gowthami Somepalli, Jonas Gei**, C. Bayan Bruss, Andrew Gordon Wilson, Tom Goldstein, Micah Goldblum

    Abstract: Academic tabular benchmarks often contain small sets of curated features. In contrast, data scientists typically collect as many features as possible into their datasets, and even engineer new features from existing ones. To prevent overfitting in subsequent downstream modeling, practitioners commonly use automated feature selection methods that identify a reduced subset of informative features. E… ▽ More

    Submitted 10 November, 2023; originally announced November 2023.

    Journal ref: Conference on Neural Information Processing Systems 2023

  26. arXiv:2311.03386  [pdf, other

    cs.CV cs.LG

    A Simple and Efficient Baseline for Data Attribution on Images

    Authors: Vasu Singla, Pedro Sandoval-Segura, Micah Goldblum, Jonas Gei**, Tom Goldstein

    Abstract: Data attribution methods play a crucial role in understanding machine learning models, providing insight into which training data points are most responsible for model outputs during deployment. However, current state-of-the-art approaches require a large ensemble of as many as 300,000 models to accurately attribute model predictions. These approaches therefore come at a high computational cost, a… ▽ More

    Submitted 3 November, 2023; originally announced November 2023.

    Comments: Code available at https://github.com/vasusingla/simple-data-attribution

  27. arXiv:2310.19909  [pdf, other

    cs.CV cs.LG

    Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks

    Authors: Micah Goldblum, Hossein Souri, Renkun Ni, Manli Shu, Viraj Prabhu, Gowthami Somepalli, Prithvijit Chattopadhyay, Mark Ibrahim, Adrien Bardes, Judy Hoffman, Rama Chellappa, Andrew Gordon Wilson, Tom Goldstein

    Abstract: Neural network based computer vision systems are typically built on a backbone, a pretrained or randomly initialized feature extractor. Several years ago, the default option was an ImageNet-trained convolutional neural network. However, the recent past has seen the emergence of countless backbones pretrained using various algorithms and datasets. While this abundance of choice has led to performan… ▽ More

    Submitted 19 November, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted to NeurIPS 2023

  28. arXiv:2310.05914  [pdf, other

    cs.CL cs.LG

    NEFTune: Noisy Embeddings Improve Instruction Finetuning

    Authors: Neel Jain, **-yeh Chiang, Yuxin Wen, John Kirchenbauer, Hong-Min Chu, Gowthami Somepalli, Brian R. Bartoldson, Bhavya Kailkhura, Avi Schwarzschild, Aniruddha Saha, Micah Goldblum, Jonas Gei**, Tom Goldstein

    Abstract: We show that language model finetuning can be improved, sometimes dramatically, with a simple augmentation. NEFTune adds noise to the embedding vectors during training. Standard finetuning of LLaMA-2-7B using Alpaca achieves 29.79% on AlpacaEval, which rises to 64.69% using noisy embeddings. NEFTune also improves over strong baselines on modern instruction datasets. Models trained with Evol-Instru… ▽ More

    Submitted 10 October, 2023; v1 submitted 9 October, 2023; originally announced October 2023.

    Comments: 25 pages, Code is available on Github: https://github.com/neelsjain/NEFTune

  29. arXiv:2309.00614  [pdf, other

    cs.LG cs.CL cs.CR

    Baseline Defenses for Adversarial Attacks Against Aligned Language Models

    Authors: Neel Jain, Avi Schwarzschild, Yuxin Wen, Gowthami Somepalli, John Kirchenbauer, **-yeh Chiang, Micah Goldblum, Aniruddha Saha, Jonas Gei**, Tom Goldstein

    Abstract: As Large Language Models quickly become ubiquitous, it becomes critical to understand their security vulnerabilities. Recent work shows that text optimizers can produce jailbreaking prompts that bypass moderation and alignment. Drawing from the rich body of work on adversarial machine learning, we approach these attacks with three questions: What threat models are practically useful in this domain… ▽ More

    Submitted 4 September, 2023; v1 submitted 1 September, 2023; originally announced September 2023.

    Comments: 12 pages

  30. arXiv:2307.00028  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Seeing in Words: Learning to Classify through Language Bottlenecks

    Authors: Khalid Saifullah, Yuxin Wen, Jonas Gei**, Micah Goldblum, Tom Goldstein

    Abstract: Neural networks for computer vision extract uninterpretable features despite achieving high accuracy on benchmarks. In contrast, humans can explain their predictions using succinct and intuitive descriptions. To incorporate explainability into neural networks, we train a vision model whose feature representations are text. We show that such a model can effectively classify ImageNet images, and we… ▽ More

    Submitted 28 June, 2023; originally announced July 2023.

    Comments: 5 pages, 2 figures, Published as a Tiny Paper at ICLR 2023

  31. arXiv:2306.17194  [pdf, other

    cs.CR cs.CL cs.LG

    On the Exploitability of Instruction Tuning

    Authors: Manli Shu, Jiongxiao Wang, Chen Zhu, Jonas Gei**, Chaowei Xiao, Tom Goldstein

    Abstract: Instruction tuning is an effective technique to align large language models (LLMs) with human intents. In this work, we investigate how an adversary can exploit instruction tuning by injecting specific instruction-following examples into the training data that intentionally changes the model's behavior. For example, an adversary can achieve content injection by injecting training examples that men… ▽ More

    Submitted 28 October, 2023; v1 submitted 28 June, 2023; originally announced June 2023.

    Comments: NeurIPS 2023 camera-ready (21 pages, 10 figures)

  32. arXiv:2306.13651  [pdf, other

    cs.CL cs.LG

    Bring Your Own Data! Self-Supervised Evaluation for Large Language Models

    Authors: Neel Jain, Khalid Saifullah, Yuxin Wen, John Kirchenbauer, Manli Shu, Aniruddha Saha, Micah Goldblum, Jonas Gei**, Tom Goldstein

    Abstract: With the rise of Large Language Models (LLMs) and their ubiquitous deployment in diverse domains, measuring language model behavior on realistic data is imperative. For example, a company deploying a client-facing chatbot must ensure that the model will not respond to client requests with profanity. Current evaluations approach this problem using small, domain-specific datasets with human-curated… ▽ More

    Submitted 29 June, 2023; v1 submitted 23 June, 2023; originally announced June 2023.

    Comments: Code is available at https://github.com/neelsjain/BYOD. First two authors contributed equally. 21 pages, 22 figures

  33. arXiv:2306.04634  [pdf, other

    cs.LG cs.CL cs.CR

    On the Reliability of Watermarks for Large Language Models

    Authors: John Kirchenbauer, Jonas Gei**, Yuxin Wen, Manli Shu, Khalid Saifullah, Kezhi Kong, Kasun Fernando, Aniruddha Saha, Micah Goldblum, Tom Goldstein

    Abstract: As LLMs become commonplace, machine-generated text has the potential to flood the internet with spam, social media bots, and valueless content. Watermarking is a simple and effective strategy for mitigating such harms by enabling the detection and documentation of LLM-generated text. Yet a crucial question remains: How reliable is watermarking in realistic settings in the wild? There, watermarked… ▽ More

    Submitted 1 May, 2024; v1 submitted 7 June, 2023; originally announced June 2023.

    Comments: 9 pages in the main body. Published at ICLR 2024. Code is available at https://github.com/jwkirchenbauer/lm-watermarking

  34. arXiv:2306.03082  [pdf, other

    cs.AI

    InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models

    Authors: Lichang Chen, Jiuhai Chen, Tom Goldstein, Heng Huang, Tianyi Zhou

    Abstract: Large language models~(LLMs) are instruction followers, but it can be challenging to find the best instruction for different situations, especially for black-box LLMs on which backpropagation is forbidden. Instead of directly optimizing the discrete instruction, we optimize a low-dimensional soft prompt applied to an open-source LLM to generate the instruction for the black-box LLM. On each iterat… ▽ More

    Submitted 8 August, 2023; v1 submitted 5 June, 2023; originally announced June 2023.

    Comments: 15 pages; 9 figures; Our code is available at https://lichang-chen.github.io/InstructZero/

  35. arXiv:2305.20086  [pdf, other

    cs.LG cs.CR cs.CV

    Understanding and Mitigating Copying in Diffusion Models

    Authors: Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Gei**, Tom Goldstein

    Abstract: Images generated by diffusion models like Stable Diffusion are increasingly widespread. Recent works and even lawsuits have shown that these models are prone to replicating their training data, unbeknownst to the user. In this paper, we first analyze this memorization problem in text-to-image diffusion models. While it is widely believed that duplicated images in the training set are responsible f… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: 17 pages, preprint. Code is available at https://github.com/somepago/DCR

  36. arXiv:2305.20030  [pdf, other

    cs.LG cs.CR cs.CV

    Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust

    Authors: Yuxin Wen, John Kirchenbauer, Jonas Gei**, Tom Goldstein

    Abstract: Watermarking the outputs of generative models is a crucial technique for tracing copyright and preventing potential harm from AI-generated content. In this paper, we introduce a novel technique called Tree-Ring Watermarking that robustly fingerprints diffusion model outputs. Unlike existing methods that perform post-hoc modifications to images after sampling, Tree-Ring Watermarking subtly influenc… ▽ More

    Submitted 3 July, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: 16 pages, 8 figures, code is available at https://github.com/YuxinWenRick/tree-ring-watermark, fixed the repo link

  37. arXiv:2305.19254  [pdf, other

    cs.LG cs.CR

    What Can We Learn from Unlearnable Datasets?

    Authors: Pedro Sandoval-Segura, Vasu Singla, Jonas Gei**, Micah Goldblum, Tom Goldstein

    Abstract: In an era of widespread web scra**, unlearnable dataset methods have the potential to protect data privacy by preventing deep neural networks from generalizing. But in addition to a number of practical limitations that make their use unlikely, we make a number of findings that call into question their ability to safeguard data. First, it is widely believed that neural networks trained on unlearn… ▽ More

    Submitted 7 November, 2023; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: Accepted to NeurIPS 2023. Code available at https://github.com/psandovalsegura/learn-from-unlearnable

  38. arXiv:2304.12210  [pdf, other

    cs.LG cs.CV

    A Cookbook of Self-Supervised Learning

    Authors: Randall Balestriero, Mark Ibrahim, Vlad Sobal, Ari Morcos, Shashank Shekhar, Tom Goldstein, Florian Bordes, Adrien Bardes, Gregoire Mialon, Yuandong Tian, Avi Schwarzschild, Andrew Gordon Wilson, Jonas Gei**, Quentin Garrido, Pierre Fernandez, Amir Bar, Hamed Pirsiavash, Yann LeCun, Micah Goldblum

    Abstract: Self-supervised learning, dubbed the dark matter of intelligence, is a promising path to advance machine learning. Yet, much like cooking, training SSL methods is a delicate art with a high barrier to entry. While many components are familiar, successfully training a SSL method involves a dizzying set of choices from the pretext tasks to training hyper-parameters. Our goal is to lower the barrier… ▽ More

    Submitted 28 June, 2023; v1 submitted 24 April, 2023; originally announced April 2023.

  39. arXiv:2304.02234  [pdf, other

    cs.LG cs.CR cs.CV

    JPEG Compressed Images Can Bypass Protections Against AI Editing

    Authors: Pedro Sandoval-Segura, Jonas Gei**, Tom Goldstein

    Abstract: Recently developed text-to-image diffusion models make it easy to edit or create high-quality images. Their ease of use has raised concerns about the potential for malicious editing or deepfake creation. Imperceptible perturbations have been proposed as a means of protecting images from malicious editing by preventing diffusion models from generating realistic images. However, we find that the afo… ▽ More

    Submitted 7 April, 2023; v1 submitted 5 April, 2023; originally announced April 2023.

    Comments: 8 pages, 8 figures

  40. arXiv:2303.00116  [pdf, other

    cs.LG cs.CR cs.GT

    Neural Auctions Compromise Bidder Information

    Authors: Alex Stein, Avi Schwarzschild, Michael Curry, Tom Goldstein, John Dickerson

    Abstract: Single-shot auctions are commonly used as a means to sell goods, for example when selling ad space or allocating radio frequencies, however devising mechanisms for auctions with multiple bidders and multiple items can be complicated. It has been shown that neural networks can be used to approximate optimal mechanisms while satisfying the constraints that an auction be strategyproof and individuall… ▽ More

    Submitted 28 February, 2023; originally announced March 2023.

  41. arXiv:2302.07121  [pdf, other

    cs.CV cs.LG

    Universal Guidance for Diffusion Models

    Authors: Arpit Bansal, Hong-Min Chu, Avi Schwarzschild, Soumyadip Sengupta, Micah Goldblum, Jonas Gei**, Tom Goldstein

    Abstract: Typical diffusion models are trained to accept a particular form of conditioning, most commonly text, and cannot be conditioned on other modalities without retraining. In this work, we propose a universal guidance algorithm that enables diffusion models to be controlled by arbitrary guidance modalities without the need to retrain any use-specific components. We show that our algorithm successfully… ▽ More

    Submitted 14 February, 2023; originally announced February 2023.

  42. arXiv:2302.03668  [pdf, other

    cs.LG cs.CL

    Hard Prompts Made Easy: Gradient-Based Discrete Optimization for Prompt Tuning and Discovery

    Authors: Yuxin Wen, Neel Jain, John Kirchenbauer, Micah Goldblum, Jonas Gei**, Tom Goldstein

    Abstract: The strength of modern generative models lies in their ability to be controlled through text-based prompts. Typical "hard" prompts are made from interpretable words and tokens, and must be hand-crafted by humans. There are also "soft" prompts, which consist of continuous feature vectors. These can be discovered using powerful optimization methods, but they cannot be easily interpreted, re-used acr… ▽ More

    Submitted 1 June, 2023; v1 submitted 7 February, 2023; originally announced February 2023.

    Comments: 15 pages, 12 figures, Code is available at https://github.com/YuxinWenRick/hard-prompts-made-easy

  43. arXiv:2302.03015  [pdf, other

    cs.LG

    Exploring and Exploiting Decision Boundary Dynamics for Adversarial Robustness

    Authors: Yuancheng Xu, Yanchao Sun, Micah Goldblum, Tom Goldstein, Furong Huang

    Abstract: The robustness of a deep classifier can be characterized by its margins: the decision boundary's distances to natural data points. However, it is unclear whether existing robust training methods effectively increase the margin for each vulnerable point during training. To understand this, we propose a continuous-time framework for quantifying the relative speed of the decision boundary with respec… ▽ More

    Submitted 15 April, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

    Comments: Published at International Conference on Learning Representations (ICLR) 2023

  44. arXiv:2301.10226  [pdf, other

    cs.LG cs.CL cs.CR

    A Watermark for Large Language Models

    Authors: John Kirchenbauer, Jonas Gei**, Yuxin Wen, Jonathan Katz, Ian Miers, Tom Goldstein

    Abstract: Potential harms of large language models can be mitigated by watermarking model output, i.e., embedding signals into generated text that are invisible to humans but algorithmically detectable from a short span of tokens. We propose a watermarking framework for proprietary language models. The watermark can be embedded with negligible impact on text quality, and can be detected using an efficient o… ▽ More

    Submitted 1 May, 2024; v1 submitted 24 January, 2023; originally announced January 2023.

    Comments: 13 pages in the main body. Published at ICML 2023. Code is available at github.com/jwkirchenbauer/lm-watermarking

  45. arXiv:2301.02650  [pdf, other

    cs.CV

    Hierarchical Point Attention for Indoor 3D Object Detection

    Authors: Manli Shu, Le Xue, Ning Yu, Roberto Martín-Martín, Caiming Xiong, Tom Goldstein, Juan Carlos Niebles, Ran Xu

    Abstract: 3D object detection is an essential vision technique for various robotic systems, such as augmented reality and domestic robots. Transformers as versatile network architectures have recently seen great success in 3D point cloud object detection. However, the lack of hierarchy in a plain transformer restrains its ability to learn features at different scales. Such limitation makes transformer detec… ▽ More

    Submitted 8 May, 2024; v1 submitted 6 January, 2023; originally announced January 2023.

    Comments: ICRA 2024 camera-ready (7 pages, 5 figures)

  46. arXiv:2212.14034  [pdf, other

    cs.CL cs.LG

    Cramming: Training a Language Model on a Single GPU in One Day

    Authors: Jonas Gei**, Tom Goldstein

    Abstract: Recent trends in language modeling have focused on increasing performance through scaling, and have resulted in an environment where training language models is out of reach for most researchers and practitioners. While most in the community are asking how to push the limits of extreme computation, we ask the opposite question: How far can we get with a single GPU in just one day? We investigate… ▽ More

    Submitted 28 December, 2022; originally announced December 2022.

    Comments: 22 pages, we provide code at https://github.com/JonasGei**/cramming

  47. arXiv:2212.06727  [pdf, other

    cs.CV

    What do Vision Transformers Learn? A Visual Exploration

    Authors: Amin Ghiasi, Hamid Kazemi, Eitan Borgnia, Steven Reich, Manli Shu, Micah Goldblum, Andrew Gordon Wilson, Tom Goldstein

    Abstract: Vision transformers (ViTs) are quickly becoming the de-facto architecture for computer vision, yet we understand very little about why they work and what they learn. While existing studies visually analyze the mechanisms of convolutional neural networks, an analogous exploration of ViTs remains challenging. In this paper, we first address the obstacles to performing visualizations on ViTs. Assiste… ▽ More

    Submitted 13 December, 2022; originally announced December 2022.

  48. arXiv:2212.03860  [pdf, other

    cs.LG cs.CV cs.CY

    Diffusion Art or Digital Forgery? Investigating Data Replication in Diffusion Models

    Authors: Gowthami Somepalli, Vasu Singla, Micah Goldblum, Jonas Gei**, Tom Goldstein

    Abstract: Cutting-edge diffusion models produce images with high quality and customizability, enabling them to be used for commercial art and graphic design purposes. But do diffusion models create unique works of art, or are they replicating content directly from their training sets? In this work, we study image retrieval frameworks that enable us to compare generated images with training samples and detec… ▽ More

    Submitted 12 December, 2022; v1 submitted 7 December, 2022; originally announced December 2022.

    Comments: Updated draft with the following changes (1) Clarified the LAION Aesthetics versions everywhere (2) Correction on which LAION Aesthetics version SD - 1.4 is finetuned on and updated figure 12 based on this (3) A section on possible causes of replication

  49. arXiv:2211.15937  [pdf, other

    cs.CY cs.AI cs.CV cs.LG

    Robustness Disparities in Face Detection

    Authors: Samuel Dooley, George Z. Wei, Tom Goldstein, John P. Dickerson

    Abstract: Facial analysis systems have been deployed by large companies and critiqued by scholars and activists for the past decade. Many existing algorithmic audits examine the performance of these systems on later stage elements of facial analysis systems like facial recognition and age, emotion, or perceived gender prediction; however, a core component to these systems has been vastly understudied from a… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

    Comments: NeurIPS Datasets & Benchmarks Track 2022

  50. arXiv:2210.12864  [pdf, other

    cs.LG cs.CV

    K-SAM: Sharpness-Aware Minimization at the Speed of SGD

    Authors: Renkun Ni, **-yeh Chiang, Jonas Gei**, Micah Goldblum, Andrew Gordon Wilson, Tom Goldstein

    Abstract: Sharpness-Aware Minimization (SAM) has recently emerged as a robust technique for improving the accuracy of deep neural networks. However, SAM incurs a high computational cost in practice, requiring up to twice as much computation as vanilla SGD. The computational challenge posed by SAM arises because each iteration requires both ascent and descent steps and thus double the gradient computations.… ▽ More

    Submitted 23 October, 2022; originally announced October 2022.

    Comments: 13 pages, 2 figures