Skip to main content

Showing 1–11 of 11 results for author: Edelman, B L

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11741  [pdf, other

    cs.LG cs.AI

    Transcendence: Generative Models Can Outperform The Experts That Train Them

    Authors: Edwin Zhang, Vincent Zhu, Naomi Saphra, Anat Kleiman, Benjamin L. Edelman, Milind Tambe, Sham M. Kakade, Eran Malach

    Abstract: Generative models are trained with the simple objective of imitating the conditional probability distribution induced by the data they are trained on. Therefore, when trained on data generated by humans, we may not expect the artificial model to outperform the humans on their original objectives. In this work, we study the phenomenon of transcendence: when a generative model achieves capabilities… ▽ More

    Submitted 28 June, 2024; v1 submitted 17 June, 2024; originally announced June 2024.

    Comments: Code, models, and data at https://transcendence.eddie.win

  2. arXiv:2404.09932  [pdf, other

    cs.LG cs.AI cs.CL cs.CY

    Foundational Challenges in Assuring Alignment and Safety of Large Language Models

    Authors: Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Zhang, Ruiqi Zhong, Seán Ó hÉigeartaigh, Gabriel Recchia, Giulio Corsi , et al. (13 additional authors not shown)

    Abstract: This work identifies 18 foundational challenges in assuring the alignment and safety of large language models (LLMs). These challenges are organized into three different categories: scientific understanding of LLMs, development and deployment methods, and sociotechnical challenges. Based on the identified challenges, we pose $200+$ concrete research questions.

    Submitted 15 April, 2024; originally announced April 2024.

  3. arXiv:2402.11004  [pdf, other

    cs.LG

    The Evolution of Statistical Induction Heads: In-Context Learning Markov Chains

    Authors: Benjamin L. Edelman, Ezra Edelman, Surbhi Goel, Eran Malach, Nikolaos Tsilivis

    Abstract: Large language models have the ability to generate text that mimics patterns in their inputs. We introduce a simple Markov Chain sequence modeling task in order to study how this in-context learning (ICL) capability emerges. In our setting, each example is sampled from a Markov chain drawn from a prior distribution over Markov chains. Transformers trained on this task form \emph{statistical induct… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

  4. arXiv:2402.03563  [pdf, other

    cs.LG cs.AI cs.CL

    Distinguishing the Knowable from the Unknowable with Language Models

    Authors: Gustaf Ahdritz, Tian Qin, Nikhil Vyas, Boaz Barak, Benjamin L. Edelman

    Abstract: We study the feasibility of identifying epistemic uncertainty (reflecting a lack of knowledge), as opposed to aleatoric uncertainty (reflecting entropy in the underlying distribution), in the outputs of large language models (LLMs) over free-form text. In the absence of ground-truth probabilities, we explore a setting where, in order to (approximately) disentangle a given LLM's uncertainty, a sign… ▽ More

    Submitted 27 February, 2024; v1 submitted 5 February, 2024; originally announced February 2024.

  5. arXiv:2311.07568  [pdf, other

    cs.LG

    Feature emergence via margin maximization: case studies in algebraic tasks

    Authors: Depen Morwani, Benjamin L. Edelman, Costin-Andrei Oncescu, Rosie Zhao, Sham Kakade

    Abstract: Understanding the internal representations learned by neural networks is a cornerstone challenge in the science of machine learning. While there have been significant recent strides in some cases towards understanding how neural networks implement specific target functions, this paper explores a complementary question -- why do networks arrive at particular computational strategies? Our inquiry fo… ▽ More

    Submitted 19 February, 2024; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: Accepted as Spotlight at ICLR 2024

    ACM Class: I.5.1; I.2.6

  6. arXiv:2311.04378  [pdf, other

    cs.LG cs.CL cs.CR

    Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models

    Authors: Hanlin Zhang, Benjamin L. Edelman, Danilo Francati, Daniele Venturi, Giuseppe Ateniese, Boaz Barak

    Abstract: Watermarking generative models consists of planting a statistical signal (watermark) in a model's output so that it can be later verified that the output was generated by the given model. A strong watermarking scheme satisfies the property that a computationally bounded attacker cannot erase the watermark without causing significant quality degradation. In this paper, we study the (im)possibility… ▽ More

    Submitted 14 November, 2023; v1 submitted 7 November, 2023; originally announced November 2023.

    Comments: Blog post: https://www.harvard.edu/kempner-institute/2023/11/09/watermarking-in-the-sand/

  7. arXiv:2309.03800  [pdf, other

    cs.LG cs.AI stat.ML

    Pareto Frontiers in Neural Feature Learning: Data, Compute, Width, and Luck

    Authors: Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang

    Abstract: In modern deep learning, algorithmic choices (such as width, depth, and learning rate) are known to modulate nuanced resource tradeoffs. This work investigates how these complexities necessarily arise for feature learning in the presence of computational-statistical gaps. We begin by considering offline sparse parity learning, a supervised classification problem which admits a statistical query lo… ▽ More

    Submitted 30 October, 2023; v1 submitted 7 September, 2023; originally announced September 2023.

    Comments: v2: NeurIPS 2023 camera-ready updates

  8. arXiv:2207.08799  [pdf, other

    cs.LG cs.NE math.OC stat.ML

    Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit

    Authors: Boaz Barak, Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Eran Malach, Cyril Zhang

    Abstract: There is mounting evidence of emergent phenomena in the capabilities of deep learning methods as we scale up datasets, model sizes, and training times. While there are some accounts of how these resources modulate statistical capacity, far less is known about their effect on the computational problem of model training. This work conducts such an exploration through the lens of learning a $k$-spars… ▽ More

    Submitted 15 January, 2023; v1 submitted 18 July, 2022; originally announced July 2022.

    Comments: v3: final camera-ready revisions for NeurIPS 2022

  9. arXiv:2110.10090  [pdf, other

    cs.LG stat.ML

    Inductive Biases and Variable Creation in Self-Attention Mechanisms

    Authors: Benjamin L. Edelman, Surbhi Goel, Sham Kakade, Cyril Zhang

    Abstract: Self-attention, an architectural motif designed to model long-range interactions in sequential data, has driven numerous recent breakthroughs in natural language processing and beyond. This work provides a theoretical analysis of the inductive biases of self-attention modules. Our focus is to rigorously establish which functions and long-range dependencies self-attention blocks prefer to represent… ▽ More

    Submitted 23 June, 2022; v1 submitted 19 October, 2021; originally announced October 2021.

    Comments: v2: camera-ready revisions for ICML 2022

  10. arXiv:2002.05240  [pdf, other

    cs.GT econ.TH

    The Multiplayer Colonel Blotto Game

    Authors: Enric Boix-Adserà, Benjamin L. Edelman, Siddhartha Jayanti

    Abstract: We initiate the study of the natural multiplayer generalization of the classic continuous Colonel Blotto game. The two-player Blotto game, introduced by Borel as a model of resource competition across $n$ simultaneous fronts, has been studied extensively for a century and seen numerous applications throughout the social sciences. Our work defines the multiplayer Colonel Blotto game and derives Nas… ▽ More

    Submitted 21 May, 2021; v1 submitted 12 February, 2020; originally announced February 2020.

    Comments: 24 pages; minor additions to introduction

  11. arXiv:1905.11604  [pdf, other

    cs.LG cs.NE stat.ML

    SGD on Neural Networks Learns Functions of Increasing Complexity

    Authors: Preetum Nakkiran, Gal Kaplun, Dimitris Kalimeris, Tristan Yang, Benjamin L. Edelman, Fred Zhang, Boaz Barak

    Abstract: We perform an experimental study of the dynamics of Stochastic Gradient Descent (SGD) in learning deep neural networks for several real and synthetic classification tasks. We show that in the initial epochs, almost all of the performance improvement of the classifier obtained by SGD can be explained by a linear classifier. More generally, we give evidence for the hypothesis that, as iterations pro… ▽ More

    Submitted 28 May, 2019; originally announced May 2019.

    Comments: Submitted to NeurIPS 2019