Skip to main content

Showing 1–14 of 14 results for author: Hayase, J

.
  1. arXiv:2407.02447  [pdf, other

    cs.LG

    PLeaS -- Merging Models with Permutations and Least Squares

    Authors: Anshul Nasery, Jonathan Hayase, Pang Wei Koh, Sewoong Oh

    Abstract: The democratization of machine learning systems has made the process of fine-tuning accessible to a large number of practitioners, leading to a wide range of open-source models fine-tuned on specialized tasks and datasets. Recent work has proposed to merge such models to combine their functionalities. However, prior approaches are restricted to models that are fine-tuned from the same base model.… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  2. arXiv:2404.15409  [pdf, ps, other

    cs.LG cs.CR stat.ML

    Insufficient Statistics Perturbation: Stable Estimators for Private Least Squares

    Authors: Gavin Brown, Jonathan Hayase, Samuel Hopkins, Weihao Kong, Xiyang Liu, Sewoong Oh, Juan C. Perdomo, Adam Smith

    Abstract: We present a sample- and time-efficient differentially private algorithm for ordinary least squares, with error that depends linearly on the dimension and is independent of the condition number of $X^\top X$, where $X$ is the design matrix. All prior private algorithms for this task require either $d^{3/2}$ examples, error growing polynomially with the condition number, or exponential time. Our ne… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: 42 pages, 3 figures

  3. arXiv:2403.06634  [pdf, other

    cs.CR

    Stealing Part of a Production Language Model

    Authors: Nicholas Carlini, Daniel Paleka, Krishnamurthy Dj Dvijotham, Thomas Steinke, Jonathan Hayase, A. Feder Cooper, Katherine Lee, Matthew Jagielski, Milad Nasr, Arthur Conmy, Itay Yona, Eric Wallace, David Rolnick, Florian Tramèr

    Abstract: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI's ChatGPT or Google's PaLM-2. Specifically, our attack recovers the embedding projection layer (up to symmetries) of a transformer model, given typical API access. For under \… ▽ More

    Submitted 9 July, 2024; v1 submitted 11 March, 2024; originally announced March 2024.

  4. arXiv:2402.12329  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Query-Based Adversarial Prompt Generation

    Authors: Jonathan Hayase, Ema Borevkovic, Nicholas Carlini, Florian Tramèr, Milad Nasr

    Abstract: Recent work has shown it is possible to construct adversarial examples that cause an aligned language model to emit harmful strings or perform harmful behavior. Existing attacks work either in the white-box setting (with full access to the model weights), or through transferability: the phenomenon that adversarial examples crafted on one model often remain effective on other models. We improve on… ▽ More

    Submitted 19 February, 2024; originally announced February 2024.

  5. arXiv:2311.17035  [pdf, other

    cs.LG cs.CL cs.CR

    Scalable Extraction of Training Data from (Production) Language Models

    Authors: Milad Nasr, Nicholas Carlini, Jonathan Hayase, Matthew Jagielski, A. Feder Cooper, Daphne Ippolito, Christopher A. Choquette-Choo, Eric Wallace, Florian Tramèr, Katherine Lee

    Abstract: This paper studies extractable memorization: training data that an adversary can efficiently extract by querying a machine learning model without prior knowledge of the training dataset. We show an adversary can extract gigabytes of training data from open-source language models like Pythia or GPT-Neo, semi-open models like LLaMA or Falcon, and closed models like ChatGPT. Existing techniques from… ▽ More

    Submitted 28 November, 2023; originally announced November 2023.

  6. arXiv:2310.18933  [pdf, other

    cs.LG cs.CR cs.CV

    Label Poisoning is All You Need

    Authors: Rishi D. Jha, Jonathan Hayase, Sewoong Oh

    Abstract: In a backdoor attack, an adversary injects corrupted data into a model's training dataset in order to gain control over its predictions on images with a specific attacker-defined trigger. A typical corrupted training example requires altering both the image, by applying the trigger, and the label. Models trained on clean images, therefore, were considered safe from backdoor attacks. However, in so… ▽ More

    Submitted 29 October, 2023; originally announced October 2023.

  7. arXiv:2304.14108  [pdf, other

    cs.CV cs.CL cs.LG

    DataComp: In search of the next generation of multimodal datasets

    Authors: Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei Koh, Olga Saukh, Alexander Ratner, Shuran Song , et al. (9 additional authors not shown)

    Abstract: Multimodal datasets are a critical component in recent breakthroughs such as Stable Diffusion and GPT-4, yet their design does not receive the same research attention as model architectures or training algorithms. To address this shortcoming in the ML ecosystem, we introduce DataComp, a testbed for dataset experiments centered around a new candidate pool of 12.8 billion image-text pairs from Commo… ▽ More

    Submitted 20 October, 2023; v1 submitted 27 April, 2023; originally announced April 2023.

    Comments: NeurIPS 2023 Datasets and Benchmarks Track

  8. arXiv:2210.08069  [pdf, ps, other

    cs.LG stat.ML

    Zonotope Domains for Lagrangian Neural Network Verification

    Authors: Matt Jordan, Jonathan Hayase, Alexandros G. Dimakis, Sewoong Oh

    Abstract: Neural network verification aims to provide provable bounds for the output of a neural network for a given input range. Notable prior works in this domain have either generated bounds using abstract domains, which preserve some dependency between intermediate neurons in the network; or framed verification as an optimization problem and solved a relaxation using Lagrangian methods. A key drawback o… ▽ More

    Submitted 14 October, 2022; originally announced October 2022.

    Comments: Accepted into NeurIPS 2022. Code: https://github.com/revbucket/dual-verification

  9. arXiv:2210.05929  [pdf, other

    cs.LG cs.CR

    Few-shot Backdoor Attacks via Neural Tangent Kernels

    Authors: Jonathan Hayase, Sewoong Oh

    Abstract: In a backdoor attack, an attacker injects corrupted examples into the training set. The goal of the attacker is to cause the final trained model to predict the attacker's desired target label when a predefined trigger is added to test inputs. Central to these attacks is the trade-off between the success rate of the attack and the number of corrupted training examples injected. We pose this attack… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

    Comments: 20 pages, 13 figures

  10. arXiv:2209.04836  [pdf, other

    cs.LG cs.AI

    Git Re-Basin: Merging Models modulo Permutation Symmetries

    Authors: Samuel K. Ainsworth, Jonathan Hayase, Siddhartha Srinivasa

    Abstract: The success of deep learning is due in large part to our ability to solve certain massive non-convex optimization problems with relative ease. Though non-convex optimization is NP-hard, simple algorithms -- often variants of stochastic gradient descent -- exhibit surprising effectiveness in fitting large neural networks in practice. We argue that neural network loss landscapes often contain (nearl… ▽ More

    Submitted 1 March, 2023; v1 submitted 11 September, 2022; originally announced September 2022.

  11. arXiv:2205.11736  [pdf, other

    cs.LG cs.AI cs.CR

    Towards a Defense Against Federated Backdoor Attacks Under Continuous Training

    Authors: Shuaiqi Wang, Jonathan Hayase, Giulia Fanti, Sewoong Oh

    Abstract: Backdoor attacks are dangerous and difficult to prevent in federated learning (FL), where training data is sourced from untrusted clients over long periods of time. These difficulties arise because: (a) defenders in FL do not have access to raw training data, and (b) a new phenomenon we identify called backdoor leakage causes models trained continuously to eventually suffer from backdoors due to c… ▽ More

    Submitted 30 January, 2023; v1 submitted 23 May, 2022; originally announced May 2022.

  12. arXiv:2104.11315  [pdf, other

    cs.LG cs.AI stat.ML

    SPECTRE: Defending Against Backdoor Attacks Using Robust Statistics

    Authors: Jonathan Hayase, Weihao Kong, Raghav Somani, Sewoong Oh

    Abstract: Modern machine learning increasingly requires training on a large collection of data from multiple sources, not all of which can be trusted. A particularly concerning scenario is when a small fraction of poisoned data changes the behavior of the trained model when triggered by an attacker-specified watermark. Such a compromised model will be deployed unnoticed as the model is accurate otherwise. T… ▽ More

    Submitted 22 April, 2021; originally announced April 2021.

    Comments: 29 pages 19 figures

  13. arXiv:1907.06010  [pdf, ps, other

    cs.LG stat.ML

    The Futility of Bias-Free Learning and Search

    Authors: George D. Montanez, Jonathan Hayase, Julius Lauw, Dominique Macias, Akshay Trikha, Julia Vendemiatti

    Abstract: Building on the view of machine learning as search, we demonstrate the necessity of bias in learning, quantifying the role of bias (measured relative to a collection of possible datasets, or more generally, information resources) in increasing the probability of success. For a given degree of bias towards a fixed target, we show that the proportion of favorable information resources is strictly bo… ▽ More

    Submitted 13 July, 2019; originally announced July 2019.

  14. Theory of multiwave mixing and decoherence control in qubit array system

    Authors: M. Sasaki, A. Hasegawa, J. I. Hayase, Y. Mitsumori, F. Minami

    Abstract: We develop a theory to analyze the decoherence effect in a charged qubit array system with photon echo signals in the multiwave mixing configuration. We present how the decoherence suppression effect by the {\it bang-bang} control with the $π$ pulses can be demonstrated in laboratory by using a bulk ensemble of exciton qubits and optical pulses whose pulse area is even smaller than $π$. Analysis… ▽ More

    Submitted 2 November, 2004; originally announced November 2004.

    Comments: 19 pages, 11 figures