Skip to main content

Showing 1–7 of 7 results for author: Shilov, I

.
  1. arXiv:2405.15523  [pdf, other

    cs.CL cs.LG

    Mosaic Memory: Fuzzy Duplication in Copyright Traps for Large Language Models

    Authors: Igor Shilov, Matthieu Meeus, Yves-Alexandre de Montjoye

    Abstract: The immense datasets used to develop Large Language Models (LLMs) often include copyright-protected content, typically without the content creator's consent. Copyright traps have been proposed to be injected into the original content, improving content detectability in newly released LLMs. Traps, however, rely on the exact duplication of a unique text sequence, leaving them vulnerable to commonly… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

  2. arXiv:2402.09363  [pdf, other

    cs.CL cs.CR

    Copyright Traps for Large Language Models

    Authors: Matthieu Meeus, Igor Shilov, Manuel Faysse, Yves-Alexandre de Montjoye

    Abstract: Questions of fair use of copyright-protected content to train Large Language Models (LLMs) are being actively debated. Document-level inference has been proposed as a new task: inferring from black-box access to the trained model whether a piece of content has been seen during training. SOTA methods however rely on naturally occurring memorization of (part of) the content. While very effective aga… ▽ More

    Submitted 4 June, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: 41st International Conference on Machine Learning (ICML 2024)

  3. arXiv:2202.07623  [pdf, other

    cs.LG cs.AI cs.CR stat.ML

    Defending against Reconstruction Attacks with Rényi Differential Privacy

    Authors: Pierre Stock, Igor Shilov, Ilya Mironov, Alexandre Sablayrolles

    Abstract: Reconstruction attacks allow an adversary to regenerate data samples of the training set using access to only a trained model. It has been recently shown that simple heuristics can reconstruct data samples from language models, making this threat scenario an important aspect of model release. Differential privacy is a known solution to such attacks, but is often used with a relatively large privac… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

  4. arXiv:2109.12298  [pdf, other

    cs.LG cs.CR

    Opacus: User-Friendly Differential Privacy Library in PyTorch

    Authors: Ashkan Yousefpour, Igor Shilov, Alexandre Sablayrolles, Davide Testuggine, Karthik Prasad, Mani Malek, John Nguyen, Sayan Ghosh, Akash Bharadwaj, Jessica Zhao, Graham Cormode, Ilya Mironov

    Abstract: We introduce Opacus, a free, open-source PyTorch library for training deep learning models with differential privacy (hosted at opacus.ai). Opacus is designed for simplicity, flexibility, and speed. It provides a simple and user-friendly API, and enables machine learning practitioners to make a training pipeline private by adding as little as two lines to their code. It supports a wide variety of… ▽ More

    Submitted 22 August, 2022; v1 submitted 25 September, 2021; originally announced September 2021.

    Comments: Privacy in Machine Learning (PriML) workshop, NeurIPS 2021

  5. arXiv:2106.03408  [pdf, other

    cs.LG cs.CR

    Antipodes of Label Differential Privacy: PATE and ALIBI

    Authors: Mani Malek, Ilya Mironov, Karthik Prasad, Igor Shilov, Florian Tramèr

    Abstract: We consider the privacy-preserving machine learning (ML) setting where the trained model must satisfy differential privacy (DP) with respect to the labels of the training examples. We propose two novel approaches based on, respectively, the Laplace mechanism and the PATE framework, and demonstrate their effectiveness on standard benchmarks. While recent work by Ghazi et al. proposed Label DP sch… ▽ More

    Submitted 29 October, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

    Comments: 2021 Conference on Neural Information Processing Systems (NeurIPS)

  6. arXiv:2101.06922  [pdf, other

    cs.GT math.OC

    Privacy Impact on Generalized Nash Equilibrium in Peer-to-Peer Electricity Market

    Authors: Ilia Shilov, Hélène Le Cadre, Ana Bušic

    Abstract: We consider a peer-to-peer electricity market, where agents hold private information that they might not want to share. The problem is modeled as a noncooperative communication game, which takes the form of a Generalized Nash Equilibrium Problem, where the agents determine their randomized reports to share with the other market players, while anticipating the form of the peer-to-peer market equili… ▽ More

    Submitted 18 January, 2021; originally announced January 2021.

  7. arXiv:2004.02470  [pdf, other

    cs.GT math.OC

    Risk-Averse Equilibrium Analysis and Computation

    Authors: Ilia Shilov, Hélène Le Cadre, Ana Busic

    Abstract: We consider two market designs for a network of prosumers, trading energy: (i) a centralized design which acts as a benchmark, and (ii) a peer-to-peer market design. High renewable energy penetration requires that the energy market design properly handles uncertainty. To that purpose, we consider risk neutral models for market designs (i), (ii), and their risk-averse interpretations in which prosu… ▽ More

    Submitted 6 April, 2020; originally announced April 2020.