Skip to main content

Showing 1–50 of 60 results for author: Gidel, G

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.18540  [pdf, other

    cs.CL cs.CR cs.LG

    Learning diverse attacks on large language models for robust red-teaming and safety tuning

    Authors: Seanie Lee, Minsu Kim, Lynn Cherif, David Dobre, Juho Lee, Sung Ju Hwang, Kenji Kawaguchi, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Moksh Jain

    Abstract: Red-teaming, or identifying prompts that elicit harmful responses, is a critical step in ensuring the safe and responsible deployment of large language models (LLMs). Develo** effective protection against many modes of attack prompts requires discovering diverse attacks. Automated red-teaming typically uses reinforcement learning to fine-tune an attacker language model to generate prompts that e… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  2. arXiv:2405.15589  [pdf, other

    cs.LG cs.CR

    Efficient Adversarial Training in LLMs with Continuous Attacks

    Authors: Sophie Xhonneux, Alessandro Sordoni, Stephan Günnemann, Gauthier Gidel, Leo Schwinn

    Abstract: Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails. In many domains, adversarial training has proven to be one of the most promising methods to reliably improve robustness against such attacks. Yet, in the context of LLMs, current methods for adversarial training are hindered by the high computational costs required to perform discrete advers… ▽ More

    Submitted 21 June, 2024; v1 submitted 24 May, 2024; originally announced May 2024.

    Comments: 19 pages, 4 figures

  3. arXiv:2402.09063  [pdf, other

    cs.LG

    Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space

    Authors: Leo Schwinn, David Dobre, Sophie Xhonneux, Gauthier Gidel, Stephan Gunnemann

    Abstract: Current research in adversarial robustness of LLMs focuses on discrete input manipulations in the natural language space, which can be directly transferred to closed-source models. However, this approach neglects the steady progression of open-source models. As open-source models advance in capability, ensuring their safety also becomes increasingly imperative. Yet, attacks tailored to open-source… ▽ More

    Submitted 14 February, 2024; originally announced February 2024.

    Comments: Trigger Warning: the appendix contains LLM-generated text with violence and harassment

  4. arXiv:2402.06121  [pdf, other

    cs.LG stat.ML

    Iterated Denoising Energy Matching for Sampling from Boltzmann Densities

    Authors: Tara Akhound-Sadegh, Jarrid Rector-Brooks, Avishek Joey Bose, Sarthak Mittal, Pablo Lemos, Cheng-Hao Liu, Marcin Sendera, Siamak Ravanbakhsh, Gauthier Gidel, Yoshua Bengio, Nikolay Malkin, Alexander Tong

    Abstract: Efficiently generating statistically independent samples from an unnormalized probability distribution, such as equilibrium samples of many-body systems, is a foundational problem in science. In this paper, we propose Iterated Denoising Energy Matching (iDEM), an iterative algorithm that uses a novel stochastic score matching objective leveraging solely the energy function and its gradient -- and… ▽ More

    Submitted 26 June, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: Published at ICML 2024. Code for iDEM is available at https://github.com/jarridrb/dem

  5. arXiv:2402.05723  [pdf, other

    cs.LG cs.CR

    In-Context Learning Can Re-learn Forbidden Tasks

    Authors: Sophie Xhonneux, David Dobre, Jian Tang, Gauthier Gidel, Dhanya Sridhar

    Abstract: Despite significant investment into safety training, large language models (LLMs) deployed in the real world still suffer from numerous vulnerabilities. One perspective on LLM safety training is that it algorithmically forbids the model from answering toxic or harmful queries. To assess the effectiveness of safety training, in this work, we study forbidden tasks, i.e., tasks the model is designed… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: 19 pages, 7 figures

  6. arXiv:2312.08484  [pdf, other

    cs.GT

    Q-learners Can Provably Collude in the Iterated Prisoner's Dilemma

    Authors: Quentin Bertrand, Juan Duque, Emilio Calvano, Gauthier Gidel

    Abstract: The deployment of machine learning systems in the market economy has triggered academic and institutional fears over potential tacit collusion between fully automated agents. Multiple recent economics studies have empirically shown the emergence of collusive strategies from agents guided by machine learning algorithms. In this work, we prove that multi-agent Q-learners playing the iterated prisone… ▽ More

    Submitted 13 December, 2023; originally announced December 2023.

  7. arXiv:2310.19737  [pdf, other

    cs.AI

    Adversarial Attacks and Defenses in Large Language Models: Old and New Threats

    Authors: Leo Schwinn, David Dobre, Stephan Günnemann, Gauthier Gidel

    Abstract: Over the past decade, there has been extensive research aimed at enhancing the robustness of neural networks, yet this problem remains vastly unsolved. Here, one major impediment has been the overestimation of the robustness of new defense approaches due to faulty defense evaluations. Flawed robustness evaluations necessitate rectifications in subsequent works, dangerously slowing down the researc… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  8. arXiv:2310.19103  [pdf, other

    cs.LG

    Proving Linear Mode Connectivity of Neural Networks via Optimal Transport

    Authors: Damien Ferbach, Baptiste Goujaud, Gauthier Gidel, Aymeric Dieuleveut

    Abstract: The energy landscape of high-dimensional non-convex optimization problems is crucial to understanding the effectiveness of modern deep neural network architectures. Recent works have experimentally shown that two different solutions found after two runs of a stochastic training are often connected by very simple continuous paths (e.g., linear) modulo a permutation of the weights. In this paper, we… ▽ More

    Submitted 1 March, 2024; v1 submitted 29 October, 2023; originally announced October 2023.

    Comments: Accepted as a conference paper at AISTATS 2024

  9. arXiv:2310.12065  [pdf, other

    cs.GT

    A Persuasive Approach to Combating Misinformation

    Authors: Safwan Hossain, Andjela Mladenovic, Yiling Chen, Gauthier Gidel

    Abstract: Bayesian Persuasion is proposed as a tool for social media platforms to combat the spread of misinformation. Since platforms can use machine learning to predict the popularity and misinformation features of to-be-shared posts, and users are largely motivated to share popular content, platforms can strategically signal this informational advantage to change user beliefs and persuade them not to sha… ▽ More

    Submitted 13 February, 2024; v1 submitted 18 October, 2023; originally announced October 2023.

  10. arXiv:2310.02779  [pdf, other

    cs.LG cs.GT

    Expected flow networks in stochastic environments and two-player zero-sum games

    Authors: Marco Jiralerspong, Bilun Sun, Danilo Vucetic, Tianyu Zhang, Yoshua Bengio, Gauthier Gidel, Nikolay Malkin

    Abstract: Generative flow networks (GFlowNets) are sequential sampling models trained to match a given distribution. GFlowNets have been successfully applied to various structured object generation tasks, sampling a diverse set of high-reward objects quickly. We propose expected flow networks (EFlowNets), which extend GFlowNets to stochastic environments. We show that EFlowNets outperform other GFlowNet for… ▽ More

    Submitted 13 March, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: ICLR 2024; code: https://github.com/GFNOrg/AdversarialFlowNetworks

  11. arXiv:2310.01860  [pdf, ps, other

    math.OC cs.LG

    High-Probability Convergence for Composite and Distributed Stochastic Minimization and Variational Inequalities with Heavy-Tailed Noise

    Authors: Eduard Gorbunov, Abdurakhmon Sadiev, Marina Danilova, Samuel Horváth, Gauthier Gidel, Pavel Dvurechensky, Alexander Gasnikov, Peter Richtárik

    Abstract: High-probability analysis of stochastic first-order optimization methods under mild assumptions on the noise has been gaining a lot of attention in recent years. Typically, gradient clip** is one of the key algorithmic ingredients to derive good high-probability guarantees when the noise is heavy-tailed. However, if implemented naïvely, clip** can spoil the convergence of the popular methods f… ▽ More

    Submitted 3 October, 2023; originally announced October 2023.

    Comments: 143 pages

  12. arXiv:2310.00429  [pdf, other

    cs.LG stat.ML

    On the Stability of Iterative Retraining of Generative Models on their own Data

    Authors: Quentin Bertrand, Avishek Joey Bose, Alexandre Duplessis, Marco Jiralerspong, Gauthier Gidel

    Abstract: Deep generative models have made tremendous progress in modeling complex data, often exhibiting generation quality that surpasses a typical human's ability to discern the authenticity of samples. Undeniably, a key driver of this success is enabled by the massive amounts of web-scale data consumed by these models. Due to these models' striking performance and ease of availability, the web will inev… ▽ More

    Submitted 2 April, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

  13. arXiv:2308.05260  [pdf, other

    cs.AI cs.CY

    AI4GCC -- Track 3: Consumption and the Challenges of Multi-Agent RL

    Authors: Marco Jiralerspong, Gauthier Gidel

    Abstract: The AI4GCC competition presents a bold step forward in the direction of integrating machine learning with traditional economic policy analysis. Below, we highlight two potential areas for improvement that could enhance the competition's ability to identify and evaluate proposed negotiation protocols. Firstly, we suggest the inclusion of an additional index that accounts for consumption/utility as… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

    Comments: Presented at AI For Global Climate Cooperation Competition, 2023 (arXiv:cs/2307.06951)

    Report number: AI4GCC/2023/track3/4

  14. arXiv:2306.07905  [pdf, other

    cs.LG math.OC stat.ML

    Omega: Optimistic EMA Gradients

    Authors: Juan Ramirez, Rohan Sukumaran, Quentin Bertrand, Gauthier Gidel

    Abstract: Stochastic min-max optimization has gained interest in the machine learning community with the advancements in GANs and adversarial training. Although game optimization is fairly well understood in the deterministic setting, some issues persist in the stochastic regime. Recent work has shown that stochastic gradient descent-ascent methods such as the optimistic gradient are highly sensitive to noi… ▽ More

    Submitted 25 March, 2024; v1 submitted 13 June, 2023; originally announced June 2023.

    Comments: Oral at the LatinX in AI workshop @ ICML 2023

  15. arXiv:2305.19394  [pdf, other

    q-bio.NC cs.LG cs.NE

    Synaptic Weight Distributions Depend on the Geometry of Plasticity

    Authors: Roman Pogodin, Jonathan Cornford, Arna Ghosh, Gauthier Gidel, Guillaume Lajoie, Blake Richards

    Abstract: A growing literature in computational neuroscience leverages gradient descent and learning algorithms that approximate it to study synaptic plasticity in the brain. However, the vast majority of this work ignores a critical underlying assumption: the choice of distance for synaptic changes - i.e. the geometry of synaptic plasticity. Gradient descent assumes that the distance is Euclidean, but many… ▽ More

    Submitted 4 March, 2024; v1 submitted 30 May, 2023; originally announced May 2023.

    Comments: ICLR 2024

    Journal ref: The Twelfth International Conference on Learning Representations, 2024

  16. arXiv:2305.10388  [pdf, other

    cs.LG cs.CR cs.CV

    Raising the Bar for Certified Adversarial Robustness with Diffusion Models

    Authors: Thomas Altstidl, David Dobre, Björn Eskofier, Gauthier Gidel, Leo Schwinn

    Abstract: Certified defenses against adversarial attacks offer formal guarantees on the robustness of a model, making them more reliable than empirical methods such as adversarial training, whose effectiveness is often later reduced by unseen attacks. Still, the limited certified robustness that is currently achievable has been a bottleneck for their practical adoption. Gowal et al. and Wang et al. have sho… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  17. arXiv:2304.11737  [pdf, other

    math.OC cs.LG stat.ML

    Sarah Frank-Wolfe: Methods for Constrained Optimization with Best Rates and Practical Features

    Authors: Aleksandr Beznosikov, David Dobre, Gauthier Gidel

    Abstract: The Frank-Wolfe (FW) method is a popular approach for solving optimization problems with structured constraints that arise in machine learning applications. In recent years, stochastic versions of FW have gained popularity, motivated by large datasets for which the computation of the full gradient is prohibitively expensive. In this paper, we present two new variants of the FW algorithms for stoch… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

    Comments: 28 pages, 2 algorithms, 2 figures, 2 tables

  18. arXiv:2304.06879  [pdf, other

    cs.LG cs.GT

    Performative Prediction with Neural Networks

    Authors: Mehrnaz Mofakhami, Ioannis Mitliagkas, Gauthier Gidel

    Abstract: Performative prediction is a framework for learning models that influence the data they intend to predict. We focus on finding classifiers that are performatively stable, i.e. optimal for the data distribution they induce. Standard convergence results for finding a performatively stable classifier with the method of repeated risk minimization assume that the data distribution is Lipschitz continuo… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: Published at AISTATS 2023

  19. arXiv:2302.04440  [pdf, other

    cs.LG cs.CV

    Feature Likelihood Divergence: Evaluating the Generalization of Generative Models Using Samples

    Authors: Marco Jiralerspong, Avishek Joey Bose, Ian Gemp, Chongli Qin, Yoram Bachrach, Gauthier Gidel

    Abstract: The past few years have seen impressive progress in the development of deep generative models capable of producing high-dimensional, complex, and photo-realistic data. However, current methods for evaluating such models remain incomplete: standard likelihood-based metrics do not always apply and rarely correlate with perceptual fidelity, while sample-based metrics, such as FID, are insensitive to… ▽ More

    Submitted 12 March, 2024; v1 submitted 8 February, 2023; originally announced February 2023.

    Comments: FLD code: https://github.com/marcojira/fld

  20. arXiv:2302.00999  [pdf, ps, other

    math.OC cs.LG

    High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance

    Authors: Abdurakhmon Sadiev, Marina Danilova, Eduard Gorbunov, Samuel Horváth, Gauthier Gidel, Pavel Dvurechensky, Alexander Gasnikov, Peter Richtárik

    Abstract: During recent years the interest of optimization and machine learning communities in high-probability convergence of stochastic optimization methods has been growing. One of the main reasons for this is that high-probability complexity bounds are more accurate and less studied than in-expectation ones. However, SOTA high-probability non-asymptotic convergence results are derived under strong assum… ▽ More

    Submitted 18 July, 2023; v1 submitted 2 February, 2023; originally announced February 2023.

    Comments: ICML 2023. 86 pages. Changes in v2: ICML formatting was applied along with minor edits of the text

  21. arXiv:2211.04659  [pdf, other

    cs.LG math.OC stat.ML

    When is Momentum Extragradient Optimal? A Polynomial-Based Analysis

    Authors: Junhyung Lyle Kim, Gauthier Gidel, Anastasios Kyrillidis, Fabian Pedregosa

    Abstract: The extragradient method has gained popularity due to its robust convergence properties for differentiable games. Unlike single-objective optimization, game dynamics involve complex interactions reflected by the eigenvalues of the game vector field's Jacobian scattered across the complex plane. This complexity can cause the simple gradient method to diverge, even for bilinear games, while the extr… ▽ More

    Submitted 10 February, 2024; v1 submitted 8 November, 2022; originally announced November 2022.

  22. arXiv:2210.17550  [pdf, other

    math.OC cs.GT cs.LG stat.ML

    Nesterov Meets Optimism: Rate-Optimal Separable Minimax Optimization

    Authors: Chris Junchi Li, Angela Yuan, Gauthier Gidel, Quanquan Gu, Michael I. Jordan

    Abstract: We propose a new first-order optimization algorithm -- AcceleratedGradient-OptimisticGradient (AG-OG) Descent Ascent -- for separable convex-concave minimax optimization. The main idea of our algorithm is to carefully leverage the structure of the minimax problem, performing Nesterov acceleration on the individual component and optimistic gradient on the coupling component. Equipped with proper re… ▽ More

    Submitted 14 August, 2023; v1 submitted 31 October, 2022; originally announced October 2022.

    Comments: 44 pages. This version matches the camera-ready that appeared at ICML 2023 under the same title

  23. arXiv:2210.04319  [pdf, other

    cs.LG

    Dissecting adaptive methods in GANs

    Authors: Samy Jelassi, David Dobre, Arthur Mensch, Yuanzhi Li, Gauthier Gidel

    Abstract: Adaptive methods are a crucial component widely used for training generative adversarial networks (GANs). While there has been some work to pinpoint the "marginal value of adaptive methods" in standard tasks, it remains unclear why they are still critical for GAN training. In this paper, we formally study how adaptive methods help train GANs; inspired by the grafting method proposed in arXiv:2002.… ▽ More

    Submitted 9 October, 2022; originally announced October 2022.

  24. arXiv:2207.06958   

    cs.SD cs.LG eess.AS

    Proceedings of the ICML 2022 Expressive Vocalizations Workshop and Competition: Recognizing, Generating, and Personalizing Vocal Bursts

    Authors: Alice Baird, Panagiotis Tzirakis, Gauthier Gidel, Marco Jiralerspong, Eilif B. Muller, Kory Mathewson, Björn Schuller, Erik Cambria, Dacher Keltner, Alan Cowen

    Abstract: This is the Proceedings of the ICML Expressive Vocalization (ExVo) Competition. The ExVo competition focuses on understanding and generating vocal bursts: laughs, gasps, cries, and other non-verbal vocalizations that are central to emotional expression and communication. ExVo 2022, included three competition tracks using a large-scale dataset of 59,201 vocalizations from 1,702 speakers. The first,… ▽ More

    Submitted 16 August, 2022; v1 submitted 14 July, 2022; originally announced July 2022.

  25. arXiv:2206.12563  [pdf, other

    cs.SD cs.LG eess.AS

    Generating Diverse Vocal Bursts with StyleGAN2 and MEL-Spectrograms

    Authors: Marco Jiralerspong, Gauthier Gidel

    Abstract: We describe our approach for the generative emotional vocal burst task (ExVo Generate) of the ICML Expressive Vocalizations Competition. We train a conditional StyleGAN2 architecture on mel-spectrograms of preprocessed versions of the audio samples. The mel-spectrograms generated by the model are then inverted back to the audio domain. As a result, our generated samples substantially improve upon… ▽ More

    Submitted 25 June, 2022; originally announced June 2022.

    Comments: To be published at the ICML Expressive Vocalizations Workshop and Competition (ExVo Generate) held in conjunction with the 39th International Conference on Machine Learning

  26. arXiv:2206.12301  [pdf, other

    cs.GT cs.LG stat.ML

    On the Limitations of Elo: Real-World Games, are Transitive, not Additive

    Authors: Quentin Bertrand, Wojciech Marian Czarnecki, Gauthier Gidel

    Abstract: Real-world competitive games, such as chess, go, or StarCraft II, rely on Elo models to measure the strength of their players. Since these games are not fully transitive, using Elo implicitly assumes they have a strong transitive component that can correctly be identified and extracted. In this study, we investigate the challenge of identifying the strength of the transitive component in games. Fi… ▽ More

    Submitted 6 March, 2023; v1 submitted 21 June, 2022; originally announced June 2022.

  27. arXiv:2206.09901  [pdf, other

    math.OC cs.LG

    Only Tails Matter: Average-Case Universality and Robustness in the Convex Regime

    Authors: Leonardo Cunha, Gauthier Gidel, Fabian Pedregosa, Damien Scieur, Courtney Paquette

    Abstract: The recently developed average-case analysis of optimization methods allows a more fine-grained and representative convergence analysis than usual worst-case results. In exchange, this analysis requires a more precise hypothesis over the data generating process, namely assuming knowledge of the expected spectral distribution (ESD) of the random matrix associated with the problem. This work shows t… ▽ More

    Submitted 22 June, 2022; v1 submitted 20 June, 2022; originally announced June 2022.

    Comments: To be published in ICML 2022

  28. arXiv:2206.08573  [pdf, ps, other

    math.OC cs.CC cs.GT cs.LG

    Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization

    Authors: Simon S. Du, Gauthier Gidel, Michael I. Jordan, Chris Junchi Li

    Abstract: We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $\min_{\mathbf{x}}\max_{\mathbf{y}}~F(\mathbf{x}) + H(\mathbf{x},\mathbf{y}) - G(\mathbf{y})$, where one has access to stochastic first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$. Building upon standard stochastic extragradient analysis for variational inequalities, we present a stochastic… ▽ More

    Submitted 11 August, 2022; v1 submitted 17 June, 2022; originally announced June 2022.

    Comments: More polishing and clarifications; 36 pages

  29. arXiv:2206.04270  [pdf, other

    cs.LG

    A General Framework For Proving The Equivariant Strong Lottery Ticket Hypothesis

    Authors: Damien Ferbach, Christos Tsirigotis, Gauthier Gidel, Avishek, Bose

    Abstract: The Strong Lottery Ticket Hypothesis (SLTH) stipulates the existence of a subnetwork within a sufficiently overparameterized (dense) neural network that -- when initialized randomly and without any training -- achieves the accuracy of a fully trained target network. Recent works by Da Cunha et. al 2022; Burkholz 2022 demonstrate that the SLTH can be extended to translation equivariant networks --… ▽ More

    Submitted 16 February, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: ICLR 2023

  30. arXiv:2206.01095  [pdf, other

    math.OC cs.LG

    Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise

    Authors: Eduard Gorbunov, Marina Danilova, David Dobre, Pavel Dvurechensky, Alexander Gasnikov, Gauthier Gidel

    Abstract: Stochastic first-order methods such as Stochastic Extragradient (SEG) or Stochastic Gradient Descent-Ascent (SGDA) for solving smooth minimax problems and, more generally, variational inequality problems (VIP) have been gaining a lot of attention in recent years due to the growing popularity of adversarial formulations in machine learning. However, while high-probability convergence bounds are kno… ▽ More

    Submitted 1 November, 2022; v1 submitted 2 June, 2022; originally announced June 2022.

    Comments: NeurIPS 2022. 74 pages, 18 figures. Changes in v2: few typos were fixed, new experiments with clipped-SEG were added. Code: https://github.com/busycalibrating/clipped-stochastic-methods

  31. arXiv:2206.00529  [pdf, other

    cs.LG cs.DC math.OC

    Variance Reduction is an Antidote to Byzantines: Better Rates, Weaker Assumptions and Communication Compression as a Cherry on the Top

    Authors: Eduard Gorbunov, Samuel Horváth, Peter Richtárik, Gauthier Gidel

    Abstract: Byzantine-robustness has been gaining a lot of attention due to the growth of the interest in collaborative and federated learning. However, many fruitful directions, such as the usage of variance reduction for achieving robustness and communication compression for reducing communication costs, remain weakly explored in the field. This work addresses this gap and proposes Byz-VR-MARINA - a new Byz… ▽ More

    Submitted 8 March, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

    Comments: ICLR 2023. 42 pages, 8 figures. Changes in v2: few typos and inaccuracies were fixed, more clarifications were added. Changes in v3: ICLR formatting was applied, additional experiments were added (Appendix B.4-B.5) and extra discussion of the results was added to Appendix E.5. Code: https://github.com/SamuelHorvath/VR_Byzantine

  32. arXiv:2205.01780  [pdf, other

    eess.AS cs.LG cs.SD

    The ICML 2022 Expressive Vocalizations Workshop and Competition: Recognizing, Generating, and Personalizing Vocal Bursts

    Authors: Alice Baird, Panagiotis Tzirakis, Gauthier Gidel, Marco Jiralerspong, Eilif B. Muller, Kory Mathewson, Björn Schuller, Erik Cambria, Dacher Keltner, Alan Cowen

    Abstract: The ICML Expressive Vocalization (ExVo) Competition is focused on understanding and generating vocal bursts: laughs, gasps, cries, and other non-verbal vocalizations that are central to emotional expression and communication. ExVo 2022, includes three competition tracks using a large-scale dataset of 59,201 vocalizations from 1,702 speakers. The first, ExVo-MultiTask, requires participants to trai… ▽ More

    Submitted 12 July, 2022; v1 submitted 3 May, 2022; originally announced May 2022.

  33. arXiv:2204.07826  [pdf, other

    stat.ML cs.LG

    Beyond L1: Faster and Better Sparse Models with skglm

    Authors: Quentin Bertrand, Quentin Klopfenstein, Pierre-Antoine Bannier, Gauthier Gidel, Mathurin Massias

    Abstract: We propose a new fast algorithm to estimate any sparse generalized linear model with convex or non-convex separable penalties. Our algorithm is able to solve problems with millions of samples and features in seconds, by relying on coordinate descent, working sets and Anderson acceleration. It handles previously unaddressed models, and is extensively shown to improve state-of-art algorithms. We pro… ▽ More

    Submitted 6 March, 2023; v1 submitted 16 April, 2022; originally announced April 2022.

  34. arXiv:2111.08611  [pdf, other

    math.OC cs.LG

    Stochastic Extragradient: General Analysis and Improved Rates

    Authors: Eduard Gorbunov, Hugo Berard, Gauthier Gidel, Nicolas Loizou

    Abstract: The Stochastic Extragradient (SEG) method is one of the most popular algorithms for solving min-max optimization and variational inequalities problems (VIP) appearing in various machine learning tasks. However, several important questions regarding the convergence properties of SEG are still open, including the sampling of stochastic gradients, mini-batching, convergence guarantees for the monoton… ▽ More

    Submitted 22 February, 2022; v1 submitted 16 November, 2021; originally announced November 2021.

    Comments: AISTATS 2022. 37 pages, 3 figures, 2 tables. Changes in v2: some minor typos were fixed, several places were clarified. Changes in v3: few typos were fixed, inaccuracies in Appendix B were corrected. Code: https://github.com/hugobb/Stochastic-Extragradient

  35. arXiv:2111.03146  [pdf, other

    cs.LG cs.SD eess.AS

    Generating Diverse Realistic Laughter for Interactive Art

    Authors: M. Mehdi Afsar, Eric Park, Étienne Paquette, Gauthier Gidel, Kory W. Mathewson, Eilif Muller

    Abstract: We propose an interactive art project to make those rendered invisible by the COVID-19 crisis and its concomitant solitude reappear through the welcome melody of laughter, and connections created and explored through advanced laughter synthesis approaches. However, the unconditional generation of the diversity of human emotional responses in high-quality auditory synthesis remains an open problem,… ▽ More

    Submitted 29 July, 2022; v1 submitted 4 November, 2021; originally announced November 2021.

    Comments: Presented at Machine Learning for Creativity and Design workshop at NeurIPS 2021, 6 pages

  36. arXiv:2110.10815  [pdf, other

    cs.LG math.OC stat.ML

    Convergence Analysis and Implicit Regularization of Feedback Alignment for Deep Linear Networks

    Authors: Manuela Girotti, Ioannis Mitliagkas, Gauthier Gidel

    Abstract: We theoretically analyze the Feedback Alignment (FA) algorithm, an efficient alternative to backpropagation for training neural networks. We provide convergence guarantees with rates for deep linear networks for both continuous and discrete dynamics. Additionally, we study incremental learning phenomena for shallow linear networks. Interestingly, certain specific initializations imply that negligi… ▽ More

    Submitted 20 October, 2021; originally announced October 2021.

    Comments: 10 pages (Main) + 19 pages (Appendix), 6 figures

  37. arXiv:2110.04261  [pdf, other

    math.OC cs.LG

    Extragradient Method: $O(1/K)$ Last-Iterate Convergence for Monotone Variational Inequalities and Connections With Cocoercivity

    Authors: Eduard Gorbunov, Nicolas Loizou, Gauthier Gidel

    Abstract: Extragradient method (EG) (Korpelevich, 1976) is one of the most popular methods for solving saddle point and variational inequalities problems (VIP). Despite its long history and significant attention in the optimization community, there remain important open questions about convergence of EG. In this paper, we resolve one of such questions and derive the first last-iterate $O(1/K)$ convergence r… ▽ More

    Submitted 22 February, 2022; v1 submitted 8 October, 2021; originally announced October 2021.

    Comments: AISTATS 2022; 37 pages, 4 figures. Changes in v2: structure was changed, minor typos are fixed, several additional clarifications were added. Code: https://github.com/eduardgorbunov/extragradient_last_iterate_AISTATS_2022

  38. arXiv:2110.04041  [pdf, other

    cs.AI

    Pick Your Battles: Interaction Graphs as Population-Level Objectives for Strategic Diversity

    Authors: Marta Garnelo, Wojciech Marian Czarnecki, Siqi Liu, Dhruva Tirumala, Junhyuk Oh, Gauthier Gidel, Hado van Hasselt, David Balduzzi

    Abstract: Strategic diversity is often essential in games: in multi-player games, for example, evaluating a player against a diverse set of strategies will yield a more accurate estimate of its performance. Furthermore, in games with non-transitivities diversity allows a player to cover several winning strategies. However, despite the significance of strategic diversity, training agents that exhibit diverse… ▽ More

    Submitted 8 October, 2021; originally announced October 2021.

  39. arXiv:2107.00464  [pdf, other

    math.OC cs.GT cs.LG stat.ML

    On the Convergence of Stochastic Extragradient for Bilinear Games using Restarted Iteration Averaging

    Authors: Chris Junchi Li, Yaodong Yu, Nicolas Loizou, Gauthier Gidel, Yi Ma, Nicolas Le Roux, Michael I. Jordan

    Abstract: We study the stochastic bilinear minimax optimization problem, presenting an analysis of the same-sample Stochastic ExtraGradient (SEG) method with constant step size, and presenting variations of the method that yield favorable convergence. In sharp contrasts with the basic SEG method whose last iterate only contracts to a fixed neighborhood of the Nash equilibrium, SEG augmented with iteration a… ▽ More

    Submitted 8 April, 2022; v1 submitted 30 June, 2021; originally announced July 2021.

    Comments: Camera-ready version appeared at AISTATS 2022; short version appeared at NeurIPS OPT 2021 Workshop

  40. arXiv:2107.00052  [pdf, other

    cs.LG cs.GT math.OC stat.ML

    Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity

    Authors: Nicolas Loizou, Hugo Berard, Gauthier Gidel, Ioannis Mitliagkas, Simon Lacoste-Julien

    Abstract: Two of the most prominent algorithms for solving unconstrained smooth games are the classical stochastic gradient descent-ascent (SGDA) and the recently introduced stochastic consensus optimization (SCO) [Mescheder et al., 2017]. SGDA is known to converge to a stationary point for specific classes of games, but current convergence analyses require a bounded variance assumption. SCO is used success… ▽ More

    Submitted 4 November, 2021; v1 submitted 30 June, 2021; originally announced July 2021.

    Comments: 35th Conference on Neural Information Processing Systems (NeurIPS 2021)

  41. arXiv:2104.03863  [pdf, other

    cs.LG cs.CR stat.ML

    A single gradient step finds adversarial examples on random two-layers neural networks

    Authors: Sébastien Bubeck, Yeshwanth Cherapanamjeri, Gauthier Gidel, Rémi Tachet des Combes

    Abstract: Daniely and Schacham recently showed that gradient descent finds adversarial examples on random undercomplete two-layers ReLU neural networks. The term "undercomplete" refers to the fact that their proof only holds when the number of neurons is a vanishing fraction of the ambient dimension. We extend their result to the overcomplete case, where the number of neurons is larger than the dimension (y… ▽ More

    Submitted 9 April, 2021; v1 submitted 8 April, 2021; originally announced April 2021.

    Comments: Added a comment about universal adversarial perturbations. 18 pages, 7 figures

  42. arXiv:2103.02014  [pdf, other

    cs.LG cs.CR cs.DS

    Online Adversarial Attacks

    Authors: Andjela Mladenovic, Avishek Joey Bose, Hugo Berard, William L. Hamilton, Simon Lacoste-Julien, Pascal Vincent, Gauthier Gidel

    Abstract: Adversarial attacks expose important vulnerabilities of deep learning models, yet little attention has been paid to settings where data arrives as a stream. In this paper, we formalize the online adversarial attack problem, emphasizing two key elements found in real-world use-cases: attackers must operate under partial knowledge of the target model, and the decisions made by the attacker are irrev… ▽ More

    Submitted 22 March, 2022; v1 submitted 2 March, 2021; originally announced March 2021.

    Comments: ICLR 2022

  43. arXiv:2007.00720  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Adversarial Example Games

    Authors: Avishek Joey Bose, Gauthier Gidel, Hugo Berard, Andre Cianflone, Pascal Vincent, Simon Lacoste-Julien, William L. Hamilton

    Abstract: The existence of adversarial examples capable of fooling trained neural network classifiers calls for a much better understanding of possible attacks to guide the development of safeguards against them. This includes attack methods in the challenging non-interactive blackbox setting, where adversarial attacks are generated without any access, including queries, to the target model. Prior attacks i… ▽ More

    Submitted 8 January, 2021; v1 submitted 1 July, 2020; originally announced July 2020.

    Comments: Appears in: Advances in Neural Information Processing Systems 33 (NeurIPS 2020)

  44. arXiv:2004.09468  [pdf, other

    cs.LG stat.ML

    Real World Games Look Like Spinning Tops

    Authors: Wojciech Marian Czarnecki, Gauthier Gidel, Brendan Tracey, Karl Tuyls, Shayegan Omidshafiei, David Balduzzi, Max Jaderberg

    Abstract: This paper investigates the geometrical properties of real world games (e.g. Tic-Tac-Toe, Go, StarCraft II). We hypothesise that their geometrical structure resemble a spinning top, with the upright axis representing transitive strength, and the radial axis, which corresponds to the number of cycles that exist at a particular transitive strength, representing the non-transitive dimension. We prove… ▽ More

    Submitted 17 June, 2020; v1 submitted 20 April, 2020; originally announced April 2020.

  45. arXiv:2002.05820  [pdf, other

    stat.ML cs.GT cs.LG

    A Limited-Capacity Minimax Theorem for Non-Convex Games or: How I Learned to Stop Worrying about Mixed-Nash and Love Neural Nets

    Authors: Gauthier Gidel, David Balduzzi, Wojciech Marian Czarnecki, Marta Garnelo, Yoram Bachrach

    Abstract: Adversarial training, a special case of multi-objective optimization, is an increasingly prevalent machine learning technique: some of its most notable applications include GAN-based generative modeling and self-play techniques in reinforcement learning which have been applied to complex games such as Go or Poker. In practice, a \emph{single} pair of networks is typically trained in order to find… ▽ More

    Submitted 15 March, 2021; v1 submitted 13 February, 2020; originally announced February 2020.

    Comments: Appears in: Proceedings of the 24th International Conference on Artificial Intelligence and Statistics (AISTATS 2021). 19 pages

  46. arXiv:2001.00602  [pdf, other

    cs.LG math.OC stat.ML

    Accelerating Smooth Games by Manipulating Spectral Shapes

    Authors: Waïss Azizian, Damien Scieur, Ioannis Mitliagkas, Simon Lacoste-Julien, Gauthier Gidel

    Abstract: We use matrix iteration theory to characterize acceleration in smooth games. We define the spectral shape of a family of games as the set containing all eigenvalues of the Jacobians of standard gradient dynamics in the family. Shapes restricted to the real line represent well-understood classes of problems, like minimization. Shapes spanning the complex plane capture the added numerical challenges… ▽ More

    Submitted 9 March, 2020; v1 submitted 2 January, 2020; originally announced January 2020.

    Comments: Appears in: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020). 34 pages

    MSC Class: G.1.6; I.2.6 ACM Class: G.1.6; I.2.6

  47. arXiv:1907.04392  [pdf, other

    cs.GT math.DS math.OC

    Finite Regret and Cycles with Fixed Step-Size via Alternating Gradient Descent-Ascent

    Authors: James P. Bailey, Gauthier Gidel, Georgios Piliouras

    Abstract: Gradient descent is arguably one of the most popular online optimization methods with a wide array of applications. However, the standard implementation where agents simultaneously update their strategies yields several undesirable properties; strategies diverge away from equilibrium and regret grows over time. In this paper, we eliminate these negative properties by introducing a different implem… ▽ More

    Submitted 9 July, 2019; originally announced July 2019.

    Comments: 15 pages

  48. arXiv:1906.07300  [pdf, ps, other

    cs.LG math.OC stat.ML

    Linear Lower Bounds and Conditioning of Differentiable Games

    Authors: Adam Ibrahim, Waïss Azizian, Gauthier Gidel, Ioannis Mitliagkas

    Abstract: Recent successes of game-theoretic formulations in ML have caused a resurgence of research interest in differentiable games. Overwhelmingly, that research focuses on methods and upper bounds on their speed of convergence. In this work, we approach the question of fundamental iteration complexity by providing lower bounds to complement the linear (i.e. geometric) upper bounds observed in the litera… ▽ More

    Submitted 15 September, 2020; v1 submitted 17 June, 2019; originally announced June 2019.

    Comments: ICML 2020 final version

    Journal ref: Proceedings of the 37 th International Conference on Machine Learning, Vienna, Austria, PMLR 119, 2020

  49. arXiv:1906.05945  [pdf, other

    cs.LG math.OC stat.ML

    A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Games

    Authors: Waïss Azizian, Ioannis Mitliagkas, Simon Lacoste-Julien, Gauthier Gidel

    Abstract: We consider differentiable games where the goal is to find a Nash equilibrium. The machine learning community has recently started using variants of the gradient method (GD). Prime examples are extragradient (EG), the optimistic gradient method (OG) and consensus optimization (CO), which enjoy linear convergence in cases like bilinear games, where the standard GD fails. The full benefits of theses… ▽ More

    Submitted 7 July, 2020; v1 submitted 13 June, 2019; originally announced June 2019.

    Comments: Appears in: Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020). 39 pages. Minor modification regarding prior work in comparison to the AISTATS Proceedings

    ACM Class: G.1.6; I.2.6

  50. arXiv:1906.04848  [pdf, other

    cs.LG stat.ML

    A Closer Look at the Optimization Landscapes of Generative Adversarial Networks

    Authors: Hugo Berard, Gauthier Gidel, Amjad Almahairi, Pascal Vincent, Simon Lacoste-Julien

    Abstract: Generative adversarial networks have been very successful in generative modeling, however they remain relatively challenging to train compared to standard deep neural networks. In this paper, we propose new visualization techniques for the optimization landscapes of GANs that enable us to study the game vector field resulting from the concatenation of the gradient of both players. Using these visu… ▽ More

    Submitted 27 April, 2020; v1 submitted 11 June, 2019; originally announced June 2019.