Skip to main content

Showing 1–9 of 9 results for author: Héliou, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2403.08295  [pdf, other

    cs.CL cs.AI

    Gemma: Open Models Based on Gemini Research and Technology

    Authors: Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Léonard Hussenot, Pier Giuseppe Sessa, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev, Alex Castro-Ros, Ambrose Slone, Amélie Héliou, Andrea Tacchetti, Anna Bulanova, Antonia Paterson, Beth Tsai, Bobak Shahriari , et al. (83 additional authors not shown)

    Abstract: This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety. We release two sizes of models (2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Ge… ▽ More

    Submitted 16 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  2. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  3. arXiv:2109.05829  [pdf, other

    cs.LG math.OC

    Zeroth-order non-convex learning via hierarchical dual averaging

    Authors: Amélie Héliou, Matthieu Martin, Panayotis Mertikopoulos, Thibaud Rahier

    Abstract: We propose a hierarchical version of dual averaging for zeroth-order online non-convex optimization - i.e., learning processes where, at each stage, the optimizer is facing an unknown non-convex loss function and only receives the incurred loss as feedback. The proposed class of policies relies on the construction of an online model that aggregates loss information as it arrives, and it consists o… ▽ More

    Submitted 13 September, 2021; originally announced September 2021.

    Comments: 40 pages, 14 figures

    MSC Class: Primary 68Q32; 90C56; secondary 90C15; 90C26

  4. arXiv:2010.08496  [pdf, other

    cs.LG math.OC

    Online non-convex optimization with imperfect feedback

    Authors: Amélie Héliou, Matthieu Martin, Panayotis Mertikopoulos, Thibaud Rahier

    Abstract: We consider the problem of online learning with non-convex losses. In terms of feedback, we assume that the learner observes - or otherwise constructs - an inexact model for the loss function encountered at each stage, and we propose a mixed-strategy learning policy based on dual averaging. In this general context, we derive a series of tight regret minimization guarantees, both for the learner's… ▽ More

    Submitted 16 October, 2020; originally announced October 2020.

    Comments: 30 pages, 2 figures, 1 table

    MSC Class: Primary 68Q32; secondary 90C26; 91A26

  5. arXiv:2008.03235  [pdf, other

    stat.ML cs.LG stat.ME

    Individual Treatment Prescription Effect Estimation in a Low Compliance Setting

    Authors: Thibaud Rahier, Amélie Héliou, Matthieu Martin, Christophe Renaudin, Eustache Diemert

    Abstract: Individual Treatment Effect (ITE) estimation is an extensively researched problem, with applications in various domains. We model the case where there exists heterogeneous non-compliance to a randomly assigned treatment, a typical situation in health (because of non-compliance to prescription) or digital advertising (because of competition and ad blockers for instance). The lower the compliance, t… ▽ More

    Submitted 23 October, 2020; v1 submitted 7 August, 2020; originally announced August 2020.

    Comments: 28 pages, 10 figures

  6. arXiv:2006.10911  [pdf, ps, other

    cs.GT math.OC

    Gradient-free Online Learning in Games with Delayed Rewards

    Authors: Amélie Héliou, Panayotis Mertikopoulos, Zhengyuan Zhou

    Abstract: Motivated by applications to online advertising and recommender systems, we consider a game-theoretic model with delayed rewards and asynchronous, payoff-based feedback. In contrast to previous work on delayed multi-armed bandits, we focus on multi-player games with continuous action spaces, and we examine the long-run behavior of strategic agents that follow a no-regret learning policy (but are o… ▽ More

    Submitted 18 June, 2020; originally announced June 2020.

    Comments: 26 pages, 4 figures; to appear in ICML 2020

    MSC Class: Primary 91A10; 91A68; 68Q32; secondary 91A20; 91A26; 68T05

  7. arXiv:1902.04785  [pdf, other

    cs.DS

    Constructing Antidictionaries in Output-Sensitive Space

    Authors: Lorraine A. K. Ayad, Golnaz Badkobeh, Gabriele Fici, Alice Héliou, Solon P. Pissis

    Abstract: A word $x$ that is absent from a word $y$ is called minimal if all its proper factors occur in $y$. Given a collection of $k$ words $y_1,y_2,\ldots,y_k$ over an alphabet $Σ$, we are asked to compute the set $\mathrm{M}^{\ell}_{y_{1}\#\ldots\#y_{k}}$ of minimal absent words of length at most $\ell$ of word $y=y_1\#y_2\#\ldots\#y_k$, $\#\notinΣ$. In data compression, this corresponds to computing th… ▽ More

    Submitted 13 February, 2019; originally announced February 2019.

    Comments: Version accepted to DCC 2019

  8. arXiv:1607.08863  [pdf, ps, other

    cs.GT cs.LG math.OC

    Exponentially fast convergence to (strict) equilibrium via hedging

    Authors: Johanne Cohen, Amélie Héliou, Panayotis Mertikopoulos

    Abstract: Motivated by applications to data networks where fast convergence is essential, we analyze the problem of learning in generic N-person games that admit a Nash equilibrium in pure strategies. Specifically, we consider a scenario where players interact repeatedly and try to learn from past experience by small adjustments based on local - and possibly imperfect - payoff information. For concreteness,… ▽ More

    Submitted 29 July, 2016; originally announced July 2016.

    Comments: 14 pages

  9. arXiv:1406.6341  [pdf, ps, other

    cs.DS

    Linear-time Computation of Minimal Absent Words Using Suffix Array

    Authors: Carl Barton, Alice Heliou, Laurent Mouchard, Solon P. Pissis

    Abstract: An absent word of a word y of length n is a word that does not occur in y. It is a minimal absent word if all its proper factors occur in y. Minimal absent words have been computed in genomes of organisms from all domains of life; their computation provides a fast alternative for measuring approximation in sequence comparison. There exists an O(n)-time and O(n)-space algorithm for computing all mi… ▽ More

    Submitted 28 June, 2014; v1 submitted 24 June, 2014; originally announced June 2014.