Skip to main content

Showing 1–8 of 8 results for author: Keller, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2405.14655  [pdf, other

    cs.LG

    Multi-turn Reinforcement Learning from Preference Human Feedback

    Authors: Lior Shani, Aviv Rosenberg, Asaf Cassel, Oran Lang, Daniele Calandriello, Avital Zipori, Hila Noga, Orgad Keller, Bilal Piot, Idan Szpektor, Avinatan Hassidim, Yossi Matias, Rémi Munos

    Abstract: Reinforcement Learning from Human Feedback (RLHF) has become the standard approach for aligning Large Language Models (LLMs) with human preferences, allowing LLMs to demonstrate remarkable abilities in various tasks. Existing methods work by emulating the preferences at the single decision (turn) level, limiting their capabilities in settings that require planning or multi-turn interactions to ach… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  2. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  3. arXiv:2306.00186  [pdf, other

    cs.CL

    Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback

    Authors: Paul Roit, Johan Ferret, Lior Shani, Roee Aharoni, Geoffrey Cideron, Robert Dadashi, Matthieu Geist, Sertan Girgin, Léonard Hussenot, Orgad Keller, Nikola Momchev, Sabela Ramos, Piotr Stanczyk, Nino Vieillard, Olivier Bachem, Gal Elidan, Avinatan Hassidim, Olivier Pietquin, Idan Szpektor

    Abstract: Despite the seeming success of contemporary grounded text generation systems, they often tend to generate factually inconsistent text with respect to their input. This phenomenon is emphasized in tasks like summarization, in which the generated summaries should be corroborated by their source article. In this work, we leverage recent progress on textual entailment models to directly address this p… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: ACL 2023

  4. arXiv:2208.02294  [pdf, other

    cs.CL cs.LG

    Dynamic Planning in Open-Ended Dialogue using Reinforcement Learning

    Authors: Deborah Cohen, Moonkyung Ryu, Yinlam Chow, Orgad Keller, Ido Greenberg, Avinatan Hassidim, Michael Fink, Yossi Matias, Idan Szpektor, Craig Boutilier, Gal Elidan

    Abstract: Despite recent advances in natural language understanding and generation, and decades of research on the development of conversational bots, building automated agents that can carry on rich open-ended conversations with humans "in the wild" remains a formidable challenge. In this work we develop a real-time, open-ended dialogue system that uses reinforcement learning (RL) to power a bot's conversa… ▽ More

    Submitted 25 July, 2022; originally announced August 2022.

  5. arXiv:2206.14796  [pdf, other

    cs.CL cs.AI cs.LG

    On the Robustness of Dialogue History Representation in Conversational Question Answering: A Comprehensive Study and a New Prompt-based Method

    Authors: Zorik Gekhman, Nadav Oved, Orgad Keller, Idan Szpektor, Roi Reichart

    Abstract: Most works on modeling the conversation history in Conversational Question Answering (CQA) report a single main result on a common CQA benchmark. While existing models show impressive results on CQA leaderboards, it remains unclear whether they are robust to shifts in setting (sometimes to more realistic ones), training data size (e.g. from large to small sets) and domain. In this work, we design… ▽ More

    Submitted 28 December, 2022; v1 submitted 29 June, 2022; originally announced June 2022.

    Comments: Accepted for publication at TACL in December 2022. First two authors contributed equally to this work. Our code and data will be released at: https://github.com/zorikg/MarCQAp

  6. arXiv:2010.02592  [pdf, other

    cs.CL cs.AI cs.LG

    Semantically Driven Sentence Fusion: Modeling and Evaluation

    Authors: Eyal Ben-David, Orgad Keller, Eric Malmi, Idan Szpektor, Roi Reichart

    Abstract: Sentence fusion is the task of joining related sentences into coherent text. Current training and evaluation schemes for this task are based on single reference ground-truths and do not account for valid fusion variants. We show that this hinders models from robustly capturing the semantic relationship between input sentences. To alleviate this, we present an approach in which ground-truth solutio… ▽ More

    Submitted 6 October, 2020; originally announced October 2020.

    Comments: This paper was accepted to Findings of EMNLP 2020

  7. arXiv:1708.04862  [pdf, other

    cs.DS

    New Approximations for Coalitional Manipulation in General Scoring Rules

    Authors: Orgad Keller, Avinatan Hassidim, Noam Hazon

    Abstract: We study the problem of coalitional manipulation---where $k$ manipulators try to manipulate an election on $m$ candidates---under general scoring rules, with a focus on the Borda protocol. We do so both in the weighted and unweighted settings. We focus on minimizing the maximum score obtainable by a non-preferred candidate. In the strongest, most general setting, we provide an algorithm for any… ▽ More

    Submitted 16 August, 2017; originally announced August 2017.

  8. arXiv:1307.2415  [pdf, ps, other

    cs.DS

    Finding the Minimum-Weight k-Path

    Authors: Avinatan Hassidim, Orgad Keller, Moshe Lewenstein, Liam Roditty

    Abstract: Given a weighted $n$-vertex graph $G$ with integer edge-weights taken from a range $[-M,M]$, we show that the minimum-weight simple path visiting $k$ vertices can be found in time $\tilde{O}(2^k \poly(k) M n^ω) = O^*(2^k M)$. If the weights are reals in $[1,M]$, we provide a $(1+\varepsilon)$-approximation which has a running time of $\tilde{O}(2^k \poly(k) n^ω(\log\log M + 1/\varepsilon))$. For t… ▽ More

    Submitted 9 July, 2013; originally announced July 2013.

    Comments: To appear at WADS 2013