Skip to main content

Showing 1–13 of 13 results for author: Bulian, J

.
  1. arXiv:2407.04622  [pdf, other

    cs.LG

    On scalable oversight with weak LLMs judging strong LLMs

    Authors: Zachary Kenton, Noah Y. Siegel, János Kramár, Jonah Brown-Cohen, Samuel Albanie, Jannis Bulian, Rishabh Agarwal, David Lindner, Yunhao Tang, Noah D. Goodman, Rohin Shah

    Abstract: Scalable oversight protocols aim to enable humans to accurately supervise superhuman AI. In this paper we study debate, where two AI's compete to convince a judge; consultancy, where a single AI tries to convince a judge that asks questions; and compare to a baseline of direct question-answering, where the judge just answers outright without the AI. We use large language models (LLMs) as both AI a… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 15 pages (53 including appendices)

  2. arXiv:2310.02932  [pdf, other

    cs.CL cs.AI cs.CY cs.LG

    Assessing Large Language Models on Climate Information

    Authors: Jannis Bulian, Mike S. Schäfer, Afra Amini, Heidi Lam, Massimiliano Ciaramita, Ben Gaiarin, Michelle Chen Hübscher, Christian Buck, Niels G. Mede, Markus Leippold, Nadine Strauß

    Abstract: As Large Language Models (LLMs) rise in popularity, it is necessary to assess their capability in critically relevant domains. We present a comprehensive evaluation framework, grounded in science communication research, to assess LLM responses to questions about climate change. Our framework emphasizes both presentational and epistemological adequacy, offering a fine-grained analysis of LLM genera… ▽ More

    Submitted 28 May, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Journal ref: Proceedings of the 41st International Conference on Machine Learning (ICML), 2024

  3. arXiv:2203.17189  [pdf, other

    cs.LG cs.CL

    Scaling Up Models and Data with $\texttt{t5x}$ and $\texttt{seqio}$

    Authors: Adam Roberts, Hyung Won Chung, Anselm Levskaya, Gaurav Mishra, James Bradbury, Daniel Andor, Sharan Narang, Brian Lester, Colin Gaffney, Afroz Mohiuddin, Curtis Hawthorne, Aitor Lewkowycz, Alex Salcianu, Marc van Zee, Jacob Austin, Sebastian Goodman, Livio Baldini Soares, Haitang Hu, Sasha Tsvyashchenko, Aakanksha Chowdhery, Jasmijn Bastings, Jannis Bulian, Xavier Garcia, Jianmo Ni, Andrew Chen , et al. (18 additional authors not shown)

    Abstract: Recent neural network-based language models have benefited greatly from scaling up the size of training datasets and the number of parameters in the models themselves. Scaling can be complicated due to various factors including the need to distribute computation on supercomputer clusters (e.g., TPUs), prevent bottlenecks when infeeding data, and ensure reproducible results. In this work, we presen… ▽ More

    Submitted 31 March, 2022; originally announced March 2022.

  4. arXiv:2202.07654  [pdf, other

    cs.CL cs.LG

    Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation

    Authors: Jannis Bulian, Christian Buck, Wojciech Gajewski, Benjamin Boerschinger, Tal Schuster

    Abstract: The predictions of question answering (QA)systems are typically evaluated against manually annotated finite sets of one or more answers. This leads to a coverage limitation that results in underestimating the true performance of systems, and is typically addressed by extending over exact match (EM) with pre-defined rules or with the token-level F1 measure. In this paper, we present the first syste… ▽ More

    Submitted 26 October, 2022; v1 submitted 15 February, 2022; originally announced February 2022.

  5. arXiv:2104.04725  [pdf, other

    cs.CL

    Fool Me Twice: Entailment from Wikipedia Gamification

    Authors: Julian Martin Eisenschlos, Bhuwan Dhingra, Jannis Bulian, Benjamin Börschinger, Jordan Boyd-Graber

    Abstract: We release FoolMeTwice (FM2 for short), a large dataset of challenging entailment pairs collected through a fun multi-player game. Gamification encourages adversarial examples, drastically lowering the number of examples that can be solved using "shortcuts" compared to other popular entailment datasets. Players are presented with two tasks. The first task asks the player to write a plausible claim… ▽ More

    Submitted 10 April, 2021; originally announced April 2021.

    Comments: Published in NAACL 2021

  6. arXiv:2012.00614  [pdf, other

    cs.CL cs.AI

    CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims

    Authors: Thomas Diggelmann, Jordan Boyd-Graber, Jannis Bulian, Massimiliano Ciaramita, Markus Leippold

    Abstract: We introduce CLIMATE-FEVER, a new publicly available dataset for verification of climate change-related claims. By providing a dataset for the research community, we aim to facilitate and encourage work on improving algorithms for retrieving evidential support for climate-specific claims, addressing the underlying language understanding challenges, and ultimately help alleviate the impact of misin… ▽ More

    Submitted 2 January, 2021; v1 submitted 1 December, 2020; originally announced December 2020.

    Comments: Accepted for the Tackling Climate Change with Machine Learning Workshop at NeurIPS 2020

  7. arXiv:1911.04156  [pdf, other

    cs.CL cs.AI

    Meta Answering for Machine Reading

    Authors: Benjamin Borschinger, Jordan Boyd-Graber, Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Michelle Chen Huebscher, Wojciech Gajewski, Yannic Kilcher, Rodrigo Nogueira, Lierni Sestorain Saralegu

    Abstract: We investigate a framework for machine reading, inspired by real world information-seeking problems, where a meta question answering system interacts with a black box environment. The environment encapsulates a competitive machine reader based on BERT, providing candidate answers to questions, and possibly some context. To validate the realism of our formulation, we ask humans to play the role of… ▽ More

    Submitted 30 April, 2020; v1 submitted 11 November, 2019; originally announced November 2019.

  8. arXiv:1809.10658  [pdf, other

    cs.LG stat.ML

    Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

    Authors: Rodrigo Nogueira, Jannis Bulian, Massimiliano Ciaramita

    Abstract: We propose a method to efficiently learn diverse strategies in reinforcement learning for query reformulation in the tasks of document retrieval and question answering. In the proposed framework an agent consists of multiple specialized sub-agents and a meta-agent that learns to aggregate the answers from sub-agents to produce a final answer. Sub-agents are trained on disjoint partitions of the tr… ▽ More

    Submitted 25 December, 2018; v1 submitted 27 September, 2018; originally announced September 2018.

  9. arXiv:1801.07537  [pdf, other

    cs.CL cs.AI

    Analyzing Language Learned by an Active Question Answering Agent

    Authors: Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Wojciech Gajewski, Andrea Gesmundo, Neil Houlsby, Wei Wang

    Abstract: We analyze the language learned by an agent trained with reinforcement learning as a component of the ActiveQA system [Buck et al., 2017]. In ActiveQA, question answering is framed as a reinforcement learning task in which an agent sits between the user and a black box question-answering system. The agent learns to reformulate the user's questions to elicit the optimal answers. It probes the syste… ▽ More

    Submitted 23 January, 2018; originally announced January 2018.

    Comments: Emergent Communication Workshop, NIPS 2017

  10. arXiv:1705.07830  [pdf, other

    cs.CL cs.AI

    Ask the Right Questions: Active Question Reformulation with Reinforcement Learning

    Authors: Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Wojciech Gajewski, Andrea Gesmundo, Neil Houlsby, Wei Wang

    Abstract: We frame Question Answering (QA) as a Reinforcement Learning task, an approach that we call Active Question Answering. We propose an agent that sits between the user and a black box QA system and learns to reformulate questions to elicit the best possible answers. The agent probes the system with, potentially many, natural language reformulations of an initial question and aggregates the returned… ▽ More

    Submitted 2 March, 2018; v1 submitted 22 May, 2017; originally announced May 2017.

    Journal ref: Sixth International Conference on Learning Representations (ICLR), 2018

  11. arXiv:1502.05910  [pdf, ps, other

    cs.DS cs.CC cs.DM cs.LO

    Fixed-parameter Tractable Distances to Sparse Graph Classes

    Authors: Jannis Bulian, Anuj Dawar

    Abstract: We show that for various classes C of sparse graphs, and several measures of distance to such classes (such as edit distance and elimination distance), the problem of determining the distance of a given graph G to C is fixed-parameter tractable. The results are based on two general techniques. The first of these, building on recent work of Grohe et al. establishes that any class of graphs that is… ▽ More

    Submitted 20 February, 2015; originally announced February 2015.

  12. arXiv:1406.4718  [pdf, ps, other

    cs.DS cs.CC cs.DM

    Graph Isomorphism Parameterized by Elimination Distance to Bounded Degree

    Authors: Jannis Bulian, Anuj Dawar

    Abstract: A commonly studied means of parameterizing graph problems is the deletion distance from triviality (Guo et al. 2004), which counts vertices that need to be deleted from a graph to place it in some class for which efficient algorithms are known. In the context of graph isomorphism, we define triviality to mean a graph with maximum degree bounded by a constant, as such graph classes admit polynomial… ▽ More

    Submitted 16 October, 2014; v1 submitted 18 June, 2014; originally announced June 2014.

    Comments: 19 pages

    ACM Class: F.2.2; G.2.2

  13. Bare canonicity of representable cylindric and polyadic algebras

    Authors: Jannis Bulian, Ian Hodkinson

    Abstract: We show that for finite n at least 3, every first-order axiomatisation of the varieties of representable n-dimensional cylindric algebras, diagonal-free cylindric algebras, polyadic algebras, and polyadic equality algebras contains an infinite number of non-canonical formulas. We also show that the class of structures for each of these varieties is non-elementary. The proofs employ algebras derive… ▽ More

    Submitted 27 February, 2012; originally announced February 2012.

    MSC Class: 03G15 (Primary) 03C05; 06B15; 06E15; 06E25 (Secondary)

    Journal ref: Annals of Pure and Applied Logic 164 (2013), pp. 884-906