Skip to main content

Showing 1–8 of 8 results for author: Puerto, H

.
  1. arXiv:2407.03181  [pdf, other

    cs.CL

    Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models

    Authors: Haritz Puerto, Tilek Chubakov, Xiaodan Zhu, Harish Tayyar Madabushi, Iryna Gurevych

    Abstract: Requiring a Large Language Model to generate intermediary reasoning steps has been shown to be an effective way of boosting performance. In fact, it has been found that instruction tuning on these intermediary reasoning steps improves model performance. In this work, we present a novel method of further improving performance by requiring models to compare multiple reasoning chains before generatin… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  2. arXiv:2401.10065  [pdf, other

    cs.CL

    Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs

    Authors: Haritz Puerto, Martin Tutek, Somak Aditya, Xiaodan Zhu, Iryna Gurevych

    Abstract: Reasoning is a fundamental component of language understanding. Recent prompting techniques, such as chain of thought, have consistently improved LLMs' performance on various reasoning tasks. Nevertheless, there is still little understanding of what triggers reasoning abilities in LLMs in the inference stage. In this paper, we introduce code prompting, a chain of prompts that transforms a natural… ▽ More

    Submitted 25 February, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: Code, prompt templates, prompts, and outputs are publicly available at https://github.com/UKPLab/arxiv2024-conditional-reasoning-llms

  3. arXiv:2306.16900  [pdf, other

    cs.CL

    Surveying (Dis)Parities and Concerns of Compute Hungry NLP Research

    Authors: Ji-Ung Lee, Haritz Puerto, Betty van Aken, Yuki Arase, Jessica Zosa Forde, Leon Derczynski, Andreas Rücklé, Iryna Gurevych, Roy Schwartz, Emma Strubell, Jesse Dodge

    Abstract: Many recent improvements in NLP stem from the development and use of large pre-trained language models (PLMs) with billions of parameters. Large model sizes makes computational cost one of the main limiting factors for training and evaluating such models; and has raised severe concerns about the sustainability, reproducibility, and inclusiveness for researching PLMs. These concerns are often based… ▽ More

    Submitted 9 November, 2023; v1 submitted 29 June, 2023; originally announced June 2023.

  4. arXiv:2305.19748  [pdf, other

    cs.CL

    UKP-SQuARE: An Interactive Tool for Teaching Question Answering

    Authors: Haishuo Fang, Haritz Puerto, Iryna Gurevych

    Abstract: The exponential growth of question answering (QA) has made it an indispensable topic in any Natural Language Processing (NLP) course. Additionally, the breadth of QA derived from this exponential growth makes it an ideal scenario for teaching related NLP topics such as information retrieval, explainability, and adversarial attacks among others. In this paper, we introduce UKP-SQuARE as a platform… ▽ More

    Submitted 2 June, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: Accepted by BEA workshop, ACL2023

  5. arXiv:2303.18120  [pdf, other

    cs.CL

    UKP-SQuARE v3: A Platform for Multi-Agent QA Research

    Authors: Haritz Puerto, Tim Baumgärtner, Rachneet Sachdeva, Haishuo Fang, Hao Zhang, Sewin Tariverdian, Kexin Wang, Iryna Gurevych

    Abstract: The continuous development of Question Answering (QA) datasets has drawn the research community's attention toward multi-domain models. A popular approach is to use multi-dataset models, which are models trained on multiple datasets to learn their regularities and prevent overfitting to a single dataset. However, with the proliferation of QA models in online repositories such as GitHub or Hugging… ▽ More

    Submitted 17 May, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

    Comments: ACL 2023 Demo Paper

  6. arXiv:2208.09316  [pdf, other

    cs.CL

    UKP-SQuARE v2: Explainability and Adversarial Attacks for Trustworthy QA

    Authors: Rachneet Sachdeva, Haritz Puerto, Tim Baumgärtner, Sewin Tariverdian, Hao Zhang, Kexin Wang, Hossain Shaikh Saadi, Leonardo F. R. Ribeiro, Iryna Gurevych

    Abstract: Question Answering (QA) systems are increasingly deployed in applications where they support real-world decisions. However, state-of-the-art models rely on deep neural networks, which are difficult to interpret by humans. Inherently interpretable models or post hoc explainability methods can help users to comprehend how a model arrives at its prediction and, if successful, increase their trust in… ▽ More

    Submitted 20 October, 2022; v1 submitted 19 August, 2022; originally announced August 2022.

    Comments: Accepted at AACL 2022 as Demo Paper

  7. arXiv:2203.13693  [pdf, other

    cs.CL cs.IR

    UKP-SQUARE: An Online Platform for Question Answering Research

    Authors: Tim Baumgärtner, Kexin Wang, Rachneet Sachdeva, Max Eichler, Gregor Geigle, Clifton Poth, Hannah Sterz, Haritz Puerto, Leonardo F. R. Ribeiro, Jonas Pfeiffer, Nils Reimers, Gözde Gül Şahin, Iryna Gurevych

    Abstract: Recent advances in NLP and information retrieval have given rise to a diverse set of question answering tasks that are of different formats (e.g., extractive, abstractive), require different model architectures (e.g., generative, discriminative), and setups (e.g., with or without retrieval). Despite having a large number of powerful, specialized QA pipelines (which we refer to as Skills) that cons… ▽ More

    Submitted 28 March, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

    Comments: Accepted at ACL 2022 Demo Track

  8. arXiv:2112.01922  [pdf, other

    cs.CL cs.LG

    MetaQA: Combining Expert Agents for Multi-Skill Question Answering

    Authors: Haritz Puerto, Gözde Gül Şahin, Iryna Gurevych

    Abstract: The recent explosion of question answering (QA) datasets and models has increased the interest in the generalization of models across multiple domains and formats by either training on multiple datasets or by combining multiple models. Despite the promising results of multi-dataset models, some domains or QA formats may require specific architectures, and thus the adaptability of these models migh… ▽ More

    Submitted 6 February, 2023; v1 submitted 3 December, 2021; originally announced December 2021.

    Comments: Accepted at EACL 2023