Skip to main content

Showing 1–7 of 7 results for author: Tutek, M

.
  1. arXiv:2406.09325  [pdf, other

    cs.CL

    REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space

    Authors: Tomer Ashuach, Martin Tutek, Yonatan Belinkov

    Abstract: Large language models (LLMs) risk inadvertently memorizing and divulging sensitive or personally identifiable information (PII) seen in training data, causing privacy concerns. Current approaches to address this issue involve costly dataset scrubbing, or model filtering through unlearning and model editing, which can be bypassed through extraction attacks. We propose REVS, a novel model editing me… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: 18 pages, 3 figures

    ACM Class: I.2.7

  2. arXiv:2401.10065  [pdf, other

    cs.CL

    Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs

    Authors: Haritz Puerto, Martin Tutek, Somak Aditya, Xiaodan Zhu, Iryna Gurevych

    Abstract: Reasoning is a fundamental component of language understanding. Recent prompting techniques, such as chain of thought, have consistently improved LLMs' performance on various reasoning tasks. Nevertheless, there is still little understanding of what triggers reasoning abilities in LLMs in the inference stage. In this paper, we introduce code prompting, a chain of prompts that transforms a natural… ▽ More

    Submitted 25 February, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: Code, prompt templates, prompts, and outputs are publicly available at https://github.com/UKPLab/arxiv2024-conditional-reasoning-llms

  3. arXiv:2310.02832  [pdf, other

    cs.LG cs.CL

    Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness

    Authors: Fran Jelenić, Josip Jukić, Martin Tutek, Mate Puljiz, Jan Šnajder

    Abstract: Effective out-of-distribution (OOD) detection is crucial for reliable machine learning models, yet most current methods are limited in practical use due to requirements like access to training data or intervention in training. We present a novel method for detecting OOD data in Transformers based on transformation smoothness between intermediate layers of a network (BLOOD), which is applicable to… ▽ More

    Submitted 11 March, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: International Conference on Learning Representations: ICLR 2024

  4. arXiv:2309.07822  [pdf, other

    cs.CL

    CATfOOD: Counterfactual Augmented Training for Improving Out-of-Domain Performance and Calibration

    Authors: Rachneet Sachdeva, Martin Tutek, Iryna Gurevych

    Abstract: In recent years, large language models (LLMs) have shown remarkable capabilities at scale, particularly at generating text conditioned on a prompt. In our work, we investigate the use of LLMs to augment training data of small language models~(SLMs) with automatically generated counterfactual~(CF) instances -- i.e. minimally altered inputs -- in order to improve out-of-domain~(OOD) performance of S… ▽ More

    Submitted 13 February, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: Accepted to EACL 2024 main conference

  5. arXiv:2211.08369  [pdf, other

    cs.CL

    Easy to Decide, Hard to Agree: Reducing Disagreements Between Saliency Methods

    Authors: Josip Jukić, Martin Tutek, Jan Šnajder

    Abstract: A popular approach to unveiling the black box of neural NLP models is to leverage saliency methods, which assign scalar importance scores to each input component. A common practice for evaluating whether an interpretability method is faithful has been to use evaluation-by-agreement -- if multiple methods agree on an explanation, its credibility increases. However, recent work has found that salien… ▽ More

    Submitted 11 May, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

    Comments: Accepted to findings of ACL 2023

  6. arXiv:2005.09379  [pdf, other

    cs.CL

    Staying True to Your Word: (How) Can Attention Become Explanation?

    Authors: Martin Tutek, Jan Šnajder

    Abstract: The attention mechanism has quickly become ubiquitous in NLP. In addition to improving performance of models, attention has been widely used as a glimpse into the inner workings of NLP models. The latter aspect has in the recent years become a common topic of discussion, most notably in work of Jain and Wallace, 2019; Wiegreffe and Pinter, 2019. With the shortcomings of using attention weights as… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

  7. arXiv:1808.10503  [pdf, other

    cs.CL

    Iterative Recursive Attention Model for Interpretable Sequence Classification

    Authors: Martin Tutek, Jan Šnajder

    Abstract: Natural language processing has greatly benefited from the introduction of the attention mechanism. However, standard attention models are of limited interpretability for tasks that involve a series of inference steps. We describe an iterative recursive attention model, which constructs incremental representations of input data through reusing results of previously computed queries. We train our m… ▽ More

    Submitted 30 August, 2018; originally announced August 2018.

    Comments: 7 pages, 5 figures, Analyzing and interpreting neural networks for NLP Workshop at EMNLP 2018