Skip to main content

Showing 1–22 of 22 results for author: Šnajder, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.12174  [pdf, other

    cs.CL

    Claim Check-Worthiness Detection: How Well do LLMs Grasp Annotation Guidelines?

    Authors: Laura Majer, Jan Šnajder

    Abstract: The increasing threat of disinformation calls for automating parts of the fact-checking pipeline. Identifying text segments requiring fact-checking is known as claim detection (CD) and claim check-worthiness detection (CW), the latter incorporating complex domain-specific criteria of worthiness and often framed as a ranking task. Zero- and few-shot LLM prompting is an attractive option for both ta… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  2. arXiv:2404.00758  [pdf, other

    cs.CL

    From Robustness to Improved Generalization and Calibration in Pre-trained Language Models

    Authors: Josip Jukić, Jan Šnajder

    Abstract: Enhancing generalization and uncertainty quantification in pre-trained language models (PLMs) is crucial for their effectiveness and reliability. Building on machine learning research that established the importance of robustness for improving generalization, we investigate the role of representation smoothness, achieved via Jacobian and Hessian regularization, in enhancing PLM performance. Althou… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

  3. arXiv:2403.00418  [pdf, other

    cs.CL

    LLMs for Targeted Sentiment in News Headlines: Exploring the Descriptive-Prescriptive Dilemma

    Authors: Jana Juroš, Laura Majer, Jan Šnajder

    Abstract: News headlines often evoke sentiment by intentionally portraying entities in particular ways, making targeted sentiment analysis (TSA) of headlines a worthwhile but difficult task. Due to its subjectivity, creating TSA datasets can involve various annotation paradigms, from descriptive to prescriptive, either encouraging or limiting subjectivity. LLMs are a good fit for TSA due to their broad ling… ▽ More

    Submitted 28 May, 2024; v1 submitted 1 March, 2024; originally announced March 2024.

  4. arXiv:2402.13130  [pdf, other

    cs.CL

    Are ELECTRA's Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity

    Authors: Ivan Rep, David Dukić, Jan Šnajder

    Abstract: While BERT produces high-quality sentence embeddings, its pre-training computational cost is a significant drawback. In contrast, ELECTRA delivers a cost-effective pre-training objective and downstream task performance improvements, but not as performant sentence embeddings. The community tacitly stopped utilizing ELECTRA's sentence embeddings for semantic textual similarity (STS). We notice a sig… ▽ More

    Submitted 20 February, 2024; originally announced February 2024.

    Comments: 7 pages, 9 figures, 2 tables

  5. arXiv:2401.14556  [pdf, other

    cs.CL

    Looking Right is Sometimes Right: Investigating the Capabilities of Decoder-only LLMs for Sequence Labeling

    Authors: David Dukić, Jan Šnajder

    Abstract: Pre-trained language models based on masked language modeling (MLM) excel in natural language understanding (NLU) tasks. While fine-tuned MLM-based encoders consistently outperform causal language modeling decoders of comparable size, recent decoder-only large language models (LLMs) perform on par with smaller MLM-based encoders. Although their performance improves with scale, LLMs fall short of a… ▽ More

    Submitted 6 June, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: Accepted at ACL 2024 Findings

  6. arXiv:2310.02832  [pdf, other

    cs.LG cs.CL

    Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness

    Authors: Fran Jelenić, Josip Jukić, Martin Tutek, Mate Puljiz, Jan Šnajder

    Abstract: Effective out-of-distribution (OOD) detection is crucial for reliable machine learning models, yet most current methods are limited in practical use due to requirements like access to training data or intervention in training. We present a novel method for detecting OOD data in Transformers based on transformation smoothness between intermediate layers of a network (BLOOD), which is applicable to… ▽ More

    Submitted 11 March, 2024; v1 submitted 4 October, 2023; originally announced October 2023.

    Comments: International Conference on Learning Representations: ICLR 2024

  7. arXiv:2305.14576  [pdf, other

    cs.CL

    Parameter-Efficient Language Model Tuning with Active Learning in Low-Resource Settings

    Authors: Josip Jukić, Jan Šnajder

    Abstract: Pre-trained language models (PLMs) have ignited a surge in demand for effective fine-tuning techniques, particularly in low-resource domains and languages. Active learning (AL), a set of algorithms designed to decrease labeling costs by minimizing label complexity, has shown promise in confronting the labeling bottleneck. In parallel, adapter modules designed for parameter-efficient fine-tuning (P… ▽ More

    Submitted 23 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted at EMNLP 2023

  8. arXiv:2305.14163  [pdf, other

    cs.CL cs.LG

    Leveraging Open Information Extraction for More Robust Domain Transfer of Event Trigger Detection

    Authors: David Dukić, Kiril Gashteovski, Goran Glavaš, Jan Šnajder

    Abstract: Event detection is a crucial information extraction task in many domains, such as Wikipedia or news. The task typically relies on trigger detection (TD) -- identifying token spans in the text that evoke specific events. While the notion of triggers should ideally be universal across domains, domain transfer for TD from high- to low-resource domains results in significant performance drops. We addr… ▽ More

    Submitted 1 February, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted at EACL 2024 Findings

  9. arXiv:2305.12190  [pdf, other

    cs.IR cs.CL

    Paragraph-level Citation Recommendation based on Topic Sentences as Queries

    Authors: Zoran Medić, Jan Šnajder

    Abstract: Citation recommendation (CR) models may help authors find relevant articles at various stages of the paper writing process. Most research has dealt with either global CR, which produces general recommendations suitable for the initial writing stage, or local CR, which produces specific recommendations more fitting for the final writing stages. We propose the task of paragraph-level CR as a middle… ▽ More

    Submitted 20 May, 2023; originally announced May 2023.

  10. arXiv:2305.09807  [pdf, other

    cs.LG cs.CL

    On Dataset Transferability in Active Learning for Transformers

    Authors: Fran Jelenić, Josip Jukić, Nina Drobac, Jan Šnajder

    Abstract: Active learning (AL) aims to reduce labeling costs by querying the examples most beneficial for model learning. While the effectiveness of AL for fine-tuning transformer-based pre-trained language models (PLMs) has been demonstrated, it is less clear to what extent the AL gains obtained with one model transfer to others. We consider the problem of transferability of actively acquired datasets in t… ▽ More

    Submitted 29 September, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

    Comments: Findings of the Association for Computational Linguistics: ACL 2023

  11. arXiv:2302.11412  [pdf, ps, other

    cs.CL cs.LG

    Data Augmentation for Neural NLP

    Authors: Domagoj Pluščec, Jan Šnajder

    Abstract: Data scarcity is a problem that occurs in languages and tasks where we do not have large amounts of labeled data but want to use state-of-the-art models. Such models are often deep learning models that require a significant amount of data to train. Acquiring data for various machine learning problems is accompanied by high labeling costs. Data augmentation is a low-cost approach for tackling data… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

  12. arXiv:2302.00493  [pdf, other

    cs.CL

    You Are What You Talk About: Inducing Evaluative Topics for Personality Analysis

    Authors: Josip Jukić, Iva Vukojević, Jan Šnajder

    Abstract: Expressing attitude or stance toward entities and concepts is an integral part of human behavior and personality. Recently, evaluative language data has become more accessible with social media's rapid growth, enabling large-scale opinion analysis. However, surprisingly little research examines the relationship between personality and evaluative language. To bridge this gap, we introduce the notio… ▽ More

    Submitted 1 February, 2023; originally announced February 2023.

    Comments: Accepted at EMNLP 2022 (Findings), NLP+CSS

  13. arXiv:2212.11680  [pdf, other

    cs.LG cs.CL

    Smooth Sailing: Improving Active Learning for Pre-trained Language Models with Representation Smoothness Analysis

    Authors: Josip Jukić, Jan Šnajder

    Abstract: Developed to alleviate prohibitive labeling costs, active learning (AL) methods aim to reduce label complexity in supervised learning. While recent work has demonstrated the benefit of using AL in combination with large pre-trained language models (PLMs), it has often overlooked the practical challenges that hinder the effectiveness of AL. We address these challenges by leveraging representation s… ▽ More

    Submitted 23 October, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: Accepted at Learning with Small Data 2023, Association for Computational Linguistics

  14. arXiv:2211.08369  [pdf, other

    cs.CL

    Easy to Decide, Hard to Agree: Reducing Disagreements Between Saliency Methods

    Authors: Josip Jukić, Martin Tutek, Jan Šnajder

    Abstract: A popular approach to unveiling the black box of neural NLP models is to leverage saliency methods, which assign scalar importance scores to each input component. A common practice for evaluating whether an interpretability method is faithful has been to use evaluation-by-agreement -- if multiple methods agree on an explanation, its credibility increases. However, recent work has found that salien… ▽ More

    Submitted 11 May, 2023; v1 submitted 15 November, 2022; originally announced November 2022.

    Comments: Accepted to findings of ACL 2023

  15. arXiv:2211.06224  [pdf, other

    cs.LG

    ALANNO: An Active Learning Annotation System for Mortals

    Authors: Josip Jukić, Fran Jelenić, Miroslav Bićanić, Jan Šnajder

    Abstract: Supervised machine learning has become the cornerstone of today's data-driven society, increasing the need for labeled data. However, the process of acquiring labels is often expensive and tedious. One possible remedy is to use active learning (AL) -- a special family of machine learning algorithms designed to reduce labeling costs. Although AL has been successful in practice, a number of practica… ▽ More

    Submitted 21 February, 2023; v1 submitted 11 November, 2022; originally announced November 2022.

    Comments: Accepted at EACL 2023

  16. arXiv:2209.05452  [pdf, other

    cs.IR cs.CL

    Large-scale Evaluation of Transformer-based Article Encoders on the Task of Citation Recommendation

    Authors: Zoran Medić, Jan Šnajder

    Abstract: Recently introduced transformer-based article encoders (TAEs) designed to produce similar vector representations for mutually related scientific articles have demonstrated strong performance on benchmark datasets for scientific article recommendation. However, the existing benchmark datasets are predominantly focused on single domains and, in some cases, contain easy negatives in small candidate p… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

    Comments: Accepted to the Third Workshop on Scholarly Document Processing @ COLING 2022

  17. A Topic Coverage Approach to Evaluation of Topic Models

    Authors: Damir Korenčić, Strahil Ristov, Jelena Repar, Jan Šnajder

    Abstract: Topic models are widely used unsupervised models capable of learning topics - weighted lists of words and documents - from large collections of text documents. When topic models are used for discovery of topics in text collections, a question that arises naturally is how well the model-induced topics correspond to topics of interest to the analyst. In this paper we revisit and extend a so far negl… ▽ More

    Submitted 2 September, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

    Comments: Final version accepted for publication in IEEE Access (https://doi.org/10.1109/ACCESS.2021.3109425); Contributions unchanged, results augmented with the analysis of the models' precision and recall and the analysis of the measures' running time; Improved description of the contributions; Improved future work

    ACM Class: H.3.3; I.5.4; I.2.7

  18. arXiv:2005.09379  [pdf, other

    cs.CL

    Staying True to Your Word: (How) Can Attention Become Explanation?

    Authors: Martin Tutek, Jan Šnajder

    Abstract: The attention mechanism has quickly become ubiquitous in NLP. In addition to improving performance of models, attention has been widely used as a glimpse into the inner workings of NLP models. The latter aspect has in the recent years become a common topic of discussion, most notably in work of Jain and Wallace, 2019; Wiegreffe and Pinter, 2019. With the shortcomings of using attention weights as… ▽ More

    Submitted 19 May, 2020; originally announced May 2020.

  19. arXiv:2004.04460  [pdf, other

    cs.CL cs.CY cs.SI

    PANDORA Talks: Personality and Demographics on Reddit

    Authors: Matej Gjurković, Mladen Karan, Iva Vukojević, Mihaela Bošnjak, Jan Šnajder

    Abstract: Personality and demographics are important variables in social sciences, while in NLP they can aid in interpretability and removal of societal biases. However, datasets with both personality and demographic labels are scarce. To address this, we present PANDORA, the first large-scale dataset of Reddit comments labeled with three personality models (including the well-established Big 5 model) and d… ▽ More

    Submitted 8 June, 2021; v1 submitted 9 April, 2020; originally announced April 2020.

    Comments: Proceedings of the Ninth International Workshop on Natural Language Processing for Social Media, NAACL 2021, https://www.aclweb.org/anthology/2021.socialnlp-1.12

  20. arXiv:1811.04655  [pdf, ps, other

    cs.CL cs.SI

    Not Just Depressed: Bipolar Disorder Prediction on Reddit

    Authors: Ivan Sekulić, Matej Gjurković, Jan Šnajder

    Abstract: Bipolar disorder, an illness characterized by manic and depressive episodes, affects more than 60 million people worldwide. We present a preliminary study on bipolar disorder prediction from user-generated text on Reddit, which relies on users' self-reported labels. Our benchmark classifiers for bipolar disorder prediction outperform the baselines and reach accuracy and F1-scores of above 86%. Fea… ▽ More

    Submitted 27 March, 2019; v1 submitted 12 November, 2018; originally announced November 2018.

    Comments: WASSA at EMNLP 2018

    Journal ref: WASSA@EMNLP 2018: 72-78

  21. arXiv:1808.10503  [pdf, other

    cs.CL

    Iterative Recursive Attention Model for Interpretable Sequence Classification

    Authors: Martin Tutek, Jan Šnajder

    Abstract: Natural language processing has greatly benefited from the introduction of the attention mechanism. However, standard attention models are of limited interpretability for tasks that involve a series of inference steps. We describe an iterative recursive attention model, which constructs incremental representations of input data through reusing results of previously computed queries. We train our m… ▽ More

    Submitted 30 August, 2018; originally announced August 2018.

    Comments: 7 pages, 5 figures, Analyzing and interpreting neural networks for NLP Workshop at EMNLP 2018

  22. arXiv:1701.00168  [pdf, ps, other

    cs.CL

    Social Media Argumentation Mining: The Quest for Deliberateness in Raucousness

    Authors: Jan Šnajder

    Abstract: Argumentation mining from social media content has attracted increasing attention. The task is both challenging and rewarding. The informal nature of user-generated content makes the task dauntingly difficult. On the other hand, the insights that could be gained by a large-scale analysis of social media argumentation make it a very worthwhile task. In this position paper I discuss the motivation f… ▽ More

    Submitted 31 December, 2016; originally announced January 2017.