Skip to main content

Showing 1–50 of 64 results for author: Aletras, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11477  [pdf

    cs.CL cs.AI

    Vocabulary Expansion for Low-resource Cross-lingual Transfer

    Authors: Atsuki Yamaguchi, Aline Villavicencio, Nikolaos Aletras

    Abstract: Large language models (LLMs) have shown remarkable capabilities in many languages beyond English. Yet, LLMs require more inference steps when generating non-English text due to their reliance on English-centric tokenizers, vocabulary, and pre-training data, resulting in higher usage costs to non-English speakers. Vocabulary expansion with target language tokens is a widely used cross-lingual vocab… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

  2. arXiv:2403.16668  [pdf, other

    cs.CL cs.SI

    Who is bragging more online? A large scale analysis of bragging in social media

    Authors: Mali **, Daniel Preoţiuc-Pietro, A. Seza Doğruöz, Nikolaos Aletras

    Abstract: Bragging is the act of uttering statements that are likely to be positively viewed by others and it is extensively employed in human communication with the aim to build a positive self-image of oneself. Social media is a natural platform for users to employ bragging in order to gain admiration, respect, attention and followers from their audiences. Yet, little is known about the scale of bragging… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

    Comments: Accepted at LREC-COLING 2024

  3. arXiv:2403.12809  [pdf, other

    cs.CL cs.AI

    Comparing Explanation Faithfulness between Multilingual and Monolingual Fine-tuned Language Models

    Authors: Zhixue Zhao, Nikolaos Aletras

    Abstract: In many real natural language processing application scenarios, practitioners not only aim to maximize predictive performance but also seek faithful explanations for the model predictions. Rationales and importance distribution given by feature attribution methods (FAs) provide insights into how different parts of the input contribute to a prediction. Previous studies have explored how different f… ▽ More

    Submitted 19 March, 2024; originally announced March 2024.

    Comments: Accepted at NAACL 2024 Main Conference

  4. arXiv:2402.10712  [pdf, other

    cs.CL cs.AI

    An Empirical Study on Cross-lingual Vocabulary Adaptation for Efficient Language Model Inference

    Authors: Atsuki Yamaguchi, Aline Villavicencio, Nikolaos Aletras

    Abstract: The development of state-of-the-art generative large language models (LLMs) disproportionately relies on English-centric tokenizers, vocabulary and pre-training data. Despite the fact that some LLMs have multilingual capabilities, recent studies have shown that their inference efficiency deteriorates when generating text in languages other than English. This results in increased inference time and… ▽ More

    Submitted 17 June, 2024; v1 submitted 16 February, 2024; originally announced February 2024.

  5. arXiv:2401.03831  [pdf, other

    cs.CL cs.LG

    We Need to Talk About Classification Evaluation Metrics in NLP

    Authors: Peter Vickers, Loïc Barrault, Emilio Monti, Nikolaos Aletras

    Abstract: In Natural Language Processing (NLP) classification tasks such as topic categorisation and sentiment analysis, model generalizability is generally measured with standard metrics such as Accuracy, F-Measure, or AUC-ROC. The diversity of metrics, and the arbitrariness of their application suggest that there is no agreement within NLP on a single best metric to use. This lack suggests there has not b… ▽ More

    Submitted 8 January, 2024; originally announced January 2024.

    Comments: Appeared in AACL 2023

  6. arXiv:2311.09755  [pdf, other

    cs.CL

    How Does Calibration Data Affect the Post-training Pruning and Quantization of Large Language Models?

    Authors: Miles Williams, Nikolaos Aletras

    Abstract: Pruning and quantization form the foundation of model compression for neural networks, enabling efficient inference for large language models (LLMs). Recently, various quantization and pruning techniques have demonstrated state-of-the-art performance in a post-training setting. They rely upon calibration data, a small set of unlabeled examples, to generate layer activations. However, no prior work… ▽ More

    Submitted 16 November, 2023; originally announced November 2023.

  7. arXiv:2311.09335  [pdf, other

    cs.CL cs.AI

    Investigating Hallucinations in Pruned Large Language Models for Abstractive Summarization

    Authors: George Chrysostomou, Zhixue Zhao, Miles Williams, Nikolaos Aletras

    Abstract: Despite the remarkable performance of generative large language models (LLMs) on abstractive summarization, they face two significant challenges: their considerable size and tendency to hallucinate. Hallucinations are concerning because they erode reliability and raise safety issues. Pruning is a technique that reduces model size by removing redundant weights, enabling more efficient sparse infere… ▽ More

    Submitted 29 January, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

  8. arXiv:2310.17271  [pdf, other

    cs.CL

    Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance?

    Authors: Ahmed Alajrami, Katerina Margatina, Nikolaos Aletras

    Abstract: Understanding how and what pre-trained language models (PLMs) learn about language is an open challenge in natural language processing. Previous work has focused on identifying whether they capture semantic and syntactic information, and how the data or the pre-training objective affects their performance. However, to the best of our knowledge, no previous work has specifically examined how inform… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: To appear at EMNLP 2023

  9. arXiv:2310.07911  [pdf, other

    cs.CL

    Pit One Against Many: Leveraging Attention-head Embeddings for Parameter-efficient Multi-head Attention

    Authors: Huiyin Xue, Nikolaos Aletras

    Abstract: Scaling pre-trained language models has resulted in large performance gains in various natural language processing tasks but comes with a large cost in memory requirements. Inspired by the position embeddings in transformers, we aim to simplify and reduce the memory footprint of the multi-head attention (MHA) mechanism. We propose an alternative module that uses only a single shared projection mat… ▽ More

    Submitted 11 October, 2023; originally announced October 2023.

    Comments: Accepted at EMNLP 2023 Findings

  10. arXiv:2310.05553  [pdf, other

    cs.CL

    Regulation and NLP (RegNLP): Taming Large Language Models

    Authors: Catalina Goanta, Nikolaos Aletras, Ilias Chalkidis, Sofia Ranchordas, Gerasimos Spanakis

    Abstract: The scientific innovation in Natural Language Processing (NLP) and more broadly in artificial intelligence (AI) is at its fastest pace to date. As large language models (LLMs) unleash a new era of automation, important debates emerge regarding the benefits and risks of their development, deployment and use. Currently, these debates have been dominated by often polarized narratives mainly led by th… ▽ More

    Submitted 9 October, 2023; originally announced October 2023.

    Comments: 9 pages, long paper at EMNLP 2023 proceedings

  11. arXiv:2309.11576  [pdf, other

    cs.CL

    Examining the Limitations of Computational Rumor Detection Models Trained on Static Datasets

    Authors: Yida Mu, Xingyi Song, Kalina Bontcheva, Nikolaos Aletras

    Abstract: A crucial aspect of a rumor detection model is its ability to generalize, particularly its ability to detect emerging, previously unknown rumors. Past research has indicated that content-based (i.e., using solely source posts as input) rumor detection models tend to perform less effectively on unseen rumors. At the same time, the potential of context-based models remains largely untapped. The main… ▽ More

    Submitted 23 March, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

    Comments: Accepted at LREC-COLING 2024

  12. arXiv:2309.08708  [pdf, other

    cs.CL

    Frustratingly Simple Memory Efficiency for Pre-trained Language Models via Dynamic Embedding Pruning

    Authors: Miles Williams, Nikolaos Aletras

    Abstract: The extensive memory footprint of pre-trained language models (PLMs) can hinder deployment in memory-constrained settings, such as cloud environments or on-device. PLMs use embedding matrices to represent extensive vocabularies, forming a large proportion of the model parameters. While previous work towards parameter-efficient PLM development has considered pruning parameters within the transforme… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

  13. arXiv:2309.07794  [pdf, other

    cs.CL cs.LG cs.SI

    Improving Multimodal Classification of Social Media Posts by Leveraging Image-Text Auxiliary Tasks

    Authors: Danae Sánchez Villegas, Daniel Preoţiuc-Pietro, Nikolaos Aletras

    Abstract: Effectively leveraging multimodal information from social media posts is essential to various downstream tasks such as sentiment analysis, sarcasm detection or hate speech classification. Jointly modeling text and images is challenging because cross-modal semantics might be hidden or the relation between image and text is weak. However, prior work on multimodal classification of social media posts… ▽ More

    Submitted 3 February, 2024; v1 submitted 14 September, 2023; originally announced September 2023.

    Comments: Accepted at EACL 2024 Findings

  14. arXiv:2309.03064  [pdf, other

    cs.CL cs.AI cs.CV

    A Multimodal Analysis of Influencer Content on Twitter

    Authors: Danae Sánchez Villegas, Catalina Goanta, Nikolaos Aletras

    Abstract: Influencer marketing involves a wide range of strategies in which brands collaborate with popular content creators (i.e., influencers) to leverage their reach, trust, and impact on their audience to promote and endorse products or services. Because followers of influencers are more likely to buy a product after receiving an authentic product endorsement rather than an explicit direct product promo… ▽ More

    Submitted 6 September, 2023; originally announced September 2023.

    Comments: Accepted at AACL 2023

  15. arXiv:2305.16798  [pdf, other

    cs.CL cs.AI

    Schema-Guided User Satisfaction Modeling for Task-Oriented Dialogues

    Authors: Yue Feng, Yunlong Jiao, Animesh Prasad, Nikolaos Aletras, Emine Yilmaz, Gabriella Kazai

    Abstract: User Satisfaction Modeling (USM) is one of the popular choices for task-oriented dialogue systems evaluation, where user satisfaction typically depends on whether the user's task goals were fulfilled by the system. Task-oriented dialogue systems use task schema, which is a set of task attributes, to encode the user's task goals. Existing studies on USM neglect explicitly modeling the user's task g… ▽ More

    Submitted 26 May, 2023; originally announced May 2023.

  16. arXiv:2305.14310  [pdf, ps, other

    cs.CL

    Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science

    Authors: Yida Mu, Ben P. Wu, William Thorne, Ambrose Robinson, Nikolaos Aletras, Carolina Scarton, Kalina Bontcheva, Xingyi Song

    Abstract: Instruction-tuned Large Language Models (LLMs) have exhibited impressive language understanding and the capacity to generate responses that follow specific prompts. However, due to the computational demands associated with training these models, their applications often adopt a zero-shot setting. In this paper, we evaluate the zero-shot performance of two publicly accessible LLMs, ChatGPT and Open… ▽ More

    Submitted 24 March, 2024; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: Accepted at LREC-COLING 2024

  17. arXiv:2305.14264  [pdf, other

    cs.CL cs.AI

    Active Learning Principles for In-Context Learning with Large Language Models

    Authors: Katerina Margatina, Timo Schick, Nikolaos Aletras, Jane Dwivedi-Yu

    Abstract: The remarkable advancements in large language models (LLMs) have significantly enhanced the performance in few-shot learning settings. By using only a small number of labeled examples, referred to as demonstrations, LLMs can effectively grasp the task at hand through in-context learning. However, the process of selecting appropriate demonstrations has received limited attention in prior work. This… ▽ More

    Submitted 22 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: To appear at Findings of EMNLP (Camera Ready version)

  18. arXiv:2305.13342  [pdf, other

    cs.LG cs.CL

    On the Limitations of Simulating Active Learning

    Authors: Katerina Margatina, Nikolaos Aletras

    Abstract: Active learning (AL) is a human-and-model-in-the-loop paradigm that iteratively selects informative unlabeled data for human annotation, aiming to improve over random sampling. However, performing AL experiments with human annotations on-the-fly is a laborious and expensive process, thus unrealistic for academic research. An easy fix to this impediment is to simulate AL, by treating an already lab… ▽ More

    Submitted 21 May, 2023; originally announced May 2023.

    Comments: To appear at Findings of ACL 2023

  19. arXiv:2305.13002  [pdf, other

    cs.CL cs.AI cs.LG

    Rethinking Semi-supervised Learning with Language Models

    Authors: Zhengxiang Shi, Francesco Tonolini, Nikolaos Aletras, Emine Yilmaz, Gabriella Kazai, Yunlong Jiao

    Abstract: Semi-supervised learning (SSL) is a popular setting aiming to effectively utilize unlabelled data to improve model performance in downstream natural language processing (NLP) tasks. Currently, there are two popular approaches to make use of unlabelled data: Self-training (ST) and Task-adaptive pre-training (TAPT). ST uses a teacher model to assign pseudo-labels to the unlabelled data, while TAPT c… ▽ More

    Submitted 22 May, 2023; originally announced May 2023.

    Comments: Findings of ACL 2023. Code is available at https://github.com/amzn/pretraining-or-self-training

  20. arXiv:2305.11034  [pdf, other

    cs.CL

    Trading Syntax Trees for Wordpieces: Target-oriented Opinion Words Extraction with Wordpieces and Aspect Enhancement

    Authors: Samuel Mensah, Kai Sun, Nikolaos Aletras

    Abstract: State-of-the-art target-oriented opinion word extraction (TOWE) models typically use BERT-based text encoders that operate on the word level, along with graph convolutional networks (GCNs) that incorporate syntactic information extracted from syntax trees. These methods achieve limited gains with GCNs and have difficulty using BERT wordpieces. Meanwhile, BERT wordpieces are known to be effective a… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL 2023

  21. arXiv:2305.10496  [pdf, other

    cs.CL cs.AI cs.LG

    Incorporating Attribution Importance for Improving Faithfulness Metrics

    Authors: Zhixue Zhao, Nikolaos Aletras

    Abstract: Feature attribution methods (FAs) are popular approaches for providing insights into the model reasoning process of making predictions. The more faithful a FA is, the more accurately it reflects which parts of the input are more important for the prediction. Widely used faithfulness metrics, such as sufficiency and comprehensiveness use a hard erasure criterion, i.e. entirely removing or retaining… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

    Comments: Accepted at ACL2023

  22. arXiv:2302.14719  [pdf, other

    cs.CL cs.LG

    Self-training through Classifier Disagreement for Cross-Domain Opinion Target Extraction

    Authors: Kai Sun, Richong Zhang, Samuel Mensah, Nikolaos Aletras, Yongyi Mao, Xudong Liu

    Abstract: Opinion target extraction (OTE) or aspect extraction (AE) is a fundamental task in opinion mining that aims to extract the targets (or aspects) on which opinions have been expressed. Recent work focus on cross-domain OTE, which is typically encountered in real-world scenarios, where the testing and training distributions differ. Most methods use domain adversarial neural networks that aim to reduc… ▽ More

    Submitted 28 February, 2023; originally announced February 2023.

    Comments: Accepted at TheWebConf 2023

  23. arXiv:2302.03147  [pdf, other

    cs.CL

    It's about Time: Rethinking Evaluation on Rumor Detection Benchmarks using Chronological Splits

    Authors: Yida Mu, Kalina Bontcheva, Nikolaos Aletras

    Abstract: New events emerge over time influencing the topics of rumors in social media. Current rumor detection benchmarks use random splits as training, development and test sets which typically results in topical overlaps. Consequently, models trained on random splits may not perform well on rumor classification on previously unseen topics due to the temporal concept drift. In this paper, we provide a re-… ▽ More

    Submitted 6 February, 2023; originally announced February 2023.

    Comments: Accepted at EACL 2023 Findings

  24. arXiv:2210.09197  [pdf, other

    cs.CL cs.AI cs.LG

    On the Impact of Temporal Concept Drift on Model Explanations

    Authors: Zhixue Zhao, George Chrysostomou, Kalina Bontcheva, Nikolaos Aletras

    Abstract: Explanation faithfulness of model predictions in natural language processing is typically evaluated on held-out data from the same temporal distribution as the training data (i.e. synchronous settings). While model performance often deteriorates due to temporal variation (i.e. temporal concept drift), it is currently unknown how explanation faithfulness is impacted when the time span of the target… ▽ More

    Submitted 17 October, 2022; originally announced October 2022.

    Comments: Accepted at EMNLP Findings 2022

  25. arXiv:2210.07904  [pdf, other

    cs.CL

    HashFormers: Towards Vocabulary-independent Pre-trained Transformers

    Authors: Huiyin Xue, Nikolaos Aletras

    Abstract: Transformer-based pre-trained language models are vocabulary-dependent, map** by default each token to its corresponding embedding. This one-to-one map** results into embedding matrices that occupy a lot of memory (i.e. millions of parameters) and grow linearly with the size of the vocabulary. Previous work on on-device transformers dynamically generate token embeddings on-the-fly without embe… ▽ More

    Submitted 29 October, 2022; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: Accepted at EMNLP 2022

  26. arXiv:2210.05999  [pdf, other

    cs.CL

    Improving Graph-Based Text Representations with Character and Word Level N-grams

    Authors: Wenzhe Li, Nikolaos Aletras

    Abstract: Graph-based text representation focuses on how text documents are represented as graphs for exploiting dependency information between tokens and documents within a corpus. Despite the increasing interest in graph representation learning, there is limited research in exploring new ways for graph-based text representation, which is important in downstream natural language processing tasks. In this p… ▽ More

    Submitted 12 October, 2022; originally announced October 2022.

  27. arXiv:2209.08681  [pdf, other

    cs.CL

    Domain Classification-based Source-specific Term Penalization for Domain Adaptation in Hate-speech Detection

    Authors: Tulika Bose, Nikolaos Aletras, Irina Illina, Dominique Fohr

    Abstract: State-of-the-art approaches for hate-speech detection usually exhibit poor performance in out-of-domain settings. This occurs, typically, due to classifiers overemphasizing source-specific information that negatively impacts its domain invariance. Prior work has attempted to penalize terms related to hate-speech from manually curated lists using feature attribution methods, which quantify the impo… ▽ More

    Submitted 18 September, 2022; originally announced September 2022.

    Comments: COLING 2022 pre-print

  28. arXiv:2205.03313  [pdf, other

    cs.CL

    Combining Humor and Sarcasm for Improving Political Parody Detection

    Authors: Xiao Ao, Danae Sánchez Villegas, Daniel Preoţiuc-Pietro, Nikolaos Aletras

    Abstract: Parody is a figurative device used for mimicking entities for comedic or critical purposes. Parody is intentionally humorous and often involves sarcasm. This paper explores jointly modelling these figurative tropes with the goal of improving performance of political parody detection in tweets. To this end, we present a multi-encoder model that combines three parallel encoders to enrich parody-spec… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: Accepted at NAACL 2022

  29. arXiv:2204.10293  [pdf, other

    cs.CL cs.AI

    A Hierarchical N-Gram Framework for Zero-Shot Link Prediction

    Authors: Mingchen Li, Junfan Chen, Samuel Mensah, Nikolaos Aletras, Xiulong Yang, Yang Ye

    Abstract: Due to the incompleteness of knowledge graphs (KGs), zero-shot link prediction (ZSLP) which aims to predict unobserved relations in KGs has attracted recent interest from researchers. A common solution is to use textual features of relations (e.g., surface name or textual descriptions) as auxiliary information to bridge the gap between seen and unseen relations. Current approaches learn an embeddi… ▽ More

    Submitted 6 January, 2023; v1 submitted 16 April, 2022; originally announced April 2022.

    Comments: Published as a conference paper at EMNLP Findings 2022

  30. arXiv:2204.10080  [pdf, other

    cs.CL cs.SI

    Identifying and Characterizing Active Citizens who Refute Misinformation in Social Media

    Authors: Yida Mu, Pu Niu, Nikolaos Aletras

    Abstract: The phenomenon of misinformation spreading in social media has developed a new form of active citizens who focus on tackling the problem by refuting posts that might contain misinformation. Automatically identifying and characterizing the behavior of such active citizens in social media is an important task in computational social science for complementing studies in misinformation analysis. In th… ▽ More

    Submitted 21 April, 2022; originally announced April 2022.

    Comments: Accepted at ACM WebSci 2022

  31. arXiv:2203.12536  [pdf, other

    cs.CL

    Dynamically Refined Regularization for Improving Cross-corpora Hate Speech Detection

    Authors: Tulika Bose, Nikolaos Aletras, Irina Illina, Dominique Fohr

    Abstract: Hate speech classifiers exhibit substantial performance degradation when evaluated on datasets different from the source. This is due to learning spurious correlations between words that are not necessarily relevant to hateful language, and hate speech labels from the training corpus. Previous work has attempted to mitigate this problem by regularizing specific terms from pre-defined static dictio… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

    Comments: Findings of ACL 2022 preprint

  32. arXiv:2203.10415  [pdf, other

    cs.CL

    How does the pre-training objective affect what large language models learn about linguistic properties?

    Authors: Ahmed Alajrami, Nikolaos Aletras

    Abstract: Several pre-training objectives, such as masked language modeling (MLM), have been proposed to pre-train language models (e.g. BERT) with the aim of learning better language representations. However, to the best of our knowledge, no previous work so far has investigated how different pre-training objectives affect what BERT learns about linguistics properties. We hypothesize that linguistically mo… ▽ More

    Submitted 19 March, 2022; originally announced March 2022.

    Comments: Accepted at ACL 2022

  33. arXiv:2203.05840  [pdf, other

    cs.CL

    Automatic Identification and Classification of Bragging in Social Media

    Authors: Mali **, Daniel Preoţiuc-Pietro, A. Seza Doğruöz, Nikolaos Aletras

    Abstract: Bragging is a speech act employed with the goal of constructing a favorable self-image through positive statements about oneself. It is widespread in daily communication and especially popular in social media, where users aim to build a positive image of their persona directly or indirectly. In this paper, we present the first large scale study of bragging in computational linguistics, building on… ▽ More

    Submitted 11 March, 2022; originally announced March 2022.

    Comments: Accepted at ACL 2022

  34. arXiv:2203.00056  [pdf, other

    cs.CL

    An Empirical Study on Explanations in Out-of-Domain Settings

    Authors: George Chrysostomou, Nikolaos Aletras

    Abstract: Recent work in Natural Language Processing has focused on develo** approaches that extract faithful explanations, either via identifying the most important tokens in the input (i.e. post-hoc explanations) or by designing inherently faithful models that first select the most important tokens and then use them to predict the correct label (i.e. select-then-predict models). Currently, these approac… ▽ More

    Submitted 28 February, 2022; originally announced March 2022.

    Comments: ACL2022 Pre-print

  35. arXiv:2110.00976  [pdf, other

    cs.CL

    LexGLUE: A Benchmark Dataset for Legal Language Understanding in English

    Authors: Ilias Chalkidis, Abhik Jana, Dirk Hartung, Michael Bommarito, Ion Androutsopoulos, Daniel Martin Katz, Nikolaos Aletras

    Abstract: Laws and their interpretations, legal arguments and agreements\ are typically expressed in writing, leading to the production of vast corpora of legal text. Their analysis, which is at the center of legal practice, becomes increasingly elaborate as these collections grow in size. Natural language understanding (NLU) technologies can be a valuable tool to support legal practitioners in these endeav… ▽ More

    Submitted 8 November, 2022; v1 submitted 3 October, 2021; originally announced October 2021.

    Comments: 9 pages, long paper at ACL 2022 proceedings. LexGLUE benchmark is available at: https://huggingface.co/datasets/lex_glue. Code is available at: https://github.com/coastalcph/lex-glue. Update TFIDF-SVM scores in the last version

  36. arXiv:2109.03764  [pdf, other

    cs.CL cs.AI cs.LG

    Active Learning by Acquiring Contrastive Examples

    Authors: Katerina Margatina, Giorgos Vernikos, Loïc Barrault, Nikolaos Aletras

    Abstract: Common acquisition functions for active learning use either uncertainty or diversity sampling, aiming to select difficult and diverse data points from the pool of unlabeled data, respectively. In this work, leveraging the best of both worlds, we propose an acquisition function that opts for selecting \textit{contrastive examples}, i.e. data points that are similar in the model feature space and ye… ▽ More

    Submitted 8 September, 2021; originally announced September 2021.

    Comments: Accepted at EMNLP 2021

  37. arXiv:2109.01819  [pdf, other

    cs.CL cs.AI cs.LG

    Frustratingly Simple Pretraining Alternatives to Masked Language Modeling

    Authors: Atsuki Yamaguchi, George Chrysostomou, Katerina Margatina, Nikolaos Aletras

    Abstract: Masked language modeling (MLM), a self-supervised pretraining objective, is widely used in natural language processing for learning text representations. MLM trains a model to predict a random sample of input tokens that have been replaced by a [MASK] placeholder in a multi-class setting over the entire vocabulary. When pretraining, it is common to use alongside MLM other auxiliary objectives on t… ▽ More

    Submitted 4 September, 2021; originally announced September 2021.

    Comments: Accepted at EMNLP 2021

  38. arXiv:2109.01238  [pdf, other

    cs.CL cs.AI

    An Empirical Study on Leveraging Position Embeddings for Target-oriented Opinion Words Extraction

    Authors: Samuel Mensah, Kai Sun, Nikolaos Aletras

    Abstract: Target-oriented opinion words extraction (TOWE) (Fan et al., 2019b) is a new subtask of target-oriented sentiment analysis that aims to extract opinion words for a given aspect in text. Current state-of-the-art methods leverage position embeddings to capture the relative position of a word to the target. However, the performance of these methods depends on the ability to incorporate this informati… ▽ More

    Submitted 2 September, 2021; originally announced September 2021.

    Comments: Accepted at EMNLP 2021

    ACM Class: I.2.7

  39. arXiv:2109.00602  [pdf, other

    cs.CL

    Point-of-Interest Type Prediction using Text and Images

    Authors: Danae Sánchez Villegas, Nikolaos Aletras

    Abstract: Point-of-interest (POI) type prediction is the task of inferring the type of a place from where a social media post was shared. Inferring a POI's type is useful for studies in computational social science including sociolinguistics, geosemiotics, and cultural geography, and has applications in geosocial networking technologies such as recommendation and visualization systems. Prior efforts in POI… ▽ More

    Submitted 1 September, 2021; originally announced September 2021.

    Comments: Accepted at EMNLP 2021

  40. arXiv:2108.13759  [pdf, other

    cs.CL

    Enjoy the Salience: Towards Better Transformer-based Faithful Explanations with Word Salience

    Authors: George Chrysostomou, Nikolaos Aletras

    Abstract: Pretrained transformer-based models such as BERT have demonstrated state-of-the-art predictive performance when adapted into a range of natural language processing tasks. An open problem is how to improve the faithfulness of explanations (rationales) for the predictions of these models. In this paper, we hypothesize that salient information extracted a priori from the training data can complement… ▽ More

    Submitted 31 August, 2021; originally announced August 2021.

    Comments: EMNLP 2021 Pre-print

  41. arXiv:2108.12197  [pdf, other

    cs.CL

    Translation Error Detection as Rationale Extraction

    Authors: Marina Fomicheva, Lucia Specia, Nikolaos Aletras

    Abstract: Recent Quality Estimation (QE) models based on multilingual pre-trained representations have achieved very competitive results when predicting the overall quality of translated sentences. Predicting translation errors, i.e. detecting specifically which words are incorrect, is a more challenging task, especially with limited amounts of training data. We hypothesize that, not unlike humans, successf… ▽ More

    Submitted 27 August, 2021; originally announced August 2021.

  42. arXiv:2107.00411  [pdf, other

    cs.CL

    Knowledge Distillation for Quality Estimation

    Authors: Amit Gajbhiye, Marina Fomicheva, Fernando Alva-Manchego, Frédéric Blain, Abiola Obamuyide, Nikolaos Aletras, Lucia Specia

    Abstract: Quality Estimation (QE) is the task of automatically predicting Machine Translation quality in the absence of reference translations, making it applicable in real-time settings, such as translating online social media conversations. Recent success in QE stems from the use of multilingual pre-trained representations, where very large models lead to impressive results. However, the inference time, d… ▽ More

    Submitted 1 July, 2021; originally announced July 2021.

    Comments: ACL Findings 2021

  43. arXiv:2105.04047  [pdf, other

    cs.CL

    Analyzing Online Political Advertisements

    Authors: Danae Sánchez Villegas, Saeid Mokaram, Nikolaos Aletras

    Abstract: Online political advertising is a central aspect of modern election campaigning for influencing public opinion. Computational analysis of political ads is of utmost importance in political science to understand the characteristics of digital campaigning. It is also important in computational linguistics to study features of political discourse and communication on a large scale. In this work, we p… ▽ More

    Submitted 26 May, 2021; v1 submitted 9 May, 2021; originally announced May 2021.

    Comments: Accepted at ACL Findings 2021

  44. arXiv:2105.02751  [pdf, other

    cs.CL cs.AI

    On the Ethical Limits of Natural Language Processing on Legal Text

    Authors: Dimitrios Tsarapatsanis, Nikolaos Aletras

    Abstract: Natural language processing (NLP) methods for analyzing legal text offer legal scholars and practitioners a range of tools allowing to empirically analyze law on a large scale. However, researchers seem to struggle when it comes to identifying ethical limits to using NLP systems for acquiring genuine insights both about the law and the systems' predictive capacity. In this paper we set out a numbe… ▽ More

    Submitted 25 May, 2021; v1 submitted 6 May, 2021; originally announced May 2021.

    Comments: Accepted at ACL Findings 2021

  45. arXiv:2105.02657  [pdf, other

    cs.CL

    Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification

    Authors: George Chrysostomou, Nikolaos Aletras

    Abstract: Neural network architectures in natural language processing often use attention mechanisms to produce probability distributions over input token representations. Attention has empirically been demonstrated to improve performance in various tasks, while its weights have been extensively used as explanations for model predictions. Recent studies (Jain and Wallace, 2019; Serrano and Smith, 2019; Wieg… ▽ More

    Submitted 7 May, 2021; v1 submitted 6 May, 2021; originally announced May 2021.

    Comments: NLP Interpretability ; Accepted at ACL2021

  46. arXiv:2104.08320  [pdf, other

    cs.CL

    On the Importance of Effectively Adapting Pretrained Language Models for Active Learning

    Authors: Katerina Margatina, Loïc Barrault, Nikolaos Aletras

    Abstract: Recent Active Learning (AL) approaches in Natural Language Processing (NLP) proposed using off-the-shelf pretrained language models (LMs). In this paper, we argue that these LMs are not adapted effectively to the downstream task during AL and we explore ways to address this issue. We suggest to first adapt the pretrained LM to the target task by continuing training with all the available unlabeled… ▽ More

    Submitted 2 March, 2022; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: To appear at ACL 2022

  47. arXiv:2104.08219  [pdf, other

    cs.CL cs.AI

    Flexible Instance-Specific Rationalization of NLP Models

    Authors: George Chrysostomou, Nikolaos Aletras

    Abstract: Recent research on model interpretability in natural language processing extensively uses feature scoring methods for identifying which parts of the input are the most important for a model to make a prediction (i.e. explanation or rationale). However, previous research has shown that there is no clear best scoring method across various text classification tasks while practitioners typically have… ▽ More

    Submitted 6 December, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

    Comments: NLP Interpretability ; Accepted at AAAI2022

  48. arXiv:2103.13084  [pdf, other

    cs.CL

    Paragraph-level Rationale Extraction through Regularization: A case study on European Court of Human Rights Cases

    Authors: Ilias Chalkidis, Manos Fergadiotis, Dimitrios Tsarapatsanis, Nikolaos Aletras, Ion Androutsopoulos, Prodromos Malakasiotis

    Abstract: Interpretability or explainability is an emerging research field in NLP. From a user-centric point of view, the goal is to build models that provide proper justification for their decisions, similar to those of humans, by requiring the models to satisfy additional constraints. To this end, we introduce a new application on legal text where, contrary to mainstream literature targeting word-level ra… ▽ More

    Submitted 24 March, 2021; originally announced March 2021.

    Comments: 9 pages, long paper at NAACL 2021 proceedings

  49. arXiv:2103.12428  [pdf, other

    cs.CL

    Modeling the Severity of Complaints in Social Media

    Authors: Mali **, Nikolaos Aletras

    Abstract: The speech act of complaining is used by humans to communicate a negative mismatch between reality and expectations as a reaction to an unfavorable situation. Linguistic theory of pragmatics categorizes complaints into various severity levels based on the face-threat that the complainer is willing to undertake. This is particularly useful for understanding the intent of complainers and how humans… ▽ More

    Submitted 23 March, 2021; originally announced March 2021.

    Comments: Accepted at NAACL 2021

  50. arXiv:2010.10910  [pdf, ps, other

    cs.CL

    Complaint Identification in Social Media with Transformer Networks

    Authors: Mali **, Nikolaos Aletras

    Abstract: Complaining is a speech act extensively used by humans to communicate a negative inconsistency between reality and expectations. Previous work on automatically identifying complaints in social media has focused on using feature-based and task-specific neural network models. Adapting state-of-the-art pre-trained neural language models and their combinations with other linguistic information from to… ▽ More

    Submitted 21 October, 2020; originally announced October 2020.

    Comments: Accepted at COLING 2020