Skip to main content

Showing 1–27 of 27 results for author: Yaghoobzadeh, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.02474  [pdf, other

    cs.CL cs.AI cs.IR cs.LG

    uTeBC-NLP at SemEval-2024 Task 9: Can LLMs be Lateral Thinkers?

    Authors: Pouya Sadeghi, Amirhossein Abaskohi, Yadollah Yaghoobzadeh

    Abstract: Inspired by human cognition, Jiang et al.(2023c) create a benchmark for assessing LLMs' lateral thinking-thinking outside the box. Building upon this benchmark, we investigate how different prompting methods enhance LLMs' performance on this task to reveal their inherent power for outside-the-box thinking ability. Through participating in SemEval-2024, task 9, Sentence Puzzle sub-task, we explore… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 12 pages, 5 figures, 6 tables, Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024) @ NAACL 2024

  2. arXiv:2404.02403  [pdf, other

    cs.CL cs.LG

    Benchmarking Large Language Models for Persian: A Preliminary Study Focusing on ChatGPT

    Authors: Amirhossein Abaskohi, Sara Baruni, Mostafa Masoudi, Nesa Abbasi, Mohammad Hadi Babalou, Ali Edalat, Sepehr Kamahi, Samin Mahdizadeh Sani, Nikoo Naghavian, Danial Namazifard, Pouya Sadeghi, Yadollah Yaghoobzadeh

    Abstract: This paper explores the efficacy of large language models (LLMs) for Persian. While ChatGPT and consequent LLMs have shown remarkable performance in English, their efficiency for more low-resource languages remains an open question. We present the first comprehensive benchmarking study of LLMs across diverse Persian language tasks. Our primary focus is on GPT-3.5-turbo, but we also include GPT-4 a… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: 14 pages, 1 figure, 6 tables, Proceeding of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING)

  3. arXiv:2310.12118  [pdf, other

    cs.CL

    Harnessing Dataset Cartography for Improved Compositional Generalization in Transformers

    Authors: Osman Batur İnce, Tanin Zeraati, Semih Yagcioglu, Yadollah Yaghoobzadeh, Erkut Erdem, Aykut Erdem

    Abstract: Neural networks have revolutionized language modeling and excelled in various downstream tasks. However, the extent to which these models achieve compositional generalization comparable to human cognitive abilities remains a topic of debate. While existing approaches in the field have mainly focused on novel architectures and alternative learning paradigms, we introduce a pioneering method harness… ▽ More

    Submitted 18 October, 2023; originally announced October 2023.

    Comments: Accepted to Findings of EMNLP 2023

  4. arXiv:2306.02873  [pdf, other

    cs.CL

    DecompX: Explaining Transformers Decisions by Propagating Token Decomposition

    Authors: Ali Modarressi, Mohsen Fayyaz, Ehsan Aghazadeh, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar

    Abstract: An emerging solution for explaining Transformer-based models is to use vector-based analysis on how the representations are formed. However, providing a faithful vector-based explanation for a multi-layer model could be challenging in three aspects: (1) Incorporating all components into the analysis, (2) Aggregating the layer dynamics to determine the information flow and mixture throughout the en… ▽ More

    Submitted 5 June, 2023; originally announced June 2023.

    Comments: Accepted to ACL 2023 (main conference)

  5. LM-CPPF: Paraphrasing-Guided Data Augmentation for Contrastive Prompt-Based Few-Shot Fine-Tuning

    Authors: Amirhossein Abaskohi, Sascha Rothe, Yadollah Yaghoobzadeh

    Abstract: In recent years, there has been significant progress in develo** pre-trained language models for NLP. However, these models often struggle when fine-tuned on small datasets. To address this issue, researchers have proposed various adaptation approaches. Prompt-based tuning is arguably the most common way, especially for larger models. Previous research shows that adding contrastive learning to p… ▽ More

    Submitted 5 July, 2023; v1 submitted 29 May, 2023; originally announced May 2023.

    Comments: 10 pages, 1 figure, 8 tables, 1 algorithm Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics

    Journal ref: https://aclanthology.org/2023.acl-short

  6. arXiv:2304.01282  [pdf, other

    cs.CL

    PEACH: Pre-Training Sequence-to-Sequence Multilingual Models for Translation with Semi-Supervised Pseudo-Parallel Document Generation

    Authors: Alireza Salemi, Amirhossein Abaskohi, Sara Tavakoli, Yadollah Yaghoobzadeh, Azadeh Shakery

    Abstract: Multilingual pre-training significantly improves many multilingual NLP tasks, including machine translation. Most existing methods are based on some variants of masked language modeling and text-denoising objectives on monolingual data. Multilingual pre-training on monolingual data ignores the availability of parallel data in many language pairs. Also, some other works integrate the available huma… ▽ More

    Submitted 14 April, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

    Comments: 15 pages, 5 figures, 16 tables, 1 algorithm, LoResMT@EACL 2023

    Journal ref: https://aclanthology.org/2023.loresmt-1.3

  7. arXiv:2211.05610  [pdf, other

    cs.CL

    BERT on a Data Diet: Finding Important Examples by Gradient-Based Pruning

    Authors: Mohsen Fayyaz, Ehsan Aghazadeh, Ali Modarressi, Mohammad Taher Pilehvar, Yadollah Yaghoobzadeh, Samira Ebrahimi Kahou

    Abstract: Current pre-trained language models rely on large datasets for achieving state-of-the-art performance. However, past research has shown that not all examples in a dataset are equally important during training. In fact, it is sometimes possible to prune a considerable fraction of the training set while maintaining the test performance. Established on standard vision benchmarks, two gradient-based s… ▽ More

    Submitted 28 November, 2022; v1 submitted 10 November, 2022; originally announced November 2022.

    Comments: ENLSP @ NeurIPS2022

  8. arXiv:2211.03862  [pdf, other

    cs.CL

    Looking at the Overlooked: An Analysis on the Word-Overlap Bias in Natural Language Inference

    Authors: Sara Rajaee, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar

    Abstract: It has been shown that NLI models are usually biased with respect to the word-overlap between premise and hypothesis; they take this feature as a primary cue for predicting the entailment label. In this paper, we focus on an overlooked aspect of the overlap bias in NLI models: the reverse word-overlap bias. Our experimental results demonstrate that current NLI models are highly biased towards the… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: Accepted at EMNLP 2022

  9. arXiv:2206.04615  [pdf, other

    cs.CL cs.AI cs.CY cs.LG stat.ML

    Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

    Authors: Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza , et al. (426 additional authors not shown)

    Abstract: Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their potentially transformative impact, these new capabilities are as yet poorly characterized. In order to inform future research, prepare for disruptive new model capabilities, and ameliorate socially harmful effects, it is vital that we understand the present and near-futur… ▽ More

    Submitted 12 June, 2023; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 27 pages, 17 figures + references and appendices, repo: https://github.com/google/BIG-bench

    Journal ref: Transactions on Machine Learning Research, May/2022, https://openreview.net/forum?id=uyTL5Bvosj

  10. arXiv:2205.03286  [pdf, other

    cs.CL

    GlobEnc: Quantifying Global Token Attribution by Incorporating the Whole Encoder Layer in Transformers

    Authors: Ali Modarressi, Mohsen Fayyaz, Yadollah Yaghoobzadeh, Mohammad Taher Pilehvar

    Abstract: There has been a growing interest in interpreting the underlying dynamics of Transformers. While self-attention patterns were initially deemed as the primary option, recent studies have shown that integrating other components can yield more accurate explanations. This paper introduces a novel token attribution analysis method that incorporates all the components in the encoder block and aggregates… ▽ More

    Submitted 6 May, 2022; originally announced May 2022.

    Comments: Accepted to NAACL 2022 (main conference)

  11. arXiv:2203.14139  [pdf, other

    cs.CL

    Metaphors in Pre-Trained Language Models: Probing and Generalization Across Datasets and Languages

    Authors: Ehsan Aghazadeh, Mohsen Fayyaz, Yadollah Yaghoobzadeh

    Abstract: Human languages are full of metaphorical expressions. Metaphors help people understand the world by connecting new concepts and domains to more familiar ones. Large pre-trained language models (PLMs) are therefore assumed to encode metaphorical knowledge useful for NLP systems. In this paper, we investigate this hypothesis for PLMs, by probing metaphoricity information in their encodings, and by m… ▽ More

    Submitted 26 March, 2022; originally announced March 2022.

    Comments: Accepted to ACL 2022 (main conference)

  12. arXiv:2112.13238  [pdf, other

    cs.CL

    PerCQA: Persian Community Question Answering Dataset

    Authors: Naghme Jamali, Yadollah Yaghoobzadeh, Hesham Faili

    Abstract: Community Question Answering (CQA) forums provide answers for many real-life questions. Thanks to the large size, these forums are very popular among machine learning researchers. Automatic answer selection, answer ranking, question retrieval, expert finding, and fact-checking are example learning tasks performed using CQA data. In this paper, we present PerCQA, the first Persian dataset for CQA.… ▽ More

    Submitted 25 December, 2021; originally announced December 2021.

  13. arXiv:2012.06154  [pdf, other

    cs.CL cs.AI

    ParsiNLU: A Suite of Language Understanding Challenges for Persian

    Authors: Daniel Khashabi, Arman Cohan, Siamak Shakeri, Pedram Hosseini, Pouya Pezeshkpour, Malihe Alikhani, Moin Aminnaseri, Marzieh Bitaab, Faeze Brahman, Sarik Ghazarian, Mozhdeh Gheini, Arman Kabiri, Rabeeh Karimi Mahabadi, Omid Memarrast, Ahmadreza Mosallanezhad, Erfan Noury, Shahab Raji, Mohammad Sadegh Rasooli, Sepideh Sadeghi, Erfan Sadeqi Azer, Niloofar Safi Samghabadi, Mahsa Shafaei, Saber Sheybani, Ali Tazarv, Yadollah Yaghoobzadeh

    Abstract: Despite the progress made in recent years in addressing natural language understanding (NLU) challenges, the majority of this progress remains to be concentrated on resource-rich languages like English. This work focuses on Persian language, one of the widely spoken languages in the world, and yet there are few NLU datasets available for this rich language. The availability of high-quality evaluat… ▽ More

    Submitted 13 July, 2021; v1 submitted 11 December, 2020; originally announced December 2020.

    Comments: To appear on Transactions of the Association for Computational Linguistics (TACL), 2021

  14. arXiv:2011.11090  [pdf, other

    cs.CL

    Cross-Domain Generalization Through Memorization: A Study of Nearest Neighbors in Neural Duplicate Question Detection

    Authors: Yadollah Yaghoobzadeh, Alexandre Rochette, Timothy J. Hazen

    Abstract: Duplicate question detection (DQD) is important to increase efficiency of community and automatic question answering systems. Unfortunately, gathering supervised data in a domain is time-consuming and expensive, and our ability to leverage annotations across domains is minimal. In this work, we leverage neural representations and study nearest neighbors for cross-domain generalization in DQD. We f… ▽ More

    Submitted 22 November, 2020; originally announced November 2020.

    Comments: 7 pages, initial results

  15. arXiv:2004.12198  [pdf, other

    cs.CL

    Quantifying the Contextualization of Word Representations with Semantic Class Probing

    Authors: Mengjie Zhao, Philipp Dufter, Yadollah Yaghoobzadeh, Hinrich Schütze

    Abstract: Pretrained language models have achieved a new state of the art on many NLP tasks, but there are still many open questions about how and why they work so well. We investigate the contextualization of words in BERT. We quantify the amount of contextualization, i.e., how well words are interpreted in context, by studying the extent to which semantic classes of a word can be inferred from its context… ▽ More

    Submitted 11 October, 2020; v1 submitted 25 April, 2020; originally announced April 2020.

    Comments: EMNLP Findings 2020

  16. arXiv:1911.03861  [pdf, other

    cs.CL cs.LG

    Increasing Robustness to Spurious Correlations using Forgettable Examples

    Authors: Yadollah Yaghoobzadeh, Soroush Mehri, Remi Tachet, T. J. Hazen, Alessandro Sordoni

    Abstract: Neural NLP models tend to rely on spurious correlations between labels and input features to perform their tasks. Minority examples, i.e., examples that contradict the spurious correlations present in the majority of data points, have been shown to increase the out-of-distribution generalization of pre-trained language models. In this paper, we first propose using example forgetting to find minori… ▽ More

    Submitted 1 February, 2021; v1 submitted 10 November, 2019; originally announced November 2019.

    Comments: 14 pages, Accepted at EACL2021

  17. arXiv:1911.02645  [pdf, other

    cs.CL cs.LG

    Unsupervised Domain Adaptation of Contextual Embeddings for Low-Resource Duplicate Question Detection

    Authors: Alexandre Rochette, Yadollah Yaghoobzadeh, Timothy J. Hazen

    Abstract: Answering questions is a primary goal of many conversational systems or search products. While most current systems have focused on answering questions against structured databases or curated knowledge graphs, on-line community forums or frequently asked questions (FAQ) lists offer an alternative source of information for question answering systems. Automatic duplicate question detection (DQD) is… ▽ More

    Submitted 6 November, 2019; originally announced November 2019.

  18. arXiv:1909.00519  [pdf, other

    cs.AI cs.CL

    Toward Understanding The Effect Of Loss function On Then Performance Of Knowledge Graph Embedding

    Authors: Mojtaba Nayyeri, Cheng** Xu, Yadollah Yaghoobzadeh, Hamed Shariat Yazdi, Jens Lehmann

    Abstract: Knowledge graphs (KGs) represent world's facts in structured forms. KG completion exploits the existing facts in a KG to discover new ones. Translation-based embedding model (TransE) is a prominent formulation to do KG completion. Despite the efficiency of TransE in memory and time, it suffers from several limitations in encoding relation patterns such as symmetric, reflexive etc. To resolve this… ▽ More

    Submitted 10 October, 2019; v1 submitted 1 September, 2019; originally announced September 2019.

  19. arXiv:1906.03608  [pdf, other

    cs.CL cs.LG

    Probing for Semantic Classes: Diagnosing the Meaning Content of Word Embeddings

    Authors: Yadollah Yaghoobzadeh, Katharina Kann, Timothy J. Hazen, Eneko Agirre, Hinrich Schütze

    Abstract: Word embeddings typically represent different meanings of a word in a single conflated vector. Empirical analysis of embeddings of ambiguous words is currently limited by the small size of manually annotated resources and by the fact that word senses are treated as unrelated individual concepts. We present a large dataset based on manual Wikipedia annotations and word senses, where word senses fro… ▽ More

    Submitted 9 June, 2019; originally announced June 2019.

    Comments: 14 pages, Accepted at ACL 2019

  20. arXiv:1810.10499  [pdf, other

    cs.CL cs.AI

    Multi-Multi-View Learning: Multilingual and Multi-Representation Entity Ty**

    Authors: Yadollah Yaghoobzadeh, Hinrich Schütze

    Abstract: Knowledge bases (KBs) are paramount in NLP. We employ multiview learning for increasing accuracy and coverage of entity type information in KBs. We rely on two metaviews: language and representation. For language, we consider high-resource and low-resource languages from Wikipedia. For representation, we consider representations based on the context distribution of the entity (i.e., on its embeddi… ▽ More

    Submitted 24 October, 2018; originally announced October 2018.

    Comments: 7 pages, Accepted at EMNLP 2018

  21. arXiv:1807.07186  [pdf, other

    cs.CL cs.AI

    Evaluating Word Embeddings in Multi-label Classification Using Fine-grained Name Ty**

    Authors: Yadollah Yaghoobzadeh, Katharina Kann, Hinrich Schütze

    Abstract: Embedding models typically associate each word with a single real-valued vector, representing its different properties. Evaluation methods, therefore, need to analyze the accuracy and completeness of these properties in embeddings. This requires fine-grained analysis of embedding subspaces. Multi-label classification is an appropriate way to do so. We propose a new evaluation method for word embed… ▽ More

    Submitted 18 July, 2018; originally announced July 2018.

    Comments: 6 pages, The 3rd Workshop on Representation Learning for NLP (RepL4NLP @ ACL2018)

  22. arXiv:1806.04523  [pdf, other

    cs.CL

    Recurrent One-Hop Predictions for Reasoning over Knowledge Graphs

    Authors: Wenpeng Yin, Yadollah Yaghoobzadeh, Hinrich Schütze

    Abstract: Large scale knowledge graphs (KGs) such as Freebase are generally incomplete. Reasoning over multi-hop (mh) KG paths is thus an important capability that is needed for question answering or other NLP tasks that require knowledge about the world. mh-KG reasoning includes diverse scenarios, e.g., given a head entity and a relation path, predict the tail entity; or given two entities connected by som… ▽ More

    Submitted 12 June, 2018; originally announced June 2018.

    Comments: COLING'2018 camera-ready

  23. arXiv:1708.02275  [pdf, other

    cs.CL

    Corpus-level Fine-grained Entity Ty**

    Authors: Yadollah Yaghoobzadeh, Heike Adel, Hinrich Schütze

    Abstract: This paper addresses the problem of corpus-level entity ty**, i.e., inferring from a large corpus that an entity is a member of a class such as "food" or "artist". The application of entity ty** we are interested in is knowledge base completion, specifically, to learn which classes an entity is a member of. We propose FIGMENT to tackle this problem. FIGMENT is embedding- based and combines (i)… ▽ More

    Submitted 6 June, 2018; v1 submitted 7 August, 2017; originally announced August 2017.

    Comments: 24 pages. arXiv admin note: text overlap with arXiv:1701.02025, arXiv:1606.07901

    Journal ref: JAIR, Vol 61 (2018)

  24. arXiv:1701.02025  [pdf, other

    cs.CL cs.AI

    Multi-level Representations for Fine-Grained Ty** of Knowledge Base Entities

    Authors: Yadollah Yaghoobzadeh, Hinrich Schütze

    Abstract: Entities are essential elements of natural language. In this paper, we present methods for learning multi-level representations of entities on three complementary levels: character (character patterns in entity names extracted, e.g., by neural networks), word (embeddings of words in entity names) and entity (entity embeddings). We investigate state-of-the-art learning methods on each level and fin… ▽ More

    Submitted 16 January, 2017; v1 submitted 8 January, 2017; originally announced January 2017.

    Comments: 13 pages, in EACL 2017

  25. arXiv:1612.07495  [pdf, other

    cs.CL

    Noise Mitigation for Neural Entity Ty** and Relation Extraction

    Authors: Yadollah Yaghoobzadeh, Heike Adel, Hinrich Schütze

    Abstract: In this paper, we address two different types of noise in information extraction models: noise from distant supervision and noise from pipeline input features. Our target tasks are entity ty** and relation extraction. For the first noise type, we introduce multi-instance multi-label learning algorithms using neural network models, and apply them to fine-grained entity ty** for the first time.… ▽ More

    Submitted 10 January, 2017; v1 submitted 22 December, 2016; originally announced December 2016.

    Comments: EACL 2017; the first two authors contributed equally to this work

  26. arXiv:1606.07902  [pdf, ps, other

    cs.CL

    Intrinsic Subspace Evaluation of Word Embedding Representations

    Authors: Yadollah Yaghoobzadeh, Hinrich Schütze

    Abstract: We introduce a new methodology for intrinsic evaluation of word representations. Specifically, we identify four fundamental criteria based on the characteristics of natural language that pose difficulties to NLP systems; and develop tests that directly show whether or not representations contain the subspaces necessary to satisfy these criteria. Current intrinsic evaluations are mostly based on th… ▽ More

    Submitted 25 June, 2016; originally announced June 2016.

    Comments: Long paper accepted in ACL2016

    MSC Class: 68T50

  27. arXiv:1606.07901  [pdf, ps, other

    cs.CL

    Corpus-level Fine-grained Entity Ty** Using Contextual Information

    Authors: Yadollah Yaghoobzadeh, Hinrich Schütze

    Abstract: This paper addresses the problem of corpus-level entity ty**, i.e., inferring from a large corpus that an entity is a member of a class such as "food" or "artist". The application of entity ty** we are interested in is knowledge base completion, specifically, to learn which classes an entity is a member of. We propose FIGMENT to tackle this problem. FIGMENT is embedding-based and combines (i)… ▽ More

    Submitted 25 June, 2016; originally announced June 2016.

    Comments: Accepted at EMNLP2015, Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing