Skip to main content

Showing 1–43 of 43 results for author: Hershcovich, D

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.11030  [pdf, other

    cs.CL

    FoodieQA: A Multimodal Dataset for Fine-Grained Understanding of Chinese Food Culture

    Authors: Wenyan Li, Xinyu Zhang, Jiaang Li, Qiwei Peng, Raphael Tang, Li Zhou, Weijia Zhang, Guimin Hu, Yifei Yuan, Anders Søgaard, Daniel Hershcovich, Desmond Elliott

    Abstract: Food is a rich and varied dimension of cultural heritage, crucial to both individuals and social groups. To bridge the gap in the literature on the often-overlooked regional diversity in this domain, we introduce FoodieQA, a manually curated, fine-grained image-text dataset capturing the intricate features of food cultures across various regions in China. We evaluate vision-language Models (VLMs)… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

  2. arXiv:2404.06833  [pdf, other

    cs.CL

    Does Mapo Tofu Contain Coffee? Probing LLMs for Food-related Cultural Knowledge

    Authors: Li Zhou, Taelin Karidi, Nicolas Garneau, Yong Cao, Wanlong Liu, Wenyu Chen, Daniel Hershcovich

    Abstract: Recent studies have highlighted the presence of cultural biases in Large Language Models (LLMs), yet often lack a robust methodology to dissect these phenomena comprehensively. Our work aims to bridge this gap by delving into the Food domain, a universally relevant yet culturally diverse aspect of human life. We introduce FmLAMA, a multilingual dataset centered on food-related cultural facts and v… ▽ More

    Submitted 10 April, 2024; originally announced April 2024.

    Comments: 20 pages,8 figures

  3. arXiv:2404.00403  [pdf, other

    cs.CL

    UniMEEC: Towards Unified Multimodal Emotion Recognition and Emotion Cause

    Authors: Guimin Hu, Zhihong Zhu, Daniel Hershcovich, Hasti Seifi, Jiayuan Xie

    Abstract: Multimodal emotion recognition in conversation (MERC) and multimodal emotion-cause pair extraction (MECPE) has recently garnered significant attention. Emotions are the expression of affect or feelings; responses to specific events, thoughts, or situations are known as emotion causes. Both are like two sides of a coin, collectively describing human behaviors and intents. However, most existing wor… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

  4. arXiv:2402.06015  [pdf, other

    cs.CL cs.CV

    Exploring Visual Culture Awareness in GPT-4V: A Comprehensive Probing

    Authors: Yong Cao, Wenyan Li, Jiaang Li, Yifei Yuan, Antonia Karamolegkou, Daniel Hershcovich

    Abstract: Pretrained large Vision-Language models have drawn considerable interest in recent years due to their remarkable performance. Despite considerable efforts to assess these models from diverse perspectives, the extent of visual cultural awareness in the state-of-the-art GPT-4V model remains unexplored. To tackle this gap, we extensively probed GPT-4V using the MaRVL benchmark dataset, aiming to inve… ▽ More

    Submitted 15 February, 2024; v1 submitted 8 February, 2024; originally announced February 2024.

    Comments: work in process

  5. arXiv:2401.10352  [pdf, other

    cs.CL

    Bridging Cultural Nuances in Dialogue Agents through Cultural Value Surveys

    Authors: Yong Cao, Min Chen, Daniel Hershcovich

    Abstract: The cultural landscape of interactions with dialogue agents is a compelling yet relatively unexplored territory. It's clear that various sociocultural aspects -- from communication styles and beliefs to shared metaphors and knowledge -- profoundly impact these interactions. To delve deeper into this dynamic, we introduce cuDialog, a first-of-its-kind benchmark for dialogue generation with a cultur… ▽ More

    Submitted 2 February, 2024; v1 submitted 18 January, 2024; originally announced January 2024.

    Comments: 17pages, 7 figures, EACL 2024 findings

  6. arXiv:2310.19567  [pdf, other

    cs.CL cs.AI

    CreoleVal: Multilingual Multitask Benchmarks for Creoles

    Authors: Heather Lent, Kushal Tatariya, Raj Dabre, Yiyi Chen, Marcell Fekete, Esther Ploeger, Li Zhou, Ruth-Ann Armstrong, Abee Eijansantos, Catriona Malau, Hans Erik Heje, Ernests Lavrinovics, Diptesh Kanojia, Paul Belony, Marcel Bollmann, Loïc Grobol, Miryam de Lhoneux, Daniel Hershcovich, Michel DeGraff, Anders Søgaard, Johannes Bjerva

    Abstract: Creoles represent an under-explored and marginalized group of languages, with few available resources for NLP research.While the genealogical ties between Creoles and a number of highly-resourced languages imply a significant potential for transfer learning, this potential is hampered due to this lack of annotated data. In this work we present CreoleVal, a collection of benchmark datasets spanning… ▽ More

    Submitted 6 May, 2024; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted to TACL

  7. arXiv:2310.17353  [pdf, other

    cs.CL cs.AI

    Cultural Adaptation of Recipes

    Authors: Yong Cao, Yova Kementchedjhieva, Ruixiang Cui, Antonia Karamolegkou, Li Zhou, Megan Dare, Lucia Donatelli, Daniel Hershcovich

    Abstract: Building upon the considerable advances in Large Language Models (LLMs), we are now equipped to address more sophisticated tasks demanding a nuanced understanding of cross-cultural contexts. A key example is recipe adaptation, which goes beyond simple translation to include a grasp of ingredients, culinary techniques, and dietary preferences specific to a given culture. We introduce a new task inv… ▽ More

    Submitted 26 October, 2023; originally announced October 2023.

    Comments: Accepted to TACL

  8. arXiv:2310.09772  [pdf, other

    cs.CL

    Rethinking Relation Classification with Graph Meaning Representations

    Authors: Li Zhou, Wenyu Chen, Dingyi Zeng, Malu Zhang, Daniel Hershcovich

    Abstract: In the field of natural language understanding, the intersection of neural models and graph meaning representations (GMRs) remains a compelling area of research. Despite the growing interest, a critical gap persists in understanding the exact influence of GMRs, particularly concerning relation extraction tasks. Addressing this, we introduce DAGNN-plus, a simple and parameter-efficient neural archi… ▽ More

    Submitted 27 December, 2023; v1 submitted 15 October, 2023; originally announced October 2023.

    Comments: 10 pages

  9. arXiv:2310.06458  [pdf, other

    cs.CL

    Cultural Compass: Predicting Transfer Learning Success in Offensive Language Detection with Cultural Features

    Authors: Li Zhou, Antonia Karamolegkou, Wenyu Chen, Daniel Hershcovich

    Abstract: The increasing ubiquity of language technology necessitates a shift towards considering cultural diversity in the machine learning realm, particularly for subjective tasks that rely heavily on cultural nuances, such as Offensive Language Detection (OLD). Current understanding underscores that these tasks are substantially influenced by cultural values, however, a notable gap exists in determining… ▽ More

    Submitted 10 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023

  10. arXiv:2309.01606  [pdf, other

    cs.CL

    Geo-Encoder: A Chunk-Argument Bi-Encoder Framework for Chinese Geographic Re-Ranking

    Authors: Yong Cao, Ruixue Ding, Boli Chen, Xianzhi Li, Min Chen, Daniel Hershcovich, Pengjun Xie, Fei Huang

    Abstract: Chinese geographic re-ranking task aims to find the most relevant addresses among retrieved candidates, which is crucial for location-related services such as navigation maps. Unlike the general sentences, geographic contexts are closely intertwined with geographical concepts, from general spans (e.g., province) to specific spans (e.g., road). Given this feature, we propose an innovative framework… ▽ More

    Submitted 2 February, 2024; v1 submitted 4 September, 2023; originally announced September 2023.

    Comments: 15 pages, 5 figures, EACL 2024 main

  11. arXiv:2306.11420  [pdf, other

    cs.CL

    On Evaluating Multilingual Compositional Generalization with Translated Datasets

    Authors: Zi Wang, Daniel Hershcovich

    Abstract: Compositional generalization allows efficient learning and human-like inductive biases. Since most research investigating compositional generalization in NLP is done on English, important questions remain underexplored. Do the necessary compositional generalization abilities differ across languages? Can models compositionally generalize cross-lingually? As a first step to answering these questions… ▽ More

    Submitted 20 June, 2023; originally announced June 2023.

    Comments: ACL 2023 long paper

  12. arXiv:2305.19597  [pdf, other

    cs.CL cs.AI

    What does the Failure to Reason with "Respectively" in Zero/Few-Shot Settings Tell Us about Language Models?

    Authors: Ruixiang Cui, Seolhwa Lee, Daniel Hershcovich, Anders Søgaard

    Abstract: Humans can effortlessly understand the coordinate structure of sentences such as "Niels Bohr and Kurt Cobain were born in Copenhagen and Seattle, respectively". In the context of natural language inference (NLI), we examine how language models (LMs) reason with respective readings (Gawron and Kehler, 2004) from two perspectives: syntactic-semantic and commonsense-world knowledge. We propose a cont… ▽ More

    Submitted 31 May, 2023; originally announced May 2023.

    Comments: To appear at ACL 2023

  13. arXiv:2305.08414  [pdf, other

    cs.CL cs.AI

    What's the Meaning of Superhuman Performance in Today's NLU?

    Authors: Simone Tedeschi, Johan Bos, Thierry Declerck, Jan Hajic, Daniel Hershcovich, Eduard H. Hovy, Alexander Koller, Simon Krek, Steven Schockaert, Rico Sennrich, Ekaterina Shutova, Roberto Navigli

    Abstract: In the last five years, there has been a significant focus in Natural Language Processing (NLP) on develo** larger Pretrained Language Models (PLMs) and introducing benchmarks such as SuperGLUE and SQuAD to measure their abilities in language understanding, reasoning, and reading comprehension. These PLMs have achieved impressive results on these benchmarks, even surpassing human performance in… ▽ More

    Submitted 15 May, 2023; originally announced May 2023.

    Comments: 9 pages, long paper at ACL 2023 proceedings

  14. arXiv:2305.02118  [pdf, other

    cs.CL

    Pay More Attention to Relation Exploration for Knowledge Base Question Answering

    Authors: Yong Cao, Xianzhi Li, Huiwen Liu, Wen Dai, Shuai Chen, Bin Wang, Min Chen, Daniel Hershcovich

    Abstract: Knowledge base question answering (KBQA) is a challenging task that aims to retrieve correct answers from large-scale knowledge bases. Existing attempts primarily focus on entity representation and final answer reasoning, which results in limited supervision for this task. Moreover, the relations, which empirically determine the reasoning path selection, are not fully considered in recent advancem… ▽ More

    Submitted 25 May, 2023; v1 submitted 3 May, 2023; originally announced May 2023.

    Comments: ACL 2023 Findings

  15. arXiv:2303.17927  [pdf, other

    cs.CL

    Cross-Cultural Transfer Learning for Chinese Offensive Language Detection

    Authors: Li Zhou, Laura Cabello, Yong Cao, Daniel Hershcovich

    Abstract: Detecting offensive language is a challenging task. Generalizing across different cultures and languages becomes even more challenging: besides lexical, syntactic and semantic differences, pragmatic aspects such as cultural norms and sensitivities, which are particularly relevant in this context, vary greatly. In this paper, we target Chinese offensive language detection and aim to investigate the… ▽ More

    Submitted 31 March, 2023; originally announced March 2023.

    Comments: C3NLP@EACL

  16. arXiv:2303.17466  [pdf, other

    cs.CL

    Assessing Cross-Cultural Alignment between ChatGPT and Human Societies: An Empirical Study

    Authors: Yong Cao, Li Zhou, Seolhwa Lee, Laura Cabello, Min Chen, Daniel Hershcovich

    Abstract: The recent release of ChatGPT has garnered widespread recognition for its exceptional ability to generate human-like responses in dialogue. Given its usage by users from various nations and its training on a vast multilingual corpus that incorporates diverse cultural and societal norms, it is crucial to evaluate its effectiveness in cultural adaptation. In this paper, we investigate the underlying… ▽ More

    Submitted 31 March, 2023; v1 submitted 30 March, 2023; originally announced March 2023.

    Comments: C3NLP@EACL 2023

  17. arXiv:2302.10086  [pdf, other

    cs.CL

    A Two-Sided Discussion of Preregistration of NLP Research

    Authors: Anders Søgaard, Daniel Hershcovich, Miryam de Lhoneux

    Abstract: Van Miltenburg et al. (2021) suggest NLP research should adopt preregistration to prevent fishing expeditions and to promote publication of negative results. At face value, this is a very reasonable suggestion, seemingly solving many methodological problems with NLP research. We discuss pros and cons -- some old, some new: a) Preregistration is challenged by the practice of retrieving hypotheses a… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

    Comments: EACL 2023

  18. arXiv:2206.02661  [pdf, other

    cs.CL

    Evaluating Deep Taylor Decomposition for Reliability Assessment in the Wild

    Authors: Stephanie Brandl, Daniel Hershcovich, Anders Søgaard

    Abstract: We argue that we need to evaluate model interpretability methods 'in the wild', i.e., in situations where professionals make critical decisions, and models can potentially assist them. We present an in-the-wild evaluation of token attribution based on Deep Taylor Decomposition, with professional journalists performing reliability assessments. We find that using this method in conjunction with RoBE… ▽ More

    Submitted 3 May, 2022; originally announced June 2022.

    Comments: ICWSM 2022

  19. arXiv:2205.05071  [pdf, other

    cs.CL cs.CY

    Towards Climate Awareness in NLP Research

    Authors: Daniel Hershcovich, Nicolas Webersinke, Mathias Kraus, Julia Anna Bingler, Markus Leippold

    Abstract: The climate impact of AI, and NLP research in particular, has become a serious issue given the enormous amount of energy that is increasingly being used for training and running computational models. Consequently, increasing focus is placed on efficient NLP. However, this important initiative lacks simple guidelines that would allow for systematic climate reporting of NLP research. We argue that t… ▽ More

    Submitted 18 October, 2022; v1 submitted 10 May, 2022; originally announced May 2022.

    Comments: Accepted to EMNLP 2022

  20. arXiv:2204.10615  [pdf, other

    cs.CL cs.LO

    Generalized Quantifiers as a Source of Error in Multilingual NLU Benchmarks

    Authors: Ruixiang Cui, Daniel Hershcovich, Anders Søgaard

    Abstract: Logical approaches to representing language have developed and evaluated computational models of quantifier words since the 19th century, but today's NLU models still struggle to capture their semantics. We rely on Generalized Quantifier Theory for language-independent representations of the semantics of quantifier words, to quantify their contribution to the errors of NLU models. We find that qua… ▽ More

    Submitted 20 May, 2022; v1 submitted 22 April, 2022; originally announced April 2022.

    Comments: To appear at NAACL 2022

  21. arXiv:2203.10020  [pdf, other

    cs.CL

    Challenges and Strategies in Cross-Cultural NLP

    Authors: Daniel Hershcovich, Stella Frank, Heather Lent, Miryam de Lhoneux, Mostafa Abdou, Stephanie Brandl, Emanuele Bugliarello, Laura Cabello Piqueras, Ilias Chalkidis, Ruixiang Cui, Constanza Fierro, Katerina Margatina, Phillip Rust, Anders Søgaard

    Abstract: Various efforts in the Natural Language Processing (NLP) community have been made to accommodate linguistic diversity and serve speakers of many different languages. However, it is important to acknowledge that speakers and the content they produce and require, vary not just by language, but also by culture. Although language and culture are tightly linked, there are important differences. Analogo… ▽ More

    Submitted 18 March, 2022; originally announced March 2022.

    Comments: ACL 2022 - Theme track

  22. arXiv:2109.06129  [pdf, other

    cs.CV cs.CL

    Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color

    Authors: Mostafa Abdou, Artur Kulmizev, Daniel Hershcovich, Stella Frank, Ellie Pavlick, Anders Søgaard

    Abstract: Pretrained language models have been shown to encode relational information, such as the relations between entities or concepts in knowledge-bases -- (Paris, Capital, France). However, simple relations of this type can often be recovered heuristically and the extent to which models implicitly reflect topological structure that is grounded in world, such as perceptual structure, is unknown. To expl… ▽ More

    Submitted 14 September, 2021; v1 submitted 13 September, 2021; originally announced September 2021.

    Comments: CoNLL 2021

  23. arXiv:2108.03509  [pdf

    cs.CL

    Compositional Generalization in Multilingual Semantic Parsing over Wikidata

    Authors: Ruixiang Cui, Rahul Aralikatte, Heather Lent, Daniel Hershcovich

    Abstract: Semantic parsing (SP) allows humans to leverage vast knowledge resources through natural interaction. However, parsers are mostly designed for and evaluated on English resources, such as CFQ (Keysers et al., 2020), the current standard benchmark based on English data generated from grammar rules and oriented towards Freebase, an outdated knowledge base. We propose a method for creating a multiling… ▽ More

    Submitted 31 May, 2022; v1 submitted 7 August, 2021; originally announced August 2021.

    Comments: Accepted to TACL; Authors' final version, pre-MIT Press publication; Previous title: Multilingual Compositional Wikidata Questions

  24. arXiv:2106.07364  [pdf, ps, other

    cs.CL

    Meaning Representation of Numeric Fused-Heads in UCCA

    Authors: Ruixiang Cui, Daniel Hershcovich

    Abstract: We exhibit that the implicit UCCA parser does not address numeric fused-heads (NFHs) consistently, which could result either from inconsistent annotation, insufficient training data or a modelling limitation. and show which factors are involved. We consider this phenomenon important, as it is pervasive in text and critical for correct inference. Careful design and fine-grained annotation of NFHs i… ▽ More

    Submitted 4 June, 2021; originally announced June 2021.

    Comments: UnImplicit Workshop at ACL 2021 (abstract)

  25. arXiv:2106.02561  [pdf, ps, other

    cs.CL

    Great Service! Fine-grained Parsing of Implicit Arguments

    Authors: Ruixiang Cui, Daniel Hershcovich

    Abstract: Broad-coverage meaning representations in NLP mostly focus on explicitly expressed content. More importantly, the scarcity of datasets annotating diverse implicit roles limits empirical studies into their linguistic nuances. For example, in the web review "Great service!", the provider and consumer are implicit arguments of different types. We examine an annotated corpus of fine-grained implicit a… ▽ More

    Submitted 23 June, 2021; v1 submitted 4 June, 2021; originally announced June 2021.

    Comments: Accepted to IWPT 2021

  26. arXiv:2102.09761  [pdf, other

    cs.HC cs.AI cs.CL

    Scaling Creative Inspiration with Fine-Grained Functional Aspects of Ideas

    Authors: Tom Hope, Ronen Tamari, Hyeonsu Kang, Daniel Hershcovich, Joel Chan, Aniket Kittur, Dafna Shahaf

    Abstract: Large repositories of products, patents and scientific papers offer an opportunity for building systems that scour millions of ideas and help users discover inspirations. However, idea descriptions are typically in the form of unstructured text, lacking key structure that is required for supporting creative innovation interactions. Prior work has explored idea representations that were either limi… ▽ More

    Submitted 17 February, 2022; v1 submitted 19 February, 2021; originally announced February 2021.

    Comments: To appear in CHI 2022

    Journal ref: CHI 2022

  27. arXiv:2101.12608  [pdf, other

    cs.CL cs.AI

    Does injecting linguistic structure into language models lead to better alignment with brain recordings?

    Authors: Mostafa Abdou, Ana Valeria Gonzalez, Mariya Toneva, Daniel Hershcovich, Anders Søgaard

    Abstract: Neuroscientists evaluate deep neural networks for natural language processing as possible candidate models for how language is processed in the brain. These models are often trained without explicit linguistic supervision, but have been shown to learn some linguistic structure in the absence of such supervision (Manning et al., 2020), potentially questioning the relevance of symbolic linguistic th… ▽ More

    Submitted 29 January, 2021; originally announced January 2021.

  28. arXiv:2011.00834  [pdf, other

    cs.CL

    Comparison by Conversion: Reverse-Engineering UCCA from Syntax and Lexical Semantics

    Authors: Daniel Hershcovich, Nathan Schneider, Dotan Dvir, Jakob Prange, Miryam de Lhoneux, Omri Abend

    Abstract: Building robust natural language understanding systems will require a clear characterization of whether and how various linguistic meaning representations complement each other. To perform a systematic comparative analysis, we evaluate the map** between meaning representations from different frameworks using two complementary methods: (i) a rule-based converter, and (ii) a supervised delexicaliz… ▽ More

    Submitted 2 November, 2020; originally announced November 2020.

    Comments: COLING 2020 camera ready

  29. arXiv:2010.05710  [pdf, other

    cs.CL cs.LG

    HUJI-KU at MRP~2020: Two Transition-based Neural Parsers

    Authors: Ofir Arviv, Ruixiang Cui, Daniel Hershcovich

    Abstract: This paper describes the HUJI-KU system submission to the shared task on Cross-Framework Meaning Representation Parsing (MRP) at the 2020 Conference for Computational Language Learning (CoNLL), employing TUPA and the HIT-SCIR parser, which were, respectively, the baseline system and winning system in the 2019 MRP shared task. Both are transition-based parsers using BERT contextualized embeddings.… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

  30. arXiv:2010.05567  [pdf, other

    cs.CL

    Joint Semantic Analysis with Document-Level Cross-Task Coherence Rewards

    Authors: Rahul Aralikatte, Mostafa Abdou, Heather Lent, Daniel Hershcovich, Anders Søgaard

    Abstract: Coreference resolution and semantic role labeling are NLP tasks that capture different aspects of semantics, indicating respectively, which expressions refer to the same entity, and what semantic roles expressions serve in the sentence. However, they are often closely interdependent, and both generally necessitate natural language understanding. Do they form a coherent abstract representation of d… ▽ More

    Submitted 12 October, 2020; originally announced October 2020.

  31. arXiv:2005.12889  [pdf, other

    cs.CL

    Refining Implicit Argument Annotation for UCCA

    Authors: Ruixiang Cui, Daniel Hershcovich

    Abstract: Predicate-argument structure analysis is a central component in meaning representations of text. The fact that some arguments are not explicitly mentioned in a sentence gives rise to ambiguity in language understanding, and renders it difficult for machines to interpret text correctly. However, only few resources represent implicit roles for NLU, and existing studies in NLP only make coarse distin… ▽ More

    Submitted 8 April, 2021; v1 submitted 26 May, 2020; originally announced May 2020.

    Comments: DMR 2020

  32. arXiv:2005.12094  [pdf, other

    cs.CL

    Køpsala: Transition-Based Graph Parsing via Efficient Training and Effective Encoding

    Authors: Daniel Hershcovich, Miryam de Lhoneux, Artur Kulmizev, Elham Pejhan, Joakim Nivre

    Abstract: We present Køpsala, the Copenhagen-Uppsala system for the Enhanced Universal Dependencies Shared Task at IWPT 2020. Our system is a pipeline consisting of off-the-shelf models for everything but enhanced graph parsing, and for the latter, a transition-based graph parser adapted from Che et al. (2019). We train a single enhanced parser model per language, using gold sentence splitting and tokenizat… ▽ More

    Submitted 2 June, 2020; v1 submitted 25 May, 2020; originally announced May 2020.

    Comments: IWPT shared task 2020

  33. arXiv:2004.15008  [pdf, other

    cs.CL

    Lexical Semantic Recognition

    Authors: Nelson F. Liu, Daniel Hershcovich, Michael Kranzlein, Nathan Schneider

    Abstract: In lexical semantics, full-sentence segmentation and segment labeling of various phenomena are generally treated separately, despite their interdependence. We hypothesize that a unified lexical semantic recognition task is an effective way to encapsulate previously disparate styles of annotation, including multiword expression identification / classification and supersense tagging. Using the STREU… ▽ More

    Submitted 7 June, 2021; v1 submitted 30 April, 2020; originally announced April 2020.

    Comments: 11 pages, 3 figures; to appear at MWE 2021

  34. arXiv:1909.02392  [pdf, other

    cs.CL

    Rewarding Coreference Resolvers for Being Consistent with World Knowledge

    Authors: Rahul Aralikatte, Heather Lent, Ana Valeria Gonzalez, Daniel Hershcovich, Chen Qiu, Anders Sandholm, Michael Ringaard, Anders Søgaard

    Abstract: Unresolved coreference is a bottleneck for relation extraction, and high-quality coreference resolvers may produce an output that makes it a lot easier to extract knowledge triples. We show how to improve coreference resolvers by forwarding their input to a relation extraction system and reward the resolvers for producing triples that are found in knowledge bases. Since relation extraction systems… ▽ More

    Submitted 11 November, 2019; v1 submitted 5 September, 2019; originally announced September 2019.

    Comments: To appear in EMNLP 2019 (with corrected Fig. 2)

  35. arXiv:1908.08336  [pdf, other

    cs.CL

    Argument Invention from First Principles

    Authors: Yonatan Bilu, Ariel Gera, Daniel Hershcovich, Benjamin Sznajder, Dan Lahav, Guy Moshkowich, Anael Malet, Assaf Gavron, Noam Slonim

    Abstract: Competitive debaters often find themselves facing a challenging task -- how to debate a topic they know very little about, with only minutes to prepare, and without access to books or the Internet? What they often do is rely on "first principles", commonplace arguments which are relevant to many topics, and which they have refined in past debates. In this work we aim to explicitly define a taxon… ▽ More

    Submitted 22 August, 2019; originally announced August 2019.

    Comments: Presented at ACL 2019

  36. arXiv:1905.05543  [pdf, other

    cs.CL

    The Language of Legal and Illegal Activity on the Darknet

    Authors: Leshem Choshen, Dan Eldad, Daniel Hershcovich, Elior Sulem, Omri Abend

    Abstract: The non-indexed parts of the Internet (the Darknet) have become a haven for both legal and illegal anonymous activity. Given the magnitude of these networks, scalably monitoring their activity necessarily relies on automated tools, and notably on NLP tools. However, little is known about what characteristics texts communicated through the Darknet have, and how well off-the-shelf NLP tools do on th… ▽ More

    Submitted 4 June, 2019; v1 submitted 14 May, 2019; originally announced May 2019.

    Comments: ACL 2019 camera ready; code in https://github.com/huji-nlp/cyber

  37. arXiv:1904.00669  [pdf, other

    cs.CL

    Syntactic Interchangeability in Word Embedding Models

    Authors: Daniel Hershcovich, Assaf Toledo, Alon Halfon, Noam Slonim

    Abstract: Nearest neighbors in word embedding models are commonly observed to be semantically similar, but the relations between them can vary greatly. We investigate the extent to which word embedding models preserve syntactic interchangeability, as reflected by distances between word vectors, and the effect of hyper-parameters---context window size in particular. We use part of speech (POS) as a proxy for… ▽ More

    Submitted 12 April, 2019; v1 submitted 1 April, 2019; originally announced April 2019.

    Comments: Accepted to RepEval 2019

  38. arXiv:1903.06494  [pdf, other

    cs.CL

    Content Differences in Syntactic and Semantic Representations

    Authors: Daniel Hershcovich, Omri Abend, Ari Rappoport

    Abstract: Syntactic analysis plays an important role in semantic parsing, but the nature of this role remains a topic of ongoing debate. The debate has been constrained by the scarcity of empirical comparative studies between syntactic and semantic schemes, which hinders the development of parsing methods informed by the details of target schemes and constructions. We target this gap, and take Universal Dep… ▽ More

    Submitted 1 May, 2019; v1 submitted 15 March, 2019; originally announced March 2019.

    Comments: NAACL-HLT 2019 camera ready

  39. arXiv:1903.02953  [pdf, other

    cs.CL

    SemEval-2019 Task 1: Cross-lingual Semantic Parsing with UCCA

    Authors: Daniel Hershcovich, Zohar Aizenbud, Leshem Choshen, Elior Sulem, Ari Rappoport, Omri Abend

    Abstract: We present the SemEval 2019 shared task on UCCA parsing in English, German and French, and discuss the participating systems and results. UCCA is a cross-linguistically applicable framework for semantic representation, which builds on extensive typological work and supports rapid annotation. UCCA poses a challenge for existing parsing techniques, as it exhibits reentrancy (resulting in DAG structu… ▽ More

    Submitted 11 June, 2020; v1 submitted 6 March, 2019; originally announced March 2019.

    Comments: SemEval 2019 Shared task. arXiv admin note: substantial text overlap with arXiv:1805.12386

  40. arXiv:1808.09354  [pdf, other

    cs.CL

    Universal Dependency Parsing with a General Transition-Based DAG Parser

    Authors: Daniel Hershcovich, Omri Abend, Ari Rappoport

    Abstract: This paper presents our experiments with applying TUPA to the CoNLL 2018 UD shared task. TUPA is a general neural transition-based DAG parser, which we use to present the first experiments on recovering enhanced dependencies as part of the general parsing task. TUPA was designed for parsing UCCA, a cross-linguistic semantic annotation scheme, exhibiting reentrancy, discontinuity and non-terminal n… ▽ More

    Submitted 28 August, 2018; originally announced August 2018.

    Comments: CoNLL 2018 UD Shared Task

  41. arXiv:1805.12386   

    cs.CL

    SemEval 2019 Shared Task: Cross-lingual Semantic Parsing with UCCA - Call for Participation

    Authors: Daniel Hershcovich, Leshem Choshen, Elior Sulem, Zohar Aizenbud, Ari Rappoport, Omri Abend

    Abstract: We announce a shared task on UCCA parsing in English, German and French, and call for participants to submit their systems. UCCA is a cross-linguistically applicable framework for semantic representation, which builds on extensive typological work and supports rapid annotation. UCCA poses a challenge for existing parsing techniques, as it exhibits reentrancy (resulting in DAG structures), disconti… ▽ More

    Submitted 3 February, 2021; v1 submitted 31 May, 2018; originally announced May 2018.

    Comments: Not an actual paper. The shared task summary is at arXiv:1903.02953

  42. arXiv:1805.00287  [pdf, ps, other

    cs.CL

    Multitask Parsing Across Semantic Representations

    Authors: Daniel Hershcovich, Omri Abend, Ari Rappoport

    Abstract: The ability to consolidate information of different types is at the core of intelligence, and has tremendous practical value in allowing learning for one task to benefit from generalizations learned for others. In this paper we tackle the challenging task of improving semantic parsing performance, taking UCCA parsing as a test case, and AMR, SDP and Universal Dependencies (UD) parsing as auxiliary… ▽ More

    Submitted 1 May, 2018; originally announced May 2018.

    Comments: Accepted to ACL 2018

  43. A Transition-Based Directed Acyclic Graph Parser for UCCA

    Authors: Daniel Hershcovich, Omri Abend, Ari Rappoport

    Abstract: We present the first parser for UCCA, a cross-linguistically applicable framework for semantic representation, which builds on extensive typological work and supports rapid annotation. UCCA poses a challenge for existing parsing techniques, as it exhibits reentrancy (resulting in DAG structures), discontinuous structures and non-terminal nodes corresponding to complex semantic units. To our knowle… ▽ More

    Submitted 4 April, 2017; v1 submitted 3 April, 2017; originally announced April 2017.

    Comments: 16 pages; Accepted as long paper at ACL2017