Skip to main content

Showing 1–26 of 26 results for author: Nivre, J

.
  1. arXiv:2403.17748  [pdf, other

    cs.CL

    UCxn: Typologically Informed Annotation of Constructions Atop Universal Dependencies

    Authors: Leonie Weissweiler, Nina Böbel, Kirian Guiller, Santiago Herrera, Wesley Scivetti, Arthur Lorenzi, Nurit Melnik, Archna Bhatia, Hinrich Schütze, Lori Levin, Amir Zeldes, Joakim Nivre, William Croft, Nathan Schneider

    Abstract: The Universal Dependencies (UD) project has created an invaluable collection of treebanks with contributions in over 140 languages. However, the UD annotations do not tell the full story. Grammatical constructions that convey meaning through a particular combination of several morphosyntactic elements -- for example, interrogative sentences with special markers and/or word orders -- are not labele… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: LREC-COLING 2024

  2. arXiv:2311.01200  [pdf, other

    cs.CL cs.LG

    Continual Learning Under Language Shift

    Authors: Evangelia Gogoulou, Timothée Lesort, Magnus Boman, Joakim Nivre

    Abstract: The recent increase in data and model scale for language model pre-training has led to huge training costs. In scenarios where new data become available over time, updating a model instead of fully retraining it would therefore provide significant gains. We study the pros and cons of updating a language model when new data comes from new languages -- the case of continual learning under language s… ▽ More

    Submitted 27 June, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

    Comments: Accepted to TSD 2024

  3. arXiv:2110.08887  [pdf, ps, other

    cs.CL

    Schrödinger's Tree -- On Syntax and Neural Language Models

    Authors: Artur Kulmizev, Joakim Nivre

    Abstract: In the last half-decade, the field of natural language processing (NLP) has undergone two major transitions: the switch to neural networks as the primary modeling paradigm and the homogenization of the training regime (pre-train, then fine-tune). Amidst this process, language models have emerged as NLP's workhorse, displaying increasingly fluent generation capabilities and proving to be an indispe… ▽ More

    Submitted 17 October, 2021; originally announced October 2021.

    Comments: preprint, submitted to Frontiers in Artificial Intelligence: Perspectives for Natural Language Processing between AI, Linguistics and Cognitive Science

  4. arXiv:2107.12203  [pdf, other

    cs.CL

    Revisiting Negation in Neural Machine Translation

    Authors: Gongbo Tang, Philipp Rönchen, Rico Sennrich, Joakim Nivre

    Abstract: In this paper, we evaluate the translation of negation both automatically and manually, in English--German (EN--DE) and English--Chinese (EN--ZH). We show that the ability of neural machine translation (NMT) models to translate negation has improved with deeper and more advanced networks, although the performance varies between language pairs and translation directions. The accuracy of manual eval… ▽ More

    Submitted 26 July, 2021; originally announced July 2021.

    Comments: To appear at TACL and to be presented at ACL 2021. Authors' final version

  5. arXiv:2101.11959  [pdf, ps, other

    cs.CL

    Syntactic Nuclei in Dependency Parsing -- A Multilingual Exploration

    Authors: Ali Basirat, Joakim Nivre

    Abstract: Standard models for syntactic dependency parsing take words to be the elementary units that enter into dependency relations. In this paper, we investigate whether there are any benefits from enriching these models with the more abstract notion of nucleus proposed by Tesnière. We do this by showing how the concept of nucleus can be defined in the framework of Universal Dependencies and how we can u… ▽ More

    Submitted 29 January, 2021; v1 submitted 28 January, 2021; originally announced January 2021.

    Comments: Accepted at EACL-2021

  6. arXiv:2101.10927  [pdf, other

    cs.CL

    Attention Can Reflect Syntactic Structure (If You Let It)

    Authors: Vinit Ravishankar, Artur Kulmizev, Mostafa Abdou, Anders Søgaard, Joakim Nivre

    Abstract: Since the popularization of the Transformer as a general-purpose feature encoder for NLP, many studies have attempted to decode linguistic structure from its novel multi-head attention mechanism. However, much of such work focused almost exclusively on English -- a language with rigid word order and a lack of inflectional morphology. In this study, we present decoding experiments for multilingual… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

    Journal ref: EACL 2021

  7. arXiv:2011.03469  [pdf, other

    cs.CL

    Understanding Pure Character-Based Neural Machine Translation: The Case of Translating Finnish into English

    Authors: Gongbo Tang, Rico Sennrich, Joakim Nivre

    Abstract: Recent work has shown that deeper character-based neural machine translation (NMT) models can outperform subword-based models. However, it is still unclear what makes deeper character-based models successful. In this paper, we conduct an investigation into pure character-based models in the case of translating Finnish into English, including exploring the ability to learn word senses and morpholog… ▽ More

    Submitted 6 November, 2020; originally announced November 2020.

    Comments: accepted by COLING 2020, camera-ready version

  8. arXiv:2007.04686  [pdf, ps, other

    cs.CL

    Greedy Transition-Based Dependency Parsing with Discrete and Continuous Supertag Features

    Authors: Ali Basirat, Joakim Nivre

    Abstract: We study the effect of rich supertag features in greedy transition-based dependency parsing. While previous studies have shown that sparse boolean features representing the 1-best supertag of a word can improve parsing accuracy, we show that we can get further improvements by adding a continuous vector representation of the entire supertag distribution for a word. In this way, we achieve the best… ▽ More

    Submitted 9 July, 2020; originally announced July 2020.

    Comments: This paper was originally submitted to EMNLP 2015 and has not been previously published

  9. arXiv:2007.04629  [pdf, ps, other

    cs.CL

    Principal Word Vectors

    Authors: Ali Basirat, Christian Hardmeier, Joakim Nivre

    Abstract: We generalize principal component analysis for embedding words into a vector space. The generalization is made in two major levels. The first is to generalize the concept of the corpus as a counting process which is defined by three key elements vocabulary set, feature (annotation) set, and context. This generalization enables the principal word embedding method to generate word vectors with regar… ▽ More

    Submitted 9 July, 2020; originally announced July 2020.

  10. arXiv:2005.12094  [pdf, other

    cs.CL

    Køpsala: Transition-Based Graph Parsing via Efficient Training and Effective Encoding

    Authors: Daniel Hershcovich, Miryam de Lhoneux, Artur Kulmizev, Elham Pejhan, Joakim Nivre

    Abstract: We present Køpsala, the Copenhagen-Uppsala system for the Enhanced Universal Dependencies Shared Task at IWPT 2020. Our system is a pipeline consisting of off-the-shelf models for everything but enhanced graph parsing, and for the latter, a transition-based graph parser adapted from Che et al. (2019). We train a single enhanced parser model per language, using gold sentence splitting and tokenizat… ▽ More

    Submitted 2 June, 2020; v1 submitted 25 May, 2020; originally announced May 2020.

    Comments: IWPT shared task 2020

  11. arXiv:2004.14096  [pdf, other

    cs.CL cs.LG

    Do Neural Language Models Show Preferences for Syntactic Formalisms?

    Authors: Artur Kulmizev, Vinit Ravishankar, Mostafa Abdou, Joakim Nivre

    Abstract: Recent work on the interpretability of deep neural language models has concluded that many properties of natural language syntax are encoded in their representational spaces. However, such studies often suffer from limited scope by focusing on a single language and a single linguistic formalism. In this study, we aim to investigate the extent to which the semblance of syntactic structure captured… ▽ More

    Submitted 29 April, 2020; originally announced April 2020.

    Comments: ACL 2020

  12. arXiv:2004.10643  [pdf, other

    cs.CL

    Universal Dependencies v2: An Evergrowing Multilingual Treebank Collection

    Authors: Joakim Nivre, Marie-Catherine de Marneffe, Filip Ginter, Jan Hajič, Christopher D. Manning, Sampo Pyysalo, Sebastian Schuster, Francis Tyers, Daniel Zeman

    Abstract: Universal Dependencies is an open community effort to create cross-linguistically consistent treebank annotation for many languages within a dependency-based lexicalist framework. The annotation consists in a linguistically motivated word segmentation; a morphological layer comprising lemmas, universal part-of-speech tags, and standardized morphological features; and a syntactic layer focusing on… ▽ More

    Submitted 22 April, 2020; originally announced April 2020.

    Comments: LREC 2020

  13. Encoders Help You Disambiguate Word Senses in Neural Machine Translation

    Authors: Gongbo Tang, Rico Sennrich, Joakim Nivre

    Abstract: Neural machine translation (NMT) has achieved new state-of-the-art performance in translating ambiguous words. However, it is still unclear which component dominates the process of disambiguation. In this paper, we explore the ability of NMT encoders and decoders to disambiguate word senses by evaluating hidden states and investigating the distributions of self-attention. We train a classifier to… ▽ More

    Submitted 6 May, 2020; v1 submitted 30 August, 2019; originally announced August 2019.

    Comments: Update with corrections. Here is the link to the erratum: https://www.aclweb.org/anthology/D19-1149e1.pdf

  14. arXiv:1908.07397  [pdf, other

    cs.CL

    Deep Contextualized Word Embeddings in Transition-Based and Graph-Based Dependency Parsing -- A Tale of Two Parsers Revisited

    Authors: Artur Kulmizev, Miryam de Lhoneux, Johannes Gontrum, Elena Fano, Joakim Nivre

    Abstract: Transition-based and graph-based dependency parsers have previously been shown to have complementary strengths and weaknesses: transition-based parsers exploit rich structural features but suffer from error propagation, while graph-based parsers benefit from global optimization but have restricted feature scope. In this paper, we show that, even though some details of the picture have changed afte… ▽ More

    Submitted 27 August, 2019; v1 submitted 20 August, 2019; originally announced August 2019.

    Comments: Accepted at EMNLP 2019

  15. arXiv:1907.08158  [pdf, other

    cs.CL

    Understanding Neural Machine Translation by Simplification: The Case of Encoder-free Models

    Authors: Gongbo Tang, Rico Sennrich, Joakim Nivre

    Abstract: In this paper, we try to understand neural machine translation (NMT) via simplifying NMT architectures and training encoder-free NMT models. In an encoder-free model, the sums of word embeddings and positional embeddings represent the source. The decoder is a standard Transformer or recurrent neural network that directly attends to embeddings via attention mechanisms. Experimental results show (1)… ▽ More

    Submitted 18 July, 2019; originally announced July 2019.

    Comments: Accepted by RANLP 2019, camera ready version

  16. arXiv:1907.07950  [pdf, other

    cs.CL

    What Should/Do/Can LSTMs Learn When Parsing Auxiliary Verb Constructions?

    Authors: Miryam de Lhoneux, Sara Stymne, Joakim Nivre

    Abstract: There is a growing interest in investigating what neural NLP models learn about language. A prominent open question is the question of whether or not it is necessary to model hierarchical structure. We present a linguistic investigation of a neural parser adding insights to this question. We look at transitivity and agreement information of auxiliary verb constructions (AVCs) in comparison to fini… ▽ More

    Submitted 12 October, 2020; v1 submitted 18 July, 2019; originally announced July 2019.

    Comments: Accepted by the Computational Linguistics journal

  17. arXiv:1902.09781  [pdf, other

    cs.CL

    Recursive Subtree Composition in LSTM-Based Dependency Parsing

    Authors: Miryam de Lhoneux, Miguel Ballesteros, Joakim Nivre

    Abstract: The need for tree structure modelling on top of sequence modelling is an open issue in neural dependency parsing. We investigate the impact of adding a tree layer on top of a sequential model by recursively composing subtree representations (composition) in a transition-based parser that uses features extracted by a BiLSTM. Composition seems superfluous with such a model, suggesting that BiLSTMs c… ▽ More

    Submitted 26 February, 2019; originally announced February 2019.

    Comments: Accepted at NAACL 2019

  18. arXiv:1810.07595  [pdf, other

    cs.CL

    An Analysis of Attention Mechanisms: The Case of Word Sense Disambiguation in Neural Machine Translation

    Authors: Gongbo Tang, Rico Sennrich, Joakim Nivre

    Abstract: Recent work has shown that the encoder-decoder attention mechanisms in neural machine translation (NMT) are different from the word alignment in statistical machine translation. In this paper, we focus on analyzing encoder-decoder attention mechanisms, in the case of word sense disambiguation (WSD) in NMT models. We hypothesize that attention mechanisms pay more attention to context tokens when tr… ▽ More

    Submitted 17 October, 2018; originally announced October 2018.

    Comments: 10 pages, accepted by WMT 2018

  19. arXiv:1809.02237  [pdf, ps, other

    cs.CL

    82 Treebanks, 34 Models: Universal Dependency Parsing with Multi-Treebank Models

    Authors: Aaron Smith, Bernd Bohnet, Miryam de Lhoneux, Joakim Nivre, Yan Shao, Sara Stymne

    Abstract: We present the Uppsala system for the CoNLL 2018 Shared Task on universal dependency parsing. Our system is a pipeline consisting of three components: the first performs joint word and sentence segmentation; the second predicts part-of- speech tags and morphological features; the third predicts dependency trees from words and tags. Instead of training a single parsing model for each treebank, we t… ▽ More

    Submitted 6 September, 2018; originally announced September 2018.

    Comments: Proceedings of the CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies

  20. arXiv:1808.09060  [pdf, other

    cs.CL

    An Investigation of the Interactions Between Pre-Trained Word Embeddings, Character Models and POS Tags in Dependency Parsing

    Authors: Aaron Smith, Miryam de Lhoneux, Sara Stymne, Joakim Nivre

    Abstract: We provide a comprehensive analysis of the interactions between pre-trained word embeddings, character models and POS tags in a transition-based dependency parser. While previous studies have shown POS information to be less important in the presence of character models, we show that in fact there are complex interactions between all three techniques. In isolation each produces large improvements… ▽ More

    Submitted 27 August, 2018; originally announced August 2018.

    Comments: EMNLP 2018

  21. arXiv:1807.02974  [pdf, ps, other

    cs.CL

    Universal Word Segmentation: Implementation and Interpretation

    Authors: Yan Shao, Christian Hardmeier, Joakim Nivre

    Abstract: Word segmentation is a low-level NLP task that is non-trivial for a considerable number of languages. In this paper, we present a sequence tagging framework and apply it to word segmentation for a wide range of languages with different writing systems and typological characteristics. Additionally, we investigate the correlations between various typological factors and word segmentation accuracy. T… ▽ More

    Submitted 9 July, 2018; originally announced July 2018.

    Journal ref: Transactions of the Association for Computational Linguistics, vol. 6, pp. 421--435, 2018

  22. arXiv:1806.05210  [pdf, ps, other

    cs.CL

    An Evaluation of Neural Machine Translation Models on Historical Spelling Normalization

    Authors: Gongbo Tang, Fabienne Cap, Eva Pettersson, Joakim Nivre

    Abstract: In this paper, we apply different NMT models to the problem of historical spelling normalization for five languages: English, German, Hungarian, Icelandic, and Swedish. The NMT models are at different levels, have different attention mechanisms, and different neural network architectures. Our results show that NMT models are much better than SMT models in terms of character error rate. The vanilla… ▽ More

    Submitted 4 August, 2018; v1 submitted 13 June, 2018; originally announced June 2018.

    Comments: 12 pages, accepted by COLING 2018, added subword-level Transformer models

  23. arXiv:1805.05089  [pdf, other

    cs.CL

    Parser Training with Heterogeneous Treebanks

    Authors: Sara Stymne, Miryam de Lhoneux, Aaron Smith, Joakim Nivre

    Abstract: How to make the most of multiple heterogeneous treebanks when training a monolingual dependency parser is an open question. We start by investigating previously suggested, but little evaluated, strategies for exploiting multiple treebanks based on concatenating training sets, with or without fine-tuning. We go on to propose a new method based on treebank embeddings. We perform experiments for seve… ▽ More

    Submitted 14 May, 2018; originally announced May 2018.

    Comments: 7 pages. Accepted to ACL 2018, short papers

  24. arXiv:1804.06922  [pdf, other

    cs.CL

    Sentences with Gap**: Parsing and Reconstructing Elided Predicates

    Authors: Sebastian Schuster, Joakim Nivre, Christopher D. Manning

    Abstract: Sentences with gap**, such as Paul likes coffee and Mary tea, lack an overt predicate to indicate the relation between two or more arguments. Surface syntax representations of such sentences are often produced poorly by parsers, and even if correct, not well suited to downstream natural language understanding tasks such as relation extraction that are typically designed to extract information fr… ▽ More

    Submitted 18 April, 2018; originally announced April 2018.

    Comments: To be presented at NAACL 2018

    Journal ref: Proceedings of the 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL 2018)

  25. arXiv:1704.01314  [pdf, ps, other

    cs.CL

    Character-based Joint Segmentation and POS Tagging for Chinese using Bidirectional RNN-CRF

    Authors: Yan Shao, Christian Hardmeier, Jörg Tiedemann, Joakim Nivre

    Abstract: We present a character-based model for joint segmentation and POS tagging for Chinese. The bidirectional RNN-CRF architecture for general sequence tagging is adapted and applied with novel vector representations of Chinese characters that capture rich contextual information and lower-than-character level features. The proposed model is extensively evaluated and compared with a state-of-the-art tag… ▽ More

    Submitted 12 September, 2017; v1 submitted 5 April, 2017; originally announced April 2017.

    Comments: 10 pages plus 1 page appendix, 3 figures, IJCNLP 2017

  26. arXiv:1603.06503  [pdf, ps, other

    cs.CL

    Static and Dynamic Feature Selection in Morphosyntactic Analyzers

    Authors: Bernd Bohnet, Miguel Ballesteros, Ryan McDonald, Joakim Nivre

    Abstract: We study the use of greedy feature selection methods for morphosyntactic tagging under a number of different conditions. We compare a static ordering of features to a dynamic ordering based on mutual information statistics, and we apply the techniques to standalone taggers as well as joint systems for tagging and parsing. Experiments on five languages show that feature selection can result in more… ▽ More

    Submitted 21 March, 2016; originally announced March 2016.