Skip to main content

Showing 1–9 of 9 results for author: Constant, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2309.08698  [pdf, other

    cs.AI cs.LG

    Modelling Irregularly Sampled Time Series Without Imputation

    Authors: Rohit Agarwal, Aman Sinha, Dilip K. Prasad, Marianne Clausel, Alexander Horsch, Mathieu Constant, Xavier Coubez

    Abstract: Modelling irregularly-sampled time series (ISTS) is challenging because of missing values. Most existing methods focus on handling ISTS by converting irregularly sampled data into regularly sampled data via imputation. These models assume an underlying missing mechanism leading to unwanted bias and sub-optimal performance. We present SLAN (Switch LSTM Aggregate Network), which utilizes a pack of L… ▽ More

    Submitted 15 September, 2023; originally announced September 2023.

  2. arXiv:2206.03529  [pdf, other

    cs.CL

    How to Dissect a Muppet: The Structure of Transformer Embedding Spaces

    Authors: Timothee Mickus, Denis Paperno, Mathieu Constant

    Abstract: Pretrained embeddings based on the Transformer architecture have taken the NLP community by storm. We show that they can mathematically be reframed as a sum of vector factors and showcase how to use this reframing to study the impact of each component. We provide evidence that multi-head attentions and feed-forwards are not equally useful in all downstream applications, as well as a quantitative o… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Comments: Accepted at TACL (pre-MIT Press publication version)

  3. arXiv:2205.13858  [pdf, other

    cs.CL

    Semeval-2022 Task 1: CODWOE -- Comparing Dictionaries and Word Embeddings

    Authors: Timothee Mickus, Kees van Deemter, Mathieu Constant, Denis Paperno

    Abstract: Word embeddings have advanced the state of the art in NLP across numerous tasks. Understanding the contents of dense neural representations is of utmost interest to the computational semantics community. We propose to focus on relating these opaque word vectors with human-readable definitions, as found in dictionaries. This problem naturally divides into two subtasks: converting definitions into e… ▽ More

    Submitted 27 May, 2022; originally announced May 2022.

  4. arXiv:2108.07708  [pdf, other

    cs.CL

    A Game Interface to Study Semantic Grounding in Text-Based Models

    Authors: Timothee Mickus, Mathieu Constant, Denis Paperno

    Abstract: Can language models learn grounded representations from text distribution alone? This question is both central and recurrent in natural language processing; authors generally agree that grounding requires more than textual distribution. We propose to experimentally test this claim: if any two words have different meanings and yet cannot be distinguished from distribution alone, then grounding is o… ▽ More

    Submitted 17 August, 2021; originally announced August 2021.

  5. What do you mean, BERT? Assessing BERT as a Distributional Semantics Model

    Authors: Timothee Mickus, Denis Paperno, Mathieu Constant, Kees van Deemter

    Abstract: Contextualized word embeddings, i.e. vector representations for words in context, are naturally seen as an extension of previous noncontextual distributional semantic models. In this work, we focus on BERT, a deep neural network that produces contextualized embeddings and has set the state-of-the-art in several semantic tasks, and study the semantic coherence of its embedding space. While showing… ▽ More

    Submitted 8 May, 2020; v1 submitted 13 November, 2019; originally announced November 2019.

    Journal ref: Proceedings of the Society for Computation in Linguistics: Vol. 3 (2020), Article 34

  6. arXiv:1911.05715  [pdf, other

    cs.CL

    Mark my Word: A Sequence-to-Sequence Approach to Definition Modeling

    Authors: Timothee Mickus, Denis Paperno, Mathieu Constant

    Abstract: Defining words in a textual context is a useful task both for practical purposes and for gaining insight into distributed word representations. Building on the distributional hypothesis, we argue here that the most natural formalization of definition modeling is to treat it as a sequence-to-sequence task, rather than a word-to-sequence task: given an input sequence with a highlighted word, generat… ▽ More

    Submitted 13 November, 2019; originally announced November 2019.

    Journal ref: Proceedings of the First NLPL Workshop on Deep Learning for Natural Language Processing, 30 September, 2019, University of Turku, Turku, Finland

  7. arXiv:1404.1872  [pdf

    cs.CL

    Intégration des données d'un lexique syntaxique dans un analyseur syntaxique probabiliste

    Authors: Anthony Sigogne, Matthieu Constant, Eric Laporte

    Abstract: This article reports the evaluation of the integration of data from a syntactic-semantic lexicon, the Lexicon-Grammar of French, into a syntactic parser. We show that by changing the set of labels for verbs and predicational nouns, we can improve the performance on French of a non-lexicalized probabilistic parser.

    Submitted 7 April, 2014; originally announced April 2014.

    Comments: in French

    Journal ref: Penser le Lexique-Grammaire. Perspectives actuelles, Fryni Kakoyianni-Doa (Ed.) (2014) 505-516

  8. arXiv:1005.5596  [pdf

    cs.CL

    A generic tool to generate a lexicon for NLP from Lexicon-Grammar tables

    Authors: Matthieu Constant, Elsa Tolone

    Abstract: Lexicon-Grammar tables constitute a large-coverage syntactic lexicon but they cannot be directly used in Natural Language Processing (NLP) applications because they sometimes rely on implicit information. In this paper, we introduce LGExtract, a generic tool for generating a syntactic lexicon for NLP from the Lexicon-Grammar tables. It is based on a global table that contains undefined information… ▽ More

    Submitted 31 May, 2010; originally announced May 2010.

    Journal ref: Actes du 27e Colloque international sur le lexique et la grammaire (L'Aquila, 10-13 septembre 2008). Seconde partie, Michele De Gioia (Ed.) (2010) pages 79-93

  9. arXiv:0711.3691  [pdf, ps, other

    cs.CL

    Outilex, plate-forme logicielle de traitement de textes écrits

    Authors: Olivier Blanc, Matthieu Constant, Eric Laporte

    Abstract: The Outilex software platform, which will be made available to research, development and industry, comprises software components implementing all the fundamental operations of written text processing: processing without lexicons, exploitation of lexicons and grammars, language resource management. All data are structured in XML formats, and also in more compact formats, either readable or binary… ▽ More

    Submitted 27 November, 2007; v1 submitted 23 November, 2007; originally announced November 2007.

    Journal ref: Dans Verbum ex machina. Proceedings of TALN - Outilex, plate-forme logicielle de traitement de textes écrits, Louvain : Belgique (2006)