Skip to main content

Showing 1–9 of 9 results for author: González-Gallardo, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.09128  [pdf, other

    cs.CL

    CoastTerm: a Corpus for Multidisciplinary Term Extraction in Coastal Scientific Literature

    Authors: Julien Delaunay, Hanh Thi Hong Tran, Carlos-Emiliano González-Gallardo, Georgeta Bordea, Mathilde Ducos, Nicolas Sidere, Antoine Doucet, Senja Pollak, Olivier De Viron

    Abstract: The growing impact of climate change on coastal areas, particularly active but fragile regions, necessitates collaboration among diverse stakeholders and disciplines to formulate effective environmental protection policies. We introduce a novel specialized corpus comprising 2,491 sentences from 410 scientific abstracts concerning coastal areas, for the Automatic Term Extraction (ATE) and Classific… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  2. arXiv:2309.16396  [pdf, other

    cs.CL

    A Comprehensive Survey of Document-level Relation Extraction (2016-2023)

    Authors: Julien Delaunay, Hanh Thi Hong Tran, Carlos-Emiliano González-Gallardo, Georgeta Bordea, Nicolas Sidere, Antoine Doucet

    Abstract: Document-level relation extraction (DocRE) is an active area of research in natural language processing (NLP) concerned with identifying and extracting relationships between entities beyond sentence boundaries. Compared to the more traditional sentence-level relation extraction, DocRE provides a broader context for analysis and is more challenging because it involves identifying relationships that… ▽ More

    Submitted 12 October, 2023; v1 submitted 28 September, 2023; originally announced September 2023.

    ACM Class: A.1

  3. arXiv:2303.17322  [pdf, other

    cs.DL cs.CL cs.IR

    Yes but.. Can ChatGPT Identify Entities in Historical Documents?

    Authors: Carlos-Emiliano González-Gallardo, Emanuela Boros, Nancy Girdhar, Ahmed Hamdi, Jose G. Moreno, Antoine Doucet

    Abstract: Large language models (LLMs) have been leveraged for several years now, obtaining state-of-the-art performance in recognizing entities from modern documents. For the last few months, the conversational agent ChatGPT has "prompted" a lot of interest in the scientific community and public due to its capacity of generating plausible-sounding answers. In this paper, we explore this ability by probing… ▽ More

    Submitted 30 March, 2023; originally announced March 2023.

    Comments: 5 pages, accepted to JCDL2023

  4. arXiv:2004.06747  [pdf, other

    cs.IR cs.CL

    Extending Text Informativeness Measures to Passage Interestingness Evaluation (Language Model vs. Word Embedding)

    Authors: Carlos-Emiliano González-Gallardo, Eric SanJuan, Juan-Manuel Torres-Moreno

    Abstract: Standard informativeness measures used to evaluate Automatic Text Summarization mostly rely on n-gram overlap** between the automatic summary and the reference summaries. These measures differ from the metric they use (cosine, ROUGE, Kullback-Leibler, Logarithm Similarity, etc.) and the bag of terms they consider (single words, word n-grams, entities, nuggets, etc.). Recent word embedding approa… ▽ More

    Submitted 14 April, 2020; originally announced April 2020.

  5. arXiv:2001.07098  [pdf, other

    cs.CL cs.IR

    Audio Summarization with Audio Features and Probability Distribution Divergence

    Authors: Carlos-Emiliano González-Gallardo, Romain Deveaud, Eric SanJuan, Juan-Manuel Torres-Moreno

    Abstract: The automatic summarization of multimedia sources is an important task that facilitates the understanding of an individual by condensing the source while maintaining relevant information. In this paper we focus on audio summarization based on audio features and the probability of distribution divergence. Our method, based on an extractive summarization approach, aims to select the most relevant se… ▽ More

    Submitted 2 April, 2020; v1 submitted 20 January, 2020; originally announced January 2020.

    Comments: 20th International Conference on Computational Linguistics and Intelligent Text Processing

  6. arXiv:1809.00994  [pdf, other

    cs.CL

    Étude de l'informativité des transcriptions : une approche basée sur le résumé automatique

    Authors: Carlos-Emiliano González-Gallardo, Malek Hajjem, Eric SanJuan, Juan-Manuel Torres-Moreno

    Abstract: In this paper we propose a new approach to evaluate the informativeness of transcriptions coming from Automatic Speech Recognition systems. This approach, based in the notion of informativeness, is focused on the framework of Automatic Text Summarization performed over these transcriptions. At a first glance we estimate the informative content of the various automatic transcriptions, then we explo… ▽ More

    Submitted 4 September, 2018; originally announced September 2018.

    Comments: in French, 15e Conférence en Recherche d'Information et Applications (CORIA)

  7. arXiv:1808.08850  [pdf, ps, other

    cs.CL

    WiSeBE: Window-based Sentence Boundary Evaluation

    Authors: Carlos-Emiliano González-Gallardo, Juan-Manuel Torres-Moreno

    Abstract: Sentence Boundary Detection (SBD) has been a major research topic since Automatic Speech Recognition transcripts have been used for further Natural Language Processing tasks like Part of Speech Tagging, Question Answering or Automatic Summarization. But what about evaluation? Do standard evaluation metrics like precision, recall, F-score or classification error; and more important, evaluating an a… ▽ More

    Submitted 27 August, 2018; originally announced August 2018.

    Comments: In proceedings of the 17th Mexican International Conference on Artificial Intelligence (MICAI), 2018

  8. arXiv:1802.04559  [pdf, other

    cs.CL

    Sentence Boundary Detection for French with Subword-Level Information Vectors and Convolutional Neural Networks

    Authors: Carlos-Emiliano González-Gallardo, Juan-Manuel Torres-Moreno

    Abstract: In this work we tackle the problem of sentence boundary detection applied to French as a binary classification task ("sentence boundary" or "not sentence boundary"). We combine convolutional neural networks with subword-level information vectors, which are word embedding representations learned from Wikipedia that take advantage of the words morphology; so each word is represented as a bag of thei… ▽ More

    Submitted 13 February, 2018; originally announced February 2018.

    Comments: In proceedings of the International Conference on Natural Language, Signal and Speech Processing (ICNLSSP) 2017

  9. arXiv:1702.06467  [pdf, other

    cs.IR cs.CL cs.SI

    Efficient Social Network Multilingual Classification using Character, POS n-grams and Dynamic Normalization

    Authors: Carlos-Emiliano González-Gallardo, Juan-Manuel Torres-Moreno, Azucena Montes Rendón, Gerardo Sierra

    Abstract: In this paper we describe a dynamic normalization process applied to social network multilingual documents (Facebook and Twitter) to improve the performance of the Author profiling task for short texts. After the normalization process, $n$-grams of characters and n-grams of POS tags are obtained to extract all the possible stylistic information encoded in the documents (emoticons, character floodi… ▽ More

    Submitted 21 February, 2017; originally announced February 2017.

    Comments: 8 pages, 6 figures, Conference paper

    Journal ref: Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management, Vol 1: KDIR, 307-314, 2016, Porto, Portugal