CoastTerm: a Corpus for Multidisciplinary Term Extraction in Coastal Scientific Literature
Authors:
Julien Delaunay,
Hanh Thi Hong Tran,
Carlos-Emiliano González-Gallardo,
Georgeta Bordea,
Mathilde Ducos,
Nicolas Sidere,
Antoine Doucet,
Senja Pollak,
Olivier De Viron
Abstract:
The growing impact of climate change on coastal areas, particularly active but fragile regions, necessitates collaboration among diverse stakeholders and disciplines to formulate effective environmental protection policies. We introduce a novel specialized corpus comprising 2,491 sentences from 410 scientific abstracts concerning coastal areas, for the Automatic Term Extraction (ATE) and Classific…
▽ More
The growing impact of climate change on coastal areas, particularly active but fragile regions, necessitates collaboration among diverse stakeholders and disciplines to formulate effective environmental protection policies. We introduce a novel specialized corpus comprising 2,491 sentences from 410 scientific abstracts concerning coastal areas, for the Automatic Term Extraction (ATE) and Classification (ATC) tasks. Inspired by the ARDI framework, focused on the identification of Actors, Resources, Dynamics and Interactions, we automatically extract domain terms and their distinct roles in the functioning of coastal systems by leveraging monolingual and multilingual transformer models. The evaluation demonstrates consistent results, achieving an F1 score of approximately 80\% for automated term extraction and F1 of 70\% for extracting terms and their labels. These findings are promising and signify an initial step towards the development of a specialized Knowledge Base dedicated to coastal areas.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
A Comprehensive Survey of Document-level Relation Extraction (2016-2023)
Authors:
Julien Delaunay,
Hanh Thi Hong Tran,
Carlos-Emiliano González-Gallardo,
Georgeta Bordea,
Nicolas Sidere,
Antoine Doucet
Abstract:
Document-level relation extraction (DocRE) is an active area of research in natural language processing (NLP) concerned with identifying and extracting relationships between entities beyond sentence boundaries. Compared to the more traditional sentence-level relation extraction, DocRE provides a broader context for analysis and is more challenging because it involves identifying relationships that…
▽ More
Document-level relation extraction (DocRE) is an active area of research in natural language processing (NLP) concerned with identifying and extracting relationships between entities beyond sentence boundaries. Compared to the more traditional sentence-level relation extraction, DocRE provides a broader context for analysis and is more challenging because it involves identifying relationships that may span multiple sentences or paragraphs. This task has gained increased interest as a viable solution to build and populate knowledge bases automatically from unstructured large-scale documents (e.g., scientific papers, legal contracts, or news articles), in order to have a better understanding of relationships between entities. This paper aims to provide a comprehensive overview of recent advances in this field, highlighting its different applications in comparison to sentence-level relation extraction.
△ Less
Submitted 12 October, 2023; v1 submitted 28 September, 2023;
originally announced September 2023.