Skip to main content

Showing 1–8 of 8 results for author: Nédellec, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2401.07447  [pdf

    cs.CL cs.AI

    Taec: a Manually annotated text dataset for trait and phenotype extraction and entity linking in wheat breeding literature

    Authors: Claire Nédellec, Clara Sauvion, Robert Bossy, Mariya Borovikova, Louise Deléger

    Abstract: Wheat varieties show a large diversity of traits and phenotypes. Linking them to genetic variability is essential for shorter and more efficient wheat breeding programs. Newly desirable wheat variety traits include disease resistance to reduce pesticide use, adaptation to climate change, resistance to heat and drought stresses, or low gluten content of grains. Wheat breeding experiments are docume… ▽ More

    Submitted 14 January, 2024; originally announced January 2024.

    Comments: 17 pages

  2. arXiv:2112.02955  [pdf

    cs.CL cs.AI cs.LG q-bio.QM

    Does constituency analysis enhance domain-specific pre-trained BERT models for relation extraction?

    Authors: Anfu Tang, Louise Deléger, Robert Bossy, Pierre Zweigenbaum, Claire Nédellec

    Abstract: Recently many studies have been conducted on the topic of relation extraction. The DrugProt track at BioCreative VII provides a manually-annotated corpus for the purpose of the development and evaluation of relation extraction systems, in which interactions between chemicals and genes are studied. We describe the ensemble system that we used for our submission, which combines predictions of fine-t… ▽ More

    Submitted 25 November, 2021; originally announced December 2021.

    Journal ref: BioCreative VII Challenge Evaluation Workshop, Nov 2021, on-line, Spain

  3. arXiv:2112.02097  [pdf

    q-bio.OT cs.CL cs.LG q-bio.QM

    Global alignment for relation extraction in Microbiology

    Authors: Anfu Tang, Claire Nédellec, Pierre Zweigenbaum, Louise Deléger, Robert Bossy

    Abstract: We investigate a method to extract relations from texts based on global alignment and syntactic information. Combined with SVM, this method is shown to have a performance comparable or even better than LSTM on two RE tasks.

    Submitted 25 November, 2021; originally announced December 2021.

    Journal ref: Junior Conference on Data Science and Engineering, Feb 2021, Orsay, France

  4. Building Large Lexicalized Ontologies from Text: a Use Case in Automatic Indexing of Biotechnology Patents

    Authors: Claire Nédellec, Wiktoria Golik, Sophie Aubin, Robert Bossy

    Abstract: This paper presents a tool, TyDI, and methods experimented in the building of a termino-ontology, i.e. a lexicalized ontology aimed at fine-grained indexation for semantic search applications. TyDI provides facilities for knowledge engineers and domain experts to efficiently collaborate to validate, organize and conceptualize corpus extracted terms. A use case on biotechnology patent search demons… ▽ More

    Submitted 2 October, 2020; originally announced October 2020.

    Journal ref: International Conference on Knowledge Engineering and Knowledge Management. EKAW 2010. Lecture Notes in Computer Science, vol 6317. (pp. 514-523) Springer, Berlin, Heidelberg

  5. arXiv:1805.04107  [pdf

    q-bio.QM cs.AI

    Text-mining and ontologies: new approaches to knowledge discovery of microbial diversity

    Authors: Claire Nédellec, Robert Bossy, Estelle Chaix, Louise Deléger

    Abstract: Microbiology research has access to a very large amount of public information on the habitats of microorganisms. Many areas of microbiology research uses this information, primarily in biodiversity studies. However the habitat information is expressed in unstructured natural language form, which hinders its exploitation at large-scale. It is very common for similar habitats to be described by diff… ▽ More

    Submitted 31 October, 2018; v1 submitted 10 May, 2018; originally announced May 2018.

    Comments: 5 pages

    Journal ref: Proceedings of the 4th International Microbial Diversity Conference. pp. 221-227, ed. Marco Gobetti. Pub. Simtra. ISBN 978-88-943010-0-7, Bari, October 2017

  6. arXiv:cs/0609137  [pdf

    cs.AI cs.IR

    Ontologies and Information Extraction

    Authors: Claire Nédellec, Adeline Nazarenko

    Abstract: This report argues that, even in the simplest cases, IE is an ontology-driven process. It is not a mere text filtering method based on simple pattern matching and keywords, because the extracted pieces of texts are interpreted with respect to a predefined partial domain model. This report shows that depending on the nature and the depth of the interpretation to be done for extracting the informa… ▽ More

    Submitted 24 September, 2006; originally announced September 2006.

    ACM Class: H.3.1

    Journal ref: LIPN Internal Report (2005)

  7. arXiv:cs/0609135  [pdf

    cs.AI cs.IR

    Event-based Information Extraction for the biomedical domain: the Caderige project

    Authors: Erick Alphonse, Sophie Aubin, Philippe Bessières, Gilles Bisson, Thierry Hamon, Sandrine Lagarrigue, Adeline Nazarenko, Alain-Pierre Manine, Claire Nédellec, Mohamed Ould Abdel Vetah, Thierry Poibeau, Davy Weissenbacher

    Abstract: This paper gives an overview of the Caderige project. This project involves teams from different areas (biology, machine learning, natural language processing) in order to develop high-level analysis tools for extracting structured information from biological bibliographical databases, especially Medline. The paper gives an overview of the approach and compares it to the state of the art.

    Submitted 24 September, 2006; originally announced September 2006.

    ACM Class: H.3.1

    Journal ref: Proceedings of the International Joint Workshop on Natural Language Processing in Biomedicine and Its Applications (COLING'04), Suisse (2004) 43-39

  8. arXiv:cs/0606118  [pdf, ps, other

    cs.CL cs.IR

    Adapting a general parser to a sublanguage

    Authors: Sophie Aubin, Adeline Nazarenko, Claire Nédellec

    Abstract: In this paper, we propose a method to adapt a general parser (Link Parser) to sublanguages, focusing on the parsing of texts in biology. Our main proposal is the use of terminology (identication and analysis of terms) in order to reduce the complexity of the text to be parsed. Several other strategies are explored and finally combined among which text normalization, lexicon and morpho-guessing m… ▽ More

    Submitted 28 June, 2006; originally announced June 2006.

    ACM Class: H.4

    Journal ref: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP'05) (2005) 89-93