Skip to main content

Showing 1–2 of 2 results for author: Herve, N

Searching in archive cs. Search in all archives.
.
  1. arXiv:2207.01893  [pdf, other

    cs.CL

    ASR-Generated Text for Language Model Pre-training Applied to Speech Tasks

    Authors: Valentin Pelloin, Franck Dary, Nicolas Herve, Benoit Favre, Nathalie Camelin, Antoine Laurent, Laurent Besacier

    Abstract: We aim at improving spoken language modeling (LM) using very large amount of automatically transcribed speech. We leverage the INA (French National Audiovisual Institute) collection and obtain 19GB of text after applying ASR on 350,000 hours of diverse TV shows. From this, spoken language models are trained either by fine-tuning an existing LM (FlauBERT) or through training a LM from scratch. New… ▽ More

    Submitted 5 July, 2022; originally announced July 2022.

    Comments: Interspeech 2022 (Camera Ready)

  2. arXiv:2001.04139  [pdf, ps, other

    cs.IR cs.SI

    Représentations lexicales pour la détection non supervisée d'événements dans un flux de tweets : étude sur des corpus français et anglais

    Authors: Béatrice Mazoyer, Nicolas Hervé, Céline Hudelot, Julia Cage

    Abstract: In this work, we evaluate the performance of recent text embeddings for the automatic detection of events in a stream of tweets. We model this task as a dynamic clustering problem.Our experiments are conducted on a publicly available corpus of tweets in English and on a similar dataset in French annotated by our team. We show that recent techniques based on deep neural networks (ELMo, Universal Se… ▽ More

    Submitted 13 January, 2020; originally announced January 2020.

    Comments: in French. Extraction et Gestion des connaissances, EGC 2020, Jan 2020, Bruxelles, France