Skip to main content

Showing 1–5 of 5 results for author: Le-Duc, K

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.15888  [pdf, other

    cs.CL cs.AI cs.LG cs.SD eess.AS

    Real-time Speech Summarization for Medical Conversations

    Authors: Khai Le-Duc, Khai-Nguyen Nguyen, Long Vo-Dang, Truong-Son Hy

    Abstract: In doctor-patient conversations, identifying medically relevant information is crucial, posing the need for conversation summarization. In this work, we propose the first deployable real-time speech summarization system for real-world applications in industry, which generates a local summary after every N speech utterances within a conversation and a global summary after the end of a conversation.… ▽ More

    Submitted 22 June, 2024; originally announced June 2024.

    Comments: Interspeech 2024

  2. arXiv:2406.13337  [pdf, other

    eess.AS cs.CL cs.LG cs.SD

    Medical Spoken Named Entity Recognition

    Authors: Khai Le-Duc

    Abstract: Spoken Named Entity Recognition (NER) aims to extracting named entities from speech and categorizing them into types like person, location, organization, etc. In this work, we present VietMed-NER - the first spoken NER dataset in the medical domain. To our best knowledge, our real-world dataset is the largest spoken NER dataset in the world in terms of the number of entity types, featuring 18 dist… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: Preprint, 40 pages

  3. arXiv:2404.05659  [pdf

    cs.CL cs.AI eess.AS

    VietMed: A Dataset and Benchmark for Automatic Speech Recognition of Vietnamese in the Medical Domain

    Authors: Khai Le-Duc

    Abstract: Due to privacy restrictions, there's a shortage of publicly available speech recognition datasets in the medical domain. In this work, we present VietMed - a Vietnamese speech recognition dataset in the medical domain comprising 16h of labeled medical speech, 1000h of unlabeled medical speech and 1200h of unlabeled general-domain speech. To our best knowledge, VietMed is by far the world's largest… ▽ More

    Submitted 28 May, 2024; v1 submitted 8 April, 2024; originally announced April 2024.

    Comments: LREC-COLING 2024, 27 pages

  4. arXiv:2309.15869  [pdf, other

    cs.CL cs.SD eess.AS

    Unsupervised Pre-Training for Vietnamese Automatic Speech Recognition in the HYKIST Project

    Authors: Khai Le-Duc

    Abstract: In today's interconnected globe, moving abroad is more and more prevalent, whether it's for employment, refugee resettlement, or other causes. Language difficulties between natives and immigrants present a common issue on a daily basis, especially in medical domain. This can make it difficult for patients and doctors to communicate during anamnesis or in the emergency room, which compromises patie… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

    Comments: Bachelor Thesis

    Journal ref: FH Aachen University of Applied Sciences (2023)

  5. arXiv:2210.13397  [pdf, ps, other

    cs.CL cs.SD eess.AS

    Development of Hybrid ASR Systems for Low Resource Medical Domain Conversational Telephone Speech

    Authors: Christoph Lüscher, Mohammad Zeineldeen, Zijian Yang, Tina Raissi, Peter Vieting, Khai Le-Duc, Weiyue Wang, Ralf Schlüter, Hermann Ney

    Abstract: Language barriers present a great challenge in our increasingly connected and global world. Especially within the medical domain, e.g. hospital or emergency room, communication difficulties and delays may lead to malpractice and non-optimal patient care. In the HYKIST project, we consider patient-physician communication, more specifically between a German-speaking physician and an Arabic- or Vietn… ▽ More

    Submitted 22 September, 2023; v1 submitted 24 October, 2022; originally announced October 2022.

    Comments: ASR System Paper for HYKIST project