Skip to main content

Showing 1–4 of 4 results for author: Demir, K C

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.14576  [pdf, other

    eess.AS

    Towards Intelligent Speech Assistants in Operating Rooms: A Multimodal Model for Surgical Workflow Analysis

    Authors: Kubilay Can Demir, Belen Lojo Rodriguez, Tobias Weise, Andreas Maier, Seung Hee Yang

    Abstract: To develop intelligent speech assistants and integrate them seamlessly with intra-operative decision-support frameworks, accurate and efficient surgical phase recognition is a prerequisite. In this study, we propose a multimodal framework based on Gated Multimodal Units (GMU) and Multi-Stage Temporal Convolutional Networks (MS-TCN) to recognize surgical phases of port-catheter placement operations… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: 5 Pages, Interspeech 2024

    MSC Class: 00b20

  2. arXiv:2206.12320  [pdf, other

    cs.SD cs.AI eess.AS

    PoCaP Corpus: A Multimodal Dataset for Smart Operating Room Speech Assistant using Interventional Radiology Workflow Analysis

    Authors: Kubilay Can Demir, Matthias May, Axel Schmid, Michael Uder, Katharina Breininger, Tobias Weise, Andreas Maier, Seung Hee Yang

    Abstract: This paper presents a new multimodal interventional radiology dataset, called PoCaP (Port Catheter Placement) Corpus. This corpus consists of speech and audio signals in German, X-ray images, and system commands collected from 31 PoCaP interventions by six surgeons with average duration of 81.4 $\pm$ 41.0 minutes. The corpus aims to provide a resource for develo** a smart speech assistant in ope… ▽ More

    Submitted 24 June, 2022; originally announced June 2022.

    Comments: 8 pages, 4 figures, Text, Speech and Dialogue 2022 Conference

    MSC Class: 00b20

  3. arXiv:2204.04016  [pdf, other

    eess.AS cs.CL cs.LG cs.SD q-bio.QM

    Disentangled Latent Speech Representation for Automatic Pathological Intelligibility Assessment

    Authors: Tobias Weise, Philipp Klumpp, Kubilay Can Demir, Andreas Maier, Elmar Noeth, Bjoern Heismann, Maria Schuster, Seung Hee Yang

    Abstract: Speech intelligibility assessment plays an important role in the therapy of patients suffering from pathological speech disorders. Automatic and objective measures are desirable to assist therapists in their traditionally subjective and labor-intensive assessments. In this work, we investigate a novel approach for obtaining such a measure using the divergence in disentangled latent speech represen… ▽ More

    Submitted 27 June, 2022; v1 submitted 8 April, 2022; originally announced April 2022.

    Comments: Submitted and Accepted at INTERSPEECH2022

  4. arXiv:2102.01746  [pdf, other

    eess.AS eess.SP

    Inference of the Selective Auditory Attention using Sequential LMMSE Estimation

    Authors: Ivine Kuruvila, Kubilay Can Demir, Eghart Fischer, Ulrich Hoppe

    Abstract: Attentive listening in a multispeaker environment such as a cocktail party requires suppression of the interfering speakers and the noise around. People with normal hearing perform remarkably well in such situations. Analysis of the cortical signals using electroencephalography (EEG) has revealed that the EEG signals track the envelope of the attended speech stronger than that of the interfering s… ▽ More

    Submitted 2 February, 2021; originally announced February 2021.

    Comments: 12 pages, 13 figures