Skip to main content

Showing 1–4 of 4 results for author: Min, K

Searching in archive eess. Search in all archives.
.
  1. arXiv:2311.04468  [pdf

    eess.IV q-bio.NC

    A human brain atlas of chi-separation for normative iron and myelin distributions

    Authors: Kyeongseon Min, Beomseok Sohn, Woo Jung Kim, Chae Jung Park, Soohwa Song, Dong Hoon Shin, Kyung Won Chang, Na-Young Shin, Minjun Kim, Hyeong-Geol Shin, Phil Hyu Lee, Jongho Lee

    Abstract: Iron and myelin are primary susceptibility sources in the human brain. These substances are essential for healthy brain, and their abnormalities are often related to various neurological disorders. Recently, an advanced susceptibility map** technique, which is referred to as chi-separation, has been proposed, successfully disentangling paramagnetic iron from diamagnetic myelin. This method opene… ▽ More

    Submitted 2 April, 2024; v1 submitted 8 November, 2023; originally announced November 2023.

    Comments: 19 pages, 9 figures

  2. arXiv:2306.10608  [pdf, other

    cs.CV cs.SD eess.AS

    STHG: Spatial-Temporal Heterogeneous Graph Learning for Advanced Audio-Visual Diarization

    Authors: Kyle Min

    Abstract: This report introduces our novel method named STHG for the Audio-Visual Diarization task of the Ego4D Challenge 2023. Our key innovation is that we model all the speakers in a video using a single, unified heterogeneous graph learning framework. Unlike previous approaches that require a separate component solely for the camera wearer, STHG can jointly detect the speech activities of all people inc… ▽ More

    Submitted 31 October, 2023; v1 submitted 18 June, 2023; originally announced June 2023.

    Comments: Validation report for the Ego4D challenge at CVPR 2023

  3. arXiv:2210.07764  [pdf, other

    cs.CV cs.SD eess.AS

    Intel Labs at Ego4D Challenge 2022: A Better Baseline for Audio-Visual Diarization

    Authors: Kyle Min

    Abstract: This report describes our approach for the Audio-Visual Diarization (AVD) task of the Ego4D Challenge 2022. Specifically, we present multiple technical improvements over the official baselines. First, we improve the detection performance of the camera wearer's voice activity by modifying the training scheme of its model. Second, we discover that an off-the-shelf voice activity detection model can… ▽ More

    Submitted 29 October, 2023; v1 submitted 14 October, 2022; originally announced October 2022.

    Comments: Validation report for the Ego4D challenge at ECCV 2022

  4. arXiv:2008.04574  [pdf, other

    eess.AS cs.LG cs.SD

    Bunched LPCNet : Vocoder for Low-cost Neural Text-To-Speech Systems

    Authors: Ravichander Vipperla, Sangjun Park, Kihyun Choo, Samin Ishtiaq, Kyoungbo Min, Sourav Bhattacharya, Abhinav Mehrotra, Alberto Gil C. P. Ramos, Nicholas D. Lane

    Abstract: LPCNet is an efficient vocoder that combines linear prediction and deep neural network modules to keep the computational complexity low. In this work, we present two techniques to further reduce it's complexity, aiming for a low-cost LPCNet vocoder-based neural Text-to-Speech (TTS) System. These techniques are: 1) Sample-bunching, which allows LPCNet to generate more than one audio sample per infe… ▽ More

    Submitted 11 August, 2020; originally announced August 2020.

    Comments: Interspeech 2020