Skip to main content

Showing 1–14 of 14 results for author: Kim, S H

Searching in archive eess. Search in all archives.
.
  1. arXiv:2307.16398  [pdf, other

    eess.AS

    Robust Self Supervised Speech Embeddings for Child-Adult Classification in Interactions involving Children with Autism

    Authors: Rimita Lahiri, Tiantian Feng, Rajat Hebbar, Catherine Lord, So Hyun Kim, Shrikanth Narayanan

    Abstract: We address the problem of detecting who spoke when in child-inclusive spoken interactions i.e., automatic child-adult speaker classification. Interactions involving children are richly heterogeneous due to developmental differences. The presence of neurodiversity e.g., due to Autism, contributes additional variability. We investigate the impact of additional pre-training with more unlabelled child… ▽ More

    Submitted 31 July, 2023; originally announced July 2023.

  2. arXiv:2304.01576  [pdf, other

    eess.IV cs.CV cs.LG

    MESAHA-Net: Multi-Encoders based Self-Adaptive Hard Attention Network with Maximum Intensity Projections for Lung Nodule Segmentation in CT Scan

    Authors: Muhammad Usman, Azka Rehman, Abdullah Shahid, Siddique Latif, Shi Sub Byon, Sung Hyun Kim, Tariq Mahmood Khan, Yeong Gil Shin

    Abstract: Accurate lung nodule segmentation is crucial for early-stage lung cancer diagnosis, as it can substantially enhance patient survival rates. Computed tomography (CT) images are widely employed for early diagnosis in lung nodule analysis. However, the heterogeneity of lung nodules, size diversity, and the complexity of the surrounding environment pose challenges for develo** robust nodule segmenta… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  3. arXiv:2302.05811  [pdf, other

    cs.RO eess.SY

    Hierarchical control and learning of a foraging CyberOctopus

    Authors: Chia-Hsien Shih, Noel Naughton, Udit Halder, Heng-Sheng Chang, Seung Hyun Kim, Rhanor Gillette, Prashant G. Mehta, Mattia Gazzola

    Abstract: Inspired by the unique neurophysiology of the octopus, we propose a hierarchical framework that simplifies the coordination of multiple soft arms by decomposing control into high-level decision making, low-level motor activation, and local reflexive behaviors via sensory feedback. When evaluated in the illustrative problem of a model octopus foraging for food, this hierarchical decomposition resul… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

    Comments: 16 pages, 7 figures

  4. arXiv:2212.07050  [pdf, other

    cs.LG cs.CV eess.IV

    Significantly Improving Zero-Shot X-ray Pathology Classification via Fine-tuning Pre-trained Image-Text Encoders

    Authors: Jongseong Jang, Daeun Kyung, Seung Hwan Kim, Honglak Lee, Kyunghoon Bae, Edward Choi

    Abstract: Deep neural networks have been successfully adopted to diverse domains including pathology classification based on medical images. However, large-scale and high-quality data to train powerful neural networks are rare in the medical domain as the labeling must be done by qualified experts. Researchers recently tackled this problem with some success by taking advantage of models pre-trained on large… ▽ More

    Submitted 16 March, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

  5. arXiv:2211.03279  [pdf, other

    eess.AS cs.SD

    A Context-Aware Computational Approach for Measuring Vocal Entrainment in Dyadic Conversations

    Authors: Rimita Lahiri, Md Nasir, Catherine Lord, So Hyun Kim, Shrikanth Narayanan

    Abstract: Vocal entrainment is a social adaptation mechanism in human interaction, knowledge of which can offer useful insights to an individual's cognitive-behavioral characteristics. We propose a context-aware approach for measuring vocal entrainment in dyadic conversations. We use conformers(a combination of convolutional network and transformer) for capturing both short-term and long-term conversational… ▽ More

    Submitted 6 November, 2022; originally announced November 2022.

  6. arXiv:2211.00003  [pdf, other

    eess.IV cs.CV

    MEDS-Net: Self-Distilled Multi-Encoders Network with Bi-Direction Maximum Intensity projections for Lung Nodule Detection

    Authors: Muhammad Usman, Azka Rehman, Abdullah Shahid, Siddique Latif, Shi Sub Byon, Byoung Dai Lee, Sung Hyun Kim, Byung il Lee, Yeong Gil Shin

    Abstract: In this study, we propose a lung nodule detection scheme which fully incorporates the clinic workflow of radiologists. Particularly, we exploit Bi-Directional Maximum intensity projection (MIP) images of various thicknesses (i.e., 3, 5 and 10mm) along with a 3D patch of CT scan, consisting of 10 adjacent slices to feed into self-distillation-based Multi-Encoders Network (MEDS-Net). The proposed ar… ▽ More

    Submitted 26 December, 2022; v1 submitted 30 October, 2022; originally announced November 2022.

  7. arXiv:2210.03739  [pdf, other

    eess.IV cs.AI cs.CV

    Dual-Stage Deeply Supervised Attention-based Convolutional Neural Networks for Mandibular Canal Segmentation in CBCT Scans

    Authors: Azka Rehman, Muhammad Usman, Rabeea Jawaid, Amal Muhammad Saleem, Shi Sub Byon, Sung Hyun Kim, Byoung Dai Lee, Byung il Lee, Yeong Gil Shin

    Abstract: Accurate segmentation of mandibular canals in lower jaws is important in dental implantology. Medical experts determine the implant position and dimensions manually from 3D CT images to avoid damaging the mandibular nerve inside the canal. In this paper, we propose a novel dual-stage deep learning-based scheme for the automatic segmentation of the mandibular canal. Particularly, we first enhance t… ▽ More

    Submitted 2 November, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: 7 Pages

  8. arXiv:2111.09522  [pdf, other

    math.OC eess.SY

    Park-and-Ride Facility Location Selection under Nested Logit Demand Function

    Authors: Sang Hyun Kim, Sangho Shim

    Abstract: Park-and-ride facilities are car parks where users can transfer to public transportation. Commuters can use P&R facilities or choose to travel by car to their destinations, and individual choice behavior is assumed to follow a logit model. The P&R facility location problem identifies locations for a fixed number of P&R facilities from among potential locations such that the number of users of the… ▽ More

    Submitted 21 December, 2022; v1 submitted 17 November, 2021; originally announced November 2021.

    Comments: Submitted to a journal for review

  9. arXiv:2007.09635  [pdf, other

    eess.AS cs.SD

    Meta-learning with Latent Space Clustering in Generative Adversarial Network for Speaker Diarization

    Authors: Monisankha Pal, Manoj Kumar, Raghuveer Peri, Tae ** Park, So Hyun Kim, Catherine Lord, Somer Bishop, Shrikanth Narayanan

    Abstract: The performance of most speaker diarization systems with x-vector embeddings is both vulnerable to noisy environments and lacks domain robustness. Earlier work on speaker diarization using generative adversarial network (GAN) with an encoder network (ClusterGAN) to project input x-vectors into a latent space has shown promising performance on meeting data. In this paper, we extend the ClusterGAN n… ▽ More

    Submitted 19 July, 2020; originally announced July 2020.

    Comments: Submitted to IEEE/ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

  10. arXiv:1912.13335  [pdf, other

    eess.IV cs.CV cs.LG stat.ML

    Volumetric Lung Nodule Segmentation using Adaptive ROI with Multi-View Residual Learning

    Authors: Muhammad Usman, Byoung-Dai Lee, Shi Sub Byon, Sung Hyun Kim, Byung-ilLee

    Abstract: Accurate quantification of pulmonary nodules can greatly assist the early diagnosis of lung cancer, which can enhance patient survival possibilities. A number of nodule segmentation techniques have been proposed, however, all of the existing techniques rely on radiologist 3-D volume of interest (VOI) input or use the constant region of interest (ROI) and only investigate the presence of nodule vox… ▽ More

    Submitted 3 February, 2020; v1 submitted 31 December, 2019; originally announced December 2019.

    Comments: The manuscript is currently under review and copyright shall be transferred to the publisher upon acceptance

  11. arXiv:1910.11400  [pdf, other

    eess.AS cs.SD

    Meta-learning for robust child-adult classification from speech

    Authors: Nithin Rao Koluguri, Manoj Kumar, So Hyun Kim, Catherine Lord, Shrikanth Narayanan

    Abstract: Computational modeling of naturalistic conversations in clinical applications has seen growing interest in the past decade. An important use-case involves child-adult interactions within the autism diagnosis and intervention domain. In this paper, we address a specific sub-problem of speaker diarization, namely child-adult speaker classification in such dyadic conversations with specified roles. T… ▽ More

    Submitted 28 October, 2019; v1 submitted 24 October, 2019; originally announced October 2019.

  12. arXiv:1910.11398  [pdf, ps, other

    eess.AS cs.SD

    Speaker diarization using latent space clustering in generative adversarial network

    Authors: Monisankha Pal, Manoj Kumar, Raghuveer Peri, Tae ** Park, So Hyun Kim, Catherine Lord, Somer Bishop, Shrikanth Narayanan

    Abstract: In this work, we propose deep latent space clustering for speaker diarization using generative adversarial network (GAN) backprojection with the help of an encoder network. The proposed diarization system is trained jointly with GAN loss, latent variable recovery loss, and a clustering-specific loss. It uses x-vector speaker embeddings at the input, while the latent variables are sampled from a co… ▽ More

    Submitted 24 October, 2019; originally announced October 2019.

    Comments: Submitted to ICASSP 2020

  13. Robust Translational Force Control of Multi-Rotor UAV for Precise Acceleration Tracking

    Authors: Seung Jae Lee, Seung Hyun Kim, H. ** Kim

    Abstract: In this paper, we introduce a translational force control method with disturbance observer (DOB)-based force disturbance cancellation for precise three-dimensional acceleration control of a multi-rotor UAV. The acceleration control of the multi-rotor requires conversion of the desired acceleration signal to the desired roll, pitch, and total thrust. But because the attitude dynamics and the thrust… ▽ More

    Submitted 14 August, 2019; originally announced August 2019.

    Comments: 11 pages, 14 figures, Accepted in the T-ASE Journal on Aug. 10th, 2019

  14. arXiv:1301.3535  [pdf, other

    eess.SY cs.AI

    Airport Gate Scheduling for Passengers, Aircraft, and Operation

    Authors: Sang Hyun Kim, Eric Feron, John-Paul Clarke, Aude Marzuoli, Daniel Delahaye

    Abstract: Passengers' experience is becoming a key metric to evaluate the air transportation system's performance. Efficient and robust tools to handle airport operations are needed along with a better understanding of passengers' interests and concerns. Among various airport operations, this paper studies airport gate scheduling for improved passengers' experience. Three objectives accounting for passenger… ▽ More

    Submitted 15 January, 2013; originally announced January 2013.

    Comments: This paper is submitted to the tenth USA/Europe ATM 2013 seminar