Skip to main content

Showing 1–43 of 43 results for author: Kim, S H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.10996  [pdf, other

    cs.CL

    THEANINE: Revisiting Memory Management in Long-term Conversations with Timeline-augmented Response Generation

    Authors: Seo Hyun Kim, Kai Tzu-iunn Ong, Taeyoon Kwon, Namyoung Kim, Keummin Ka, SeongHyeon Bae, Yohan Jo, Seung-won Hwang, Dongha Lee, **young Yeo

    Abstract: Large language models (LLMs) are capable of processing lengthy dialogue histories during prolonged interaction with users without additional memory modules; however, their responses tend to overlook or incorrectly recall information from the past. In this paper, we revisit memory-augmented response generation in the era of LLMs. While prior work focuses on getting rid of outdated memories, we argu… ▽ More

    Submitted 16 June, 2024; originally announced June 2024.

    Comments: Under Review

  2. arXiv:2403.04787  [pdf, other

    cs.CL cs.AI

    Ever-Evolving Memory by Blending and Refining the Past

    Authors: Seo Hyun Kim, Keummin Ka, Yohan Jo, Seung-won Hwang, Dongha Lee, **young Yeo

    Abstract: For a human-like chatbot, constructing a long-term memory is crucial. However, current large language models often lack this capability, leading to instances of missing important user information or redundantly asking for the same information, thereby diminishing conversation quality. To effectively construct memory, it is crucial to seamlessly connect past and present information, while also poss… ▽ More

    Submitted 7 April, 2024; v1 submitted 3 March, 2024; originally announced March 2024.

    Comments: 17 pages, 4 figures, 7 tables

  3. arXiv:2312.13822  [pdf, other

    cs.CV

    Universal Noise Annotation: Unveiling the Impact of Noisy annotation on Object Detection

    Authors: Kwangrok Ryoo, Yeonsik Jo, Seungjun Lee, Mira Kim, Ahra Jo, Seung Hwan Kim, Seungryong Kim, Soonyoung Lee

    Abstract: For object detection task with noisy labels, it is important to consider not only categorization noise, as in image classification, but also localization noise, missing annotations, and bogus bounding boxes. However, previous studies have only addressed certain types of noise (e.g., localization or categorization). In this paper, we propose Universal-Noise Annotation (UNA), a more practical settin… ▽ More

    Submitted 21 December, 2023; originally announced December 2023.

    Comments: appendix and code : https://github.com/Ryoo72/UNA

  4. arXiv:2312.12661  [pdf, other

    cs.CV

    Misalign, Contrast then Distill: Rethinking Misalignments in Language-Image Pretraining

    Authors: Bumsoo Kim, Yeonsik Jo, **hyung Kim, Seung Hwan Kim

    Abstract: Contrastive Language-Image Pretraining has emerged as a prominent approach for training vision and text encoders with uncurated image-text pairs from the web. To enhance data-efficiency, recent efforts have introduced additional supervision terms that involve random-augmented views of the image. However, since the image augmentation process is unaware of its text counterpart, this procedure could… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: ICCV 2023

  5. arXiv:2312.12659  [pdf, other

    cs.CV

    Expediting Contrastive Language-Image Pretraining via Self-distilled Encoders

    Authors: Bumsoo Kim, **hyung Kim, Yeonsik Jo, Seung Hwan Kim

    Abstract: Recent advances in vision language pretraining (VLP) have been largely attributed to the large-scale data collected from the web. However, uncurated dataset contains weakly correlated image-text pairs, causing data inefficiency. To address the issue, knowledge distillation have been explored at the expense of extra image and text momentum encoders to generate teaching signals for misaligned image-… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: AAAI 2024

  6. arXiv:2310.15263  [pdf, other

    q-bio.NC cs.LG

    One-hot Generalized Linear Model for Switching Brain State Discovery

    Authors: Chengrui Li, Soon Ho Kim, Chris Rodgers, Hannah Choi, Anqi Wu

    Abstract: Exposing meaningful and interpretable neural interactions is critical to understanding neural circuits. Inferred neural interactions from neural signals primarily reflect functional interactions. In a long experiment, subject animals may experience different stages defined by the experiment, stimuli, or behavioral states, and hence functional interactions can change over time. To model dynamically… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  7. arXiv:2310.08221  [pdf, other

    cs.CL cs.AI cs.LG

    SimCKP: Simple Contrastive Learning of Keyphrase Representations

    Authors: Minseok Choi, Chaeheon Gwak, Seho Kim, Si Hyeong Kim, Jaegul Choo

    Abstract: Keyphrase generation (KG) aims to generate a set of summarizing words or phrases given a source document, while keyphrase extraction (KE) aims to identify them from the text. Because the search space is much smaller in KE, it is often combined with KG to predict keyphrases that may or may not exist in the corresponding document. However, current unified approaches adopt sequence labeling and maxim… ▽ More

    Submitted 12 October, 2023; originally announced October 2023.

    Comments: Accepted to Findings of EMNLP 2023

  8. arXiv:2309.01961  [pdf, other

    cs.CV

    NICE: CVPR 2023 Challenge on Zero-shot Image Captioning

    Authors: Taehoon Kim, Pyunghwan Ahn, Sangyun Kim, Sihaeng Lee, Mark Marsden, Alessandra Sala, Seung Hwan Kim, Bohyung Han, Kyoung Mu Lee, Honglak Lee, Kyounghoon Bae, Xiangyu Wu, Yi Gao, Hailiang Zhang, Yang Yang, Weili Guo, Jianfeng Lu, Youngtaek Oh, Jae Won Cho, Dong-** Kim, In So Kweon, Junmo Kim, Wooyoung Kang, Won Young Jhoo, Byungseok Roh , et al. (17 additional authors not shown)

    Abstract: In this report, we introduce NICE (New frontiers for zero-shot Image Captioning Evaluation) project and share the results and outcomes of 2023 challenge. This project is designed to challenge the computer vision community to develop robust image captioning models that advance the state-of-the-art both in terms of accuracy and fairness. Through the challenge, the image captioning models were tested… ▽ More

    Submitted 10 September, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

    Comments: Tech report, project page https://nice.lgresearch.ai/

  9. arXiv:2308.07575  [pdf, other

    cs.CV cs.AI cs.LG

    Story Visualization by Online Text Augmentation with Context Memory

    Authors: Daechul Ahn, Daneul Kim, Gwangmo Song, Seung Hwan Kim, Honglak Lee, Dongyeop Kang, Jonghyun Choi

    Abstract: Story visualization (SV) is a challenging text-to-image generation task for the difficulty of not only rendering visual details from the text descriptions but also encoding a long-term context across multiple sentences. While prior efforts mostly focus on generating a semantically relevant image for each sentence, encoding a context spread across the given paragraph to generate contextually convin… ▽ More

    Submitted 19 August, 2023; v1 submitted 15 August, 2023; originally announced August 2023.

    Comments: ICCV 2023, Project page: https://dcahn12.github.io/projects/CMOTA/

  10. arXiv:2305.16713  [pdf, other

    cs.CV

    ReConPatch : Contrastive Patch Representation Learning for Industrial Anomaly Detection

    Authors: Jeeho Hyun, Sangyun Kim, Giyoung Jeon, Seung Hwan Kim, Kyunghoon Bae, Byung Jun Kang

    Abstract: Anomaly detection is crucial to the advanced identification of product defects such as incorrect parts, misaligned components, and damages in industrial manufacturing. Due to the rare observations and unknown types of defects, anomaly detection is considered to be challenging in machine learning. To overcome this difficulty, recent approaches utilize the common visual representations pre-trained f… ▽ More

    Submitted 10 January, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: Accepted on WACV 2024

  11. arXiv:2304.01576  [pdf, other

    eess.IV cs.CV cs.LG

    MESAHA-Net: Multi-Encoders based Self-Adaptive Hard Attention Network with Maximum Intensity Projections for Lung Nodule Segmentation in CT Scan

    Authors: Muhammad Usman, Azka Rehman, Abdullah Shahid, Siddique Latif, Shi Sub Byon, Sung Hyun Kim, Tariq Mahmood Khan, Yeong Gil Shin

    Abstract: Accurate lung nodule segmentation is crucial for early-stage lung cancer diagnosis, as it can substantially enhance patient survival rates. Computed tomography (CT) images are widely employed for early diagnosis in lung nodule analysis. However, the heterogeneity of lung nodules, size diversity, and the complexity of the surrounding environment pose challenges for develo** robust nodule segmenta… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

  12. arXiv:2303.09917  [pdf, other

    cs.CV

    Vision Transformer for Action Units Detection

    Authors: Tu Vu, Van Thong Huynh, Soo Hyung Kim

    Abstract: Facial Action Units detection (FAUs) represents a fine-grained classification problem that involves identifying different units on the human face, as defined by the Facial Action Coding System. In this paper, we present a simple yet efficient Vision Transformer-based approach for addressing the task of Action Units (AU) detection in the context of Affective Behavior Analysis in-the-wild (ABAW) com… ▽ More

    Submitted 20 March, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

    Comments: Will be updated

  13. arXiv:2302.05811  [pdf, other

    cs.RO eess.SY

    Hierarchical control and learning of a foraging CyberOctopus

    Authors: Chia-Hsien Shih, Noel Naughton, Udit Halder, Heng-Sheng Chang, Seung Hyun Kim, Rhanor Gillette, Prashant G. Mehta, Mattia Gazzola

    Abstract: Inspired by the unique neurophysiology of the octopus, we propose a hierarchical framework that simplifies the coordination of multiple soft arms by decomposing control into high-level decision making, low-level motor activation, and local reflexive behaviors via sensory feedback. When evaluated in the illustrative problem of a model octopus foraging for food, this hierarchical decomposition resul… ▽ More

    Submitted 11 February, 2023; originally announced February 2023.

    Comments: 16 pages, 7 figures

  14. arXiv:2302.02506  [pdf

    cs.LG cs.AI

    Generating Dispatching Rules for the Interrupting Swap-Allowed Blocking Job Shop Problem Using Graph Neural Network and Reinforcement Learning

    Authors: Vivian W. H. Wong, Sang Hun Kim, Junyoung Park, **kyoo Park, Kincho H. Law

    Abstract: The interrupting swap-allowed blocking job shop problem (ISBJSSP) is a complex scheduling problem that is able to model many manufacturing planning and logistics applications realistically by addressing both the lack of storage capacity and unforeseen production interruptions. Subjected to random disruptions due to machine malfunction or maintenance, industry production settings often choose to ad… ▽ More

    Submitted 28 September, 2023; v1 submitted 5 February, 2023; originally announced February 2023.

    Comments: 14 pages, 10 figures. Supplementary Material not included

  15. arXiv:2212.07050  [pdf, other

    cs.LG cs.CV eess.IV

    Significantly Improving Zero-Shot X-ray Pathology Classification via Fine-tuning Pre-trained Image-Text Encoders

    Authors: Jongseong Jang, Daeun Kyung, Seung Hwan Kim, Honglak Lee, Kyunghoon Bae, Edward Choi

    Abstract: Deep neural networks have been successfully adopted to diverse domains including pathology classification based on medical images. However, large-scale and high-quality data to train powerful neural networks are rare in the medical domain as the labeling must be done by qualified experts. Researchers recently tackled this problem with some success by taking advantage of models pre-trained on large… ▽ More

    Submitted 16 March, 2023; v1 submitted 14 December, 2022; originally announced December 2022.

  16. arXiv:2211.06774  [pdf, other

    cs.CV cs.CL

    Large-Scale Bidirectional Training for Zero-Shot Image Captioning

    Authors: Taehoon Kim, Mark Marsden, Pyunghwan Ahn, Sangyun Kim, Sihaeng Lee, Alessandra Sala, Seung Hwan Kim

    Abstract: When trained on large-scale datasets, image captioning models can understand the content of images from a general domain but often fail to generate accurate, detailed captions. To improve performance, pretraining-and-finetuning has been a key strategy for image captioning. However, we find that large-scale bidirectional training between image and text enables zero-shot image captioning. In this pa… ▽ More

    Submitted 1 October, 2023; v1 submitted 12 November, 2022; originally announced November 2022.

    Comments: Arxiv Preprint. Work in progress

  17. arXiv:2211.03279  [pdf, other

    eess.AS cs.SD

    A Context-Aware Computational Approach for Measuring Vocal Entrainment in Dyadic Conversations

    Authors: Rimita Lahiri, Md Nasir, Catherine Lord, So Hyun Kim, Shrikanth Narayanan

    Abstract: Vocal entrainment is a social adaptation mechanism in human interaction, knowledge of which can offer useful insights to an individual's cognitive-behavioral characteristics. We propose a context-aware approach for measuring vocal entrainment in dyadic conversations. We use conformers(a combination of convolutional network and transformer) for capturing both short-term and long-term conversational… ▽ More

    Submitted 6 November, 2022; originally announced November 2022.

  18. arXiv:2211.00003  [pdf, other

    eess.IV cs.CV

    MEDS-Net: Self-Distilled Multi-Encoders Network with Bi-Direction Maximum Intensity projections for Lung Nodule Detection

    Authors: Muhammad Usman, Azka Rehman, Abdullah Shahid, Siddique Latif, Shi Sub Byon, Byoung Dai Lee, Sung Hyun Kim, Byung il Lee, Yeong Gil Shin

    Abstract: In this study, we propose a lung nodule detection scheme which fully incorporates the clinic workflow of radiologists. Particularly, we exploit Bi-Directional Maximum intensity projection (MIP) images of various thicknesses (i.e., 3, 5 and 10mm) along with a 3D patch of CT scan, consisting of 10 adjacent slices to feed into self-distillation-based Multi-Encoders Network (MEDS-Net). The proposed ar… ▽ More

    Submitted 26 December, 2022; v1 submitted 30 October, 2022; originally announced November 2022.

  19. arXiv:2210.03739  [pdf, other

    eess.IV cs.AI cs.CV

    Dual-Stage Deeply Supervised Attention-based Convolutional Neural Networks for Mandibular Canal Segmentation in CBCT Scans

    Authors: Azka Rehman, Muhammad Usman, Rabeea Jawaid, Amal Muhammad Saleem, Shi Sub Byon, Sung Hyun Kim, Byoung Dai Lee, Byung il Lee, Yeong Gil Shin

    Abstract: Accurate segmentation of mandibular canals in lower jaws is important in dental implantology. Medical experts determine the implant position and dimensions manually from 3D CT images to avoid damaging the mandibular nerve inside the canal. In this paper, we propose a novel dual-stage deep learning-based scheme for the automatic segmentation of the mandibular canal. Particularly, we first enhance t… ▽ More

    Submitted 2 November, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: 7 Pages

  20. arXiv:2209.13430  [pdf, other

    cs.CV cs.LG

    UniCLIP: Unified Framework for Contrastive Language-Image Pre-training

    Authors: Janghyeon Lee, Jongsuk Kim, Hyounguk Shon, Bumsoo Kim, Seung Hwan Kim, Honglak Lee, Junmo Kim

    Abstract: Pre-training vision-language models with contrastive objectives has shown promising results that are both scalable to large uncurated datasets and transferable to many downstream applications. Some following works have targeted to improve data efficiency by adding self-supervision terms, but inter-domain (image-text) contrastive loss and intra-domain (image-image) contrastive loss are defined on i… ▽ More

    Submitted 31 October, 2022; v1 submitted 27 September, 2022; originally announced September 2022.

    Comments: Neural Information Processing Systems (NeurIPS) 2022

  21. arXiv:2208.08112  [pdf, other

    cs.LG cs.AI cs.CV

    DLCFT: Deep Linear Continual Fine-Tuning for General Incremental Learning

    Authors: Hyounguk Shon, Janghyeon Lee, Seung Hwan Kim, Junmo Kim

    Abstract: Pre-trained representation is one of the key elements in the success of modern deep learning. However, existing works on continual learning methods have mostly focused on learning models incrementally from scratch. In this paper, we explore an alternative framework to incremental learning where we continually fine-tune the model from a pre-trained representation. Our method takes advantage of line… ▽ More

    Submitted 17 August, 2022; originally announced August 2022.

    Comments: European Conference on Computer Vision (ECCV) 2022

  22. arXiv:2203.07682  [pdf, other

    cs.CV

    Enriched CNN-Transformer Feature Aggregation Networks for Super-Resolution

    Authors: **su Yoo, Taehoon Kim, Sihaeng Lee, Seung Hwan Kim, Honglak Lee, Tae Hyun Kim

    Abstract: Recent transformer-based super-resolution (SR) methods have achieved promising results against conventional CNN-based methods. However, these approaches suffer from essential shortsightedness created by only utilizing the standard self-attention-based reasoning. In this paper, we introduce an effective hybrid SR network to aggregate enriched features, including local features from CNNs and long-ra… ▽ More

    Submitted 20 October, 2022; v1 submitted 15 March, 2022; originally announced March 2022.

    Comments: WACV 2023

  23. arXiv:2202.07741  [pdf, other

    cs.MA

    Disentangling Successor Features for Coordination in Multi-agent Reinforcement Learning

    Authors: Seung Hyun Kim, Neale Van Stralen, Girish Chowdhary, Huy T. Tran

    Abstract: Multi-agent reinforcement learning (MARL) is a promising framework for solving complex tasks with many agents. However, a key challenge in MARL is defining private utility functions that ensure coordination when training decentralized agents. This challenge is especially prevalent in unstructured tasks with sparse rewards and many agents. We show that successor features can help address this chall… ▽ More

    Submitted 15 February, 2022; originally announced February 2022.

    Comments: The paper is accepted in AAMAS 2022 (International Conference on Autonomous Agents and Multiagent Systems)

  24. arXiv:2112.00343  [pdf, other

    cs.CV

    Camera Motion Agnostic 3D Human Pose Estimation

    Authors: Seong Hyun Kim, Sunwon Jeong, Sungbum Park, Ju Yong Chang

    Abstract: Although the performance of 3D human pose and shape estimation methods has improved significantly in recent years, existing approaches typically generate 3D poses defined in camera or human-centered coordinate system. This makes it difficult to estimate a person's pure pose and motion in world coordinate system for a video captured using a moving camera. To address this issue, this paper presents… ▽ More

    Submitted 1 December, 2021; originally announced December 2021.

  25. arXiv:2111.11133  [pdf, other

    cs.CV cs.CL cs.LG

    L-Verse: Bidirectional Generation Between Image and Text

    Authors: Taehoon Kim, Gwangmo Song, Sihaeng Lee, Sangyun Kim, Yewon Seo, Soonyoung Lee, Seung Hwan Kim, Honglak Lee, Kyunghoon Bae

    Abstract: Far beyond learning long-range interactions of natural language, transformers are becoming the de-facto standard for many vision tasks with their power and scalability. Especially with cross-modal tasks between image and text, vector quantized variational autoencoders (VQ-VAEs) are widely used to make a raw RGB image into a sequence of feature vectors. To better leverage the correlation between im… ▽ More

    Submitted 6 April, 2022; v1 submitted 22 November, 2021; originally announced November 2021.

    Comments: Accepted to CVPR 2022 as Oral Presentation (18 pages, 14 figures, 4 tables)

  26. arXiv:2110.14874  [pdf, other

    cs.LG stat.ML

    Sayer: Using Implicit Feedback to Optimize System Policies

    Authors: Mathias Lécuyer, Sang Hoon Kim, Mihir Nanavati, Junchen Jiang, Siddhartha Sen, Amit Sharma, Aleksandrs Slivkins

    Abstract: We observe that many system policies that make threshold decisions involving a resource (e.g., time, memory, cores) naturally reveal additional, or implicit feedback. For example, if a system waits X min for an event to occur, then it automatically learns what would have happened if it waited <X min, because time has a cumulative property. This feedback tells us about alternative decisions, and ca… ▽ More

    Submitted 28 October, 2021; originally announced October 2021.

  27. arXiv:2109.08372  [pdf, other

    cs.RO

    A physics-informed, vision-based method to reconstruct all deformation modes in slender bodies

    Authors: Seung Hyun Kim, Heng-Sheng Chang, Chia-Hsien Shih, Naveen Kumar Uppalapati, Udit Halder, Girish Krishnan, Prashant G. Mehta, Mattia Gazzola

    Abstract: This paper is concerned with the problem of estimating (interpolating and smoothing) the shape (pose and the six modes of deformation) of a slender flexible body from multiple camera measurements. This problem is important in both biology, where slender, soft, and elastic structures are ubiquitously encountered across species, and in engineering, particularly in the area of soft robotics. The prop… ▽ More

    Submitted 17 September, 2021; originally announced September 2021.

    Comments: This work has been submitted to the IEEE RA-L with ICRA 2022 for possible publication. Copyright may be transferred without notice. For associated data and code, see https://github.com/GazzolaLab/BR2-vision-based-smoothing

  28. Learning to schedule job-shop problems: Representation and policy learning using graph neural network and reinforcement learning

    Authors: Junyoung Park, Jaehyeong Chun, Sang Hun Kim, Youngkook Kim, **kyoo Park

    Abstract: We propose a framework to learn to schedule a job-shop problem (JSSP) using a graph neural network (GNN) and reinforcement learning (RL). We formulate the scheduling process of JSSP as a sequential decision-making problem with graph representation of the state to consider the structure of JSSP. In solving the formulated problem, the proposed framework employs a GNN to learn that node features that… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: 16 pages, 8 figures

    Journal ref: International Journal of Production Research International Journal of Production Research, Volume 59, 2021 - Issue 11, Pages 3360-3377

  29. arXiv:2008.02043  [pdf, other

    cs.LG cs.CV stat.ML

    Learning Boost by Exploiting the Auxiliary Task in Multi-task Domain

    Authors: Jonghwa Yim, Sang Hwan Kim

    Abstract: Learning two tasks in a single shared function has some benefits. Firstly by acquiring information from the second task, the shared function leverages useful information that could have been neglected or underestimated in the first task. Secondly, it helps to generalize the function that can be learned using generally applicable information for both tasks. To fully enjoy these benefits, Multi-task… ▽ More

    Submitted 5 August, 2020; originally announced August 2020.

  30. arXiv:2007.09635  [pdf, other

    eess.AS cs.SD

    Meta-learning with Latent Space Clustering in Generative Adversarial Network for Speaker Diarization

    Authors: Monisankha Pal, Manoj Kumar, Raghuveer Peri, Tae ** Park, So Hyun Kim, Catherine Lord, Somer Bishop, Shrikanth Narayanan

    Abstract: The performance of most speaker diarization systems with x-vector embeddings is both vulnerable to noisy environments and lacks domain robustness. Earlier work on speaker diarization using generative adversarial network (GAN) with an encoder network (ClusterGAN) to project input x-vectors into a latent space has shown promising performance on meeting data. In this paper, we extend the ClusterGAN n… ▽ More

    Submitted 19 July, 2020; originally announced July 2020.

    Comments: Submitted to IEEE/ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

  31. Patterns of population displacement during mega-fires in California detected using Facebook Disaster Maps

    Authors: Shenyue Jia, Seung Hee Kim, Son V. Nghiem, Paul Doherty, Menas Kafatos

    Abstract: Facebook Disaster Maps (FBDM) is the first platform providing analysis-ready population change products derived from crowdsourced data targeting disaster relief practices. We evaluate the representativeness of FBDM data using the Mann-Kendall test and emerging hot and cold spots in an anomaly analysis to reveal the trend, magnitude, and agglommeration of population displacement during the Mendocin… ▽ More

    Submitted 2 April, 2020; originally announced April 2020.

    Comments: 16 pages with supplemental information

  32. arXiv:1912.13335  [pdf, other

    eess.IV cs.CV cs.LG stat.ML

    Volumetric Lung Nodule Segmentation using Adaptive ROI with Multi-View Residual Learning

    Authors: Muhammad Usman, Byoung-Dai Lee, Shi Sub Byon, Sung Hyun Kim, Byung-ilLee

    Abstract: Accurate quantification of pulmonary nodules can greatly assist the early diagnosis of lung cancer, which can enhance patient survival possibilities. A number of nodule segmentation techniques have been proposed, however, all of the existing techniques rely on radiologist 3-D volume of interest (VOI) input or use the constant region of interest (ROI) and only investigate the presence of nodule vox… ▽ More

    Submitted 3 February, 2020; v1 submitted 31 December, 2019; originally announced December 2019.

    Comments: The manuscript is currently under review and copyright shall be transferred to the publisher upon acceptance

  33. arXiv:1910.11400  [pdf, other

    eess.AS cs.SD

    Meta-learning for robust child-adult classification from speech

    Authors: Nithin Rao Koluguri, Manoj Kumar, So Hyun Kim, Catherine Lord, Shrikanth Narayanan

    Abstract: Computational modeling of naturalistic conversations in clinical applications has seen growing interest in the past decade. An important use-case involves child-adult interactions within the autism diagnosis and intervention domain. In this paper, we address a specific sub-problem of speaker diarization, namely child-adult speaker classification in such dyadic conversations with specified roles. T… ▽ More

    Submitted 28 October, 2019; v1 submitted 24 October, 2019; originally announced October 2019.

  34. arXiv:1910.11398  [pdf, ps, other

    eess.AS cs.SD

    Speaker diarization using latent space clustering in generative adversarial network

    Authors: Monisankha Pal, Manoj Kumar, Raghuveer Peri, Tae ** Park, So Hyun Kim, Catherine Lord, Somer Bishop, Shrikanth Narayanan

    Abstract: In this work, we propose deep latent space clustering for speaker diarization using generative adversarial network (GAN) backprojection with the help of an encoder network. The proposed diarization system is trained jointly with GAN loss, latent variable recovery loss, and a clustering-specific loss. It uses x-vector speaker embeddings at the input, while the latent variables are sampled from a co… ▽ More

    Submitted 24 October, 2019; originally announced October 2019.

    Comments: Submitted to ICASSP 2020

  35. Robust Translational Force Control of Multi-Rotor UAV for Precise Acceleration Tracking

    Authors: Seung Jae Lee, Seung Hyun Kim, H. ** Kim

    Abstract: In this paper, we introduce a translational force control method with disturbance observer (DOB)-based force disturbance cancellation for precise three-dimensional acceleration control of a multi-rotor UAV. The acceleration control of the multi-rotor requires conversion of the desired acceleration signal to the desired roll, pitch, and total thrust. But because the attitude dynamics and the thrust… ▽ More

    Submitted 14 August, 2019; originally announced August 2019.

    Comments: 11 pages, 14 figures, Accepted in the T-ASE Journal on Aug. 10th, 2019

  36. arXiv:1807.08903  [pdf, ps, other

    cs.IT

    Traffic-Aware Backscatter Communications in Wireless-Powered Heterogeneous Networks

    Authors: Sung Hoon Kim, Dong In Kim

    Abstract: With the emerging Internet-of-Things services, massive machine-to-machine (M2M) communication will be deployed on top of human-to-human (H2H) communication in the near future. Due to the coexistence of M2M and H2H communications, the performance of M2M (i.e., secondary) network depends largely on the H2H (i.e., primary) network. In this paper, we propose ambient backscatter communication for the M… ▽ More

    Submitted 24 July, 2018; originally announced July 2018.

    Comments: 14 pages, 10 figures

  37. arXiv:1710.03299  [pdf

    cs.HC cs.CY

    A Review on the Applications of Crowdsourcing in Human Pathology

    Authors: Roshanak Alialy, Sasan Tavakkol, Elham Tavakkol, Amir Ghorbani-Aghbologhi, Alireza Ghaffarieh, Seon Ho Kim, Cyrus Shahabi

    Abstract: The advent of the digital pathology has introduced new avenues of diagnostic medicine. Among them, crowdsourcing has attracted researchers' attention in the recent years, allowing them to engage thousands of untrained individuals in research and diagnosis. While there exist several articles in this regard, prior works have not collectively documented them. We, therefore, aim to review the applicat… ▽ More

    Submitted 20 November, 2017; v1 submitted 9 October, 2017; originally announced October 2017.

  38. arXiv:1705.02009  [pdf, ps, other

    cs.IR cs.LG

    On Identifying Disaster-Related Tweets: Matching-based or Learning-based?

    Authors: Hien To, Sumeet Agrawal, Seon Ho Kim, Cyrus Shahabi

    Abstract: Social media such as tweets are emerging as platforms contributing to situational awareness during disasters. Information shared on Twitter by both affected population (e.g., requesting assistance, warning) and those outside the impact zone (e.g., providing assistance) would help first responders, decision makers, and the public to understand the situation first-hand. Effective use of such informa… ▽ More

    Submitted 4 May, 2017; originally announced May 2017.

  39. Variable-Length Feedback Codes under a Strict Delay Constraint

    Authors: Seong Hwan Kim, Dan Keun Sung, Tho Le-Ngoc

    Abstract: We study variable-length feedback (VLF) codes under a strict delay constraint to maximize their average transmission rate (ATR) in a discrete memoryless channel (DMC) while considering periodic decoding attempts. We first derive a lower bound on the maximum achievable ATR, and confirm that the VLF code can outperform non-feedback codes with a larger delay constraint. We show that for a given decod… ▽ More

    Submitted 23 February, 2015; originally announced February 2015.

    Comments: 5pages, 1 figure, Accepted for publication in IEEE Communications Letters

  40. Numerical Analysis of Gate Conflict Duration and Passenger Transit Time in Airport

    Authors: Sang Hyun Kim, Eric Feron

    Abstract: Robustness is as important as efficiency in air transportation. All components in the air traffic system are connected to form an interactive network. So, a disturbance that occurs in one component, for example, a severe delay at an airport, can influence the entire network. Delays are easily propagated between flights through gates, but the propagation can be reduced if gate assignments are robus… ▽ More

    Submitted 28 August, 2013; originally announced August 2013.

    Comments: Submitted to Transportation Research Part B, and presented at AIAA Guidance, Navigation, and Control Conference in 2011 in part

  41. Impact of Gate Assignment on Gate-Holding Departure Control Strategies

    Authors: Sang Hyun Kim, Eric Feron

    Abstract: Gate holding reduces congestion by reducing the number of aircraft present on the airport surface at any time, while not starving the runway. Because some departing flights are held at gates, there is a possibility that arriving flights cannot access the gates and have to wait until the gates are cleared. This is called a gate conflict. Robust gate assignment is an assignment that minimizes gate c… ▽ More

    Submitted 14 June, 2013; originally announced June 2013.

    Comments: Submitted to IEEE Transactions on Intelligent Transportation Systems

  42. Valuating Surface Surveillance Technology for Collaborative Multiple-Spot Control of Airport Departure Operations

    Authors: Pierrick Burgain, Sang Hyun Kim, Eric Feron

    Abstract: Airport departure operations are a source of airline delays and passenger frustration. Excessive surface traffic is a cause of increased controller and pilot workload. It is also a source of increased emissions and delays, and does not yield improved runway throughput. Leveraging the extensive past research on airport departure management, this paper explores the environmental and safety benefits… ▽ More

    Submitted 14 June, 2013; originally announced June 2013.

    Comments: Submitted to IEEE Transactions on Intelligent Transportation Systems. arXiv admin note: substantial text overlap with arXiv:1102.2673

  43. arXiv:1301.3535  [pdf, other

    eess.SY cs.AI

    Airport Gate Scheduling for Passengers, Aircraft, and Operation

    Authors: Sang Hyun Kim, Eric Feron, John-Paul Clarke, Aude Marzuoli, Daniel Delahaye

    Abstract: Passengers' experience is becoming a key metric to evaluate the air transportation system's performance. Efficient and robust tools to handle airport operations are needed along with a better understanding of passengers' interests and concerns. Among various airport operations, this paper studies airport gate scheduling for improved passengers' experience. Three objectives accounting for passenger… ▽ More

    Submitted 15 January, 2013; originally announced January 2013.

    Comments: This paper is submitted to the tenth USA/Europe ATM 2013 seminar