Search | arXiv e-print repository

arXiv:2404.07021 [pdf, other]

A 4x32Gb/s 1.8pJ/bit Collaborative Baud-Rate CDR with Background Eye-Climbing Algorithm and Low-Power Global Clock Distribution

Authors: Jihee Kim, Jia Park, Jiwon Shin, Hanseok Kim, Kahyun Kim, Haengbeom Shin, Ha-Jung Park, Woo-Seok Choi

Abstract: This paper presents design techniques for an energy-efficient multi-lane receiver (RX) with baud-rate clock and data recovery (CDR), which is essential for high-throughput low-latency communication in high-performance computing systems. The proposed low-power global clock distribution not only significantly reduces power consumption across multi-lane RXs but is capable of compensating for the freq… ▽ More This paper presents design techniques for an energy-efficient multi-lane receiver (RX) with baud-rate clock and data recovery (CDR), which is essential for high-throughput low-latency communication in high-performance computing systems. The proposed low-power global clock distribution not only significantly reduces power consumption across multi-lane RXs but is capable of compensating for the frequency offset without any phase interpolators. To this end, a fractional divider controlled by CDR is placed close to the global phase locked loop. Moreover, in order to address the sub-optimal lock point of conventional baud-rate phase detectors, the proposed CDR employs a background eye-climbing algorithm, which optimizes the sampling phase and maximizes the vertical eye margin (VEM). Fabricated in a 28nm CMOS process, the proposed 4x32Gb/s RX shows a low integrated fractional spur of -40.4dBc at a 2500ppm frequency offset. Furthermore, it improves bit-error-rate performance by increasing the VEM by 17%. The entire RX achieves the energy efficiency of 1.8pJ/bit with the aggregate data rate of 128Gb/s. △ Less

Submitted 22 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

arXiv:2404.05119 [pdf, other]

A 0.65-pJ/bit 3.6-TB/s/mm I/O Interface with XTalk Minimizing Affine Signaling for Next-Generation HBM with High Interconnect Density

Authors: Hyunjun Park, Jiwon Shin, Hanseok Kim, Jihee Kim, Haengbeom Shin, Taehoon Kim, Jung-Hun Park, Woo-Seok Choi

Abstract: This paper presents an I/O interface with Xtalk Minimizing Affine Signaling (XMAS), which is designed to support high-speed data transmission in die-to-die communication over silicon interposers or similar high-density interconnects susceptible to crosstalk. The operating principles of XMAS are elucidated through rigorous analyses, and its advantages over existing signaling are validated through n… ▽ More This paper presents an I/O interface with Xtalk Minimizing Affine Signaling (XMAS), which is designed to support high-speed data transmission in die-to-die communication over silicon interposers or similar high-density interconnects susceptible to crosstalk. The operating principles of XMAS are elucidated through rigorous analyses, and its advantages over existing signaling are validated through numerical experiments. XMAS not only demonstrates exceptional crosstalk removing capabilities but also exhibits robustness against noise, especially simultaneous switching noise. Fabricated in a 28-nm CMOS process, the prototype XMAS transceiver achieves an edge density of 3.6TB/s/mm and an energy efficiency of 0.65pJ/b. Compared to the single-ended signaling, the crosstalk-induced peak-to-peak jitter of the received eye with XMAS is reduced by 75% at 10GS/s/pin data rate, and the horizontal eye opening extends to 0.2UI at a bit error rate < 10$^{-12}$. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2404.02135 [pdf]

Enhancing Ship Classification in Optical Satellite Imagery: Integrating Convolutional Block Attention Module with ResNet for Improved Performance

Authors: Ryan Donghan Kwon, Gangjoo Robin Nam, Jisoo Tak, Junseob Shin, Hyerin Cha, Yeom Hyeok, Seung Won Lee

Abstract: This study presents an advanced Convolutional Neural Network (CNN) architecture for ship classification from optical satellite imagery, significantly enhancing performance through the integration of the Convolutional Block Attention Module (CBAM) and additional architectural innovations. Building upon the foundational ResNet50 model, we first incorporated a standard CBAM to direct the model's focu… ▽ More This study presents an advanced Convolutional Neural Network (CNN) architecture for ship classification from optical satellite imagery, significantly enhancing performance through the integration of the Convolutional Block Attention Module (CBAM) and additional architectural innovations. Building upon the foundational ResNet50 model, we first incorporated a standard CBAM to direct the model's focus towards more informative features, achieving an accuracy of 87% compared to the baseline ResNet50's 85%. Further augmentations involved multi-scale feature integration, depthwise separable convolutions, and dilated convolutions, culminating in the Enhanced ResNet Model with Improved CBAM. This model demonstrated a remarkable accuracy of 95%, with precision, recall, and f1-scores all witnessing substantial improvements across various ship classes. The bulk carrier and oil tanker classes, in particular, showcased nearly perfect precision and recall rates, underscoring the model's enhanced capability in accurately identifying and classifying ships. Attention heatmap analyses further validated the improved model's efficacy, revealing a more focused attention on relevant ship features, regardless of background complexities. These findings underscore the potential of integrating attention mechanisms and architectural innovations in CNNs for high-resolution satellite imagery classification. The study navigates through the challenges of class imbalance and computational costs, proposing future directions towards scalability and adaptability in new or rare ship type recognition. This research lays a groundwork for the application of advanced deep learning techniques in the domain of remote sensing, offering insights into scalable and efficient satellite image classification. △ Less

Submitted 8 April, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

arXiv:2403.07355 [pdf, ps, other]

Vector Quantization for Deep-Learning-Based CSI Feedback in Massive MIMO Systems

Authors: Junyong Shin, Yu** Kang, Yo-Seb Jeon

Abstract: This paper presents a finite-rate deep-learning (DL)-based channel state information (CSI) feedback method for massive multiple-input multiple-output (MIMO) systems. The presented method provides a finite-bit representation of the latent vector based on a vector-quantized variational autoencoder (VQ-VAE) framework while reducing its computational complexity based on shape-gain vector quantization.… ▽ More This paper presents a finite-rate deep-learning (DL)-based channel state information (CSI) feedback method for massive multiple-input multiple-output (MIMO) systems. The presented method provides a finite-bit representation of the latent vector based on a vector-quantized variational autoencoder (VQ-VAE) framework while reducing its computational complexity based on shape-gain vector quantization. In this method, the magnitude of the latent vector is quantized using a non-uniform scalar codebook with a proper transformation function, while the direction of the latent vector is quantized using a trainable Grassmannian codebook. A multi-rate codebook design strategy is also developed by introducing a codeword selection rule for a nested codebook along with the design of a loss function. Simulation results demonstrate that the proposed method reduces the computational complexity associated with VQ-VAE while improving CSI reconstruction performance under a given feedback overhead. △ Less

Submitted 12 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

arXiv:2402.00671 [pdf, other]

Uncertainty-Aware Guidance for Target Tracking subject to Intermittent Measurements using Motion Model Learning

Authors: Andres Pulido, Kyle Volle, Kristy Waters, Zachary I. Bell, Prashant Ganesh, Jane Shin

Abstract: This letter presents a novel guidance law for target tracking applications where the target motion model is unknown and sensor measurements are intermittent due to unknown environmental conditions and low measurement update rate. In this work, the target motion model is represented by a transformer-based neural network and trained by previous target position measurements. This neural network (NN)-… ▽ More This letter presents a novel guidance law for target tracking applications where the target motion model is unknown and sensor measurements are intermittent due to unknown environmental conditions and low measurement update rate. In this work, the target motion model is represented by a transformer-based neural network and trained by previous target position measurements. This neural network (NN)-based motion model serves as the prediction step in a particle filter for target state estimation and uncertainty quantification. Then this estimation uncertainty is utilized in the information-driven guidance law to compute a path for the mobile agent to travel to a position with maximum expected entropy reduction (EER). The computation of EER is performed in real-time by approximating the probability distribution of the state using the particle representation from particle filter. Simulation and hardware experiments are performed with a quadcopter agent and TurtleBot target to demonstrate that the presented guidance law outperforms two other baseline guidance methods. △ Less

Submitted 1 February, 2024; originally announced February 2024.

arXiv:2312.05465 [pdf, other]

On Task-Relevant Loss Functions in Meta-Reinforcement Learning and Online LQR

Authors: Jaeuk Shin, Giho Kim, Howon Lee, Joonho Han, Insoon Yang

Abstract: Designing a competent meta-reinforcement learning (meta-RL) algorithm in terms of data usage remains a central challenge to be tackled for its successful real-world applications. In this paper, we propose a sample-efficient meta-RL algorithm that learns a model of the system or environment at hand in a task-directed manner. As opposed to the standard model-based approaches to meta-RL, our method e… ▽ More Designing a competent meta-reinforcement learning (meta-RL) algorithm in terms of data usage remains a central challenge to be tackled for its successful real-world applications. In this paper, we propose a sample-efficient meta-RL algorithm that learns a model of the system or environment at hand in a task-directed manner. As opposed to the standard model-based approaches to meta-RL, our method exploits the value information in order to rapidly capture the decision-critical part of the environment. The key component of our method is the loss function for learning the task inference module and the system model that systematically couples the model discrepancy and the value estimate, thereby facilitating the learning of the policy and the task inference module with a significantly smaller amount of data compared to the existing meta-RL algorithms. The idea is also extended to a non-meta-RL setting, namely an online linear quadratic regulator (LQR) problem, where our method can be simplified to reveal the essence of the strategy. The proposed method is evaluated in high-dimensional robotic control and online LQR problems, empirically verifying its effectiveness in extracting information indispensable for solving the tasks from observations in a sample efficient manner. △ Less

Submitted 8 December, 2023; originally announced December 2023.

arXiv:2311.10306 [pdf, other]

MPSeg : Multi-Phase strategy for coronary artery Segmentation

Authors: Jonghoe Ku, Yong-Hee Lee, Junsup Shin, In Kyu Lee, Hyun-Woo Kim

Abstract: Accurate segmentation of coronary arteries is a pivotal process in assessing cardiovascular diseases. However, the intricate structure of the cardiovascular system presents significant challenges for automatic segmentation, especially when utilizing methodologies like the SYNTAX Score, which relies extensively on detailed structural information for precise risk stratification. To address these dif… ▽ More Accurate segmentation of coronary arteries is a pivotal process in assessing cardiovascular diseases. However, the intricate structure of the cardiovascular system presents significant challenges for automatic segmentation, especially when utilizing methodologies like the SYNTAX Score, which relies extensively on detailed structural information for precise risk stratification. To address these difficulties and cater to this need, we present MPSeg, an innovative multi-phase strategy designed for coronary artery segmentation. Our approach specifically accommodates these structural complexities and adheres to the principles of the SYNTAX Score. Initially, our method segregates vessels into two categories based on their unique morphological characteristics: Left Coronary Artery (LCA) and Right Coronary Artery (RCA). Specialized ensemble models are then deployed for each category to execute the challenging segmentation task. Due to LCA's higher complexity over RCA, a refinement model is utilized to scrutinize and correct initial class predictions on segmented areas. Notably, our approach demonstrated exceptional effectiveness when evaluated in the Automatic Region-based Coronary Artery Disease diagnostics using x-ray angiography imagEs (ARCADE) Segmentation Detection Algorithm challenge at MICCAI 2023. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: MICCAI 2023 Conference ARCADE Challenge

arXiv:2303.13110 [pdf, other]

OCELOT: Overlapped Cell on Tissue Dataset for Histopathology

Authors: Jeongun Ryu, Aaron Valero Puche, JaeWoong Shin, Seonwook Park, Biagio Brattoli, **hee Lee, Wonkyung Jung, Soo Ick Cho, Kyunghyun Paeng, Chan-Young Ock, Donggeun Yoo, Sérgio Pereira

Abstract: Cell detection is a fundamental task in computational pathology that can be used for extracting high-level medical information from whole-slide images. For accurate cell detection, pathologists often zoom out to understand the tissue-level structures and zoom in to classify cells based on their morphology and the surrounding context. However, there is a lack of efforts to reflect such behaviors by… ▽ More Cell detection is a fundamental task in computational pathology that can be used for extracting high-level medical information from whole-slide images. For accurate cell detection, pathologists often zoom out to understand the tissue-level structures and zoom in to classify cells based on their morphology and the surrounding context. However, there is a lack of efforts to reflect such behaviors by pathologists in the cell detection models, mainly due to the lack of datasets containing both cell and tissue annotations with overlap** regions. To overcome this limitation, we propose and publicly release OCELOT, a dataset purposely dedicated to the study of cell-tissue relationships for cell detection in histopathology. OCELOT provides overlap** cell and tissue annotations on images acquired from multiple organs. Within this setting, we also propose multi-task learning approaches that benefit from learning both cell and tissue tasks simultaneously. When compared against a model trained only for the cell detection task, our proposed approaches improve cell detection performance on 3 datasets: proposed OCELOT, public TIGER, and internal CARP datasets. On the OCELOT test set in particular, we show up to 6.79 improvement in F1-score. We believe the contributions of this paper, including the release of the OCELOT dataset at https://lunit-io.github.io/research/publications/ocelot are a crucial starting point toward the important research direction of incorporating cell-tissue relationships in computation pathology. △ Less

Submitted 23 March, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

Comments: Accepted for publication at CVPR'23

arXiv:2302.05290 [pdf, other]

Removing Structured Noise with Diffusion Models

Authors: Tristan S. W. Stevens, Hans van Gorp, Faik C. Meral, Junseob Shin, Jason Yu, Jean-Luc Robert, Ruud J. G. van Sloun

Abstract: Solving ill-posed inverse problems requires careful formulation of prior beliefs over the signals of interest and an accurate description of their manifestation into noisy measurements. Handcrafted signal priors based on e.g. sparsity are increasingly replaced by data-driven deep generative models, and several groups have recently shown that state-of-the-art score-based diffusion models yield part… ▽ More Solving ill-posed inverse problems requires careful formulation of prior beliefs over the signals of interest and an accurate description of their manifestation into noisy measurements. Handcrafted signal priors based on e.g. sparsity are increasingly replaced by data-driven deep generative models, and several groups have recently shown that state-of-the-art score-based diffusion models yield particularly strong performance and flexibility. In this paper, we show that the powerful paradigm of posterior sampling with diffusion models can be extended to include rich, structured, noise models. To that end, we propose a joint conditional reverse diffusion process with learned scores for the noise and signal-generating distribution. We demonstrate strong performance gains across various inverse problems with structured noise, outperforming competitive baselines that use normalizing flows and adversarial networks. This opens up new opportunities and relevant practical applications of diffusion modeling for inverse problems in the context of non-Gaussian measurement models. △ Less

Submitted 17 October, 2023; v1 submitted 20 January, 2023; originally announced February 2023.

Comments: 11 pages, 7 figures, preprint

arXiv:2211.15950 [pdf, other]

Enhanced artificial intelligence-based diagnosis using CBCT with internal denoising: Clinical validation for discrimination of fungal ball, sinusitis, and normal cases in the maxillary sinus

Authors: Kyungsu Kim, Chae Yeon Lim, Joong Bo Shin, Myung ** Chung, Yong Gi Jung

Abstract: The cone-beam computed tomography (CBCT) provides 3D volumetric imaging of a target with low radiation dose and cost compared with conventional computed tomography, and it is widely used in the detection of paranasal sinus disease. However, it lacks the sensitivity to detect soft tissue lesions owing to reconstruction constraints. Consequently, only physicians with expertise in CBCT reading can di… ▽ More The cone-beam computed tomography (CBCT) provides 3D volumetric imaging of a target with low radiation dose and cost compared with conventional computed tomography, and it is widely used in the detection of paranasal sinus disease. However, it lacks the sensitivity to detect soft tissue lesions owing to reconstruction constraints. Consequently, only physicians with expertise in CBCT reading can distinguish between inherent artifacts or noise and diseases, restricting the use of this imaging modality. The development of artificial intelligence (AI)-based computer-aided diagnosis methods for CBCT to overcome the shortage of experienced physicians has attracted substantial attention. However, advanced AI-based diagnosis addressing intrinsic noise in CBCT has not been devised, discouraging the practical use of AI solutions for CBCT. To address this issue, we propose an AI-based computer-aided diagnosis method using CBCT with a denoising module. This module is implemented before diagnosis to reconstruct the internal ground-truth full-dose scan corresponding to an input CBCT image and thereby improve the diagnostic performance. The external validation results for the unified diagnosis of sinus fungal ball, chronic rhinosinusitis, and normal cases show that the proposed method improves the micro-, macro-average AUC, and accuracy by 7.4, 5.6, and 9.6% (from 86.2, 87.0, and 73.4 to 93.6, 92.6, and 83.0%), respectively, compared with a baseline while improving human diagnosis accuracy by 11% (from 71.7 to 83.0%), demonstrating technical differentiation and clinical effectiveness. This pioneering study on AI-based diagnosis using CBCT indicates denoising can improve diagnostic performance and reader interpretability in images from the sinonasal area, thereby providing a new approach and direction to radiographic image reconstruction regarding the development of AI-based diagnostic solutions. △ Less

Submitted 29 November, 2022; originally announced November 2022.

arXiv:2211.14998 [pdf, ps, other]

Anderson Acceleration for Partially Observable Markov Decision Processes: A Maximum Entropy Approach

Authors: Mingyu Park, Jaeuk Shin, Insoon Yang

Abstract: Partially observable Markov decision processes (POMDPs) is a rich mathematical framework that embraces a large class of complex sequential decision-making problems under uncertainty with limited observations. However, the complexity of POMDPs poses various computational challenges, motivating the need for an efficient algorithm that rapidly finds a good enough suboptimal solution. In this paper, w… ▽ More Partially observable Markov decision processes (POMDPs) is a rich mathematical framework that embraces a large class of complex sequential decision-making problems under uncertainty with limited observations. However, the complexity of POMDPs poses various computational challenges, motivating the need for an efficient algorithm that rapidly finds a good enough suboptimal solution. In this paper, we propose a novel accelerated offline POMDP algorithm exploiting Anderson acceleration (AA) that is capable of efficiently solving fixed-point problems using previous solution estimates. Our algorithm is based on the Q-function approximation (QMDP) method to alleviate the scalability issue inherent in POMDPs. Inspired by the quasi-Newton interpretation of AA, we propose a maximum entropy variant of QMDP, which we call soft QMDP, to fully benefit from AA. We prove that the overall algorithm converges to the suboptimal solution obtained by soft QMDP. Our algorithm can also be implemented in a model-free manner using simulation data. Provable error bounds on the residual and the solution are provided to examine how the simulation errors are propagated through the proposed algorithm. Finally, the performance of our algorithm is tested on several benchmark problems. According to the results of our experiments, the proposed algorithm converges significantly faster without degrading the solution quality compared to its standard counterparts. △ Less

Submitted 27 November, 2022; originally announced November 2022.

arXiv:2211.09988 [pdf, ps, other]

Exploring WavLM on Speech Enhancement

Authors: Hyungchan Song, Sanyuan Chen, Zhuo Chen, Yu Wu, Takuya Yoshioka, Min Tang, Jong Won Shin, Shujie Liu

Abstract: There is a surge in interest in self-supervised learning approaches for end-to-end speech encoding in recent years as they have achieved great success. Especially, WavLM showed state-of-the-art performance on various speech processing tasks. To better understand the efficacy of self-supervised learning models for speech enhancement, in this work, we design and conduct a series of experiments with… ▽ More There is a surge in interest in self-supervised learning approaches for end-to-end speech encoding in recent years as they have achieved great success. Especially, WavLM showed state-of-the-art performance on various speech processing tasks. To better understand the efficacy of self-supervised learning models for speech enhancement, in this work, we design and conduct a series of experiments with three resource conditions by combining WavLM and two high-quality speech enhancement systems. Also, we propose a regression-based WavLM training objective and a noise-mixing data configuration to further boost the downstream enhancement performance. The experiments on the DNS challenge dataset and a simulation dataset show that the WavLM benefits the speech enhancement task in terms of both speech quality and speech recognition accuracy, especially for low fine-tuning resources. For the high fine-tuning resource condition, only the word error rate is substantially improved. △ Less

Submitted 17 November, 2022; originally announced November 2022.

Comments: Accepted by IEEE SLT 2022

arXiv:2210.10267 [pdf, other]

Synthetic Sonar Image Simulation with Various Seabed Conditions for Automatic Target Recognition

Authors: Jaejeong Shin, Shi Chang, Matthew Bays, Joshua Weaver, Tom Wettergren, Silvia Ferrari

Abstract: We propose a novel method to generate underwater object imagery that is acoustically compliant with that generated by side-scan sonar using the Unreal Engine. We describe the process to develop, tune, and generate imagery to provide representative images for use in training automated target recognition (ATR) and machine learning algorithms. The methods provide visual approximations for acoustic ef… ▽ More We propose a novel method to generate underwater object imagery that is acoustically compliant with that generated by side-scan sonar using the Unreal Engine. We describe the process to develop, tune, and generate imagery to provide representative images for use in training automated target recognition (ATR) and machine learning algorithms. The methods provide visual approximations for acoustic effects such as back-scatter noise and acoustic shadow, while allowing fast rendering with C++ actor in UE for maximizing the size of potential ATR training datasets. Additionally, we provide analysis of its utility as a replacement for actual sonar imagery or physics-based sonar data. △ Less

Submitted 18 October, 2022; originally announced October 2022.

Comments: Submitted to OCEANS 2022

arXiv:2210.10263 [pdf, other]

Time and Cost-Efficient Bathymetric Map** System using Sparse Point Cloud Generation and Automatic Object Detection

Authors: Andres Pulido, Ruoyao Qin, Antonio Diaz, Andrew Ortega, Peter Ifju, Jaejeong Shin

Abstract: Generating 3D point cloud (PC) data from noisy sonar measurements is a problem that has potential applications for bathymetry map**, artificial object inspection, map** of aquatic plants and fauna as well as underwater navigation and localization of vehicles such as submarines. Side-scan sonar sensors are available in inexpensive cost ranges, especially in fish-finders, where the transducers a… ▽ More Generating 3D point cloud (PC) data from noisy sonar measurements is a problem that has potential applications for bathymetry map**, artificial object inspection, map** of aquatic plants and fauna as well as underwater navigation and localization of vehicles such as submarines. Side-scan sonar sensors are available in inexpensive cost ranges, especially in fish-finders, where the transducers are usually mounted to the bottom of a boat and can approach shallower depths than the ones attached to an Uncrewed Underwater Vehicle (UUV) can. However, extracting 3D information from side-scan sonar imagery is a difficult task because of its low signal-to-noise ratio and missing angle and depth information in the imagery. Since most algorithms that generate a 3D point cloud from side-scan sonar imagery use Shape from Shading (SFS) techniques, extracting 3D information is especially difficult when the seafloor is smooth, is slowly changing in depth, or does not have identifiable objects that make acoustic shadows. This paper introduces an efficient algorithm that generates a sparse 3D point cloud from side-scan sonar images. This computation is done in a computationally efficient manner by leveraging the geometry of the first sonar return combined with known positions provided by GPS and down-scan sonar depth measurement at each data point. Additionally, this paper implements another algorithm that uses a Convolutional Neural Network (CNN) using transfer learning to perform object detection on side-scan sonar images collected in real life and generated with a simulation. The algorithm was tested on both real and synthetic images to show reasonably accurate anomaly detection and classification. △ Less

Submitted 18 October, 2022; originally announced October 2022.

Comments: Submitted to OCEANS 2022

arXiv:2209.13646 [pdf]

Development of AI-cloud based high-sensitivity wireless smart sensor for port structure monitoring

Authors: Junsik Shin, Junyoung Park, Jongwoong Park

Abstract: Regular structural monitoring of port structure is crucial to cope with rapid degeneration owing to its exposure to saline and collisional environment. However, most of the inspections are being done visually by human in irregular-basis. To overcome the complication, lots of research related to vibration-based monitoring system with sensor has been devised. Nonetheless, it was difficult to measure… ▽ More Regular structural monitoring of port structure is crucial to cope with rapid degeneration owing to its exposure to saline and collisional environment. However, most of the inspections are being done visually by human in irregular-basis. To overcome the complication, lots of research related to vibration-based monitoring system with sensor has been devised. Nonetheless, it was difficult to measure ambient vibration due to port's diminutive amplitude and specify the exact timing of berthing, which is the major excitation source. This study developed a novel cloud-AI based wireless sensor system with high-sensitivity accelerometer M-A352, which has 0.2uG/sqrt(Hz) noise density, 0.003mg of ultra-low noise feature, and 1000Hz of sampling frequency. The sensor is triggered based on either predefined schedule or long rangefinder. After that, the detection of ship is done by AI object detection technique called Faster R-CNN with backbone network of ResNet for the convolution part. Coordinate and size of the detected anchor box is further processed to certify the berthing ship. Collected data are automatically sent to the cloud server through LTE CAT 1 modem within 10Mbps. The system was installed in the actual port field in Korea for few days as a preliminary investigation of proposed system. Additionally, acceleration, slope, and temperature data are analyzed to suggest the possibility of vibration-based port condition assessment. △ Less

Submitted 24 September, 2022; originally announced September 2022.

arXiv:2208.00988 [pdf, other]

Information-Aware Guidance for Magnetic Anomaly based Navigation

Authors: J. Humberto Ramos, Jaejeong Shin, Kyle Volle, Paul Buzaud, Kevin Brink, Prashant Ganesh

Abstract: In the absence of an absolute positioning system, such as GPS, autonomous vehicles are subject to accumulation of positional error which can interfere with reliable performance. Improved navigational accuracy without GPS enables vehicles to achieve a higher degree of autonomy and reliability, both in terms of decision making and safety. This paper details the use of two navigation systems for auto… ▽ More In the absence of an absolute positioning system, such as GPS, autonomous vehicles are subject to accumulation of positional error which can interfere with reliable performance. Improved navigational accuracy without GPS enables vehicles to achieve a higher degree of autonomy and reliability, both in terms of decision making and safety. This paper details the use of two navigation systems for autonomous agents using magnetic field anomalies to localize themselves within a map; both techniques use the information content in the environment in distinct ways and are aimed at reducing the localization uncertainty. The first method is based on a nonlinear observability metric of the vehicle model, while the second is an information theory based technique which minimizes the expected entropy of the system. These conditions are used to design guidance laws that minimize the localization uncertainty and are verified both in simulation and hardware experiments are presented for the observability approach. △ Less

Submitted 1 August, 2022; originally announced August 2022.

Comments: 2022 International Conference on Intelligent Robots and Systems October 23 to 27, 2022 Kyoto, Japan

arXiv:2206.09479 [pdf, other]

StudioGAN: A Taxonomy and Benchmark of GANs for Image Synthesis

Authors: Minguk Kang, Joonghyuk Shin, Jaesik Park

Abstract: Generative Adversarial Network (GAN) is one of the state-of-the-art generative models for realistic image synthesis. While training and evaluating GAN becomes increasingly important, the current GAN research ecosystem does not provide reliable benchmarks for which the evaluation is conducted consistently and fairly. Furthermore, because there are few validated GAN implementations, researchers devo… ▽ More Generative Adversarial Network (GAN) is one of the state-of-the-art generative models for realistic image synthesis. While training and evaluating GAN becomes increasingly important, the current GAN research ecosystem does not provide reliable benchmarks for which the evaluation is conducted consistently and fairly. Furthermore, because there are few validated GAN implementations, researchers devote considerable time to reproducing baselines. We study the taxonomy of GAN approaches and present a new open-source library named StudioGAN. StudioGAN supports 7 GAN architectures, 9 conditioning methods, 4 adversarial losses, 12 regularization modules, 3 differentiable augmentations, 7 evaluation metrics, and 5 evaluation backbones. With our training and evaluation protocol, we present a large-scale benchmark using various datasets (CIFAR10, ImageNet, AFHQv2, FFHQ, and Baby/Papa/Granpa-ImageNet) and 3 different evaluation backbones (InceptionV3, SwAV, and Swin Transformer). Unlike other benchmarks used in the GAN community, we train representative GANs, including BigGAN and StyleGAN series in a unified training pipeline and quantify generation performance with 7 evaluation metrics. The benchmark evaluates other cutting-edge generative models (e.g., StyleGAN-XL, ADM, MaskGIT, and RQ-Transformer). StudioGAN provides GAN implementations, training, and evaluation scripts with the pre-trained weights. StudioGAN is available at https://github.com/POSTECH-CVLab/PyTorch-StudioGAN. △ Less

Submitted 18 August, 2023; v1 submitted 19 June, 2022; originally announced June 2022.

Comments: 32 pages, IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI, 2023)

arXiv:2204.02405 [pdf, other]

Zero-shot Blind Image Denoising via Implicit Neural Representations

Authors: Chaewon Kim, Jaeho Lee, **woo Shin

Abstract: Recent denoising algorithms based on the "blind-spot" strategy show impressive blind image denoising performances, without utilizing any external dataset. While the methods excel in recovering highly contaminated images, we observe that such algorithms are often less effective under a low-noise or real noise regime. To address this gap, we propose an alternative denoising strategy that leverages t… ▽ More Recent denoising algorithms based on the "blind-spot" strategy show impressive blind image denoising performances, without utilizing any external dataset. While the methods excel in recovering highly contaminated images, we observe that such algorithms are often less effective under a low-noise or real noise regime. To address this gap, we propose an alternative denoising strategy that leverages the architectural inductive bias of implicit neural representations (INRs), based on our two findings: (1) INR tends to fit the low-frequency clean image signal faster than the high-frequency noise, and (2) INR layers that are closer to the output play more critical roles in fitting higher-frequency parts. Building on these observations, we propose a denoising algorithm that maximizes the innate denoising capability of INRs by penalizing the growth of deeper layer weights. We show that our method outperforms existing zero-shot denoising methods under an extensive set of low-noise or real-noise scenarios. △ Less

Submitted 5 April, 2022; originally announced April 2022.

Comments: 8 pages, 3 figures

arXiv:2202.02015 [pdf, other]

Energy-Efficient High-Accuracy Spiking Neural Network Inference Using Time-Domain Neurons

Authors: Joonghyun Song, Jiwon Shin, Hanseok Kim, Woo-Seok Choi

Abstract: Due to the limitations of realizing artificial neural networks on prevalent von Neumann architectures, recent studies have presented neuromorphic systems based on spiking neural networks (SNNs) to reduce power and computational cost. However, conventional analog voltage-domain integrate-and-fire (I&F) neuron circuits, based on either current mirrors or op-amps, pose serious issues such as nonlinea… ▽ More Due to the limitations of realizing artificial neural networks on prevalent von Neumann architectures, recent studies have presented neuromorphic systems based on spiking neural networks (SNNs) to reduce power and computational cost. However, conventional analog voltage-domain integrate-and-fire (I&F) neuron circuits, based on either current mirrors or op-amps, pose serious issues such as nonlinearity or high power consumption, thereby degrading either inference accuracy or energy efficiency of the SNN. To achieve excellent energy efficiency and high accuracy simultaneously, this paper presents a low-power highly linear time-domain I&F neuron circuit. Designed and simulated in a 28nm CMOS process, the proposed neuron leads to more than 4.3x lower error rate on the MNIST inference over the conventional current-mirror-based neurons. In addition, the power consumed by the proposed neuron circuit is simulated to be 0.230uW per neuron, which is orders of magnitude lower than the existing voltage-domain neurons. △ Less

Submitted 9 April, 2022; v1 submitted 4 February, 2022; originally announced February 2022.

Comments: Accepted in AICAS 2022

arXiv:2109.07120 [pdf, other]

doi 10.1109/LRA.2022.3191234

Infusing model predictive control into meta-reinforcement learning for mobile robots in dynamic environments

Authors: Jaeuk Shin, Astghik Hakobyan, Mingyu Park, Yeoneung Kim, Gihun Kim, Insoon Yang

Abstract: The successful operation of mobile robots requires them to adapt rapidly to environmental changes. To develop an adaptive decision-making tool for mobile robots, we propose a novel algorithm that combines meta-reinforcement learning (meta-RL) with model predictive control (MPC). Our method employs an off-policy meta-RL algorithm as a baseline to train a policy using transition samples generated by… ▽ More The successful operation of mobile robots requires them to adapt rapidly to environmental changes. To develop an adaptive decision-making tool for mobile robots, we propose a novel algorithm that combines meta-reinforcement learning (meta-RL) with model predictive control (MPC). Our method employs an off-policy meta-RL algorithm as a baseline to train a policy using transition samples generated by MPC when the robot detects certain events that can be effectively handled by MPC, with its explicit use of robot dynamics. The key idea of our method is to switch between the meta-learned policy and the MPC controller in a randomized and event-triggered fashion to make up for suboptimal MPC actions caused by the limited prediction horizon. During meta-testing, the MPC module is deactivated to significantly reduce computation time in motion control. We further propose an online adaptation scheme that enables the robot to infer and adapt to a new task within a single trajectory. The performance of our method has been demonstrated through simulations using a nonlinear car-like vehicle model with (i) synthetic movements of obstacles, and (ii) real-world pedestrian motion data. The simulation results indicate that our method outperforms other algorithms in terms of learning efficiency and navigation quality. △ Less

Submitted 7 July, 2022; v1 submitted 15 September, 2021; originally announced September 2021.

Comments: Accepted for publication in the IEEE Robotics and Automation Letters

Journal ref: IEEE Robotics and Automation Letters, 2022

arXiv:2010.14087 [pdf, other]

Hamilton-Jacobi Deep Q-Learning for Deterministic Continuous-Time Systems with Lipschitz Continuous Controls

Authors: Jeongho Kim, Jaeuk Shin, Insoon Yang

Abstract: In this paper, we propose Q-learning algorithms for continuous-time deterministic optimal control problems with Lipschitz continuous controls. Our method is based on a new class of Hamilton-Jacobi-Bellman (HJB) equations derived from applying the dynamic programming principle to continuous-time Q-functions. A novel semi-discrete version of the HJB equation is proposed to design a Q-learning algori… ▽ More In this paper, we propose Q-learning algorithms for continuous-time deterministic optimal control problems with Lipschitz continuous controls. Our method is based on a new class of Hamilton-Jacobi-Bellman (HJB) equations derived from applying the dynamic programming principle to continuous-time Q-functions. A novel semi-discrete version of the HJB equation is proposed to design a Q-learning algorithm that uses data collected in discrete time without discretizing or approximating the system dynamics. We identify the condition under which the Q-function estimated by this algorithm converges to the optimal Q-function. For practical implementation, we propose the Hamilton-Jacobi DQN, which extends the idea of deep Q-networks (DQN) to our continuous control setting. This approach does not require actor networks or numerical solutions to optimization problems for greedy actions since the HJB equation provides a simple characterization of optimal controls via ordinary differential equations. We empirically demonstrate the performance of our method through benchmark tasks and high-dimensional linear-quadratic problems. △ Less

Submitted 27 October, 2020; originally announced October 2020.

arXiv:2007.02096 [pdf]

doi 10.1109/TMI.2021.3055428

Multi-Site Infant Brain Segmentation Algorithms: The iSeg-2019 Challenge

Authors: Yue Sun, Kun Gao, Zhengwang Wu, Zhihao Lei, Ying Wei, Jun Ma, ** Yang, Xue Feng, Li Zhao, Trung Le Phan, Jitae Shin, Tao Zhong, Yu Zhang, Lequan Yu, Caizi Li, Ramesh Basnet, M. Omair Ahmad, M. N. S. Swamy, Wenao Ma, Qi Dou, Toan Duc Bui, Camilo Bermudez Noguera, Bennett Landman, Ian H. Gotlib, Kathryn L. Humphreys , et al. (8 additional authors not shown)

Abstract: To better understand early brain growth patterns in health and disorder, it is critical to accurately segment infant brain magnetic resonance (MR) images into white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF). Deep learning-based methods have achieved state-of-the-art performance; however, one of major limitations is that the learning-based methods may suffer from the multi-site i… ▽ More To better understand early brain growth patterns in health and disorder, it is critical to accurately segment infant brain magnetic resonance (MR) images into white matter (WM), gray matter (GM), and cerebrospinal fluid (CSF). Deep learning-based methods have achieved state-of-the-art performance; however, one of major limitations is that the learning-based methods may suffer from the multi-site issue, that is, the models trained on a dataset from one site may not be applicable to the datasets acquired from other sites with different imaging protocols/scanners. To promote methodological development in the community, iSeg-2019 challenge (http://iseg2019.web.unc.edu) provides a set of 6-month infant subjects from multiple sites with different protocols/scanners for the participating methods. Training/validation subjects are from UNC (MAP) and testing subjects are from UNC/UMN (BCP), Stanford University, and Emory University. By the time of writing, there are 30 automatic segmentation methods participating in iSeg-2019. We review the 8 top-ranked teams by detailing their pipelines/implementations, presenting experimental results and evaluating performance in terms of the whole brain, regions of interest, and gyral landmark curves. We also discuss their limitations and possible future directions for the multi-site issue. We hope that the multi-site dataset in iSeg-2019 and this review article will attract more researchers on the multi-site issue. △ Less

Submitted 11 July, 2020; v1 submitted 4 July, 2020; originally announced July 2020.

Journal ref: IEEE Transactions on Medical Imaging, 40(5), 1363-1376, 2021

arXiv:2004.14146 [pdf, other]

White Paper on Critical and Massive Machine Type Communication Towards 6G

Authors: Nurul Huda Mahmood, Stefan Böcker, Andrea Munari, Federico Clazzer, Ingrid Moerman, Konstantin Mikhaylov, Onel Lopez, Ok-Sun Park, Eric Mercier, Hannes Bartz, Riku Jäntti, Ravikumar Pragada, Yihua Ma, Elina Annanperä, Christian Wietfeld, Martin Andraud, Gianluigi Liva, Yan Chen, Eduardo Garro, Frank Burkhardt, Hirley Alves, Chen-Feng Liu, Yalcin Sadi, Jean-Baptiste Dore, Eunah Kim , et al. (6 additional authors not shown)

Abstract: The society as a whole, and many vertical sectors in particular, is becoming increasingly digitalized. Machine Type Communication (MTC), encompassing its massive and critical aspects, and ubiquitous wireless connectivity are among the main enablers of such digitization at large. The recently introduced 5G New Radio is natively designed to support both aspects of MTC to promote the digital transfor… ▽ More The society as a whole, and many vertical sectors in particular, is becoming increasingly digitalized. Machine Type Communication (MTC), encompassing its massive and critical aspects, and ubiquitous wireless connectivity are among the main enablers of such digitization at large. The recently introduced 5G New Radio is natively designed to support both aspects of MTC to promote the digital transformation of the society. However, it is evident that some of the more demanding requirements cannot be fully supported by 5G networks. Alongside, further development of the society towards 2030 will give rise to new and more stringent requirements on wireless connectivity in general, and MTC in particular. Driven by the societal trends towards 2030, the next generation (6G) will be an agile and efficient convergent network serving a set of diverse service classes and a wide range of key performance indicators (KPI). This white paper explores the main drivers and requirements of an MTC-optimized 6G network, and discusses the following six key research questions: - Will the main KPIs of 5G continue to be the dominant KPIs in 6G; or will there emerge new key metrics? - How to deliver different E2E service mandates with different KPI requirements considering joint-optimization at the physical up to the application layer? - What are the key enablers towards designing ultra-low power receivers and highly efficient sleep modes? - How to tackle a disruptive rather than incremental joint design of a massively scalable waveform and medium access policy for global MTC connectivity? - How to support new service classes characterizing mission-critical and dependable MTC in 6G? - What are the potential enablers of long term, lightweight and flexible privacy and security schemes considering MTC device requirements? △ Less

Submitted 4 May, 2020; v1 submitted 29 April, 2020; originally announced April 2020.

Comments: White paper by http://www.6GFlagship.com

arXiv:1905.06655 [pdf, other]

Effective Sentence Scoring Method using Bidirectional Language Model for Speech Recognition

Authors: Joongbo Shin, Yoonhyung Lee, Kyomin Jung

Abstract: In automatic speech recognition, many studies have shown performance improvements using language models (LMs). Recent studies have tried to use bidirectional LMs (biLMs) instead of conventional unidirectional LMs (uniLMs) for rescoring the $N$-best list decoded from the acoustic model. In spite of their theoretical benefits, the biLMs have not given notable improvements compared to the uniLMs in t… ▽ More In automatic speech recognition, many studies have shown performance improvements using language models (LMs). Recent studies have tried to use bidirectional LMs (biLMs) instead of conventional unidirectional LMs (uniLMs) for rescoring the $N$-best list decoded from the acoustic model. In spite of their theoretical benefits, the biLMs have not given notable improvements compared to the uniLMs in their experiments. This is because their biLMs do not consider the interaction between the two directions. In this paper, we propose a novel sentence scoring method considering the interaction between the past and the future words on the biLM. Our experimental results on the LibriSpeech corpus show that the biLM with the proposed sentence scoring outperforms the uniLM for the $N$-best list rescoring, consistently and significantly in all experimental conditions. The analysis of WERs by word position demonstrates that the biLM is more robust than the uniLM especially when a recognized sentence is short or a misrecognized word is at the beginning of the sentence. △ Less

Submitted 16 May, 2019; originally announced May 2019.

Comments: submitted to INTERSPEECH 2019, 5 pages

arXiv:1809.04972 [pdf, other]

Simulation-based Distributed Coordination Maximization over Networks

Authors: Hyeryung Jang, **woo Shin, Yung Yi

Abstract: In various online/offline multi-agent networked environments, it is very popular that the system can benefit from coordinating actions of two interacting agents at some cost of coordination. In this paper, we first formulate an optimization problem that captures the amount of coordination gain at the cost of node activation over networks. This problem is challenging to solve in a distributed manne… ▽ More In various online/offline multi-agent networked environments, it is very popular that the system can benefit from coordinating actions of two interacting agents at some cost of coordination. In this paper, we first formulate an optimization problem that captures the amount of coordination gain at the cost of node activation over networks. This problem is challenging to solve in a distributed manner, since the target gain is a function of the long-term time portion of the inter-coupled activations of two adjacent nodes, and thus a standard Lagrange duality theory is hard to apply to obtain a distributed decomposition as in the standard Network Utility Maximization. In this paper, we propose three simulation-based distributed algorithms, each having different update rules, all of which require only one-hop message passing and locally-observed information. The key idea for being distributedness is due to a stochastic approximation method that runs a Markov chain simulation incompletely over time, but provably guarantees its convergence to the optimal solution. Next, we provide a game-theoretic framework to interpret our proposed algorithms from a different perspective. We artificially select the payoff function, where the game's Nash equilibrium is asymptotically equal to the socially optimal point, i.e., no Price-of-Anarchy. We show that two stochastically-approximated variants of standard game-learning dynamics overlap with two algorithms developed from the optimization perspective. Finally, we demonstrate our theoretical findings on convergence, optimality, and further features such as a trade-off between efficiency and convergence speed through extensive simulations. △ Less

Submitted 13 September, 2018; originally announced September 2018.

Comments: 34 pages, 4 figures. A shorter version of this paper appeared in Proceedings of ACM Mobile Ad Hoc Networking and Computing (MOBIHOC), 2016. To appear at IEEE Transactions on Control of Network Systems, 2018

arXiv:1807.10752 [pdf, other]

Dictionary Learning in Fourier Transform Scanning Tunneling Spectroscopy

Authors: Sky C. Cheung, John Y. Shin, Yenson Lau, Zhengyu Chen, Ju Sun, Yuqian Zhang, John N. Wright, Abhay N. Pasupathy

Abstract: Modern high-resolution microscopes, such as the scanning tunneling microscope, are commonly used to study specimens that have dense and aperiodic spatial structure. Extracting meaningful information from images obtained from such microscopes remains a formidable challenge. Fourier analysis is commonly used to analyze the underlying structure of fundamental motifs present in an image. However, the… ▽ More Modern high-resolution microscopes, such as the scanning tunneling microscope, are commonly used to study specimens that have dense and aperiodic spatial structure. Extracting meaningful information from images obtained from such microscopes remains a formidable challenge. Fourier analysis is commonly used to analyze the underlying structure of fundamental motifs present in an image. However, the Fourier transform fundamentally suffers from severe phase noise when applied to aperiodic images. Here, we report the development of a new algorithm based on nonconvex optimization, applicable to any microscopy modality, that directly uncovers the fundamental motifs present in a real-space image. Apart from being quantitatively superior to traditional Fourier analysis, we show that this novel algorithm also uncovers phase sensitive information about the underlying motif structure. We demonstrate its usefulness by studying scanning tunneling microscopy images of a Co-doped iron arsenide superconductor and prove that the application of the algorithm allows for the complete recovery of quasiparticle interference in this material. Our phase sensitive quasiparticle interference imaging results indicate that the pairing symmetry in optimally doped NaFeAs is consistent with a sign-changing s+- order parameter. △ Less

Submitted 19 July, 2018; originally announced July 2018.

arXiv:1801.04081 [pdf, other]

Separation of Instrument Sounds using Non-negative Matrix Factorization with Spectral Envelope Constraints

Authors: Jeongsoo Park, Jaeyoung Shin, Kyogu Lee

Abstract: Spectral envelope is one of the most important features that characterize the timbre of an instrument sound. However, it is difficult to use spectral information in the framework of conventional spectrogram decomposition methods. We overcome this problem by suggesting a simple way to provide a constraint on the spectral envelope calculated by linear prediction. In the first part of this study, we… ▽ More Spectral envelope is one of the most important features that characterize the timbre of an instrument sound. However, it is difficult to use spectral information in the framework of conventional spectrogram decomposition methods. We overcome this problem by suggesting a simple way to provide a constraint on the spectral envelope calculated by linear prediction. In the first part of this study, we use a pre-trained spectral envelope of known instruments as the constraint. Then we apply the same idea to a blind scenario in which the instruments are unknown. The experimental results reveal that the proposed method outperforms the conventional methods. △ Less

Submitted 12 January, 2018; originally announced January 2018.

Showing 1–27 of 27 results for author: Shin, J