Skip to main content

Showing 1–50 of 97 results for author: Lee, B

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.18951  [pdf, ps, other

    eess.SP

    Constant Modulus Waveform Design with Interference Exploitation for DFRC Systems: A Block-Level Optimization Approach

    Authors: Byunghyun Lee, Anindya Bijoy Das, David Love, Christopher Brinton, James Krogmeier

    Abstract: Dual-function radar-communication (DFRC) is a key enabler of location-based services for next-generation communication systems. In this paper, we investigate the problem of designing constant modulus waveforms for DFRC systems. For high-precision radar sensing, we consider joint optimization of the correlation properties and spatial beam pattern. For communication, we employ constructive interfere… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2310.10804

  2. arXiv:2406.02562  [pdf, other

    eess.AS cs.AI cs.CL

    Gated Low-rank Adaptation for personalized Code-Switching Automatic Speech Recognition on the low-spec devices

    Authors: Gwantae Kim, Bokyeung Lee, Donghyeon Kim, Hanseok Ko

    Abstract: In recent times, there has been a growing interest in utilizing personalized large models on low-spec devices, such as mobile and CPU-only devices. However, utilizing a personalized large model in the on-device is inefficient, and sometimes limited due to computational cost. To tackle the problem, this paper presents the weights separation method to minimize on-device model weights using parameter… ▽ More

    Submitted 23 April, 2024; originally announced June 2024.

    Comments: Table 2 is revised

    Journal ref: ICASSP 2024 Workshop(HSCMA 2024) paper

  3. arXiv:2406.01570  [pdf, ps, other

    cs.LG eess.SY stat.ML

    Single Trajectory Conformal Prediction

    Authors: Brian Lee, Nikolai Matni

    Abstract: We study the performance of risk-controlling prediction sets (RCPS), an empirical risk minimization-based formulation of conformal prediction, with a single trajectory of temporally correlated data from an unknown stochastic dynamical system. First, we use the blocking technique to show that RCPS attains performance guarantees similar to those enjoyed in the iid setting whenever data is generated… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 16 pages

  4. arXiv:2404.09030  [pdf, other

    eess.SY cs.LG

    Active Learning for Control-Oriented Identification of Nonlinear Systems

    Authors: Bruce D. Lee, Ingvar Ziemann, George J. Pappas, Nikolai Matni

    Abstract: Model-based reinforcement learning is an effective approach for controlling an unknown system. It is based on a longstanding pipeline familiar to the control community in which one performs experiments on the environment to collect a dataset, uses the resulting dataset to identify a model of the system, and finally performs control synthesis using the identified model. As interacting with the syst… ▽ More

    Submitted 13 April, 2024; originally announced April 2024.

  5. arXiv:2404.01636  [pdf, other

    cs.CV cs.AI cs.LG cs.RO eess.SY

    Learning to Control Camera Exposure via Reinforcement Learning

    Authors: Kyunghyun Lee, Ukcheol Shin, Byeong-Uk Lee

    Abstract: Adjusting camera exposure in arbitrary lighting conditions is the first step to ensure the functionality of computer vision applications. Poorly adjusted camera exposure often leads to critical failure and performance degradation. Traditional camera exposure control methods require multiple convergence steps and time-consuming processes, making them unsuitable for dynamic lighting conditions. In t… ▽ More

    Submitted 2 April, 2024; originally announced April 2024.

    Comments: Accepted at CVPR 2024, *First two authors contributed equally to this work. Project page link: https://sites.google.com/view/drl-ae

  6. arXiv:2403.18707  [pdf, other

    math.OC eess.SY

    Connections between Reachability and Time Optimality

    Authors: Juho Bae, Ji Hoon Bai, Byung-Yoon Lee, Jun-Yong Lee, Chang-Hun Lee

    Abstract: This paper presents the concept of an equivalence relation between the set of optimal control problems. By leveraging this concept, we show that the boundary of the reachability set can be constructed by the solutions of time optimal problems. Alongside, a more generalized equivalence theorem is presented together. The findings facilitate the use of solution structures from a certain class of opti… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: Submitted to Automatica

  7. arXiv:2403.15692  [pdf, other

    cs.IT eess.SP

    Block Orthogonal Sparse Superposition Codes for $ \sf{L}^3 $ Communications: Low Error Rate, Low Latency, and Low Power Consumption

    Authors: Donghwa Han, Bowhyung Lee, Min Jang, Donghun Lee, Seho Myung, Namyoon Lee

    Abstract: Block orthogonal sparse superposition (BOSS) code is a class of joint coded modulation methods, which can closely achieve the finite-blocklength capacity with a low-complexity decoder at a few coding rates under Gaussian channels. However, for fading channels, the code performance degrades considerably because coded symbols experience different channel fading effects. In this paper, we put forth n… ▽ More

    Submitted 22 March, 2024; originally announced March 2024.

  8. arXiv:2402.01969  [pdf, other

    cs.LG eess.SP

    Simulation-Enhanced Data Augmentation for Machine Learning Pathloss Prediction

    Authors: Ahmed P. Mohamed, Byunghyun Lee, Yaguang Zhang, Max Hollingsworth, C. Robert Anderson, James V. Krogmeier, David J. Love

    Abstract: Machine learning (ML) offers a promising solution to pathloss prediction. However, its effectiveness can be degraded by the limited availability of data. To alleviate these challenges, this paper introduces a novel simulation-enhanced data augmentation method for ML pathloss prediction. Our method integrates synthetic data generated from a cellular coverage simulator and independently collected re… ▽ More

    Submitted 5 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

    Comments: 6 pages, 5 figures, Accepted at ICC 2024

  9. arXiv:2401.14304  [pdf, other

    eess.SY

    Constraint-Aware Mesh Refinement Method by Reachability Set Envelope of Curvature Bounded Paths

    Authors: Juho Bae, Ji Hoon Bai, Byung-Yoon Lee, Jun-Yong Lee

    Abstract: This paper presents an enhanced direct-method-based approach for the real-time solution of optimal control problems to handle path constraints, such as obstacles. The principal contributions of this work are twofold: first, the existing methods for constructing reachability sets in the literature are extended to derive the envelope of these sets, which determines the region swept by all feasible t… ▽ More

    Submitted 4 March, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

    Comments: Preprint submitted to Automatica

  10. arXiv:2401.00825  [pdf, other

    cs.CV cs.GR eess.IV

    Sharp-NeRF: Grid-based Fast Deblurring Neural Radiance Fields Using Sharpness Prior

    Authors: Byeonghyeon Lee, Howoong Lee, Usman Ali, Eunbyung Park

    Abstract: Neural Radiance Fields (NeRF) have shown remarkable performance in neural rendering-based novel view synthesis. However, NeRF suffers from severe visual quality degradation when the input images have been captured under imperfect conditions, such as poor illumination, defocus blurring, and lens aberrations. Especially, defocus blur is quite common in the images when they are normally captured usin… ▽ More

    Submitted 1 January, 2024; originally announced January 2024.

    Comments: Accepted to WACV 2024

  11. arXiv:2401.00073  [pdf, other

    eess.SY cs.LG

    Nonasymptotic Regret Analysis of Adaptive Linear Quadratic Control with Model Misspecification

    Authors: Bruce D. Lee, Anders Rantzer, Nikolai Matni

    Abstract: The strategy of pre-training a large model on a diverse dataset, then fine-tuning for a particular application has yielded impressive results in computer vision, natural language processing, and robotic control. This strategy has vast potential in adaptive control, where it is necessary to rapidly adapt to changing conditions with limited data. Toward concretely understanding the benefit of pre-tr… ▽ More

    Submitted 21 May, 2024; v1 submitted 29 December, 2023; originally announced January 2024.

  12. arXiv:2312.09603  [pdf, other

    cs.SD cs.LG eess.AS

    Stethoscope-guided Supervised Contrastive Learning for Cross-domain Adaptation on Respiratory Sound Classification

    Authors: June-Woo Kim, Sangmin Bae, Won-Yang Cho, Byungjo Lee, Ho-Young Jung

    Abstract: Despite the remarkable advances in deep learning technology, achieving satisfactory performance in lung sound classification remains a challenge due to the scarcity of available data. Moreover, the respiratory sound samples are collected from a variety of electronic stethoscopes, which could potentially introduce biases into the trained models. When a significant distribution shift occurs within t… ▽ More

    Submitted 15 December, 2023; originally announced December 2023.

    Comments: accepted to ICASSP 2024

  13. arXiv:2311.10224  [pdf, other

    eess.IV cs.CV cs.LG

    CV-Attention UNet: Attention-based UNet for 3D Cerebrovascular Segmentation of Enhanced TOF-MRA Images

    Authors: Syed Farhan Abbas, Nguyen Thanh Duc, Yoonguu Song, Kyungwon Kim, Ekta Srivastava, Boreom Lee

    Abstract: Due to the lack of automated methods, to diagnose cerebrovascular disease, time-of-flight magnetic resonance angiography (TOF-MRA) is assessed visually, making it time-consuming. The commonly used encoder-decoder architectures for cerebrovascular segmentation utilize redundant features, eventually leading to the extraction of low-level features multiple times. Additionally, convolutional neural ne… ▽ More

    Submitted 19 June, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

  14. arXiv:2311.07079  [pdf, other

    cs.LG cs.AI eess.SP

    Sample Dominance Aware Framework via Non-Parametric Estimation for Spontaneous Brain-Computer Interface

    Authors: Byeong-Hoo Lee, Byoung-Hee Kwon, Seong-Whan Lee

    Abstract: Deep learning has shown promise in decoding brain signals, such as electroencephalogram (EEG), in the field of brain-computer interfaces (BCIs). However, the non-stationary characteristics of EEG signals pose challenges for training neural networks to acquire appropriate knowledge. Inconsistent EEG signals resulting from these non-stationary characteristics can lead to poor performance. Therefore,… ▽ More

    Submitted 14 November, 2023; v1 submitted 13 November, 2023; originally announced November 2023.

    Comments: 5 pages, 2 figures

  15. arXiv:2310.10804  [pdf, other

    eess.SP

    Constant Modulus Waveform Design with Block-Level Interference Exploitation for DFRC Systems

    Authors: Byunghyun Lee, Anindya Bijoy Das, David J. Love, Christopher G. Brinton, James V. Krogmeier

    Abstract: Dual-functional radar-communication (DFRC) is a promising technology where radar and communication functions operate on the same spectrum and hardware. In this paper, we propose an algorithm for designing constant modulus waveforms for DFRC systems. Particularly, we jointly optimize the correlation properties and the spatial beam pattern. For communication, we employ constructive interference-base… ▽ More

    Submitted 6 April, 2024; v1 submitted 16 October, 2023; originally announced October 2023.

    Comments: Accepted to IEEE International Conference on Communications (ICC) 2024

  16. arXiv:2309.14741  [pdf, other

    eess.AS cs.SD

    Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification

    Authors: Hee-Soo Heo, KiHyun Nam, Bong-** Lee, Youngki Kwon, Minjae Lee, You ** Kim, Joon Son Chung

    Abstract: In the field of speaker verification, session or channel variability poses a significant challenge. While many contemporary methods aim to disentangle session information from speaker embeddings, we introduce a novel approach using an additional embedding to represent the session information. This is achieved by training an auxiliary network appended to the speaker embedding extractor which remain… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  17. arXiv:2309.04916  [pdf, ps, other

    eess.SP

    Fusing Channel and Sensor Measurements for Enhancing Predictive Beamforming in UAV-Assisted Massive MIMO Communications

    Authors: Byunghyun Lee, Andrew Marcum, David Love, James Krogmeier

    Abstract: Cellular-connected unmanned aerial vehicles (UAVs) represent a promising technology for extending the coverage of 5G and 6G networks in a cost-effective manner. Additionally, Massive multiple-input multiple-output (MIMO) serves as an effective solution to interference mitigation in cellular-connected UAV communications. In this letter, we propose a fusion of wireless and sensor data to enhance bea… ▽ More

    Submitted 28 December, 2023; v1 submitted 9 September, 2023; originally announced September 2023.

  18. arXiv:2309.03873  [pdf, ps, other

    eess.SY cs.LG stat.ML

    A Tutorial on the Non-Asymptotic Theory of System Identification

    Authors: Ingvar Ziemann, Anastasios Tsiamis, Bruce Lee, Yassir Jedra, Nikolai Matni, George J. Pappas

    Abstract: This tutorial serves as an introduction to recently developed non-asymptotic methods in the theory of -- mainly linear -- system identification. We emphasize tools we deem particularly useful for a range of problems in this domain, such as the covering technique, the Hanson-Wright Inequality and the method of self-normalized martingales. We then employ these tools to give streamlined proofs of the… ▽ More

    Submitted 16 June, 2024; v1 submitted 7 September, 2023; originally announced September 2023.

  19. arXiv:2308.12492  [pdf, other

    cs.LG eess.SP

    Optimizing Neural Network Scale for ECG Classification

    Authors: Byeong Tak Lee, Yong-Yeon Jo, Joon-Myoung Kwon

    Abstract: We study scaling convolutional neural networks (CNNs), specifically targeting Residual neural networks (ResNet), for analyzing electrocardiograms (ECGs). Although ECG signals are time-series data, CNN-based models have been shown to outperform other neural networks with different architectures in ECG analysis. However, most previous studies in ECG analysis have overlooked the importance of network… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

    Comments: 30pages

  20. arXiv:2306.12978  [pdf, other

    cs.IT eess.SP

    Rate-Splitting Multiple Access for 6G Networks: Ten Promising Scenarios and Applications

    Authors: Jeonghun Park, Byungju Lee, **seok Choi, Hoon Lee, Namyoon Lee, Seok-Hwan Park, Kyoung-Jae Lee, Junil Choi, Sung Ho Chae, Sang-Woon Jeon, Kyung Sup Kwak, Bruno Clerckx, Wonjae Shin

    Abstract: In the upcoming 6G era, multiple access (MA) will play an essential role in achieving high throughput performances required in a wide range of wireless applications. Since MA and interference management are closely related issues, the conventional MA techniques are limited in that they cannot provide near-optimal performance in universal interference regimes. Recently, rate-splitting multiple acce… ▽ More

    Submitted 22 June, 2023; originally announced June 2023.

    Comments: 17 pages, 6 figures, submitted to IEEE Network Magazine

  21. arXiv:2306.00680  [pdf, other

    cs.SD cs.AI eess.AS

    Encoder-decoder multimodal speaker change detection

    Authors: Jee-weon Jung, Soonshin Seo, Hee-Soo Heo, Geonmin Kim, You ** Kim, Young-ki Kwon, Minjae Lee, Bong-** Lee

    Abstract: The task of speaker change detection (SCD), which detects points where speakers change in an input, is essential for several applications. Several studies solved the SCD task using audio inputs only and have shown limited performance. Recently, multimodal SCD (MMSCD) models, which utilise text modality in addition to audio, have shown improved performance. In this study, the proposed model are bui… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: 5 pages, accepted for presentation at INTERSPEECH 2023

  22. arXiv:2305.16415  [pdf, other

    eess.SY

    Performance-Robustness Tradeoffs in Adversarially Robust Control and Estimation

    Authors: Bruce D. Lee, Thomas T. C. K. Zhang, Hamed Hassani, Nikolai Matni

    Abstract: While $\mathcal{H}_\infty$ methods can introduce robustness against worst-case perturbations, their nominal performance under conventional stochastic disturbances is often drastically reduced. Though this fundamental tradeoff between nominal performance and robustness is known to exist, it is not well-characterized in quantitative terms. Toward addressing this issue, we borrow the increasingly ubi… ▽ More

    Submitted 25 May, 2023; originally announced May 2023.

    Comments: arXiv admin note: substantial text overlap with arXiv:2203.10763

  23. Patch-Mix Contrastive Learning with Audio Spectrogram Transformer on Respiratory Sound Classification

    Authors: Sangmin Bae, June-Woo Kim, Won-Yang Cho, Hyerim Baek, Soyoun Son, Byungjo Lee, Changwan Ha, Kyongpil Tae, Sungnyun Kim, Se-Young Yun

    Abstract: Respiratory sound contains crucial information for the early diagnosis of fatal lung diseases. Since the COVID-19 pandemic, there has been a growing interest in contact-free medical care based on electronic stethoscopes. To this end, cutting-edge deep learning models have been developed to diagnose lung diseases; however, it is still challenging due to the scarcity of medical data. In this study,… ▽ More

    Submitted 22 November, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: INTERSPEECH 2023, Code URL: https://github.com/raymin0223/patch-mix_contrastive_learning

  24. arXiv:2303.15637  [pdf, other

    eess.SY

    The Fundamental Limitations of Learning Linear-Quadratic Regulators

    Authors: Bruce D. Lee, Ingvar Ziemann, Anastasios Tsiamis, Henrik Sandberg, Nikolai Matni

    Abstract: We present a local minimax lower bound on the excess cost of designing a linear-quadratic controller from offline data. The bound is valid for any offline exploration policy that consists of a stabilizing controller and an energy bounded exploratory input. The derivation leverages a relaxation of the minimax estimation problem to Bayesian estimation, and an application of Van Trees' inequality. We… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

  25. arXiv:2303.00795  [pdf, other

    eess.IV cs.CV

    Improved Segmentation of Deep Sulci in Cortical Gray Matter Using a Deep Learning Framework Incorporating Laplace's Equation

    Authors: Sadhana Ravikumar, Ranjit Ittyerah, Sydney Lim, Long Xie, Sandhitsu Das, Pulkit Khandelwal, Laura E. M. Wisse, Madigan L. Bedard, John L. Robinson, Terry Schuck, Murray Grossman, John Q. Trojanowski, Edward B. Lee, M. Dylan Tisdall, Karthik Prabhakaran, John A. Detre, David J. Irwin, Winifred Trotman, Gabor Mizsei, Emilio Artacho-Pérula, Maria Mercedes Iñiguez de Onzono Martin, Maria del Mar Arroyo Jiménez, Monica Muñoz, Francisco Javier Molina Romero, Maria del Pilar Marcos Rabal , et al. (7 additional authors not shown)

    Abstract: When develo** tools for automated cortical segmentation, the ability to produce topologically correct segmentations is important in order to compute geometrically valid morphometry measures. In practice, accurate cortical segmentation is challenged by image artifacts and the highly convoluted anatomy of the cortex itself. To address this, we propose a novel deep learning-based cortical segmentat… ▽ More

    Submitted 3 March, 2023; v1 submitted 1 March, 2023; originally announced March 2023.

    Comments: Accepted at the 28th biennial international conference on Information Processing in Medical Imaging (IPMI 2023)

  26. arXiv:2212.08122  [pdf, other

    cs.HC cs.AI cs.RO eess.SY

    Hybrid Paradigm-based Brain-Computer Interface for Robotic Arm Control

    Authors: Byeong-Hoo Lee, Jeong-Hyun Cho, Byung-Hee Kwon

    Abstract: Brain-computer interface (BCI) uses brain signals to communicate with external devices without actual control. Particularly, BCI is one of the interfaces for controlling the robotic arm. In this study, we propose a knowledge distillation-based framework to manipulate robotic arm through hybrid paradigm induced EEG signals for practical use. The teacher model is designed to decode input data hierar… ▽ More

    Submitted 14 December, 2022; originally announced December 2022.

  27. arXiv:2212.06368  [pdf, other

    cs.CV eess.IV

    Single Cell Training on Architecture Search for Image Denoising

    Authors: Bokyeung Lee, Kyungdeuk Ko, Jonghwan Hong, Hanseok Ko

    Abstract: Neural Architecture Search (NAS) for automatically finding the optimal network architecture has shown some success with competitive performances in various computer vision tasks. However, NAS in general requires a tremendous amount of computations. Thus reducing computational cost has emerged as an important issue. Most of the attempts so far has been based on manual approaches, and often the arch… ▽ More

    Submitted 12 December, 2022; originally announced December 2022.

  28. arXiv:2212.05662  [pdf, other

    cs.LG eess.SY

    Optimal Planning of Hybrid Energy Storage Systems using Curtailed Renewable Energy through Deep Reinforcement Learning

    Authors: Dongju Kang, Doeun Kang, Sumin Hwangbo, Haider Niaz, Won Bo Lee, J. Jay Liu, Jonggeol Na

    Abstract: Energy management systems (EMS) are becoming increasingly important in order to utilize the continuously growing curtailed renewable energy. Promising energy storage systems (ESS), such as batteries and green hydrogen should be employed to maximize the efficiency of energy stakeholders. However, optimal decision-making, i.e., planning the leveraging between different strategies, is confronted with… ▽ More

    Submitted 11 December, 2022; originally announced December 2022.

    Comments: 30 pages, 8 figures

  29. arXiv:2212.00723  [pdf, other

    eess.SP cs.LG

    Target-centered Subject Transfer Framework for EEG Data Augmentation

    Authors: Kang Yin, Byeong-Hoo Lee, Byoung-Hee Kwon, Jeong-Hyun Cho

    Abstract: Data augmentation approaches are widely explored for the enhancement of decoding electroencephalogram signals. In subject-independent brain-computer interface system, domain adaption and generalization are utilized to shift source subjects' data distribution to match the target subject as an augmentation. However, previous works either introduce noises (e.g., by noise addition or generation with r… ▽ More

    Submitted 23 November, 2022; originally announced December 2022.

  30. arXiv:2212.00186  [pdf, other

    cs.LG eess.SY

    Multi-Task Imitation Learning for Linear Dynamical Systems

    Authors: Thomas T. Zhang, Katie Kang, Bruce D. Lee, Claire Tomlin, Sergey Levine, Stephen Tu, Nikolai Matni

    Abstract: We study representation learning for efficient imitation learning over linear systems. In particular, we consider a setting where learning is split into two phases: (a) a pre-training step where a shared $k$-dimensional representation is learned from $H$ source policies, and (b) a target policy fine-tuning step where the learned representation is used to parameterize the policy class. We find that… ▽ More

    Submitted 9 November, 2023; v1 submitted 30 November, 2022; originally announced December 2022.

    Comments: Appeared in L4DC 2023. V3: corrected typo in assumptions

  31. arXiv:2211.06769  [pdf, other

    eess.IV cs.CV

    Realistic Bokeh Effect Rendering on Mobile GPUs, Mobile AI & AIM 2022 challenge: Report

    Authors: Andrey Ignatov, Radu Timofte, ** Zhang, Feng Zhang, Gaocheng Yu, Zhe Ma, Hongbin Wang, Minsu Kwon, Haotian Qian, Wentao Tong, Pan Mu, Zi** Wang, Guang**g Yan, Brian Lee, Lei Fei, Huai** Chen, Hyebin Cho, Byeongjun Kwon, Munchurl Kim, Mingyang Qian, Huixin Ma, Yanan Li, Xiaotao Wang, Lei Lei

    Abstract: As mobile cameras with compact optics are unable to produce a strong bokeh effect, lots of interest is now devoted to deep learning-based solutions for this task. In this Mobile AI challenge, the target was to develop an efficient end-to-end AI-based bokeh effect rendering approach that can run on modern smartphone GPUs using TensorFlow Lite. The participants were provided with a large-scale EBB!… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2211.03885; text overlap with arXiv:2105.07809, arXiv:2211.04470, arXiv:2211.05256, arXiv:2211.05910

  32. arXiv:2211.05910  [pdf, other

    eess.IV cs.CV

    Efficient and Accurate Quantized Image Super-Resolution on Mobile NPUs, Mobile AI & AIM 2022 challenge: Report

    Authors: Andrey Ignatov, Radu Timofte, Maurizio Denna, Abdel Younes, Ganzorig Gankhuyag, **gang Huh, Myeong Kyun Kim, Kihwan Yoon, Hyeon-Cheol Moon, Seungho Lee, Yoonsik Choe, **woo Jeong, Sungjei Kim, Maciej Smyl, Tomasz Latkowski, Pawel Kubik, Michal Sokolski, Yujie Ma, Jiahao Chao, Zhou Zhou, Hongfan Gao, Zhengfeng Yang, Zhenbing Zeng, Zhengyang Zhuge, Chenghua Li , et al. (71 additional authors not shown)

    Abstract: Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: text overlap with arXiv:2105.07825, arXiv:2105.08826, arXiv:2211.04470, arXiv:2211.03885, arXiv:2211.05256

  33. arXiv:2211.04768  [pdf, other

    eess.AS cs.SD

    Absolute decision corrupts absolutely: conservative online speaker diarisation

    Authors: Youngki Kwon, Hee-Soo Heo, Bong-** Lee, You ** Kim, Jee-weon Jung

    Abstract: Our focus lies in develo** an online speaker diarisation framework which demonstrates robust performance across diverse domains. In online speaker diarisation, outputs generated in real-time are irreversible, and a few misjudgements in the early phase of an input session can lead to catastrophic results. We hypothesise that cautiously increasing the number of estimated speakers is of paramount i… ▽ More

    Submitted 9 November, 2022; originally announced November 2022.

    Comments: 5pages, 2 figure, 4 tables, submitted to ICASSP

  34. arXiv:2211.04470  [pdf, other

    cs.CV eess.IV

    Efficient Single-Image Depth Estimation on Mobile Devices, Mobile AI & AIM 2022 Challenge: Report

    Authors: Andrey Ignatov, Grigory Malivenko, Radu Timofte, Lukasz Treszczotko, Xin Chang, Piotr Ksiazek, Michal Lopuszynski, Maciej Pioro, Rafal Rudnicki, Maciej Smyl, Yujie Ma, Zhenyu Li, Zehui Chen, Jialei Xu, Xianming Liu, Junjun Jiang, XueChao Shi, Difan Xu, Yanan Li, Xiaotao Wang, Lei Lei, Ziyu Zhang, Yicheng Wang, Zilong Huang, Guozhong Luo , et al. (14 additional authors not shown)

    Abstract: Various depth estimation models are now widely used on many mobile and IoT devices for image segmentation, bokeh effect rendering, object tracking and many other mobile tasks. Thus, it is very crucial to have efficient and accurate depth estimation models that can run fast on low-power mobile chipsets. In this Mobile AI challenge, the target was to develop deep learning-based single image depth es… ▽ More

    Submitted 7 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: substantial text overlap with arXiv:2105.08630, arXiv:2211.03885; text overlap with arXiv:2105.08819, arXiv:2105.08826, arXiv:2105.08629, arXiv:2105.07809, arXiv:2105.07825

  35. arXiv:2211.04060  [pdf, other

    cs.SD cs.CL eess.AS

    High-resolution embedding extractor for speaker diarisation

    Authors: Hee-Soo Heo, Youngki Kwon, Bong-** Lee, You ** Kim, Jee-weon Jung

    Abstract: Speaker embedding extractors significantly influence the performance of clustering-based speaker diarisation systems. Conventionally, only one embedding is extracted from each speech segment. However, because of the sliding window approach, a segment easily includes two or more speakers owing to speaker change points. This study proposes a novel embedding extractor architecture, referred to as a h… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: 5pages, 2 figure, 3 tables, submitted to ICASSP

  36. arXiv:2211.00003  [pdf, other

    eess.IV cs.CV

    MEDS-Net: Self-Distilled Multi-Encoders Network with Bi-Direction Maximum Intensity projections for Lung Nodule Detection

    Authors: Muhammad Usman, Azka Rehman, Abdullah Shahid, Siddique Latif, Shi Sub Byon, Byoung Dai Lee, Sung Hyun Kim, Byung il Lee, Yeong Gil Shin

    Abstract: In this study, we propose a lung nodule detection scheme which fully incorporates the clinic workflow of radiologists. Particularly, we exploit Bi-Directional Maximum intensity projection (MIP) images of various thicknesses (i.e., 3, 5 and 10mm) along with a 3D patch of CT scan, consisting of 10 adjacent slices to feed into self-distillation-based Multi-Encoders Network (MEDS-Net). The proposed ar… ▽ More

    Submitted 26 December, 2022; v1 submitted 30 October, 2022; originally announced November 2022.

  37. arXiv:2210.14682  [pdf, other

    cs.SD cs.AI eess.AS

    In search of strong embedding extractors for speaker diarisation

    Authors: Jee-weon Jung, Hee-Soo Heo, Bong-** Lee, Jaesung Huh, Andrew Brown, Youngki Kwon, Shinji Watanabe, Joon Son Chung

    Abstract: Speaker embedding extractors (EEs), which map input audio to a speaker discriminant latent space, are of paramount importance in speaker diarisation. However, there are several challenges when adopting EEs for diarisation, from which we tackle two key problems. First, the evaluation is not straightforward because the features required for better performance differ between speaker verification and… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

    Comments: 5pages, 1 figure, 2 tables, submitted to ICASSP

  38. arXiv:2210.10985  [pdf, ps, other

    cs.SD cs.AI eess.AS

    Large-scale learning of generalised representations for speaker recognition

    Authors: Jee-weon Jung, Hee-Soo Heo, Bong-** Lee, Jaesong Lee, Hye-** Shim, Youngki Kwon, Joon Son Chung, Shinji Watanabe

    Abstract: The objective of this work is to develop a speaker recognition model to be used in diverse scenarios. We hypothesise that two components should be adequately configured to build such a model. First, adequate architecture would be required. We explore several recent state-of-the-art models, including ECAPA-TDNN and MFA-Conformer, as well as other baselines. Second, a massive amount of data would be… ▽ More

    Submitted 27 October, 2022; v1 submitted 19 October, 2022; originally announced October 2022.

    Comments: 5pages, 5 tables, submitted to ICASSP

  39. arXiv:2210.03739  [pdf, other

    eess.IV cs.AI cs.CV

    Dual-Stage Deeply Supervised Attention-based Convolutional Neural Networks for Mandibular Canal Segmentation in CBCT Scans

    Authors: Azka Rehman, Muhammad Usman, Rabeea Jawaid, Amal Muhammad Saleem, Shi Sub Byon, Sung Hyun Kim, Byoung Dai Lee, Byung il Lee, Yeong Gil Shin

    Abstract: Accurate segmentation of mandibular canals in lower jaws is important in dental implantology. Medical experts determine the implant position and dimensions manually from 3D CT images to avoid damaging the mandibular nerve inside the canal. In this paper, we propose a novel dual-stage deep learning-based scheme for the automatic segmentation of the mandibular canal. Particularly, we first enhance t… ▽ More

    Submitted 2 November, 2022; v1 submitted 6 October, 2022; originally announced October 2022.

    Comments: 7 Pages

  40. arXiv:2208.12343  [pdf, other

    cs.CV eess.IV

    Bokeh-Loss GAN: Multi-Stage Adversarial Training for Realistic Edge-Aware Bokeh

    Authors: Brian Lee, Fei Lei, Huai** Chen, Alexis Baudron

    Abstract: In this paper, we tackle the problem of monocular bokeh synthesis, where we attempt to render a shallow depth of field image from a single all-in-focus image. Unlike in DSLR cameras, this effect can not be captured directly in mobile cameras due to the physical constraints of the mobile aperture. We thus propose a network-based approach that is capable of rendering realistic monocular bokeh from s… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

    Comments: ECCV Workshop 2022

  41. Multi-View Attention Transfer for Efficient Speech Enhancement

    Authors: Wooseok Shin, Hyun Joon Park, ** Sob Kim, Byung Hoon Lee, Sung Won Han

    Abstract: Recent deep learning models have achieved high performance in speech enhancement; however, it is still challenging to obtain a fast and low-complexity model without significant performance degradation. Previous knowledge distillation studies on speech enhancement could not solve this problem because their output distillation methods do not fit the speech enhancement task in some aspects. In this s… ▽ More

    Submitted 30 October, 2022; v1 submitted 22 August, 2022; originally announced August 2022.

    Comments: Proceedings of Interspeech 2022

  42. arXiv:2206.08494  [pdf, other

    cs.AI eess.SP

    Factorization Approach for Sparse Spatio-Temporal Brain-Computer Interface

    Authors: Byeong-Hoo Lee, Jeong-Hyun Cho, Byoung-Hee Kwon, Seong-Whan Lee

    Abstract: Recently, advanced technologies have unlimited potential in solving various problems with a large amount of data. However, these technologies have yet to show competitive performance in brain-computer interfaces (BCIs) which deal with brain signals. Basically, brain signals are difficult to collect in large quantities, in particular, the amount of information would be sparse in spontaneous BCIs. I… ▽ More

    Submitted 16 June, 2022; originally announced June 2022.

    Comments: 8 pages

  43. arXiv:2205.01304  [pdf, other

    eess.AS cs.SD

    Efficient dynamic filter for robust and low computational feature extraction

    Authors: Donghyeon Kim, Gwantae Kim, Bokyeung Lee, Jeong-gi Kwak, David K. Han, Hanseok Ko

    Abstract: Unseen noise signal which is not considered in a model training process is difficult to anticipate and would lead to performance degradation. Various methods have been investigated to mitigate unseen noise. In our previous work, an Instance-level Dynamic Filter (IDF) and a Pixel Dynamic Filter (PDF) were proposed to extract noise-robust features. However, the performance of the dynamic filter migh… ▽ More

    Submitted 20 October, 2022; v1 submitted 3 May, 2022; originally announced May 2022.

    Comments: Accept to SLT2022

  44. arXiv:2204.09976  [pdf, other

    cs.SD eess.AS

    Baseline Systems for the First Spoofing-Aware Speaker Verification Challenge: Score and Embedding Fusion

    Authors: Hye-** Shim, Hemlata Tak, Xuechen Liu, Hee-Soo Heo, Jee-weon Jung, Joon Son Chung, Soo-Whan Chung, Ha-** Yu, Bong-** Lee, Massimiliano Todisco, Héctor Delgado, Kong Aik Lee, Md Sahidullah, Tomi Kinnunen, Nicholas Evans

    Abstract: Deep learning has brought impressive progress in the study of both automatic speaker verification (ASV) and spoofing countermeasures (CM). Although solutions are mutually dependent, they have typically evolved as standalone sub-systems whereby CM solutions are usually designed for a fixed ASV system. The work reported in this paper aims to gauge the improvements in reliability that can be gained f… ▽ More

    Submitted 21 April, 2022; originally announced April 2022.

    Comments: 8 pages, accepted by Odyssey 2022

  45. arXiv:2204.09190  [pdf, other

    eess.SP

    Phase-Shift Design and Channel Modeling for Focused Beams in IRS-Assisted FSO Systems

    Authors: Junghoon Noh, Byungju Lee

    Abstract: Interest in free-space optics (FSO) is rapidly growing as a potential solution for the backhaul of next-generation mobile or low-orbit satellite communications. Various techniques have been suggested for employing an intelligent reflecting surface (IRS) in FSO systems, such as anomalous reflection, power amplification, and beam splitting. It is possible to deliver more power to the receiver (Rx) b… ▽ More

    Submitted 19 April, 2022; originally announced April 2022.

    Comments: 5 pages

  46. arXiv:2203.14732  [pdf, other

    eess.AS

    SASV 2022: The First Spoofing-Aware Speaker Verification Challenge

    Authors: Jee-weon Jung, Hemlata Tak, Hye-** Shim, Hee-Soo Heo, Bong-** Lee, Soo-Whan Chung, Ha-** Yu, Nicholas Evans, Tomi Kinnunen

    Abstract: The first spoofing-aware speaker verification (SASV) challenge aims to integrate research efforts in speaker verification and anti-spoofing. We extend the speaker verification scenario by introducing spoofed trials to the usual set of target and impostor trials. In contrast to the established ASVspoof challenge where the focus is upon separate, independently optimised spoofing detection and speake… ▽ More

    Submitted 28 March, 2022; originally announced March 2022.

    Comments: 5 pages, 2 figures, 2 tables, submitted to Interspeech 2022 as a conference paper

  47. arXiv:2203.14525  [pdf, other

    eess.AS

    Curriculum learning for self-supervised speaker verification

    Authors: Hee-Soo Heo, Jee-weon Jung, **gu Kang, Youngki Kwon, You ** Kim, Bong-** Lee, Joon Son Chung

    Abstract: The goal of this paper is to train effective self-supervised speaker representations without identity labels. We propose two curriculum learning strategies within a self-supervised learning framework. The first strategy aims to gradually increase the number of speakers in the training phase by enlarging the used portion of the train dataset. The second strategy applies various data augmentations t… ▽ More

    Submitted 13 February, 2024; v1 submitted 28 March, 2022; originally announced March 2022.

    Comments: INTERSPEECH 2023. 5 pages, 3 figures, 4 tables

  48. arXiv:2203.10763  [pdf, other

    eess.SY

    Performance-Robustness Tradeoffs in Adversarially Robust Linear-Quadratic Control

    Authors: Bruce D. Lee, Thomas T. C. K. Zhang, Hamed Hassani, Nikolai Matni

    Abstract: While $\mathcal{H}_\infty$ methods can introduce robustness against worst-case perturbations, their nominal performance under conventional stochastic disturbances is often drastically reduced. Though this fundamental tradeoff between nominal performance and robustness is known to exist, it is not well-characterized in quantitative terms. Toward addressing this issue, we borrow from the increasingl… ▽ More

    Submitted 21 March, 2022; originally announced March 2022.

  49. arXiv:2203.10091  [pdf, other

    eess.IV cs.CV

    Label conditioned segmentation

    Authors: Tianyu Ma, Benjamin C. Lee, Mert R. Sabuncu

    Abstract: Semantic segmentation is an important task in computer vision that is often tackled with convolutional neural networks (CNNs). A CNN learns to produce pixel-level predictions through training on pairs of images and their corresponding ground-truth segmentation labels. For segmentation tasks with multiple classes, the standard approach is to use a network that computes a multi-channel probabilistic… ▽ More

    Submitted 17 March, 2022; originally announced March 2022.

    Comments: MIDL 2022

  50. arXiv:2203.08488  [pdf, other

    eess.AS cs.AI

    Pushing the limits of raw waveform speaker recognition

    Authors: Jee-weon Jung, You ** Kim, Hee-Soo Heo, Bong-** Lee, Youngki Kwon, Joon Son Chung

    Abstract: In recent years, speaker recognition systems based on raw waveform inputs have received increasing attention. However, the performance of such systems are typically inferior to the state-of-the-art handcrafted feature-based counterparts, which demonstrate equal error rates under 1% on the popular VoxCeleb1 test set. This paper proposes a novel speaker recognition model based on raw waveform inputs… ▽ More

    Submitted 28 March, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

    Comments: submitted to INTERSPEECH 2022 as a conference paper. 5 pages, 2 figures, 5 tables