Search | arXiv e-print repository

CLIPtone: Unsupervised Learning for Text-based Image Tone Adjustment

Authors: Hyeongmin Lee, Kyoungkook Kang, Jungseul Ok, Sunghyun Cho

Abstract: Recent image tone adjustment (or enhancement) approaches have predominantly adopted supervised learning for learning human-centric perceptual assessment. However, these approaches are constrained by intrinsic challenges of supervised learning. Primarily, the requirement for expertly-curated or retouched images escalates the data acquisition expenses. Moreover, their coverage of target style is con… ▽ More Recent image tone adjustment (or enhancement) approaches have predominantly adopted supervised learning for learning human-centric perceptual assessment. However, these approaches are constrained by intrinsic challenges of supervised learning. Primarily, the requirement for expertly-curated or retouched images escalates the data acquisition expenses. Moreover, their coverage of target style is confined to stylistic variants inferred from the training data. To surmount the above challenges, we propose an unsupervised learning-based approach for text-based image tone adjustment method, CLIPtone, that extends an existing image enhancement method to accommodate natural language descriptions. Specifically, we design a hyper-network to adaptively modulate the pretrained parameters of the backbone model based on text description. To assess whether the adjusted image aligns with the text description without ground truth image, we utilize CLIP, which is trained on a vast set of language-image pairs and thus encompasses knowledge of human perception. The major advantages of our approach are three fold: (i) minimal data collection expenses, (ii) support for a range of adjustments, and (iii) the ability to handle novel text descriptions unseen in training. Our approach's efficacy is demonstrated through comprehensive experiments, including a user study. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2401.11090 [pdf, other]

Sharing Energy in Wide Area: A Two-Layer Energy Sharing Scheme for Massive Prosumers

Authors: Yifan Su, Peng Yang, Kai Kang, Zhaojian Wang, Ning Qi, Tonghua Liu, Feng Liu

Abstract: The popularization of distributed energy resources transforms end-users from consumers into prosumers. Inspired by the sharing economy principle, energy sharing markets for prosumers are proposed to facilitate the utilization of renewable energy. This paper proposes a novel two-layer energy sharing market for massive prosumers, which can promote social efficiency by wider-area sharing. In this mar… ▽ More The popularization of distributed energy resources transforms end-users from consumers into prosumers. Inspired by the sharing economy principle, energy sharing markets for prosumers are proposed to facilitate the utilization of renewable energy. This paper proposes a novel two-layer energy sharing market for massive prosumers, which can promote social efficiency by wider-area sharing. In this market, there is an upper-level wide-area market (WAM) in the distribution system and numerous lower-level local-area markets (LAMs) in communities. Prosumers in the same community share energy with each other in the LAM, which can be uncleared. The energy surplus and shortage of LAMs are cleared in the WAM. Thanks to the wide-area two-layer structure, the market outcome is near-social-optimal in large-scale systems. However, the proposed market forms a complex mathematical program with equilibrium constraints (MPEC). To solve the problem, we propose an efficient and hierarchically distributed bidding algorithm. The proposed two-layer market and bidding algorithm are verified on the IEEE 123-bus system with 11250 prosumers, which demonstrates the practicality and efficiency for large-scale markets. △ Less

Submitted 19 January, 2024; originally announced January 2024.

arXiv:2401.00370 [pdf, other]

UGPNet: Universal Generative Prior for Image Restoration

Authors: Hwayoon Lee, Kyoungkook Kang, Hyeongmin Lee, Seung-Hwan Baek, Sunghyun Cho

Abstract: Recent image restoration methods can be broadly categorized into two classes: (1) regression methods that recover the rough structure of the original image without synthesizing high-frequency details and (2) generative methods that synthesize perceptually-realistic high-frequency details even though the resulting image deviates from the original structure of the input. While both directions have b… ▽ More Recent image restoration methods can be broadly categorized into two classes: (1) regression methods that recover the rough structure of the original image without synthesizing high-frequency details and (2) generative methods that synthesize perceptually-realistic high-frequency details even though the resulting image deviates from the original structure of the input. While both directions have been extensively studied in isolation, merging their benefits with a single framework has been rarely studied. In this paper, we propose UGPNet, a universal image restoration framework that can effectively achieve the benefits of both approaches by simply adopting a pair of an existing regression model and a generative model. UGPNet first restores the image structure of a degraded input using a regression model and synthesizes a perceptually-realistic image with a generative model on top of the regressed output. UGPNet then combines the regressed output and the synthesized output, resulting in a final result that faithfully reconstructs the structure of the original image in addition to perceptually-realistic textures. Our extensive experiments on deblurring, denoising, and super-resolution demonstrate that UGPNet can successfully exploit both regression and generative methods for high-fidelity image restoration. △ Less

Submitted 30 December, 2023; originally announced January 2024.

Comments: Accepted to WACV 2024

arXiv:2306.13020 [pdf]

Toward Automated Detection of Microbleeds with Anatomical Scale Localization: A Complete Clinical Diagnosis Support Using Deep Learning

Authors: Jun-Ho Kim, Young Noh, Haejoon Lee, Seul Lee, Woo-Ram Kim, Koung Mi Kang, Eung Yeop Kim, Mohammed A. Al-masni, Dong-Hyun Kim

Abstract: Cerebral Microbleeds (CMBs) are chronic deposits of small blood products in the brain tissues, which have explicit relation to various cerebrovascular diseases depending on their anatomical location, including cognitive decline, intracerebral hemorrhage, and cerebral infarction. However, manual detection of CMBs is a time-consuming and error-prone process because of their sparse and tiny structura… ▽ More Cerebral Microbleeds (CMBs) are chronic deposits of small blood products in the brain tissues, which have explicit relation to various cerebrovascular diseases depending on their anatomical location, including cognitive decline, intracerebral hemorrhage, and cerebral infarction. However, manual detection of CMBs is a time-consuming and error-prone process because of their sparse and tiny structural properties. The detection of CMBs is commonly affected by the presence of many CMB mimics that cause a high false-positive rate (FPR), such as calcification and pial vessels. This paper proposes a novel 3D deep learning framework that does not only detect CMBs but also inform their anatomical location in the brain (i.e., lobar, deep, and infratentorial regions). For the CMB detection task, we propose a single end-to-end model by leveraging the U-Net as a backbone with Region Proposal Network (RPN). To significantly reduce the FPs within the same single model, we develop a new scheme, containing Feature Fusion Module (FFM) that detects small candidates utilizing contextual information and Hard Sample Prototype Learning (HSPL) that mines CMB mimics and generates additional loss term called concentration loss using Convolutional Prototype Learning (CPL). The anatomical localization task does not only tell to which region the CMBs belong but also eliminate some FPs from the detection task by utilizing anatomical information. The results show that the proposed RPN that utilizes the FFM and HSPL outperforms the vanilla RPN and achieves a sensitivity of 94.66% vs. 93.33% and an average number of false positives per subject (FPavg) of 0.86 vs. 14.73. Also, the anatomical localization task further improves the detection performance by reducing the FPavg to 0.56 while maintaining the sensitivity of 94.66%. △ Less

Submitted 22 June, 2023; originally announced June 2023.

Comments: 16 pages, 10 figures,3 tables

arXiv:2306.05759 [pdf, other]

One-shot Learning for Channel Estimation in Massive MIMO Systems

Authors: Kai Kang, Qiyu Hu, Yunlong Cai, Yonina C. Eldar

Abstract: In conventional supervised deep learning based channel estimation algorithms, a large number of training samples are required for offline training. However, in practical communication systems, it is difficult to obtain channel samples for every signal-to-noise ratio (SNR). Furthermore, the generalization ability of these deep neural networks (DNN) is typically poor. In this work, we propose a one-… ▽ More In conventional supervised deep learning based channel estimation algorithms, a large number of training samples are required for offline training. However, in practical communication systems, it is difficult to obtain channel samples for every signal-to-noise ratio (SNR). Furthermore, the generalization ability of these deep neural networks (DNN) is typically poor. In this work, we propose a one-shot self-supervised learning framework for channel estimation in multi-input multi-output (MIMO) systems. The required number of samples for offline training is small and our approach can be directly deployed to adapt to variable channels. Our framework consists of a traditional channel estimation module and a denoising module. The denoising module is designed based on the one-shot learning method Self2Self and employs Bernoulli sampling to generate training labels. Besides,we further utilize a blind spot strategy and dropout technique to avoid overfitting. Simulation results show that the performance of the proposed one-shot self-supervised learning method is very close to the supervised learning approach while obtaining improved generalization ability for different channel environments. △ Less

Submitted 9 June, 2023; originally announced June 2023.

arXiv:2212.00186 [pdf, other]

Multi-Task Imitation Learning for Linear Dynamical Systems

Authors: Thomas T. Zhang, Katie Kang, Bruce D. Lee, Claire Tomlin, Sergey Levine, Stephen Tu, Nikolai Matni

Abstract: We study representation learning for efficient imitation learning over linear systems. In particular, we consider a setting where learning is split into two phases: (a) a pre-training step where a shared $k$-dimensional representation is learned from $H$ source policies, and (b) a target policy fine-tuning step where the learned representation is used to parameterize the policy class. We find that… ▽ More We study representation learning for efficient imitation learning over linear systems. In particular, we consider a setting where learning is split into two phases: (a) a pre-training step where a shared $k$-dimensional representation is learned from $H$ source policies, and (b) a target policy fine-tuning step where the learned representation is used to parameterize the policy class. We find that the imitation gap over trajectories generated by the learned target policy is bounded by $\tilde{O}\left( \frac{k n_x}{HN_{\mathrm{shared}}} + \frac{k n_u}{N_{\mathrm{target}}}\right)$, where $n_x > k$ is the state dimension, $n_u$ is the input dimension, $N_{\mathrm{shared}}$ denotes the total amount of data collected for each policy during representation learning, and $N_{\mathrm{target}}$ is the amount of target task data. This result formalizes the intuition that aggregating data across related tasks to learn a representation can significantly improve the sample efficiency of learning a target task. The trends suggested by this bound are corroborated in simulation. △ Less

Submitted 9 November, 2023; v1 submitted 30 November, 2022; originally announced December 2022.

Comments: Appeared in L4DC 2023. V3: corrected typo in assumptions

arXiv:2206.10524 [pdf, other]

Lyapunov Density Models: Constraining Distribution Shift in Learning-Based Control

Authors: Katie Kang, Paula Gradu, Jason Choi, Michael Janner, Claire Tomlin, Sergey Levine

Abstract: Learned models and policies can generalize effectively when evaluated within the distribution of the training data, but can produce unpredictable and erroneous outputs on out-of-distribution inputs. In order to avoid distribution shift when deploying learning-based control algorithms, we seek a mechanism to constrain the agent to states and actions that resemble those that it was trained on. In co… ▽ More Learned models and policies can generalize effectively when evaluated within the distribution of the training data, but can produce unpredictable and erroneous outputs on out-of-distribution inputs. In order to avoid distribution shift when deploying learning-based control algorithms, we seek a mechanism to constrain the agent to states and actions that resemble those that it was trained on. In control theory, Lyapunov stability and control-invariant sets allow us to make guarantees about controllers that stabilize the system around specific states, while in machine learning, density models allow us to estimate the training data distribution. Can we combine these two concepts, producing learning-based control algorithms that constrain the system to in-distribution states using only in-distribution actions? In this work, we propose to do this by combining concepts from Lyapunov stability and density estimation, introducing Lyapunov density models: a generalization of control Lyapunov functions and density models that provides guarantees on an agent's ability to stay in-distribution over its entire trajectory. △ Less

Submitted 21 June, 2022; originally announced June 2022.

arXiv:2206.03755 [pdf, other]

Mixed-Timescale Deep-Unfolding for Joint Channel Estimation and Hybrid Beamforming

Authors: Kai Kang, Qiyu Hu, Yunlong Cai, Guanding Yu, Jakob Hoydis, Yonina C. Eldar

Abstract: In massive multiple-input multiple-output (MIMO) systems, hybrid analog-digital beamforming is an essential technique for exploiting the potential array gain without using a dedicated radio frequency chain for each antenna. However, due to the large number of antennas, the conventional channel estimation and hybrid beamforming algorithms generally require high computational complexity and signalin… ▽ More In massive multiple-input multiple-output (MIMO) systems, hybrid analog-digital beamforming is an essential technique for exploiting the potential array gain without using a dedicated radio frequency chain for each antenna. However, due to the large number of antennas, the conventional channel estimation and hybrid beamforming algorithms generally require high computational complexity and signaling overhead. In this work, we propose an end-to-end deep-unfolding neural network (NN) joint channel estimation and hybrid beamforming (JCEHB) algorithm to maximize the system sum rate in time-division duplex (TDD) massive MIMO. Specifically, the recursive least-squares (RLS) algorithm and stochastic successive convex approximation (SSCA) algorithm are unfolded for channel estimation and hybrid beamforming, respectively. In order to reduce the signaling overhead, we consider a mixed-timescale hybrid beamforming scheme, where the analog beamforming matrices are optimized based on the channel state information (CSI) statistics offline, while the digital beamforming matrices are designed at each time slot based on the estimated low-dimensional equivalent CSI matrices. We jointly train the analog beamformers together with the trainable parameters of the RLS and SSCA induced deep-unfolding NNs based on the CSI statistics offline. During data transmission, we estimate the low-dimensional equivalent CSI by the RLS induced deep-unfolding NN and update the digital beamformers. In addition, we propose a mixed-timescale deep-unfolding NN where the analog beamformers are optimized online, and extend the framework to frequency-division duplex (FDD) systems where channel feedback is considered. Simulation results show that the proposed algorithm can significantly outperform conventional algorithms with reduced computational complexity and signaling overhead. △ Less

Submitted 8 June, 2022; originally announced June 2022.

arXiv:2110.12059 [pdf, other]

Two-Timescale End-to-End Learning for Channel Acquisition and Hybrid Precoding

Authors: Qiyu Hu, Yunlong Cai, Kai Kang, Guanding Yu, Jakob Hoydis, Yonina C. Eldar

Abstract: In this paper, we propose an end-to-end deep learning-based joint transceiver design algorithm for millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems, which consists of deep neural network (DNN)-aided pilot training, channel feedback, and hybrid analog-digital (HAD) precoding. Specifically, we develop a DNN architecture that maps the received pilots into feedback bits a… ▽ More In this paper, we propose an end-to-end deep learning-based joint transceiver design algorithm for millimeter wave (mmWave) massive multiple-input multiple-output (MIMO) systems, which consists of deep neural network (DNN)-aided pilot training, channel feedback, and hybrid analog-digital (HAD) precoding. Specifically, we develop a DNN architecture that maps the received pilots into feedback bits at the receiver, and then further maps the feedback bits into the hybrid precoder at the transmitter. To reduce the signaling overhead and channel state information (CSI) mismatch caused by the transmission delay, a two-timescale DNN composed of a long-term DNN and a short-term DNN is developed. The analog precoders are designed by the long-term DNN based on the CSI statistics and updated once in a frame consisting of a number of time slots. In contrast, the digital precoders are optimized by the short-term DNN at each time slot based on the estimated low-dimensional equivalent CSI matrices. A two-timescale training method is also developed for the proposed DNN with a binary layer. We then analyze the generalization ability and signaling overhead for the proposed DNN based algorithm. Simulation results show that our proposed technique significantly outperforms conventional schemes in terms of bit-error rate performance with reduced signaling overhead and shorter pilot sequences. △ Less

Submitted 26 October, 2021; v1 submitted 22 October, 2021; originally announced October 2021.

Comments: 18 pages, 26 figures

arXiv:2108.08016 [pdf, ps, other]

Low-Complexity Algorithm for Outage Optimal Resource Allocation in Energy Harvesting-Based UAV Identification Networks

Authors: Jae Cheol Park, Kyu-Min Kang, Junil Choi

Abstract: We study an unmanned aerial vehicle (UAV) identification network equipped with an energy harvesting (EH) technique. In the network, the UAVs harvest energy through radio frequency (RF) signals transmitted from ground control stations (GCSs) and then transmit their identification information to the ground receiver station (GRS). Specifically, we first derive a closed-form expression of the outage p… ▽ More We study an unmanned aerial vehicle (UAV) identification network equipped with an energy harvesting (EH) technique. In the network, the UAVs harvest energy through radio frequency (RF) signals transmitted from ground control stations (GCSs) and then transmit their identification information to the ground receiver station (GRS). Specifically, we first derive a closed-form expression of the outage probability to evaluate the network performance. Then we obtain the closed-form expression of the optimal time allocation when the bandwidth is equally allocated to the UAVs. We also propose a fast-converging algorithm for time and the bandwidth allocation, which is necessary for the UAV environment with high mobility, to optimize the outage performance of EH-based UAV identification network. Simulation results show that the proposed algorithm outperforms the conventional bisection algorithm and achieves near-optimal performance. △ Less

Submitted 21 August, 2021; v1 submitted 18 August, 2021; originally announced August 2021.

Comments: 5 pages, 4 figures, accepted to IEEE Communications Letters, Aug. 2021

arXiv:2012.03663 [pdf]

doi 10.1016/j.media.2021.101993

Deep Metric Learning-based Image Retrieval System for Chest Radiograph and its Clinical Applications in COVID-19

Authors: Aoxiao Zhong, Xiang Li, Dufan Wu, Hui Ren, Kyungsang Kim, Younggon Kim, Varun Buch, Nir Neumark, Bernardo Bizzo, Won Young Tak, Soo Young Park, Yu Rim Lee, Min Kyu Kang, Jung Gil Park, Byung Seok Kim, Woo ** Chung, Ning Guo, Ittai Dayan, Mannudeep K. Kalra, Quanzheng Li

Abstract: In recent years, deep learning-based image analysis methods have been widely applied in computer-aided detection, diagnosis and prognosis, and has shown its value during the public health crisis of the novel coronavirus disease 2019 (COVID-19) pandemic. Chest radiograph (CXR) has been playing a crucial role in COVID-19 patient triaging, diagnosing and monitoring, particularly in the United States.… ▽ More In recent years, deep learning-based image analysis methods have been widely applied in computer-aided detection, diagnosis and prognosis, and has shown its value during the public health crisis of the novel coronavirus disease 2019 (COVID-19) pandemic. Chest radiograph (CXR) has been playing a crucial role in COVID-19 patient triaging, diagnosing and monitoring, particularly in the United States. Considering the mixed and unspecific signals in CXR, an image retrieval model of CXR that provides both similar images and associated clinical information can be more clinically meaningful than a direct image diagnostic model. In this work we develop a novel CXR image retrieval model based on deep metric learning. Unlike traditional diagnostic models which aims at learning the direct map** from images to labels, the proposed model aims at learning the optimized embedding space of images, where images with the same labels and similar contents are pulled together. It utilizes multi-similarity loss with hard-mining sampling strategy and attention mechanism to learn the optimized embedding space, and provides similar images to the query image. The model is trained and validated on an international multi-site COVID-19 dataset collected from 3 different sources. Experimental results of COVID-19 image retrieval and diagnosis tasks show that the proposed model can serve as a robust solution for CXR analysis and patient management for COVID-19. The model is also tested on its transferability on a different clinical decision support task, where the pre-trained model is applied to extract image features from a new dataset without any further training. These results demonstrate our deep metric learning based image retrieval model is highly efficient in the CXR retrieval, diagnosis and prognosis, and thus has great clinical value for the treatment and management of COVID-19 patients. △ Less

Submitted 25 November, 2020; originally announced December 2020.

Comments: Aoxiao Zhong and Xiang Li contribute equally to this work

Journal ref: Medical Image Analysis. 70 (2021) 101993

arXiv:2010.01810 [pdf, other]

Painting Outside as Inside: Edge Guided Image Outpainting via Bidirectional Rearrangement with Progressive Step Learning

Authors: Kyunghun Kim, Yeohun Yun, Keon-Woo Kang, Kyeongbo Kong, Siyeong Lee, Suk-Ju Kang

Abstract: Image outpainting is a very intriguing problem as the outside of a given image can be continuously filled by considering as the context of the image. This task has two main challenges. The first is to maintain the spatial consistency in contents of generated regions and the original input. The second is to generate a high-quality large image with a small amount of adjacent information. Conventiona… ▽ More Image outpainting is a very intriguing problem as the outside of a given image can be continuously filled by considering as the context of the image. This task has two main challenges. The first is to maintain the spatial consistency in contents of generated regions and the original input. The second is to generate a high-quality large image with a small amount of adjacent information. Conventional image outpainting methods generate inconsistent, blurry, and repeated pixels. To alleviate the difficulty of an outpainting problem, we propose a novel image outpainting method using bidirectional boundary region rearrangement. We rearrange the image to benefit from the image inpainting task by reflecting more directional information. The bidirectional boundary region rearrangement enables the generation of the missing region using bidirectional information similar to that of the image inpainting task, thereby generating the higher quality than the conventional methods using unidirectional information. Moreover, we use the edge map generator that considers images as original input with structural information and hallucinates the edges of unknown regions to generate the image. Our proposed method is compared with other state-of-the-art outpainting and inpainting methods both qualitatively and quantitatively. We further compared and evaluated them using BRISQUE, one of the No-Reference image quality assessment (IQA) metrics, to evaluate the naturalness of the output. The experimental results demonstrate that our method outperforms other methods and generates new images with 360°panoramic characteristics. △ Less

Submitted 9 November, 2020; v1 submitted 5 October, 2020; originally announced October 2020.

Comments: Paper accepted in WACV 2021

arXiv:2009.12610 [pdf]

Deep Learning-based Four-region Lung Segmentation in Chest Radiography for COVID-19 Diagnosis

Authors: Young-Gon Kim, Kyungsang Kim, Dufan Wu, Hui Ren, Won Young Tak, Soo Young Park, Yu Rim Lee, Min Kyu Kang, Jung Gil Park, Byung Seok Kim, Woo ** Chung, Mannudeep K. Kalra, Quanzheng Li

Abstract: Purpose. Imaging plays an important role in assessing severity of COVID 19 pneumonia. However, semantic interpretation of chest radiography (CXR) findings does not include quantitative description of radiographic opacities. Most current AI assisted CXR image analysis framework do not quantify for regional variations of disease. To address these, we proposed a four region lung segmentation method t… ▽ More Purpose. Imaging plays an important role in assessing severity of COVID 19 pneumonia. However, semantic interpretation of chest radiography (CXR) findings does not include quantitative description of radiographic opacities. Most current AI assisted CXR image analysis framework do not quantify for regional variations of disease. To address these, we proposed a four region lung segmentation method to assist accurate quantification of COVID 19 pneumonia. Methods. A segmentation model to separate left and right lung is firstly applied, and then a carina and left hilum detection network is used, which are the clinical landmarks to separate the upper and lower lungs. To improve the segmentation performance of COVID 19 images, ensemble strategy incorporating five models is exploited. Using each region, we evaluated the clinical relevance of the proposed method with the Radiographic Assessment of the Quality of Lung Edema (RALE). Results. The proposed ensemble strategy showed dice score of 0.900, which is significantly higher than conventional methods (0.854 0.889). Mean intensities of segmented four regions indicate positive correlation to the extent and density scores of pulmonary opacities under the RALE framework. Conclusion. A deep learning based model in CXR can accurately segment and quantify regional distribution of pulmonary opacities in patients with COVID 19 pneumonia. △ Less

Submitted 26 September, 2020; originally announced September 2020.

arXiv:1110.3531 [pdf, other]

Switching Strategies for Linear Feedback Stabilization with Sparsified State Measurements

Authors: Kang Kang, Sourabh Bhattacharya, Tamer Basar

Abstract: In this paper, we address the problem of stabilization in continuous time linear dynamical systems using state feedback when compressive sampling techniques are used for state measurement and reconstruction. In [5], we had introduced the concept of using l1 reconstruction technique, commonly used in sparse data reconstruction, for state measurement and estimation in a discrete time linear system.… ▽ More In this paper, we address the problem of stabilization in continuous time linear dynamical systems using state feedback when compressive sampling techniques are used for state measurement and reconstruction. In [5], we had introduced the concept of using l1 reconstruction technique, commonly used in sparse data reconstruction, for state measurement and estimation in a discrete time linear system. In this work, we extend the previous scenario to analyse continuous time linear systems. We investigate the effect of switching within a set of sparsifiers, introduced in [5], on the stability of a linear plant in continuous time settings. Initially, we analyze the problem of stabilization in low dimensional systems, following which we generalize the results to address the problem of stabilization in systems of arbitrary dimensions. △ Less

Submitted 16 October, 2011; originally announced October 2011.

Comments: American Control Conference 2012

Showing 1–14 of 14 results for author: Kang, K