Search | arXiv e-print repository

Compressed Sensor Caching and Collaborative Sparse Data Recovery with Anchor Alignment

Authors: Yi-Jen Yang, Ming-Hsun Yang, Jwo-Yuh Wu, Y. -W. Peter Hong

Abstract: This work examines the compressed sensor caching problem in wireless sensor networks and devises efficient distributed sparse data recovery algorithms to enable collaboration among multiple caches. In this problem, each cache is only allowed to access measurements from a small subset of sensors within its vicinity to reduce both cache size and data acquisition overhead. To enable reliable data rec… ▽ More This work examines the compressed sensor caching problem in wireless sensor networks and devises efficient distributed sparse data recovery algorithms to enable collaboration among multiple caches. In this problem, each cache is only allowed to access measurements from a small subset of sensors within its vicinity to reduce both cache size and data acquisition overhead. To enable reliable data recovery with limited access to measurements, we propose a distributed sparse data recovery method, called the collaborative sparse recovery by anchor alignment (CoSR-AA) algorithm, where collaboration among caches is enabled by aligning their locally recovered data at a few anchor nodes. The proposed algorithm is based on the consensus alternating direction method of multipliers (ADMM) algorithm but with message exchange that is reduced by considering the proposed anchor alignment strategy. Then, by the deep unfolding of the ADMM iterations, we further propose the Deep CoSR-AA algorithm that can be used to significantly reduce the number of iterations. We obtain a graph neural network architecture where message exchange is done more efficiently by an embedded autoencoder. Simulations are provided to demonstrate the effectiveness of the proposed collaborative recovery algorithms in terms of the improved reconstruction quality and the reduced communication overhead due to anchor alignment. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: v1 was submitted to IEEE Transactions on Signal Processing on Sept. 18, 2023

arXiv:2404.10343 [pdf, other]

The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang , et al. (109 additional authors not shown)

Abstract: This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such… ▽ More This paper provides a comprehensive review of the NTIRE 2024 challenge, focusing on efficient single-image super-resolution (ESR) solutions and their outcomes. The task of this challenge is to super-resolve an input image with a magnification factor of x4 based on pairs of low and corresponding high-resolution images. The primary objective is to develop networks that optimize various aspects such as runtime, parameters, and FLOPs, while still maintaining a peak signal-to-noise ratio (PSNR) of approximately 26.90 dB on the DIV2K_LSDIR_valid dataset and 26.99 dB on the DIV2K_LSDIR_test dataset. In addition, this challenge has 4 tracks including the main track (overall performance), sub-track 1 (runtime), sub-track 2 (FLOPs), and sub-track 3 (parameters). In the main track, all three metrics (ie runtime, FLOPs, and parameter count) were considered. The ranking of the main track is calculated based on a weighted sum-up of the scores of all other sub-tracks. In sub-track 1, the practical runtime performance of the submissions was evaluated, and the corresponding score was used to determine the ranking. In sub-track 2, the number of FLOPs was considered. The score calculated based on the corresponding FLOPs was used to determine the ranking. In sub-track 3, the number of parameters was considered. The score calculated based on the corresponding parameters was used to determine the ranking. RLFN is set as the baseline for efficiency measurement. The challenge had 262 registered participants, and 34 teams made valid submissions. They gauge the state-of-the-art in efficient single-image super-resolution. To facilitate the reproducibility of the challenge and enable other researchers to build upon these findings, the code and the pre-trained model of validated solutions are made publicly available at https://github.com/Amazingren/NTIRE2024_ESR/. △ Less

Submitted 25 June, 2024; v1 submitted 16 April, 2024; originally announced April 2024.

Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

arXiv:2404.08333 [pdf, other]

doi 10.1109/TWC.2024.3386160

OTFS Channel Estimation and Detection for Channels with Very Large Delay Spread

Authors: Preety Priya, Yi Hong, Emanuele Viterbo

Abstract: In low latency applications and in general, for overspread channels, channel delay spread is a large percentage of the transmission frame duration. In this paper, we consider OTFS in an overspread channel exhibiting a delay spread that exceeds the block duration in a frame, where traditional channel estimation (CE) fails. We propose a two-stage CE method based on a delay-Doppler (DD) training fram… ▽ More In low latency applications and in general, for overspread channels, channel delay spread is a large percentage of the transmission frame duration. In this paper, we consider OTFS in an overspread channel exhibiting a delay spread that exceeds the block duration in a frame, where traditional channel estimation (CE) fails. We propose a two-stage CE method based on a delay-Doppler (DD) training frame, consisting of a dual chirp converted from time domain and a higher power pilot. The first stage employs a DD domain embedded pilot CE to estimate the aliased delays (due to modulo operation) and Doppler shifts, followed by identifying all the underspread paths not coinciding with any overspread path. The second stage utilizes time domain dual chirp correlation to estimate the actual delays and Doppler shifts of the remaining paths. This stage also resolves ambiguity in estimating delays and Doppler shifts for paths sharing same aliased delay. Furthermore, we present a modified low-complexity maximum ratio combining (MRC) detection algorithm for OTFS in overspread channels. Finally, we evaluate performance of OTFS using the proposed CE and the modified MRC detection in terms of normalized mean square error (NMSE) and bit error rate (BER). △ Less

Submitted 12 April, 2024; originally announced April 2024.

arXiv:2403.08187 [pdf, other]

Automatic Speech Recognition (ASR) for the Diagnosis of pronunciation of Speech Sound Disorders in Korean children

Authors: Taekyung Ahn, Yeonjung Hong, Younggon Im, Do Hyung Kim, Dayoung Kang, Joo Won Jeong, Jae Won Kim, Min Jung Kim, Ah-ra Cho, Dae-Hyun Jang, Hosung Nam

Abstract: This study presents a model of automatic speech recognition (ASR) designed to diagnose pronunciation issues in children with speech sound disorders (SSDs) to replace manual transcriptions in clinical procedures. Since ASR models trained for general purposes primarily predict input speech into real words, employing a well-known high-performance ASR model for evaluating pronunciation in children wit… ▽ More This study presents a model of automatic speech recognition (ASR) designed to diagnose pronunciation issues in children with speech sound disorders (SSDs) to replace manual transcriptions in clinical procedures. Since ASR models trained for general purposes primarily predict input speech into real words, employing a well-known high-performance ASR model for evaluating pronunciation in children with SSDs is impractical. We fine-tuned the wav2vec 2.0 XLS-R model to recognize speech as pronounced rather than as existing words. The model was fine-tuned with a speech dataset from 137 children with inadequate speech production pronouncing 73 Korean words selected for actual clinical diagnosis. The model's predictions of the pronunciations of the words matched the human annotations with about 90% accuracy. While the model still requires improvement in recognizing unclear pronunciation, this study demonstrates that ASR models can streamline complex pronunciation error diagnostic procedures in clinical fields. △ Less

Submitted 12 March, 2024; originally announced March 2024.

Comments: 12 pages, 2 figures

ACM Class: I.2.7

arXiv:2311.08661 [pdf, other]

Deep Neural Network Identification of Limnonectes Species and New Class Detection Using Image Data

Authors: Li Xu, Yili Hong, Eric P. Smith, David S. McLeod, Xinwei Deng, Laura J. Freeman

Abstract: As is true of many complex tasks, the work of discovering, describing, and understanding the diversity of life on Earth (viz., biological systematics and taxonomy) requires many tools. Some of this work can be accomplished as it has been done in the past, but some aspects present us with challenges which traditional knowledge and tools cannot adequately resolve. One such challenge is presented by… ▽ More As is true of many complex tasks, the work of discovering, describing, and understanding the diversity of life on Earth (viz., biological systematics and taxonomy) requires many tools. Some of this work can be accomplished as it has been done in the past, but some aspects present us with challenges which traditional knowledge and tools cannot adequately resolve. One such challenge is presented by species complexes in which the morphological similarities among the group members make it difficult to reliably identify known species and detect new ones. We address this challenge by develo** new tools using the principles of machine learning to resolve two specific questions related to species complexes. The first question is formulated as a classification problem in statistics and machine learning and the second question is an out-of-distribution (OOD) detection problem. We apply these tools to a species complex comprising Southeast Asian stream frogs (Limnonectes kuhlii complex) and employ a morphological character (hind limb skin texture) traditionally treated qualitatively in a quantitative and objective manner. We demonstrate that deep neural networks can successfully automate the classification of an image into a known species group for which it has been trained. We further demonstrate that the algorithm can successfully classify an image into a new class if the image does not belong to the existing classes. Additionally, we use the larger MNIST dataset to test the performance of our OOD detection algorithm. We finish our paper with some concluding remarks regarding the application of these methods to species complexes and our efforts to document true biodiversity. This paper has online supplementary materials. △ Less

Submitted 14 November, 2023; originally announced November 2023.

Comments: 26 pages, 11 Figures

arXiv:2311.08439 [pdf, other]

A Unified Approach for Comprehensive Analysis of Various Spectral and Tissue Doppler Echocardiography

Authors: Jaeik Jeon, Jiyeon Kim, Yeonggul Jang, Yeonyee E. Yoon, Dawun Jeong, Youngtaek Hong, Seung-Ah Lee, Hyuk-Jae Chang

Abstract: Doppler echocardiography offers critical insights into cardiac function and phases by quantifying blood flow velocities and evaluating myocardial motion. However, previous methods for automating Doppler analysis, ranging from initial signal processing techniques to advanced deep learning approaches, have been constrained by their reliance on electrocardiogram (ECG) data and their inability to proc… ▽ More Doppler echocardiography offers critical insights into cardiac function and phases by quantifying blood flow velocities and evaluating myocardial motion. However, previous methods for automating Doppler analysis, ranging from initial signal processing techniques to advanced deep learning approaches, have been constrained by their reliance on electrocardiogram (ECG) data and their inability to process Doppler views collectively. We introduce a novel unified framework using a convolutional neural network for comprehensive analysis of spectral and tissue Doppler echocardiography images that combines automatic measurements and end-diastole (ED) detection into a singular method. The network automatically recognizes key features across various Doppler views, with novel Doppler shape embedding and anti-aliasing modules enhancing interpretation and ensuring consistent analysis. Empirical results indicate a consistent outperformance in performance metrics, including dice similarity coefficients (DSC) and intersection over union (IoU). The proposed framework demonstrates strong agreement with clinicians in Doppler automatic measurements and competitive performance in ED detection. △ Less

Submitted 14 November, 2023; originally announced November 2023.

arXiv:2310.09511 [pdf, other]

Online Parameter Identification of Generalized Non-cooperative Game

Authors: Jianguo Chen, **long Lei, Hongsheng Qi, Yiguang Hong

Abstract: This work studies the parameter identification problem of a generalized non-cooperative game, where each player's cost function is influenced by an observable signal and some unknown parameters. We consider the scenario where equilibrium of the game at some observable signals can be observed with noises, whereas our goal is to identify the unknown parameters with the observed data. Assuming that t… ▽ More This work studies the parameter identification problem of a generalized non-cooperative game, where each player's cost function is influenced by an observable signal and some unknown parameters. We consider the scenario where equilibrium of the game at some observable signals can be observed with noises, whereas our goal is to identify the unknown parameters with the observed data. Assuming that the observable signals and the corresponding noise-corrupted equilibriums are acquired sequentially, we construct this parameter identification problem as online optimization and introduce a novel online parameter identification algorithm. To be specific, we construct a regularized loss function that balances conservativeness and correctiveness, where the conservativeness term ensures that the new estimates do not deviate significantly from the current estimates, while the correctiveness term is captured by the Karush-Kuhn-Tucker conditions. We then prove that when the players' cost functions are linear with respect to the unknown parameters and the learning rate of the online parameter identification algorithm satisfies μ_k \propto 1/\sqrt{k}, along with other assumptions, the regret bound of the proposed algorithm is O(\sqrt{K}). Finally, we conduct numerical simulations on a Nash-Cournot problem to demonstrate that the performance of the online identification algorithm is comparable to that of the offline setting. △ Less

Submitted 14 October, 2023; originally announced October 2023.

Comments: 10 pages, 5 figures

arXiv:2310.08897 [pdf, other]

Self supervised convolutional kernel based handcrafted feature harmonization: Enhanced left ventricle hypertension disease phenoty** on echocardiography

Authors: **a Lee, Youngtaek Hong, Dawun Jeong, Yeonggul Jang, Jaeik Jeon, Sihyeon Jeong, Taekgeun Jung, Yeonyee E. Yoon, Inki Moon, Seung-Ah Lee, Hyuk-Jae Chang

Abstract: Radiomics, a medical imaging technique, extracts quantitative handcrafted features from images to predict diseases. Harmonization in those features ensures consistent feature extraction across various imaging devices and protocols. Methods for harmonization include standardized imaging protocols, statistical adjustments, and evaluating feature robustness. Myocardial diseases such as Left Ventricul… ▽ More Radiomics, a medical imaging technique, extracts quantitative handcrafted features from images to predict diseases. Harmonization in those features ensures consistent feature extraction across various imaging devices and protocols. Methods for harmonization include standardized imaging protocols, statistical adjustments, and evaluating feature robustness. Myocardial diseases such as Left Ventricular Hypertrophy (LVH) and Hypertensive Heart Disease (HHD) are diagnosed via echocardiography, but variable imaging settings pose challenges. Harmonization techniques are crucial for applying handcrafted features in disease diagnosis in such scenario. Self-supervised learning (SSL) enhances data understanding within limited datasets and adapts to diverse data settings. ConvNeXt-V2 integrates convolutional layers into SSL, displaying superior performance in various tasks. This study focuses on convolutional filters within SSL, using them as preprocessing to convert images into feature maps for handcrafted feature harmonization. Our proposed method excelled in harmonization evaluation and exhibited superior LVH classification performance compared to existing methods. △ Less

Submitted 22 November, 2023; v1 submitted 13 October, 2023; originally announced October 2023.

Comments: 11 pages, 7 figures

arXiv:2308.16483 [pdf, other]

Improving Out-of-Distribution Detection in Echocardiographic View Classication through Enhancing Semantic Features

Authors: Jaeik Jeon, Seongmin Ha, Yeonggul Jang, Yeonyee E. Yoon, Jiyeon Kim, Hyunseok Jeong, Dawun Jeong, Youngtaek Hong, Seung-Ah Lee Hyuk-Jae Chang

Abstract: In echocardiographic view classification, accurately detecting out-of-distribution (OOD) data is essential but challenging, especially given the subtle differences between in-distribution and OOD data. While conventional OOD detection methods, such as Mahalanobis distance (MD) are effective in far-OOD scenarios with clear distinctions between distributions, they struggle to discern the less obviou… ▽ More In echocardiographic view classification, accurately detecting out-of-distribution (OOD) data is essential but challenging, especially given the subtle differences between in-distribution and OOD data. While conventional OOD detection methods, such as Mahalanobis distance (MD) are effective in far-OOD scenarios with clear distinctions between distributions, they struggle to discern the less obvious variations characteristic of echocardiographic data. In this study, we introduce a novel use of label smoothing to enhance semantic feature representation in echocardiographic images, demonstrating that these enriched semantic features are key for significantly improving near-OOD instance detection. By combining label smoothing with MD-based OOD detection, we establish a new benchmark for accuracy in echocardiographic OOD detection. △ Less

Submitted 23 November, 2023; v1 submitted 31 August, 2023; originally announced August 2023.

arXiv:2303.09088 [pdf, other]

MetaRegNet: Metamorphic Image Registration Using Flow-Driven Residual Networks

Authors: Ankita Joshi, Yi Hong

Abstract: Deep learning based methods provide efficient solutions to medical image registration, including the challenging problem of diffeomorphic image registration. However, most methods register normal image pairs, facing difficulty handling those with missing correspondences, e.g., in the presence of pathology like tumors. We desire an efficient solution to jointly account for spatial deformations and… ▽ More Deep learning based methods provide efficient solutions to medical image registration, including the challenging problem of diffeomorphic image registration. However, most methods register normal image pairs, facing difficulty handling those with missing correspondences, e.g., in the presence of pathology like tumors. We desire an efficient solution to jointly account for spatial deformations and appearance changes in the pathological regions where the correspondences are missing, i.e., finding a solution to metamorphic image registration. Some approaches are proposed to tackle this problem, but they cannot properly handle large pathological regions and deformations around pathologies. In this paper, we propose a deep metamorphic image registration network (MetaRegNet), which adopts time-varying flows to drive spatial diffeomorphic deformations and generate intensity variations. We evaluate MetaRegNet on two datasets, i.e., BraTS 2021 with brain tumors and 3D-IRCADb-01 with liver tumors, showing promising results in registering a healthy and tumor image pair. The source code is available online. △ Less

Submitted 16 March, 2023; originally announced March 2023.

Comments: 11 pages, 3 figures

arXiv:2212.07939 [pdf, other]

RWEN-TTS: Relation-aware Word Encoding Network for Natural Text-to-Speech Synthesis

Authors: Shinhyeok Oh, HyeongRae Noh, Yoonseok Hong, Insoo Oh

Abstract: With the advent of deep learning, a huge number of text-to-speech (TTS) models which produce human-like speech have emerged. Recently, by introducing syntactic and semantic information w.r.t the input text, various approaches have been proposed to enrich the naturalness and expressiveness of TTS models. Although these strategies showed impressive results, they still have some limitations in utiliz… ▽ More With the advent of deep learning, a huge number of text-to-speech (TTS) models which produce human-like speech have emerged. Recently, by introducing syntactic and semantic information w.r.t the input text, various approaches have been proposed to enrich the naturalness and expressiveness of TTS models. Although these strategies showed impressive results, they still have some limitations in utilizing language information. First, most approaches only use graph networks to utilize syntactic and semantic information without considering linguistic features. Second, most previous works do not explicitly consider adjacent words when encoding syntactic and semantic information, even though it is obvious that adjacent words are usually meaningful when encoding the current word. To address these issues, we propose Relation-aware Word Encoding Network (RWEN), which effectively allows syntactic and semantic information based on two modules (i.e., Semantic-level Relation Encoding and Adjacent Word Relation Encoding). Experimental results show substantial improvements compared to previous works. △ Less

Submitted 15 December, 2022; originally announced December 2022.

Comments: Accepted to AAAI 2023

arXiv:2211.07951 [pdf, other]

Show Me the Instruments: Musical Instrument Retrieval from Mixture Audio

Authors: Kyungsu Kim, Minju Park, Haesun Joung, Yunkee Chae, Yeongbeom Hong, Seonghyeon Go, Kyogu Lee

Abstract: As digital music production has become mainstream, the selection of appropriate virtual instruments plays a crucial role in determining the quality of music. To search the musical instrument samples or virtual instruments that make one's desired sound, music producers use their ears to listen and compare each instrument sample in their collection, which is time-consuming and inefficient. In this p… ▽ More As digital music production has become mainstream, the selection of appropriate virtual instruments plays a crucial role in determining the quality of music. To search the musical instrument samples or virtual instruments that make one's desired sound, music producers use their ears to listen and compare each instrument sample in their collection, which is time-consuming and inefficient. In this paper, we call this task as Musical Instrument Retrieval and propose a method for retrieving desired musical instruments using reference music mixture as a query. The proposed model consists of the Single-Instrument Encoder and the Multi-Instrument Encoder, both based on convolutional neural networks. The Single-Instrument Encoder is trained to classify the instruments used in single-track audio, and we take its penultimate layer's activation as the instrument embedding. The Multi-Instrument Encoder is trained to estimate multiple instrument embeddings using the instrument embeddings computed by the Single-Instrument Encoder as a set of target embeddings. For more generalized training and realistic evaluation, we also propose a new dataset called Nlakh. Experimental results showed that the Single-Instrument Encoder was able to learn the map** from the audio signal of unseen instruments to the instrument embedding space and the Multi-Instrument Encoder was able to extract multiple embeddings from the mixture of music and retrieve the desired instruments successfully. The code used for the experiment and audio samples are available at: https://github.com/minju0821/musical_instrument_retrieval △ Less

Submitted 15 November, 2022; originally announced November 2022.

Comments: 5 pages, 4 figures, submitted to ICASSP 2023

arXiv:2207.01868 [pdf, other]

Bayesian approaches for Quantifying Clinicians' Variability in Medical Image Quantification

Authors: Jaeik Jeon, Yeonggul Jang, Youngtaek Hong, Hackjoon Shim, Sekeun Kim

Abstract: Medical imaging, including MRI, CT, and Ultrasound, plays a vital role in clinical decisions. Accurate segmentation is essential to measure the structure of interest from the image. However, manual segmentation is highly operator-dependent, which leads to high inter and intra-variability of quantitative measurements. In this paper, we explore the feasibility that Bayesian predictive distribution p… ▽ More Medical imaging, including MRI, CT, and Ultrasound, plays a vital role in clinical decisions. Accurate segmentation is essential to measure the structure of interest from the image. However, manual segmentation is highly operator-dependent, which leads to high inter and intra-variability of quantitative measurements. In this paper, we explore the feasibility that Bayesian predictive distribution parameterized by deep neural networks can capture the clinicians' inter-intra variability. By exploring and analyzing recently emerged approximate inference schemes, we evaluate whether approximate Bayesian deep learning with the posterior over segmentations can learn inter-intra rater variability both in segmentation and clinical measurements. The experiments are performed with two different imaging modalities: MRI and ultrasound. We empirically demonstrated that Bayesian predictive distribution parameterized by deep neural networks could approximate the clinicians' inter-intra variability. We show a new perspective in analyzing medical images quantitatively by providing clinical measurement uncertainty. △ Less

Submitted 6 July, 2022; v1 submitted 5 July, 2022; originally announced July 2022.

Comments: Interpretable Machine Learning in Healthcare

arXiv:2207.01078 [pdf, other]

doi 10.1109/TAFFC.2023.3247914

ARAUS: A Large-Scale Dataset and Baseline Models of Affective Responses to Augmented Urban Soundscapes

Authors: Kenneth Ooi, Zhen-Ting Ong, Karn N. Watcharasupat, Bhan Lam, Joo Young Hong, Woon-Seng Gan

Abstract: Choosing optimal maskers for existing soundscapes to effect a desired perceptual change via soundscape augmentation is non-trivial due to extensive varieties of maskers and a dearth of benchmark datasets with which to compare and develop soundscape augmentation models. To address this problem, we make publicly available the ARAUS (Affective Responses to Augmented Urban Soundscapes) dataset, which… ▽ More Choosing optimal maskers for existing soundscapes to effect a desired perceptual change via soundscape augmentation is non-trivial due to extensive varieties of maskers and a dearth of benchmark datasets with which to compare and develop soundscape augmentation models. To address this problem, we make publicly available the ARAUS (Affective Responses to Augmented Urban Soundscapes) dataset, which comprises a five-fold cross-validation set and independent test set totaling 25,440 unique subjective perceptual responses to augmented soundscapes presented as audio-visual stimuli. Each augmented soundscape is made by digitally adding "maskers" (bird, water, wind, traffic, construction, or silence) to urban soundscape recordings at fixed soundscape-to-masker ratios. Responses were then collected by asking participants to rate how pleasant, annoying, eventful, uneventful, vibrant, monotonous, chaotic, calm, and appropriate each augmented soundscape was, in accordance with ISO 12913-2:2018. Participants also provided relevant demographic information and completed standard psychological questionnaires. We perform exploratory and statistical analysis of the responses obtained to verify internal consistency and agreement with known results in the literature. Finally, we demonstrate the benchmarking capability of the dataset by training and comparing four baseline models for urban soundscape pleasantness: a low-parameter regression model, a high-parameter convolutional neural network, and two attention-based networks in the literature. △ Less

Submitted 5 March, 2023; v1 submitted 3 July, 2022; originally announced July 2022.

Comments: [v1, v2] 25 pages, 11 figures. [v3] 33 pages, 18 figures. v3 updated with changes made after peer review. in IEEE Transactions on Affective Computing, 2023

Journal ref: IEEE Trans. Affect. Comput., pp. 1-17, 2023

arXiv:2206.03112 [pdf]

doi 10.3390/su14127485

Singapore Soundscape Site Selection Survey (S5): Identification of Characteristic Soundscapes of Singapore via Weighted k-means Clustering

Authors: Kenneth Ooi, Bhan Lam, Joo Young Hong, Karn N. Watcharasupat, Zhen-Ting Ong, Woon-Seng Gan

Abstract: The ecological validity of soundscape studies usually rests on a choice of soundscapes that are representative of the perceptual space under investigation. For example, a soundscape pleasantness study might investigate locations with soundscapes ranging from "pleasant" to "annoying". The choice of soundscapes is typically researcher-led, but a participant-led process can reduce selection bias and… ▽ More The ecological validity of soundscape studies usually rests on a choice of soundscapes that are representative of the perceptual space under investigation. For example, a soundscape pleasantness study might investigate locations with soundscapes ranging from "pleasant" to "annoying". The choice of soundscapes is typically researcher-led, but a participant-led process can reduce selection bias and improve result reliability. Hence, we propose a robust participant-led method to pinpoint characteristic soundscapes possessing arbitrary perceptual attributes. We validate our method by identifying Singaporean soundscapes spanning the perceptual quadrants generated from the "Pleasantness" and "Eventfulness" axes of the ISO 12913-2 circumplex model of soundscape perception, as perceived by local experts. From memory and experience, 67 participants first selected locations corresponding to each perceptual quadrant in each major planning region of Singapore. We then performed weighted k-means clustering on the selected locations, with weights for each location derived from previous frequencies and durations spent in each location by each participant. Weights hence acted as proxies for participant confidence. In total, 62 locations were thereby identified as suitable locations with characteristic soundscapes for further research utilizing the ISO 12913-2 perceptual quadrants. Audio-visual recordings and acoustic characterization of the soundscapes will be made in a future study. △ Less

Submitted 7 June, 2022; originally announced June 2022.

Comments: 23 pages, 8 figures. Submitted to Sustainability

Journal ref: MDPI Sustainability. 2022; 14(12):7485

arXiv:2205.11224 [pdf]

Around View Monitoring System for Hydraulic Excavators

Authors: Dong Jun Yeom, Yu Na Hong, Yoojun Kim, Hyun Seok Yoo, Youngsuk Kim

Abstract: This paper describes the Around View Monitoring (AVM) system for hydraulic excavators that prevents the safety accidents caused by blind spots and increases the operational efficiency. To verify the developed system, experiments were conducted with its prototype. The experimental results demonstrate its applicability in the field with the following values: 7m of a visual range, 15fps of image refr… ▽ More This paper describes the Around View Monitoring (AVM) system for hydraulic excavators that prevents the safety accidents caused by blind spots and increases the operational efficiency. To verify the developed system, experiments were conducted with its prototype. The experimental results demonstrate its applicability in the field with the following values: 7m of a visual range, 15fps of image refresh rate, 300ms of working information data reception rate, and 300ms of surface condition data reception rate. △ Less

Submitted 4 April, 2022; originally announced May 2022.

Comments: 9 pages, 11 figures

Journal ref: The 7th International Conference on Construction Engineering and Project Management (ICCEPM 2017), Oct. 27-30, 2017, Chengdu, China

arXiv:2204.04376 [pdf, ps, other]

Small-Gain Theorem for Safety Verification under High-Relative-Degree Constraints

Authors: Ziliang Lyu, Xiangru Xu, Yiguang Hong

Abstract: This paper develops a small-gain technique for the safety analysis and verification of interconnected systems with high-relative-degree safety constraints. In this technique, input-to-state safety (ISSf) is used to characterize how the safety of a subsystem is influenced by the external input, and ISSf-barrier functions (ISSf-BFs) with high relative degree are employed to capture the safety of sub… ▽ More This paper develops a small-gain technique for the safety analysis and verification of interconnected systems with high-relative-degree safety constraints. In this technique, input-to-state safety (ISSf) is used to characterize how the safety of a subsystem is influenced by the external input, and ISSf-barrier functions (ISSf-BFs) with high relative degree are employed to capture the safety of subsystems. With a coordination transform, the relationship between ISSf-BFs and the existing high-relative-degree (or high-order) barrier functions is established in order to simplify the ISSf analysis. With the help of high-relative-degree ISSf-BFs, a small-gain theorem is proposed for safety verification. It is shown that, under the small-gain condition, i) the interconnection of ISSf subsystems is still ISSf; and ii) the overall interconnected system is input-to-state stable (ISS) with respect to the compositional safe set. The effectiveness of the proposed small-gain theorem is illustrated on the output-constrained decentralized control of two inverted pendulums connected by a spring mounted on two carts. △ Less

Submitted 8 April, 2022; originally announced April 2022.

arXiv:2203.04215 [pdf, other]

doi 10.1109/TAC.2022.3225472

Multi-agent consensus over time-invariant and time-varying signed digraphs via eventual positivity

Authors: Angela Fontan, Lingfei Wang, Yiguang Hong, Guodong Shi, Claudio Altafini

Abstract: Laplacian dynamics on signed digraphs have a richer behavior than those on nonnegative digraphs. In particular, for the so-called "repelling" signed Laplacians, the marginal stability property (needed to achieve consensus) is not guaranteed a priori and, even when it holds, it does not automatically lead to consensus, as these signed Laplacians may loose rank even in strongly connected digraphs. F… ▽ More Laplacian dynamics on signed digraphs have a richer behavior than those on nonnegative digraphs. In particular, for the so-called "repelling" signed Laplacians, the marginal stability property (needed to achieve consensus) is not guaranteed a priori and, even when it holds, it does not automatically lead to consensus, as these signed Laplacians may loose rank even in strongly connected digraphs. Furthermore, in the time-varying case, instability can occur even when switching in a family of systems each of which corresponds to a marginally stable signed Laplacian with the correct corank. In this paper we present conditions guaranteeing consensus of these signed Laplacians based on the property of eventual positivity, a Perron-Frobenius type of property for signed matrices. The conditions cover both time-invariant and time-varying cases. A particularly simple sufficient condition valid in both cases is that the Laplacians are normal matrices. Such condition can be relaxed in several ways. For instance in the time-invariant case it is enough that the Laplacian has this Perron-Frobenius property on the right but not on the left side (i.e., on the transpose). For the time-varying case, convergence to consensus can be guaranteed by the existence of a common Lyapunov function for all the signed Laplacians. All conditions can be easily extended to bipartite consensus. △ Less

Submitted 8 March, 2022; originally announced March 2022.

Comments: 16 pages, 1 figure

Journal ref: IEEE Transactions on Automatic Control, 2023

arXiv:2202.05463 [pdf]

Infrastructure-enabled GPS Spoofing Detection and Correction

Authors: Feilong Wang, Yuan Hong, Jeff Ban

Abstract: Accurate and robust localization is crucial for supporting high-level driving automation and safety. Modern localization solutions rely on various sensors, among which GPS has been and will continue to be essential. However, GPS can be vulnerable to malicious attacks and GPS spoofing has been identified as a high threat. With transportation infrastructure becoming increasingly important in support… ▽ More Accurate and robust localization is crucial for supporting high-level driving automation and safety. Modern localization solutions rely on various sensors, among which GPS has been and will continue to be essential. However, GPS can be vulnerable to malicious attacks and GPS spoofing has been identified as a high threat. With transportation infrastructure becoming increasingly important in supporting emerging vehicle technologies and systems, this study explores the potential of applying infrastructure data for defending against GPS spoofing. We propose an infrastructure-enabled framework using roadside units as an independent, secured data source. A real-time detector, based on the Isolation Forest, is constructed to detect GPS spoofing. Once spoofing is detected, GPS measurements are isolated, and the potentially compromised location estimator is corrected using secure infrastructure data. We test the proposed method using both simulation and real-world data and show its effectiveness in defending against various GPS spoofing attacks, including stealthy attacks that are proposed to fail the production-grade autonomous driving systems. △ Less

Submitted 14 March, 2023; v1 submitted 11 February, 2022; originally announced February 2022.

Comments: Add more results in experiment study, etc. IEEE T-ITS submission

arXiv:2201.12702 [pdf, ps, other]

Robotic Wireless Energy Transfer in Dynamic Environments: System Design and Experimental Validation

Authors: Shuai Wang, Ruihua Han, Yuncong Hong, Qi Hao, Miaowen Wen, Leila Musavian, Shahid Mumtaz, Derrick Wing Kwan Ng

Abstract: Wireless energy transfer (WET) is a ground-breaking technology for cutting the last wire between mobile sensors and power grids in smart cities. Yet, WET only offers effective transmission of energy over a short distance. Robotic WET is an emerging paradigm that mounts the energy transmitter on a mobile robot and navigates the robot through different regions in a large area to charge remote energy… ▽ More Wireless energy transfer (WET) is a ground-breaking technology for cutting the last wire between mobile sensors and power grids in smart cities. Yet, WET only offers effective transmission of energy over a short distance. Robotic WET is an emerging paradigm that mounts the energy transmitter on a mobile robot and navigates the robot through different regions in a large area to charge remote energy harvesters. However, it is challenging to determine the robotic charging strategy in an unknown and dynamic environment due to the uncertainty of obstacles. This paper proposes a hardware-in-the-loop joint optimization framework that offers three distinctive features: 1) efficient model updates and re-optimization based on the last-round experimental data; 2) iterative refinement of the anchor list for adaptation to different environments; 3) verification of algorithms in a high-fidelity Gazebo simulator and a multi-robot testbed. Experimental results show that the proposed framework significantly saves the WET mission completion time while satisfying collision avoidance and energy harvesting constraints. △ Less

Submitted 10 February, 2022; v1 submitted 29 January, 2022; originally announced January 2022.

Comments: single column, 18 pages, 6 figures, to appear in IEEE Communications Magazine

Journal ref: IEEE Communications Magazine, Mar. 2022

arXiv:2112.09135 [pdf, other]

ASC-Net: Unsupervised Medical Anomaly Segmentation Using an Adversarial-based Selective Cutting Network

Authors: Raunak Dey, Wenbo Sun, Haibo Xu, Yi Hong

Abstract: In this paper we consider the problem of unsupervised anomaly segmentation in medical images, which has attracted increasing attention in recent years due to the expensive pixel-level annotations from experts and the existence of a large amount of unannotated normal and abnormal image scans. We introduce a segmentation network that utilizes adversarial learning to partition an image into two cuts,… ▽ More In this paper we consider the problem of unsupervised anomaly segmentation in medical images, which has attracted increasing attention in recent years due to the expensive pixel-level annotations from experts and the existence of a large amount of unannotated normal and abnormal image scans. We introduce a segmentation network that utilizes adversarial learning to partition an image into two cuts, with one of them falling into a reference distribution provided by the user. This Adversarial-based Selective Cutting network (ASC-Net) bridges the two domains of cluster-based deep segmentation and adversarial-based anomaly/novelty detection algorithms. Our ASC-Net learns from normal and abnormal medical scans to segment anomalies in medical scans without any masks for supervision. We evaluate this unsupervised anomly segmentation model on three public datasets, i.e., BraTS 2019 for brain tumor segmentation, LiTS for liver lesion segmentation, and MS-SEG 2015 for brain lesion segmentation, and also on a private dataset for brain tumor segmentation. Compared to existing methods, our model demonstrates tremendous performance gains in unsupervised anomaly segmentation tasks. Although there is still room to further improve performance compared to supervised learning algorithms, the promising experimental results and interesting observations shed light on building an unsupervised learning algorithm for medical anomaly identification using user-defined knowledge. △ Less

Submitted 16 December, 2021; originally announced December 2021.

Comments: Currently in Submission to Medical Image Analysis Journal. Extension of DOI - 10.1007/978-3-030-87240-3_23 with more details and experiments and indepth analysis. arXiv admin note: substantial text overlap with arXiv:2103.03664

arXiv:2111.11650 [pdf, ps, other]

doi 10.1109/JIOT.2022.3163396

Aerial Intelligent Reflecting Surface Enabled Terahertz Covert Communications in Beyond-5G Internet of Things

Authors: Milad Tatar Mamaghani, Yi Hong

Abstract: Unmanned aerial vehicles (UAVs) are envisioned to be extensively employed for assisting wireless communications in Internet of Things (IoT) applications. On the other hand, terahertz (THz) enabled intelligent reflecting surface (IRS) is expected to be one of the core enabling technologies for forthcoming beyond-5G wireless communications that promise a broad range of data-demand applications. In t… ▽ More Unmanned aerial vehicles (UAVs) are envisioned to be extensively employed for assisting wireless communications in Internet of Things (IoT) applications. On the other hand, terahertz (THz) enabled intelligent reflecting surface (IRS) is expected to be one of the core enabling technologies for forthcoming beyond-5G wireless communications that promise a broad range of data-demand applications. In this paper, we propose a UAV-mounted IRS (UIRS) communication system over THz bands for confidential data dissemination from an access point (AP) towards multiple ground user equipments (UEs) in IoT networks. Specifically, the AP intends to send data to the scheduled UE, while unscheduled UEs may pose potential adversaries. To protect information messages and the privacy of the scheduled UE, we aim to devise an energy-efficient multi-UAV covert communication scheme, where the UIRS is for reliable data transmissions, and an extra UAV is utilized as a cooperative jammer generating artificial noise (AN) to degrade unscheduled UEs detection. We then formulate a novel minimum average energy efficiency (mAEE) optimization problem, targetting to improve the covert throughput and reduce UAVs' propulsion energy consumption subject to the covertness requirement, which is determined analytically. Since the optimization problem is non-convex, we tackle it via the block successive convex approximation (BSCA) approach to iteratively solve a sequence of approximated convex sub-problems, designing the binary user scheduling, AP's power allocation, maximum AN jamming power, IRS beamforming, and both UAVs' trajectory planning. Finally, we present a low-complex overall algorithm for system performance enhancement with complexity and convergence analysis. Numerical results are provided to verify our analysis and demonstrate significant outperformance of our design over other existing benchmark schemes. △ Less

Submitted 28 January, 2022; v1 submitted 22 November, 2021; originally announced November 2021.

Comments: 23 pages, 14 figures, submitted for possible journal publication

arXiv:2111.04738 [pdf]

doi 10.3390/jimaging8080213

HEROHE Challenge: assessing HER2 status in breast cancer without immunohistochemistry or in situ hybridization

Authors: Eduardo Conde-Sousa, João Vale, Ming Feng, Kele Xu, Yin Wang, Vincenzo Della Mea, David La Barbera, Ehsan Montahaei, Mahdieh Soleymani Baghshah, Andreas Turzynski, Jacob Gildenblat, Eldad Klaiman, Yiyu Hong, Guilherme Aresta, Teresa Araújo, Paulo Aguiar, Catarina Eloy, António Polónia

Abstract: Breast cancer is the most common malignancy in women, being responsible for more than half a million deaths every year. As such, early and accurate diagnosis is of paramount importance. Human expertise is required to diagnose and correctly classify breast cancer and define appropriate therapy, which depends on the evaluation of the expression of different biomarkers such as the transmembrane prote… ▽ More Breast cancer is the most common malignancy in women, being responsible for more than half a million deaths every year. As such, early and accurate diagnosis is of paramount importance. Human expertise is required to diagnose and correctly classify breast cancer and define appropriate therapy, which depends on the evaluation of the expression of different biomarkers such as the transmembrane protein receptor HER2. This evaluation requires several steps, including special techniques such as immunohistochemistry or in situ hybridization to assess HER2 status. With the goal of reducing the number of steps and human bias in diagnosis, the HEROHE Challenge was organized, as a parallel event of the 16th European Congress on Digital Pathology, aiming to automate the assessment of the HER2 status based only on hematoxylin and eosin stained tissue sample of invasive breast cancer. Methods to assess HER2 status were presented by 21 teams worldwide and the results achieved by some of the proposed methods open potential perspectives to advance the state-of-the-art. △ Less

Submitted 8 November, 2021; originally announced November 2021.

arXiv:2110.06648 [pdf, other]

Robotic Autonomous Trolley Collection with Progressive Perception and Nonlinear Model Predictive Control

Authors: Anxing Xiao, Hao Luan, Ziqi Zhao, Yue Hong, Jieting Zhao, Weinan Chen, Jiankun Wang, Max Q. -H. Meng

Abstract: Autonomous mobile manipulation robots that can collect trolleys are widely used to liberate human resources and fight epidemics. Most prior robotic trolley collection solutions only detect trolleys with 2D poses or are merely based on specific marks and lack the formal design of planning algorithms. In this paper, we present a novel mobile manipulation system with applications in luggage trolley c… ▽ More Autonomous mobile manipulation robots that can collect trolleys are widely used to liberate human resources and fight epidemics. Most prior robotic trolley collection solutions only detect trolleys with 2D poses or are merely based on specific marks and lack the formal design of planning algorithms. In this paper, we present a novel mobile manipulation system with applications in luggage trolley collection. The proposed system integrates a compact hardware design and a progressive perception and planning framework, enabling the system to efficiently and robustly collect trolleys in dynamic and complex environments. For the perception, we first develop a 3D trolley detection method that combines object detection and keypoint estimation. Then, a docking process in a short distance is achieved with an accurate point cloud plane detection method and a novel manipulator design. On the planning side, we formulate the robot's motion planning under a nonlinear model predictive control framework with control barrier functions to improve obstacle avoidance capabilities while maintaining the target in the sensors' field of view at close distances. We demonstrate our design and framework by deploying the system on actual trolley collection tasks, and their effectiveness and robustness are experimentally validated. △ Less

Submitted 1 March, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

Comments: Accepted to the 2022 International Conference on Robotics and Automation (ICRA 2022)

arXiv:2107.09896 [pdf, ps, other]

doi 10.1109/TVT.2022.3150011

Terahertz Meets Untrusted UAV-Relaying: Minimum Secrecy Energy Efficiency Maximization via Trajectory and Communication Co-design

Authors: Milad Tatar Mamaghani, Yi Hong

Abstract: Unmanned aerial vehicles (UAVs) and Terahertz (THz) technology are envisioned to play paramount roles in next-generation wireless communications. In this paper, we present a novel secure UAV-assisted mobile relaying system operating at THz bands for data acquisition from multiple ground user equipments (UEs) towards a destination. We assume that the UAV-mounted relay may act, besides providing rel… ▽ More Unmanned aerial vehicles (UAVs) and Terahertz (THz) technology are envisioned to play paramount roles in next-generation wireless communications. In this paper, we present a novel secure UAV-assisted mobile relaying system operating at THz bands for data acquisition from multiple ground user equipments (UEs) towards a destination. We assume that the UAV-mounted relay may act, besides providing relaying services, as a potential eavesdropper called the untrusted UAV-relay (UUR). To safeguard end-to-end communications, we present a secure two-phase transmission strategy with cooperative jamming. Then, we devise an optimization framework in terms of a new measure $-$ secrecy energy efficiency (SEE), defined as the ratio of achievable average secrecy rate to average system power consumption, which enables us to obtain the best possible security level while taking UUR's inherent flight power limitation into account. For the sake of quality of service fairness amongst all the UEs, we aim to maximize the minimum SEE (MSEE) performance via the joint design of key system parameters, including UUR's trajectory and velocity, communication scheduling, and network power allocation. Since the formulated problem is a mixed-integer nonconvex optimization and computationally intractable, we decouple it into four subproblems and propose alternative algorithms to solve it efficiently via greedy/sequential block successive convex approximation and non-linear fractional programming techniques. Numerical results demonstrate significant MSEE performance improvement of our designs compared to other known benchmarks. △ Less

Submitted 7 February, 2022; v1 submitted 21 July, 2021; originally announced July 2021.

Comments: 16 pages, 10 figures, Accepted by (to appear in) the IEEE Transactions on Vehicular Technology

arXiv:2106.02810 [pdf, other]

doi 10.21437/Interspeech.2021-1341

An Attribute-Aligned Strategy for Learning Speech Representation

Authors: Yu-Lin Huang, Bo-Hao Su, Y. -W. Peter Hong, Chi-Chun Lee

Abstract: Advancement in speech technology has brought convenience to our life. However, the concern is on the rise as speech signal contains multiple personal attributes, which would lead to either sensitive information leakage or bias toward decision. In this work, we propose an attribute-aligned learning strategy to derive speech representation that can flexibly address these issues by attribute-selectio… ▽ More Advancement in speech technology has brought convenience to our life. However, the concern is on the rise as speech signal contains multiple personal attributes, which would lead to either sensitive information leakage or bias toward decision. In this work, we propose an attribute-aligned learning strategy to derive speech representation that can flexibly address these issues by attribute-selection mechanism. Specifically, we propose a layered-representation variational autoencoder (LR-VAE), which factorizes speech representation into attribute-sensitive nodes, to derive an identity-free representation for speech emotion recognition (SER), and an emotionless representation for speaker verification (SV). Our proposed method achieves competitive performances on identity-free SER and a better performance on emotionless SV, comparing to the current state-of-the-art method of using adversarial learning applied on a large emotion corpora, the MSP-Podcast. Also, our proposed learning strategy reduces the model and training process needed to achieve multiple privacy-preserving tasks. △ Less

Submitted 8 September, 2021; v1 submitted 5 June, 2021; originally announced June 2021.

Comments: 5 pages, 2 figures; Accepted in Interspeech 2021

Journal ref: Proceedings of INTERSPEECH 2021

arXiv:2106.02514 [pdf, other]

The Image Local Autoregressive Transformer

Authors: Chenjie Cao, Yuxin Hong, Xiang Li, Chengrong Wang, Chengming Xu, XiangYang Xue, Yanwei Fu

Abstract: Recently, AutoRegressive (AR) models for the whole image generation empowered by transformers have achieved comparable or even better performance to Generative Adversarial Networks (GANs). Unfortunately, directly applying such AR models to edit/change local image regions, may suffer from the problems of missing global information, slow inference speed, and information leakage of local guidance. To… ▽ More Recently, AutoRegressive (AR) models for the whole image generation empowered by transformers have achieved comparable or even better performance to Generative Adversarial Networks (GANs). Unfortunately, directly applying such AR models to edit/change local image regions, may suffer from the problems of missing global information, slow inference speed, and information leakage of local guidance. To address these limitations, we propose a novel model -- image Local Autoregressive Transformer (iLAT), to better facilitate the locally guided image synthesis. Our iLAT learns the novel local discrete representations, by the newly proposed local autoregressive (LA) transformer of the attention mask and convolution mechanism. Thus iLAT can efficiently synthesize the local image regions by key guidance information. Our iLAT is evaluated on various locally guided image syntheses, such as pose-guided person image synthesis and face editing. Both the quantitative and qualitative results show the efficacy of our model. △ Less

Submitted 18 October, 2021; v1 submitted 4 June, 2021; originally announced June 2021.

Comments: Accepted by NeurIPS2021

arXiv:2105.11361 [pdf, other]

DDR-Net: Dividing and Downsampling Mixed Network for Diffeomorphic Image Registration

Authors: Ankita Joshi, Yi Hong

Abstract: Deep diffeomorphic registration faces significant challenges for high-dimensional images, especially in terms of memory limits. Existing approaches either downsample original images, or approximate underlying transformations, or reduce model size. The information loss during the approximation or insufficient model capacity is a hindrance to the registration accuracy for high-dimensional images, e.… ▽ More Deep diffeomorphic registration faces significant challenges for high-dimensional images, especially in terms of memory limits. Existing approaches either downsample original images, or approximate underlying transformations, or reduce model size. The information loss during the approximation or insufficient model capacity is a hindrance to the registration accuracy for high-dimensional images, e.g., 3D medical volumes. In this paper, we propose a Dividing and Downsampling mixed Registration network (DDR-Net), a general architecture that preserves most of the image information at multiple scales. DDR-Net leverages the global context via downsampling the input and utilizes the local details from divided chunks of the input images. This design reduces the network input size and its memory cost; meanwhile, by fusing global and local information, DDR-Net obtains both coarse-level and fine-level alignments in the final deformation fields. We evaluate DDR-Net on three public datasets, i.e., OASIS, IBSR18, and 3DIRCADB-01, and the experimental results demonstrate our approach outperforms existing approaches. △ Less

Submitted 24 May, 2021; originally announced May 2021.

arXiv:2105.04508 [pdf, other]

MDA-Net: Multi-Dimensional Attention-Based Neural Network for 3D Image Segmentation

Authors: Rutu Gandhi, Yi Hong

Abstract: Segmenting an entire 3D image often has high computational complexity and requires large memory consumption; by contrast, performing volumetric segmentation in a slice-by-slice manner is efficient but does not fully leverage the 3D data. To address this challenge, we propose a multi-dimensional attention network (MDA-Net) to efficiently integrate slice-wise, spatial, and channel-wise attention int… ▽ More Segmenting an entire 3D image often has high computational complexity and requires large memory consumption; by contrast, performing volumetric segmentation in a slice-by-slice manner is efficient but does not fully leverage the 3D data. To address this challenge, we propose a multi-dimensional attention network (MDA-Net) to efficiently integrate slice-wise, spatial, and channel-wise attention into a U-Net based network, which results in high segmentation accuracy with a low computational cost. We evaluate our model on the MICCAI iSeg and IBSR datasets, and the experimental results demonstrate consistent improvements over existing methods. △ Less

Submitted 10 May, 2021; originally announced May 2021.

arXiv:2104.10338 [pdf, other]

Shadow Generation for Composite Image in Real-world Scenes

Authors: Yan Hong, Li Niu, Jianfu Zhang, Liqing Zhang

Abstract: Image composition targets at inserting a foreground object into a background image. Most previous image composition methods focus on adjusting the foreground to make it compatible with background while ignoring the shadow effect of foreground on the background. In this work, we focus on generating plausible shadow for the foreground object in the composite image. First, we contribute a real-world… ▽ More Image composition targets at inserting a foreground object into a background image. Most previous image composition methods focus on adjusting the foreground to make it compatible with background while ignoring the shadow effect of foreground on the background. In this work, we focus on generating plausible shadow for the foreground object in the composite image. First, we contribute a real-world shadow generation dataset DESOBA by generating synthetic composite images based on paired real images and deshadowed images. Then, we propose a novel shadow generation network SGRNet, which consists of a shadow mask prediction stage and a shadow filling stage. In the shadow mask prediction stage, foreground and background information are thoroughly interacted to generate foreground shadow mask. In the shadow filling stage, shadow parameters are predicted to fill the shadow area. Extensive experiments on our DESOBA dataset and real composite images demonstrate the effectiveness of our proposed method. Our dataset and code are available at https://github.com/bcmi/Object-Shadow-Generation-Dataset-DESOBA. △ Less

Submitted 11 May, 2022; v1 submitted 20 April, 2021; originally announced April 2021.

Comments: This paper is accepted by AAAI 2022

arXiv:2104.05939 [pdf, ps, other]

doi 10.1109/TWC.2021.3088479

Orthogonal Time Sequency Multiplexing Modulation: Analysis and Low-Complexity Receiver Design

Authors: Tharaj Thaj, Emanuele Viterbo, Yi Hong

Abstract: This paper proposes orthogonal time sequency multiplexing (OTSM), a novel single carrier modulation scheme that places information symbols in the delay-sequency domain followed by a cascade of time-division multiplexing (TDM) and Walsh-Hadamard sequence multiplexing. Thanks to the Walsh Hadamard transform (WHT), the modulation and demodulation do not require complex domain multiplications. For the… ▽ More This paper proposes orthogonal time sequency multiplexing (OTSM), a novel single carrier modulation scheme that places information symbols in the delay-sequency domain followed by a cascade of time-division multiplexing (TDM) and Walsh-Hadamard sequence multiplexing. Thanks to the Walsh Hadamard transform (WHT), the modulation and demodulation do not require complex domain multiplications. For the proposed OTSM, we first derive the input-output relation in the delay-sequency domain and present a low complexity detection method taking advantage of zero-padding. We demonstrate via simulations that OTSM offers high performance gains over orthogonal frequency division multiplexing (OFDM) and similar performance to orthogonal time frequency space (OTFS), but at lower complexity owing to WHT. Then we propose a low complexity time-domain channel estimation method. Finally, we show how to include an outer error control code and a turbo decoder to improve error performance of the coded system. △ Less

Submitted 13 April, 2021; originally announced April 2021.

arXiv:2103.05860 [pdf, other]

doi 10.1364/OPTICA.408657

Single-photon imaging over 200 km

Authors: Zheng-** Li, Jun-Tian Ye, Xin Huang, Peng-Yu Jiang, Yuan Cao, Yu Hong, Chao Yu, Jun Zhang, Qiang Zhang, Cheng-Zhi Peng, Feihu Xu, Jian-Wei Pan

Abstract: Long-range active imaging has widespread applications in remote sensing and target recognition. Single-photon light detection and ranging (lidar) has been shown to have high sensitivity and temporal resolution. On the application front, however, the operating range of practical single-photon lidar systems is limited to about tens of kilometers over the Earth's atmosphere, mainly due to the weak ec… ▽ More Long-range active imaging has widespread applications in remote sensing and target recognition. Single-photon light detection and ranging (lidar) has been shown to have high sensitivity and temporal resolution. On the application front, however, the operating range of practical single-photon lidar systems is limited to about tens of kilometers over the Earth's atmosphere, mainly due to the weak echo signal mixed with high background noise. Here, we present a compact coaxial single-photon lidar system capable of realizing 3D imaging at up to 201.5 km. It is achieved by using high-efficiency optical devices for collection and detection, and what we believe is a new noise-suppression technique that is efficient for long-range applications. We show that photon-efficient computational algorithms enable accurate 3D imaging over hundreds of kilometers with as few as 0.44 signal photons per pixel. The results represent a significant step toward practical, low-power lidar over extra-long ranges. △ Less

Submitted 9 March, 2021; originally announced March 2021.

Comments: 6 pages, 6 figures

Journal ref: Having been published in Optica 8, 344-349 (2021)

arXiv:2103.03664 [pdf, other]

doi 10.1007/978-3-030-87240-3_23

ASC-Net : Adversarial-based Selective Network for Unsupervised Anomaly Segmentation

Authors: Raunak Dey, Yi Hong

Abstract: We introduce a neural network framework, utilizing adversarial learning to partition an image into two cuts, with one cut falling into a reference distribution provided by the user. This concept tackles the task of unsupervised anomaly segmentation, which has attracted increasing attention in recent years due to their broad applications in tasks with unlabelled data. This Adversarial-based Selecti… ▽ More We introduce a neural network framework, utilizing adversarial learning to partition an image into two cuts, with one cut falling into a reference distribution provided by the user. This concept tackles the task of unsupervised anomaly segmentation, which has attracted increasing attention in recent years due to their broad applications in tasks with unlabelled data. This Adversarial-based Selective Cutting network (ASC-Net) bridges the two domains of cluster-based deep learning methods and adversarial-based anomaly/novelty detection algorithms. We evaluate this unsupervised learning model on BraTS brain tumor segmentation, LiTS liver lesion segmentation, and MS-SEG2015 segmentation tasks. Compared to existing methods like the AnoGAN family, our model demonstrates tremendous performance gains in unsupervised anomaly segmentation tasks. Although there is still room to further improve performance compared to supervised learning algorithms, the promising experimental results shed light on building an unsupervised learning algorithm using user-defined knowledge. △ Less

Submitted 9 July, 2021; v1 submitted 5 March, 2021; originally announced March 2021.

Comments: Accepted for MICCAI 2021

arXiv:2101.09433 [pdf]

A Pressure Ulcer Care System For Remote Medical Assistance: Residual U-Net with an Attention Model Based for Wound Area Segmentation

Authors: **yeong Chae, Ki Yong Hong, Jihie Kim

Abstract: Increasing numbers of patients with disabilities or elderly people with mobility issues often suffer from a pressure ulcer. The affected areas need regular checks, but they have a difficulty in accessing a hospital. Some remote diagnosis systems are being used for them, but there are limitations in checking a patient's status regularly. In this paper, we present a remote medical assistant that can… ▽ More Increasing numbers of patients with disabilities or elderly people with mobility issues often suffer from a pressure ulcer. The affected areas need regular checks, but they have a difficulty in accessing a hospital. Some remote diagnosis systems are being used for them, but there are limitations in checking a patient's status regularly. In this paper, we present a remote medical assistant that can help pressure ulcer management with image processing techniques. The proposed system includes a mobile application with a deep learning model for wound segmentation and analysis. As there are not enough data to train the deep learning model, we make use of a pretrained model from a relevant domain and data augmentation that is appropriate for this task. First of all, an image preprocessing method using bilinear interpolation is used to resize images and normalize the images. Second, for data augmentation, we use rotation, reflection, and a watershed algorithm. Third, we use a pretrained deep learning model generated from skin wound images similar to pressure ulcer images. Finally, we added an attention module that can provide hints on the pressure ulcer image features. The resulting model provides an accuracy of 99.0%, an intersection over union (IoU) of 99.99%, and a dice similarity coefficient (DSC) of 93.4% for pressure ulcer segmentation, which is better than existing results. △ Less

Submitted 15 April, 2021; v1 submitted 23 January, 2021; originally announced January 2021.

Comments: Accepted by AAAI 2021 Workshop

arXiv:2011.09173 [pdf, ps, other]

doi 10.1016/j.automatica.2022.110178

Small-Gain Theorem for Safety Verification of Interconnected Systems

Authors: Ziliang Lyu, Xiangru Xu, Yiguang Hong

Abstract: A small-gain theorem in the formulation of barrier function is developed in this work for safety verification of interconnected systems. This result is helpful to verify input-to-state safety (ISSf) of the overall system from the safety information encoded in the subsystem's ISSf-barrier function. Also, it can be used to obtain a safety set in a higher dimensional space from the safety sets in two… ▽ More A small-gain theorem in the formulation of barrier function is developed in this work for safety verification of interconnected systems. This result is helpful to verify input-to-state safety (ISSf) of the overall system from the safety information encoded in the subsystem's ISSf-barrier function. Also, it can be used to obtain a safety set in a higher dimensional space from the safety sets in two lower dimensional spaces. △ Less

Submitted 18 November, 2020; originally announced November 2020.

arXiv:2008.09352 [pdf, other]

Deep Learning Methods for Lung Cancer Segmentation in Whole-slide Histopathology Images -- the ACDC@LungHP Challenge 2019

Authors: Zhang Li, Jiehua Zhang, Tao Tan, Xichao Teng, Xiaoliang Sun, Yang Li, Lihong Liu, Yang Xiao, Byungjae Lee, Yilong Li, Qianni Zhang, Shujiao Sun, Yushan Zheng, Junyu Yan, Ni Li, Yiyu Hong, Junsu Ko, Hyun Jung, Yanling Liu, Yu-cheng Chen, Ching-wei Wang, Vladimir Yurovskiy, Pavel Maevskikh, Vahid Khanagha, Yi Jiang , et al. (8 additional authors not shown)

Abstract: Accurate segmentation of lung cancer in pathology slides is a critical step in improving patient care. We proposed the ACDC@LungHP (Automatic Cancer Detection and Classification in Whole-slide Lung Histopathology) challenge for evaluating different computer-aided diagnosis (CADs) methods on the automatic diagnosis of lung cancer. The ACDC@LungHP 2019 focused on segmentation (pixel-wise detection)… ▽ More Accurate segmentation of lung cancer in pathology slides is a critical step in improving patient care. We proposed the ACDC@LungHP (Automatic Cancer Detection and Classification in Whole-slide Lung Histopathology) challenge for evaluating different computer-aided diagnosis (CADs) methods on the automatic diagnosis of lung cancer. The ACDC@LungHP 2019 focused on segmentation (pixel-wise detection) of cancer tissue in whole slide imaging (WSI), using an annotated dataset of 150 training images and 50 test images from 200 patients. This paper reviews this challenge and summarizes the top 10 submitted methods for lung cancer segmentation. All methods were evaluated using the false positive rate, false negative rate, and DICE coefficient (DC). The DC ranged from 0.7354$\pm$0.1149 to 0.8372$\pm$0.0858. The DC of the best method was close to the inter-observer agreement (0.8398$\pm$0.0890). All methods were based on deep learning and categorized into two groups: multi-model method and single model method. In general, multi-model methods were significantly better ($\textit{p}$<$0.01$) than single model methods, with mean DC of 0.7966 and 0.7544, respectively. Deep learning based methods could potentially help pathologists find suspicious regions for further analysis of lung cancer in WSI. △ Less

Submitted 21 August, 2020; originally announced August 2020.

arXiv:2008.06208 [pdf]

Adaptable Multi-Domain Language Model for Transformer ASR

Authors: Taewoo Lee, Min-Joong Lee, Tae Gyoon Kang, Seokyeoung Jung, Minseok Kwon, Yeona Hong, Jungin Lee, Kyoung-Gu Woo, Ho-Gyeong Kim, Jiseung Jeong, Jihyun Lee, Hosik Lee, Young Sang Choi

Abstract: We propose an adapter based multi-domain Transformer based language model (LM) for Transformer ASR. The model consists of a big size common LM and small size adapters. The model can perform multi-domain adaptation with only the small size adapters and its related layers. The proposed model can reuse the full fine-tuned LM which is fine-tuned using all layers of an original model. The proposed LM c… ▽ More We propose an adapter based multi-domain Transformer based language model (LM) for Transformer ASR. The model consists of a big size common LM and small size adapters. The model can perform multi-domain adaptation with only the small size adapters and its related layers. The proposed model can reuse the full fine-tuned LM which is fine-tuned using all layers of an original model. The proposed LM can be expanded to new domains by adding about 2% of parameters for a first domain and 13% parameters for after second domain. The proposed model is also effective in reducing the model maintenance cost because it is possible to omit the costly and time-consuming common LM pre-training process. Using proposed adapter based approach, we observed that a general LM with adapter can outperform a dedicated music domain LM in terms of word error rate (WER). △ Less

Submitted 10 February, 2021; v1 submitted 14 August, 2020; originally announced August 2020.

Comments: This paper is accepted for presentation at IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE ICASSP), 2021

arXiv:2007.16039 [pdf]

Data-driven Inverter-based Volt/VAr Control for Partially Observable Distribution Networks

Authors: Tong Xu, Wenchuan Wu, Yiwen Hong, Junjie Yu, Fazhong Zhang

Abstract: For active distribution networks (ADNs) integrated with massive inverter-based energy resources, it is impractical to maintain the accurate model and deploy measurements at all nodes due to the large-scale of ADNs. Thus, current models of ADNs are usually involving significant errors or even unknown. Moreover, ADNs are usually partially observable since only a few measurements are available at pil… ▽ More For active distribution networks (ADNs) integrated with massive inverter-based energy resources, it is impractical to maintain the accurate model and deploy measurements at all nodes due to the large-scale of ADNs. Thus, current models of ADNs are usually involving significant errors or even unknown. Moreover, ADNs are usually partially observable since only a few measurements are available at pilot nodes or nodes with significant users. To provide a practical Volt/Var control (VVC) strategy for such networks, a data-driven VVC method is proposed in this paper. Firstly, the system response policy, approximating the relationship between the control variables and states of monitoring nodes, is estimated by a recursive regression closed-form solution. Then, based on real-time measurements and the newly updated system response policy, a VVC strategy with convergence guarantee is realized. Since the recursive regression solution is embedded in the control stage, a data-driven closed-loop VVC framework is established. The effectiveness of the proposed method is validated in an unbalanced distribution system considering nonlinear loads where not only the rapid and self-adaptive voltage regulation is realized but also system-wide optimization is achieved. △ Less

Submitted 17 June, 2021; v1 submitted 31 July, 2020; originally announced July 2020.

Comments: accepted by CSEE Journal of Power and Energy Systems and published in June 2021

arXiv:2004.12592 [pdf, other]

Robust Screening of COVID-19 from Chest X-ray via Discriminative Cost-Sensitive Learning

Authors: Tianyang Li, Zhongyi Han, Benzheng Wei, Yuanjie Zheng, Yanfei Hong, **yu Cong

Abstract: This paper addresses the new problem of automated screening of coronavirus disease 2019 (COVID-19) based on chest X-rays, which is urgently demanded toward fast stop** the pandemic. However, robust and accurate screening of COVID-19 from chest X-rays is still a globally recognized challenge because of two bottlenecks: 1) imaging features of COVID-19 share some similarities with other pneumonia o… ▽ More This paper addresses the new problem of automated screening of coronavirus disease 2019 (COVID-19) based on chest X-rays, which is urgently demanded toward fast stop** the pandemic. However, robust and accurate screening of COVID-19 from chest X-rays is still a globally recognized challenge because of two bottlenecks: 1) imaging features of COVID-19 share some similarities with other pneumonia on chest X-rays, and 2) the misdiagnosis rate of COVID-19 is very high, and the misdiagnosis cost is expensive. While a few pioneering works have made much progress, they underestimate both crucial bottlenecks. In this paper, we report our solution, discriminative cost-sensitive learning (DCSL), which should be the choice if the clinical needs the assisted screening of COVID-19 from chest X-rays. DCSL combines both advantages from fine-grained classification and cost-sensitive learning. Firstly, DCSL develops a conditional center loss that learns deep discriminative representation. Secondly, DCSL establishes score-level cost-sensitive learning that can adaptively enlarge the cost of misclassifying COVID-19 examples into other classes. DCSL is so flexible that it can apply in any deep neural network. We collected a large-scale multi-class dataset comprised of 2,239 chest X-ray examples: 239 examples from confirmed COVID-19 cases, 1,000 examples with confirmed bacterial or viral pneumonia cases, and 1,000 examples of healthy people. Extensive experiments on the three-class classification show that our algorithm remarkably outperforms state-of-the-art algorithms. It achieves an accuracy of 97.01%, a precision of 97%, a sensitivity of 97.09%, and an F1-score of 96.98%. These results endow our algorithm as an efficient tool for the fast large-scale screening of COVID-19. △ Less

Submitted 21 May, 2020; v1 submitted 27 April, 2020; originally announced April 2020.

Comments: Under review

arXiv:2003.03497 [pdf, other]

MatchingGAN: Matching-based Few-shot Image Generation

Authors: Yan Hong, Li Niu, Jianfu Zhang, Liqing Zhang

Abstract: To generate new images for a given category, most deep generative models require abundant training images from this category, which are often too expensive to acquire. To achieve the goal of generation based on only a few images, we propose matching-based Generative Adversarial Network (GAN) for few-shot generation, which includes a matching generator and a matching discriminator. Matching generat… ▽ More To generate new images for a given category, most deep generative models require abundant training images from this category, which are often too expensive to acquire. To achieve the goal of generation based on only a few images, we propose matching-based Generative Adversarial Network (GAN) for few-shot generation, which includes a matching generator and a matching discriminator. Matching generator can match random vectors with a few conditional images from the same category and generate new images for this category based on the fused features. The matching discriminator extends conventional GAN discriminator by matching the feature of generated image with the fused feature of conditional images. Extensive experiments on three datasets demonstrate the effectiveness of our proposed method. △ Less

Submitted 14 March, 2020; v1 submitted 6 March, 2020; originally announced March 2020.

Comments: This paper is accepted for oral presentation at ICME 2020(http://www.2020.ieeeicme.org/)

arXiv:2002.10908 [pdf, other]

Multifold Acceleration of Diffusion MRI via Slice-Interleaved Diffusion Encoding (SIDE)

Authors: Yoonmi Hong, Wei-Tang Chang, Geng Chen, Ye Wu, Weili Lin, Dinggang Shen, Pew-Thian Yap

Abstract: Diffusion MRI (dMRI) is a unique imaging technique for in vivo characterization of tissue microstructure and white matter pathways. However, its relatively long acquisition time implies greater motion artifacts when imaging, for example, infants and Parkinson's disease patients. To accelerate dMRI acquisition, we propose in this paper (i) a diffusion encoding scheme, called Slice-Interleaved Diffu… ▽ More Diffusion MRI (dMRI) is a unique imaging technique for in vivo characterization of tissue microstructure and white matter pathways. However, its relatively long acquisition time implies greater motion artifacts when imaging, for example, infants and Parkinson's disease patients. To accelerate dMRI acquisition, we propose in this paper (i) a diffusion encoding scheme, called Slice-Interleaved Diffusion Encoding (SIDE), that interleaves each diffusion-weighted (DW) image volume with slices that are encoded with different diffusion gradients, essentially allowing the slice-undersampling of image volume associated with each diffusion gradient to significantly reduce acquisition time, and (ii) a method based on deep learning for effective reconstruction of DW images from the highly slice-undersampled data. Evaluation based on the Human Connectome Project (HCP) dataset indicates that our method can achieve a high acceleration factor of up to 6 with minimal information loss. Evaluation using dMRI data acquired with SIDE acquisition demonstrates that it is possible to accelerate the acquisition by as much as 50 folds when combined with multi-band imaging. △ Less

Submitted 25 February, 2020; originally announced February 2020.

arXiv:2001.11450 [pdf, other]

doi 10.1364/OE.383456

Super-resolution single-photon imaging at 8.2 kilometers

Authors: Zheng-** Li, Xin Huang, Peng-Yu Jiang, Yu Hong, Chao Yu, Yuan Cao, Jun Zhang, Feihu Xu, Jian-Wei Pan

Abstract: Single-photon light detection and ranging (LiDAR), offering single-photon sensitivity and picosecond time resolution, has been widely adopted for active imaging applications. Long-range active imaging is a great challenge, because the spatial resolution degrades significantly with the imaging range due to the diffraction limit of the optics, and only weak echo signal photons can return but mixed w… ▽ More Single-photon light detection and ranging (LiDAR), offering single-photon sensitivity and picosecond time resolution, has been widely adopted for active imaging applications. Long-range active imaging is a great challenge, because the spatial resolution degrades significantly with the imaging range due to the diffraction limit of the optics, and only weak echo signal photons can return but mixed with a strong background noise. Here we propose and demonstrate a photon-efficient LiDAR approach that can achieve sub-Rayleigh resolution imaging over long ranges. This approach exploits fine sub-pixel scanning and a deconvolution algorithm tailored to this long-range application. Using this approach, we experimentally demonstrated active three-dimensional (3D) single-photon imaging by recognizing different postures of a mannequin model at a stand-off distance of 8.2 km in both daylight and night. The observed spatial (transversal) resolution is about 5.5 cm at 8.2 km, which is about twice of the system's resolution. This also beats the optical system's Rayleigh criterion. The results are valuable for geosciences and target recognition over long ranges. △ Less

Submitted 30 January, 2020; originally announced January 2020.

Journal ref: Having been published in Opt. Express 28, 4076-4087 (2020)

arXiv:1912.10753 [pdf, other]

Heterogeneous Hegselmann-Krause Dynamics with Environment and Communication Noise

Authors: Ge Chen, Wei Su, Songyuan Ding, Yiguang Hong

Abstract: The Hegselmann-Krause (HK) model is a wellknown opinion dynamics, attracting a significant amount of interest from a number of fields. However, the heterogeneous HK model is difficult to analyze - even the most basic property of convergence is still open to prove. For the first time, this paper takes into consideration heterogeneous HK models with environment or communication noise. Under environm… ▽ More The Hegselmann-Krause (HK) model is a wellknown opinion dynamics, attracting a significant amount of interest from a number of fields. However, the heterogeneous HK model is difficult to analyze - even the most basic property of convergence is still open to prove. For the first time, this paper takes into consideration heterogeneous HK models with environment or communication noise. Under environment noise, it has been revealed that the heterogeneous HK model with or without global information has a phase transition for the upper limit of the maximum opinion difference, and has a critical noise amplitude depending on the minimal confidence threshold for quasi-synchronization. In addition, the convergence time to the quasi-synchronization is bounded by a negative exponential distribution. The heterogeneous HK model with global information and communication noise is also analyzed. Finally, for the basic HK model with communication noise, we show that the heterogeneous case exhibits a different behavior regarding quasi-synchronization from the homogenous case. Interestingly, raising the confidence thresholds of constituent agents may break quasi-synchronization. Our results reveal that the heterogeneity of individuals is harmful to synchronization, which may be the reason why the synchronization of opinions is hard to reach in reality, even within that of a small group. △ Less

Submitted 23 December, 2019; originally announced December 2019.

arXiv:1911.06516 [pdf, ps, other]

doi 10.1109/TVT.2020.2998060

Improving PHY-Security of UAV-Enabled Transmission with Wireless Energy Harvesting: Robust Trajectory Design and Communications Resource Allocation

Authors: Milad Tatar Mamaghani, Yi Hong

Abstract: In this paper, we consider an unmanned aerial vehicle (UAV) assisted communications system, including two cooperative UAVs, a wireless-powered ground destination node leveraging simultaneous wireless information and power transfer (SWIPT) technique, and a terrestrial passive eavesdropper. One UAV delivers confidential information to destination and the other sends jamming signals to against eavesd… ▽ More In this paper, we consider an unmanned aerial vehicle (UAV) assisted communications system, including two cooperative UAVs, a wireless-powered ground destination node leveraging simultaneous wireless information and power transfer (SWIPT) technique, and a terrestrial passive eavesdropper. One UAV delivers confidential information to destination and the other sends jamming signals to against eavesdrop** and assist destination with energy harvesting. Assuming UAVs have partial information about eavesdropper's location, we propose two transmission schemes: friendly UAV jamming (FUJ) and Gaussian jamming transmission (GJT) for the cases when jamming signals are known and unknown a priori at destination, respectively. Then, we formulate an average secrecy rate maximization problem to jointly optimize the transmission power and trajectory of UAVs, and the power splitting ratio of destination. Being non-convex and hence difficult to solve the formulated problem, we propose a computationally efficient iterative algorithm based on block coordinate descent and successive convex approximation to obtain a suboptimal solution. Finally, numerical results are provided to substantiate the effectiveness of our proposed multiple-UAV schemes, compared to other existing benchmarks. Specifically, we find that the FUJ demonstrates significant secrecy performance improvement in terms of the optimal instantaneous and average secrecy rate compared to the GJT and the conventional single-UAV counterpart. △ Less

Submitted 12 April, 2020; v1 submitted 15 November, 2019; originally announced November 2019.

Comments: This paper has been accepted by IEEE Transactions on Vehicular Technology

arXiv:1909.04797 [pdf, other]

doi 10.1109/ISBI45749.2020.9098656

Hybrid Cascaded Neural Network for Liver Lesion Segmentation

Authors: Raunak Dey, Yi Hong

Abstract: Automatic liver lesion segmentation is a challenging task while having a significant impact on assisting medical professionals in the designing of effective treatment and planning proper care. In this paper we propose a cascaded system that combines both 2D and 3D convolutional neural networks to effectively segment hepatic lesions. Our 2D network operates on a slice by slice basis to segment the… ▽ More Automatic liver lesion segmentation is a challenging task while having a significant impact on assisting medical professionals in the designing of effective treatment and planning proper care. In this paper we propose a cascaded system that combines both 2D and 3D convolutional neural networks to effectively segment hepatic lesions. Our 2D network operates on a slice by slice basis to segment the liver and larger tumors, while we use a 3D network to detect small lesions that are often missed in a 2D segmentation design. We employ this algorithm on the LiTS challenge obtaining a Dice score per case of 68.1%, which performs the best among all non pre-trained models and the second best among published methods. We also perform two-fold cross-validation to reveal the over- and under-segmentation issues in the LiTS annotations. △ Less

Submitted 8 October, 2019; v1 submitted 10 September, 2019; originally announced September 2019.

arXiv:1906.06867 [pdf, ps, other]

doi 10.1109/ACCESS.2019.2948384

On the Performance of Low-Altitude UAV-Enabled Secure AF Relaying with Cooperative Jamming and SWIPT

Authors: Milad Tatar Mamaghani, Yi Hong

Abstract: This paper proposes a novel cooperative secure unmanned aerial vehicle (UAV) aided transmission protocol, where a source (Alice) sends confidential information to a destination (Bob) via an energy-constrained UAV-mounted amplify-and-forward (AF) relay in the presence of a ground eavesdropper (Eve). We adopt destination-assisted cooperative jamming (CJ) as well as simultaneous wireless information… ▽ More This paper proposes a novel cooperative secure unmanned aerial vehicle (UAV) aided transmission protocol, where a source (Alice) sends confidential information to a destination (Bob) via an energy-constrained UAV-mounted amplify-and-forward (AF) relay in the presence of a ground eavesdropper (Eve). We adopt destination-assisted cooperative jamming (CJ) as well as simultaneous wireless information and power transfer (SWIPT) at the UAV-mounted relay to enhance physical-layer security (PLS) and transmission reliability. Assuming a low altitude UAV, we derive connection probability (CP), secrecy outage probability (SOP), instantaneous secrecy rate, and average secrecy rate (ASR) of the proposed protocol over Air-Ground (AG) channels, which are modeled as Rician fading with elevation-angel dependent parameters. By simulations, we verify our theoretical results and demonstrate significant performance improvement of our protocol, when compared to conventional transmission protocol with ground relaying and UAV-based transmission protocol without destination-assisted jamming. Finally, we evaluate the impacts of different system parameters and different UAV's locations on the proposed protocol in terms of ASR. △ Less

Submitted 17 June, 2019; originally announced June 2019.

Comments: 10 pages, 9 figures, Submitted for possible journal publication

arXiv:1906.03975 [pdf, other]

Predicting Global Variations in Outdoor PM2.5 Concentrations using Satellite Images and Deep Convolutional Neural Networks

Authors: Kris Y. Hong, Pedro O. Pinheiro, Scott Weichenthal

Abstract: Here we present a new method of estimating global variations in outdoor PM$_{2.5}$ concentrations using satellite images combined with ground-level measurements and deep convolutional neural networks. Specifically, new deep learning models were trained over the global PM$_{2.5}$ concentration range ($<$1-436 $μ$g/m$^3$) using a large database of satellite images paired with ground level PM… ▽ More Here we present a new method of estimating global variations in outdoor PM$_{2.5}$ concentrations using satellite images combined with ground-level measurements and deep convolutional neural networks. Specifically, new deep learning models were trained over the global PM$_{2.5}$ concentration range ($<$1-436 $μ$g/m$^3$) using a large database of satellite images paired with ground level PM$_{2.5}$ measurements available from the World Health Organization. Final model selection was based on a systematic evaluation of well-known architectures for the convolutional base including InceptionV3, Xception, and VGG16. The Xception architecture performed best and the final global model had a root mean square error (RMSE) value of 13.01 $μ$g/m$^3$ (R$^2$=0.75) in the disjoint test set. The predictive performance of our new global model (called IMAGE-PM$_{2.5}$) is similar to the current state-of-the-art model used in the Global Burden of Disease study but relies only on satellite images as input. As a result, the IMAGE-PM$_{2.5}$ model offers a fast, cost-effective means of estimating global variations in long-term average PM$_{2.5}$ concentrations and may be particularly useful for regions without ground monitoring data or detailed emissions inventories. The IMAGE-PM$_{2.5}$ model can be used as a stand-alone method of global exposure estimation or incorporated into more complex hierarchical model structures. △ Less

Submitted 1 June, 2019; originally announced June 2019.

Comments: 8 pages, 6 figures, Submitted to Scientific Reports

arXiv:1906.03051 [pdf, other]

doi 10.1007/978-3-030-35817-4_11

DeepBundle: Fiber Bundle Parcellation with Graph Convolution Neural Networks

Authors: Feihong Liu, Jun Feng, Geng Chen, Ye Wu, Yoonmi Hong, Pew-Thian Yap, Dinggang Shen

Abstract: Parcellation of whole-brain tractography streamlines is an important step for tract-based analysis of brain white matter microstructure. Existing fiber parcellation approaches rely on accurate registration between an atlas and the tractograms of an individual, however, due to large individual differences, accurate registration is hard to guarantee in practice. To resolve this issue, we propose a n… ▽ More Parcellation of whole-brain tractography streamlines is an important step for tract-based analysis of brain white matter microstructure. Existing fiber parcellation approaches rely on accurate registration between an atlas and the tractograms of an individual, however, due to large individual differences, accurate registration is hard to guarantee in practice. To resolve this issue, we propose a novel deep learning method, called DeepBundle, for registration-free fiber parcellation. Our method utilizes graph convolution neural networks (GCNNs) to predict the parcellation label of each fiber tract. GCNNs are capable of extracting the geometric features of each fiber tract and harnessing the resulting features for accurate fiber parcellation and ultimately avoiding the use of atlases and any registration method. We evaluate DeepBundle using data from the Human Connectome Project. Experimental results demonstrate the advantages of DeepBundle and suggest that the geometric features extracted from each fiber tract can be used to effectively parcellate the fiber tracts. △ Less

Submitted 23 December, 2019; v1 submitted 7 June, 2019; originally announced June 2019.

Comments: 8 pages

arXiv:1902.03916 [pdf, ps, other]

Homogeneous and Mixed Energy Communities Discovery with Spatial-Temporal Net Energy

Authors: Shangyu Xie, Han Wang, Shengbin Wang, Haibing Lu, Yuan Hong, Dong **, Qi Liu

Abstract: Smart grid has integrated an increasing number of distributed energy resources to improve the efficiency and flexibility of power generation and consumption as well as the resilience of the power grid. The energy consumers on the power grid (e.g., households) equipped with the distributed energy resources can be considered as "microgrids" that both generate and consume electricity. In this paper,… ▽ More Smart grid has integrated an increasing number of distributed energy resources to improve the efficiency and flexibility of power generation and consumption as well as the resilience of the power grid. The energy consumers on the power grid (e.g., households) equipped with the distributed energy resources can be considered as "microgrids" that both generate and consume electricity. In this paper, we study the energy community discovery problems which identify multiple kinds of energy communities for the microgrids to facilitate energy management (e.g., power supply adjustment, load balancing, energy sharing) on the grid, such as homogeneous energy communities (HECs), mixed energy communities (MECs), and self-sufficient energy communities (SECs). Specifically, we present efficient algorithms to discover such communities of microgrids by taking into account not only their geo-locations but also their net energy over any period. Finally, we experimentally validate the performance of the algorithms using both synthetic and real datasets. △ Less

Submitted 27 March, 2019; v1 submitted 8 February, 2019; originally announced February 2019.

Comments: Full Version of the Energy Community Discovery Article

arXiv:1902.03594 [pdf, ps, other]

Max-Min Fair Sensor Scheduling: Game-theoretic Perspective and Algorithmic Solution

Authors: Shuang Wu, Xiaoqiang Ren, Yiguang Hong, Ling Shi

Abstract: We consider the design of a fair sensor schedule for a number of sensors monitoring different linear time-invariant processes. The largest average remote estimation error among all processes is to be minimized. We first consider a general setup for the max-min fair allocation problem. By reformulating the problem as its equivalent form, we transform the fair resource allocation problem into a zero… ▽ More We consider the design of a fair sensor schedule for a number of sensors monitoring different linear time-invariant processes. The largest average remote estimation error among all processes is to be minimized. We first consider a general setup for the max-min fair allocation problem. By reformulating the problem as its equivalent form, we transform the fair resource allocation problem into a zero-sum game between a "judge" and a resource allocator. We propose an equilibrium seeking procedure and show that there exists a unique Nash equilibrium in pure strategy for this game. We then apply the result to the sensor scheduling problem and show that the max-min fair sensor scheduling policy can be achieved. △ Less

Submitted 18 October, 2019; v1 submitted 10 February, 2019; originally announced February 2019.

Showing 1–50 of 63 results for author: Hong, Y