Search | arXiv e-print repository

Multi-scale Restoration of Missing Data in Optical Time-series Images with Masked Spatial-Temporal Attention Network

Authors: Zaiyan Zhang, **ing Yan, Yuanqi Liang, Jiaxin Feng, Haixu He, Wei Han

Abstract: Due to factors such as thick cloud cover and sensor limitations, remote sensing images often suffer from significant missing data, resulting in incomplete time-series information. Existing methods for imputing missing values in remote sensing images do not fully exploit spatio-temporal auxiliary information, leading to limited accuracy in restoration. Therefore, this paper proposes a novel deep le… ▽ More Due to factors such as thick cloud cover and sensor limitations, remote sensing images often suffer from significant missing data, resulting in incomplete time-series information. Existing methods for imputing missing values in remote sensing images do not fully exploit spatio-temporal auxiliary information, leading to limited accuracy in restoration. Therefore, this paper proposes a novel deep learning-based approach called MS2TAN (Multi-scale Masked Spatial-Temporal Attention Network), for reconstructing time-series remote sensing images. Firstly, we introduce an efficient spatio-temporal feature extractor based on Masked Spatial-Temporal Attention (MSTA), to obtain high-quality representations of the spatio-temporal neighborhood features in the missing regions. Secondly, a Multi-scale Restoration Network consisting of the MSTA-based Feature Extractors, is employed to progressively refine the missing values by exploring spatio-temporal neighborhood features at different scales. Thirdly, we propose a ``Pixel-Structure-Perception'' Multi-Objective Joint Optimization method to enhance the visual effects of the reconstruction results from multiple perspectives and preserve more texture structures. Furthermore, the proposed method reconstructs missing values in all input temporal phases in parallel (i.e., Multi-In Multi-Out), achieving higher processing efficiency. Finally, experimental evaluations on two typical missing data restoration tasks across multiple research areas demonstrate that the proposed method outperforms state-of-the-art methods with an improvement of 0.40dB/1.17dB in mean peak signal-to-noise ratio (mPSNR) and 3.77/9.41 thousandths in mean structural similarity (mSSIM), while exhibiting stronger texture and structural consistency. △ Less

Submitted 19 June, 2024; originally announced June 2024.

arXiv:2406.08806 [pdf, ps, other]

Adaptive Cooperative Streaming of Holographic Video Over Wireless Networks: A Proximal Policy Optimization Solution

Authors: Wanli Wen, Ji** Yan, Yulu Zhang, Zhen Huang, Liang Liang, Yunjian Jia

Abstract: Adapting holographic video streaming to fluctuating wireless channels is essential to maintain consistent and satisfactory Quality of Experience (QoE) for users, which, however, is a challenging task due to the dynamic and uncertain characteristics of wireless networks. To address this issue, we propose a holographic video cooperative streaming framework designed for a generic wireless network in… ▽ More Adapting holographic video streaming to fluctuating wireless channels is essential to maintain consistent and satisfactory Quality of Experience (QoE) for users, which, however, is a challenging task due to the dynamic and uncertain characteristics of wireless networks. To address this issue, we propose a holographic video cooperative streaming framework designed for a generic wireless network in which multiple access points can cooperatively transmit video with different bitrates to multiple users. Additionally, we model a novel QoE metric tailored specifically for holographic video streaming, which can effectively encapsulate the nuances of holographic video quality, quality fluctuations, and rebuffering occurrences simultaneously. Furthermore, we formulate a formidable QoE maximization problem, which is a non-convex mixed integer nonlinear programming problem. Using proximal policy optimization (PPO), a new class of reinforcement learning algorithms, we devise a joint beamforming and bitrate control scheme, which can be wisely adapted to fluctuations in the wireless channel. The numerical results demonstrate the superiority of the proposed scheme over representative baselines. △ Less

Submitted 13 June, 2024; originally announced June 2024.

Comments: This paper has been accepted for publication in IEEE Wireless Communications Letters

arXiv:2404.16727 [pdf, other]

Learning-Based Efficient Approximation of Data-enabled Predictive Control

Authors: Yihan Zhou, Yiwen Lu, Zishuo Li, Jiaqi Yan, Yilin Mo

Abstract: Data-Enabled Predictive Control (DeePC) bypasses the need for system identification by directly leveraging raw data to formulate optimal control policies. However, the size of the optimization problem in DeePC grows linearly with respect to the data size, which prohibits its application due to high computational costs. In this paper, we propose an efficient approximation of DeePC, whose size is in… ▽ More Data-Enabled Predictive Control (DeePC) bypasses the need for system identification by directly leveraging raw data to formulate optimal control policies. However, the size of the optimization problem in DeePC grows linearly with respect to the data size, which prohibits its application due to high computational costs. In this paper, we propose an efficient approximation of DeePC, whose size is invariant with respect to the amount of data collected, via differentiable convex programming. Specifically, the optimization problem in DeePC is decomposed into two parts: a control objective and a scoring function that evaluates the likelihood of a guessed I/O sequence, the latter of which is approximated with a size-invariant learned optimization problem. The proposed method is validated through numerical simulations on a quadruple tank system, illustrating that the learned controller can reduce the computational time of DeePC by 5x while maintaining its control performance. △ Less

Submitted 25 April, 2024; originally announced April 2024.

arXiv:2404.12097 [pdf, other]

MPC of Uncertain Nonlinear Systems with Meta-Learning for Fast Adaptation of Neural Predictive Models

Authors: Jiaqi Yan, Ankush Chakrabarty, Alisa Rupenyan, John Lygeros

Abstract: In this paper, we consider the problem of reference tracking in uncertain nonlinear systems. A neural State-Space Model (NSSM) is used to approximate the nonlinear system, where a deep encoder network learns the nonlinearity from data, and a state-space component captures the temporal relationship. This transforms the nonlinear system into a linear system in a latent space, enabling the applicatio… ▽ More In this paper, we consider the problem of reference tracking in uncertain nonlinear systems. A neural State-Space Model (NSSM) is used to approximate the nonlinear system, where a deep encoder network learns the nonlinearity from data, and a state-space component captures the temporal relationship. This transforms the nonlinear system into a linear system in a latent space, enabling the application of model predictive control (MPC) to determine effective control actions. Our objective is to design the optimal controller using limited data from the \textit{target system} (the system of interest). To this end, we employ an implicit model-agnostic meta-learning (iMAML) framework that leverages information from \textit{source systems} (systems that share similarities with the target system) to expedite training in the target system and enhance its control performance. The framework consists of two phases: the (offine) meta-training phase learns a aggregated NSSM using data from source systems, and the (online) meta-inference phase quickly adapts this aggregated model to the target system using only a few data points and few online training iterations, based on local loss function gradients. The iMAML algorithm exploits the implicit function theorem to exactly compute the gradient during training, without relying on the entire optimization path. By focusing solely on the optimal solution, rather than the path, we can meta-train with less storage complexity and fewer approximations than other contemporary meta-learning algorithms. We demonstrate through numerical examples that our proposed method can yield accurate predictive models by adaptation, resulting in a downstream MPC that outperforms several baselines. △ Less

Submitted 18 April, 2024; originally announced April 2024.

arXiv:2404.11171 [pdf, other]

Personalized Heart Disease Detection via ECG Digital Twin Generation

Authors: Yaojun Hu, **tai Chen, Lianting Hu, Dantong Li, Jiahuan Yan, Haochao Ying, Huiying Liang, Jian Wu

Abstract: Heart diseases rank among the leading causes of global mortality, demonstrating a crucial need for early diagnosis and intervention. Most traditional electrocardiogram (ECG) based automated diagnosis methods are trained at population level, neglecting the customization of personalized ECGs to enhance individual healthcare management. A potential solution to address this limitation is to employ dig… ▽ More Heart diseases rank among the leading causes of global mortality, demonstrating a crucial need for early diagnosis and intervention. Most traditional electrocardiogram (ECG) based automated diagnosis methods are trained at population level, neglecting the customization of personalized ECGs to enhance individual healthcare management. A potential solution to address this limitation is to employ digital twins to simulate symptoms of diseases in real patients. In this paper, we present an innovative prospective learning approach for personalized heart disease detection, which generates digital twins of healthy individuals' anomalous ECGs and enhances the model sensitivity to the personalized symptoms. In our approach, a vector quantized feature separator is proposed to locate and isolate the disease symptom and normal segments in ECG signals with ECG report guidance. Thus, the ECG digital twins can simulate specific heart diseases used to train a personalized heart disease detection model. Experiments demonstrate that our approach not only excels in generating high-fidelity ECG signals but also improves personalized heart disease detection. Moreover, our approach ensures robust privacy protection, safeguarding patient data in model development. △ Less

Submitted 11 May, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

arXiv:2403.12218 [pdf, other]

Secure Synchronization of Heterogeneous Pulse-Coupled Oscillators

Authors: Jiaqi Yan, Hideaki Ishii

Abstract: In this paper, we consider the synchronization of heterogeneous pulse-coupled oscillators (PCOs), where some of the oscillators might be faulty or malicious. The oscillators interact through identical pulses at discrete instants and evolve continuously with different frequencies otherwise. Despite the presence of misbehaviors, benign oscillators aim to reach synchronization. To achieve this object… ▽ More In this paper, we consider the synchronization of heterogeneous pulse-coupled oscillators (PCOs), where some of the oscillators might be faulty or malicious. The oscillators interact through identical pulses at discrete instants and evolve continuously with different frequencies otherwise. Despite the presence of misbehaviors, benign oscillators aim to reach synchronization. To achieve this objective, two resilient synchronization protocols are developed in this paper by adapting the real-valued mean-subsequence reduced (MSR) algorithm to pulse-based interactions. The first protocol relies on packet-based communication to transmit absolute frequencies, while the second protocol operates purely with pulses to calculate relative frequencies. In both protocols, each normal oscillator periodically counts the received pulses to detect possible malicious behaviors. By disregarding suspicious pulses from its neighbors, the oscillator updates both its phases and frequencies. The paper establishes sufficient conditions on the initial states and graph structure under which resilient synchronization is achieved in the PCO network. Specifically, the normal oscillators can either detect the presence of malicious nodes or synchronize in both phases and frequencies. Additionally, a comparison between the two algorithms reveals a trade-off between relaxed initial conditions and reduced communication burden. △ Less

Submitted 20 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.06087 [pdf, other]

Learning the irreversible progression trajectory of Alzheimer's disease

Authors: Yipei Wang, Bing He, Shannon Risacher, Andrew Saykin, **gwen Yan, Xiaoqian Wang

Abstract: Alzheimer's disease (AD) is a progressive and irreversible brain disorder that unfolds over the course of 30 years. Therefore, it is critical to capture the disease progression in an early stage such that intervention can be applied before the onset of symptoms. Machine learning (ML) models have been shown effective in predicting the onset of AD. Yet for subjects with follow-up visits, existing te… ▽ More Alzheimer's disease (AD) is a progressive and irreversible brain disorder that unfolds over the course of 30 years. Therefore, it is critical to capture the disease progression in an early stage such that intervention can be applied before the onset of symptoms. Machine learning (ML) models have been shown effective in predicting the onset of AD. Yet for subjects with follow-up visits, existing techniques for AD classification only aim for accurate group assignment, where the monotonically increasing risk across follow-up visits is usually ignored. Resulted fluctuating risk scores across visits violate the irreversibility of AD, hampering the trustworthiness of models and also providing little value to understanding the disease progression. To address this issue, we propose a novel regularization approach to predict AD longitudinally. Our technique aims to maintain the expected monotonicity of increasing disease risk during progression while preserving expressiveness. Specifically, we introduce a monotonicity constraint that encourages the model to predict disease risk in a consistent and ordered manner across follow-up visits. We evaluate our method using the longitudinal structural MRI and amyloid-PET imaging data from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Our model outperforms existing techniques in capturing the progressiveness of disease risk, and at the same time preserves prediction accuracy. △ Less

Submitted 9 March, 2024; originally announced March 2024.

Comments: accepted by ISBI 2024

arXiv:2312.16419 [pdf]

Radar detection of wake vortex behind the aircraft: the detection range problem

Authors: Jiangkun Gong, Jun Yan, Deyong Kong, Deren Li

Abstract: In this study, we showcased the detection of the wake vortex produced by a medium aircraft at distances exceeding 10 km using an X-band pulse-Doppler radar. We analyzed radar signals within the range profiles behind a Boeing 737 aircraft on February 7, 2021, within the airspace of the Runway Protection Zone (RPZ) at Tianhe Airport, Wuhan, China. The findings revealed that the wake vortex extended… ▽ More In this study, we showcased the detection of the wake vortex produced by a medium aircraft at distances exceeding 10 km using an X-band pulse-Doppler radar. We analyzed radar signals within the range profiles behind a Boeing 737 aircraft on February 7, 2021, within the airspace of the Runway Protection Zone (RPZ) at Tianhe Airport, Wuhan, China. The findings revealed that the wake vortex extended up to 6 km from the aircraft, which is 10 km from the radar, displaying distinct stages characterized by scattering patterns and Doppler signatures. Despite the wake vortex exhibiting a scattering power approximately 10 dB lower than that of the aircraft, its Doppler Signal-to-Clutter Ratio (DSCR) values were only 5 dB lower, indicating a notably strong scattering power within a single radar bin. Additionally, certain radar parameters proved inconsistent in the stable detection and tracking of wake vortex, aligning with our earlier concept of cognitive micro-Doppler radar. △ Less

Submitted 27 December, 2023; originally announced December 2023.

arXiv:2312.11921 [pdf, other]

Optimal BER Minimum Precoder Design for OTFS-Based ISAC Systems

Authors: Jun Wu, Weijie Yuan, Zhiqiang Wei, **** Yan, Derrick Wing Kwan Ng

Abstract: This paper investigates the bit error rate (BER) minimum pre-coder design for an orthogonal time frequency space (OTFS)-based integrated sensing and communications (ISAC) system, which is considered as a promising technique for enabling future wireless networks. In particular, the BER minimum problem takes into account the maximized available transmission power and the required sensing performance… ▽ More This paper investigates the bit error rate (BER) minimum pre-coder design for an orthogonal time frequency space (OTFS)-based integrated sensing and communications (ISAC) system, which is considered as a promising technique for enabling future wireless networks. In particular, the BER minimum problem takes into account the maximized available transmission power and the required sensing performance. We devise the precoder from the perspective of delay-Doppler (DD) domain by exploiting the equivalent DD channel. To address the non-convex design problem, we resort to minimizing the lower bound of the derived average BER. Afterwards, we propose a computationally iterative method to solve the dual problem at low cost. Simulation results verify the effectiveness of our proposed precoder and reveal the interplay between sensing and communication for dual-functional precoder design. △ Less

Submitted 19 December, 2023; originally announced December 2023.

arXiv:2312.10483 [pdf, other]

doi 10.1109/BioCAS54905.2022.9948588

All Attention U-NET for Semantic Segmentation of Intracranial Hemorrhages In Head CT Images

Authors: Chia Shuo Chang, Tian Sheuan Chang, Jiun Lin Yan, Li Ko

Abstract: Intracranial hemorrhages in head CT scans serve as a first line tool to help specialists diagnose different types. However, their types have diverse shapes in the same type but similar confusing shape, size and location between types. To solve this problem, this paper proposes an all attention U-Net. It uses channel attentions in the U-Net encoder side to enhance class specific feature extraction,… ▽ More Intracranial hemorrhages in head CT scans serve as a first line tool to help specialists diagnose different types. However, their types have diverse shapes in the same type but similar confusing shape, size and location between types. To solve this problem, this paper proposes an all attention U-Net. It uses channel attentions in the U-Net encoder side to enhance class specific feature extraction, and space and channel attentions in the U-Net decoder side for more accurate shape extraction and type classification. The simulation results show up to a 31.8\% improvement compared to baseline, ResNet50 + U-Net, and better performance than in cases with limited attention. △ Less

Submitted 16 December, 2023; originally announced December 2023.

Comments: 2022 IEEE Biomedical Circuits and Systems Conference (BioCAS)

arXiv:2311.08823 [pdf, other]

Ultrafast 3-D Super Resolution Ultrasound using Row-Column Array specific Coherence-based Beamforming and Rolling Acoustic Sub-aperture Processing: In Vitro, In Vivo and Clinical Study

Authors: Joseph Hansen-Shearer, Jipeng Yan, Marcelo Lerendegui, Biao Huang, Matthieu Toulemonde, Kai Riemer, Qingyuan Tan, Johanna Tonko, Peter D. Weinberg, Chris Dunsby, Meng-Xing Tang

Abstract: The row-column addressed array is an emerging probe for ultrafast 3-D ultrasound imaging. It achieves this with far fewer independent electronic channels and a wider field of view than traditional 2-D matrix arrays, of the same channel count, making it a good candidate for clinical translation. However, the image quality of row-column arrays is generally poor, particularly when investigating tissu… ▽ More The row-column addressed array is an emerging probe for ultrafast 3-D ultrasound imaging. It achieves this with far fewer independent electronic channels and a wider field of view than traditional 2-D matrix arrays, of the same channel count, making it a good candidate for clinical translation. However, the image quality of row-column arrays is generally poor, particularly when investigating tissue. Ultrasound localisation microscopy allows for the production of super-resolution images even when the initial image resolution is not high. Unfortunately, the row-column probe can suffer from imaging artefacts that can degrade the quality of super-resolution images as `secondary' lobes from bright microbubbles can be mistaken as microbubble events, particularly when operated using plane wave imaging. These false events move through the image in a physiologically realistic way so can be challenging to remove via tracking, leading to the production of 'false vessels'. Here, a new type of rolling window image reconstruction procedure was developed, which integrated a row-column array-specific coherence-based beamforming technique with acoustic sub-aperture processing for the purposes of reducing `secondary' lobe artefacts, noise and increasing the effective frame rate. Using an {\it{in vitro}} cross tube, it was found that the procedure reduced the percentage of `false' locations from $\sim$26\% to $\sim$15\% compared to traditional orthogonal plane wave compounding. Additionally, it was found that the noise could be reduced by $\sim$7 dB and that the effective frame rate could be increased to over 4000 fps. Subsequently, {\it{in vivo}} ultrasound localisation microscopy was used to produce images non-invasively of a rabbit kidney and a human thyroid. △ Less

Submitted 15 November, 2023; originally announced November 2023.

arXiv:2311.07062 [pdf, other]

doi 10.1109/TASLP.2023.3332542

Decoupling and Interacting Multi-Task Learning Network for Joint Speech and Accent Recognition

Authors: Qijie Shao, Pengcheng Guo, **ghao Yan, Pengfei Hu, Lei Xie

Abstract: Accents, as variations from standard pronunciation, pose significant challenges for speech recognition systems. Although joint automatic speech recognition (ASR) and accent recognition (AR) training has been proven effective in handling multi-accent scenarios, current multi-task ASR-AR approaches overlook the granularity differences between tasks. Fine-grained units capture pronunciation-related a… ▽ More Accents, as variations from standard pronunciation, pose significant challenges for speech recognition systems. Although joint automatic speech recognition (ASR) and accent recognition (AR) training has been proven effective in handling multi-accent scenarios, current multi-task ASR-AR approaches overlook the granularity differences between tasks. Fine-grained units capture pronunciation-related accent characteristics, while coarse-grained units are better for learning linguistic information. Moreover, an explicit interaction of two tasks can also provide complementary information and improve the performance of each other, but it is rarely used by existing approaches. In this paper, we propose a novel Decoupling and Interacting Multi-task Network (DIMNet) for joint speech and accent recognition, which is comprised of a connectionist temporal classification (CTC) branch, an AR branch, an ASR branch, and a bottom feature encoder. Specifically, AR and ASR are first decoupled by separated branches and two-granular modeling units to learn task-specific representations. The AR branch is from our previously proposed linguistic-acoustic bimodal AR model and the ASR branch is an encoder-decoder based Conformer model. Then, for the task interaction, the CTC branch provides aligned text for the AR task, while accent embeddings extracted from our AR model are incorporated into the ASR branch's encoder and decoder. Finally, during ASR inference, a cross-granular rescoring method is introduced to fuse the complementary information from the CTC and attention decoder after the decoupling. Our experiments on English and Chinese datasets demonstrate the effectiveness of the proposed model, which achieves 21.45%/28.53% AR accuracy relative improvement and 32.33%/14.55% ASR error rate relative reduction over a published standard baseline, respectively. △ Less

Submitted 17 November, 2023; v1 submitted 12 November, 2023; originally announced November 2023.

Comments: Accepted by IEEE Transactions on Audio, Speech and Language Processing (TASLP)

arXiv:2311.03911 [pdf, other]

Distributed Parameter Estimation with Gaussian Observation Noises in Time-varying Digraphs

Authors: Jiaqi Yan, Hideaki Ishii

Abstract: In this paper, we consider the problem of distributed parameter estimation in sensor networks. Each sensor makes successive observations of an unknown $d$-dimensional parameter, which might be subject to Gaussian random noises. The sensors aim to infer the true value of the unknown parameter by cooperating with each other. To this end, we first generalize the so-called dynamic regressor extension… ▽ More In this paper, we consider the problem of distributed parameter estimation in sensor networks. Each sensor makes successive observations of an unknown $d$-dimensional parameter, which might be subject to Gaussian random noises. The sensors aim to infer the true value of the unknown parameter by cooperating with each other. To this end, we first generalize the so-called dynamic regressor extension and mixing (DREM) algorithm to stochastic systems, with which the problem of estimating a $d$-dimensional vector parameter is transformed to that of $d$ scalar ones: one for each of the unknown parameters. For each of the scalar problem, both combine-then-adapt (CTA) and adapt-then-combine (ATC) diffusion-based estimation algorithms are given, where each sensor performs a combination step to fuse the local estimates in its in-neighborhood, alongside an adaptation step to process its streaming observations. Under weak conditions on network topology and excitation of regressors, we show that the proposed estimators guarantee that each sensor infers the true parameter, even if any individual of them cannot by itself. Specifically, it is required that the union of topologies over an interval with fixed length is strongly connected. Moreover, the sensors must collectively satisfy a cooperative persistent excitation (PE) condition, which relaxes the traditional PE condition. Numerical examples are finally provided to illustrate the established results. △ Less

Submitted 8 November, 2023; v1 submitted 7 November, 2023; originally announced November 2023.

arXiv:2311.01003 [pdf, other]

Minimum Snap Trajectory Generation and Control for an Under-actuated Flap** Wing Aerial Vehicle

Authors: Chen Qian, Rui Chen, Peiyao Shen, Yongchun Fang, Jifu Yan, Tiefeng Li

Abstract: Minimum Snap Trajectory Generation and Control for an Under-actuated Flap** Wing Aerial VehicleThis paper presents both the trajectory generation and tracking control strategies for an underactuated flap** wing aerial vehicle (FWAV). First, the FWAV dynamics is analyzed in a practical perspective. Then, based on these analyses, we demonstrate the differential flatness of the FWAV system, and d… ▽ More Minimum Snap Trajectory Generation and Control for an Under-actuated Flap** Wing Aerial VehicleThis paper presents both the trajectory generation and tracking control strategies for an underactuated flap** wing aerial vehicle (FWAV). First, the FWAV dynamics is analyzed in a practical perspective. Then, based on these analyses, we demonstrate the differential flatness of the FWAV system, and develop a general-purpose trajectory generation strategy. Subsequently, the trajectory tracking controller is developed with the help of robust control and switch control techniques. After that, the overall system asymptotic stability is guaranteed by Lyapunov stability analysis. To make the controller applicable in real flight, we also provide several instructions. Finally, a series of experiment results manifest the successful implementation of the proposed trajectory generation strategy and tracking control strategy. This work firstly achieves the closed-loop integration of trajectory generation and control for real 3-dimensional flight of an underactuated FWAV to a practical level. △ Less

Submitted 2 November, 2023; originally announced November 2023.

arXiv:2310.14769 [pdf]

An introduction to radar Automatic Target Recognition (ATR) technology in ground-based radar systems

Authors: Jiangkun Gong, Jun Yan, Deyong Kong, Deren Li

Abstract: This paper presents a brief examination of Automatic Target Recognition (ATR) technology within ground-based radar systems. It offers a lucid comprehension of the ATR concept, delves into its historical milestones, and categorizes ATR methods according to different scattering regions. By incorporating ATR solutions into radar systems, this study demonstrates the expansion of radar detection ranges… ▽ More This paper presents a brief examination of Automatic Target Recognition (ATR) technology within ground-based radar systems. It offers a lucid comprehension of the ATR concept, delves into its historical milestones, and categorizes ATR methods according to different scattering regions. By incorporating ATR solutions into radar systems, this study demonstrates the expansion of radar detection ranges and the enhancement of tracking capabilities, leading to superior situational awareness. Drawing insights from the Russo-Ukrainian War, the paper highlights three pressing radar applications that urgently necessitate ATR technology: detecting stealth aircraft, countering small drones, and implementing anti-jamming measures. Anticipating the next wave of radar ATR research, the study predicts a surge in cognitive radar and machine learning (ML)-driven algorithms. These emerging methodologies aspire to confront challenges associated with system adaptation, real-time recognition, and environmental adaptability. Ultimately, ATR stands poised to revolutionize conventional radar systems, ushering in an era of 4D sensing capabilities. △ Less

Submitted 23 October, 2023; originally announced October 2023.

arXiv:2310.10039 [pdf, other]

TpopT: Efficient Trainable Template Optimization on Low-Dimensional Manifolds

Authors: **gkai Yan, Shiyu Wang, Xinyu Rain Wei, Jimmy Wang, Zsuzsanna Márka, Szabolcs Márka, John Wright

Abstract: In scientific and engineering scenarios, a recurring task is the detection of low-dimensional families of signals or patterns. A classic family of approaches, exemplified by template matching, aims to cover the search space with a dense template bank. While simple and highly interpretable, it suffers from poor computational efficiency due to unfavorable scaling in the signal space dimensionality.… ▽ More In scientific and engineering scenarios, a recurring task is the detection of low-dimensional families of signals or patterns. A classic family of approaches, exemplified by template matching, aims to cover the search space with a dense template bank. While simple and highly interpretable, it suffers from poor computational efficiency due to unfavorable scaling in the signal space dimensionality. In this work, we study TpopT (TemPlate OPTimization) as an alternative scalable framework for detecting low-dimensional families of signals which maintains high interpretability. We provide a theoretical analysis of the convergence of Riemannian gradient descent for TpopT, and prove that it has a superior dimension scaling to covering. We also propose a practical TpopT framework for nonparametric signal sets, which incorporates techniques of embedding and kernel interpolation, and is further configurable into a trainable network architecture by unrolled optimization. The proposed trainable TpopT exhibits significantly improved efficiency-accuracy tradeoffs for gravitational wave detection, where matched filtering is currently a method of choice. We further illustrate the general applicability of this approach with experiments on handwritten digit data. △ Less

Submitted 15 October, 2023; originally announced October 2023.

arXiv:2310.05999 [pdf]

Two stage Robust Nash Bargaining based Energy Trading between Hydrogen-enriched Gas and Active Distribution Networks

Authors: Wenwen Zhang, Gao Qiu, Hongjun Gao, Tingjian Liu, Junyong Liu, Ya** Li, Shengchun Yang, Jiahao Yan, Wenbo Mao

Abstract: Integration of emerging hydrogen-enriched compressed natural gas (HCNG) distribution network with active distribution net-work (ADN) provides huge latent flexibility on consuming re-newable energies. However, paucity of energy trading mechanism risks the stable earnings of the flexibility for both entities, especially when rising highly-efficient solid oxide fuel cells (SOFCs) are pioneered to int… ▽ More Integration of emerging hydrogen-enriched compressed natural gas (HCNG) distribution network with active distribution net-work (ADN) provides huge latent flexibility on consuming re-newable energies. However, paucity of energy trading mechanism risks the stable earnings of the flexibility for both entities, especially when rising highly-efficient solid oxide fuel cells (SOFCs) are pioneered to interface gas and electricity. To fill the gap, a two-stage robust Nash bargaining strategy is pro-posed. In the first stage, a privacy-preserved Nash Bargaining based on the ADMM is applied to clear energy trading between the two autonomous entities, i.e., ADN and gas distribution network (GDN). Via robust dispatch of configured energy storage in ADN, the next stage de-risks ADN profit collapse from transaction biases, caused by forecasting errors of distributed energy resources. C&CG is finally utilized to loop the two stages. The convergence of the entire energy trading strategy is theoretically proved. As such, sustain-able returns from the integration of ADN and GDN bridged by SOFC and HCNG are facilitated. Numerical studies indicate that, the proposed cooperative strategy reaps a stable social welfare of nearly 1.6% to total cost, and benefit-steady situations for both ADN and GDN, even in the worst case. △ Less

Submitted 22 May, 2024; v1 submitted 9 October, 2023; originally announced October 2023.

arXiv:2309.15415 [pdf]

Formation Wing-Beat Modulation (FWM): A Tool for Quantifying Bird Flocks Using Radar Micro-Doppler Signals

Authors: Jiangkun Gong, Jun Yan, Deyong Kong, Ruizhi Chen, Deren Li

Abstract: Radar echoes from bird flocks contain modulation signals, which we find are produced by the flap** gaits of birds in the flock, resulting in a group of spectral peaks with similar amplitudes spaced at a specific interval. We call this the formation wing-beat modulation (FWM) effect. FWM signals are micro-Doppler modulated by flap** wings and are related to the bird number, wing-beat frequency,… ▽ More Radar echoes from bird flocks contain modulation signals, which we find are produced by the flap** gaits of birds in the flock, resulting in a group of spectral peaks with similar amplitudes spaced at a specific interval. We call this the formation wing-beat modulation (FWM) effect. FWM signals are micro-Doppler modulated by flap** wings and are related to the bird number, wing-beat frequency, and flight phasing strategy. Our X-band radar data show that FWM signals exist in radar signals of a seagull flock, providing tools for quantifying the bird number and estimating the mean wingbeat rate of birds. This new finding could aid in research on the quantification of bird migration numbers and estimation of bird flight behavior in radar ornithology and aero-ecology. △ Less

Submitted 27 September, 2023; originally announced September 2023.

arXiv:2309.08276

A New Adaptive Phase-locked Loop for Synchronization of a Grid-Connected Voltage Source Converter: Simulation and Experimental Results

Authors: Wei He, Jiachen Yan, Romeo Ortega, Daniele Zonetti, Wang** Zhou

Abstract: In [1] a new adaptive phase-locked loop scheme for synchronization of a grid connected voltage source converter with guaranteed (almost) global stability properties was reported. To guarantee a suitable synchronization with the angle of the three-phase grid voltage we design an adaptive observer for such a signal requiring measurements only at the point of common coupling. An interesting feature o… ▽ More In [1] a new adaptive phase-locked loop scheme for synchronization of a grid connected voltage source converter with guaranteed (almost) global stability properties was reported. To guarantee a suitable synchronization with the angle of the three-phase grid voltage we design an adaptive observer for such a signal requiring measurements only at the point of common coupling. An interesting feature of this scheme is the ability to synchronize in the challenging condition of connection with a grid with reduced short-circuit ratio. In this paper we present some simulation and experimental illustration of the excellent performance of the proposed solution. △ Less

Submitted 30 October, 2023; v1 submitted 15 September, 2023; originally announced September 2023.

Comments: Something needs to be modified so that this paper is more clear

arXiv:2308.03806 [pdf, other]

SoK: Acoustic Side Channels

Authors: ** Wang, Shishir Nagaraja, Aurélien Bourquard, Haichang Gao, Jeff Yan

Abstract: We provide a state-of-the-art analysis of acoustic side channels, cover all the significant academic research in the area, discuss their security implications and countermeasures, and identify areas for future research. We also make an attempt to bridge side channels and inverse problems, two fields that appear to be completely isolated from each other but have deep connections. We provide a state-of-the-art analysis of acoustic side channels, cover all the significant academic research in the area, discuss their security implications and countermeasures, and identify areas for future research. We also make an attempt to bridge side channels and inverse problems, two fields that appear to be completely isolated from each other but have deep connections. △ Less

Submitted 6 August, 2023; originally announced August 2023.

Comments: 16 pages

arXiv:2308.03076 [pdf]

Study for Performance of MobileNetV1 and MobileNetV2 Based on Breast Cancer

Authors: Jiuqi Yan

Abstract: Artificial intelligence is constantly evolving and can provide effective help in all aspects of people's lives. The experiment is mainly to study the use of artificial intelligence in the field of medicine. The purpose of this experiment was to compare which of MobileNetV1 and MobileNetV2 models was better at detecting histopathological images of the breast downloaded at Kaggle. When the doctor lo… ▽ More Artificial intelligence is constantly evolving and can provide effective help in all aspects of people's lives. The experiment is mainly to study the use of artificial intelligence in the field of medicine. The purpose of this experiment was to compare which of MobileNetV1 and MobileNetV2 models was better at detecting histopathological images of the breast downloaded at Kaggle. When the doctor looks at the pathological image, there may be errors that lead to errors in judgment, and the observation speed is slow. Rational use of artificial intelligence can effectively reduce the error of doctor diagnosis in breast cancer judgment and speed up doctor diagnosis. The dataset was downloaded from Kaggle and then normalized. The basic principle of the experiment is to let the neural network model learn the downloaded data set. Then find the pattern and be able to judge on your own whether breast tissue is cancer. In the dataset, benign tumor pictures and malignant tumor pictures have been classified, of which 198738 are benign tumor pictures and 78, 786 are malignant tumor pictures. After calling MobileNetV1 and MobileNetV2, the dataset is trained separately, the training accuracy and validation accuracy rate are obtained, and the image is drawn. It can be observed that MobileNetV1 has better validation accuracy and overfit during MobileNetV2 training. From the experimental results, it can be seen that in the case of processing this dataset, MobileNetV1 is much better than MobileNetV2. △ Less

Submitted 6 August, 2023; originally announced August 2023.

Comments: 5 pages,3 figures,CMLAI 2023

Report number: CMLAI-101

arXiv:2307.15101 [pdf]

Detection of Children Abuse by Voice and Audio Classification by Short-Time Fourier Transform Machine Learning implemented on Nvidia Edge GPU device

Authors: Jiuqi Yan, Yingxian Chen, W. W. T. Fok

Abstract: The safety of children in children home has become an increasing social concern, and the purpose of this experiment is to use machine learning applied to detect the scenarios of child abuse to increase the safety of children. This experiment uses machine learning to classify and recognize a child's voice and predict whether the current sound made by the child is crying, screaming or laughing. If a… ▽ More The safety of children in children home has become an increasing social concern, and the purpose of this experiment is to use machine learning applied to detect the scenarios of child abuse to increase the safety of children. This experiment uses machine learning to classify and recognize a child's voice and predict whether the current sound made by the child is crying, screaming or laughing. If a child is found to be crying or screaming, an alert is immediately sent to the relevant personnel so that they can perceive what the child may be experiencing in a surveillance blind spot and respond in a timely manner. Together with a hybrid use of video image classification, the accuracy of child abuse detection can be significantly increased. This greatly reduces the likelihood that a child will receive violent abuse in the nursery and allows personnel to stop an imminent or incipient child abuse incident in time. The datasets collected from this experiment is entirely from sounds recorded on site at the children home, including crying, laughing, screaming sound and background noises. These sound files are transformed into spectrograms using Short-Time Fourier Transform, and then these image data are imported into a CNN neural network for classification, and the final trained model can achieve an accuracy of about 92% for sound detection. △ Less

Submitted 27 July, 2023; originally announced July 2023.

Comments: 5 pages, 7 figures, PRAI 2023

Report number: Report number: E15251 MSC Class: First level: 68

arXiv:2307.10326 [pdf]

Introduction to Drone Detection Radar with Emphasis on Automatic Target Recognition (ATR) technology

Authors: Jiangkun Gong, Jun Yan, Deyong Kong, Deren Li

Abstract: This paper discusses the challenges of detecting and categorizing small drones with radar automatic target recognition (ATR) technology. The authors suggest integrating ATR capabilities into drone detection radar systems to improve performance and manage emerging threats. The study focuses primarily on drones in Group 1 and 2. The paper highlights the need to consider kinetic features and signal s… ▽ More This paper discusses the challenges of detecting and categorizing small drones with radar automatic target recognition (ATR) technology. The authors suggest integrating ATR capabilities into drone detection radar systems to improve performance and manage emerging threats. The study focuses primarily on drones in Group 1 and 2. The paper highlights the need to consider kinetic features and signal signatures, such as micro-Doppler, in ATR techniques to efficiently recognize small drones. The authors also present a comprehensive drone detection radar system design that balances detection and tracking requirements, incorporating parameter adjustment based on scattering region theory. They offer an example of a performance improvement achieved using feedback and situational awareness mechanisms with the integrated ATR capabilities. Furthermore, the paper examines challenges related to one-way attack drones and explores the potential of cognitive radar as a solution. The integration of ATR capabilities transforms a 3D radar system into a 4D radar system, resulting in improved drone detection performance. These advancements are useful in military, civilian, and commercial applications, and ongoing research and development efforts are essential to keep radar systems effective and ready to detect, track, and respond to emerging threats. △ Less

Submitted 19 July, 2023; originally announced July 2023.

Comments: 17 pages, 14 figures, submitted to a journal and being under review

arXiv:2307.07829 [pdf, other]

HQG-Net: Unpaired Medical Image Enhancement with High-Quality Guidance

Authors: Chunming He, Kai Li, Guoxia Xu, Jiangpeng Yan, Longxiang Tang, Yulun Zhang, Xiu Li, Yaowei Wang

Abstract: Unpaired Medical Image Enhancement (UMIE) aims to transform a low-quality (LQ) medical image into a high-quality (HQ) one without relying on paired images for training. While most existing approaches are based on Pix2Pix/CycleGAN and are effective to some extent, they fail to explicitly use HQ information to guide the enhancement process, which can lead to undesired artifacts and structural distor… ▽ More Unpaired Medical Image Enhancement (UMIE) aims to transform a low-quality (LQ) medical image into a high-quality (HQ) one without relying on paired images for training. While most existing approaches are based on Pix2Pix/CycleGAN and are effective to some extent, they fail to explicitly use HQ information to guide the enhancement process, which can lead to undesired artifacts and structural distortions. In this paper, we propose a novel UMIE approach that avoids the above limitation of existing methods by directly encoding HQ cues into the LQ enhancement process in a variational fashion and thus model the UMIE task under the joint distribution between the LQ and HQ domains. Specifically, we extract features from an HQ image and explicitly insert the features, which are expected to encode HQ cues, into the enhancement network to guide the LQ enhancement with the variational normalization module. We train the enhancement network adversarially with a discriminator to ensure the generated HQ image falls into the HQ domain. We further propose a content-aware loss to guide the enhancement process with wavelet-based pixel-level and multi-encoder-based feature-level constraints. Additionally, as a key motivation for performing image enhancement is to make the enhanced images serve better for downstream tasks, we propose a bi-level learning scheme to optimize the UMIE task and downstream tasks cooperatively, hel** generate HQ images both visually appealing and favorable for downstream tasks. Experiments on three medical datasets, including two newly collected datasets, verify that the proposed method outperforms existing techniques in terms of both enhancement quality and downstream task performance. We will make the code and the newly collected datasets publicly available for community study. △ Less

Submitted 15 July, 2023; originally announced July 2023.

Comments: 14 pages, 10 figures

arXiv:2307.04101 [pdf, other]

Enhancing Building Semantic Segmentation Accuracy with Super Resolution and Deep Learning: Investigating the Impact of Spatial Resolution on Various Datasets

Authors: Zhiling Guo, Xiaodan Shi, Haoran Zhang, Dou Huang, Xiaoya Song, **yue Yan, Ryosuke Shibasaki

Abstract: The development of remote sensing and deep learning techniques has enabled building semantic segmentation with high accuracy and efficiency. Despite their success in different tasks, the discussions on the impact of spatial resolution on deep learning based building semantic segmentation are quite inadequate, which makes choosing a higher cost-effective data source a big challenge. To address the… ▽ More The development of remote sensing and deep learning techniques has enabled building semantic segmentation with high accuracy and efficiency. Despite their success in different tasks, the discussions on the impact of spatial resolution on deep learning based building semantic segmentation are quite inadequate, which makes choosing a higher cost-effective data source a big challenge. To address the issue mentioned above, in this study, we create remote sensing images among three study areas into multiple spatial resolutions by super-resolution and down-sampling. After that, two representative deep learning architectures: UNet and FPN, are selected for model training and testing. The experimental results obtained from three cities with two deep learning models indicate that the spatial resolution greatly influences building segmentation results, and with a better cost-effectiveness around 0.3m, which we believe will be an important insight for data selection and preparation. △ Less

Submitted 9 July, 2023; originally announced July 2023.

arXiv:2306.13875 [pdf, other]

Real-World Video for Zoom Enhancement based on Spatio-Temporal Coupling

Authors: Zhiling Guo, Yinqiang Zheng, Haoran Zhang, Xiaodan Shi, Zekun Cai, Ryosuke Shibasaki, **yue Yan

Abstract: In recent years, single-frame image super-resolution (SR) has become more realistic by considering the zooming effect and using real-world short- and long-focus image pairs. In this paper, we further investigate the feasibility of applying realistic multi-frame clips to enhance zoom quality via spatio-temporal information coupling. Specifically, we first built a real-world video benchmark, VideoRA… ▽ More In recent years, single-frame image super-resolution (SR) has become more realistic by considering the zooming effect and using real-world short- and long-focus image pairs. In this paper, we further investigate the feasibility of applying realistic multi-frame clips to enhance zoom quality via spatio-temporal information coupling. Specifically, we first built a real-world video benchmark, VideoRAW, by a synchronized co-axis optical system. The dataset contains paired short-focus raw and long-focus sRGB videos of different dynamic scenes. Based on VideoRAW, we then presented a Spatio-Temporal Coupling Loss, termed as STCL. The proposed STCL is intended for better utilization of information from paired and adjacent frames to align and fuse features both temporally and spatially at the feature level. The outperformed experimental results obtained in different zoom scenarios demonstrate the superiority of integrating real-world video dataset and STCL into existing SR models for zoom quality enhancement, and reveal that the proposed method can serve as an advanced and viable tool for video zoom. △ Less

Submitted 24 June, 2023; originally announced June 2023.

Comments: 11 pages

arXiv:2306.10982 [pdf, other]

Differentially Private Over-the-Air Federated Learning Over MIMO Fading Channels

Authors: Hang Liu, Jia Yan, Ying-Jun Angela Zhang

Abstract: Federated learning (FL) enables edge devices to collaboratively train machine learning models, with model communication replacing direct data uploading. While over-the-air model aggregation improves communication efficiency, uploading models to an edge server over wireless networks can pose privacy risks. Differential privacy (DP) is a widely used quantitative technique to measure statistical data… ▽ More Federated learning (FL) enables edge devices to collaboratively train machine learning models, with model communication replacing direct data uploading. While over-the-air model aggregation improves communication efficiency, uploading models to an edge server over wireless networks can pose privacy risks. Differential privacy (DP) is a widely used quantitative technique to measure statistical data privacy in FL. Previous research has focused on over-the-air FL with a single-antenna server, leveraging communication noise to enhance user-level DP. This approach achieves the so-called "free DP" by controlling transmit power rather than introducing additional DP-preserving mechanisms at devices, such as adding artificial noise. In this paper, we study differentially private over-the-air FL over a multiple-input multiple-output (MIMO) fading channel. We show that FL model communication with a multiple-antenna server amplifies privacy leakage as the multiple-antenna server employs separate receive combining for model aggregation and information inference. Consequently, relying solely on communication noise, as done in the multiple-input single-output system, cannot meet high privacy requirements, and a device-side privacy-preserving mechanism is necessary for optimal DP design. We analyze the learning convergence and privacy loss of the studied FL system and propose a transceiver design algorithm based on alternating optimization. Numerical results demonstrate that the proposed method achieves a better privacy-learning trade-off compared to prior work. △ Less

Submitted 25 December, 2023; v1 submitted 19 June, 2023; originally announced June 2023.

Comments: This work has been accepted by the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

arXiv:2306.02017 [pdf, other]

Resilient Distributed Parameter Estimation in Sensor Networks

Authors: Jiaqi Yan, Kuo Li, Hideaki Ishii

Abstract: In this paper, we study the problem of parameter estimation in a sensor network, where the measurements and updates of some sensors might be arbitrarily manipulated by adversaries. Despite the presence of such misbehaviors, normally behaving sensors make successive observations of an unknown $d$-dimensional vector parameter and aim to infer its true value by cooperating with their neighbors over a… ▽ More In this paper, we study the problem of parameter estimation in a sensor network, where the measurements and updates of some sensors might be arbitrarily manipulated by adversaries. Despite the presence of such misbehaviors, normally behaving sensors make successive observations of an unknown $d$-dimensional vector parameter and aim to infer its true value by cooperating with their neighbors over a directed communication graph. To this end, by leveraging the so-called dynamic regressor extension and mixing procedure, we transform the problem of estimating the vector parameter to that of estimating $d$ scalar ones. For each of the scalar problem, we propose a resilient combine-then-adapt diffusion algorithm, where each normal sensor performs a resilient combination to discard the suspicious estimates in its neighborhood and to fuse the remaining values, alongside an adaptation step to process its streaming observations. With a low computational cost, this estimator guarantees that each normal sensor exponentially infers the true parameter even if some of them are not sufficiently excited. △ Less

Submitted 3 June, 2023; originally announced June 2023.

arXiv:2304.00819 [pdf, other]

Acceleration-Based Kalman Tracking for Super-Resolution Ultrasound Imaging in vivo

Authors: Biao Huang, Jipeng Yan, Megan Morris, Victoria Sinnett, Navita Somaiah, Meng-Xing Tang

Abstract: Super-resolution ultrasound can image microvascular structure and flow at sub-wave-diffraction resolution based on localising and tracking microbubbles. Currently, tracking microbubbles accurately under limited imaging frame rates and high microbubble concentrations remains a challenge, especially under the effect of cardiac pulsatility and in highly curved vessels. In this study, an acceleration-… ▽ More Super-resolution ultrasound can image microvascular structure and flow at sub-wave-diffraction resolution based on localising and tracking microbubbles. Currently, tracking microbubbles accurately under limited imaging frame rates and high microbubble concentrations remains a challenge, especially under the effect of cardiac pulsatility and in highly curved vessels. In this study, an acceleration-incorporated microbubble motion model is introduced into a Kalman tracking framework. The tracking performance was evaluated using simulated microvasculature with different microbubble motion parameters and acquisition frame rates, and in vivo human breast tumour ultrasound datasets. The simulation results show that the acceleration-based method outperformed the non-acceleration-based method at different levels of acceleration and acquisition frame rates and achieved significant improvement in true positive rate (up to 10.03%), false negative rate (up to 28.61%) and correctly pairing fraction (up to 170.14%). The proposed method can also reduce errors in vasculature reconstruction via the acceleration-based nonlinear interpolation, compared with linear interpolation (up to 19 um). The tracking results from temporally downsampled low frame rate in vivo datasets from human breast tumours show that the proposed method has better microbubble tracking performance than the baseline method, if using results from the initial high frame data as reference. Finally, the acceleration estimated from tracking results also provides a spatial speed gradient map that may contain extra valuable diagnostic information. △ Less

Submitted 3 April, 2023; originally announced April 2023.

Comments: 15 pages, 10 figures

arXiv:2303.14003 [pdf]

Transthoracic super-resolution ultrasound localisation microscopy of myocardial vasculature in patients

Authors: Jipeng Yan, Biao Huang, Johanna Tonko, Matthieu Toulemonde, Joseph Hansen-Shearer, Qingyuan Tan, Kai Riemer, Konstantinos Ntagiantas, Rasheda A Chowdhury, Pier Lambiase, Roxy Senior, Meng-Xing Tang

Abstract: Micro-vascular flow in the myocardium is of significant importance clinically but remains poorly understood. Up to 25% of patients with symptoms of coronary heart diseases have no obstructive coronary arteries and have suspected microvascular diseases. However, such microvasculature is difficult to image in vivo with existing modalities due to the lack of resolution and sensitivity. Here, we demon… ▽ More Micro-vascular flow in the myocardium is of significant importance clinically but remains poorly understood. Up to 25% of patients with symptoms of coronary heart diseases have no obstructive coronary arteries and have suspected microvascular diseases. However, such microvasculature is difficult to image in vivo with existing modalities due to the lack of resolution and sensitivity. Here, we demonstrate the feasibility of transthoracic super-resolution ultrasound localisation microscopy (SRUS/ULM) of myocardial microvasculature and hemodynamics in a large animal model and in patients, using a cardiac phased array probe with a customised data acquisition and processing pipeline. A multi-level motion correction strategy was proposed. A tracking framework incorporating multiple features and automatic parameter initialisations was developed to reconstruct microcirculation. In two patients with impaired myocardial function, we have generated SRUS images of myocardial vascular structure and flow with a resolution that is beyond the wave-diffraction limit (half a wavelength), using data acquired within a breath hold. Myocardial SRUS/ULM has potential to improve the understanding of myocardial microcirculation and the management of patients with cardiac microvascular diseases. △ Less

Submitted 28 March, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

Comments: 22 pages, 10 figures

arXiv:2303.11775 [pdf, other]

Distributed Parameter Estimation under Gaussian Observation Noises

Authors: Jiaqi Yan, Hideaki Ishii

Abstract: In this paper, we consider the problem of distributed parameter estimation in sensor networks. Each sensor makes successive observations of an unknown $d$-dimensional parameter, which might be subject to Gaussian random noises. They aim to infer true value of the unknown parameter by cooperating with each other. To this end, we first generalize the so-called dynamic regressor extension and mixing… ▽ More In this paper, we consider the problem of distributed parameter estimation in sensor networks. Each sensor makes successive observations of an unknown $d$-dimensional parameter, which might be subject to Gaussian random noises. They aim to infer true value of the unknown parameter by cooperating with each other. To this end, we first generalize the so-called dynamic regressor extension and mixing (DREM) algorithm to stochastic systems, with which the problem of estimating a $d$-dimensional vector parameter is transformed to that of $d$ scalar ones: one for each of the unknown parameters. For each of the scalar problem, an estimation scheme is given, where each sensor fuses the regressors and measurements in its in-neighborhood and updates its local estimate by using least-mean squares. Particularly, a counter is also introduced for each sensor, which prevents any (noisy) measurement from being repeatedly used such that the estimation performance will not be greatly affected by certain extreme values. A novel excitation condition termed as \textit{local persistent excitation} (Local-PE) condition is also proposed, which relaxes the traditional persistent excitation (PE) condition and only requires that the collective signals in each sensor's in-neighborhood are sufficiently excited. With the Local-PE condition and proper step sizes, we show that the proposed estimator guarantee that each sensor infers the true parameter in mean square, even if any individual of them cannot. Numerical examples are finally provided to illustrate the established results. △ Less

Submitted 21 March, 2023; originally announced March 2023.

arXiv:2302.05736 [pdf]

Locating the Sources of Sub-synchronous Oscillations Induced by the Control of Voltage Source Converters Based on Energy Structure and Nonlinearity Detection

Authors: Zetian Zheng, Shaowei Huang, Jun Yan, Qiangsheng Bu, Chen Shen, Mingzhong Zheng, Ye Liu

Abstract: The oscillation phenomena associated with the control of voltage source converters (VSCs) are widely concerning, and locating the source of these oscillations is crucial to suppressing them; therefore, this paper presents a locating scheme, based on the energy structure and nonlinearity detection. On the one hand, the energy structure, which conforms with the principle of the energy-based method a… ▽ More The oscillation phenomena associated with the control of voltage source converters (VSCs) are widely concerning, and locating the source of these oscillations is crucial to suppressing them; therefore, this paper presents a locating scheme, based on the energy structure and nonlinearity detection. On the one hand, the energy structure, which conforms with the principle of the energy-based method and dissipativity theory, is constructed to describe the transient energy flow for VSCs, and on this basis, a defined characteristic quantity is implemented to narrow the scope of oscillation source location; on the other hand, according to the self-sustained oscillation characteristics of VSCs, an index for nonlinearity detection is applied to locate the VSCs which produce the oscillation energy. The combination of the energy structure and nonlinearity detection could distinguish the contributions of different VSCs to the oscillation. The results of a case study implemented by the PSCAD/EMTDC simulation validate the proposed scheme. △ Less

Submitted 17 February, 2023; v1 submitted 11 February, 2023; originally announced February 2023.

arXiv:2302.05297 [pdf]

Objective Evaluation-based High-efficiency Learning Framework for Hyperspectral Image Classification

Authors: Xuming Zhang, Jian Yan, Jia Tian, Wei Li, Xingfa Gu, Qingjiu Tian

Abstract: Deep learning methods have been successfully applied to hyperspectral image (HSI) classification with remarkable performance. Because of limited labelled HSI data, earlier studies primarily adopted a patch-based classification framework, which divides images into overlap** patches for training and testing. However, this approach results in redundant computations and possible information leakage.… ▽ More Deep learning methods have been successfully applied to hyperspectral image (HSI) classification with remarkable performance. Because of limited labelled HSI data, earlier studies primarily adopted a patch-based classification framework, which divides images into overlap** patches for training and testing. However, this approach results in redundant computations and possible information leakage. In this study, we propose an objective evaluation-based high-efficiency learning framework for tiny HSI classification. This framework comprises two main parts: (i) a leakage-free balanced sampling strategy, and (ii) a modified end-to-end fully convolutional network (FCN) architecture that optimizes the trade-off between accuracy and efficiency. The leakage-free balanced sampling strategy generates balanced and non-overlap** training and testing data by partitioning an HSI and the ground truth image into small windows, each of which corresponds to one training or testing sample. The proposed high-efficiency FCN exhibits a pixel-to-pixel architecture with modifications aimed at faster inference speed and improved parameter efficiency. Experiments conducted on four representative datasets demonstrated that the proposed sampling strategy can provide objective performance evaluation and that the proposed network outperformed many state-of-the-art approaches with respect to the speed/accuracy tradeoff. Our source code is available at https://github.com/xmzhang2018. △ Less

Submitted 10 January, 2023; originally announced February 2023.

arXiv:2211.12845 [pdf, other]

doi 10.1007/s41095-023-0387-8

Super-resolution Reconstruction of Single Image for Latent features

Authors: Xin Wang, **g-Ke Yan, **g-Ye Cai, Jian-Hua Deng, Qin Qin, Yao Cheng

Abstract: Single-image super-resolution (SISR) typically focuses on restoring various degraded low-resolution (LR) images to a single high-resolution (HR) image. However, during SISR tasks, it is often challenging for models to simultaneously maintain high quality and rapid sampling while preserving diversity in details and texture features. This challenge can lead to issues such as model collapse, lack of… ▽ More Single-image super-resolution (SISR) typically focuses on restoring various degraded low-resolution (LR) images to a single high-resolution (HR) image. However, during SISR tasks, it is often challenging for models to simultaneously maintain high quality and rapid sampling while preserving diversity in details and texture features. This challenge can lead to issues such as model collapse, lack of rich details and texture features in the reconstructed HR images, and excessive time consumption for model sampling. To address these problems, this paper proposes a Latent Feature-oriented Diffusion Probability Model (LDDPM). First, we designed a conditional encoder capable of effectively encoding LR images, reducing the solution space for model image reconstruction and thereby improving the quality of the reconstructed images. We then employed a normalized flow and multimodal adversarial training, learning from complex multimodal distributions, to model the denoising distribution. Doing so boosts the generative modeling capabilities within a minimal number of sampling steps. Experimental comparisons of our proposed model with existing SISR methods on mainstream datasets demonstrate that our model reconstructs more realistic HR images and achieves better performance on multiple evaluation metrics, providing a fresh perspective for tackling SISR tasks. △ Less

Submitted 9 November, 2023; v1 submitted 16 November, 2022; originally announced November 2022.

Journal ref: Computational Visual Media,2023

arXiv:2208.14812 [pdf, other]

Domain Shift-oriented Machine Anomalous Sound Detection Model Based on Self-Supervised Learning

Authors: **g-ke Yan, Xin Wang, Qin Wang, Qin Qin, Huang-he Li, Peng-fei Ye, Yue-** He, **g Zeng

Abstract: Thanks to the development of deep learning, research on machine anomalous sound detection based on self-supervised learning has made remarkable achievements. However, there are differences in the acoustic characteristics of the test set and the training set under different operating conditions of the same machine (domain shifts). It is challenging for the existing detection methods to learn the do… ▽ More Thanks to the development of deep learning, research on machine anomalous sound detection based on self-supervised learning has made remarkable achievements. However, there are differences in the acoustic characteristics of the test set and the training set under different operating conditions of the same machine (domain shifts). It is challenging for the existing detection methods to learn the domain shifts features stably with low computation overhead. To address these problems, we propose a domain shift-oriented machine anomalous sound detection model based on self-supervised learning (TranSelf-DyGCN) in this paper. Firstly, we design a time-frequency domain feature modeling network to capture global and local spatial and time-domain features, thus improving the stability of machine anomalous sound detection stability under domain shifts. Then, we adopt a Dynamic Graph Convolutional Network (DyGCN) to model the inter-dependence relationship between domain shifts features, enabling the model to perceive domain shifts features efficiently. Finally, we use a Domain Adaptive Network (DAN) to compensate for the performance decrease caused by domain shifts, making the model adapt to anomalous sound better in the self-supervised environment. The performance of the suggested model is validated on DCASE 2020 task 2 and DCASE 2022 task 2. △ Less

Submitted 7 September, 2022; v1 submitted 31 August, 2022; originally announced August 2022.

arXiv:2208.12176 [pdf, other]

doi 10.1109/TBME.2023.3263369

3D Super-Resolution Ultrasound with Adaptive Weight-Based Beamforming

Authors: Jipeng Yan, Bingxue Wang, Kai Riemer, Joseph Hansen-Shearer, Marcelo Lerendegui, Matthieu Toulemonde, Christopher J Rowlands, Peter D. Weinberg, Meng-Xing Tang

Abstract: Super-resolution ultrasound (SRUS) imaging through localising and tracking sparse microbubbles has been shown to reveal microvascular structure and flow beyond the wave diffraction limit. Most SRUS studies use standard delay and sum (DAS) beamforming, where large main lobe and significant side lobes make separation and localisation of densely distributed bubbles challenging, particularly in 3D due… ▽ More Super-resolution ultrasound (SRUS) imaging through localising and tracking sparse microbubbles has been shown to reveal microvascular structure and flow beyond the wave diffraction limit. Most SRUS studies use standard delay and sum (DAS) beamforming, where large main lobe and significant side lobes make separation and localisation of densely distributed bubbles challenging, particularly in 3D due to the typically small aperture of matrix array probes. This study aims to improve 3D SRUS by implementing a low-cost 3D coherence beamformer based on channel signal variance, as well as two other adaptive weight-based coherence beamformers: nonlinear beamforming with p-th root compression and coherence factor. The 3D coherence beamformers, together with DAS, are compared in computer simulation, on a microflow phantom, and in vivo. Simulation results demonstrate that the adaptive weight-based beamformers can significantly narrow the main lobe and suppress the side lobes for modest computational cost. Significantly improved 3D SR images of microflow phantom and a rabbit kidney are obtained through the adaptive weight-based beamformers. The proposed variance-based beamformer performs best in simulations and experiments. △ Less

Submitted 25 August, 2022; originally announced August 2022.

Comments: Ultrasound localisation microscopy (ULM), super-resolution, contrast-enhanced ultrasound, 3D beamforming

arXiv:2207.07303 [pdf, other]

doi 10.1007/978-3-030-92273-3_45

Towards Better Dermoscopic Image Feature Representation Learning for Melanoma Classification

Authors: ChengHui Yu, MingKang Tang, ShengGe Yang, MingQing Wang, Zhe Xu, JiangPeng Yan, HanMo Chen, Yu Yang, Xiao-Jun Zeng, Xiu Li

Abstract: Deep learning-based melanoma classification with dermoscopic images has recently shown great potential in automatic early-stage melanoma diagnosis. However, limited by the significant data imbalance and obvious extraneous artifacts, i.e., the hair and ruler markings, discriminative feature extraction from dermoscopic images is very challenging. In this study, we seek to resolve these problems resp… ▽ More Deep learning-based melanoma classification with dermoscopic images has recently shown great potential in automatic early-stage melanoma diagnosis. However, limited by the significant data imbalance and obvious extraneous artifacts, i.e., the hair and ruler markings, discriminative feature extraction from dermoscopic images is very challenging. In this study, we seek to resolve these problems respectively towards better representation learning for lesion features. Specifically, a GAN-based data augmentation (GDA) strategy is adapted to generate synthetic melanoma-positive images, in conjunction with the proposed implicit hair denoising (IHD) strategy. Wherein the hair-related representations are implicitly disentangled via an auxiliary classifier network and reversely sent to the melanoma-feature extraction backbone for better melanoma-specific representation learning. Furthermore, to train the IHD module, the hair noises are additionally labeled on the ISIC2020 dataset, making it the first large-scale dermoscopic dataset with annotation of hair-like artifacts. Extensive experiments demonstrate the superiority of the proposed framework as well as the effectiveness of each component. The improved dataset publicly avaliable at https://github.com/kirtsy/DermoscopicDataset. △ Less

Submitted 15 July, 2022; originally announced July 2022.

Comments: ICONIP 2021 conference

arXiv:2206.07364 [pdf, other]

Seeking Common Ground While Reserving Differences: Multiple Anatomy Collaborative Framework for Undersampled MRI Reconstruction

Authors: Jiangpeng Yan, Chenghui Yu, Hanbo Chen, Zhe Xu, Junzhou Huang, Xiu Li, Jianhua Yao

Abstract: Recently, deep neural networks have greatly advanced undersampled Magnetic Resonance Image (MRI) reconstruction, wherein most studies follow the one-anatomy-one-network fashion, i.e., each expert network is trained and evaluated for a specific anatomy. Apart from inefficiency in training multiple independent models, such convention ignores the shared de-aliasing knowledge across various anatomies… ▽ More Recently, deep neural networks have greatly advanced undersampled Magnetic Resonance Image (MRI) reconstruction, wherein most studies follow the one-anatomy-one-network fashion, i.e., each expert network is trained and evaluated for a specific anatomy. Apart from inefficiency in training multiple independent models, such convention ignores the shared de-aliasing knowledge across various anatomies which can benefit each other. To explore the shared knowledge, one naive way is to combine all the data from various anatomies to train an all-round network. Unfortunately, despite the existence of the shared de-aliasing knowledge, we reveal that the exclusive knowledge across different anatomies can deteriorate specific reconstruction targets, yielding overall performance degradation. Observing this, in this study, we present a novel deep MRI reconstruction framework with both anatomy-shared and anatomy-specific parameterized learners, aiming to "seek common ground while reserving differences" across different anatomies.Particularly, the primary anatomy-shared learners are exposed to different anatomies to model flourishing shared knowledge, while the efficient anatomy-specific learners are trained with their target anatomy for exclusive knowledge. Four different implementations of anatomy-specific learners are presented and explored on the top of our framework in two MRI reconstruction networks. Comprehensive experiments on brain, knee and cardiac MRI datasets demonstrate that three of these learners are able to enhance reconstruction performance via multiple anatomy collaborative learning. △ Less

Submitted 15 June, 2022; v1 submitted 15 June, 2022; originally announced June 2022.

Comments: submitted to an IEEE journal

arXiv:2206.03912 [pdf, other]

Volumetric Image Projection Super-Resolution Ultrasound (VIP-SR) with a 1D Unfocused Linear Array

Authors: B. Wang, K. Riemer, M. Toulemonde, J. Yan, X. Zhou, M. Tang

Abstract: Super-Resolution Ultrasound (SRUS) through localizing spatially isolated microbubbles has been demonstrated to overcome the wave diffraction limit and reveal the microvascular structure and flow information at the microscopic scale. However, 3D SRUS imaging remains a challenge due to the fabrication and computational complexity of 2D matrix array probes and connections. Inspired by X-ray radiograp… ▽ More Super-Resolution Ultrasound (SRUS) through localizing spatially isolated microbubbles has been demonstrated to overcome the wave diffraction limit and reveal the microvascular structure and flow information at the microscopic scale. However, 3D SRUS imaging remains a challenge due to the fabrication and computational complexity of 2D matrix array probes and connections. Inspired by X-ray radiography which can present volumetric information in a single projection image with much simpler hardware than X-ray CT, this study investigates the feasibility of volumetric image projection super-resolution (VIP-SR) ultrasound using a 1D unfocused linear array. Both simulation and experiments were conducted on 3D microvessel phantoms using a 1D linear array with or without an elevational focus, and a 2D matrix array as the reference. Results show that, VIP-SR, using an unfocused 1D array probe can capture significantly more volumetric information than the conventional 1D elevational focused probe. Compared with the 2D projection image of the full 3D SRUS results using the 2D array probe with the same aperture size, VIP-SR has similar volumetric coverage using 32 folds less independent elements. The impact of bubble concentration and vascular density on the VIP-SR US was also investigated. This study demonstrates the ability of high-resolution volumetric imaging of microvascular structures at significantly reduced costs with VIP-SR. △ Less

Submitted 8 June, 2022; originally announced June 2022.

Comments: 19 pages, 9 figures

arXiv:2205.06612 [pdf, ps, other]

Event-Based Control for Synchronization of Stochastic Linear Systems with Application to Distributed Estimation

Authors: Jiaqi Yan, Yilin Mo, Hideaki Ishii

Abstract: This paper studies the synchronization of stochastic linear systems which are subject to a general class of noises, in the sense that the noises are bounded in covariance but might be correlated with the states of agents and among each other. We propose an event-based control protocol for achieving the synchronization among agents in the mean square sense and theoretically analyze the performance… ▽ More This paper studies the synchronization of stochastic linear systems which are subject to a general class of noises, in the sense that the noises are bounded in covariance but might be correlated with the states of agents and among each other. We propose an event-based control protocol for achieving the synchronization among agents in the mean square sense and theoretically analyze the performance of it by using a stochastic Lyapunov function, where the stability of $c$-martingales is particularly developed to handle the challenges brought by the general model of noises and the event-triggering mechanism. The proposed event-based synchronization algorithm is then applied to solve the problem of distributed estimation in sensor network. Specifically, by losslessly decomposing the optimal Kalman filter, it is shown that the problem of distributed estimation can be resolved by using the algorithms designed for achieving the synchronization of stochastic linear systems. As such, an event-based distributed estimation algorithm is developed, where each sensor performs local filtering solely using its own measurement, together with the proposed event-based synchronization algorithm to fuse the local estimates of neighboring nodes. With the reduced communication frequency, the designed estimator is proved to be stable under the minimal requirements of network connectivity and collective system observability. △ Less

Submitted 13 May, 2022; originally announced May 2022.

Comments: arXiv admin note: text overlap with arXiv:2204.03364

arXiv:2204.03398 [pdf, other]

Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition

Authors: Qijie Shao, **ghao Yan, Jian Kang, Pengcheng Guo, Xian Shi, Pengfei Hu, Lei Xie

Abstract: General accent recognition (AR) models tend to directly extract low-level information from spectrums, which always significantly overfit on speakers or channels. Considering accent can be regarded as a series of shifts relative to native pronunciation, distinguishing accents will be an easier task with accent shift as input. But due to the lack of native utterance as an anchor, estimating the acce… ▽ More General accent recognition (AR) models tend to directly extract low-level information from spectrums, which always significantly overfit on speakers or channels. Considering accent can be regarded as a series of shifts relative to native pronunciation, distinguishing accents will be an easier task with accent shift as input. But due to the lack of native utterance as an anchor, estimating the accent shift is difficult. In this paper, we propose linguistic-acoustic similarity based accent shift (LASAS) for AR tasks. For an accent speech utterance, after map** the corresponding text vector to multiple accent-associated spaces as anchors, its accent shift could be estimated by the similarities between the acoustic embedding and those anchors. Then, we concatenate the accent shift with a dimension-reduced text vector to obtain a linguistic-acoustic bimodal representation. Compared with pure acoustic embedding, the bimodal representation is richer and more clear by taking full advantage of both linguistic and acoustic information, which can effectively improve AR performance. Experiments on Accented English Speech Recognition Challenge (AESRC) dataset show that our method achieves 77.42% accuracy on Test set, obtaining a 6.94% relative improvement over a competitive system in the challenge. △ Less

Submitted 1 July, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

Comments: Accepted by Interspeech 2022

arXiv:2204.03364 [pdf, other]

A Framework for Distributed Estimation with Reduced Communication via Event-Based Strategies

Authors: Jiaqi Yan, Yilin Mo, Hideaki Ishii

Abstract: This paper considers the problem of distributed estimation in a sensor network, where multiple sensors are deployed to infer the state of a linear time-invariant (LTI) Gaussian system. By proposing a lossless decomposition of Kalman filter, a framework of event-based distributed estimation is developed, where each sensor node runs a local filter using solely its own measurement, alongside with an… ▽ More This paper considers the problem of distributed estimation in a sensor network, where multiple sensors are deployed to infer the state of a linear time-invariant (LTI) Gaussian system. By proposing a lossless decomposition of Kalman filter, a framework of event-based distributed estimation is developed, where each sensor node runs a local filter using solely its own measurement, alongside with an event-based synchronization algorithm to fuse the neighboring information. One novelty of the proposed framework is that it decouples the local filter from synchronization process. By doing so, we prove that a general class of triggering strategies can be applied in our framework, which yields stable distributed estimators under the minimal requirements of network connectivity and collective system observability. As compared with existing works, the proposed algorithm enjoys lower data size for each transmission. Moreover, the developed results can be generalized to achieve a distributed implementation of any Luenberger observer. By solving a semi-definite programming (SDP), we further present a low-rank estimator design to obtain the optimal gain of Luenberger observer such that the distributed estimation is realized under the constraint of message complexity. Numerical examples are finally provided to demonstrate the proposed methods. △ Less

Submitted 17 April, 2022; v1 submitted 7 April, 2022; originally announced April 2022.

arXiv:2203.04263 [pdf, other]

doi 10.1109/TMI.2022.3223554

Fast and selective super-resolution ultrasound in vivo with sono-switchable nanodroplets

Authors: Kai Riemer, Matthieu Toulemonde, Jipeng Yan, Marcelo Lerendegui, Eleanor Stride, Peter D. Weinberg, Christopher Dunsby, Meng-Xing Tang

Abstract: Perfusion by the microcirculation is key to the development, maintenance and pathology of tissue. Its measurement with high spatiotemporal resolution is consequently valuable but remains a challenge in deep tissue. Ultrasound Localization Microscopy (ULM) provides very high spatiotemporal resolution but the use of microbubbles requires low contrast agent concentrations, a long acquisition time, an… ▽ More Perfusion by the microcirculation is key to the development, maintenance and pathology of tissue. Its measurement with high spatiotemporal resolution is consequently valuable but remains a challenge in deep tissue. Ultrasound Localization Microscopy (ULM) provides very high spatiotemporal resolution but the use of microbubbles requires low contrast agent concentrations, a long acquisition time, and gives little control over the spatial and temporal distribution of the bubbles. The present study is the first to demonstrate Acoustic Wave Sparsely-Activated Localization Microscopy (AWSALM) and fast-AWSALM for in vivo super-resolution ultrasound imaging, offering contrast on demand and vascular selectivity. Three different formulations of sono-switchable contrast agents were tested. We demonstrate their use with ultrasound mechanical indices well within recommended safety limits to enable fast on-demand sparse switching at very high agent concentrations. We produce super-localization maps of the rabbit renal vasculature with acquisition times between 5.5 s and 0.25 s, and an 4-fold improvement in spatial resolution. We present the unique selectivity of AWSALM in visualizing specific vascular branches and downstream microvasculature, and we show super-localized kidney structures in systole and diastole with fast-AWSALM. In conclusion we demonstrate the feasibility of fast and selective measurement of microvascular dynamics in vivo with subwavelength resolution using ultrasound and sono-switchable nanodroplets. △ Less

Submitted 8 March, 2022; originally announced March 2022.

Comments: phase-change contrast agent, low-boiling point nanodroplet, acoustic vaporization, droplet activation, microcirculation, contrast enhanced ultrasound, plane wave

arXiv:2202.09059 [pdf, other]

Towards better understanding and better generalization of few-shot classification in histology images with contrastive learning

Authors: Jiawei Yang, Hanbo Chen, Jiangpeng Yan, Xiaoyu Chen, Jianhua Yao

Abstract: Few-shot learning is an established topic in natural images for years, but few work is attended to histology images, which is of high clinical value since well-labeled datasets and rare abnormal samples are expensive to collect. Here, we facilitate the study of few-shot learning in histology images by setting up three cross-domain tasks that simulate real clinics problems. To enable label-efficien… ▽ More Few-shot learning is an established topic in natural images for years, but few work is attended to histology images, which is of high clinical value since well-labeled datasets and rare abnormal samples are expensive to collect. Here, we facilitate the study of few-shot learning in histology images by setting up three cross-domain tasks that simulate real clinics problems. To enable label-efficient learning and better generalizability, we propose to incorporate contrastive learning (CL) with latent augmentation (LA) to build a few-shot system. CL learns useful representations without manual labels, while LA transfers semantic variations of the base dataset in an unsupervised way. These two components fully exploit unlabeled training data and can scale gracefully to other label-hungry problems. In experiments, we find i) models learned by CL generalize better than supervised learning for histology images in unseen classes, and ii) LA brings consistent gains over baselines. Prior studies of self-supervised learning mainly focus on ImageNet-like images, which only present a dominant object in their centers. Recent attention has been paid to images with multi-objects and multi-textures. Histology images are a natural choice for such a study. We show the superiority of CL over supervised learning in terms of generalization for such data and provide our empirical understanding for this observation. The findings in this work could contribute to understanding how the model generalizes in the context of both representation learning and histological image analysis. Code is available. △ Less

Submitted 18 February, 2022; originally announced February 2022.

arXiv:2202.07125 [pdf, other]

Transformers in Time Series: A Survey

Authors: Qingsong Wen, Tian Zhou, Chaoli Zhang, Weiqi Chen, Ziqing Ma, Junchi Yan, Liang Sun

Abstract: Transformers have achieved superior performances in many tasks in natural language processing and computer vision, which also triggered great interest in the time series community. Among multiple advantages of Transformers, the ability to capture long-range dependencies and interactions is especially attractive for time series modeling, leading to exciting progress in various time series applicati… ▽ More Transformers have achieved superior performances in many tasks in natural language processing and computer vision, which also triggered great interest in the time series community. Among multiple advantages of Transformers, the ability to capture long-range dependencies and interactions is especially attractive for time series modeling, leading to exciting progress in various time series applications. In this paper, we systematically review Transformer schemes for time series modeling by highlighting their strengths as well as limitations. In particular, we examine the development of time series Transformers in two perspectives. From the perspective of network structure, we summarize the adaptations and modifications that have been made to Transformers in order to accommodate the challenges in time series analysis. From the perspective of applications, we categorize time series Transformers based on common tasks including forecasting, anomaly detection, and classification. Empirically, we perform robust analysis, model size analysis, and seasonal-trend decomposition analysis to study how Transformers perform in time series. Finally, we discuss and suggest future directions to provide useful research guidance. To the best of our knowledge, this paper is the first work to comprehensively and systematically summarize the recent advances of Transformers for modeling time series data. We hope this survey will ignite further research interests in time series Transformers. △ Less

Submitted 11 May, 2023; v1 submitted 14 February, 2022; originally announced February 2022.

Comments: Accepted by 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023). 9 pages. The first work to comprehensively and systematically summarize time series Transformers. The GitHub repository is https://github.com/qingsongedu/time-series-transformers-review

Journal ref: In the 32nd International Joint Conference on Artificial Intelligence (IJCAI 2023)

arXiv:2111.00104 [pdf, other]

Principal Component Pursuit for Pattern Identification in Environmental Mixtures

Authors: Elizabeth A. Gibson, Junhui Zhang, **gkai Yan, Lawrence Chillrud, Jaime Benavides, Yanelli Nunez, Julie B. Herbstman, Jeff Goldsmith, John Wright, Marianthi-Anna Kioumourtzoglou

Abstract: Environmental health researchers often aim to identify sources/behaviors that give rise to potentially harmful exposures. We adapted principal component pursuit (PCP)-a robust technique for dimensionality reduction in computer vision and signal processing-to identify patterns in environmental mixtures. PCP decomposes the exposure mixture into a low-rank matrix containing consistent exposure patter… ▽ More Environmental health researchers often aim to identify sources/behaviors that give rise to potentially harmful exposures. We adapted principal component pursuit (PCP)-a robust technique for dimensionality reduction in computer vision and signal processing-to identify patterns in environmental mixtures. PCP decomposes the exposure mixture into a low-rank matrix containing consistent exposure patterns across pollutants and a sparse matrix isolating unique exposure events. We adapted PCP to accommodate non-negative and missing data, and values below a given limit of detection (LOD). We simulated data to represent environmental mixtures of two sizes with increasing proportions <LOD and three noise structures. We compared PCP-LOD to principal component analysis (PCA) to evaluate performance. We next applied PCP-LOD to a mixture of 21 persistent organic pollutants (POPs) measured in 1,000 U.S. adults from the 2001-2002 National Health and Nutrition Examination Survey. We applied singular value decomposition to the estimated low-rank matrix to characterize the patterns. PCP-LOD recovered the true number of patterns through cross-validation for all simulations; based on an a priori specified criterion, PCA recovered the true number of patterns in 32% of simulations. PCP-LOD achieved lower relative predictive error than PCA for all simulated datasets with up to 50% of the data <LOD. When 75% of values were <LOD, PCP-LOD outperformed PCA only when noise was low. In the POP mixture, PCP-LOD identified a rank-three underlying structure and separated 6% of values as unique events. One pattern represented comprehensive exposure to all POPs. The other patterns grouped chemicals based on known structure and toxicity. PCP-LOD serves as a useful tool to express multi-dimensional exposures as consistent patterns that, if found to be related to adverse health, are amenable to targeted interventions. △ Less

Submitted 29 October, 2021; originally announced November 2021.

Comments: 32 pages, 11 figures, 4 tables

arXiv:2109.13930 [pdf, other]

All-Around Real Label Supervision: Cyclic Prototype Consistency Learning for Semi-supervised Medical Image Segmentation

Authors: Zhe Xu, Yixin Wang, Donghuan Lu, Lequan Yu, Jiangpeng Yan, Jie Luo, Kai Ma, Yefeng Zheng, Raymond Kai-yu Tong

Abstract: Semi-supervised learning has substantially advanced medical image segmentation since it alleviates the heavy burden of acquiring the costly expert-examined annotations. Especially, the consistency-based approaches have attracted more attention for their superior performance, wherein the real labels are only utilized to supervise their paired images via supervised loss while the unlabeled images ar… ▽ More Semi-supervised learning has substantially advanced medical image segmentation since it alleviates the heavy burden of acquiring the costly expert-examined annotations. Especially, the consistency-based approaches have attracted more attention for their superior performance, wherein the real labels are only utilized to supervise their paired images via supervised loss while the unlabeled images are exploited by enforcing the perturbation-based \textit{"unsupervised"} consistency without explicit guidance from those real labels. However, intuitively, the expert-examined real labels contain more reliable supervision signals. Observing this, we ask an unexplored but interesting question: can we exploit the unlabeled data via explicit real label supervision for semi-supervised training? To this end, we discard the previous perturbation-based consistency but absorb the essence of non-parametric prototype learning. Based on the prototypical network, we then propose a novel cyclic prototype consistency learning (CPCL) framework, which is constructed by a labeled-to-unlabeled (L2U) prototypical forward process and an unlabeled-to-labeled (U2L) backward process. Such two processes synergistically enhance the segmentation network by encouraging more discriminative and compact features. In this way, our framework turns previous \textit{"unsupervised"} consistency into new \textit{"supervised"} consistency, obtaining the \textit{"all-around real label supervision"} property of our method. Extensive experiments on brain tumor segmentation from MRI and kidney segmentation from CT images show that our CPCL can effectively exploit the unlabeled data and outperform other state-of-the-art semi-supervised medical image segmentation methods. △ Less

Submitted 15 March, 2022; v1 submitted 28 September, 2021; originally announced September 2021.

Comments: 11 pages

arXiv:2107.09889 [pdf, other]

Fine-Grained Music Plagiarism Detection: Revealing Plagiarists through Bipartite Graph Matching and a Comprehensive Large-Scale Dataset

Authors: Wenxuan Liu, Tianyao He, Chen Gong, Ning Zhang, Hua Yang, Junchi Yan

Abstract: Music plagiarism detection is gaining more and more attention due to the popularity of music production and society's emphasis on intellectual property. We aim to find fine-grained plagiarism in music pairs since conventional methods are coarse-grained and cannot match real-life scenarios. Considering that there is no sizeable dataset designed for the music plagiarism task, we establish a large-sc… ▽ More Music plagiarism detection is gaining more and more attention due to the popularity of music production and society's emphasis on intellectual property. We aim to find fine-grained plagiarism in music pairs since conventional methods are coarse-grained and cannot match real-life scenarios. Considering that there is no sizeable dataset designed for the music plagiarism task, we establish a large-scale simulated dataset, named Music Plagiarism Detection Dataset (MPD-Set) under the guidance and expertise of renowned researchers from national-level professional institutions in the field of music. MPD-Set considers diverse music plagiarism cases found in real life from the melodic, rhythmic, and tonal levels respectively. Further, we establish a Real-life Dataset for evaluation, where all plagiarism pairs are real cases. To detect the fine-grained plagiarism pairs effectively, we propose a graph-based method called Bipatite Melody Matching Detector (BMM-Det), which formulates the problem as a max matching problem in the bipartite graph. Experimental results on both the simulated and Real-life Datasets demonstrate that BMM-Det outperforms the existing plagiarism detection methods, and is robust to common plagiarism cases like transpositions, pitch shifts, duration variance, and melody change. Datasets and source code are open-sourced at https://github.com/xuan301/BMMDet_MPDSet. △ Less

Submitted 2 July, 2023; v1 submitted 21 July, 2021; originally announced July 2021.

arXiv:2107.09877 [pdf, other]

Melody Structure Transfer Network: Generating Music with Separable Self-Attention

Authors: Ning Zhang, Junchi Yan

Abstract: Symbolic music generation has attracted increasing attention, while most methods focus on generating short piece (mostly less than 8 bars, and up to 32 bars). Generating long music calls for effective expression of the coherent music structure. Despite their success on long sequences, self-attention architectures still have challenge in dealing with long-term music as it requires additional care o… ▽ More Symbolic music generation has attracted increasing attention, while most methods focus on generating short piece (mostly less than 8 bars, and up to 32 bars). Generating long music calls for effective expression of the coherent music structure. Despite their success on long sequences, self-attention architectures still have challenge in dealing with long-term music as it requires additional care on the subtle music structure. In this paper, we propose to transfer the structure of training samples for new music generation, and develop a novel separable self-attention based model which enable the learning and transferring of the structure embedding. We show that our transfer model can generate music sequences (up to 100 bars) with interpretable structures, which bears similar structures and composition techniques with the template music from training set. Extensive experiments show its ability of generating music with target structure and well diversity. The generated 3,000 sets of music is uploaded as supplemental material. △ Less

Submitted 21 July, 2021; originally announced July 2021.

arXiv:2107.02433 [pdf, other]

Double-Uncertainty Guided Spatial and Temporal Consistency Regularization Weighting for Learning-based Abdominal Registration

Authors: Zhe Xu, Jie Luo, Donghuan Lu, Jiangpeng Yan, Sarah Frisken, Jayender Jagadeesan, William Wells III, Xiu Li, Yefeng Zheng, Raymond Tong

Abstract: In order to tackle the difficulty associated with the ill-posed nature of the image registration problem, regularization is often used to constrain the solution space. For most learning-based registration approaches, the regularization usually has a fixed weight and only constrains the spatial transformation. Such convention has two limitations: (i) Besides the laborious grid search for the optima… ▽ More In order to tackle the difficulty associated with the ill-posed nature of the image registration problem, regularization is often used to constrain the solution space. For most learning-based registration approaches, the regularization usually has a fixed weight and only constrains the spatial transformation. Such convention has two limitations: (i) Besides the laborious grid search for the optimal fixed weight, the regularization strength of a specific image pair should be associated with the content of the images, thus the "one value fits all" training scheme is not ideal; (ii) Only spatially regularizing the transformation may neglect some informative clues related to the ill-posedness. In this study, we propose a mean-teacher based registration framework, which incorporates an additional temporal consistency regularization term by encouraging the teacher model's prediction to be consistent with that of the student model. More importantly, instead of searching for a fixed weight, the teacher enables automatically adjusting the weights of the spatial regularization and the temporal consistency regularization by taking advantage of the transformation uncertainty and appearance uncertainty. Extensive experiments on the challenging abdominal CT-MRI registration show that our training strategy can promisingly advance the original learning-based method in terms of efficient hyperparameter tuning and a better tradeoff between accuracy and smoothness. △ Less

Submitted 2 March, 2022; v1 submitted 6 July, 2021; originally announced July 2021.

Comments: 11 pages

Showing 1–50 of 80 results for author: Yan, J