Search | arXiv e-print repository

A New Solution for MU-MISO Symbol-Level Precoding: Extrapolation and Deep Unfolding

Authors: Mu Liang, Ang Li, Xiaoyan Hu, Christos Masouros

Abstract: Constructive interference (CI) precoding, which converts the harmful multi-user interference into beneficial signals, is a promising and efficient interference management scheme in multi-antenna communication systems. However, CI-based symbol-level precoding (SLP) experiences high computational complexity as the number of symbol slots increases within a transmission block, rendering it unaffordabl… ▽ More Constructive interference (CI) precoding, which converts the harmful multi-user interference into beneficial signals, is a promising and efficient interference management scheme in multi-antenna communication systems. However, CI-based symbol-level precoding (SLP) experiences high computational complexity as the number of symbol slots increases within a transmission block, rendering it unaffordable in practical communication systems. In this paper, we propose a symbol-level extrapolation (SLE) strategy to extrapolate the precoding matrix by leveraging the relationship between different symbol slots within in a transmission block, during which the channel state information (CSI) remains constant, where we design a closed-form iterative algorithm based on SLE for both PSK and QAM modulation. In order to further reduce the computational complexity, a sub-optimal closed-form solution based on SLE is further developed for PSK and QAM, respectively. Moreover, we design an unsupervised SLE-based neural network (SLE-Net) to unfold the proposed iterative algorithm, which helps enhance the interpretability of the neural network. By carefully designing the loss function of the SLE-Net, the time-complexity of the network can be reduced effectively. Extensive simulation results illustrate that the proposed algorithms can dramatically reduce the computational complexity and time complexity with only marginal performance loss, compared with the conventional SLP design methods. △ Less

Submitted 26 May, 2024; originally announced May 2024.

arXiv:2405.14802 [pdf, other]

Fast-DDPM: Fast Denoising Diffusion Probabilistic Models for Medical Image-to-Image Generation

Authors: Hongxu Jiang, Muhammad Imran, Linhai Ma, Teng Zhang, Yuyin Zhou, Muxuan Liang, Kuang Gong, Wei Shao

Abstract: Denoising diffusion probabilistic models (DDPMs) have achieved unprecedented success in computer vision. However, they remain underutilized in medical imaging, a field crucial for disease diagnosis and treatment planning. This is primarily due to the high computational cost associated with (1) the use of large number of time steps (e.g., 1,000) in diffusion processes and (2) the increased dimensio… ▽ More Denoising diffusion probabilistic models (DDPMs) have achieved unprecedented success in computer vision. However, they remain underutilized in medical imaging, a field crucial for disease diagnosis and treatment planning. This is primarily due to the high computational cost associated with (1) the use of large number of time steps (e.g., 1,000) in diffusion processes and (2) the increased dimensionality of medical images, which are often 3D or 4D. Training a diffusion model on medical images typically takes days to weeks, while sampling each image volume takes minutes to hours. To address this challenge, we introduce Fast-DDPM, a simple yet effective approach capable of improving training speed, sampling speed, and generation quality simultaneously. Unlike DDPM, which trains the image denoiser across 1,000 time steps, Fast-DDPM trains and samples using only 10 time steps. The key to our method lies in aligning the training and sampling procedures to optimize time-step utilization. Specifically, we introduced two efficient noise schedulers with 10 time steps: one with uniform time step sampling and another with non-uniform sampling. We evaluated Fast-DDPM across three medical image-to-image generation tasks: multi-image super-resolution, image denoising, and image-to-image translation. Fast-DDPM outperformed DDPM and current state-of-the-art methods based on convolutional networks and generative adversarial networks in all tasks. Additionally, Fast-DDPM reduced the training time to 0.2x and the sampling time to 0.01x compared to DDPM. Our code is publicly available at: https://github.com/mirthAI/Fast-DDPM. △ Less

Submitted 23 May, 2024; v1 submitted 23 May, 2024; originally announced May 2024.

arXiv:2404.09841 [pdf, other]

Anatomy of Industrial Scale Multilingual ASR

Authors: Francis McCann Ramirez, Luka Chkhetiani, Andrew Ehrenberg, Robert McHardy, Rami Botros, Yash Khare, Andrea Vanzo, Taufiquzzaman Peyash, Gabriel Oexle, Michael Liang, Ilya Sklyar, Enver Fakhan, Ahmed Etefy, Daniel McCrystal, Sam Flamini, Domenic Donato, Takuya Yoshioka

Abstract: This paper describes AssemblyAI's industrial-scale automatic speech recognition (ASR) system, designed to meet the requirements of large-scale, multilingual ASR serving various application needs. Our system leverages a diverse training dataset comprising unsupervised (12.5M hours), supervised (188k hours), and pseudo-labeled (1.6M hours) data across four languages. We provide a detailed descriptio… ▽ More This paper describes AssemblyAI's industrial-scale automatic speech recognition (ASR) system, designed to meet the requirements of large-scale, multilingual ASR serving various application needs. Our system leverages a diverse training dataset comprising unsupervised (12.5M hours), supervised (188k hours), and pseudo-labeled (1.6M hours) data across four languages. We provide a detailed description of our model architecture, consisting of a full-context 600M-parameter Conformer encoder pre-trained with BEST-RQ and an RNN-T decoder fine-tuned jointly with the encoder. Our extensive evaluation demonstrates competitive word error rates (WERs) against larger and more computationally expensive models, such as Whisper large and Canary-1B. Furthermore, our architectural choices yield several key advantages, including an improved code-switching capability, a 5x inference speedup compared to an optimized Whisper baseline, a 30% reduction in hallucination rate on speech data, and a 90% reduction in ambient noise compared to Whisper, along with significantly improved time-stamp accuracy. Throughout this work, we adopt a system-centric approach to analyzing various aspects of fully-fledged ASR models to gain practically relevant insights useful for real-world services operating at scale. △ Less

Submitted 16 April, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.07341 [pdf, other]

Conformer-1: Robust ASR via Large-Scale Semisupervised Bootstrap**

Authors: Kevin Zhang, Luka Chkhetiani, Francis McCann Ramirez, Yash Khare, Andrea Vanzo, Michael Liang, Sergio Ramirez Martin, Gabriel Oexle, Ruben Bousbib, Taufiquzzaman Peyash, Michael Nguyen, Dillon Pulliam, Domenic Donato

Abstract: This paper presents Conformer-1, an end-to-end Automatic Speech Recognition (ASR) model trained on an extensive dataset of 570k hours of speech audio data, 91% of which was acquired from publicly available sources. To achieve this, we perform Noisy Student Training after generating pseudo-labels for the unlabeled public data using a strong Conformer RNN-T baseline model. The addition of these pseu… ▽ More This paper presents Conformer-1, an end-to-end Automatic Speech Recognition (ASR) model trained on an extensive dataset of 570k hours of speech audio data, 91% of which was acquired from publicly available sources. To achieve this, we perform Noisy Student Training after generating pseudo-labels for the unlabeled public data using a strong Conformer RNN-T baseline model. The addition of these pseudo-labeled data results in remarkable improvements in relative Word Error Rate (WER) by 11.5% and 24.3% for our asynchronous and realtime models, respectively. Additionally, the model is more robust to background noise owing to the addition of these data. The results obtained in this study demonstrate that the incorporation of pseudo-labeled publicly available data is a highly effective strategy for improving ASR accuracy and noise robustness. △ Less

Submitted 12 April, 2024; v1 submitted 10 April, 2024; originally announced April 2024.

arXiv:2402.11954 [pdf, other]

Multimodal Emotion Recognition from Raw Audio with Sinc-convolution

Authors: Xiaohui Zhang, Wenjie Fu, Mangui Liang

Abstract: Speech Emotion Recognition (SER) is still a complex task for computers with average recall rates usually about 70% on the most realistic datasets. Most SER systems use hand-crafted features extracted from audio signal such as energy, zero crossing rate, spectral information, prosodic, mel frequency cepstral coefficient (MFCC), and so on. More recently, using raw waveform for training neural networ… ▽ More Speech Emotion Recognition (SER) is still a complex task for computers with average recall rates usually about 70% on the most realistic datasets. Most SER systems use hand-crafted features extracted from audio signal such as energy, zero crossing rate, spectral information, prosodic, mel frequency cepstral coefficient (MFCC), and so on. More recently, using raw waveform for training neural network is becoming an emerging trend. This approach is advantageous as it eliminates the feature extraction pipeline. Learning from time-domain signal has shown good results for tasks such as speech recognition, speaker verification etc. In this paper, we utilize Sinc-convolution layer, which is an efficient architecture for preprocessing raw speech waveform for emotion recognition, to extract acoustic features from raw audio signals followed by a long short-term memory (LSTM). We also incorporate linguistic features and append a dialogical emotion decoding (DED) strategy. Our approach achieves a weighted accuracy of 85.1\% in four class emotion on the Interactive Emotional Dyadic Motion Capture (IEMOCAP) dataset. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2402.11931 [pdf, other]

Soft-Weighted CrossEntropy Loss for Continous Alzheimer's Disease Detection

Authors: Xiaohui Zhang, Wenjie Fu, Mangui Liang

Abstract: Alzheimer's disease is a common cognitive disorder in the elderly. Early and accurate diagnosis of Alzheimer's disease (AD) has a major impact on the progress of research on dementia. At present, researchers have used machine learning methods to detect Alzheimer's disease from the speech of participants. However, the recognition accuracy of current methods is unsatisfactory, and most of them focus… ▽ More Alzheimer's disease is a common cognitive disorder in the elderly. Early and accurate diagnosis of Alzheimer's disease (AD) has a major impact on the progress of research on dementia. At present, researchers have used machine learning methods to detect Alzheimer's disease from the speech of participants. However, the recognition accuracy of current methods is unsatisfactory, and most of them focus on using low-dimensional handcrafted features to extract relevant information from audios. This paper proposes an Alzheimer's disease detection system based on the pre-trained framework Wav2vec 2.0 (Wav2vec2). In addition, by replacing the loss function with the Soft-Weighted CrossEntropy loss function, we achieved 85.45\% recognition accuracy on the same test dataset. △ Less

Submitted 19 February, 2024; originally announced February 2024.

arXiv:2312.15564 [pdf, ps, other]

A Belief Propagation Approach for Direct Multipath-Based SLAM

Authors: Mingchao Liang, Erik Leitinger, Florian Meyer

Abstract: In this work, we develop a multipath-based simultaneous localization and map** (SLAM) method that can directly be applied to received radio signals. In existing multipath-based SLAM approaches, a channel estimator is used as a preprocessing stage that reduces data flow and computational complexity by extracting features related to multipath components (MPCs). We aim to avoid any preprocessing st… ▽ More In this work, we develop a multipath-based simultaneous localization and map** (SLAM) method that can directly be applied to received radio signals. In existing multipath-based SLAM approaches, a channel estimator is used as a preprocessing stage that reduces data flow and computational complexity by extracting features related to multipath components (MPCs). We aim to avoid any preprocessing stage that may lead to a loss of relevant information. The presented method relies on a new statistical model for the data generation process of the received radio signal that can be represented by a factor graph. This factor graph is the starting point for the development of an efficient belief propagation (BP) method for multipath-based SLAM that directly uses received radio signals as measurements. Simulation results in a realistic scenario with a single-input single-output (SISO) channel demonstrate that the proposed direct method for radio-based SLAM outperforms state-of-the-art methods that rely on a channel estimator. △ Less

Submitted 24 December, 2023; originally announced December 2023.

arXiv:2310.18529 [pdf, other]

FPM-INR: Fourier ptychographic microscopy image stack reconstruction using implicit neural representations

Authors: Haowen Zhou, Brandon Y. Feng, Haiyun Guo, Siyu Lin, Mingshu Liang, Christopher A. Metzler, Changhuei Yang

Abstract: Image stacks provide invaluable 3D information in various biological and pathological imaging applications. Fourier ptychographic microscopy (FPM) enables reconstructing high-resolution, wide field-of-view image stacks without z-stack scanning, thus significantly accelerating image acquisition. However, existing FPM methods take tens of minutes to reconstruct and gigabytes of memory to store a hig… ▽ More Image stacks provide invaluable 3D information in various biological and pathological imaging applications. Fourier ptychographic microscopy (FPM) enables reconstructing high-resolution, wide field-of-view image stacks without z-stack scanning, thus significantly accelerating image acquisition. However, existing FPM methods take tens of minutes to reconstruct and gigabytes of memory to store a high-resolution volumetric scene, impeding fast gigapixel-scale remote digital pathology. While deep learning approaches have been explored to address this challenge, existing methods poorly generalize to novel datasets and can produce unreliable hallucinations. This work presents FPM-INR, a compact and efficient framework that integrates physics-based optical models with implicit neural representations (INR) to represent and reconstruct FPM image stacks. FPM-INR is agnostic to system design or sample types and does not require external training data. In our demonstrated experiments, FPM-INR substantially outperforms traditional FPM algorithms with up to a 25-fold increase in speed and an 80-fold reduction in memory usage for continuous image stack representations. △ Less

Submitted 31 October, 2023; v1 submitted 27 October, 2023; originally announced October 2023.

Comments: Project Page: https://hwzhou2020.github.io/FPM-INR-Web/

arXiv:2307.08323 [pdf, other]

TST: Time-Sparse Transducer for Automatic Speech Recognition

Authors: Xiaohui Zhang, Mangui Liang, Zhengkun Tian, Jiangyan Yi, Jianhua Tao

Abstract: End-to-end model, especially Recurrent Neural Network Transducer (RNN-T), has achieved great success in speech recognition. However, transducer requires a great memory footprint and computing time when processing a long decoding sequence. To solve this problem, we propose a model named time-sparse transducer, which introduces a time-sparse mechanism into transducer. In this mechanism, we obtain th… ▽ More End-to-end model, especially Recurrent Neural Network Transducer (RNN-T), has achieved great success in speech recognition. However, transducer requires a great memory footprint and computing time when processing a long decoding sequence. To solve this problem, we propose a model named time-sparse transducer, which introduces a time-sparse mechanism into transducer. In this mechanism, we obtain the intermediate representations by reducing the time resolution of the hidden states. Then the weighted average algorithm is used to combine these representations into sparse hidden states followed by the decoder. All the experiments are conducted on a Mandarin dataset AISHELL-1. Compared with RNN-T, the character error rate of the time-sparse transducer is close to RNN-T and the real-time factor is 50.00% of the original. By adjusting the time resolution, the time-sparse transducer can also reduce the real-time factor to 16.54% of the original at the expense of a 4.94% loss of precision. △ Less

Submitted 17 July, 2023; originally announced July 2023.

Comments: 10 pages

Journal ref: International Conference on Artificial Intelligence (CICAI 2023)

arXiv:2307.00765 [pdf, ps, other]

A BP Method for Track-Before-Detect

Authors: Mingchao Liang, Thomas Kropfreiter, Florian Meyer

Abstract: Tracking an unknown number of low-observable objects is notoriously challenging. This letter proposes a sequential Bayesian estimation method based on the track-before-detect (TBD) approach. In TBD, raw sensor measurements are directly used by the tracking algorithm without any preprocessing. Our proposed method is based on a new statistical model that introduces a new object hypothesis for each d… ▽ More Tracking an unknown number of low-observable objects is notoriously challenging. This letter proposes a sequential Bayesian estimation method based on the track-before-detect (TBD) approach. In TBD, raw sensor measurements are directly used by the tracking algorithm without any preprocessing. Our proposed method is based on a new statistical model that introduces a new object hypothesis for each data cell of the raw sensor measurements. It allows objects to interact and contribute to more than one data cell. Based on the factor graph representing our statistical model, we derive the message passing equations of the proposed belief propagation (BP) method for TBD. Approximations are applied to certain BP messages to reduce computational complexity and improve scalability. In a simulation experiment, our proposed BP-based TBD method outperforms two other state-of-the-art TBD methods. △ Less

Submitted 3 July, 2023; originally announced July 2023.

arXiv:2305.19956 [pdf, other]

doi 10.1016/j.compmedimag.2024.102326

MicroSegNet: A Deep Learning Approach for Prostate Segmentation on Micro-Ultrasound Images

Authors: Hongxu Jiang, Muhammad Imran, Preethika Muralidharan, Anjali Patel, Jake Pensa, Muxuan Liang, Tarik Benidir, Joseph R. Grajo, Jason P. Joseph, Russell Terry, John Michael DiBianco, Li-Ming Su, Yuyin Zhou, Wayne G. Brisbane, Wei Shao

Abstract: Micro-ultrasound (micro-US) is a novel 29-MHz ultrasound technique that provides 3-4 times higher resolution than traditional ultrasound, potentially enabling low-cost, accurate diagnosis of prostate cancer. Accurate prostate segmentation is crucial for prostate volume measurement, cancer diagnosis, prostate biopsy, and treatment planning. However, prostate segmentation on micro-US is challenging… ▽ More Micro-ultrasound (micro-US) is a novel 29-MHz ultrasound technique that provides 3-4 times higher resolution than traditional ultrasound, potentially enabling low-cost, accurate diagnosis of prostate cancer. Accurate prostate segmentation is crucial for prostate volume measurement, cancer diagnosis, prostate biopsy, and treatment planning. However, prostate segmentation on micro-US is challenging due to artifacts and indistinct borders between the prostate, bladder, and urethra in the midline. This paper presents MicroSegNet, a multi-scale annotation-guided transformer UNet model designed specifically to tackle these challenges. During the training process, MicroSegNet focuses more on regions that are hard to segment (hard regions), characterized by discrepancies between expert and non-expert annotations. We achieve this by proposing an annotation-guided binary cross entropy (AG-BCE) loss that assigns a larger weight to prediction errors in hard regions and a lower weight to prediction errors in easy regions. The AG-BCE loss was seamlessly integrated into the training process through the utilization of multi-scale deep supervision, enabling MicroSegNet to capture global contextual dependencies and local information at various scales. We trained our model using micro-US images from 55 patients, followed by evaluation on 20 patients. Our MicroSegNet model achieved a Dice coefficient of 0.939 and a Hausdorff distance of 2.02 mm, outperforming several state-of-the-art segmentation methods, as well as three human annotators with different experience levels. Our code is publicly available at https://github.com/mirthAI/MicroSegNet and our dataset is publicly available at https://zenodo.org/records/10475293. △ Less

Submitted 25 January, 2024; v1 submitted 31 May, 2023; originally announced May 2023.

Journal ref: Computerized Medical Imaging and Graphics (2024): 102326

arXiv:2305.19939 [pdf, other]

Image Registration of In Vivo Micro-Ultrasound and Ex Vivo Pseudo-Whole Mount Histopathology Images of the Prostate: A Proof-of-Concept Study

Authors: Muhammad Imran, Brianna Nguyen, Jake Pensa, Sara M. Falzarano, Anthony E. Sisk, Muxuan Liang, John Michael DiBianco, Li-Ming Su, Yuyin Zhou, Wayne G. Brisbane, Wei Shao

Abstract: Early diagnosis of prostate cancer significantly improves a patient's 5-year survival rate. Biopsy of small prostate cancers is improved with image-guided biopsy. MRI-ultrasound fusion-guided biopsy is sensitive to smaller tumors but is underutilized due to the high cost of MRI and fusion equipment. Micro-ultrasound (micro-US), a novel high-resolution ultrasound technology, provides a cost-effecti… ▽ More Early diagnosis of prostate cancer significantly improves a patient's 5-year survival rate. Biopsy of small prostate cancers is improved with image-guided biopsy. MRI-ultrasound fusion-guided biopsy is sensitive to smaller tumors but is underutilized due to the high cost of MRI and fusion equipment. Micro-ultrasound (micro-US), a novel high-resolution ultrasound technology, provides a cost-effective alternative to MRI while delivering comparable diagnostic accuracy. However, the interpretation of micro-US is challenging due to subtle gray scale changes indicating cancer vs normal tissue. This challenge can be addressed by training urologists with a large dataset of micro-US images containing the ground truth cancer outlines. Such a dataset can be mapped from surgical specimens (histopathology) onto micro-US images via image registration. In this paper, we present a semi-automated pipeline for registering in vivo micro-US images with ex vivo whole-mount histopathology images. Our pipeline begins with the reconstruction of pseudo-whole-mount histopathology images and a 3-dimensional (3D) micro-US volume. Each pseudo-whole-mount histopathology image is then registered with the corresponding axial micro-US slice using a two-stage approach that estimates an affine transformation followed by a deformable transformation. We evaluated our registration pipeline using micro-US and histopathology images from 18 patients who underwent radical prostatectomy. The results showed a Dice coefficient of 0.94 and a landmark error of 2.7 mm, indicating the accuracy of our registration pipeline. This proof-of-concept study demonstrates the feasibility of accurately aligning micro-US and histopathology images. To promote transparency and collaboration in research, we will make our code and dataset publicly available. △ Less

Submitted 16 June, 2023; v1 submitted 31 May, 2023; originally announced May 2023.

arXiv:2303.04432 [pdf, ps, other]

Deep Learning-Based Channel Extrapolation for Pattern Reconfigurable Massive MIMO

Authors: Mu Liang, Ang Li

Abstract: Reconfigurable antennas that can dynamically change their operation state exhibit excellent adaptivity and flexibility over traditional antennas, and MIMO arrays that consist of multifunctional and reconfigurable antennas (MRAs) are foreseen as one promising solution towards future Holographic MIMO. Specifically, in pattern reconfigurable MIMO (PR-MIMO) communication systems, accurate acquisition… ▽ More Reconfigurable antennas that can dynamically change their operation state exhibit excellent adaptivity and flexibility over traditional antennas, and MIMO arrays that consist of multifunctional and reconfigurable antennas (MRAs) are foreseen as one promising solution towards future Holographic MIMO. Specifically, in pattern reconfigurable MIMO (PR-MIMO) communication systems, accurate acquisition of channel state information (CSI) of all the radiation modes is a challenging task, because using conventional pilot-based channel estimation techniques in PR-MIMO systems incurs overwhelming pilot overheads. In this letter, we leverage deep learning methods to design a PR neural network, which can use the estimated CSI for one radiation mode to infer CSIs for the other radiation modes. In order to reduce the pilot overheads, we propose a new channel estimation method specially for PR-MIMO systems, which divides the transmit antennas of PR-MIMO into groups and antennas in different groups employ different radiation modes. Compared with conventional full-connected real-valued deep neural networks (DNN), the PR neural network which uses complex-valued coefficients can work directly in the complex domain. Experiment results show that the proposed channel extrapolation method offers significant performance gains in terms of extrapolation accuracy over benchmark schemes. △ Less

Submitted 6 April, 2023; v1 submitted 8 March, 2023; originally announced March 2023.

arXiv:2212.08340 [pdf, ps, other]

Neural Enhanced Belief Propagation for Multiobject Tracking

Authors: Mingchao Liang, Florian Meyer

Abstract: Algorithmic solutions for multi-object tracking (MOT) are a key enabler for applications in autonomous navigation and applied ocean sciences. State-of-the-art MOT methods fully rely on a statistical model and typically use preprocessed sensor data as measurements. In particular, measurements are produced by a detector that extracts potential object locations from the raw sensor data collected for… ▽ More Algorithmic solutions for multi-object tracking (MOT) are a key enabler for applications in autonomous navigation and applied ocean sciences. State-of-the-art MOT methods fully rely on a statistical model and typically use preprocessed sensor data as measurements. In particular, measurements are produced by a detector that extracts potential object locations from the raw sensor data collected for a discrete time step. This preparatory processing step reduces data flow and computational complexity but may result in a loss of information. State-of-the-art Bayesian MOT methods that are based on belief propagation (BP) systematically exploit graph structures of the statistical model to reduce computational complexity and improve scalability. However, as a fully model-based approach, BP can only provide suboptimal estimates when there is a mismatch between the statistical model and the true data-generating process. Existing BP-based MOT methods can further only make use of preprocessed measurements. In this paper, we introduce a variant of BP that combines model-based with data-driven MOT. The proposed neural enhanced belief propagation (NEBP) method complements the statistical model of BP by information learned from raw sensor data. This approach conjectures that the learned information can reduce model mismatch and thus improve data association and false alarm rejection. Our NEBP method improves tracking performance compared to model-based methods. At the same time, it inherits the advantages of BP-based MOT, i.e., it scales only quadratically in the number of objects, and it can thus generate and maintain a large number of object tracks. We evaluate the performance of our NEBP approach for MOT on the nuScenes autonomous driving dataset and demonstrate that it has state-of-the-art performance. △ Less

Submitted 16 December, 2022; originally announced December 2022.

arXiv:2212.06414 [pdf, other]

Even Order Explicit Symplectic Geometric Algorithms for Quaternion Kinematical Differential Equation in Guidance Navigation and Control via Diagonal Padè Approximation and Cayley Transform

Authors: Hong-Yan Zhang, Fei Liu, Yu Zhou, Man Liang

Abstract: The Quaternion kinematical differential equation (QKDE) plays a key role in navigation, control and guidance systems. Although explicit symplectic geometric algorithms (ESGA) for this problem are available, there is a lack of a unified way for constructing high order symplectic difference schemes with configurable order parameter. We present even order explicit symplectic geometric algorithms to s… ▽ More The Quaternion kinematical differential equation (QKDE) plays a key role in navigation, control and guidance systems. Although explicit symplectic geometric algorithms (ESGA) for this problem are available, there is a lack of a unified way for constructing high order symplectic difference schemes with configurable order parameter. We present even order explicit symplectic geometric algorithms to solve the QKDE with diagonal Padè approximation and Cayley transform. The maximum absolute error for solving the QKDE is $\mathcal{O}(τ^{2\ell})$ where $τ$ is the time step and $\ell$ is the order parameter. The linear time complexity and constant space complexity of computation as well as the simple algorithmic structure show that our algorithms are appropriate for realtime applications in aeronautics, astronautics, robotics, visual-inertial odemetry and so on. The performance of the proposed algorithms are verified and validated by mathematical analysis and numerical simulation. △ Less

Submitted 12 January, 2023; v1 submitted 13 December, 2022; originally announced December 2022.

arXiv:2206.09746 [pdf, other]

Data Fusion for Radio Frequency SLAM with Robust Sampling

Authors: Erik Leitinger, Bryan Teague, Wenyu Zhang, Mingchao Liang, Florian Meyer

Abstract: Precise indoor localization remains a challenging problem for a variety of essential applications. A promising approach to address this problem is to exchange radio signals between mobile agents and static physical anchors (PAs) that bounce off flat surfaces in the indoor environment. Radio frequency simultaneous localization and map** (RF-SLAM) methods can be used to jointly estimates the time-… ▽ More Precise indoor localization remains a challenging problem for a variety of essential applications. A promising approach to address this problem is to exchange radio signals between mobile agents and static physical anchors (PAs) that bounce off flat surfaces in the indoor environment. Radio frequency simultaneous localization and map** (RF-SLAM) methods can be used to jointly estimates the time-varying location of agents as well as the static locations of the flat surfaces. Recent work on RF-SLAM methods has shown that each surface can be efficiently represented by a single master virtual anchor (MVA). The measurement model related to this MVA-based RF-SLAM method is highly nonlinear. Thus, Bayesian estimation relies on sampling-based techniques. The original MVA-based RF-SLAM method employs conventional "bootstrap" sampling. In challenging scenarios it was observed that the original method might converge to incorrect MVA positions corresponding to local maxima. In this paper, we introduce MVA-based RF-SLAM with an improved sampling technique that succeeds in the aforementioned challenging scenarios. Our simulation results demonstrate significant performance advantages. △ Less

Submitted 20 June, 2022; originally announced June 2022.

Comments: published at FUSION 2022

arXiv:2203.09948 [pdf, ps, other]

Neural Enhanced Belief Propagation for Data Association in Multiobject Tracking

Authors: Mingchao Liang, Florian Meyer

Abstract: Situation-aware technologies enabled by multiobject tracking (MOT) methods will create new services and applications in fields such as autonomous navigation and applied ocean sciences. Belief propagation (BP) is a state-of-the-art method for Bayesian MOT but fully relies on a statistical model and preprocessed sensor measurements. In this paper, we establish a hybrid method for model-based and dat… ▽ More Situation-aware technologies enabled by multiobject tracking (MOT) methods will create new services and applications in fields such as autonomous navigation and applied ocean sciences. Belief propagation (BP) is a state-of-the-art method for Bayesian MOT but fully relies on a statistical model and preprocessed sensor measurements. In this paper, we establish a hybrid method for model-based and data-driven MOT. The proposed neural enhanced belief propagation (NEBP) approach complements BP by information learned from raw sensor data with the goal to improve data association and to reject false alarm measurements. We evaluate the performance of our NEBP approach for MOT on the nuScenes autonomous driving dataset and demonstrate that it can outperform state-of-the-art reference methods. △ Less

Submitted 15 June, 2022; v1 submitted 16 March, 2022; originally announced March 2022.

arXiv:2105.12903 [pdf, ps, other]

Neural Enhanced Belief Propagation for Cooperative Localization

Authors: Mingchao Liang, Florian Meyer

Abstract: Location-aware networks will introduce innovative services and applications for modern convenience, applied ocean sciences, and public safety. In this paper, we establish a hybrid method for model-based and data-driven inference. We consider a cooperative localization (CL) scenario where the mobile agents in a wireless network aim to localize themselves by performing pairwise observations with oth… ▽ More Location-aware networks will introduce innovative services and applications for modern convenience, applied ocean sciences, and public safety. In this paper, we establish a hybrid method for model-based and data-driven inference. We consider a cooperative localization (CL) scenario where the mobile agents in a wireless network aim to localize themselves by performing pairwise observations with other agents and by exchanging location information. A traditional method for distributed CL in large agent networks is belief propagation (BP) which is completely model-based and is known to suffer from providing inconsistent (overconfident) estimates. The proposed approach addresses these limitations by complementing BP with learned information provided by a graph neural network (GNN). We demonstrate numerically that our method can improve estimation accuracy and avoid overconfident beliefs, while its computational complexity remains comparable to BP. Notably, more consistent beliefs are obtained by not explicitly addressing overconfidence in the loss function used for training of the GNN. △ Less

Submitted 26 May, 2021; originally announced May 2021.

arXiv:2102.10260 [pdf, other]

Wireless sensor network for in situ soil moisture monitoring

Authors: Jianing Fang, Chuheng Hu, Nour Smaoui, Doug Carlson, Jayant Gupchup, Razvan Musaloiu-E., Chieh-Jan Mike Liang, Marcus Chang, Omprakash Gnawali, Tamas Budavari, Andreas Terzis, Katalin Szlavecz, Alexander S. Szalay

Abstract: We discuss the history and lessons learned from a series of deployments of environmental sensors measuring soil parameters and CO2 fluxes over the last fifteen years, in an outdoor environment. We present the hardware and software architecture of our current Gen-3 system, and then discuss how we are simplifying the user facing part of the software, to make it easier and friendlier for the environm… ▽ More We discuss the history and lessons learned from a series of deployments of environmental sensors measuring soil parameters and CO2 fluxes over the last fifteen years, in an outdoor environment. We present the hardware and software architecture of our current Gen-3 system, and then discuss how we are simplifying the user facing part of the software, to make it easier and friendlier for the environmental scientist to be in full control of the system. Finally, we describe the current effort to build a large-scale Gen-4 sensing platform consisting of hundreds of nodes to track the environmental parameters for urban green spaces in Baltimore, Maryland. △ Less

Submitted 20 February, 2021; originally announced February 2021.

Comments: 12 pages, 16 figures, Sensornets 2021 Conference

arXiv:2005.05288 [pdf, other]

doi 10.1364/PRJ.419886

Non-iterative complex wave-field reconstruction based on Kramers-Kronig relations

Authors: Cheng Shen, An Pan, Mingshu Liang, Changhuei Yang

Abstract: A new computational imaging method to reconstruct the complex wave-field is reported. Due to the existence of zero frequency component, the measured signal by amplitude modulation of pupil has a spectrum similar to the one of off-axis hologram. The mathematical analogy between them is established in this paper. Based on this observation and analyticity of band-limited signal under any diffraction-… ▽ More A new computational imaging method to reconstruct the complex wave-field is reported. Due to the existence of zero frequency component, the measured signal by amplitude modulation of pupil has a spectrum similar to the one of off-axis hologram. The mathematical analogy between them is established in this paper. Based on this observation and analyticity of band-limited signal under any diffraction-limited system, an algorithm from Kramers-Kronig (KK) relations is utilized to recover the phase information only from the intensity patterns. From the sensing side, only two measurements are required at least. From the reconstruction algorithm side, our method is iteration-free and parameter-free, also without any assumption on sample characteristics. It owns several advantages over existing phase imaging methods and could provide a unique perspective to understand current computational imaging methods. △ Less

Submitted 11 May, 2020; originally announced May 2020.

arXiv:2004.01407 [pdf]

doi 10.1109/TSG.2020.3025259

FeederGAN: Synthetic Feeder Generation via Deep Graph Adversarial Nets

Authors: Ming Liang, Yao Meng, Jiyu Wang, David Lubkeman, Ning Lu

Abstract: This paper presents a novel, automated, generative adversarial networks (GAN) based synthetic feeder generation mechanism, abbreviated as FeederGAN. FeederGAN digests real feeder models represented by directed graphs via a deep learning framework powered by GAN and graph convolutional networks (GCN). Information of a distribution feeder circuit is extracted from its model input files so that the d… ▽ More This paper presents a novel, automated, generative adversarial networks (GAN) based synthetic feeder generation mechanism, abbreviated as FeederGAN. FeederGAN digests real feeder models represented by directed graphs via a deep learning framework powered by GAN and graph convolutional networks (GCN). Information of a distribution feeder circuit is extracted from its model input files so that the device connectivity is mapped onto the adjacency matrix and the device characteristics, such as circuit types (i.e., 3-phase, 2-phase, and 1-phase) and component attributes (e.g., length and current ratings), are mapped onto the attribute matrix. Then, Wasserstein distance is used to optimize the GAN and GCN is used to discriminate the generated graphs from the actual ones. A greedy method based on graph theory is developed to reconstruct the feeder using the generated adjacency and attribute matrices. Our results show that the GAN generated feeders resemble the actual feeder in both topology and attributes verified by visual inspection and by empirical statistics obtained from actual distribution feeders. △ Less

Submitted 16 September, 2020; v1 submitted 3 April, 2020; originally announced April 2020.

Comments: Accepted by IEEE Trans. on Smart Grid

Showing 1–21 of 21 results for author: Liang, M