Search | arXiv e-print repository

A point cloud processing method of mmWave radar over automotive scenario

Authors: Qingmian Wan, Hongli Peng, Xing Liao, Kuayue Liu

Abstract: This paper introduces in detail the effective method of comprehensive target judgment by using radar RA map and point cloud map. Different output of radar can effectively judge the road boundary of target and the relative coordinates of target, avoid the error of output caused by excessive processing information, and greatly improve the processing efficiency of DBSCAN of the measured target. This paper introduces in detail the effective method of comprehensive target judgment by using radar RA map and point cloud map. Different output of radar can effectively judge the road boundary of target and the relative coordinates of target, avoid the error of output caused by excessive processing information, and greatly improve the processing efficiency of DBSCAN of the measured target. △ Less

Submitted 23 March, 2024; originally announced April 2024.

arXiv:2402.09747 [pdf, other]

Less is more: Ensemble Learning for Retinal Disease Recognition Under Limited Resources

Authors: Jiahao Wang, Hong Peng, Shengchao Chen, Sufen Ren

Abstract: Retinal optical coherence tomography (OCT) images provide crucial insights into the health of the posterior ocular segment. Therefore, the advancement of automated image analysis methods is imperative to equip clinicians and researchers with quantitative data, thereby facilitating informed decision-making. The application of deep learning (DL)-based approaches has gained extensive traction for exe… ▽ More Retinal optical coherence tomography (OCT) images provide crucial insights into the health of the posterior ocular segment. Therefore, the advancement of automated image analysis methods is imperative to equip clinicians and researchers with quantitative data, thereby facilitating informed decision-making. The application of deep learning (DL)-based approaches has gained extensive traction for executing these analysis tasks, demonstrating remarkable performance compared to labor-intensive manual analyses. However, the acquisition of Retinal OCT images often presents challenges stemming from privacy concerns and the resource-intensive labeling procedures, which contradicts the prevailing notion that DL models necessitate substantial data volumes for achieving superior performance. Moreover, limitations in available computational resources constrain the progress of high-performance medical artificial intelligence, particularly in less developed regions and countries. This paper introduces a novel ensemble learning mechanism designed for recognizing retinal diseases under limited resources (e.g., data, computation). The mechanism leverages insights from multiple pre-trained models, facilitating the transfer and adaptation of their knowledge to Retinal OCT images. This approach establishes a robust model even when confronted with limited labeled data, eliminating the need for an extensive array of parameters, as required in learning from scratch. Comprehensive experimentation on real-world datasets demonstrates that the proposed approach can achieve superior performance in recognizing Retinal OCT images, even when dealing with exceedingly restricted labeled datasets. Furthermore, this method obviates the necessity of learning extensive-scale parameters, making it well-suited for deployment in low-resource scenarios. △ Less

Submitted 15 February, 2024; originally announced February 2024.

Comments: Ongoing work

arXiv:2402.00320 [pdf]

DARCS: Memory-Efficient Deep Compressed Sensing Reconstruction for Acceleration of 3D Whole-Heart Coronary MR Angiography

Authors: Zhihao Xue, Fan Yang, Juan Gao, Zhuo Chen, Hao Peng, Chao Zou, Hang **, Chenxi Hu

Abstract: Three-dimensional coronary magnetic resonance angiography (CMRA) demands reconstruction algorithms that can significantly suppress the artifacts from a heavily undersampled acquisition. While unrolling-based deep reconstruction methods have achieved state-of-the-art performance on 2D image reconstruction, their application to 3D reconstruction is hindered by the large amount of memory needed to tr… ▽ More Three-dimensional coronary magnetic resonance angiography (CMRA) demands reconstruction algorithms that can significantly suppress the artifacts from a heavily undersampled acquisition. While unrolling-based deep reconstruction methods have achieved state-of-the-art performance on 2D image reconstruction, their application to 3D reconstruction is hindered by the large amount of memory needed to train an unrolled network. In this study, we propose a memory-efficient deep compressed sensing method by employing a sparsifying transform based on a pre-trained artifact estimation network. The motivation is that the artifact image estimated by a well-trained network is sparse when the input image is artifact-free, and less sparse when the input image is artifact-affected. Thus, the artifact-estimation network can be used as an inherent sparsifying transform. The proposed method, named De-Aliasing Regularization based Compressed Sensing (DARCS), was compared with a traditional compressed sensing method, de-aliasing generative adversarial network (DAGAN), model-based deep learning (MoDL), and plug-and-play for accelerations of 3D CMRA. The results demonstrate that the proposed method improved the reconstruction quality relative to the compared methods by a large margin. Furthermore, the proposed method well generalized for different undersampling rates and noise levels. The memory usage of the proposed method was only 63% of that needed by MoDL. In conclusion, the proposed method achieves improved reconstruction quality for 3D CMRA with reduced memory burden. △ Less

Submitted 2 February, 2024; v1 submitted 31 January, 2024; originally announced February 2024.

Comments: 10 pages, 8 figures

arXiv:2401.16714 [pdf]

A Point Cloud Enhancement Method for 4D mmWave Radar Imagery

Authors: Qingmian Wan, Hongli Peng, Xing Liao, Kuayue Liu, Junfa Mao

Abstract: A point cloud enhancement method for 4D mmWave radar imagery is proposed in this paper. Based on the patch antenna and MIMO array theories, the MIMO array with small redundancy and high SNR is designed to provide the probability of high angular resolution and detection rate. The antenna array is deployed using a ladder shape in vertical direction to decrease the redundancy and improve the resoluti… ▽ More A point cloud enhancement method for 4D mmWave radar imagery is proposed in this paper. Based on the patch antenna and MIMO array theories, the MIMO array with small redundancy and high SNR is designed to provide the probability of high angular resolution and detection rate. The antenna array is deployed using a ladder shape in vertical direction to decrease the redundancy and improve the resolution in horizontal direction with the constrains of physical factors. Considering the complicated environment of the real world with non-uniform distributed clutters, the dynamic detection method is used to solve the weak target sensing problem. The window size of CFAR detector is assumed variant to be determined using optimization method, making it adaptive to different environments especially when weak targets exist. The angular resolution increase using FT-based DOA method and the designed antenna array is described, which provides the basis of accurate detection and dense point cloud. To verify the performance of the proposed method, experiments of simulations and practical measurements are carried out, whose results show that the accuracy and the point cloud density are improved with comparison of the original manufacturer mmWave radar of TI AWR2243. △ Less

Submitted 29 January, 2024; originally announced January 2024.

arXiv:2401.14949 [pdf]

Renewable energy exporting consumption-oriented transfer limit switching control: A unsupervised learning-based method

Authors: Gao Qiu, Hao** Peng, Youbo Liu, Tingjian Liu, Junyong Liu

Abstract: A method for generating unsupervised conditional map** rules for multi-inter-corridor transfer limits and their integration into unit commitment through banding-switching is proposed in this paper. The method starts by using Ant colony clustering(ACC) to identify different operating modes with renewable energy penetration. For each sub-pattern, coupling inter-corridors are determined using corre… ▽ More A method for generating unsupervised conditional map** rules for multi-inter-corridor transfer limits and their integration into unit commitment through banding-switching is proposed in this paper. The method starts by using Ant colony clustering(ACC) to identify different operating modes with renewable energy penetration. For each sub-pattern, coupling inter-corridors are determined using correlation coefficients. An algorithm for constructing coupled inter-corridors' limits boundaries, employing grid partitioning, is proposed to establish conditional map**s from sub-patterns to multi-inter-corridor limits. Additionally, a banding matching model is proposed, incorporating distance criteria and the Big-M method. It also includes a limit-switching method based on Lagrange multipliers. Case studies on the IEEE 39-node system illustrate the effectiveness of this method in increasing consumption of renewable energy and reducing operational costs while adhering to stability verification requirements. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2312.15946 [pdf, other]

EnchantDance: Unveiling the Potential of Music-Driven Dance Movement

Authors: Bo Han, Yi Ren, Hao Peng, Teng Zhang, Zeyu Ling, Xiang Yin, Feilin Han

Abstract: The task of music-driven dance generation involves creating coherent dance movements that correspond to the given music. While existing methods can produce physically plausible dances, they often struggle to generalize to out-of-set data. The challenge arises from three aspects: 1) the high diversity of dance movements and significant differences in the distribution of music modalities, which make… ▽ More The task of music-driven dance generation involves creating coherent dance movements that correspond to the given music. While existing methods can produce physically plausible dances, they often struggle to generalize to out-of-set data. The challenge arises from three aspects: 1) the high diversity of dance movements and significant differences in the distribution of music modalities, which make it difficult to generate music-aligned dance movements. 2) the lack of a large-scale music-dance dataset, which hinders the generation of generalized dance movements from music. 3) The protracted nature of dance movements poses a challenge to the maintenance of a consistent dance style. In this work, we introduce the EnchantDance framework, a state-of-the-art method for dance generation. Due to the redundancy of the original dance sequence along the time axis, EnchantDance first constructs a strong dance latent space and then trains a dance diffusion model on the dance latent space. To address the data gap, we construct a large-scale music-dance dataset, ChoreoSpectrum3D Dataset, which includes four dance genres and has a total duration of 70.32 hours, making it the largest reported music-dance dataset to date. To enhance consistency between music genre and dance style, we pre-train a music genre prediction network using transfer learning and incorporate music genre as extra conditional information in the training of the dance diffusion model. Extensive experiments demonstrate that our proposed framework achieves state-of-the-art performance on dance quality, diversity, and consistency. △ Less

Submitted 26 December, 2023; originally announced December 2023.

arXiv:2312.12789 [pdf, other]

SLP-Net:An efficient lightweight network for segmentation of skin lesions

Authors: Bo Yang, Hong Peng, Chenggang Guo, Xiaohui Luo, Jun Wang, Xianzhong Long

Abstract: Prompt treatment for melanoma is crucial. To assist physicians in identifying lesion areas precisely in a quick manner, we propose a novel skin lesion segmentation technique namely SLP-Net, an ultra-lightweight segmentation network based on the spiking neural P(SNP) systems type mechanism. Most existing convolutional neural networks achieve high segmentation accuracy while neglecting the high hard… ▽ More Prompt treatment for melanoma is crucial. To assist physicians in identifying lesion areas precisely in a quick manner, we propose a novel skin lesion segmentation technique namely SLP-Net, an ultra-lightweight segmentation network based on the spiking neural P(SNP) systems type mechanism. Most existing convolutional neural networks achieve high segmentation accuracy while neglecting the high hardware cost. SLP-Net, on the contrary, has a very small number of parameters and a high computation speed. We design a lightweight multi-scale feature extractor without the usual encoder-decoder structure. Rather than a decoder, a feature adaptation module is designed to replace it and implement multi-scale information decoding. Experiments at the ISIC2018 challenge demonstrate that the proposed model has the highest Acc and DSC among the state-of-the-art methods, while experiments on the PH2 dataset also demonstrate a favorable generalization ability. Finally, we compare the computational complexity as well as the computational speed of the models in experiments, where SLP-Net has the highest overall superiority △ Less

Submitted 4 January, 2024; v1 submitted 20 December, 2023; originally announced December 2023.

arXiv:2303.17964 [pdf]

doi 10.1364/OPTICA.484200

Slice-Less Optical Arbitrary Waveform Measurement (OAWM) in a Bandwidth of More than 600 GHz Using Soliton Microcombs

Authors: Daniel Drayss, Dengyang Fang, Christoph Füllner, Grigory Lihachev, Thomas Henauer, Yung Chen, Huanfa Peng, Pablo Marin-Palomo, Thomas Zwick, Wolfgang Freude, Tobias J. Kippenberg, Sebastian Randel, Christian Koos

Abstract: We propose and demonstrate a novel scheme for optical arbitrary waveform measurement (OAWM) that exploits chip-scale Kerr soliton combs as highly scalable multiwavelength local oscillators (LO) for ultra-broadband full-field waveform acquisition. In contrast to earlier concepts, our approach does not require any optical slicing filters and thus lends itself to efficient implementation on state-of-… ▽ More We propose and demonstrate a novel scheme for optical arbitrary waveform measurement (OAWM) that exploits chip-scale Kerr soliton combs as highly scalable multiwavelength local oscillators (LO) for ultra-broadband full-field waveform acquisition. In contrast to earlier concepts, our approach does not require any optical slicing filters and thus lends itself to efficient implementation on state-of-the-art high-index-contrast integration platforms such as silicon photonics. The scheme allows to measure truly arbitrary waveforms with high accuracy, based on a dedicated system model which is calibrated by means of a femtosecond laser with known pulse shape. We demonstrated the viability of the approach in a proof-of-concept experiment by capturing an optical waveform that contains multiple 16 QAM and 64 QAM wavelength-division multiplexed (WDM) data signals with symbol rates of up to 80 GBd, reaching overall line rates of up to 1.92 Tbit/s within an optical acquisition bandwidth of 610 GHz. To the best of our knowledge, this is the highest bandwidth that has so far been demonstrated in an OAWM experiment. △ Less

Submitted 31 March, 2023; originally announced March 2023.

arXiv:2301.07269 [pdf, other]

Parallel Multi-Extended State Observers based {ADRC} with Application to High-Speed Precision Motion Stage

Authors: Guojie Tang, Wenchao Xue, Hao Peng, Yanlong Zhao, Zhijun Yang

Abstract: In this paper, the parallel multi-extended state observers (ESOs) based active disturbance rejection control approach is proposed to achieve desired tracking performance by automatically selecting the estimation values leading to the least tracking error. First, the relationship between the estimation error of ESO and the tracking error of output is quantitatively studied for single ESO with gener… ▽ More In this paper, the parallel multi-extended state observers (ESOs) based active disturbance rejection control approach is proposed to achieve desired tracking performance by automatically selecting the estimation values leading to the least tracking error. First, the relationship between the estimation error of ESO and the tracking error of output is quantitatively studied for single ESO with general order. In particular, the algorithm for calculating the tracking error caused by single ESO's estimation error is constructed. Moreover, by timely evaluating the least tracking error caused by different ESOs, a novel switching ADRC approach with parallel multi-ESOs is proposed. In addition, the stability of the algorithm is rigorously proved. Furthermore, the proposed ADRC is applied to the high-speed precision motion stage which has large nonlinear uncertainties and elastic deformation disturbances near the dead zone of friction. The experimental results show that the parallel multi-ESOs based ADRC has higher tracking performance than the traditional single ESO based ADRC. △ Less

Submitted 17 January, 2023; originally announced January 2023.

Comments: 10 pages, 9 figures

arXiv:2212.08653 [pdf, other]

Attentive Mask CLIP

Authors: Yifan Yang, Weiquan Huang, Yixuan Wei, Houwen Peng, Xinyang Jiang, Huiqiang Jiang, Fangyun Wei, Yin Wang, Han Hu, Lili Qiu, Yuqing Yang

Abstract: Image token removal is an efficient augmentation strategy for reducing the cost of computing image features. However, this efficient augmentation strategy has been found to adversely affect the accuracy of CLIP-based training. We hypothesize that removing a large portion of image tokens may improperly discard the semantic content associated with a given text description, thus constituting an incor… ▽ More Image token removal is an efficient augmentation strategy for reducing the cost of computing image features. However, this efficient augmentation strategy has been found to adversely affect the accuracy of CLIP-based training. We hypothesize that removing a large portion of image tokens may improperly discard the semantic content associated with a given text description, thus constituting an incorrect pairing target in CLIP training. To address this issue, we propose an attentive token removal approach for CLIP training, which retains tokens with a high semantic correlation to the text description. The correlation scores are computed in an online fashion using the EMA version of the visual encoder. Our experiments show that the proposed attentive masking approach performs better than the previous method of random token removal for CLIP training. The approach also makes it efficient to apply multiple augmentation views to the image, as well as introducing instance contrastive learning tasks between these views into the CLIP framework. Compared to other CLIP improvements that combine different pre-training targets such as SLIP and MaskCLIP, our method is not only more effective, but also much more efficient. Specifically, using ViT-B and YFCC-15M dataset, our approach achieves $43.9\%$ top-1 accuracy on ImageNet-1K zero-shot classification, as well as $62.7/42.1$ and $38.0/23.2$ I2T/T2I retrieval accuracy on Flickr30K and MS COCO, which are $+1.1\%$, $+5.5/+0.9$, and $+4.4/+1.3$ higher than the SLIP method, while being $2.30\times$ faster. An efficient version of our approach running $1.16\times$ faster than the plain CLIP model achieves significant gains of $+5.3\%$, $+11.3/+8.0$, and $+9.5/+4.9$ on these benchmarks. △ Less

Submitted 9 October, 2023; v1 submitted 16 December, 2022; originally announced December 2022.

Journal ref: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2023, pp. 2771-2781

arXiv:2208.07181 [pdf]

One-shot Generative Prior in Hankel-k-space for Parallel Imaging Reconstruction

Authors: Hong Peng, Chen Jiang, **g Cheng, Minghui Zhang, Shanshan Wang, Dong Liang, Qiegen Liu

Abstract: Magnetic resonance imaging serves as an essential tool for clinical diagnosis. However, it suffers from a long acquisition time. The utilization of deep learning, especially the deep generative models, offers aggressive acceleration and better reconstruction in magnetic resonance imaging. Nevertheless, learning the data distribution as prior knowledge and reconstructing the image from limited data… ▽ More Magnetic resonance imaging serves as an essential tool for clinical diagnosis. However, it suffers from a long acquisition time. The utilization of deep learning, especially the deep generative models, offers aggressive acceleration and better reconstruction in magnetic resonance imaging. Nevertheless, learning the data distribution as prior knowledge and reconstructing the image from limited data remains challenging. In this work, we propose a novel Hankel-k-space generative model (HKGM), which can generate samples from a training set of as little as one k-space data. At the prior learning stage, we first construct a large Hankel matrix from k-space data, then extract multiple structured k-space patches from the large Hankel matrix to capture the internal distribution among different patches. Extracting patches from a Hankel matrix enables the generative model to be learned from redundant and low-rank data space. At the iterative reconstruction stage, it is observed that the desired solution obeys the learned prior knowledge. The intermediate reconstruction solution is updated by taking it as the input of the generative model. The updated result is then alternatively operated by imposing low-rank penalty on its Hankel matrix and data consistency con-strain on the measurement data. Experimental results confirmed that the internal statistics of patches within a single k-space data carry enough information for learning a powerful generative model and provide state-of-the-art reconstruction. △ Less

Submitted 7 December, 2022; v1 submitted 15 August, 2022; originally announced August 2022.

Comments: 10 pages,10 figures,7 tables

arXiv:2111.10803 [pdf, other]

Structure-Preserving Graph Kernel for Brain Network Classification

Authors: Jun Yu, Zhaoming Kong, Aditya Kendre, Hao Peng, Carl Yang, Lichao Sun, Alex Leow, Lifang He

Abstract: This paper presents a novel graph-based kernel learning approach for connectome analysis. Specifically, we demonstrate how to leverage the naturally available structure within the graph representation to encode prior knowledge in the kernel. We first proposed a matrix factorization to directly extract structural features from natural symmetric graph representations of connectome data. We then used… ▽ More This paper presents a novel graph-based kernel learning approach for connectome analysis. Specifically, we demonstrate how to leverage the naturally available structure within the graph representation to encode prior knowledge in the kernel. We first proposed a matrix factorization to directly extract structural features from natural symmetric graph representations of connectome data. We then used them to derive a structure-persevering graph kernel to be fed into the support vector machine. The proposed approach has the advantage of being clinically interpretable. Quantitative evaluations on challenging HIV disease classification (DTI- and fMRI-derived connectome data) and emotion recognition (EEG-derived connectome data) tasks demonstrate the superior performance of our proposed methods against the state-of-the-art. Results showed that relevant EEG-connectome information is primarily encoded in the alpha band during the emotion regulation task. △ Less

Submitted 21 February, 2022; v1 submitted 21 November, 2021; originally announced November 2021.

arXiv:2110.12833 [pdf, other]

doi 10.1038/s41928-021-00661-2

Silicon photonic-electronic neural network for fibre nonlinearity compensation

Authors: Chaoran Huang, Shinsuke Fujisawa, Thomas Ferreira de Lima, Alexander N. Tait, Eric C. Blow, Yue Tian, Simon Bilodeau, Aashu Jha, F atih Yaman, Hsuan-Tung Peng, Hussam G. Batshon, Bhavin J. Shastri, Yoshihisa Inada, Ting Wang, Paul R. Prucnal

Abstract: In optical communication systems, fibre nonlinearity is the major obstacle in increasing the transmission capacity. Typically, digital signal processing techniques and hardware are used to deal with optical communication signals, but increasing speed and computational complexity create challenges for such approaches. Highly parallel, ultrafast neural networks using photonic devices have the potent… ▽ More In optical communication systems, fibre nonlinearity is the major obstacle in increasing the transmission capacity. Typically, digital signal processing techniques and hardware are used to deal with optical communication signals, but increasing speed and computational complexity create challenges for such approaches. Highly parallel, ultrafast neural networks using photonic devices have the potential to ease the requirements placed on the digital signal processing circuits by processing the optical signals in the analogue domain. Here we report a silicon photonice-lectronic neural network for solving fibre nonlinearity compensation of submarine optical fibre transmission systems. Our approach uses a photonic neural network based on wavelength-division multiplexing built on a CMOS-compatible silicon photonic platform. We show that the platform can be used to compensate optical fibre nonlinearities and improve the signal quality (Q)-factor in a 10,080 km submarine fibre communication system. The Q-factor improvement is comparable to that of a software-based neural network implemented on a 32-bit graphic processing unit-assisted workstation. Our reconfigurable photonic-electronic integrated neural network promises to address pressing challenges in high-speed intelligent signal processing. △ Less

Submitted 11 October, 2021; originally announced October 2021.

arXiv:2110.04044 [pdf, other]

Subspace Change-Point Detection via Low-Rank Matrix Factorisation

Authors: Euan Thomas McGonigle, Hankui Peng

Abstract: Multivariate time series can often have a large number of dimensions, whether it is due to the vast amount of collected features or due to how the data sources are processed. Frequently, the main structure of the high-dimensional time series can be well represented by a lower dimensional subspace. As vast quantities of data are being collected over long periods of time, it is reasonable to assume… ▽ More Multivariate time series can often have a large number of dimensions, whether it is due to the vast amount of collected features or due to how the data sources are processed. Frequently, the main structure of the high-dimensional time series can be well represented by a lower dimensional subspace. As vast quantities of data are being collected over long periods of time, it is reasonable to assume that the underlying subspace structure would change over time. In this work, we propose a change-point detection method based on low-rank matrix factorisation that can detect multiple changes in the underlying subspace of a multivariate time series. Experimental results on both synthetic and real data sets demonstrate the effectiveness of our approach and its advantages against various state-of-the-art methods. △ Less

Submitted 8 October, 2021; originally announced October 2021.

arXiv:2106.13865 [pdf, other]

A Photonic-Circuits-Inspired Compact Network: Toward Real-Time Wireless Signal Classification at the Edge

Authors: Hsuan-Tung Peng, Joshua Lederman, Lei Xu, Thomas Ferreira de Lima, Chaoran Huang, Bhavin Shastri, David Rosenbluth, Paul Prucnal

Abstract: Machine learning (ML) methods are ubiquitous in wireless communication systems and have proven powerful for applications including radio-frequency (RF) fingerprinting, automatic modulation classification, and cognitive radio. However, the large size of ML models can make them difficult to implement on edge devices for latency-sensitive downstream tasks. In wireless communication systems, ML data p… ▽ More Machine learning (ML) methods are ubiquitous in wireless communication systems and have proven powerful for applications including radio-frequency (RF) fingerprinting, automatic modulation classification, and cognitive radio. However, the large size of ML models can make them difficult to implement on edge devices for latency-sensitive downstream tasks. In wireless communication systems, ML data processing at a sub-millisecond scale will enable real-time network monitoring to improve security and prevent infiltration. In addition, compact and integratable hardware platforms which can implement ML models at the chip scale will find much broader application to wireless communication networks. Toward real-time wireless signal classification at the edge, we propose a novel compact deep network that consists of a photonic-hardware-inspired recurrent neural network model in combination with a simplified convolutional classifier, and we demonstrate its application to the identification of RF emitters by their random transmissions. With the proposed model, we achieve 96.32% classification accuracy over a set of 30 identical ZigBee devices when using 50 times fewer training parameters than an existing state-of-the-art CNN classifier. Thanks to the large reduction in network size, we demonstrate real-time RF fingerprinting with 0.219 ms latency using a small-scale FPGA board, the PYNQ-Z1. △ Less

Submitted 25 June, 2021; originally announced June 2021.

Comments: 17 pages, 14 figures

ACM Class: I.2.1; C.3

arXiv:2104.08876 [pdf, other]

Quick Learner Automated Vehicle Adapting its Roadmanship to Varying Traffic Cultures with Meta Reinforcement Learning

Authors: Songan Zhang, Lu Wen, Huei Peng, H. Eric Tseng

Abstract: It is essential for an automated vehicle in the field to perform discretionary lane changes with appropriate roadmanship - driving safely and efficiently without annoying or endangering other road users - under a wide range of traffic cultures and driving conditions. While deep reinforcement learning methods have excelled in recent years and been applied to automated vehicle driving policy, there… ▽ More It is essential for an automated vehicle in the field to perform discretionary lane changes with appropriate roadmanship - driving safely and efficiently without annoying or endangering other road users - under a wide range of traffic cultures and driving conditions. While deep reinforcement learning methods have excelled in recent years and been applied to automated vehicle driving policy, there are concerns about their capability to quickly adapt to unseen traffic with new environment dynamics. We formulate this challenge as a multi-Markov Decision Processes (MDPs) adaptation problem and developed Meta Reinforcement Learning (MRL) driving policies to showcase their quick learning capability. Two types of distribution variation in environments were designed and simulated to validate the fast adaptation capability of resulting MRL driving policies which significantly outperform a baseline RL. △ Less

Submitted 18 April, 2021; originally announced April 2021.

arXiv:2104.01164 [pdf, other]

doi 10.1364/OPTICA.446100

Silicon microring synapses enable photonic deep learning beyond 9-bit precision

Authors: Weipeng Zhang, Chaoran Huang, Hsuan-Tung Peng, Simon Bilodeau, Aashu Jha, Eric Blow, Thomas Ferreira De Lima, Bhavin J. Shastri, Paul Prucnal

Abstract: Deep neural networks (DNN) consist of layers of neurons interconnected by synaptic weights. A high bit-precision in weights is generally required to guarantee high accuracy in many applications. Minimizing error accumulation between layers is also essential when building large-scale networks. Recent demonstrations of photonic neural networks are limited in bit-precision due to crosstalk and the hi… ▽ More Deep neural networks (DNN) consist of layers of neurons interconnected by synaptic weights. A high bit-precision in weights is generally required to guarantee high accuracy in many applications. Minimizing error accumulation between layers is also essential when building large-scale networks. Recent demonstrations of photonic neural networks are limited in bit-precision due to crosstalk and the high sensitivity of optical components (e.g., resonators). Here, we experimentally demonstrate a record-high precision of 9 bits with a dithering control scheme for photonic synapses. We then numerically simulated the impact with increased synaptic precision on a wireless signal classification application. This work could help realize the potential of photonic neural networks for many practical, real-world tasks. △ Less

Submitted 15 April, 2022; v1 submitted 14 March, 2021; originally announced April 2021.

Comments: 7 pages

arXiv:2102.12817 [pdf, ps, other]

Fronthaul Compression and Passive Beamforming Design for Intelligent Reflecting Surface-aided Cloud Radio Access Networks

Authors: Yu Zhang, Xuelu Wu, Hong Peng, Caijun Zhong, Xiaoming Chen

Abstract: This letter studies a cloud radio access network (C-RAN) with multiple intelligent reflecting surfaces (IRS) deployed between users and remote radio heads (RRH). Specifically, we consider the uplink transmission where each RRH quantizes the received signals from the users by either point-to-point compression or Wyner-Ziv compression and then transmits the quantization bits to the BBU pool through… ▽ More This letter studies a cloud radio access network (C-RAN) with multiple intelligent reflecting surfaces (IRS) deployed between users and remote radio heads (RRH). Specifically, we consider the uplink transmission where each RRH quantizes the received signals from the users by either point-to-point compression or Wyner-Ziv compression and then transmits the quantization bits to the BBU pool through capacity limited fronthhual links. To maximize the uplink sum rate, we jointly optimize the passive beamformers of IRSs and the quantization noise covariance matrices of fronthoul compression. An joint fronthaul compression and passive beamforming design is proposed by exploiting the Arimoto-Blahut algorithm and semidefinte relaxation (SDR). Numerical results show the performance gain achieved by the proposed algorithm. △ Less

Submitted 8 April, 2021; v1 submitted 25 February, 2021; originally announced February 2021.

arXiv:2102.11462 [pdf, other]

An Interaction-aware Evaluation Method for Highly Automated Vehicles

Authors: Xinpeng Wang, Songan Zhang, Kuan-Hui Lee, Huei Peng

Abstract: It is important to build a rigorous verification and validation (V&V) process to evaluate the safety of highly automated vehicles (HAVs) before their wide deployment on public roads. In this paper, we propose an interaction-aware framework for HAV safety evaluation which is suitable for some highly-interactive driving scenarios including highway merging, roundabout entering, etc. Contrary to exist… ▽ More It is important to build a rigorous verification and validation (V&V) process to evaluate the safety of highly automated vehicles (HAVs) before their wide deployment on public roads. In this paper, we propose an interaction-aware framework for HAV safety evaluation which is suitable for some highly-interactive driving scenarios including highway merging, roundabout entering, etc. Contrary to existing approaches where the primary other vehicle (POV) takes predetermined maneuvers, we model the POV as a game-theoretic agent. To capture a wide variety of interactions between the POV and the vehicle under test (VUT), we characterize the interactive behavior using level-k game theory and social value orientation and train a diverse set of POVs using reinforcement learning. Moreover, we propose an adaptive test case sampling scheme based on the Gaussian process regression technique to generate customized and diverse challenging cases. The highway merging is used as the example scenario. We found the proposed method is able to capture a wide range of POV behaviors and achieve better coverage of the failure modes of the VUT compared with other evaluation approaches. △ Less

Submitted 22 February, 2021; originally announced February 2021.

Comments: 8 pages, 9 figures

arXiv:2101.08910 [pdf, other]

Single Neuron Segmentation using Graph-based Global Reasoning with Auxiliary Skeleton Loss from 3D Optical Microscope Images

Authors: Heng Wang, Yang Song, Chaoyi Zhang, Jianhui Yu, Siqi Liu, Hanchuan Peng, Weidong Cai

Abstract: One of the critical steps in improving accurate single neuron reconstruction from three-dimensional (3D) optical microscope images is the neuronal structure segmentation. However, they are always hard to segment due to the lack in quality. Despite a series of attempts to apply convolutional neural networks (CNNs) on this task, noise and disconnected gaps are still challenging to alleviate with the… ▽ More One of the critical steps in improving accurate single neuron reconstruction from three-dimensional (3D) optical microscope images is the neuronal structure segmentation. However, they are always hard to segment due to the lack in quality. Despite a series of attempts to apply convolutional neural networks (CNNs) on this task, noise and disconnected gaps are still challenging to alleviate with the neglect of the non-local features of graph-like tubular neural structures. Hence, we present an end-to-end segmentation network by jointly considering the local appearance and the global geometry traits through graph reasoning and a skeleton-based auxiliary loss. The evaluation results on the Janelia dataset from the BigNeuron project demonstrate that our proposed method exceeds the counterpart algorithms in performance. △ Less

Submitted 21 January, 2021; originally announced January 2021.

Comments: 5 pages, 3 figures, 2 tables, ISBI2021

arXiv:2012.08516 [pdf, other]

A Laser Spiking Neuron in a Photonic Integrated Circuit

Authors: Mitchell A. Nahmias, Hsuan-Tung Peng, Thomas Ferreira de Lima, Chaoran Huang, Alexander N. Tait, Bhavin J. Shastri, Paul R. Prucnal

Abstract: There has been a recent surge of interest in the implementation of linear operations such as matrix multipications using photonic integrated circuit technology. However, these approaches require an efficient and flexible way to perform nonlinear operations in the photonic domain. We have fabricated an optoelectronic nonlinear device--a laser neuron--that uses excitable laser dynamics to achieve bi… ▽ More There has been a recent surge of interest in the implementation of linear operations such as matrix multipications using photonic integrated circuit technology. However, these approaches require an efficient and flexible way to perform nonlinear operations in the photonic domain. We have fabricated an optoelectronic nonlinear device--a laser neuron--that uses excitable laser dynamics to achieve biologically-inspired spiking behavior. We demonstrate functionality with simultaneous excitation, inhibition, and summation across multiple wavelengths. We also demonstrate cascadability and compatibility with a wavelength multiplexing protocol, both essential for larger scale system integration. Laser neurons represent an important class of optoelectronic nonlinear processors that can complement both the enormous bandwidth density and energy efficiency of photonic computing operations. △ Less

Submitted 15 December, 2020; originally announced December 2020.

Comments: 11 pages, 7 figures. Previously submitted to Nature Photonics. Currently received reviewer feedback

arXiv:2012.02626 [pdf, other]

GraphPB: Graphical Representations of Prosody Boundary in Speech Synthesis

Authors: Aolan Sun, Jianzong Wang, Ning Cheng, Huayi Peng, Zhen Zeng, Lingwei Kong, **g Xiao

Abstract: This paper introduces a graphical representation approach of prosody boundary (GraphPB) in the task of Chinese speech synthesis, intending to parse the semantic and syntactic relationship of input sequences in a graphical domain for improving the prosody performance. The nodes of the graph embedding are formed by prosodic words, and the edges are formed by the other prosodic boundaries, namely pro… ▽ More This paper introduces a graphical representation approach of prosody boundary (GraphPB) in the task of Chinese speech synthesis, intending to parse the semantic and syntactic relationship of input sequences in a graphical domain for improving the prosody performance. The nodes of the graph embedding are formed by prosodic words, and the edges are formed by the other prosodic boundaries, namely prosodic phrase boundary (PPH) and intonation phrase boundary (IPH). Different Graph Neural Networks (GNN) like Gated Graph Neural Network (GGNN) and Graph Long Short-term Memory (G-LSTM) are utilised as graph encoders to exploit the graphical prosody boundary information. Graph-to-sequence model is proposed and formed by a graph encoder and an attentional decoder. Two techniques are proposed to embed sequential information into the graph-to-sequence text-to-speech model. The experimental results show that this proposed approach can encode the phonetic and prosody rhythm of an utterance. The mean opinion score (MOS) of these GNN models shows comparative results with the state-of-the-art sequence-to-sequence models with better performance in the aspect of prosody. This provides an alternative approach for prosody modelling in end-to-end speech synthesis. △ Less

Submitted 2 December, 2020; originally announced December 2020.

Comments: Accepted to SLT 2021

arXiv:2011.08965 [pdf]

doi 10.1038/s41746-021-00427-2

Interpretable Survival Prediction for Colorectal Cancer using Deep Learning

Authors: Ellery Wulczyn, David F. Steiner, Melissa Moran, Markus Plass, Robert Reihs, Fraser Tan, Isabelle Flament-Auvigne, Trissia Brown, Peter Regitnig, Po-Hsuan Cameron Chen, Narayan Hegde, Apaar Sadhwani, Robert MacDonald, Benny Ayalew, Greg S. Corrado, Lily H. Peng, Daniel Tse, Heimo Müller, Zhaoyang Xu, Yun Liu, Martin C. Stumpe, Kurt Zatloukal, Craig H. Mermel

Abstract: Deriving interpretable prognostic features from deep-learning-based prognostic histopathology models remains a challenge. In this study, we developed a deep learning system (DLS) for predicting disease specific survival for stage II and III colorectal cancer using 3,652 cases (27,300 slides). When evaluated on two validation datasets containing 1,239 cases (9,340 slides) and 738 cases (7,140 slide… ▽ More Deriving interpretable prognostic features from deep-learning-based prognostic histopathology models remains a challenge. In this study, we developed a deep learning system (DLS) for predicting disease specific survival for stage II and III colorectal cancer using 3,652 cases (27,300 slides). When evaluated on two validation datasets containing 1,239 cases (9,340 slides) and 738 cases (7,140 slides) respectively, the DLS achieved a 5-year disease-specific survival AUC of 0.70 (95%CI 0.66-0.73) and 0.69 (95%CI 0.64-0.72), and added significant predictive value to a set of 9 clinicopathologic features. To interpret the DLS, we explored the ability of different human-interpretable features to explain the variance in DLS scores. We observed that clinicopathologic features such as T-category, N-category, and grade explained a small fraction of the variance in DLS scores (R2=18% in both validation sets). Next, we generated human-interpretable histologic features by clustering embeddings from a deep-learning based image-similarity model and showed that they explain the majority of the variance (R2 of 73% to 80%). Furthermore, the clustering-derived feature most strongly associated with high DLS scores was also highly prognostic in isolation. With a distinct visual appearance (poorly differentiated tumor cell clusters adjacent to adipose tissue), this feature was identified by annotators with 87.0-95.5% accuracy. Our approach can be used to explain predictions from a prognostic deep learning model and uncover potentially-novel prognostic features that can be reliably identified by people for future validation studies. △ Less

Submitted 17 November, 2020; originally announced November 2020.

Journal ref: Nature Partner Journal Digital Medicine (2021)

arXiv:2005.07842 [pdf]

doi 10.1063/5.0181791

DEFM: Delay E mbedding based Forecast Machine for Time Series Forecasting by Spatiotemporal Information Transformation

Authors: Hao Peng, Wei Wang, Pei Chen, Rui Liu

Abstract: Making accurate forecasts for a complex system is a challenge in various practical applications. The major difficulty in solving such a problem concerns nonlinear spatiotemporal dynamics with time-varying characteristics. Takens' delay embedding theory provides a way to transform high-dimensional spatial information into temporal information. In this work, by combining delay embedding theory and d… ▽ More Making accurate forecasts for a complex system is a challenge in various practical applications. The major difficulty in solving such a problem concerns nonlinear spatiotemporal dynamics with time-varying characteristics. Takens' delay embedding theory provides a way to transform high-dimensional spatial information into temporal information. In this work, by combining delay embedding theory and deep learning techniques, we propose a novel framework, Delay-Embedding-based Forecast Machine (DEFM), to predict the future values of a target variable in a self-supervised and multistep-ahead manner based on high-dimensional observations. With a three-module spatiotemporal architecture, the DEFM leverages deep neural networks to effectively extract both the spatially and temporally associated information from the observed time series even with time-varying parameters or additive noise. The DEFM can accurately predict future information by transforming spatiotemporal information to the delay embeddings of a target variable. The efficacy and precision of the DEFM are substantiated through applications in three spatiotemporally chaotic systems: a 90-dimensional (90D) coupled Lorenz system, the Lorenz 96 system, and the Kuramoto-Sivashinsky (KS) equation with inhomogeneity. Additionally, the performance of the DEFM is evaluated on six real-world datasets spanning various fields. Comparative experiments with five prediction methods illustrate the superiority and robustness of the DEFM and show the great potential of the DEFM in temporal information mining and forecasting △ Less

Submitted 6 April, 2024; v1 submitted 15 May, 2020; originally announced May 2020.

Comments: 28 pages, 5 figures

Journal ref: Chaos 1 April 2024; 34 (4): 043112

arXiv:2004.05161 [pdf, other]

Combined Eco-Routing and Power-Train Control of Plug-In Hybrid Electric Vehicles in Transportation Networks

Authors: Arian Houshmand, Christos G. Cassandras, Nan Zhou, Nasser Hashemi, Boqi Li, Huei Peng

Abstract: We study the problem of eco-routing for Plug-In Hybrid Electric Vehicles (PHEVs) to minimize the overall energy consumption cost. We propose an algorithm which can simultaneously calculate an energy-optimal route (eco-route) for a PHEV and an optimal power-train control strategy over this route. In order to show the effectiveness of our method in practice, we use a HERE Maps API to apply our algor… ▽ More We study the problem of eco-routing for Plug-In Hybrid Electric Vehicles (PHEVs) to minimize the overall energy consumption cost. We propose an algorithm which can simultaneously calculate an energy-optimal route (eco-route) for a PHEV and an optimal power-train control strategy over this route. In order to show the effectiveness of our method in practice, we use a HERE Maps API to apply our algorithms based on traffic data in the city of Boston with more than 110,000 links. Moreover, we validate the performance of our eco-routing algorithm using speed profiles collected from a traffic simulator (SUMO) as input to a high-fidelity energy model to calculate energy consumption costs. Our results show significant energy savings (around 12%) for PHEVs with a near real-time execution time for the algorithm. △ Less

Submitted 9 April, 2020; originally announced April 2020.

Comments: arXiv admin note: text overlap with arXiv:1810.01443

arXiv:2004.02805 [pdf]

Application of Structural Similarity Analysis of Visually Salient Areas and Hierarchical Clustering in the Screening of Similar Wireless Capsule Endoscopic Images

Authors: Rui Nie, Huan Yang, Hejuan Peng, Wenbin Luo, Weiya Fan, Jie Zhang, **g Liao, Fang Huang, Yufeng Xiao

Abstract: Small intestinal capsule endoscopy is the mainstream method for inspecting small intestinal lesions,but a single small intestinal capsule endoscopy will produce 60,000 - 120,000 images, the majority of which are similar and have no diagnostic value. It takes 2 - 3 hours for doctors to identify lesions from these images. This is time-consuming and increase the probability of misdiagnosis and missed… ▽ More Small intestinal capsule endoscopy is the mainstream method for inspecting small intestinal lesions,but a single small intestinal capsule endoscopy will produce 60,000 - 120,000 images, the majority of which are similar and have no diagnostic value. It takes 2 - 3 hours for doctors to identify lesions from these images. This is time-consuming and increase the probability of misdiagnosis and missed diagnosis since doctors are likely to experience visual fatigue while focusing on a large number of similar images for an extended period of time.In order to solve these problems, we proposed a similar wireless capsule endoscope (WCE) image screening method based on structural similarity analysis and the hierarchical clustering of visually salient sub-image blocks. The similarity clustering of images was automatically identified by hierarchical clustering based on the hue,saturation,value (HSV) spatial color characteristics of the images,and the keyframe images were extracted based on the structural similarity of the visually salient sub-image blocks, in order to accurately identify and screen out similar small intestinal capsule endoscopic images. Subsequently, the proposed method was applied to the capsule endoscope imaging workstation. After screening out similar images in the complete data gathered by the Type I OMOM Small Intestinal Capsule Endoscope from 52 cases covering 17 common types of small intestinal lesions, we obtained a lesion recall of 100% and an average similar image reduction ratio of 76%. With similar images screened out, the average play time of the OMOM image workstation was 18 minutes, which greatly reduced the time spent by doctors viewing the images. △ Less

Submitted 1 April, 2020; originally announced April 2020.

arXiv:2003.08034 [pdf, other]

Generating Socially Acceptable Perturbations for Efficient Evaluation of Autonomous Vehicles

Authors: Songan Zhang, Huei Peng, Subramanya Nageshrao, H. Eric Tseng

Abstract: Deep reinforcement learning methods have been widely used in recent years for autonomous vehicle's decision-making. A key issue is that deep neural networks can be fragile to adversarial attacks or other unseen inputs. In this paper, we address the latter issue: we focus on generating socially acceptable perturbations (SAP), so that the autonomous vehicle (AV agent), instead of the challenging veh… ▽ More Deep reinforcement learning methods have been widely used in recent years for autonomous vehicle's decision-making. A key issue is that deep neural networks can be fragile to adversarial attacks or other unseen inputs. In this paper, we address the latter issue: we focus on generating socially acceptable perturbations (SAP), so that the autonomous vehicle (AV agent), instead of the challenging vehicle (attacker), is primarily responsible for the crash. In our process, one attacker is added to the environment and trained by deep reinforcement learning to generate the desired perturbation. The reward is designed so that the attacker aims to fail the AV agent in a socially acceptable way. After training the attacker, the agent policy is evaluated in both the original naturalistic environment and the environment with one attacker. The results show that the agent policy which is safe in the naturalistic environment has many crashes in the perturbed environment. △ Less

Submitted 18 March, 2020; originally announced March 2020.

arXiv:2003.01924 [pdf, other]

GraphTTS: graph-to-sequence modelling in neural text-to-speech

Authors: Aolan Sun, Jianzong Wang, Ning Cheng, Huayi Peng, Zhen Zeng, **g Xiao

Abstract: This paper leverages the graph-to-sequence method in neural text-to-speech (GraphTTS), which maps the graph embedding of the input sequence to spectrograms. The graphical inputs consist of node and edge representations constructed from input texts. The encoding of these graphical inputs incorporates syntax information by a GNN encoder module. Besides, applying the encoder of GraphTTS as a graph au… ▽ More This paper leverages the graph-to-sequence method in neural text-to-speech (GraphTTS), which maps the graph embedding of the input sequence to spectrograms. The graphical inputs consist of node and edge representations constructed from input texts. The encoding of these graphical inputs incorporates syntax information by a GNN encoder module. Besides, applying the encoder of GraphTTS as a graph auxiliary encoder (GAE) can analyse prosody information from the semantic structure of texts. This can remove the manual selection of reference audios process and makes prosody modelling an end-to-end procedure. Experimental analysis shows that GraphTTS outperforms the state-of-the-art sequence-to-sequence models by 0.24 in Mean Opinion Score (MOS). GAE can adjust the pause, ventilation and tones of synthesised audios automatically. This experimental conclusion may give some inspiration to researchers working on improving speech synthesis prosody. △ Less

Submitted 4 March, 2020; originally announced March 2020.

Comments: Accepted to ICASSP 2020

arXiv:2001.10908 [pdf]

doi 10.1063/1.5132586

Super Resolution Convolutional Neural Network for Feature Extraction in Spectroscopic Data

Authors: Han Peng, Xiang Gao, Yu He, Yiwei Li, Yuchen Ji, Chuhang Liu, Sandy A. Ekahana, Ding Pei, Zhongkai Liu, Zhixun Shen, Yulin Chen

Abstract: Two dimensional (2D) peak finding is a common practice in data analysis for physics experiments, which is typically achieved by computing the local derivatives. However, this method is inherently unstable when the local landscape is complicated, or the signal-to-noise ratio of the data is low. In this work, we propose a new method in which the peak tracking task is formalized as an inverse problem… ▽ More Two dimensional (2D) peak finding is a common practice in data analysis for physics experiments, which is typically achieved by computing the local derivatives. However, this method is inherently unstable when the local landscape is complicated, or the signal-to-noise ratio of the data is low. In this work, we propose a new method in which the peak tracking task is formalized as an inverse problem, thus can be solved with a convolutional neural network (CNN). In addition, we show that the underlying physics principle of the experiments can be used to generate the training data. By generalizing the trained neural network on real experimental data, we show that the CNN method can achieve comparable or better results than traditional derivative based methods. This approach can be further generalized in different physics experiments when the physical process is known. △ Less

Submitted 29 January, 2020; originally announced January 2020.

Comments: 13pages, 6 figures

arXiv:1912.06258 [pdf, other]

Mcity Data Collection for Automated Vehicles Study

Authors: Yiqun Dong, Yuanxin Zhong, Wenbo Yu, Minghan Zhu, **** Lu, Yeyang Fang, Jiajun Hong, Huei Peng

Abstract: The main goal of this paper is to introduce the data collection effort at Mcity targeting automated vehicle development. We captured a comprehensive set of data from a set of perception sensors (Lidars, Radars, Cameras) as well as vehicle steering/brake/throttle inputs and an RTK unit. Two in-cabin cameras record the human driver's behaviors for possible future use. The naturalistic driving on sel… ▽ More The main goal of this paper is to introduce the data collection effort at Mcity targeting automated vehicle development. We captured a comprehensive set of data from a set of perception sensors (Lidars, Radars, Cameras) as well as vehicle steering/brake/throttle inputs and an RTK unit. Two in-cabin cameras record the human driver's behaviors for possible future use. The naturalistic driving on selected open roads is recorded at different time of day and weather conditions. We also perform designed choreography data collection inside the Mcity test facility focusing on vehicle to vehicle, and vehicle to vulnerable road user interactions which is quite unique among existing open-source datasets. The vehicle platform, data content, tags/labels, and selected analysis results are shown in this paper. △ Less

Submitted 12 December, 2019; originally announced December 2019.

arXiv:1909.11598 [pdf, ps, other]

doi 10.1109/ICCChina.2019.8855868

A Predictive On-Demand Placement of UAV Base Stations Using Echo State Network

Authors: Haoran Peng, Chao Chen, Chuan-Chi Lai, Li-Chun Wang, Zhu Han

Abstract: The unmanned aerial vehicles base stations (UAV-BSs) have great potential in being widely used in many dynamic application scenarios. In those scenarios, the movements of served user equipments (UEs) are inevitable, so the UAV-BSs needs to be re-positioned dynamically for providing seamless services. In this paper, we propose a system framework consisting of UEs clustering, UAV-BS placement, UEs t… ▽ More The unmanned aerial vehicles base stations (UAV-BSs) have great potential in being widely used in many dynamic application scenarios. In those scenarios, the movements of served user equipments (UEs) are inevitable, so the UAV-BSs needs to be re-positioned dynamically for providing seamless services. In this paper, we propose a system framework consisting of UEs clustering, UAV-BS placement, UEs trajectories prediction, and UAV-BS reposition matching scheme, to serve the UEs seamlessly as well as minimize the energy cost of UAV-BSs' reposition trajectories. An Echo State Network (ESN) based algorithm for predicting the future trajectories of UEs and a Kuhn-Munkres-based algorithm for finding the energy-efficient reposition trajectories of UAV-BSs is designed, respectively. We conduct a simulation using a real open dataset for performance validation. The simulation results indicate that the proposed framework achieves high prediction accuracy and provides the energy-efficient matching scheme. △ Less

Submitted 6 October, 2019; v1 submitted 25 September, 2019; originally announced September 2019.

Comments: 6 pages, 8 figures, accepted by 2019 IEEE/CIC International Conference on Communications in China (ICCC)

arXiv:1909.05382 [pdf]

doi 10.1038/s41591-020-0842-3

A deep learning system for differential diagnosis of skin diseases

Authors: Yuan Liu, Ayush Jain, Clara Eng, David H. Way, Kang Lee, Peggy Bui, Kimberly Kanada, Guilherme de Oliveira Marinho, Jessica Gallegos, Sara Gabriele, Vishakha Gupta, Nalini Singh, Vivek Natarajan, Rainer Hofmann-Wellenhof, Greg S. Corrado, Lily H. Peng, Dale R. Webster, Dennis Ai, Susan Huang, Yun Liu, R. Carter Dunn, David Coz

Abstract: Skin conditions affect an estimated 1.9 billion people worldwide. A shortage of dermatologists causes long wait times and leads patients to seek dermatologic care from general practitioners. However, the diagnostic accuracy of general practitioners has been reported to be only 0.24-0.70 (compared to 0.77-0.96 for dermatologists), resulting in referral errors, delays in care, and errors in diagnosi… ▽ More Skin conditions affect an estimated 1.9 billion people worldwide. A shortage of dermatologists causes long wait times and leads patients to seek dermatologic care from general practitioners. However, the diagnostic accuracy of general practitioners has been reported to be only 0.24-0.70 (compared to 0.77-0.96 for dermatologists), resulting in referral errors, delays in care, and errors in diagnosis and treatment. In this paper, we developed a deep learning system (DLS) to provide a differential diagnosis of skin conditions for clinical cases (skin photographs and associated medical histories). The DLS distinguishes between 26 skin conditions that represent roughly 80% of the volume of skin conditions seen in primary care. The DLS was developed and validated using de-identified cases from a teledermatology practice serving 17 clinical sites via a temporal split: the first 14,021 cases for development and the last 3,756 cases for validation. On the validation set, where a panel of three board-certified dermatologists defined the reference standard for every case, the DLS achieved 0.71 and 0.93 top-1 and top-3 accuracies respectively. For a random subset of the validation set (n=963 cases), 18 clinicians reviewed the cases for comparison. On this subset, the DLS achieved a 0.67 top-1 accuracy, non-inferior to board-certified dermatologists (0.63, p<0.001), and higher than primary care physicians (PCPs, 0.45) and nurse practitioners (NPs, 0.41). The top-3 accuracy showed a similar trend: 0.90 DLS, 0.75 dermatologists, 0.60 PCPs, and 0.55 NPs. These results highlight the potential of the DLS to augment general practitioners to accurately diagnose skin conditions by suggesting differential diagnoses that may not have been considered. Future work will be needed to prospectively assess the clinical impact of using this tool in actual clinical workflows. △ Less

Submitted 11 September, 2019; originally announced September 2019.

Journal ref: Nature Medicine (2020)

arXiv:1908.09828 [pdf]

doi 10.1109/TITS.2020.3032473

Eco-Mobility-on-Demand Fleet Control with Ride-Sharing

Authors: Xianan Huang, Boqi Li, Huei Peng, Joshua A. Auld, Vadim O. Sokolov

Abstract: Shared Mobility-on-Demand using automated vehicles can reduce energy consumption and cost for future mobility. However, its full potential in energy saving has not been fully explored. An algorithm to minimize fleet fuel consumption while satisfying customers travel time constraints is developed in this paper. Numerical simulations with realistic travel demand and route choice are performed, showi… ▽ More Shared Mobility-on-Demand using automated vehicles can reduce energy consumption and cost for future mobility. However, its full potential in energy saving has not been fully explored. An algorithm to minimize fleet fuel consumption while satisfying customers travel time constraints is developed in this paper. Numerical simulations with realistic travel demand and route choice are performed, showing that if fuel consumption is not considered, the MOD service can increase fleet fuel consumption due to increased empty vehicle mileage. With fuel consumption as part of the cost function, we can reduce total fuel consumption by 7 percent while maintaining a high level of mobility service. △ Less

Submitted 17 October, 2020; v1 submitted 23 August, 2019; originally announced August 2019.

Comments: arXiv admin note: text overlap with arXiv:1801.08602

arXiv:1907.07325 [pdf, other]

doi 10.1109/JSTQE.2019.2931252

Noise Analysis of Photonic Modulator Neurons

Authors: Thomas Ferreira de Lima, Alexander N. Tait, Hooman Saeidi, Mitchell A. Nahmias, Hsuan-Tung Peng, Siamak Abbaslou, Bhavin J. Shastri, Paul R. Prucnal

Abstract: Neuromorphic photonics relies on efficiently emulating analog neural networks at high speeds. Prior work showed that transducing signals from the optical to the electrical domain and back with transimpedance gain was an efficient approach to implementing analog photonic neurons and scalable networks. Here, we examine modulator-based photonic neuron circuits with passive and active transimpedance g… ▽ More Neuromorphic photonics relies on efficiently emulating analog neural networks at high speeds. Prior work showed that transducing signals from the optical to the electrical domain and back with transimpedance gain was an efficient approach to implementing analog photonic neurons and scalable networks. Here, we examine modulator-based photonic neuron circuits with passive and active transimpedance gains, with special attention to the sources of noise propagation. We find that a modulator nonlinear transfer function can suppress noise, which is necessary to avoid noise propagation in hardware neural networks. In addition, while efficient modulators can reduce power for an individual neuron, signal-to-noise ratios must be traded off with power consumption at a system level. Active transimpedance amplifiers may help relax this tradeoff for conventional p-n junction silicon photonic modulators, but a passive transimpedance circuit is sufficient when very efficient modulators (i.e. low C and low V-pi) are employed. △ Less

Submitted 17 July, 2019; originally announced July 2019.

Comments: 8 pages, 7 figures, 1 table

arXiv:1907.01525 [pdf, other]

doi 10.1109/JSTQE.2019.2945540

Digital Electronics and Analog Photonics for Convolutional Neural Networks (DEAP-CNNs)

Authors: Viraj Bangari, Bicky A. Marquez, Heidi B. Miller, Alexander N. Tait, Mitchell A. Nahmias, Thomas Ferreira de Lima, Hsuan-Tung Peng, Paul R. Prucnal, Bhavin J. Shastri

Abstract: Convolutional Neural Networks (CNNs) are powerful and highly ubiquitous tools for extracting features from large datasets for applications such as computer vision and natural language processing. However, a convolution is a computationally expensive operation in digital electronics. In contrast, neuromorphic photonic systems, which have experienced a recent surge of interest over the last few year… ▽ More Convolutional Neural Networks (CNNs) are powerful and highly ubiquitous tools for extracting features from large datasets for applications such as computer vision and natural language processing. However, a convolution is a computationally expensive operation in digital electronics. In contrast, neuromorphic photonic systems, which have experienced a recent surge of interest over the last few years, propose higher bandwidth and energy efficiencies for neural network training and inference. Neuromorphic photonics exploits the advantages of optical electronics, including the ease of analog processing, and busing multiple signals on a single waveguide at the speed of light. Here, we propose a Digital Electronic and Analog Photonic (DEAP) CNN hardware architecture that has potential to be 2.8 to 14 times faster while maintaining the same power usage of current state-of-the-art GPUs. △ Less

Submitted 22 April, 2019; originally announced July 2019.

Comments: 12 pages, 9 figures, 3 tables

arXiv:1808.00869 [pdf]

Develo** Robot Driver Etiquette Based on Naturalistic Human Driving Behavior

Authors: Xianan Huang, Songan Zhang, Huei Peng

Abstract: Automated vehicles can change the society by improved safety, mobility and fuel efficiency. However, due to the higher cost and change in business model, over the coming decades, the highly automated vehicles likely will continue to interact with many human-driven vehicles. In the past, the control/design of the highly automated (robotic) vehicles mainly considers safety and efficiency but failed… ▽ More Automated vehicles can change the society by improved safety, mobility and fuel efficiency. However, due to the higher cost and change in business model, over the coming decades, the highly automated vehicles likely will continue to interact with many human-driven vehicles. In the past, the control/design of the highly automated (robotic) vehicles mainly considers safety and efficiency but failed to address the "driving culture" of surrounding human-driven vehicles. Thus, the robotic vehicles may demonstrate behaviors very different from other vehicles. We study this "driving etiquette" problem in this paper. As the first step, we report the key behavior parameters of human driven vehicles derived from a large naturalistic driving database. The results can be used to guide future algorithm design of highly automated vehicles or to develop realistic human-driven vehicle behavior model in simulations. △ Less

Submitted 1 August, 2018; originally announced August 2018.

arXiv:1808.00058 [pdf, other]

A Unified Framework for Joint Mobility Prediction and Object Profiling of Drones in UAV Networks

Authors: Han Peng, Abolfazl Razi, Fatemeh Afghah, Jonathan Ashdown

Abstract: In recent years, using a network of autonomous and cooperative unmanned aerial vehicles (UAVs) without command and communication from the ground station has become more imperative, in particular in search-and-rescue operations, disaster management, and other applications where human intervention is limited. In such scenarios, UAVs can make more efficient decisions if they acquire more information… ▽ More In recent years, using a network of autonomous and cooperative unmanned aerial vehicles (UAVs) without command and communication from the ground station has become more imperative, in particular in search-and-rescue operations, disaster management, and other applications where human intervention is limited. In such scenarios, UAVs can make more efficient decisions if they acquire more information about the mobility, sensing and actuation capabilities of their neighbor nodes. In this paper, we develop an unsupervised online learning algorithm for joint mobility prediction and object profiling of UAVs to facilitate control and communication protocols. The proposed method not only predicts the future locations of the surrounding flying objects, but also classifies them into different groups with similar levels of maneuverability (e.g. rotatory, and fixed-wing UAVs) without prior knowledge about these classes. This method is flexible in admitting new object types with unknown mobility profiles, thereby applicable to emerging flying Ad-hoc networks with heterogeneous nodes. △ Less

Submitted 31 July, 2018; originally announced August 2018.

Comments: 8 pages, 11 figures

arXiv:1712.05506 [pdf]

Enhancing the performance of a safe controller via supervised learning for truck lateral control

Authors: Yuxiao Chen, Ayonga Hereid, Huei Peng, Jessy Grizzle

Abstract: Correct-by-construction techniques, such as control barrier functions (CBFs), can be used to guarantee closed-loop safety by acting as a supervisor of an existing or legacy controller. However, supervisory-control intervention typically compromises the performance of the closed-loop system. On the other hand, machine learning has been used to synthesize controllers that inherit good properties fro… ▽ More Correct-by-construction techniques, such as control barrier functions (CBFs), can be used to guarantee closed-loop safety by acting as a supervisor of an existing or legacy controller. However, supervisory-control intervention typically compromises the performance of the closed-loop system. On the other hand, machine learning has been used to synthesize controllers that inherit good properties from a training dataset, though safety is typically not guaranteed due to the difficulty of analyzing the associated neural network. In this paper, supervised learning is combined with CBFs to synthesize controllers that enjoy good performance with provable safety. A training set is generated by trajectory optimization that incorporates the CBF constraint for an interesting range of initial conditions of the truck model. A control policy is obtained via supervised learning that maps a feature representing the initial conditions to a parameterized desired trajectory. The learning-based controller is used as the performance controller and a CBF-based supervisory controller guarantees safety. A case study of lane kee** for articulated trucks shows that the controller trained by supervised learning inherits the good performance of the training set and rarely requires intervention by the CBF supervisor △ Less

Submitted 2 May, 2018; v1 submitted 14 December, 2017; originally announced December 2017.

Comments: submitted to IEEE Transaction of Control System Technology

arXiv:1708.00151 [pdf]

doi 10.1007/s12239-016-0030-0

Optimal design of three-planetary-gear power-split hybrid powertrains

Authors: Weichao Zhuang, Xiaowu Zhang, Ding Zhao, Huei Peng, Lianmou Wang

Abstract: Many of today's power-split hybrid electric vehicles (HEVs) utilize planetary gears (PGs) to connect the powertrain elements together. Recent power-split HEVs tend to use two PGs and some of them have multiple modes to achieve better fuel economy and driving performance. Looking to the future, hybrid powertrain technologies must be enhanced to design hybrid light trucks. For light trucks, the need… ▽ More Many of today's power-split hybrid electric vehicles (HEVs) utilize planetary gears (PGs) to connect the powertrain elements together. Recent power-split HEVs tend to use two PGs and some of them have multiple modes to achieve better fuel economy and driving performance. Looking to the future, hybrid powertrain technologies must be enhanced to design hybrid light trucks. For light trucks, the need for multi-mode and more PGs is stronger, to achieve the required performance. To systematically explore all the possible designs of multi-mode HEVs with three PGs, an efficient searching and optimization methodology is proposed. All possible clutch topology and modes for one existing configuration that uses three PGs were exhaustively searched. The launching performance is first used to screen out designs that fail to satisfy the required launching performance. A near-optimal and computationally efficient energy management strategy was then employed to identify designs that achieve good fuel economy. The proposed design process successfully identify 8 designs that achieve better launching performance and better fuel economy, while using fewer number of clutches than the benchmark and a patented design. △ Less

Submitted 31 July, 2017; originally announced August 2017.

Journal ref: International Journal of Automotive Technology, April 2016, Volume 17, Issue 2, pp 299-309

arXiv:1707.09415 [pdf]

doi 10.1109/TITS.2015.2482821

Gap Acceptance During Lane Changes by Large-Truck Drivers-An Image-Based Analysis

Authors: Kazutoshi Nobukawa, Shan Bao, David J. LeBlanc, Ding Zhao, Huei Peng, Christopher S. Pan

Abstract: This paper presents an analysis of rearward gap acceptance characteristics of drivers of large trucks in highway lane change scenarios. The range between the vehicles was inferred from camera images using the estimated lane width obtained from the lane tracking camera as the reference. Six-hundred lane change events were acquired from a large-scale naturalistic driving data set. The kinematic vari… ▽ More This paper presents an analysis of rearward gap acceptance characteristics of drivers of large trucks in highway lane change scenarios. The range between the vehicles was inferred from camera images using the estimated lane width obtained from the lane tracking camera as the reference. Six-hundred lane change events were acquired from a large-scale naturalistic driving data set. The kinematic variables from the image-based gap analysis were filtered by the weighted linear least squares in order to extrapolate them at the lane change time. In addition, the time-to-collision and required deceleration were computed, and potential safety threshold values are provided. The resulting range and range rate distributions showed directional discrepancies, i.e., in left lane changes, large trucks are often slower than other vehicles in the target lane, whereas they are usually faster in right lane changes. Video observations have confirmed that major motivations for changing lanes are different depending on the direction of move, i.e., moving to the left (faster) lane occurs due to a slower vehicle ahead or a merging vehicle on the right-hand side, whereas right lane changes are frequently made to return to the original lane after passing. △ Less

Submitted 28 July, 2017; originally announced July 2017.

Journal ref: IEEE Transactions on Intelligent Transportation Systems ( Volume: 17, Issue: 3, March 2016 )

arXiv:1707.09411 [pdf]

Analysis of mandatory and discretionary lane change behaviors for heavy trucks

Authors: Ding Zhao, Huei Peng, Kazutoshi Nobukawa, Shan Bao, David J LeBlanc, Christopher S Pan

Abstract: The behaviors of heavy vehicles drivers in mandatory and discretionary lane changes are analyzed in this paper. 640 mandatory and 2,035 discretionary lane change events were extracted from a naturalistic driving database. Variations in gap acceptance and lane change duration were investigated. Statistical analysis showed that mandatory lane changes are more aggressive in gap acceptance and lane ch… ▽ More The behaviors of heavy vehicles drivers in mandatory and discretionary lane changes are analyzed in this paper. 640 mandatory and 2,035 discretionary lane change events were extracted from a naturalistic driving database. Variations in gap acceptance and lane change duration were investigated. Statistical analysis showed that mandatory lane changes are more aggressive in gap acceptance and lane change execution than discretionary lane changes. The results can be used for microscopic simulations, and design and evaluation of driver-assistant systems. △ Less

Submitted 28 July, 2017; originally announced July 2017.

Comments: Published in the 12th International Symposium on Advanced Vehicle Control, AVEC'14

arXiv:1702.05792 [pdf, other]

Improving Localization Accuracy in Connected Vehicle Networks Using Rao-Blackwellized Particle Filters: Theory, Simulations, and Experiments

Authors: Macheng Shen, Ding Zhao, **g Sun, Huei Peng

Abstract: A crucial function for automated vehicle technologies is accurate localization. Lane-level accuracy is not readily available from low-cost Global Navigation Satellite System (GNSS) receivers because of factors such as multipath error and atmospheric bias. Approaches such as Differential GNSS can improve localization accuracy, but usually require investment in expensive base stations. Connected veh… ▽ More A crucial function for automated vehicle technologies is accurate localization. Lane-level accuracy is not readily available from low-cost Global Navigation Satellite System (GNSS) receivers because of factors such as multipath error and atmospheric bias. Approaches such as Differential GNSS can improve localization accuracy, but usually require investment in expensive base stations. Connected vehicle technologies provide an alternative approach to improving the localization accuracy. It will be shown in this paper that localization accuracy can be enhanced using crude GNSS measurements from a group of connected vehicles, by matching their locations to a digital map. A Rao-Blackwellized particle filter (RBPF) is used to jointly estimate the common biases of the pseudo-ranges and the vehicle positions. Multipath biases, which introduce receiver-specific (non-common) error, are mitigated by a multi-hypothesis detection-rejection approach. The temporal correlation of the estimations is exploited through the prediction-update process. The proposed approach is compared to existing methods using both simulations and experimental results. It was found that the proposed algorithm can eliminate the common biases and reduce the localization error to below 1 meter under open sky conditions. △ Less

Submitted 26 March, 2017; v1 submitted 19 February, 2017; originally announced February 2017.

Comments: 11 pages, 14 figures. arXiv admin note: text overlap with arXiv:1606.03736

arXiv:1702.00785 [pdf, other]

Evaluation of Automated Vehicles Encountering Pedestrians at Unsignalized Crossings

Authors: Baiming Chen, Ding Zhao, Huei Peng

Abstract: Interactions between vehicles and pedestrians have always been a major problem in traffic safety. Experienced human drivers are able to analyze the environment and choose driving strategies that will help them avoid crashes. What is not yet clear, however, is how automated vehicles will interact with pedestrians. This paper proposes a new method for evaluating the safety and feasibility of the dri… ▽ More Interactions between vehicles and pedestrians have always been a major problem in traffic safety. Experienced human drivers are able to analyze the environment and choose driving strategies that will help them avoid crashes. What is not yet clear, however, is how automated vehicles will interact with pedestrians. This paper proposes a new method for evaluating the safety and feasibility of the driving strategy of automated vehicles when encountering unsignalized crossings. MobilEye sensors installed on buses in Ann Arbor, Michigan, collected data on 2,973 valid crossing events. A stochastic interaction model was then created using a multivariate Gaussian mixture model. This model allowed us to simulate the movements of pedestrians reacting to an oncoming vehicle when approaching unsignalized crossings, and to evaluate the passing strategies of automated vehicles. A simulation was then conducted to demonstrate the evaluation procedure. △ Less

Submitted 27 March, 2017; v1 submitted 1 February, 2017; originally announced February 2017.

arXiv:1702.00135 [pdf, other]

Analysis of Unprotected Intersection Left-Turn Conflicts based on Naturalistic Driving Data

Authors: Xinpeng Wang, Ding Zhao, Huei Peng, David J. LeBlanc

Abstract: Analyzing and reconstructing driving scenarios is crucial for testing and evaluating automated vehicles. This research analyzed left turn / straight-driving conflicts at unprotected intersections by extracting actual vehicle motion data from a naturalistic driving database collected by the University of Michigan. Nearly 7,000 Left turn across path opposite direction (LTAP/OD) events involving heav… ▽ More Analyzing and reconstructing driving scenarios is crucial for testing and evaluating automated vehicles. This research analyzed left turn / straight-driving conflicts at unprotected intersections by extracting actual vehicle motion data from a naturalistic driving database collected by the University of Michigan. Nearly 7,000 Left turn across path opposite direction (LTAP/OD) events involving heavy trucks and light vehicles were extracted and used to build a stochastic model of such LTAP/OD scenarios. Statistical analysis showed that vehicle type is a significant factor, whereas the change of season seems to have limited influence on the statistical nature of the conflict. The results can be used to build HAV testing environments to simulate the LTAP/OD crash cases in a stochastic manner, which is among the top NHTSA identified priority light-vehicle pre-crash scenarios. △ Less

Submitted 3 April, 2017; v1 submitted 1 February, 2017; originally announced February 2017.

arXiv:1610.09450 [pdf, other]

Evaluation of Automated Vehicles in the Frontal Cut-in Scenario - an Enhanced Approach using Piecewise Mixture Models

Authors: Zhiyuan Huang, Ding Zhao, Henry Lam, David J. LeBlanc, Huei Peng

Abstract: Evaluation and testing are critical for the development of Automated Vehicles (AVs). Currently, companies test AVs on public roads, which is very time-consuming and inefficient. We proposed the Accelerated Evaluation concept which uses a modified statistics of the surrounding vehicles and the Importance Sampling theory to reduce the evaluation time by several orders of magnitude, while ensuring th… ▽ More Evaluation and testing are critical for the development of Automated Vehicles (AVs). Currently, companies test AVs on public roads, which is very time-consuming and inefficient. We proposed the Accelerated Evaluation concept which uses a modified statistics of the surrounding vehicles and the Importance Sampling theory to reduce the evaluation time by several orders of magnitude, while ensuring the final evaluation results are accurate. In this paper, we further extend this idea by using Piecewise Mixture Distribution models instead of Single Distribution models. We demonstrate this idea to evaluate vehicle safety in lane change scenarios. The behavior of the cut-in vehicles was modeled based on more than 400,000 naturalistic driving lane changes collected by the University of Michigan Safety Pilot Model Deployment Program. Simulation results confirm that the accuracy and efficiency of the Piecewise Mixture Distribution method are better than the single distribution. △ Less

Submitted 30 January, 2017; v1 submitted 28 October, 2016; originally announced October 2016.

Comments: 6 pages, 8 figures

MSC Class: accepted by ICRA

arXiv:1606.08365 [pdf]

doi 10.1109/TITS.2017.2649538

Empirical Study of DSRC Performance Based on Safety Pilot Model Deployment Data

Authors: Xianan Huang, Ding Zhao, Huei Peng

Abstract: Dedicated Short Range Communication (DSRC) was designed to provide reliable wireless communication for intelligent transportation system applications. Sharing information among cars and between cars and the infrastructure, pedestrians, or "the cloud" has great potential to improve safety, mobility and fuel economy. DSRC is being considered by the US Department of Transportation to be required for… ▽ More Dedicated Short Range Communication (DSRC) was designed to provide reliable wireless communication for intelligent transportation system applications. Sharing information among cars and between cars and the infrastructure, pedestrians, or "the cloud" has great potential to improve safety, mobility and fuel economy. DSRC is being considered by the US Department of Transportation to be required for ground vehicles. In the past, their performance has been assessed thoroughly in the labs and limited field testing, but not on a large fleet. In this paper, we present the analysis of DSRC performance using data from the world's largest connected vehicle test program - Safety Pilot Model Deployment lead by the University of Michigan. We first investigate their maximum and effective range, and then study the effect of environmental factors, such as trees/foliage, weather, buildings, vehicle travel direction, and road elevation. The results can be used to guide future DSRC equipment placement and installation, and can be used to develop DSRC communication models for numerical simulations. △ Less

Submitted 16 June, 2016; originally announced June 2016.

Showing 1–46 of 46 results for author: Peng, H