Search | arXiv e-print repository

arXiv:2404.11938 [pdf, other]

HyDiscGAN: A Hybrid Distributed cGAN for Audio-Visual Privacy Preservation in Multimodal Sentiment Analysis

Authors: Zhuojia Wu, Qi Zhang, Duoqian Miao, Kun Yi, Wei Fan, Liang Hu

Abstract: Multimodal Sentiment Analysis (MSA) aims to identify speakers' sentiment tendencies in multimodal video content, raising serious concerns about privacy risks associated with multimodal data, such as voiceprints and facial images. Recent distributed collaborative learning has been verified as an effective paradigm for privacy preservation in multimodal tasks. However, they often overlook the privac… ▽ More Multimodal Sentiment Analysis (MSA) aims to identify speakers' sentiment tendencies in multimodal video content, raising serious concerns about privacy risks associated with multimodal data, such as voiceprints and facial images. Recent distributed collaborative learning has been verified as an effective paradigm for privacy preservation in multimodal tasks. However, they often overlook the privacy distinctions among different modalities, struggling to strike a balance between performance and privacy preservation. Consequently, it poses an intriguing question of maximizing multimodal utilization to improve performance while simultaneously protecting necessary modalities. This paper forms the first attempt at modality-specified (i.e., audio and visual) privacy preservation in MSA tasks. We propose a novel Hybrid Distributed cross-modality cGAN framework (HyDiscGAN), which learns multimodality alignment to generate fake audio and visual features conditioned on shareable de-identified textual data. The objective is to leverage the fake features to approximate real audio and visual content to guarantee privacy preservation while effectively enhancing performance. Extensive experiments show that compared with the state-of-the-art MSA model, HyDiscGAN can achieve superior or competitive performance while preserving privacy. △ Less

Submitted 18 April, 2024; originally announced April 2024.

Comments: 13 pages, IJCAI-2024

arXiv:2404.11171 [pdf, other]

Personalized Heart Disease Detection via ECG Digital Twin Generation

Authors: Yaojun Hu, **tai Chen, Lianting Hu, Dantong Li, Jiahuan Yan, Haochao Ying, Huiying Liang, Jian Wu

Abstract: Heart diseases rank among the leading causes of global mortality, demonstrating a crucial need for early diagnosis and intervention. Most traditional electrocardiogram (ECG) based automated diagnosis methods are trained at population level, neglecting the customization of personalized ECGs to enhance individual healthcare management. A potential solution to address this limitation is to employ dig… ▽ More Heart diseases rank among the leading causes of global mortality, demonstrating a crucial need for early diagnosis and intervention. Most traditional electrocardiogram (ECG) based automated diagnosis methods are trained at population level, neglecting the customization of personalized ECGs to enhance individual healthcare management. A potential solution to address this limitation is to employ digital twins to simulate symptoms of diseases in real patients. In this paper, we present an innovative prospective learning approach for personalized heart disease detection, which generates digital twins of healthy individuals' anomalous ECGs and enhances the model sensitivity to the personalized symptoms. In our approach, a vector quantized feature separator is proposed to locate and isolate the disease symptom and normal segments in ECG signals with ECG report guidance. Thus, the ECG digital twins can simulate specific heart diseases used to train a personalized heart disease detection model. Experiments demonstrate that our approach not only excels in generating high-fidelity ECG signals but also improves personalized heart disease detection. Moreover, our approach ensures robust privacy protection, safeguarding patient data in model development. △ Less

Submitted 11 May, 2024; v1 submitted 17 April, 2024; originally announced April 2024.

arXiv:2403.06463 [pdf, other]

A prediction-based forward-looking vehicle dispatching strategy for dynamic ride-pooling

Authors: Xiaolei Wang, Chen Yang, Yuzhen Feng, Luohan Hu, Zhengbing He

Abstract: For on-demand dynamic ride-pooling services, e.g., Uber Pool and Didi Pinche, a well-designed vehicle dispatching strategy is crucial for platform profitability and passenger experience. Most existing dispatching strategies overlook incoming pairing opportunities, therefore suffer from short-sighted limitations. In this paper, we propose a forward-looking vehicle dispatching strategy, which first… ▽ More For on-demand dynamic ride-pooling services, e.g., Uber Pool and Didi Pinche, a well-designed vehicle dispatching strategy is crucial for platform profitability and passenger experience. Most existing dispatching strategies overlook incoming pairing opportunities, therefore suffer from short-sighted limitations. In this paper, we propose a forward-looking vehicle dispatching strategy, which first predicts the expected distance saving that could be brought about by future orders and then solves a bipartite matching problem based on the prediction to match passengers with partially occupied or vacant vehicles or keep passengers waiting for next rounds of matching. To demonstrate the performance of the proposed strategy, a number of simulation experiments and comparisons are conducted based on the real-world road network and historical trip data from Haikou, China. Results show that the proposed strategy outperform the baseline strategies by generating approximately 31\% more distance saving and 18\% less average passenger detour distance. It indicates the significant benefits of considering future pairing opportunities in dispatching, and highlights the effectiveness of our innovative forward-looking vehicle dispatching strategy in improving system efficiency and user experience for dynamic ride-pooling services. △ Less

Submitted 11 March, 2024; originally announced March 2024.

arXiv:2402.09457 [pdf]

Self-Healing Effects in OAM Beams Observed on a 28 GHz Experimental Link

Authors: Marek Klemes, Lan Hu, Greg Bowles, Mohammad Akbari, Soulideth Thirakoune, Michael Schwartzman, Kevin Zhang, Tan Huy Ho, David Wessel, Wen Tong

Abstract: In this paper we document for the first time some of the effects of self-healing, a property of orbital-angular-momentum (OAM) or vortex beams, as observed on a millimeter-wave experimental communications link in an outdoors line-of-sight (LOS) scenario. The OAM beams have a helical phase and polarization structure and have conical amplitude shape in the far field. The Poynting vectors of the OAM… ▽ More In this paper we document for the first time some of the effects of self-healing, a property of orbital-angular-momentum (OAM) or vortex beams, as observed on a millimeter-wave experimental communications link in an outdoors line-of-sight (LOS) scenario. The OAM beams have a helical phase and polarization structure and have conical amplitude shape in the far field. The Poynting vectors of the OAM beams also possess helical structures, orthogonal to the corresponding helical phase-fronts. Due to such non-planar structure in the direction orthogonal to the beam axis, OAM beams are a subset of structured light beams. Such structured beams are known to possess self-healing properties when partially obstructed along their propagation axis, especially in their near fields, resulting in partial reconstruction of their structures at larger distances along their beam axis. Various theoretical rationales have been proposed to explain, model and experimentally verify the self-healing physical effects in structured optical beams, using various types of obstructions and experimental techniques. Based on these models, we hypothesize that any self-healing observed will be greater as the OAM order increases. Here we observe the self-healing effects for the first time in structured OAM radio beams, in terms of communication signals and channel parameters rather than beam structures. We capture the effects of partial near-field obstructions of OAM beams of different orders on the communications signals and provide a physical rationale to substantiate that the self-healing effect was observed to increase with the order of OAM, agreeing with our hypothesis. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: 9 pages, 10 figures, pending submission to IEEE Access journal

arXiv:2401.10242 [pdf, other]

DanceMeld: Unraveling Dance Phrases with Hierarchical Latent Codes for Music-to-Dance Synthesis

Authors: Xin Gao, Li Hu, Peng Zhang, Bang Zhang, Liefeng Bo

Abstract: In the realm of 3D digital human applications, music-to-dance presents a challenging task. Given the one-to-many relationship between music and dance, previous methods have been limited in their approach, relying solely on matching and generating corresponding dance movements based on music rhythm. In the professional field of choreography, a dance phrase consists of several dance poses and dance… ▽ More In the realm of 3D digital human applications, music-to-dance presents a challenging task. Given the one-to-many relationship between music and dance, previous methods have been limited in their approach, relying solely on matching and generating corresponding dance movements based on music rhythm. In the professional field of choreography, a dance phrase consists of several dance poses and dance movements. Dance poses composed of a series of basic meaningful body postures, while dance movements can reflect dynamic changes such as the rhythm, melody, and style of dance. Taking inspiration from these concepts, we introduce an innovative dance generation pipeline called DanceMeld, which comprising two stages, i.e., the dance decouple stage and the dance generation stage. In the decouple stage, a hierarchical VQ-VAE is used to disentangle dance poses and dance movements in different feature space levels, where the bottom code represents dance poses, and the top code represents dance movements. In the generation stage, we utilize a diffusion model as a prior to model the distribution and generate latent codes conditioned on music features. We have experimentally demonstrated the representational capabilities of top code and bottom code, enabling the explicit decoupling expression of dance poses and dance movements. This disentanglement not only provides control over motion details, styles, and rhythm but also facilitates applications such as dance style transfer and dance unit editing. Our approach has undergone qualitative and quantitative experiments on the AIST++ dataset, demonstrating its superiority over other methods. △ Less

Submitted 30 November, 2023; originally announced January 2024.

Comments: 10 pages, 8 figures

arXiv:2401.05000 [pdf]

Map** Information in Feature Extraction Transformation for Chirp Signal

Authors: Shuyi Gu, Zhenghua Luo, Lin Hu, Yilin Zhang, Junxiong Guo

Abstract: Chirp signals have established diverse applications caused by the capable of producing time-dependent linear frequencies. Most feature extraction transformation methods for chirp signals focus on enhancing the performance of transform methods but neglecting the information derived from the transformation process. Consequently, they may fail to fully exploit the information from observations, resul… ▽ More Chirp signals have established diverse applications caused by the capable of producing time-dependent linear frequencies. Most feature extraction transformation methods for chirp signals focus on enhancing the performance of transform methods but neglecting the information derived from the transformation process. Consequently, they may fail to fully exploit the information from observations, resulting in decreased performance under conditions of low signal-to-noise ratio and limited observations. In this work, we develop a novel post-processing method called map** information model to addressing this challenge. The model establishes a link between the observation space and feature space in feature extraction transform, enabling interference suppression and obtain more accurate information by iteratively resampling and assigning weights in both spaces. Analysis of the iteration process reveals a continual increase in weight of signal samples and a gradual stability in weight of noise samples. The demonstration of the noise suppression in the iteration process and feature enhancement supports the effectiveness of the map** information model. Furthermore, numerical simulations also affirm the high efficiency of the proposed model by showcasing enhanced signal detection and estimation performances without requiring additional observations. This superior model allows amplifying performance within feature extraction transformation for chirp signal processing under low SNR and limited observation conditions, opens up new opportunities for areas such as communication, biomedicine, and remote sensing. △ Less

Submitted 10 January, 2024; originally announced January 2024.

Comments: 14 pages,10 figures

arXiv:2312.17266 [pdf]

Automatic laminectomy cutting plane planning based on artificial intelligence in robot assisted laminectomy surgery

Authors: Zhuofu Li, Yonghong Zhang, Chengxia Wang, Shanshan Liu, Xiongkang Song, Xuquan Ji, Shuai Jiang, Woquan Zhong, Lei Hu, Weishi Li

Abstract: Objective: This study aims to use artificial intelligence to realize the automatic planning of laminectomy, and verify the method. Methods: We propose a two-stage approach for automatic laminectomy cutting plane planning. The first stage was the identification of key points. 7 key points were manually marked on each CT image. The Spatial Pyramid Upsampling Network (SPU-Net) algorithm developed by… ▽ More Objective: This study aims to use artificial intelligence to realize the automatic planning of laminectomy, and verify the method. Methods: We propose a two-stage approach for automatic laminectomy cutting plane planning. The first stage was the identification of key points. 7 key points were manually marked on each CT image. The Spatial Pyramid Upsampling Network (SPU-Net) algorithm developed by us was used to accurately locate the 7 key points. In the second stage, based on the identification of key points, a personalized coordinate system was generated for each vertebra. Finally, the transverse and longitudinal cutting planes of laminectomy were generated under the coordinate system. The overall effect of planning was evaluated. Results: In the first stage, the average localization error of the SPU-Net algorithm for the seven key points was 0.65mm. In the second stage, a total of 320 transverse cutting planes and 640 longitudinal cutting planes were planned by the algorithm. Among them, the number of horizontal plane planning effects of grade A, B, and C were 318(99.38%), 1(0.31%), and 1(0.31%), respectively. The longitudinal planning effects of grade A, B, and C were 622(97.18%), 1(0.16%), and 17(2.66%), respectively. Conclusions: In this study, we propose a method for automatic surgical path planning of laminectomy based on the localization of key points in CT images. The results showed that the method achieved satisfactory results. More studies are needed to confirm the reliability of this approach in the future. △ Less

Submitted 25 December, 2023; originally announced December 2023.

arXiv:2308.05757 [pdf, other]

OrcoDCS: An IoT-Edge Orchestrated Online Deep Compressed Sensing Framework

Authors: Cheng-Wei Ching, Chirag Gupta, Zi Huang, Liting Hu

Abstract: Compressed data aggregation (CDA) over wireless sensor networks (WSNs) is task-specific and subject to environmental changes. However, the existing compressed data aggregation (CDA) frameworks (e.g., compressed sensing-based data aggregation, deep learning(DL)-based data aggregation) do not possess the flexibility and adaptivity required to handle distinct sensing tasks and environmental changes.… ▽ More Compressed data aggregation (CDA) over wireless sensor networks (WSNs) is task-specific and subject to environmental changes. However, the existing compressed data aggregation (CDA) frameworks (e.g., compressed sensing-based data aggregation, deep learning(DL)-based data aggregation) do not possess the flexibility and adaptivity required to handle distinct sensing tasks and environmental changes. Additionally, they do not consider the performance of follow-up IoT data-driven deep learning (DL)-based applications. To address these shortcomings, we propose OrcoDCS, an IoT-Edge orchestrated online deep compressed sensing framework that offers high flexibility and adaptability to distinct IoT device groups and their sensing tasks, as well as high performance for follow-up applications. The novelty of our work is the design and deployment of IoT-Edge orchestrated online training framework over WSNs by leveraging an specially-designed asymmetric autoencoder, which can largely reduce the encoding overhead and improve the reconstruction performance and robustness. We show analytically and empirically that OrcoDCS outperforms the state-of-the-art DCDA on training time, significantly improves flexibility and adaptability when distinct reconstruction tasks are given, and achieves higher performance for follow-up applications. △ Less

Submitted 5 August, 2023; originally announced August 2023.

Comments: 6 pages, 8 figures, to appear in 2023 IEEE International Conference on Distributed Computing Systems Workshop on ECAI

arXiv:2304.01466 [pdf]

OTFDM: A Novel 2D Modulation Waveform Modeling Dot-product Doubly-selective Channel

Authors: Yihua Ma, Zhifeng Yuan, Yu Xin, Jiang Hua, Guanghui Yu, ** Xu, Liujun Hu

Abstract: Recently, a two-dimension (2D) modulation waveform of orthogonal time-frequency-space (OTFS) has been a popular 6G candidate to replace existing orthogonal frequency division multiplexing (OFDM). The extensive OTFS researches help to make both the advantages and limitations of OTFS more and more clear. The limitations are not easy to overcome as they come from OTFS on-grid 2D convolution channel m… ▽ More Recently, a two-dimension (2D) modulation waveform of orthogonal time-frequency-space (OTFS) has been a popular 6G candidate to replace existing orthogonal frequency division multiplexing (OFDM). The extensive OTFS researches help to make both the advantages and limitations of OTFS more and more clear. The limitations are not easy to overcome as they come from OTFS on-grid 2D convolution channel model. Instead of solving OTFS inborn challenges, this paper proposes a novel 2D modulation waveform named orthogonal time-frequency division multiplexing (OTFDM). OTFDM uses a 2D dot-product channel model to cope with doubly-selectivity. Compared with OTFS, OTFDM supports grid-free channel delay and Doppler and gains a simple and efficient 2D equalization. The concise dot-division equalization can be easily combined with MIMO. The simulation result shows that OTFDM is able to bear high mobility and greatly outperforms OFDM in doubly-selective channel. △ Less

Submitted 4 July, 2023; v1 submitted 3 April, 2023; originally announced April 2023.

Comments: Accepted by IEEE PIMRC 2023

arXiv:2303.09071 [pdf, other]

doi 10.1109/WACVW54805.2022.00080

Joint Multi-Scale Tone Map** and Denoising for HDR Image Enhancement

Authors: Litao Hu, Huai** Chen, Jan P. Allebach

Abstract: An image processing unit (IPU), or image signal processor (ISP) for high dynamic range (HDR) imaging usually consists of demosaicing, white balancing, lens shading correction, color correction, denoising, and tone-map**. Besides noise from the imaging sensors, almost every step in the ISP introduces or amplifies noise in different ways, and denoising operators are designed to reduce the noise fr… ▽ More An image processing unit (IPU), or image signal processor (ISP) for high dynamic range (HDR) imaging usually consists of demosaicing, white balancing, lens shading correction, color correction, denoising, and tone-map**. Besides noise from the imaging sensors, almost every step in the ISP introduces or amplifies noise in different ways, and denoising operators are designed to reduce the noise from these sources. Designed for dynamic range compressing, tone-map** operators in an ISP can significantly amplify the noise level, especially for images captured in low-light conditions, making denoising very difficult. Therefore, we propose a joint multi-scale denoising and tone-map** framework that is designed with both operations in mind for HDR images. Our joint network is trained in an end-to-end format that optimizes both operators together, to prevent the tone-map** operator from overwhelming the denoising operator. Our model outperforms existing HDR denoising and tone-map** operators both quantitatively and qualitatively on most of our benchmarking datasets. △ Less

Submitted 23 March, 2023; v1 submitted 16 March, 2023; originally announced March 2023.

Comments: 10 pages, 4 figures, WACVW2022. Codes available at https://github.com/hulitaotom/Joint-Multi-Scale-Tone-Map**-and-Denoising-for-HDR-Image-Enhancement

arXiv:2209.13786 [pdf, other]

A Parameter-free Nonconvex Low-rank Tensor Completion Model for Spatiotemporal Traffic Data Recovery

Authors: Yang He, Yuheng Jia, Liyang Hu, Chengchuan An, Zhenbo Lu, **gxin Xia

Abstract: Traffic data chronically suffer from missing and corruption, leading to accuracy and utility reduction in subsequent Intelligent Transportation System (ITS) applications. Noticing the inherent low-rank property of traffic data, numerous studies formulated missing traffic data recovery as a low-rank tensor completion (LRTC) problem. Due to the non-convexity and discreteness of the rank minimization… ▽ More Traffic data chronically suffer from missing and corruption, leading to accuracy and utility reduction in subsequent Intelligent Transportation System (ITS) applications. Noticing the inherent low-rank property of traffic data, numerous studies formulated missing traffic data recovery as a low-rank tensor completion (LRTC) problem. Due to the non-convexity and discreteness of the rank minimization in LRTC, existing methods either replaced rank with convex surrogates that are quite far away from the rank function or approximated rank with nonconvex surrogates involving many parameters. In this study, we proposed a Parameter-Free Non-Convex Tensor Completion model (TC-PFNC) for traffic data recovery, in which a log-based relaxation term was designed to approximate tensor algebraic rank. Moreover, previous studies usually assumed the observations are reliable without any outliers. Therefore, we extended the TC-PFNC to a robust version (RTC-PFNC) by modeling potential traffic data outliers, which can recover the missing value from partial and corrupted observations and remove the anomalies in observations. The numerical solutions of TC-PFNC and RTC-PFNC were elaborated based on the alternating direction multiplier method (ADMM). The extensive experimental results conducted on four real-world traffic data sets demonstrated that the proposed methods outperform other state-of-the-art methods in both missing and corrupted data recovery. The code used in this paper is available at: https://github.com/YoungHe49/T-ITSPFNC. △ Less

Submitted 27 September, 2022; originally announced September 2022.

Comments: 10 pages, 7 figures

arXiv:2207.03241 [pdf]

doi 10.1109/JIOT.2023.3274120

Highly Efficient Waveform Design and Hybrid Duplex for Joint Communication and Sensing

Authors: Yihua Ma, Zhifeng Yuan, Shuqiang Xia, Guanghui Yu, Liujun Hu

Abstract: Joint communication and sensing (JCAS) is a very promising 6G technology, which attracts more and more research attention. Compared with communication, radar has many unique features in terms of waveform design criteria, self-interference cancellation (SIC), aperture-dependent resolution, and virtual aperture. This paper proposes a novel waveform design named max-aperture radar slicing (MaRS) to g… ▽ More Joint communication and sensing (JCAS) is a very promising 6G technology, which attracts more and more research attention. Compared with communication, radar has many unique features in terms of waveform design criteria, self-interference cancellation (SIC), aperture-dependent resolution, and virtual aperture. This paper proposes a novel waveform design named max-aperture radar slicing (MaRS) to gain a large time-frequency aperture, which is generated by orthogonal frequency division multiplexing (OFDM) and occupies only a tiny fraction of OFDM resources. The proposed MaRS keeps the radar advantages of constant modulus, zero auto-correlation sequence, and simple SIC. As MaRS consumes much less resources, conventional processing methods fail, and novel angle-Doppler map based methods are proposed to obtain the range-velocity-angle information from MaRS echos and strong clutters. To avoid complex full-duplex communication, this paper proposes a hybrid-duplex JCAS scheme composed of half-duplex communication and full-duplex radar. The half-duplex communication antenna array is reused, and a small sensing-dedicated antenna array is added. Using these two arrays, a large space-domain sensing aperture is virtually formed to greatly improve the angle resolution. The numerical results show that the proposed MaRS and hybrid duplex can achieve a high sensing resolution with only 0.4% OFDM resources, which reduces the overheads of conventional methods to less than one tenth. △ Less

Submitted 4 July, 2023; v1 submitted 7 July, 2022; originally announced July 2022.

Comments: in IEEE Internet of Things Journal

arXiv:2205.03225 [pdf, other]

doi 10.1364/OE.460704

Multiple-access relay stations for long-haul fiber-optic radio frequency transfer

Authors: Qi Li, Liang Hu, **bo Zhang, Jian** Chen, Guiling Wu

Abstract: We report on the realization of a long-haul radio frequency (RF) transfer scheme by using multiple-access relay stations (MARSs). The proposed scheme with independent link noise compensation for each fiber sub-link effectively solves the limitation of compensation bandwidth for long-haul transfer. The MARS can have the capability to share the same modulated optical signal for the front and rear fi… ▽ More We report on the realization of a long-haul radio frequency (RF) transfer scheme by using multiple-access relay stations (MARSs). The proposed scheme with independent link noise compensation for each fiber sub-link effectively solves the limitation of compensation bandwidth for long-haul transfer. The MARS can have the capability to share the same modulated optical signal for the front and rear fiber sub-links, simplifying the configuration at the repeater station and enabling the transfer system to have the multiple-access capability. At the same time, we for the first time theoretically model the effect of the MARS position on the fractional frequency instability of the fiber-optic RF transfer, demonstrating that the MARS position has little effect on system's performance when the ratio of the front and rear fiber sub-links is around $1:1$. We experimentally demonstrate a 1 GHz signal transfer by using one MARS connecting 260 and 280 km fiber links with the fractional frequency instabilities of less than $5.9\times10^{-14}$ at 1 s and $8.5\times10^{-17}$ at 10,000 s at the remote site and of $5.6\times10^{-14}$ and $6.6\times10^{-17}$ at the integration times of 1 s and 10,000 s at the MARS. The proposed scalable technique can arbitrarily add the same MARSs in the fiber link, which has great potential in realizing ultra-long-haul RF transfer. △ Less

Submitted 4 May, 2022; originally announced May 2022.

Comments: Accepted for publication in Optics Express

arXiv:2201.00941 [pdf]

Waveform Design Using Half-duplex Devices for 6G Joint Communications and Sensing

Authors: Yihua Ma, Zhifeng Yuan, Guanghui Yu, Shuqiang Xia, Liujun Hu

Abstract: Joint communications and sensing is a promising 6G technology, and the challenge is how to integrate them efficiently. Existing frequency-division and time-division coexistence can hardly bring a gain of integration. Directly using orthogonal frequency-division multiplexing (OFDM) to sense requires complex in-band full-duplex to cancel the selfinterference (SI). To solve these problems, this paper… ▽ More Joint communications and sensing is a promising 6G technology, and the challenge is how to integrate them efficiently. Existing frequency-division and time-division coexistence can hardly bring a gain of integration. Directly using orthogonal frequency-division multiplexing (OFDM) to sense requires complex in-band full-duplex to cancel the selfinterference (SI). To solve these problems, this paper proposes novel coexistence schemes to gain super sensing range (SSR) and simple SI cancellation. SSR enables JCS to gain a sensing range of a sensing-only scheme and shares the resources with communications. Random time-division is proposed to gain a super Doppler range. Flexible sensing implanted OFDM (FSIOFDM) is also proposed. FSI-OFDM uses random sensing occasions to gain super Doppler range, as well as utilizes the fixed tail sensing occasions to achieve supper distance range. The simulation results show that the proposed schemes can gain SSR with limited resources. △ Less

Submitted 3 January, 2022; originally announced January 2022.

arXiv:2111.15200 [pdf, other]

Contrastive Learning for Local and Global Learning MRI Reconstruction

Authors: Qiaosi Yi, **hao Liu, Le Hu, Faming Fang, Guixu Zhang

Abstract: Magnetic Resonance Imaging (MRI) is an important medical imaging modality, while it requires a long acquisition time. To reduce the acquisition time, various methods have been proposed. However, these methods failed to reconstruct images with a clear structure for two main reasons. Firstly, similar patches widely exist in MR images, while most previous deep learning-based methods ignore this prope… ▽ More Magnetic Resonance Imaging (MRI) is an important medical imaging modality, while it requires a long acquisition time. To reduce the acquisition time, various methods have been proposed. However, these methods failed to reconstruct images with a clear structure for two main reasons. Firstly, similar patches widely exist in MR images, while most previous deep learning-based methods ignore this property and only adopt CNN to learn local information. Secondly, the existing methods only use clear images to constrain the upper bound of the solution space, while the lower bound is not constrained, so that a better parameter of the network cannot be obtained. To address these problems, we propose a Contrastive Learning for Local and Global Learning MRI Reconstruction Network (CLGNet). Specifically, according to the Fourier theory, each value in the Fourier domain is calculated from all the values in Spatial domain. Therefore, we propose a Spatial and Fourier Layer (SFL) to simultaneously learn the local and global information in Spatial and Fourier domains. Moreover, compared with self-attention and transformer, the SFL has a stronger learning ability and can achieve better performance in less time. Based on the SFL, we design a Spatial and Fourier Residual block as the main component of our model. Meanwhile, to constrain the lower bound and upper bound of the solution space, we introduce contrastive learning, which can pull the result closer to the clear image and push the result further away from the undersampled image. Extensive experimental results on different datasets and acceleration rates demonstrate that the proposed CLGNet achieves new state-of-the-art results. △ Less

Submitted 30 November, 2021; originally announced November 2021.

arXiv:2111.07552 [pdf, other]

Dynamic Placement of Rapidly Deployable Mobile Sensor Robots Using Machine Learning and Expected Value of Information

Authors: Alice Agogino, Hae Young Jang, Vivek Rao, Ritik Batra, Felicity Liao, Rohan Sood, Irving Fang, R. Lily Hu, Emerson Shoichet-Bartus, John Matranga

Abstract: Although the Industrial Internet of Things has increased the number of sensors permanently installed in industrial plants, there will be gaps in coverage due to broken sensors or sparse density in very large plants, such as in the petrochemical industry. Modern emergency response operations are beginning to use Small Unmanned Aerial Systems (sUAS) that have the ability to drop sensor robots to pre… ▽ More Although the Industrial Internet of Things has increased the number of sensors permanently installed in industrial plants, there will be gaps in coverage due to broken sensors or sparse density in very large plants, such as in the petrochemical industry. Modern emergency response operations are beginning to use Small Unmanned Aerial Systems (sUAS) that have the ability to drop sensor robots to precise locations. sUAS can provide longer-term persistent monitoring that aerial drones are unable to provide. Despite the relatively low cost of these assets, the choice of which robotic sensing systems to deploy to which part of an industrial process in a complex plant environment during emergency response remains challenging. This paper describes a framework for optimizing the deployment of emergency sensors as a preliminary step towards realizing the responsiveness of robots in disaster circumstances. AI techniques (Long short-term memory, 1-dimensional convolutional neural network, logistic regression, and random forest) identify regions where sensors would be most valued without requiring humans to enter the potentially dangerous area. In the case study described, the cost function for optimization considers costs of false-positive and false-negative errors. Decisions on mitigation include implementing repairs or shutting down the plant. The Expected Value of Information (EVI) is used to identify the most valuable type and location of physical sensors to be deployed to increase the decision-analytic value of a sensor network. This method is applied to a case study using the Tennessee Eastman process data set of a chemical plant, and we discuss implications of our findings for operation, distribution, and decision-making of sensors in plant emergency and resilience scenarios. △ Less

Submitted 15 November, 2021; originally announced November 2021.

Comments: 14 pages, 11 figures, IMECE2021

arXiv:2109.08417 [pdf, other]

Transformer-Unet: Raw Image Processing with Unet

Authors: Youyang Sha, Yonghong Zhang, Xuquan Ji, Lei Hu

Abstract: Medical image segmentation have drawn massive attention as it is important in biomedical image analysis. Good segmentation results can assist doctors with their judgement and further improve patients' experience. Among many available pipelines in medical image analysis, Unet is one of the most popular neural networks as it keeps raw features by adding concatenation between encoder and decoder, whi… ▽ More Medical image segmentation have drawn massive attention as it is important in biomedical image analysis. Good segmentation results can assist doctors with their judgement and further improve patients' experience. Among many available pipelines in medical image analysis, Unet is one of the most popular neural networks as it keeps raw features by adding concatenation between encoder and decoder, which makes it still widely used in industrial field. In the mean time, as a popular model which dominates natural language process tasks, transformer is now introduced to computer vision tasks and have seen promising results in object detection, image classification and semantic segmentation tasks. Therefore, the combination of transformer and Unet is supposed to be more efficient than both methods working individually. In this article, we propose Transformer-Unet by adding transformer modules in raw images instead of feature maps in Unet and test our network in CT82 datasets for Pancreas segmentation accordingly. We form an end-to-end network and gain segmentation results better than many previous Unet based algorithms in our experiment. We demonstrate our network and show our experimental results in this paper accordingly. △ Less

Submitted 17 September, 2021; originally announced September 2021.

arXiv:2109.03389 [pdf, other]

An Optimal Resource Allocator of Elastic Training for Deep Learning Jobs on Cloud

Authors: Liang Hu, Jiangcheng Zhu, Zirui Zhou, Ruiqing Cheng, Xiaolong Bai, Yong Zhang

Abstract: Cloud training platforms, such as Amazon Web Services and Huawei Cloud provide users with computational resources to train their deep learning jobs. Elastic training is a service embedded in cloud training platforms that dynamically scales up or down the resources allocated to a job. The core technique of an elastic training system is to best allocate limited resources among heterogeneous jobs in… ▽ More Cloud training platforms, such as Amazon Web Services and Huawei Cloud provide users with computational resources to train their deep learning jobs. Elastic training is a service embedded in cloud training platforms that dynamically scales up or down the resources allocated to a job. The core technique of an elastic training system is to best allocate limited resources among heterogeneous jobs in terms of shorter queueing delay and higher training efficiency. This paper presents an optimal resource allocator for elastic training system that leverages a mixed-integer programming (MIP) model to maximize the training progress of deep learning jobs. We take advantage of the real-world job data obtained from ModelArts, the deep learning training platform of Huawei Cloud and conduct simulation experiments to compare the optimal resource allocator with a greedy one as benchmark. Numerical results show that the proposed allocator can reduce queuing time by up to 32% and accelerate training efficiency by up to 24% relative to the greedy resource allocator, thereby greatly improving user experience with Huawei ModelArts and potentially enabling the realization of higher profits for the product. Also, the optimal resource allocator is fast in decision-making, taking merely 0.4 seconds on average. △ Less

Submitted 7 September, 2021; originally announced September 2021.

arXiv:2108.08494 [pdf, ps, other]

Towards a Multispectral RGB-IR-UV-D Vision System -- Seeing the Invisible in 3D

Authors: Tanhao Zhang, Luyin Hu, Lu Li, David Navarro-Alarcon

Abstract: In this paper, we present the development of a sensing system with the capability to compute multispectral point clouds in real-time. The proposed multi-eye sensor system effectively registers information from the visible, (long-wave) infrared, and ultraviolet spectrum to its depth sensing frame, thus enabling to measure a wider range of surface features that are otherwise hidden to the naked eye.… ▽ More In this paper, we present the development of a sensing system with the capability to compute multispectral point clouds in real-time. The proposed multi-eye sensor system effectively registers information from the visible, (long-wave) infrared, and ultraviolet spectrum to its depth sensing frame, thus enabling to measure a wider range of surface features that are otherwise hidden to the naked eye. For that, we designed a new cross-calibration apparatus that produces consistent features which can be sensed by each of the cameras, therefore, acting as a multispectral "chessboard". The performance of the sensor is evaluated with two different cases of studies, where we show that the proposed system can detect "hidden" features of a 3D environment. △ Less

Submitted 19 August, 2021; originally announced August 2021.

arXiv:2012.13147 [pdf, other]

On Radiation-Based Thermal Servoing: New Models, Controls and Experiments

Authors: Luyin Hu, David Navarro-Alarcon, Andrea Cherubini, Mengying Li

Abstract: In this paper, we introduce a new sensor-based control method that regulates (by means of robot motions) the heat transfer between a radiative source and an object of interest. This valuable sensorimotor capability is needed in many industrial, dermatology and field robot applications, and it is an essential component for creating machines with advanced thermo-motor intelligence. To this end, we d… ▽ More In this paper, we introduce a new sensor-based control method that regulates (by means of robot motions) the heat transfer between a radiative source and an object of interest. This valuable sensorimotor capability is needed in many industrial, dermatology and field robot applications, and it is an essential component for creating machines with advanced thermo-motor intelligence. To this end, we derive a geometric-thermal-motor model which describes the relationship between the robot's active configuration and the produced dynamic thermal response. We then use the model to guide the design of two new thermal servoing controllers (one model-based and one adaptive), and analyze their stability with Lyapunov theory. To validate our method, we report a detailed experimental study with a robotic manipulator conducting autonomous thermal servoing tasks. To the best of the authors' knowledge, this is the first time that temperature regulation has been formulated as a motion control problem for robots. △ Less

Submitted 24 December, 2020; originally announced December 2020.

Comments: 15 pages, 22 figures

arXiv:2010.13529 [pdf, other]

Lyapunov-Based Reinforcement Learning State Estimator

Authors: Liang Hu, Chengwei Wu, Wei Pan

Abstract: In this paper, we consider the state estimation problem for nonlinear stochastic discrete-time systems. We combine Lyapunov's method in control theory and deep reinforcement learning to design the state estimator. We theoretically prove the convergence of the bounded estimate error solely using the data simulated from the model. An actor-critic reinforcement learning algorithm is proposed to learn… ▽ More In this paper, we consider the state estimation problem for nonlinear stochastic discrete-time systems. We combine Lyapunov's method in control theory and deep reinforcement learning to design the state estimator. We theoretically prove the convergence of the bounded estimate error solely using the data simulated from the model. An actor-critic reinforcement learning algorithm is proposed to learn the state estimator approximated by a deep neural network. The convergence of the algorithm is analysed. The proposed Lyapunov-based reinforcement learning state estimator is compared with a number of existing nonlinear filtering methods through Monte Carlo simulations, showing its advantage in terms of estimate convergence even under some system uncertainties such as covariance shift in system noise and randomly missing measurements. To the best of our knowledge, this is the first reinforcement learning based nonlinear state estimator with bounded estimate error performance guarantee. △ Less

Submitted 7 January, 2021; v1 submitted 26 October, 2020; originally announced October 2020.

arXiv:2009.02285 [pdf, other]

Flow Field Reconstructions with GANs based on Radial Basis Functions

Authors: Liwei Hu, Wenyong Wang, Yu Xiang, Jun Zhang

Abstract: Nonlinear sparse data regression and generation have been a long-term challenge, to cite the flow field reconstruction as a typical example. The huge computational cost of computational fluid dynamics (CFD) makes it much expensive for large scale CFD data producing, which is the reason why we need some cheaper ways to do this, of which the traditional reduced order models (ROMs) were promising but… ▽ More Nonlinear sparse data regression and generation have been a long-term challenge, to cite the flow field reconstruction as a typical example. The huge computational cost of computational fluid dynamics (CFD) makes it much expensive for large scale CFD data producing, which is the reason why we need some cheaper ways to do this, of which the traditional reduced order models (ROMs) were promising but they couldn't generate a large number of full domain flow field data (FFD) to realize high-precision flow field reconstructions. Motivated by the problems of existing approaches and inspired by the success of the generative adversarial networks (GANs) in the field of computer vision, we prove an optimal discriminator theorem that the optimal discriminator of a GAN is a radial basis function neural network (RBFNN) while dealing with nonlinear sparse FFD regression and generation. Based on this theorem, two radial basis function-based GANs (RBF-GAN and RBFC-GAN), for regression and generation purposes, are proposed. Three different datasets are applied to verify the feasibility of our models. The results show that the performance of the RBF-GAN and the RBFC-GAN are better than that of GANs/cGANs by means of both the mean square error (MSE) and the mean square percentage error (MSPE). Besides, compared with GANs/cGANs, the stability of the RBF-GAN and the RBFC-GAN improve by 34.62% and 72.31%, respectively. Consequently, our proposed models can be used to generate full domain FFD from limited and sparse datasets, to meet the requirement of high-precision flow field reconstructions. △ Less

Submitted 11 August, 2020; originally announced September 2020.

arXiv:2006.02627 [pdf]

Robust Automatic Whole Brain Extraction on Magnetic Resonance Imaging of Brain Tumor Patients using Dense-Vnet

Authors: Sara Ranjbar, Kyle W. Singleton, Lee Curtin, Cassandra R. Rickertsen, Lisa E. Paulson, Leland S. Hu, J. Ross Mitchell, Kristin R. Swanson

Abstract: Whole brain extraction, also known as skull strip**, is a process in neuroimaging in which non-brain tissue such as skull, eyeballs, skin, etc. are removed from neuroimages. Skull stri** is a preliminary step in presurgical planning, cortical reconstruction, and automatic tumor segmentation. Despite a plethora of skull strip** approaches in the literature, few are sufficiently accurate for p… ▽ More Whole brain extraction, also known as skull strip**, is a process in neuroimaging in which non-brain tissue such as skull, eyeballs, skin, etc. are removed from neuroimages. Skull stri** is a preliminary step in presurgical planning, cortical reconstruction, and automatic tumor segmentation. Despite a plethora of skull strip** approaches in the literature, few are sufficiently accurate for processing pathology-presenting MRIs, especially MRIs with brain tumors. In this work we propose a deep learning approach for skull stri** common MRI sequences in oncology such as T1-weighted with gadolinium contrast (T1Gd) and T2-weighted fluid attenuated inversion recovery (FLAIR) in patients with brain tumors. We automatically created gray matter, white matter, and CSF probability masks using SPM12 software and merged the masks into one for a final whole-brain mask for model training. Dice agreement, sensitivity, and specificity of the model (referred herein as DeepBrain) was tested against manual brain masks. To assess data efficiency, we retrained our models using progressively fewer training data examples and calculated average dice scores on the test set for the models trained in each round. Further, we tested our model against MRI of healthy brains from the LBP40A dataset. Overall, DeepBrain yielded an average dice score of 94.5%, sensitivity of 96.4%, and specificity of 98.5% on brain tumor data. For healthy brains, model performance improved to a dice score of 96.2%, sensitivity of 96.6% and specificity of 99.2%. The data efficiency experiment showed that, for this specific task, comparable levels of accuracy could have been achieved with as few as 50 training samples. In conclusion, this study demonstrated that a deep learning model trained on minimally processed automatically-generated labels can generate more accurate brain masks on MRI of brain tumor patients within seconds. △ Less

Submitted 3 June, 2020; originally announced June 2020.

arXiv:2005.10462 [pdf, other]

Robotics Meets Cosmetic Dermatology: Development of a Novel Vision-Guided System for Skin Photo-Rejuvenation

Authors: Muhammad Muddassir, Domingo Gomez, Shujian Chen, Luyin Hu, David Navarro-Alarcon

Abstract: In this paper, we present a novel robotic system for skin photo-rejuvenation procedures, which can uniformly deliver the laser's energy over the skin of the face. The robotised procedure is performed by a manipulator whose end-effector is instrumented with a depth sensor, a thermal camera, and a cosmetic laser generator. To plan the heat stimulating trajectories for the laser, the system computes… ▽ More In this paper, we present a novel robotic system for skin photo-rejuvenation procedures, which can uniformly deliver the laser's energy over the skin of the face. The robotised procedure is performed by a manipulator whose end-effector is instrumented with a depth sensor, a thermal camera, and a cosmetic laser generator. To plan the heat stimulating trajectories for the laser, the system computes the surface model of the face and segments it into seven regions that are automatically filled with laser shots. We report experimental results with human subjects to validate the performance of the system. To the best of the author's knowledge, this is the first time that facial skin rejuvenation has been automated by robot manipulators. △ Less

Submitted 5 November, 2020; v1 submitted 21 May, 2020; originally announced May 2020.

Comments: 11 pages, 16 figures

arXiv:2005.03976 [pdf]

Dual connectivity and standalone modes for LTE-U

Authors: Yajun Zhao, Yu-Ngok Ruyue Li, Liujun Hu, Chenchen Zhang

Abstract: Long-Term Evolution in unlicensed bands (LTE-U) has been considered as an effective way of offloading traffic from licensed bands. This paper discusses the scenarios, requirements and different operation modes of LTE-U. Motivations and benefits of supporting two of the operation modes namely Dual Connectivity(DC) and standalone are discussed. Further, evaluation results of some typical LTE-U scena… ▽ More Long-Term Evolution in unlicensed bands (LTE-U) has been considered as an effective way of offloading traffic from licensed bands. This paper discusses the scenarios, requirements and different operation modes of LTE-U. Motivations and benefits of supporting two of the operation modes namely Dual Connectivity(DC) and standalone are discussed. Further, evaluation results of some typical LTE-U scenarios are provided to show the benefits of supporting dual connectivity and standalone modes for LTE-U. △ Less

Submitted 16 April, 2020; originally announced May 2020.

Comments: 7 pages, 6 figures and 7 tables. for ICC 2015 Workshop-LTE in Unlicensed Bands

arXiv:2003.10647 [pdf, other]

Robust and On-the-fly Dataset Denoising for Image Classification

Authors: Jiaming Song, Lunjia Hu, Michael Auli, Yann Dauphin, Tengyu Ma

Abstract: Memorization in over-parameterized neural networks could severely hurt generalization in the presence of mislabeled examples. However, mislabeled examples are hard to avoid in extremely large datasets collected with weak supervision. We address this problem by reasoning counterfactually about the loss distribution of examples with uniform random labels had they were trained with the real examples,… ▽ More Memorization in over-parameterized neural networks could severely hurt generalization in the presence of mislabeled examples. However, mislabeled examples are hard to avoid in extremely large datasets collected with weak supervision. We address this problem by reasoning counterfactually about the loss distribution of examples with uniform random labels had they were trained with the real examples, and use this information to remove noisy examples from the training set. First, we observe that examples with uniform random labels have higher losses when trained with stochastic gradient descent under large learning rates. Then, we propose to model the loss distribution of the counterfactual examples using only the network parameters, which is able to model such examples with remarkable success. Finally, we propose to remove examples whose loss exceeds a certain quantile of the modeled loss distribution. This leads to On-the-fly Data Denoising (ODD), a simple yet effective algorithm that is robust to mislabeled examples, while introducing almost zero computational overhead compared to standard training. ODD is able to achieve state-of-the-art results on a wide range of datasets including real-world ones such as WebVision and Clothing1M. △ Less

Submitted 9 April, 2020; v1 submitted 23 March, 2020; originally announced March 2020.

arXiv:1911.13238 [pdf, other]

Machine Learning-based Signal Detection for PMH Signals in Load-modulated MIMO System

Authors: **le Zhu, Qiang Li, Li Hu, Hongyang Chen, Nirwan Ansari

Abstract: Phase Modulation on the Hypersphere (PMH) is a power efficient modulation scheme for the \textit{load-modulated} multiple-input multiple-output (MIMO) transmitters with central power amplifiers (CPA). However, it is difficult to obtain the precise channel state information (CSI), and the traditional optimal maximum likelihood (ML) detection scheme incurs high complexity which increases exponential… ▽ More Phase Modulation on the Hypersphere (PMH) is a power efficient modulation scheme for the \textit{load-modulated} multiple-input multiple-output (MIMO) transmitters with central power amplifiers (CPA). However, it is difficult to obtain the precise channel state information (CSI), and the traditional optimal maximum likelihood (ML) detection scheme incurs high complexity which increases exponentially with the number of antennas and the number of bits carried per antenna in the PMH modulation. To detect the PMH signals without knowing the prior CSI, we first propose a signal detection scheme, termed as the hypersphere clustering scheme based on the expectation maximization (EM) algorithm with maximum likelihood detection (HEM-ML). By leveraging machine learning, the proposed detection scheme can accurately obtain information of the channel from a few of the received symbols with little resource cost and achieve comparable detection results as that of the optimal ML detector. To further reduce the computational complexity in the ML detection in HEM-ML, we also propose the second signal detection scheme, termed as the hypersphere clustering scheme based on the EM algorithm with KD-tree detection (HEM-KD). The CSI obtained from the EM algorithm is used to build a spatial KD-tree receiver codebook and the signal detection problem can be transformed into a nearest neighbor search (NNS) problem. The detection complexity of HEM-KD is significantly reduced without any detection performance loss as compared to HEM-ML. Extensive simulation results verify the effectiveness of our proposed detection schemes. △ Less

Submitted 24 November, 2019; originally announced November 2019.

Comments: with example

arXiv:1908.01901 [pdf, other]

Fully-automated patient-level malaria assessment on field-prepared thin blood film microscopy images, including Supplementary Information

Authors: Charles B. Delahunt, Mayoore S. Jaiswal, Matthew P. Horning, Samantha Janko, Clay M. Thompson, Sourabh Kulhare, Liming Hu, Travis Ostbye, Grace Yun, Roman Gebrehiwot, Benjamin K. Wilson, Earl Long, Stephane Proux, Dionicia Gamboa, Peter Chiodini, Jane Carter, Mehul Dhorda, David Isaboke, Bernhards Ogutu, Wellington Oyibo, Elizabeth Villasis, Kyaw Myo Tun, Christine Bachman, David Bell, Courosh Mehanian

Abstract: Malaria is a life-threatening disease affecting millions. Microscopy-based assessment of thin blood films is a standard method to (i) determine malaria species and (ii) quantitate high-parasitemia infections. Full automation of malaria microscopy by machine learning (ML) is a challenging task because field-prepared slides vary widely in quality and presentation, and artifacts often heavily outnumb… ▽ More Malaria is a life-threatening disease affecting millions. Microscopy-based assessment of thin blood films is a standard method to (i) determine malaria species and (ii) quantitate high-parasitemia infections. Full automation of malaria microscopy by machine learning (ML) is a challenging task because field-prepared slides vary widely in quality and presentation, and artifacts often heavily outnumber relatively rare parasites. In this work, we describe a complete, fully-automated framework for thin film malaria analysis that applies ML methods, including convolutional neural nets (CNNs), trained on a large and diverse dataset of field-prepared thin blood films. Quantitation and species identification results are close to sufficiently accurate for the concrete needs of drug resistance monitoring and clinical use-cases on field-prepared samples. We focus our methods and our performance metrics on the field use-case requirements. We discuss key issues and important metrics for the application of ML methods to malaria microscopy. △ Less

Submitted 11 September, 2022; v1 submitted 5 August, 2019; originally announced August 2019.

Comments: 16 pages, 13 figures

MSC Class: 68T10 ACM Class: I.5.0

arXiv:1907.09691 [pdf, other]

Deep-SLAM++: Object-level RGBD SLAM based on class-specific deep shape priors

Authors: Lan Hu, Wanting Xu, Kun Huang, Laurent Kneip

Abstract: In an effort to increase the capabilities of SLAM systems and produce object-level representations, the community increasingly investigates the imposition of higher-level priors into the estimation process. One such example is given by employing object detectors to load and register full CAD models. Our work extends this idea to environments with unknown objects and imposes object priors by employ… ▽ More In an effort to increase the capabilities of SLAM systems and produce object-level representations, the community increasingly investigates the imposition of higher-level priors into the estimation process. One such example is given by employing object detectors to load and register full CAD models. Our work extends this idea to environments with unknown objects and imposes object priors by employing modern class-specific neural networks to generate complete model geometry proposals. The difficulty of using such predictions in a real SLAM scenario is that the prediction performance depends on the view-point and measurement quality, with even small changes of the input data sometimes leading to a large variability in the network output. We propose a discrete selection strategy that finds the best among multiple proposals from different registered views by re-enforcing the agreement with the online depth measurements. The result is an effective object-level RGBD SLAM system that produces compact, high-fidelity, and dense 3D maps with semantic annotations. It outperforms traditional fusion strategies in terms of map completeness and resilience against degrading measurement quality. △ Less

Submitted 9 December, 2019; v1 submitted 23 July, 2019; originally announced July 2019.

arXiv:1906.01895 [pdf, ps, other]

AI-Skin : Skin Disease Recognition based on Self-learning and Wide Data Collection through a Closed Loop Framework

Authors: Min Chen, ** Zhou, Di Wu, Long Hu, Mohammad Mehedi Hassan, Atif Alamri

Abstract: There are a lot of hidden dangers in the change of human skin conditions, such as the sunburn caused by long-time exposure to ultraviolet radiation, which not only has aesthetic impact causing psychological depression and lack of self-confidence, but also may even be life-threatening due to skin canceration. Current skin disease researches adopt the auto-classification system for improving the acc… ▽ More There are a lot of hidden dangers in the change of human skin conditions, such as the sunburn caused by long-time exposure to ultraviolet radiation, which not only has aesthetic impact causing psychological depression and lack of self-confidence, but also may even be life-threatening due to skin canceration. Current skin disease researches adopt the auto-classification system for improving the accuracy rate of skin disease classification. However, the excessive dependence on the image sample database is unable to provide individualized diagnosis service for different population groups. To overcome this problem, a medical AI framework based on data width evolution and self-learning is put forward in this paper to provide skin disease medical service meeting the requirement of real time, extendibility and individualization. First, the wide collection of data in the close-loop information flow of user and remote medical data center is discussed. Next, a data set filter algorithm based on information entropy is given, to lighten the load of edge node and meanwhile improve the learning ability of remote cloud analysis model. In addition, the framework provides an external algorithm load module, which can be compatible with the application requirements according to the model selected. Three kinds of deep learning model, i.e. LeNet-5, AlexNet and VGG16, are loaded and compared, which have verified the universality of the algorithm load module. The experiment platform for the proposed real-time, individualized and extensible skin disease recognition system is built. And the system's computation and communication delay under the interaction scenario between tester and remote data center are analyzed. It is demonstrated that the system we put forward is reliable and effective. △ Less

Submitted 5 June, 2019; originally announced June 2019.

arXiv:1906.01810 [pdf, other]

A Sustainable Multi-modal Multi-layer Emotion-aware Service at the Edge

Authors: Long Hu, Wei Li, Jun Yang, Giancarlo Fortino, Min Chen

Abstract: Limited by the computational capabilities and battery energy of terminal devices and network bandwidth, emotion recognition tasks fail to achieve good interactive experience for users. The intolerable latency for users also seriously restricts the popularization of emotion recognition applications in the edge environments such as fatigue detection in auto-driving. The development of edge computing… ▽ More Limited by the computational capabilities and battery energy of terminal devices and network bandwidth, emotion recognition tasks fail to achieve good interactive experience for users. The intolerable latency for users also seriously restricts the popularization of emotion recognition applications in the edge environments such as fatigue detection in auto-driving. The development of edge computing provides a more sustainable solution for this problem. Based on edge computing, this article proposes a multi-modal multi-layer emotion-aware service (MULTI-EASE) architecture that considers user's facial expression and voice as a multi-modal data source of emotion recognition, and employs the intelligent terminal, edge server and cloud as multi-layer execution environment. By analyzing the average delay of each task and the average energy consumption at the mobile device, we formulate a delay-constrained energy minimization problem and perform a task scheduling policy between multiple layers to reduce the end-to-end delay and energy consumption by using an edge-based approach, further to improve the users' emotion interactive experience and achieve energy saving in edge computing. Finally, a prototype system is also implemented to validate the architecture of MULTI-EASE, the experimental results show that MULTI-EASE is a sustainable and efficient platform for emotion analysis applications, and also provide a valuable reference for dynamic task scheduling under MULTI-EASE architecture. △ Less

Submitted 4 June, 2019; originally announced June 2019.

Showing 1–31 of 31 results for author: Hu, L