Skip to main content

Showing 1–50 of 58 results for author: Jiang, L

Searching in archive eess. Search in all archives.
.
  1. arXiv:2405.13762  [pdf, other

    cs.CV cs.LG cs.MM cs.SD eess.AS

    A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation

    Authors: Gwanghyun Kim, Alonso Martinez, Yu-Chuan Su, Brendan Jou, José Lezama, Agrim Gupta, Lijun Yu, Lu Jiang, Aren Jansen, Jacob Walker, Krishna Somandepalli

    Abstract: Training diffusion models for audiovisual sequences allows for a range of generation tasks by learning conditional distributions of various input-output combinations of the two modalities. Nevertheless, this strategy often requires training a separate model for each task which is expensive. Here, we propose a novel training approach to effectively learn arbitrary conditional distributions in the a… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  2. arXiv:2405.10825  [pdf, other

    eess.SY cs.LG

    Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities

    Authors: Hao Zhou, Chengming Hu, Ye Yuan, Yufei Cui, Yili **, Can Chen, Haolun Wu, Dun Yuan, Li Jiang, Di Wu, Xue Liu, Charlie Zhang, Xianbin Wang, Jiangchuan Liu

    Abstract: Large language models (LLMs) have received considerable attention recently due to their outstanding comprehension and reasoning capabilities, leading to great progress in many fields. The advancement of LLM techniques also offers promising opportunities to automate many tasks in the telecommunication (telecom) field. After pre-training and fine-tuning, LLMs can perform diverse downstream tasks bas… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  3. arXiv:2404.11168  [pdf

    physics.optics eess.SP

    Microwave photonic short-time Fourier transform based on stabilized period-one nonlinear laser dynamics and stimulated Brillouin scattering

    Authors: Sunan Zhang, Taixia Shi, Lizhong Jiang, Yang Chen

    Abstract: A microwave photonic short-time Fourier transform (STFT) system based on stabilized period-one (P1) nonlinear laser dynamics and stimulated Brillouin scattering (SBS) is proposed. By using an optoelectronic feedback loop, the frequency-sweep optical signal generated by the P1 nonlinear laser dynamics is stabilized, which is further used in conjunction with an optical bandpass filter implemented by… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

    Comments: 9 pages, 6 figures

  4. arXiv:2404.07577  [pdf, other

    cs.LG eess.SP

    Generating Comprehensive Lithium Battery Charging Data with Generative AI

    Authors: Lidang Jiang, Changyan Hu, Sibei Ji, Hang Zhao, Junxiong Chen, Ge He

    Abstract: In optimizing performance and extending the lifespan of lithium batteries, accurate state prediction is pivotal. Traditional regression and classification methods have achieved some success in battery state prediction. However, the efficacy of these data-driven approaches heavily relies on the availability and quality of public datasets. Additionally, generating electrochemical data predominantly… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  5. arXiv:2403.09062  [pdf

    eess.IV cs.CV

    TBI Image/Text (TBI-IT): Comprehensive Text and Image Datasets for Traumatic Brain Injury Research

    Authors: Jie Li, Jiaying Wen, Tongxin Yang, Fenglin Cai, Miao Wei, Zhiwei Zhang, Li Jiang

    Abstract: In this paper, we introduce a new dataset in the medical field of Traumatic Brain Injury (TBI), called TBI-IT, which includes both electronic medical records (EMRs) and head CT images. This dataset is designed to enhance the accuracy of artificial intelligence in the diagnosis and treatment of TBI. This dataset, built upon the foundation of standard text and image data, incorporates specific annot… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

    Comments: arXiv admin note: substantial text overlap with arXiv:2401.15934

  6. arXiv:2402.18070  [pdf, other

    cs.AR eess.SP

    A Hierarchical Dataflow-Driven Heterogeneous Architecture for Wireless Baseband Processing

    Authors: Limin Jiang, Yi Shi, Haiqin Hu, Qingyu Deng, Siyi Xu, Yintao Liu, Feng Yuan, Si Wang, Yihao Shen, Fangfang Ye, Shan Cao, Zhiyuan Jiang

    Abstract: Wireless baseband processing (WBP) is a key element of wireless communications, with a series of signal processing modules to improve data throughput and counter channel fading. Conventional hardware solutions, such as digital signal processors (DSPs) and more recently, graphic processing units (GPUs), provide various degrees of parallelism, yet they both fail to take into account the cyclical and… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

    Comments: 7 pages, 7 figures, conference

  7. arXiv:2311.11969  [pdf, other

    eess.IV cs.CV

    SA-Med2D-20M Dataset: Segment Anything in 2D Medical Imaging with 20 Million masks

    Authors: ** Ye, Junlong Cheng, Jianpin Chen, Zhongying Deng, Tianbin Li, Haoyu Wang, Yanzhou Su, Ziyan Huang, Jilong Chen, Lei Jiang, Hui Sun, Min Zhu, Shaoting Zhang, Junjun He, Yu Qiao

    Abstract: Segment Anything Model (SAM) has achieved impressive results for natural image segmentation with input prompts such as points and bounding boxes. Its success largely owes to massive labeled training data. However, directly applying SAM to medical image segmentation cannot perform well because SAM lacks medical knowledge -- it does not use medical images for training. To incorporate medical knowled… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  8. arXiv:2311.04049  [pdf, other

    eess.IV cs.CV

    3D EAGAN: 3D edge-aware attention generative adversarial network for prostate segmentation in transrectal ultrasound images

    Authors: Mengqing Liu, Xiao Shao, Li** Jiang, Kaizhi Wu

    Abstract: Automatic prostate segmentation in TRUS images has always been a challenging problem, since prostates in TRUS images have ambiguous boundaries and inhomogeneous intensity distribution. Although many prostate segmentation methods have been proposed, they still need to be improved due to the lack of sensibility to edge information. Consequently, the objective of this study is to devise a highly effe… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  9. arXiv:2310.13541  [pdf, ps, other

    eess.SY

    Distributed Adaptive Time-Varying Convex Optimization for Multi-agent Systems

    Authors: Liangze Jiang, Zhengguang Wu, Lei Wang

    Abstract: This paper focus on the time-varying convex optimization problems with uncertain parameters. A new class of adaptive algorithms are proposed to solve time-varying convex optimization problems. Under the mild assumption of Hessian and partial derivative of the gradient with respect to time, the dependence on them is reduced through appropriate adaptive law design. By integrating the new adaptive op… ▽ More

    Submitted 20 October, 2023; originally announced October 2023.

    Comments: 12 pages,2 figures

  10. arXiv:2310.00854  [pdf, other

    eess.SY

    Regulating CPU Temperature With Thermal-Aware Scheduling Using a Reduced Order Learning Thermal Model

    Authors: Anthony Dowling, Lin Jiang, Ming-Cheng Cheng, Yu Liu

    Abstract: Modern real-time systems utilize considerable amounts of power while executing computation-intensive tasks. The execution of these tasks leads to significant power dissipation and heating of the device. It therefore results in severe thermal issues like temperature escalation, high thermal gradients, and excessive hot spot formation, which may result in degrading chip performance, accelerating dev… ▽ More

    Submitted 6 February, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

    Comments: This version includes revisions to the previous version to improve the clarity and presentation of the work

  11. arXiv:2308.06891  [pdf

    cs.RO eess.SY

    Viia-hand: a Reach-and-grasp Restoration System Integrating Voice interaction, Computer vision and Auditory feedback for Blind Amputees

    Authors: Chunhao Peng, Dapeng Yang, Ming Cheng, **ghui Dai, Deyu Zhao, Li Jiang

    Abstract: Visual feedback plays a crucial role in the process of amputation patients completing gras** in the field of prosthesis control. However, for blind and visually impaired (BVI) amputees, the loss of both visual and gras** abilities makes the "easy" reach-and-grasp task a feasible challenge. In this paper, we propose a novel multi-sensory prosthesis system hel** BVI amputees with sensing, navi… ▽ More

    Submitted 13 August, 2023; originally announced August 2023.

  12. arXiv:2306.17717  [pdf, other

    cs.GR eess.IV

    Content-Preserving Diffusion Model for Unsupervised AS-OCT image Despeckling

    Authors: Li Sanqian, Higashita Risa, Fu Huazhu, Li Heng, Niu **gxuan, Liu Jiang

    Abstract: Anterior segment optical coherence tomography (AS-OCT) is a non-invasive imaging technique that is highly valuable for ophthalmic diagnosis. However, speckles in AS-OCT images can often degrade the image quality and affect clinical analysis. As a result, removing speckles in AS-OCT images can greatly benefit automatic ophthalmology analysis. Unfortunately, challenges still exist in deploying effec… ▽ More

    Submitted 30 June, 2023; originally announced June 2023.

  13. arXiv:2303.14081  [pdf, other

    eess.IV cs.CV

    CoLa-Diff: Conditional Latent Diffusion Model for Multi-Modal MRI Synthesis

    Authors: Lan Jiang, Ye Mao, Xi Chen, Xiangfeng Wang, Chao Li

    Abstract: MRI synthesis promises to mitigate the challenge of missing MRI modality in clinical practice. Diffusion model has emerged as an effective technique for image synthesis by modelling complex and variable data distributions. However, most diffusion-based MRI synthesis models are using a single modality. As they operate in the original image domain, they are memory-intensive and less feasible for mul… ▽ More

    Submitted 24 March, 2023; originally announced March 2023.

    Comments: 8 pages

    ACM Class: I.3.3; I.4.10

  14. arXiv:2303.13933  [pdf, other

    eess.IV cs.CV

    DisC-Diff: Disentangled Conditional Diffusion Model for Multi-Contrast MRI Super-Resolution

    Authors: Ye Mao, Lan Jiang, Xi Chen, Chao Li

    Abstract: Multi-contrast magnetic resonance imaging (MRI) is the most common management tool used to characterize neurological disorders based on brain tissue contrasts. However, acquiring high-resolution MRI scans is time-consuming and infeasible under specific conditions. Hence, multi-contrast super-resolution methods have been developed to improve the quality of low-resolution contrasts by leveraging com… ▽ More

    Submitted 6 June, 2023; v1 submitted 24 March, 2023; originally announced March 2023.

    Comments: Early Accepted by MICCAI 2023

  15. arXiv:2303.12249  [pdf, other

    cs.CV cs.CR eess.IV

    State-of-the-art optical-based physical adversarial attacks for deep learning computer vision systems

    Authors: Junbin Fang, You Jiang, Canjian Jiang, Zoe L. Jiang, Siu-Ming Yiu, Chuanyi Liu

    Abstract: Adversarial attacks can mislead deep learning models to make false predictions by implanting small perturbations to the original input that are imperceptible to the human eye, which poses a huge security threat to the computer vision systems based on deep learning. Physical adversarial attacks, which is more realistic, as the perturbation is introduced to the input before it is being captured and… ▽ More

    Submitted 21 March, 2023; originally announced March 2023.

    Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

  16. arXiv:2301.04889  [pdf

    eess.IV

    Artificial intelligence for diagnosing and predicting survival of patients with renal cell carcinoma: Retrospective multi-center study

    Authors: Siteng Chen, Xiyue Wang, Jun Zhang, Liren Jiang, Ning Zhang, Feng Gao, Wei Yang, **xi Xiang, Sen Yang, Junhua Zheng, Xiao Han

    Abstract: Background: Clear cell renal cell carcinoma (ccRCC) is the most common renal-related tumor with high heterogeneity. There is still an urgent need for novel diagnostic and prognostic biomarkers for ccRCC. Methods: We proposed a weakly-supervised deep learning strategy using conventional histology of 1752 whole slide images from multiple centers. Our study was demonstrated through internal cross-val… ▽ More

    Submitted 12 January, 2023; originally announced January 2023.

  17. arXiv:2210.13761  [pdf, other

    eess.AS cs.SD

    Streaming Parrotron for on-device speech-to-speech conversion

    Authors: Oleg Rybakov, Fadi Biadsy, Xia Zhang, Liyang Jiang, Phoenix Meadowlark, Shivani Agrawal

    Abstract: We present a fully on-device streaming Speech2Speech conversion model that normalizes a given input speech directly to synthesized output speech. Deploying such a model on mobile devices pose significant challenges in terms of memory footprint and computation requirements. We present a streaming-based approach to produce an acceptable delay, with minimal loss in speech conversion quality, when com… ▽ More

    Submitted 24 May, 2023; v1 submitted 25 October, 2022; originally announced October 2022.

  18. arXiv:2209.10675  [pdf, other

    math.OC cs.LG eess.IV stat.ML

    A Validation Approach to Over-parameterized Matrix and Image Recovery

    Authors: Lijun Ding, Zhen Qin, Liwei Jiang, **xin Zhou, Zhihui Zhu

    Abstract: In this paper, we study the problem of recovering a low-rank matrix from a number of noisy random linear measurements. We consider the setting where the rank of the ground-truth matrix is unknown a prior and use an overspecified factored representation of the matrix variable, where the global optimal solutions overfit and do not correspond to the underlying ground-truth. We then solve the associat… ▽ More

    Submitted 21 September, 2022; originally announced September 2022.

    Comments: 29 pages and 9 figures

  19. arXiv:2208.05122  [pdf, other

    eess.AS

    Improving Hypernasality Estimation with Automatic Speech Recognition in Cleft Palate Speech

    Authors: Kaitao Song, Teng Wan, Bixia Wang, Huiqiang Jiang, Luna Qiu, Jiahang Xu, Li** Jiang, Qun Lou, Yuqing Yang, Dongsheng Li, Xudong Wang, Lili Qiu

    Abstract: Hypernasality is an abnormal resonance in human speech production, especially in patients with craniofacial anomalies such as cleft palate. In clinical application, hypernasality estimation is crucial in cleft palate diagnosis, as its results determine the subsequent surgery and additional speech therapy. Therefore, designing an automatic hypernasality assessment method will facilitate speech-lang… ▽ More

    Submitted 9 August, 2022; originally announced August 2022.

    Comments: Accepted by InterSpeech 2022

  20. arXiv:2207.13882  [pdf, other

    eess.IV cs.CV

    SuperVessel: Segmenting High-resolution Vessel from Low-resolution Retinal Image

    Authors: Yan Hu, Zhongxi Qiu, Dan Zeng, Li Jiang, Chen Lin, Jiang Liu

    Abstract: Vascular segmentation extracts blood vessels from images and serves as the basis for diagnosing various diseases, like ophthalmic diseases. Ophthalmologists often require high-resolution segmentation results for analysis, which leads to super-computational load by most existing methods. If based on low-resolution input, they easily ignore tiny vessels or cause discontinuity of segmented vessels. T… ▽ More

    Submitted 28 July, 2022; originally announced July 2022.

    Comments: Accepted by PRCV2022

  21. arXiv:2206.04948  [pdf, other

    eess.SY

    A Holistic Robust Motion Controller Framework for Autonomous Platooning

    Authors: Hong Wang, Li-Ming Peng, Zi-Chun Wei, Kai Yang, Xian-Xu Bai, Luo Jiang, Ehsan Hashemi

    Abstract: Safety is the foremost concern for autonomous platooning. The vehicle-to-vehicle (V2V) communication delay and the sudden appearance of obstacles will trigger the safety of the intended functionality (SOTIF) issues for autonomous platooning. This research proposes a holistic robust motion controller framework (MCF) for an intelligent and connected vehicle platoon system. The MCF utilizes a hierarc… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

    Comments: 13 pages, 20 figures

  22. arXiv:2205.14523  [pdf, other

    eess.SY cs.FL cs.LO

    Risk of Stochastic Systems for Temporal Logic Specifications

    Authors: Lars Lindemann, Lejun Jiang, Nikolai Matni, George J. Pappas

    Abstract: The wide availability of data coupled with the computational advances in artificial intelligence and machine learning promise to enable many future technologies such as autonomous driving. While there has been a variety of successful demonstrations of these technologies, critical system failures have repeatedly been reported. Even if rare, such system failures pose a serious barrier to adoption wi… ▽ More

    Submitted 8 October, 2022; v1 submitted 28 May, 2022; originally announced May 2022.

  23. arXiv:2205.01550  [pdf, other

    cs.CV eess.IV

    Point Cloud Semantic Segmentation using Multi Scale Sparse Convolution Neural Network

    Authors: Yunzheng Su, Lei Jiang, Jie Cao

    Abstract: In recent years, with the development of computing resources and LiDAR, point cloud semantic segmentation has attracted many researchers. For the sparsity of point clouds, although there is already a way to deal with sparse convolution, multi-scale features are not considered. In this letter, we propose a feature extraction module based on multi-scale sparse convolution and a feature selection mod… ▽ More

    Submitted 29 June, 2022; v1 submitted 3 May, 2022; originally announced May 2022.

  24. arXiv:2203.00756  [pdf, other

    eess.AS cs.SD

    Real time spectrogram inversion on mobile phone

    Authors: Oleg Rybakov, Marco Tagliasacchi, Yunpeng Li, Liyang Jiang, Xia Zhang, Fadi Biadsy

    Abstract: We present two methods of real time magnitude spectrogram inversion: streaming Griffin Lim(GL) and streaming MelGAN. We demonstrate the impact of looking ahead on perceptual quality of MelGAN. As little as one hop size (12.5ms) of lookahead is able to significantly improve perceptual quality in comparison to its causal version. We compare streaming GL with the streaming MelGAN and show different t… ▽ More

    Submitted 24 May, 2023; v1 submitted 1 March, 2022; originally announced March 2022.

  25. arXiv:2111.09971  [pdf, other

    eess.SY cs.LG

    Learning Robust Output Control Barrier Functions from Safe Expert Demonstrations

    Authors: Lars Lindemann, Alexander Robey, Lejun Jiang, Satyajeet Das, Stephen Tu, Nikolai Matni

    Abstract: This paper addresses learning safe output feedback control laws from partial observations of expert demonstrations. We assume that a model of the system dynamics and a state estimator are available along with corresponding error bounds, e.g., estimated from data in practice. We first propose robust output control barrier functions (ROCBFs) as a means to guarantee safety, as defined through control… ▽ More

    Submitted 2 April, 2024; v1 submitted 18 November, 2021; originally announced November 2021.

    Comments: Journal paper

  26. arXiv:2110.12857  [pdf

    physics.app-ph eess.SP physics.optics

    Photonics-assisted microwave pulse detection and frequency measurement based on pulse replication and frequency-to-time map**

    Authors: Pengcheng Zuo, Dong Ma, Qingbo Liu, Lizhong Jiang, Yang Chen

    Abstract: A photonics-assisted microwave pulse detection and frequency measurement scheme is proposed. The unknown microwave pulse is converted to the optical domain and then injected into a fiber loop for pulse replication, which makes it easier to identify the microwave pulse with large pulse repetition interval (PRI), whereas stimulated Brillouin scattering-based frequency-to-time map** (FTTM) is utili… ▽ More

    Submitted 25 September, 2021; originally announced October 2021.

    Comments: 13 pages, 8 figures

  27. arXiv:2109.13322  [pdf, other

    physics.optics eess.SY physics.app-ph quant-ph

    Induced transparency: interference or polarization?

    Authors: Changqing Wang, Xuefeng Jiang, William R. Sweeney, Chia Wei Hsu, Yiming Liu, Guangming Zhao, Bo Peng, Mengzhen Zhang, Liang Jiang, A. Douglas Stone, Lan Yang

    Abstract: The polarization of optical fields is a crucial degree of freedom in the all-optical analogue of electromagnetically induced transparency (EIT). However, the physical origins of EIT and polarization induced phenomena have not been well distinguished, which can lead to confusion in associated applications such as slow light and optical/quantum storage. Here we study the polarization effects in vari… ▽ More

    Submitted 27 September, 2021; originally announced September 2021.

    Comments: 8 pages, 4 figures, 57 references. The published version can be found via ULR: https://www.pnas.org/content/118/3/e2012982118

    Journal ref: Proceedings of the National Academy of Sciences Vol. 118 No. 3 e2012982118 (19 Jan 2021)

  28. arXiv:2107.04589  [pdf, other

    cs.CV cs.LG eess.IV

    ViTGAN: Training GANs with Vision Transformers

    Authors: Kwonjoon Lee, Huiwen Chang, Lu Jiang, Han Zhang, Zhuowen Tu, Ce Liu

    Abstract: Recently, Vision Transformers (ViTs) have shown competitive performance on image recognition while requiring less vision-specific inductive biases. In this paper, we investigate if such performance can be extended to image generation. To this end, we integrate the ViT architecture into generative adversarial networks (GANs). For ViT discriminators, we observe that existing regularization methods f… ▽ More

    Submitted 29 May, 2024; v1 submitted 9 July, 2021; originally announced July 2021.

    Comments: Accepted to ICLR 2022 (Spotlight)

  29. arXiv:2106.11172  [pdf

    eess.SP physics.app-ph

    Multi-functional microwave photonic radar system for simultaneous distance and velocity measurement and high-resolution microwave imaging

    Authors: Dingding Liang, Lizhong Jiang, Yang Chen

    Abstract: A photonic-assisted multi-functional radar system for simultaneous distance and velocity measurement and high-resolution microwave imaging is proposed and experimentally demonstrated by using a composite transmitted microwave signal of a single-chirped linearly frequency-modulated (LFM) signal and a single-tone microwave signal. In the system, the transmitted signal is generated via photonic frequ… ▽ More

    Submitted 27 May, 2021; originally announced June 2021.

    Comments: 16 pages, 9 figures

  30. arXiv:2105.05701  [pdf, other

    eess.SY

    Deep Multi-agent Reinforcement Learning for Highway On-Ramp Merging in Mixed Traffic

    Authors: Dong Chen, Mohammad Hajidavalloo, Zhaojian Li, Kaian Chen, Yongqiang Wang, Longsheng Jiang, Yue Wang

    Abstract: On-ramp merging is a challenging task for autonomous vehicles (AVs), especially in mixed traffic where AVs coexist with human-driven vehicles (HDVs). In this paper, we formulate the mixed-traffic highway on-ramp merging problem as a multi-agent reinforcement learning (MARL) problem, where the AVs (on both merge lane and through lane) collaboratively learn a policy to adapt to HDVs to maximize the… ▽ More

    Submitted 5 November, 2022; v1 submitted 12 May, 2021; originally announced May 2021.

    Comments: 15 figures

  31. arXiv:2104.10781  [pdf, other

    eess.IV cs.CV

    NTIRE 2021 Challenge on Quality Enhancement of Compressed Video: Methods and Results

    Authors: Ren Yang, Radu Timofte, **g Liu, Yi Xu, Xinjian Zhang, Minyi Zhao, Shuigeng Zhou, Kelvin C. K. Chan, Shangchen Zhou, Xiangyu Xu, Chen Change Loy, Xin Li, Fanglong Liu, He Zheng, Lielin Jiang, Qi Zhang, Dongliang He, Fu Li, Qingqing Dang, Yibin Huang, Matteo Maggioni, Zhongqian Fu, Shuai Xiao, Cheng li, Thomas Tanay , et al. (47 additional authors not shown)

    Abstract: This paper reviews the first NTIRE challenge on quality enhancement of compressed video, with a focus on the proposed methods and results. In this challenge, the new Large-scale Diverse Video (LDV) dataset is employed. The challenge has three tracks. Tracks 1 and 2 aim at enhancing the videos compressed by HEVC at a fixed QP, while Track 3 is designed for enhancing the videos compressed by x265 at… ▽ More

    Submitted 31 August, 2022; v1 submitted 21 April, 2021; originally announced April 2021.

    Comments: Corrected the MOS values in Table 2, and corrected some minor typos

  32. arXiv:2103.14236  [pdf, other

    cs.SD eess.AS

    Subspace-based compressive sensing algorithm for raypath separation in a shallow-water waveguide

    Authors: Longyu Jiang, Zhe Zhang, Rui **, Xiao Zhou, Philippe Roux

    Abstract: Compressive sensing (CS) has been applied to estimate the direction of arrival (DOA) in underwater acoustics. However, the key problem needed to be resolved in a {multipath} propagation environment is to suppress the interferences between the raypaths. Thus, in this paper, {a subspace-based compressive sensing algorithm that formulates the statistic information of the signal subspace in a CS frame… ▽ More

    Submitted 25 March, 2021; originally announced March 2021.

  33. arXiv:2012.12821  [pdf, other

    cs.CV cs.LG eess.IV

    Focal Frequency Loss for Image Reconstruction and Synthesis

    Authors: Liming Jiang, Bo Dai, Wayne Wu, Chen Change Loy

    Abstract: Image reconstruction and synthesis have witnessed remarkable progress thanks to the development of generative models. Nonetheless, gaps could still exist between the real and generated images, especially in the frequency domain. In this study, we show that narrowing gaps in the frequency domain can ameliorate image reconstruction and synthesis quality further. We propose a novel focal frequency lo… ▽ More

    Submitted 23 August, 2021; v1 submitted 23 December, 2020; originally announced December 2020.

    Comments: ICCV 2021. GitHub: https://github.com/EndlessSora/focal-frequency-loss Project page: https://www.mmlab-ntu.com/project/ffl/index.html

  34. arXiv:2012.08698  [pdf, other

    cs.LG eess.SP

    Edge Entropy as an Indicator of the Effectiveness of GNNs over CNNs for Node Classification

    Authors: Lavender Yao Jiang, John Shi, Mark Cheung, Oren Wright, José M. F. Moura

    Abstract: Graph neural networks (GNNs) extend convolutional neural networks (CNNs) to graph-based data. A question that arises is how much performance improvement does the underlying graph structure in the GNN provide over the CNN (that ignores this graph structure). To address this question, we introduce edge entropy and evaluate how good an indicator it is for possible performance improvement of GNNs over… ▽ More

    Submitted 15 December, 2020; originally announced December 2020.

  35. arXiv:2012.06091  [pdf, other

    physics.optics eess.IV

    Single-pixel Tracking and Imaging under Weak Illumination

    Authors: Shuai Sun, Hong-Kang Hu, Yao-Kun Xu, Hui-Zu Lin, Er-Feng Zhang, Liang Jiang, Wei-Tao Liu

    Abstract: Under weak illumination, tracking and imaging moving object turns out to be hard. By spatially collecting the signal, single pixel imaging schemes promise the capability of image reconstruction from low photon flux. However, due to the requirement on large number of samplings, how to clearly image moving objects is an essential problem for such schemes. Here we present a principle of single pixel… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

  36. Deep image prior for undersampling high-speed photoacoustic microscopy

    Authors: Tri Vu, Anthony DiSpirito III, Daiwei Li, Zixuan Zhang, Xiaoyi Zhu, Maomao Chen, Laiming Jiang, Dong Zhang, Jianwen Luo, Yu Shrike Zhang, Qifa Zhou, Roarke Horstmeyer, Junjie Yao

    Abstract: Photoacoustic microscopy (PAM) is an emerging imaging method combining light and sound. However, limited by the laser's repetition rate, state-of-the-art high-speed PAM technology often sacrifices spatial sampling density (i.e., undersampling) for increased imaging speed over a large field-of-view. Deep learning (DL) methods have recently been used to improve sparsely sampled PAM images; however,… ▽ More

    Submitted 7 April, 2021; v1 submitted 15 October, 2020; originally announced October 2020.

  37. Graph Signal Processing and Deep Learning: Convolution, Pooling, and Topology

    Authors: Mark Cheung, John Shi, Oren Wright, Lavender Y. Jiang, Xu** Liu, José M. F. Moura

    Abstract: Deep learning, particularly convolutional neural networks (CNNs), have yielded rapid, significant improvements in computer vision and related domains. But conventional deep learning architectures perform poorly when data have an underlying graph structure, as in social, biological, and many other domains. This paper explores 1)how graph signal processing (GSP) can be used to extend CNN components… ▽ More

    Submitted 3 August, 2020; originally announced August 2020.

    Comments: To be published on IEEE Signal Processing Magazine

  38. arXiv:2007.12072  [pdf, other

    cs.CV cs.LG eess.IV

    TSIT: A Simple and Versatile Framework for Image-to-Image Translation

    Authors: Liming Jiang, Changxu Zhang, Mingyang Huang, Chunxiao Liu, Jian** Shi, Chen Change Loy

    Abstract: We introduce a simple and versatile framework for image-to-image translation. We unearth the importance of normalization layers, and provide a carefully designed two-stream generative model with newly proposed feature transformations in a coarse-to-fine fashion. This allows multi-scale semantic structure information and style representation to be effectively captured and fused by the network, perm… ▽ More

    Submitted 25 July, 2020; v1 submitted 23 July, 2020; originally announced July 2020.

    Comments: ECCV 2020 (Spotlight). Table 2 is updated. GitHub: https://github.com/EndlessSora/TSIT

  39. arXiv:2007.00322  [pdf, other

    eess.SP

    Kernel Learning for High-Resolution Time-Frequency Distribution

    Authors: Lei Jiang, Haijian Zhang, Lei Yu, Guang Hua

    Abstract: The design of high-resolution and cross-term (CT) free time-frequency distributions (TFDs) has been an open problem. Classical kernel based methods are limited by the trade-off between TFD resolution and CT suppression, even under optimally derived parameters. To break the current limitation, we propose a data-driven kernel learning model directly based on Wigner-Ville distribution (WVD). The prop… ▽ More

    Submitted 16 July, 2021; v1 submitted 1 July, 2020; originally announced July 2020.

  40. arXiv:2005.08245  [pdf

    eess.SP cs.AI cs.LG

    Dampen the Stop-and-Go Traffic with Connected and Automated Vehicles -- A Deep Reinforcement Learning Approach

    Authors: Liming Jiang, Yuanchang Xie, Danjue Chen, Tienan Li, Nicholas G. Evans

    Abstract: Stop-and-go traffic poses many challenges to tranportation system, but its formation and mechanism are still under exploration.however, it has been proved that by introducing Connected Automated Vehicles(CAVs) with carefully designed controllers one could dampen the stop-and-go waves in the vehicle fleet. Instead of using analytical model, this study adopts reinforcement learning to control the be… ▽ More

    Submitted 17 May, 2020; originally announced May 2020.

  41. arXiv:2004.14820  [pdf, other

    eess.SP

    Robust Time-Frequency Reconstruction by Learning Structured Sparsity

    Authors: Lei Jiang, Haijian Zhang, Lei Yu

    Abstract: Time-frequency distributions (TFDs) play a vital role in providing descriptive analysis of non-stationary signals involved in realistic scenarios. It is well known that low time-frequency (TF) resolution and the emergency of cross-terms (CTs) are two main issues, which make it difficult to analyze and interpret practical signals using TFDs. In order to address these issues, we propose the U-Net ai… ▽ More

    Submitted 30 April, 2020; originally announced April 2020.

  42. arXiv:2004.03519  [pdf, other

    eess.SP cs.LG

    Pooling in Graph Convolutional Neural Networks

    Authors: Mark Cheung, John Shi, Lavender Yao Jiang, Oren Wright, José M. F. Moura

    Abstract: Graph convolutional neural networks (GCNNs) are a powerful extension of deep learning techniques to graph-structured data problems. We empirically evaluate several pooling methods for GCNNs, and combinations of those graph pooling methods with three different architectures: GCN, TAGCN, and GraphSAGE. We confirm that graph pooling, especially DiffPool, improves classification accuracy on popular gr… ▽ More

    Submitted 7 April, 2020; originally announced April 2020.

    Comments: 5 pages, 2 figures, 2019 Asilomar Conference paper

  43. arXiv:2002.08587  [pdf, other

    eess.IV cs.CV

    Cross-stained Segmentation from Renal Biopsy Images Using Multi-level Adversarial Learning

    Authors: Ke Mei, Chuang Zhu, Lei Jiang, Jun Liu, Yuanyuan Qiao

    Abstract: Segmentation from renal pathological images is a key step in automatic analyzing the renal histological characteristics. However, the performance of models varies significantly in different types of stained datasets due to the appearance variations. In this paper, we design a robust and flexible model for cross-stained segmentation. It is a novel multi-level deep adversarial network architecture t… ▽ More

    Submitted 20 February, 2020; originally announced February 2020.

    Comments: Accepted by ICASSP2020

  44. arXiv:2002.00179  [pdf

    cs.CV cs.CR eess.IV

    AdvJND: Generating Adversarial Examples with Just Noticeable Difference

    Authors: Zifei Zhang, Kai Qiao, Lingyun Jiang, Linyuan Wang, Bin Yan

    Abstract: Compared with traditional machine learning models, deep neural networks perform better, especially in image classification tasks. However, they are vulnerable to adversarial examples. Adding small perturbations on examples causes a good-performance model to misclassify the crafted examples, without category differences in the human eyes, and fools deep models successfully. There are two requiremen… ▽ More

    Submitted 23 June, 2020; v1 submitted 1 February, 2020; originally announced February 2020.

  45. arXiv:2001.11954  [pdf, other

    eess.SP

    MindReading: An Ultra-Low-Power Photonic Accelerator for EEG-based Human Intention Recognition

    Authors: Qian Lou, Wenyang Liu, Weichen Liu, Feng Guo, Lei Jiang

    Abstract: A scalp-recording electroencephalography (EEG)-based brain-computer interface (BCI) system can greatly improve the quality of life for people who suffer from motor disabilities. Deep neural networks consisting of multiple convolutional, LSTM and fully-connected layers are created to decode EEG signals to maximize the human intention recognition accuracy. However, prior FPGA, ASIC, ReRAM and photon… ▽ More

    Submitted 30 January, 2020; originally announced January 2020.

    Comments: 6 pages, 8 figures

  46. arXiv:2001.08581  [pdf

    eess.SP cs.LG

    Cooperative Highway Work Zone Merge Control based on Reinforcement Learning in A Connected and Automated Environment

    Authors: Tianzhu Ren, Yuanchang Xie, Liming Jiang

    Abstract: Given the aging infrastructure and the anticipated growing number of highway work zones in the United States, it is important to investigate work zone merge control, which is critical for improving work zone safety and capacity. This paper proposes and evaluates a novel highway work zone merge control strategy based on cooperative driving behavior enabled by artificial intelligence. The proposed m… ▽ More

    Submitted 21 January, 2020; originally announced January 2020.

    Comments: 17pages, 6 figures, TRB 2020

  47. arXiv:2001.03257  [pdf

    cs.CV eess.IV

    A Deep Neural Networks Approach for Pixel-Level Runway Pavement Crack Segmentation Using Drone-Captured Images

    Authors: Liming Jiang, Yuanchang Xie, Tianzhu Ren

    Abstract: Pavement conditions are a critical aspect of asset management and directly affect safety. This study introduces a deep neural network method called U-Net for pavement crack segmentation based on drone-captured images to reduce the cost and time needed for airport runway inspection. The proposed approach can also be used for highway pavement conditions assessment during off-peak periods when there… ▽ More

    Submitted 9 January, 2020; originally announced January 2020.

    Comments: 13 pages, 5 figures

  48. arXiv:1912.13192  [pdf, other

    cs.CV cs.LG eess.IV

    PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection

    Authors: Shaoshuai Shi, Chaoxu Guo, Li Jiang, Zhe Wang, Jian** Shi, Xiaogang Wang, Hongsheng Li

    Abstract: We present a novel and high-performance 3D object detection framework, named PointVoxel-RCNN (PV-RCNN), for accurate 3D object detection from point clouds. Our proposed method deeply integrates both 3D voxel Convolutional Neural Network (CNN) and PointNet-based set abstraction to learn more discriminative point cloud features. It takes advantages of efficient learning and high-quality proposals of… ▽ More

    Submitted 9 April, 2021; v1 submitted 31 December, 2019; originally announced December 2019.

    Comments: Accepted by CVPR 2020. arXiv admin note: substantial text overlap with arXiv:2102.00463

  49. arXiv:1912.09859  [pdf, ps, other

    cs.LG cs.NI eess.SP stat.ML

    Lightweight and Unobtrusive Data Obfuscation at IoT Edge for Remote Inference

    Authors: Dixing Xu, Mengyao Zheng, Linshan Jiang, Chaojie Gu, Rui Tan, Peng Cheng

    Abstract: Executing deep neural networks for inference on the server-class or cloud backend based on data generated at the edge of Internet of Things is desirable due primarily to the limited compute power of edge devices and the need to protect the confidentiality of the inference neural networks. However, such a remote inference scheme incurs concerns regarding the privacy of the inference data transmitte… ▽ More

    Submitted 25 March, 2020; v1 submitted 20 December, 2019; originally announced December 2019.

    Comments: This paper has been accepted by IEEE Internet of Things Journal, Special Issue on Artificial Intelligence Powered Edge Computing for Internet of Things

  50. arXiv:1912.04979  [pdf, other

    eess.AS cs.CL cs.CV cs.SD eess.IV

    Advances in Online Audio-Visual Meeting Transcription

    Authors: Takuya Yoshioka, Igor Abramovski, Cem Aksoylar, Zhuo Chen, Moshe David, Dimitrios Dimitriadis, Yifan Gong, Ilya Gurvich, Xuedong Huang, Yan Huang, Aviv Hurvitz, Li Jiang, Sharon Koubi, Eyal Krupka, Ido Leichter, Changliang Liu, Partha Parthasarathy, Alon Vinnikov, Lingfeng Wu, Xiong Xiao, Wayne Xiong, Huaming Wang, Zhenghao Wang, Jun Zhang, Yong Zhao , et al. (1 additional authors not shown)

    Abstract: This paper describes a system that generates speaker-annotated transcripts of meetings by using a microphone array and a 360-degree camera. The hallmark of the system is its ability to handle overlapped speech, which has been an unsolved problem in realistic settings for over a decade. We show that this problem can be addressed by using a continuous speech separation approach. In addition, we desc… ▽ More

    Submitted 10 December, 2019; originally announced December 2019.

    Comments: To appear in Proc. IEEE ASRU Workshop 2019