Search | arXiv e-print repository

UniCompress: Enhancing Multi-Data Medical Image Compression with Knowledge Distillation

Authors: Runzhao Yang, Yinda Chen, Zhihong Zhang, Xiaoyu Liu, Zongren Li, Kunlun He, Zhiwei Xiong, **li Suo, Qionghai Dai

Abstract: In the field of medical image compression, Implicit Neural Representation (INR) networks have shown remarkable versatility due to their flexible compression ratios, yet they are constrained by a one-to-one fitting approach that results in lengthy encoding times. Our novel method, ``\textbf{UniCompress}'', innovatively extends the compression capabilities of INR by being the first to compress multi… ▽ More In the field of medical image compression, Implicit Neural Representation (INR) networks have shown remarkable versatility due to their flexible compression ratios, yet they are constrained by a one-to-one fitting approach that results in lengthy encoding times. Our novel method, ``\textbf{UniCompress}'', innovatively extends the compression capabilities of INR by being the first to compress multiple medical data blocks using a single INR network. By employing wavelet transforms and quantization, we introduce a codebook containing frequency domain information as a prior input to the INR network. This enhances the representational power of INR and provides distinctive conditioning for different image blocks. Furthermore, our research introduces a new technique for the knowledge distillation of implicit representations, simplifying complex model knowledge into more manageable formats to improve compression ratios. Extensive testing on CT and electron microscopy (EM) datasets has demonstrated that UniCompress outperforms traditional INR methods and commercial compression solutions like HEVC, especially in complex and high compression scenarios. Notably, compared to existing INR techniques, UniCompress achieves a 4$\sim$5 times increase in compression speed, marking a significant advancement in the field of medical image compression. Codes will be publicly available. △ Less

Submitted 27 May, 2024; originally announced May 2024.

arXiv:2404.07551 [pdf, other]

Event-Enhanced Snapshot Compressive Videography at 10K FPS

Authors: Bo Zhang, **li Suo, Qionghai Dai

Abstract: Video snapshot compressive imaging (SCI) encodes the target dynamic scene compactly into a snapshot and reconstructs its high-speed frame sequence afterward, greatly reducing the required data footprint and transmission bandwidth as well as enabling high-speed imaging with a low frame rate intensity camera. In implementation, high-speed dynamics are encoded via temporally varying patterns, and onl… ▽ More Video snapshot compressive imaging (SCI) encodes the target dynamic scene compactly into a snapshot and reconstructs its high-speed frame sequence afterward, greatly reducing the required data footprint and transmission bandwidth as well as enabling high-speed imaging with a low frame rate intensity camera. In implementation, high-speed dynamics are encoded via temporally varying patterns, and only frames at corresponding temporal intervals can be reconstructed, while the dynamics occurring between consecutive frames are lost. To unlock the potential of conventional snapshot compressive videography, we propose a novel hybrid "intensity+event" imaging scheme by incorporating an event camera into a video SCI setup. Our proposed system consists of a dual-path optical setup to record the coded intensity measurement and intermediate event signals simultaneously, which is compact and photon-efficient by collecting the half photons discarded in conventional video SCI. Correspondingly, we developed a dual-branch Transformer utilizing the reciprocal relationship between two data modes to decode dense video frames. Extensive experiments on both simulated and real-captured data demonstrate our superiority to state-of-the-art video SCI and video frame interpolation (VFI) methods. Benefiting from the new hybrid design leveraging both intrinsic redundancy in videos and the unique feature of event cameras, we achieve high-quality videography at 0.1ms time intervals with a low-cost CMOS image sensor working at 24 FPS. △ Less

Submitted 11 April, 2024; originally announced April 2024.

arXiv:2401.15219 [pdf, other]

Harnessing Deep Learning of Point Clouds for Inverse Control of 3D Shape Morphing

Authors: Jue Wang, Dhirodaatto Sarkar, Jiaqi Suo, Alex Chortos

Abstract: Shape-morphing devices, a crucial branch in soft robotics, hold significant application value in areas like human-machine interfaces, biomimetic robotics, and tools for interacting with biological systems. To achieve three-dimensional (3D) programmable shape morphing (PSM), the deployment of array-based actuators is essential. However, a critical knowledge gap impeding the development of 3D PSM is… ▽ More Shape-morphing devices, a crucial branch in soft robotics, hold significant application value in areas like human-machine interfaces, biomimetic robotics, and tools for interacting with biological systems. To achieve three-dimensional (3D) programmable shape morphing (PSM), the deployment of array-based actuators is essential. However, a critical knowledge gap impeding the development of 3D PSM is the challenge of controlling the complex systems formed by these soft actuator arrays. This study introduces a novel approach, for the first time, representing the configuration of shape morphing devices using point cloud data and employing deep learning to map these configurations to control inputs. We propose Shape Morphing Net (SMNet), a method that realizes the regression from point cloud data to high-dimensional continuous vectors. Applied to previous 2D PSM actuator arrays, SMNet significantly enhances control precision from 82.23% to 97.68%. Further, we extend its application to 3D PSM devices with three different actuator mechanisms, demonstrating the universal applicability of SMNet to the control of 3D shape morphing technologies. In our demonstrations, we confirm the efficacy of inverse control, where 3D PSM devices successfully replicate target shapes. These shapes are obtained either through 3D scanning of physical objects or via 3D modeling software. The results show that within the deformable range of 3D PSM devices, accurate reproduction of the desired shapes is achievable. The findings of this research represent a substantial advancement in soft robotics, particularly for applications demanding intricate 3D shape transformations, and establish a foundational framework for future developments in the field. △ Less

Submitted 26 January, 2024; originally announced January 2024.

arXiv:2312.00082 [pdf, other]

A Compact Implicit Neural Representation for Efficient Storage of Massive 4D Functional Magnetic Resonance Imaging

Authors: Ruoran Li, Runzhao Yang, Wenxin Xiang, Yuxiao Cheng, Tingxiong Xiao, **li Suo

Abstract: Functional Magnetic Resonance Imaging (fMRI) data is a widely used kind of four-dimensional biomedical data, which requires effective compression. However, fMRI compressing poses unique challenges due to its intricate temporal dynamics, low signal-to-noise ratio, and complicated underlying redundancies. This paper reports a novel compression paradigm specifically tailored for fMRI data based on Im… ▽ More Functional Magnetic Resonance Imaging (fMRI) data is a widely used kind of four-dimensional biomedical data, which requires effective compression. However, fMRI compressing poses unique challenges due to its intricate temporal dynamics, low signal-to-noise ratio, and complicated underlying redundancies. This paper reports a novel compression paradigm specifically tailored for fMRI data based on Implicit Neural Representation (INR). The proposed approach focuses on removing the various redundancies among the time series by employing several methods, including (i) conducting spatial correlation modeling for intra-region dynamics, (ii) decomposing reusable neuronal activation patterns, and (iii) using proper initialization together with nonlinear fusion to describe the inter-region similarity. This scheme appropriately incorporates the unique features of fMRI data, and experimental results on publicly available datasets demonstrate the effectiveness of the proposed method, surpassing state-of-the-art algorithms in both conventional image quality evaluation metrics and fMRI downstream tasks. This work in this paper paves the way for sharing massive fMRI data at low bandwidth and high fidelity. △ Less

Submitted 29 February, 2024; v1 submitted 30 November, 2023; originally announced December 2023.

arXiv:2311.13134 [pdf, other]

Lightweight High-Speed Photography Built on Coded Exposure and Implicit Neural Representation of Videos

Authors: Zhihong Zhang, Runzhao Yang, **li Suo, Yuxiao Cheng, Qionghai Dai

Abstract: The compact cameras recording high-speed scenes with high resolution are highly demanded, but the required high bandwidth often leads to bulky, heavy systems, which limits their applications on low-capacity platforms. Adopting a coded exposure setup to encode a frame sequence into a blurry snapshot and retrieve the latent sharp video afterward can serve as a lightweight solution. However, restorin… ▽ More The compact cameras recording high-speed scenes with high resolution are highly demanded, but the required high bandwidth often leads to bulky, heavy systems, which limits their applications on low-capacity platforms. Adopting a coded exposure setup to encode a frame sequence into a blurry snapshot and retrieve the latent sharp video afterward can serve as a lightweight solution. However, restoring motion from blur is quite challenging due to the high ill-posedness of motion blur decomposition, intrinsic ambiguity in motion direction, and diverse motions in natural videos. In this work, by leveraging classical coded exposure imaging technique and emerging implicit neural representation for videos, we tactfully embed the motion direction cues into the blurry image during the imaging process and develop a novel self-recursive neural network to sequentially retrieve the latent video sequence from the blurry image utilizing the embedded motion direction cues. To validate the effectiveness and efficiency of the proposed framework, we conduct extensive experiments on benchmark datasets and real-captured blurry images. The results demonstrate that our proposed framework significantly outperforms existing methods in quality and flexibility. The code for our work is available at https://github.com/zhihongz/BDINR △ Less

Submitted 21 November, 2023; originally announced November 2023.

Comments: 19 pages, 10 figures

arXiv:2308.04774 [pdf, other]

doi 10.1109/JIOT.2023.3301623

E$^3$-UAV: An Edge-based Energy-Efficient Object Detection System for Unmanned Aerial Vehicles

Authors: Jiashun Suo, Xingzhou Zhang, Weisong Shi, Wei Zhou

Abstract: Motivated by the advances in deep learning techniques, the application of Unmanned Aerial Vehicle (UAV)-based object detection has proliferated across a range of fields, including vehicle counting, fire detection, and city monitoring. While most existing research studies only a subset of the challenges inherent to UAV-based object detection, there are few studies that balance various aspects to de… ▽ More Motivated by the advances in deep learning techniques, the application of Unmanned Aerial Vehicle (UAV)-based object detection has proliferated across a range of fields, including vehicle counting, fire detection, and city monitoring. While most existing research studies only a subset of the challenges inherent to UAV-based object detection, there are few studies that balance various aspects to design a practical system for energy consumption reduction. In response, we present the E$^3$-UAV, an edge-based energy-efficient object detection system for UAVs. The system is designed to dynamically support various UAV devices, edge devices, and detection algorithms, with the aim of minimizing energy consumption by deciding the most energy-efficient flight parameters (including flight altitude, flight speed, detection algorithm, and sampling rate) required to fulfill the detection requirements of the task. We first present an effective evaluation metric for actual tasks and construct a transparent energy consumption model based on hundreds of actual flight data to formalize the relationship between energy consumption and flight parameters. Then we present a lightweight energy-efficient priority decision algorithm based on a large quantity of actual flight data to assist the system in deciding flight parameters. Finally, we evaluate the performance of the system, and our experimental results demonstrate that it can significantly decrease energy consumption in real-world scenarios. Additionally, we provide four insights that can assist researchers and engineers in their efforts to study UAV-based object detection further. △ Less

Submitted 2 December, 2023; v1 submitted 9 August, 2023; originally announced August 2023.

Comments: 16 pages, 8 figures

Journal ref: IEEE Internet of Things Journal, Early Access 1-1 (2023)

arXiv:2209.15180 [pdf, other]

SCI: A Spectrum Concentrated Implicit Neural Compression for Biomedical Data

Authors: Runzhao Yang, Tingxiong Xiao, Yuxiao Cheng, Qianni Cao, **yuan Qu, **li Suo, Qionghai Dai

Abstract: Massive collection and explosive growth of biomedical data, demands effective compression for efficient storage, transmission and sharing. Readily available visual data compression techniques have been studied extensively but tailored for natural images/videos, and thus show limited performance on biomedical data which are of different features and larger diversity. Emerging implicit neural repres… ▽ More Massive collection and explosive growth of biomedical data, demands effective compression for efficient storage, transmission and sharing. Readily available visual data compression techniques have been studied extensively but tailored for natural images/videos, and thus show limited performance on biomedical data which are of different features and larger diversity. Emerging implicit neural representation (INR) is gaining momentum and demonstrates high promise for fitting diverse visual data in target-data-specific manner, but a general compression scheme covering diverse biomedical data is so far absent. To address this issue, we firstly derive a mathematical explanation for INR's spectrum concentration property and an analytical insight on the design of INR based compressor. Further, we propose a Spectrum Concentrated Implicit neural compression (SCI) which adaptively partitions the complex biomedical data into blocks matching INR's concentrated spectrum envelop, and design a funnel shaped neural network capable of representing each block with a small number of parameters. Based on this design, we conduct compression via optimization under given budget and allocate the available parameters with high representation accuracy. The experiments show SCI's superior performance to state-of-the-art methods including commercial compressors, data-driven ones, and INR based counterparts on diverse biomedical data. The source code can be found at https://github.com/RichealYoung/ImplicitNeuralCompression.git. △ Less

Submitted 23 November, 2022; v1 submitted 29 September, 2022; originally announced September 2022.

Comments: accepted to AAAI2023

ACM Class: I.4.2; I.2.10

arXiv:2207.08201 [pdf, other]

doi 10.1109/TIP.2023.3244417

INFWIDE: Image and Feature Space Wiener Deconvolution Network for Non-blind Image Deblurring in Low-Light Conditions

Authors: Zhihong Zhang, Yuxiao Cheng, **li Suo, Liheng Bian, Qionghai Dai

Abstract: Under low-light environment, handheld photography suffers from severe camera shake under long exposure settings. Although existing deblurring algorithms have shown promising performance on well-exposed blurry images, they still cannot cope with low-light snapshots. Sophisticated noise and saturation regions are two dominating challenges in practical low-light deblurring. In this work, we propose a… ▽ More Under low-light environment, handheld photography suffers from severe camera shake under long exposure settings. Although existing deblurring algorithms have shown promising performance on well-exposed blurry images, they still cannot cope with low-light snapshots. Sophisticated noise and saturation regions are two dominating challenges in practical low-light deblurring. In this work, we propose a novel non-blind deblurring method dubbed image and feature space Wiener deconvolution network (INFWIDE) to tackle these problems systematically. In terms of algorithm design, INFWIDE proposes a two-branch architecture, which explicitly removes noise and hallucinates saturated regions in the image space and suppresses ringing artifacts in the feature space, and integrates the two complementary outputs with a subtle multi-scale fusion network for high quality night photograph deblurring. For effective network training, we design a set of loss functions integrating a forward imaging model and backward reconstruction to form a close-loop regularization to secure good convergence of the deep neural network. Further, to optimize INFWIDE's applicability in real low-light conditions, a physical-process-based low-light noise model is employed to synthesize realistic noisy night photographs for model training. Taking advantage of the traditional Wiener deconvolution algorithm's physically driven characteristics and arisen deep neural network's representation ability, INFWIDE can recover fine details while suppressing the unpleasant artifacts during deblurring. Extensive experiments on synthetic data and real data demonstrate the superior performance of the proposed approach. △ Less

Submitted 17 February, 2023; v1 submitted 17 July, 2022; originally announced July 2022.

Comments: Accepted by IEEE Trans. Image Process, early access version available at https://ieeexplore.ieee.org/document/10047966

arXiv:2205.03238 [pdf]

Ultra-sensitive Flexible Sponge-Sensor Array for Muscle Activities Detection and Human Limb Motion Recognition

Authors: Jiao Suo, Yifan Liu, Clio Cheng, Keer Wang, Meng Chen, Ho-yin Chan, Roy Vellaisamy, Ning Xi, Vivian W. Q. Lou, Wen Jung Li

Abstract: Human limb motion tracking and recognition plays an important role in medical rehabilitation training, lower limb assistance, prosthetics design for amputees, feedback control for assistive robots, etc. Lightweight wearable sensors, including inertial sensors, surface electromyography sensors, and flexible strain/pressure, are promising to become the next-generation human motion capture devices. H… ▽ More Human limb motion tracking and recognition plays an important role in medical rehabilitation training, lower limb assistance, prosthetics design for amputees, feedback control for assistive robots, etc. Lightweight wearable sensors, including inertial sensors, surface electromyography sensors, and flexible strain/pressure, are promising to become the next-generation human motion capture devices. Herein, we present a wireless wearable device consisting of a sixteen-channel flexible sponge-based pressure sensor array to recognize various human lower limb motions by detecting contours on the human skin caused by calf gastrocnemius muscle actions. Each sensing element is a round porous structure of thin carbon nanotube/polydimethylsiloxane nanocomposites with a diameter of 4 mm and thickness of about 400 μm. Ten human subjects were recruited to perform ten different lower limb motions while wearing the developed device. The motion classification result with the support vector machine method shows a macro-recall of about 97.3% for all ten motions tested. This work demonstrates a portable wearable muscle activity detection device with a lower limb motion recognition application, which can be potentially used in assistive robot control, healthcare, sports monitoring, etc. △ Less

Submitted 29 June, 2022; v1 submitted 30 April, 2022; originally announced May 2022.

Comments: 17 pages, 6 figures

arXiv:2204.04987 [pdf, other]

doi 10.1016/j.inffus.2023.01.013

A Dual Sensor Computational Camera for High Quality Dark Videography

Authors: Yuxiao Cheng, Runzhao Yang, Zhihong Zhang, **li Suo, Qionghai Dai

Abstract: Videos captured under low light conditions suffer from severe noise. A variety of efforts have been devoted to image/video noise suppression and made large progress. However, in extremely dark scenarios, extensive photon starvation would hamper precise noise modeling. Instead, develo** an imaging system collecting more photons is a more effective way for high-quality video capture under low illu… ▽ More Videos captured under low light conditions suffer from severe noise. A variety of efforts have been devoted to image/video noise suppression and made large progress. However, in extremely dark scenarios, extensive photon starvation would hamper precise noise modeling. Instead, develo** an imaging system collecting more photons is a more effective way for high-quality video capture under low illuminations. In this paper, we propose to build a dual-sensor camera to additionally collect the photons in NIR wavelength, and make use of the correlation between RGB and near-infrared (NIR) spectrum to perform high-quality reconstruction from noisy dark video pairs. In hardware, we build a compact dual-sensor camera capturing RGB and NIR videos simultaneously. Computationally, we propose a dual-channel multi-frame attention network (DCMAN) utilizing spatial-temporal-spectral priors to reconstruct the low-light RGB and NIR videos. In addition, we build a high-quality paired RGB and NIR video dataset, based on which the approach can be applied to different sensors easily by training the DCMAN model with simulated noisy input following a physical-process-based CMOS noise model. Both experiments on synthetic and real videos validate the performance of this compact dual-sensor camera design and the corresponding reconstruction algorithm in dark videography. △ Less

Submitted 11 April, 2022; originally announced April 2022.

Journal ref: Information Fusion Volume 93, May 2023, Pages 429-440

arXiv:2109.08880 [pdf, other]

doi 10.1109/JPROC.2023.3338272

Computational Imaging and Artificial Intelligence: The Next Revolution of Mobile Vision

Authors: **li Suo, Weihang Zhang, ** Gong, Xin Yuan, David J. Brady, Qionghai Dai

Abstract: Signal capture stands in the forefront to perceive and understand the environment and thus imaging plays the pivotal role in mobile vision. Recent explosive progresses in Artificial Intelligence (AI) have shown great potential to develop advanced mobile platforms with new imaging devices. Traditional imaging systems based on the "capturing images first and processing afterwards" mechanism cannot m… ▽ More Signal capture stands in the forefront to perceive and understand the environment and thus imaging plays the pivotal role in mobile vision. Recent explosive progresses in Artificial Intelligence (AI) have shown great potential to develop advanced mobile platforms with new imaging devices. Traditional imaging systems based on the "capturing images first and processing afterwards" mechanism cannot meet this unprecedented demand. Differently, Computational Imaging (CI) systems are designed to capture high-dimensional data in an encoded manner to provide more information for mobile vision systems.Thanks to AI, CI can now be used in real systems by integrating deep learning algorithms into the mobile vision platform to achieve the closed loop of intelligent acquisition, processing and decision making, thus leading to the next revolution of mobile vision.Starting from the history of mobile vision using digital cameras, this work first introduces the advances of CI in diverse applications and then conducts a comprehensive review of current research topics combining CI and AI. Motivated by the fact that most existing studies only loosely connect CI and AI (usually using AI to improve the performance of CI and only limited works have deeply connected them), in this work, we propose a framework to deeply integrate CI and AI by using the example of self-driving vehicles with high-speed communication, edge computing and traffic planning. Finally, we outlook the future of CI plus AI by investigating new materials, brain science and new computing techniques to shed light on new directions of mobile vision systems. △ Less

Submitted 18 September, 2021; originally announced September 2021.

arXiv:2106.15765 [pdf, other]

doi 10.1364/PRJ.435256

10-mega pixel snapshot compressive imaging with a hybrid coded aperture

Authors: Zhihong Zhang, Chao Deng, Yang Liu, Xin Yuan, **li Suo, Qionghai Dai

Abstract: High resolution images are widely used in our daily life, whereas high-speed video capture is challenging due to the low frame rate of cameras working at the high resolution mode. Digging deeper, the main bottleneck lies in the low throughput of existing imaging systems. Towards this end, snapshot compressive imaging (SCI) was proposed as a promising solution to improve the throughput of imaging s… ▽ More High resolution images are widely used in our daily life, whereas high-speed video capture is challenging due to the low frame rate of cameras working at the high resolution mode. Digging deeper, the main bottleneck lies in the low throughput of existing imaging systems. Towards this end, snapshot compressive imaging (SCI) was proposed as a promising solution to improve the throughput of imaging systems by compressive sampling and computational reconstruction. During acquisition, multiple high-speed images are encoded and collapsed to a single measurement. After this, algorithms are employed to retrieve the video frames from the coded snapshot. Recently developed Plug-and-Play (PnP) algorithms make it possible for SCI reconstruction in large-scale problems. However, the lack of high-resolution encoding systems still precludes SCI's wide application. In this paper, we build a novel hybrid coded aperture snapshot compressive imaging (HCA-SCI) system by incorporating a dynamic liquid crystal on silicon and a high-resolution lithography mask. We further implement a PnP reconstruction algorithm with cascaded denoisers for high quality reconstruction. Based on the proposed HCA-SCI system and algorithm, we achieve a 10-mega pixel SCI system to capture high-speed scenes, leading to a high throughput of 4.6G voxels per second. Both simulation and real data experiments verify the feasibility and performance of our proposed HCA-SCI scheme. △ Less

Submitted 15 August, 2021; v1 submitted 29 June, 2021; originally announced June 2021.

Comments: 11 pages, 8 figures, accepted by Photonics Research

arXiv:2104.03078 [pdf, other]

Universal and Flexible Optical Aberration Correction Using Deep-Prior Based Deconvolution

Authors: Xiu Li, **li Suo, Weihang Zhang, Xin Yuan, Qionghai Dai

Abstract: High quality imaging usually requires bulky and expensive lenses to compensate geometric and chromatic aberrations. This poses high constraints on the optical hash or low cost applications. Although one can utilize algorithmic reconstruction to remove the artifacts of low-end lenses, the degeneration from optical aberrations is spatially varying and the computation has to trade off efficiency for… ▽ More High quality imaging usually requires bulky and expensive lenses to compensate geometric and chromatic aberrations. This poses high constraints on the optical hash or low cost applications. Although one can utilize algorithmic reconstruction to remove the artifacts of low-end lenses, the degeneration from optical aberrations is spatially varying and the computation has to trade off efficiency for performance. For example, we need to conduct patch-wise optimization or train a large set of local deep neural networks to achieve high reconstruction performance across the whole image. In this paper, we propose a PSF aware plug-and-play deep network, which takes the aberrant image and PSF map as input and produces the latent high quality version via incorporating lens-specific deep priors, thus leading to a universal and flexible optical aberration correction method. Specifically, we pre-train a base model from a set of diverse lenses and then adapt it to a given lens by quickly refining the parameters, which largely alleviates the time and memory consumption of model learning. The approach is of high efficiency in both training and testing stages. Extensive results verify the promising applications of our proposed approach for compact low-end cameras. △ Less

Submitted 18 August, 2021; v1 submitted 7 April, 2021; originally announced April 2021.

Comments: ICCV2021

arXiv:2101.04822 [pdf, other]

Plug-and-Play Algorithms for Video Snapshot Compressive Imaging

Authors: Xin Yuan, Yang Liu, **li Suo, Frédo Durand, Qionghai Dai

Abstract: We consider the reconstruction problem of video snapshot compressive imaging (SCI), which captures high-speed videos using a low-speed 2D sensor (detector). The underlying principle of SCI is to modulate sequential high-speed frames with different masks and then these encoded frames are integrated into a snapshot on the sensor and thus the sensor can be of low-speed. On one hand, video SCI enjoys… ▽ More We consider the reconstruction problem of video snapshot compressive imaging (SCI), which captures high-speed videos using a low-speed 2D sensor (detector). The underlying principle of SCI is to modulate sequential high-speed frames with different masks and then these encoded frames are integrated into a snapshot on the sensor and thus the sensor can be of low-speed. On one hand, video SCI enjoys the advantages of low-bandwidth, low-power and low-cost. On the other hand, applying SCI to large-scale problems (HD or UHD videos) in our daily life is still challenging and one of the bottlenecks lies in the reconstruction algorithm. Exiting algorithms are either too slow (iterative optimization algorithms) or not flexible to the encoding process (deep learning based end-to-end networks). In this paper, we develop fast and flexible algorithms for SCI based on the plug-and-play (PnP) framework. In addition to the PnP-ADMM method, we further propose the PnP-GAP (generalized alternating projection) algorithm with a lower computational workload. We first employ the image deep denoising priors to show that PnP can recover a UHD color video with 30 frames from a snapshot measurement. Since videos have strong temporal correlation, by employing the video deep denoising priors, we achieve a significant improvement in the results. Furthermore, we extend the proposed PnP algorithms to the color SCI system using mosaic sensors, where each pixel only captures the red, green or blue channels. A joint reconstruction and demosaicing paradigm is developed for flexible and high quality reconstruction of color video SCI systems. Extensive results on both simulation and real datasets verify the superiority of our proposed algorithm. △ Less

Submitted 12 January, 2021; originally announced January 2021.

Comments: 18 pages, 12 figures and 4 tables. Journal extension of arXiv:2003.13654. Code available at https://github.com/liuyang12/PnP-SCI_python

arXiv:2009.04185 [pdf, other]

Small-floating Target Detection in Sea Clutter via Visual Feature Classifying in the Time-Doppler Spectra

Authors: Yi Zhou, Yin Cui, Xiaoke Xu, Jidong Suo, Xiaoming Liu

Abstract: It is challenging to detect small-floating object in the sea clutter for a surface radar. In this paper, we have observed that the backscatters from the target brake the continuity of the underlying motion of the sea surface in the time-Doppler spectra (TDS) images. Following this visual clue, we exploit the local binary pattern (LBP) to measure the variations of texture in the TDS images. It is s… ▽ More It is challenging to detect small-floating object in the sea clutter for a surface radar. In this paper, we have observed that the backscatters from the target brake the continuity of the underlying motion of the sea surface in the time-Doppler spectra (TDS) images. Following this visual clue, we exploit the local binary pattern (LBP) to measure the variations of texture in the TDS images. It is shown that the radar returns containing target and those only having clutter are separable in the feature space of LBP. An unsupervised one-class support vector machine (SVM) is then utilized to detect the deviation of the LBP histogram of the clutter. The outiler of the detector is classified as the target. In the real-life IPIX radar data sets, our visual feature based detector shows favorable detection rate compared to other three existing approaches. △ Less

Submitted 9 September, 2020; originally announced September 2020.

arXiv:2003.13654 [pdf, other]

Plug-and-Play Algorithms for Large-scale Snapshot Compressive Imaging

Authors: Xin Yuan, Yang Liu, **li Suo, Qionghai Dai

Abstract: Snapshot compressive imaging (SCI) aims to capture the high-dimensional (usually 3D) images using a 2D sensor (detector) in a single snapshot. Though enjoying the advantages of low-bandwidth, low-power and low-cost, applying SCI to large-scale problems (HD or UHD videos) in our daily life is still challenging. The bottleneck lies in the reconstruction algorithms; they are either too slow (iterativ… ▽ More Snapshot compressive imaging (SCI) aims to capture the high-dimensional (usually 3D) images using a 2D sensor (detector) in a single snapshot. Though enjoying the advantages of low-bandwidth, low-power and low-cost, applying SCI to large-scale problems (HD or UHD videos) in our daily life is still challenging. The bottleneck lies in the reconstruction algorithms; they are either too slow (iterative optimization algorithms) or not flexible to the encoding process (deep learning based end-to-end networks). In this paper, we develop fast and flexible algorithms for SCI based on the plug-and-play (PnP) framework. In addition to the widely used PnP-ADMM method, we further propose the PnP-GAP (generalized alternating projection) algorithm with a lower computational workload and prove the convergence of PnP-GAP under the SCI hardware constraints. By employing deep denoising priors, we first time show that PnP can recover a UHD color video ($3840\times 1644\times 48$ with PNSR above 30dB) from a snapshot 2D measurement. Extensive results on both simulation and real datasets verify the superiority of our proposed algorithm. The code is available at https://github.com/liuyang12/PnP-SCI. △ Less

Submitted 17 July, 2020; v1 submitted 30 March, 2020; originally announced March 2020.

Comments: CVPR 2020. Corrected a proof of convergence in previous version

arXiv:1811.03455 [pdf, other]

High fidelity single-pixel imaging

Authors: Chao Deng, Xuemei Hu, Xiaoxu Li, **li Suo, Zhili Zhang, Qionghai Dai

Abstract: Single-pixel imaging (SPI) is an emerging technique which has attracts wide attention in various research fields. However, restricted by the low reconstruction quality and large amount of measurements, the practical application is still in its infancy. Inspired by the fact that natural scenes exhibit unique degenerate structures in the low dimensional subspace, we propose to take advantage of the… ▽ More Single-pixel imaging (SPI) is an emerging technique which has attracts wide attention in various research fields. However, restricted by the low reconstruction quality and large amount of measurements, the practical application is still in its infancy. Inspired by the fact that natural scenes exhibit unique degenerate structures in the low dimensional subspace, we propose to take advantage of the local prior in convolutional sparse coding to implement high fidelity single-pixel imaging. Specifically, by statistically learning strategy, the target scene can be sparse represented on an overcomplete dictionary. The dictionary is composed of various basis learned from a natural image database. We introduce the above local prior into conventional SPI framework to promote the final reconstruction quality. Experiments both on synthetic data and real captured data demonstrate that our method can achieve better reconstruction from the same measurements, and thus consequently reduce the number of required measurements for same reconstruction quality. △ Less

Submitted 7 November, 2018; originally announced November 2018.

Comments: 5 pages, 6 figures

arXiv:1802.08805 [pdf, other]

Multispectral Focal Stack Acquisition Using A Chromatic Aberration Enlarged Camera

Authors: Qian Huang, Yunqian Li, Linsen Chen, Xiaoming Zhong, **li Suo, Zhan Ma, Tao Yue, Xun Cao

Abstract: Capturing more information, e.g. geometry and material, using optical cameras can greatly help the perception and understanding of complex scenes. This paper proposes a novel method to capture the spectral and light field information simultaneously. By using a delicately designed chromatic aberration enlarged camera, the spectral-varying slices at different depths of the scene can be easily captur… ▽ More Capturing more information, e.g. geometry and material, using optical cameras can greatly help the perception and understanding of complex scenes. This paper proposes a novel method to capture the spectral and light field information simultaneously. By using a delicately designed chromatic aberration enlarged camera, the spectral-varying slices at different depths of the scene can be easily captured. Afterwards, the multispectral focal stack, which is composed of a stack of multispectral slice images focusing on different depths, can be recovered from the spectral-varying slices by using a Local Linear Transformation (LLT) based algorithm. The experiments verify the effectiveness of the proposed method. △ Less

Submitted 24 February, 2018; originally announced February 2018.

Comments: Proceedings of IEEE international conference on image processing (ICIP)

Showing 1–18 of 18 results for author: Suo, J