Skip to main content

Showing 1–50 of 53 results for author: Zhu, F

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.10361  [pdf, other

    eess.IV

    On Efficient Neural Network Architectures for Image Compression

    Authors: Yichi Zhang, Zhihao Duan, Fengqing Zhu

    Abstract: Recent advances in learning-based image compression typically come at the cost of high complexity. Designing computationally efficient architectures remains an open challenge. In this paper, we empirically investigate the impact of different network designs in terms of rate-distortion performance and computational complexity. Our experiments involve testing various transforms, including convolutio… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

    Comments: 2024 IEEE International Conference on Image Processing (ICIP2024)

  2. arXiv:2405.07291  [pdf, other

    cs.IT eess.SP

    Robust Beamforming with Gradient-based Liquid Neural Network

    Authors: Xinquan Wang, Fenghao Zhu, Chongwen Huang, Ahmed Alhammadi, Faouzi Bader, Zhaoyang Zhang, Chau Yuen, Merouane Debbah

    Abstract: Millimeter-wave (mmWave) multiple-input multiple-output (MIMO) communication with the advanced beamforming technologies is a key enabler to meet the growing demands of future mobile communication. However, the dynamic nature of cellular channels in large-scale urban mmWave MIMO communication scenarios brings substantial challenges, particularly in terms of complexity and robustness. To address the… ▽ More

    Submitted 17 May, 2024; v1 submitted 12 May, 2024; originally announced May 2024.

  3. arXiv:2405.00391  [pdf, ps, other

    cs.IT eess.SP

    Beamforming Inferring by Conditional WGAN-GP for Holographic Antenna Arrays

    Authors: Fenghao Zhu, Xinquan Wang, Chongwen Huang, Ahmed Alhammadi, Hui Chen, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

    Abstract: The beamforming technology with large holographic antenna arrays is one of the key enablers for the next generation of wireless systems, which can significantly improve the spectral efficiency. However, the deployment of large antenna arrays implies high algorithm complexity and resource overhead at both receiver and transmitter ends. To address this issue, advanced technologies such as artificial… ▽ More

    Submitted 15 May, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

  4. arXiv:2405.00365  [pdf, other

    cs.IT eess.SP

    Robust Continuous-Time Beam Tracking with Liquid Neural Network

    Authors: Fenghao Zhu, Xinquan Wang, Chongwen Huang, Richeng **, Qianqian Yang, Ahmed Alhammadi, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

    Abstract: Millimeter-wave (mmWave) technology is increasingly recognized as a pivotal technology of the sixth-generation communication networks due to the large amounts of available spectrum at high frequencies. However, the huge overhead associated with beam training imposes a significant challenge in mmWave communications, particularly in urban environments with high background noise. To reduce this high… ▽ More

    Submitted 1 May, 2024; originally announced May 2024.

  5. arXiv:2404.15575  [pdf, other

    astro-ph.IM eess.SY

    Jitter Characterization of the HyTI Satellite

    Authors: Chase Urasaki, Frances Zhu, Michael Bottom, Miguel Nunes, Aidan Walk

    Abstract: The Hyperspectral Thermal Imager (HyTI) is a technology demonstration mission that will obtain high spatial, spectral, and temporal resolution long-wave infrared images of Earth's surface from a 6U cubesat. HyTI science requires that the pointing accuracy of the optical axis shall not exceed 2.89 arcsec over the 0.5 ms integration time due to microvibration effects (known as jitter). Two sources o… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

    Comments: Accepted for the 2024 IEEE Aerospace Conference Proceedings

  6. arXiv:2404.12257  [pdf, other

    cs.CV cs.AI cs.LG cs.MM eess.IV

    Food Portion Estimation via 3D Object Scaling

    Authors: Gautham Vinod, Jiangpeng He, Zeman Shao, Fengqing Zhu

    Abstract: Image-based methods to analyze food images have alleviated the user burden and biases associated with traditional methods. However, accurate portion estimation remains a major challenge due to the loss of 3D information in the 2D representation of foods captured by smartphone cameras or wearable devices. In this paper, we propose a new framework to estimate both food volume and energy from 2D imag… ▽ More

    Submitted 18 April, 2024; originally announced April 2024.

  7. arXiv:2404.07507  [pdf, other

    eess.IV cs.CV

    Learning to Classify New Foods Incrementally Via Compressed Exemplars

    Authors: Justin Yang, Zhihao Duan, Jiangpeng He, Fengqing Zhu

    Abstract: Food image classification systems play a crucial role in health monitoring and diet tracking through image-based dietary assessment techniques. However, existing food recognition systems rely on static datasets characterized by a pre-defined fixed number of food classes. This contrasts drastically with the reality of food consumption, which features constantly changing data. Therefore, food image… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

  8. Flexible Variable-Rate Image Feature Compression for Edge-Cloud Systems

    Authors: Md Adnan Faisal Hossain, Zhihao Duan, Yuning Huang, Fengqing Zhu

    Abstract: Feature compression is a promising direction for coding for machines. Existing methods have made substantial progress, but they require designing and training separate neural network models to meet different specifications of compression rate, performance accuracy and computational complexity. In this paper, a flexible variable-rate feature compression method is presented that can operate on a ran… ▽ More

    Submitted 30 March, 2024; originally announced April 2024.

    Comments: 6 pages, 7 figures, 1 table, International Conference on Multimedia and Expo Workshops 2023

  9. arXiv:2403.18535  [pdf, other

    eess.IV cs.LG

    Theoretical Bound-Guided Hierarchical VAE for Neural Image Codecs

    Authors: Yichi Zhang, Zhihao Duan, Yuning Huang, Fengqing Zhu

    Abstract: Recent studies reveal a significant theoretical link between variational autoencoders (VAEs) and rate-distortion theory, notably in utilizing VAEs to estimate the theoretical upper bound of the information rate-distortion function of images. Such estimated theoretical bounds substantially exceed the performance of existing neural image codecs (NICs). To narrow this gap, we propose a theoretical bo… ▽ More

    Submitted 27 March, 2024; originally announced March 2024.

    Comments: 2024 IEEE International Conference on Multimedia and Expo (ICME2024)

  10. arXiv:2402.18862  [pdf, other

    eess.IV

    Towards Backward-Compatible Continual Learning of Image Compression

    Authors: Zhihao Duan, Ming Lu, Justin Yang, Jiangpeng He, Zhan Ma, Fengqing Zhu

    Abstract: This paper explores the possibility of extending the capability of pre-trained neural image compressors (e.g., adapting to new data or target bitrates) without breaking backward compatibility, the ability to decode bitstreams encoded by the original model. We refer to this problem as continual learning of image compression. Our initial findings show that baseline solutions, such as end-to-end fine… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: Accepted to CVPR 2024

  11. arXiv:2402.10626  [pdf, other

    cs.IT eess.SP

    Robust Beamforming for RIS-aided Communications: Gradient-based Manifold Meta Learning

    Authors: Fenghao Zhu, Xinquan Wang, Chongwen Huang, Zhaohui Yang, Xiaoming Chen, Ahmed Alhammadi, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

    Abstract: Reconfigurable intelligent surface (RIS) has become a promising technology to realize the programmable wireless environment via steering the incident signal in fully customizable ways. However, a major challenge in RIS-aided communication systems is the simultaneous design of the precoding matrix at the base station (BS) and the phase shifting matrix of the RIS elements. This is mainly attributed… ▽ More

    Submitted 16 February, 2024; originally announced February 2024.

    Comments: journal

  12. arXiv:2402.02349  [pdf

    eess.IV cs.CV

    Vision Transformer-based Multimodal Feature Fusion Network for Lymphoma Segmentation on PET/CT Images

    Authors: Huan Huang, Liheng Qiu, Shenmiao Yang, Longxi Li, Jiaofen Nan, Yanting Li, Chuang Han, Fubao Zhu, Chen Zhao, Weihua Zhou

    Abstract: Background: Diffuse large B-cell lymphoma (DLBCL) segmentation is a challenge in medical image analysis. Traditional segmentation methods for lymphoma struggle with the complex patterns and the presence of DLBCL lesions. Objective: We aim to develop an accurate method for lymphoma segmentation with 18F-Fluorodeoxyglucose positron emission tomography (PET) and computed tomography (CT) images. Metho… ▽ More

    Submitted 4 February, 2024; originally announced February 2024.

    Comments: 14 pages, 6 figures; reference added

  13. arXiv:2401.11615  [pdf, other

    eess.IV

    Another Way to the Top: Exploit Contextual Clustering in Learned Image Coding

    Authors: Yichi Zhang, Zhihao Duan, Ming Lu, Dandan Ding, Fengqing Zhu, Zhan Ma

    Abstract: While convolution and self-attention are extensively used in learned image compression (LIC) for transform coding, this paper proposes an alternative called Contextual Clustering based LIC (CLIC) which primarily relies on clustering operations and local attention for correlation characterization and compact representation of an image. As seen, CLIC expands the receptive field into the entire image… ▽ More

    Submitted 21 January, 2024; originally announced January 2024.

    Comments: The 38th Annual AAAI Conference on Artificial Intelligence (AAAI 2024)

  14. arXiv:2312.07126  [pdf, other

    eess.IV

    Deep Hierarchical Video Compression

    Authors: Ming Lu, Zhihao Duan, Fengqing Zhu, Zhan Ma

    Abstract: Recently, probabilistic predictive coding that directly models the conditional distribution of latent features across successive frames for temporal redundancy removal has yielded promising results. Existing methods using a single-scale Variational AutoEncoder (VAE) must devise complex networks for conditional probability estimation in latent space, neglecting multiscale characteristics of video f… ▽ More

    Submitted 12 December, 2023; originally announced December 2023.

  15. arXiv:2311.06861  [pdf, other

    cs.IT eess.SP

    Energy-efficient Beamforming for RISs-aided Communications: Gradient Based Meta Learning

    Authors: Xinquan Wang, Fenghao Zhu, Qianyun Zhou, Qihao Yu, Chongwen Huang, Ahmed Alhammadi, Zhaoyang Zhang, Chau Yuen, Mérouane Debbah

    Abstract: Reconfigurable intelligent surfaces (RISs) have become a promising technology to meet the requirements of energy efficiency and scalability in future six-generation (6G) communications. However, a significant challenge in RISs-aided communications is the joint optimization of active and passive beamforming at base stations (BSs) and RISs respectively. Specifically, the main difficulty is attribute… ▽ More

    Submitted 16 February, 2024; v1 submitted 12 November, 2023; originally announced November 2023.

    Comments: 5 pages, 8 figures. Accepted in IEEE ICC 2024 (GCSN symposium)

  16. arXiv:2311.00567  [pdf

    eess.IV cs.CV cs.LG physics.med-ph q-bio.QM

    A Robust Deep Learning Method with Uncertainty Estimation for the Pathological Classification of Renal Cell Carcinoma based on CT Images

    Authors: Ni Yao, Hang Hu, Kaicong Chen, Chen Zhao, Yuan Guo, Boya Li, Jiaofen Nan, Yanting Li, Chuang Han, Fubao Zhu, Weihua Zhou, Li Tian

    Abstract: Objectives To develop and validate a deep learning-based diagnostic model incorporating uncertainty estimation so as to facilitate radiologists in the preoperative differentiation of the pathological subtypes of renal cell carcinoma (RCC) based on CT images. Methods Data from 668 consecutive patients, pathologically proven RCC, were retrospectively collected from Center 1. By using five-fold cross… ▽ More

    Submitted 12 November, 2023; v1 submitted 1 November, 2023; originally announced November 2023.

    Comments: 16 pages, 6 figures

  17. arXiv:2309.05423  [pdf, other

    eess.AS cs.AI cs.CL cs.SD

    Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP

    Authors: **zuomu Zhong, Yang Li, Hui Huang, Korin Richmond, Jie Liu, Zhiba Su, **g Guo, Benlai Tang, Fengjie Zhu

    Abstract: In expressive and controllable Text-to-Speech (TTS), explicit prosodic features significantly improve the naturalness and controllability of synthesised speech. However, manual prosody annotation is labor-intensive and inconsistent. To address this issue, a two-stage automatic annotation pipeline is novelly proposed in this paper. In the first stage, we use contrastive pretraining of Speech-Silenc… ▽ More

    Submitted 11 June, 2024; v1 submitted 11 September, 2023; originally announced September 2023.

  18. arXiv:2309.02574  [pdf, other

    eess.IV

    An Improved Upper Bound on the Rate-Distortion Function of Images

    Authors: Zhihao Duan, Jack Ma, Jiangpeng He, Fengqing Zhu

    Abstract: Recent work has shown that Variational Autoencoders (VAEs) can be used to upper-bound the information rate-distortion (R-D) function of images, i.e., the fundamental limit of lossy image compression. In this paper, we report an improved upper bound on the R-D function of images implemented by (1) introducing a new VAE model architecture, (2) applying variable-rate compression techniques, and (3) p… ▽ More

    Submitted 5 September, 2023; originally announced September 2023.

    Comments: Conference paper at ICIP 2023. The first two authors share equal contributions

  19. arXiv:2307.13241  [pdf, other

    eess.IV

    A Visual Quality Assessment Method for Raster Images in Scanned Document

    Authors: Justin Yang, Peter Bauer, Todd Harris, Changhyung Lee, Hyeon Seok Seo, Jan P Allebach, Fengqing Zhu

    Abstract: Image quality assessment (IQA) is an active research area in the field of image processing. Most prior works focus on visual quality of natural images captured by cameras. In this paper, we explore visual quality of scanned documents, focusing on raster image areas. Different from many existing works which aim to estimate a visual quality score, we propose a machine learning based classification m… ▽ More

    Submitted 25 July, 2023; originally announced July 2023.

  20. arXiv:2307.12263  [pdf, other

    eess.SP

    Efficient Gaussian Process Classification-based Physical-Layer Authentication with Configurable Fingerprints for 6G-Enabled IoT

    Authors: Rui Meng, Fangzhou Zhu, Xiaodong Xu, Liang **, Bizhu Wang, Bingxuan Xu, Han Meng, ** Zhang

    Abstract: Physical-Layer Authentication (PLA) has been recently believed as an endogenous-secure and energy-efficient technique to recognize IoT terminals. However, the major challenge of applying the state-of-the-art PLA schemes directly to 6G-enabled IoT is the inaccurate channel fingerprint estimation in low Signal-Noise Ratio (SNR) environments, which will greatly influence the reliability and robustnes… ▽ More

    Submitted 23 July, 2023; originally announced July 2023.

    Comments: 12 pages, 9 figures

  21. arXiv:2306.17008  [pdf

    eess.IV cs.CV

    MLA-BIN: Model-level Attention and Batch-instance Style Normalization for Domain Generalization of Federated Learning on Medical Image Segmentation

    Authors: Fubao Zhu, Yanhui Tian, Chuang Han, Yanting Li, Jiaofen Nan, Ni Yao, Weihua Zhou

    Abstract: The privacy protection mechanism of federated learning (FL) offers an effective solution for cross-center medical collaboration and data sharing. In multi-site medical image segmentation, each medical site serves as a client of FL, and its data naturally forms a domain. FL supplies the possibility to improve the performance of seen domains model. However, there is a problem of domain generalizatio… ▽ More

    Submitted 29 June, 2023; originally announced June 2023.

    Comments: 9 pages, 8 figures, 2 tables

  22. arXiv:2306.15212  [pdf, other

    cs.SD cs.LG eess.AS

    TranssionADD: A multi-frame reinforcement based sequence tagging model for audio deepfake detection

    Authors: Jie Liu, Zhiba Su, Hui Huang, Caiyan Wan, Quanxiu Wang, Jiangli Hong, Benlai Tang, Fengjie Zhu

    Abstract: Thanks to recent advancements in end-to-end speech modeling technology, it has become increasingly feasible to imitate and clone a user`s voice. This leads to a significant challenge in differentiating between authentic and fabricated audio segments. To address the issue of user voice abuse and misuse, the second Audio Deepfake Detection Challenge (ADD 2023) aims to detect and analyze deepfake spe… ▽ More

    Submitted 27 June, 2023; originally announced June 2023.

  23. arXiv:2303.09046  [pdf, other

    cs.CV eess.IV

    Self-Supervised Visual Representation Learning on Food Images

    Authors: Andrew Peng, Jiangpeng He, Fengqing Zhu

    Abstract: Food image analysis is the groundwork for image-based dietary assessment, which is the process of monitoring what kinds of food and how much energy is consumed using captured food or eating scene images. Existing deep learning-based methods learn the visual representation for downstream tasks based on human annotation of each food image. However, most food images in real life are obtained without… ▽ More

    Submitted 15 March, 2023; originally announced March 2023.

    Comments: Presented and published in EI 2023 Conference Proceedings

  24. arXiv:2303.08156  [pdf, other

    cs.CV eess.IV

    Nonlinear Hyperspectral Unmixing based on Multilinear Mixing Model using Convolutional Autoencoders

    Authors: Tingting Fang, Fei Zhu, Jie Chen

    Abstract: Unsupervised spectral unmixing consists of representing each observed pixel as a combination of several pure materials called endmembers with their corresponding abundance fractions. Beyond the linear assumption, various nonlinear unmixing models have been proposed, with the associated optimization problems solved either by traditional optimization algorithms or deep learning techniques. Current d… ▽ More

    Submitted 14 March, 2023; originally announced March 2023.

  25. QARV: Quantization-Aware ResNet VAE for Lossy Image Compression

    Authors: Zhihao Duan, Ming Lu, Jack Ma, Yuning Huang, Zhan Ma, Fengqing Zhu

    Abstract: This paper addresses the problem of lossy image compression, a fundamental problem in image processing and information theory that is involved in many real-world applications. We start by reviewing the framework of variational autoencoders (VAEs), a powerful class of generative probabilistic models that has a deep connection to lossy compression. Based on VAEs, we develop a novel scheme for lossy… ▽ More

    Submitted 1 December, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

    Comments: Full version (19 pages, includes appendix) of the paper accepted by IEEE TPAMI

  26. arXiv:2301.12340  [pdf

    eess.IV cs.CV

    Incremental Value and Interpretability of Radiomics Features of Both Lung and Epicardial Adipose Tissue for Detecting the Severity of COVID-19 Infection

    Authors: Ni Yao, Yanhui Tian, Daniel Gama das Neves, Chen Zhao, Claudio Tinoco Mesquita, Wolney de Andrade Martins, Alair Augusto Sarmet Moreira Damas dos Santos, Yanting Li, Chuang Han, Fubao Zhu, Neng Dai, Weihua Zhou

    Abstract: Epicardial adipose tissue (EAT) is known for its pro-inflammatory properties and association with Coronavirus Disease 2019 (COVID-19) severity. However, current EAT segmentation methods do not consider positional information. Additionally, the detection of COVID-19 severity lacks consideration for EAT radiomics features, which limits interpretability. This study investigates the use of radiomics f… ▽ More

    Submitted 6 December, 2023; v1 submitted 28 January, 2023; originally announced January 2023.

    Comments: 20 pages, 7 figures

  27. arXiv:2211.09897  [pdf, other

    eess.IV

    Efficient Feature Compression for Edge-Cloud Systems

    Authors: Zhihao Duan, Fengqing Zhu

    Abstract: Optimizing computation in an edge-cloud system is an important yet challenging problem. In this paper, we consider a three-way trade-off between bit rate, classification accuracy, and encoding complexity in an edge-cloud image classification system. Our method includes a new training strategy and an efficient encoder architecture to improve the rate-accuracy performance. Our design can also be eas… ▽ More

    Submitted 17 November, 2022; originally announced November 2022.

    Comments: Picture Coding Symposium (PCS) 2022

  28. arXiv:2210.05644  [pdf, other

    eess.IV cs.CV physics.optics

    Simulating single-photon detector array sensors for depth imaging

    Authors: Stirling Scholes, Germán Mora-Martín, Feng Zhu, Istvan Gyongy, Phil Soan, Jonathan Leach

    Abstract: Single-Photon Avalanche Detector (SPAD) arrays are a rapidly emerging technology. These multi-pixel sensors have single-photon sensitivities and pico-second temporal resolutions thus they can rapidly generate depth images with millimeter precision. Such sensors are a key enabling technology for future autonomous systems as they provide guidance and situational awareness. However, to fully exploit… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

  29. Lossy Image Compression with Quantized Hierarchical VAEs

    Authors: Zhihao Duan, Ming Lu, Zhan Ma, Fengqing Zhu

    Abstract: Recent research has shown a strong theoretical connection between variational autoencoders (VAEs) and the rate-distortion theory. Motivated by this, we consider the problem of lossy image compression from the perspective of generative modeling. Starting with ResNet VAEs, which are originally designed for data (image) distribution modeling, we redesign their latent variable model using a quantizati… ▽ More

    Submitted 25 March, 2023; v1 submitted 27 August, 2022; originally announced August 2022.

    Comments: WACV 2023 Best Algorithms Paper Award, revised version

  30. Automatic reorientation by deep learning to generate short axis SPECT myocardial perfusion images

    Authors: Fubao Zhu, Guojie Wang, Chen Zhao, Saurabh Malhotra, Min Zhao, Zhuo He, Jianzhou Shi, Zhixin Jiang, Weihua Zhou

    Abstract: Single photon emission computed tomography (SPECT) myocardial perfusion images (MPI) can be displayed both in traditional short-axis (SA) cardiac planes and polar maps for interpretation and quantification. It is essential to reorient the reconstructed transaxial SPECT MPI into standard SA slices. This study is aimed to develop a deep-learning-based approach for automatic reorientation of MPI. Met… ▽ More

    Submitted 7 August, 2022; originally announced August 2022.

    Comments: 27 pages,7 figures

  31. arXiv:2207.07195  [pdf

    cs.LG cs.MA eess.SY

    COOR-PLT: A hierarchical control model for coordinating adaptive platoons of connected and autonomous vehicles at signal-free intersections based on deep reinforcement learning

    Authors: Duowei Li, Jian** Wu, Feng Zhu, Tianyi Chen, Yiik Diew Wong

    Abstract: Platooning and coordination are two implementation strategies that are frequently proposed for traffic control of connected and autonomous vehicles (CAVs) at signal-free intersections instead of using conventional traffic signals. However, few studies have attempted to integrate both strategies to better facilitate the CAV control at signal-free intersections. To this end, this study proposes a hi… ▽ More

    Submitted 30 June, 2022; originally announced July 2022.

    Comments: This paper has been submitted to Transportation Research Part C: Emerging Technologies and is currently under review

    Journal ref: Transportation Research Part C: Emerging Technologies 146 (2023): 103933

  32. A new method incorporating deep learning with shape priors for left ventricular segmentation in myocardial perfusion SPECT images

    Authors: Fubao Zhu, **yu Zhao, Chen Zhao, Shaojie Tang, Jiaofen Nan, Yanting Li, Zhongqiang Zhao, Jianzhou Shi, Zenghong Chen, Zhixin Jiang, Weihua Zhou

    Abstract: Background: The assessment of left ventricular (LV) function by myocardial perfusion SPECT (MPS) relies on accurate myocardial segmentation. The purpose of this paper is to develop and validate a new method incorporating deep learning with shape priors to accurately extract the LV myocardium for automatic measurement of LV functional parameters. Methods: A segmentation architecture that integrates… ▽ More

    Submitted 7 June, 2022; originally announced June 2022.

    Comments: 21 pages, 14 figures

  33. arXiv:2205.01805  [pdf, other

    cs.CV cs.LG eess.IV

    Splicing Detection and Localization In Satellite Imagery Using Conditional GANs

    Authors: Emily R. Bartusiak, Sri Kalyan Yarlagadda, David Güera, Paolo Bestagini, Stefano Tubaro, Fengqing M. Zhu, Edward J. Delp

    Abstract: The widespread availability of image editing tools and improvements in image processing techniques allow image manipulation to be very easy. Oftentimes, easy-to-use yet sophisticated image manipulation tools yields distortions/changes imperceptible to the human observer. Distribution of forged images can have drastic ramifications, especially when coupled with the speed and vastness of the Interne… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: Accepted to the 2019 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR)

    Journal ref: IEEE Conference on Multimedia Information Processing and Retrieval, pp. 91-96, March 2019, San Jose, CA

  34. arXiv:2202.13209  [pdf, other

    eess.IV

    Opening the Black Box of Learned Image Coders

    Authors: Zhihao Duan, Ming Lu, Zhan Ma, Fengqing Zhu

    Abstract: End-to-end learned lossy image coders (LICs), as opposed to hand-crafted image codecs, have shown increasing superiority in terms of the rate-distortion performance. However, they are mainly treated as black-box systems and their interpretability is not well studied. In this paper, we show that LICs learn a set of basis functions to transform input image for its compact representation in the laten… ▽ More

    Submitted 14 October, 2022; v1 submitted 26 February, 2022; originally announced February 2022.

  35. arXiv:2110.06439  [pdf, other

    cs.IT eess.SP

    Statistical CSI-Based Transmission Design for Reconfigurable Intelligent Surface-aided Massive MIMO Systems with Hardware Impairments

    Authors: Jianxin Dai, Feng Zhu, Cunhua Pan, Hong Ren, Kezhi Wang

    Abstract: We consider a reconfigurable intelligent surface (RIS)-aided massive multi-user multiple-input multiple-output (MIMO) communication system with transceiver hardware impairments (HWIs) and RIS phase noise. Different from the existing contributions, the phase shifts of the RIS are designed based on the long-term angle informations. Firstly, an approximate analytical expression of the uplink achievab… ▽ More

    Submitted 12 October, 2021; originally announced October 2021.

    Comments: Accepted by IEEE Wireless Communications Letters. Keywords: Reconfigurable Intelligent Surface, Intelligent Reflecting Surface, Massive MIMO, Channel estimation, etc

  36. arXiv:2109.02755  [pdf, other

    eess.SP cs.LG

    Motion Artifact Reduction In Photoplethysmography For Reliable Signal Selection

    Authors: Runyu Mao, Mackenzie Tweardy, Stephan W. Wegerich, Craig J. Goergen, George R. Wodicka, Fengqing Zhu

    Abstract: Photoplethysmography (PPG) is a non-invasive and economical technique to extract vital signs of the human body. Although it has been widely used in consumer and research grade wrist devices to track a user's physiology, the PPG signal is very sensitive to motion which can corrupt the signal's quality. Existing Motion Artifact (MA) reduction techniques have been developed and evaluated using either… ▽ More

    Submitted 6 September, 2021; originally announced September 2021.

  37. arXiv:2107.06185  [pdf

    eess.SY

    A new method for vehicle system safety design based on data mining with uncertainty modeling

    Authors: ** Du, Binhui Jiang, Feng Zhu

    Abstract: In this research, a new data mining-based design approach has been developed for designing complex mechanical systems such as a crashworthy passenger car with uncertainty modeling. The method allows exploring the big crash simulation dataset to design the vehicle at multi-levels in a top-down manner (main energy absorbing system, components, and geometric features) and derive design rules based on… ▽ More

    Submitted 12 July, 2021; originally announced July 2021.

    Comments: 38 pages, 21 figures, 6 tables

  38. arXiv:2105.08819  [pdf, other

    eess.IV cs.CV cs.LG

    Fast and Accurate Quantized Camera Scene Detection on Smartphones, Mobile AI 2021 Challenge: Report

    Authors: Andrey Ignatov, Grigory Malivenko, Radu Timofte, Sheng Chen, Xin Xia, Zhaoyan Liu, Yuwei Zhang, Feng Zhu, Jiashi Li, Xuefeng Xiao, Yuan Tian, Xinglong Wu, Christos Kyrkou, Yixin Chen, Zexin Zhang, Yunbo Peng, Yue Lin, Saikat Dutta, Sourya Dipta Das, Nisarg A. Shah, Himanshu Kumar, Chao Ge, Pei-Lin Wu, **-Hua Du, Andrew Batutin , et al. (6 additional authors not shown)

    Abstract: Camera scene detection is among the most popular computer vision problem on smartphones. While many custom solutions were developed for this task by phone vendors, none of the designed models were available publicly up until now. To address this problem, we introduce the first Mobile AI challenge, where the target is to develop quantized deep learning-based camera scene classification solutions th… ▽ More

    Submitted 17 May, 2021; originally announced May 2021.

    Comments: Mobile AI 2021 Workshop and Challenges: https://ai-benchmark.com/workshops/mai/2021/. arXiv admin note: substantial text overlap with arXiv:2105.08630; text overlap with arXiv:2105.07825, arXiv:2105.07809, arXiv:2105.08629

  39. arXiv:2102.05024  [pdf, other

    eess.IV

    Turkey Behavior Identification System with a GUI Using Deep Learning and Video Analytics

    Authors: Shengtai Ju, Sneha Mahapatra, Marisa A. Erasmus, Amy R. Reibman, Fengqing Zhu

    Abstract: In this paper, we propose a video analytics system to identify the behavior of turkeys. Turkey behavior provides evidence to assess turkey welfare, which can be negatively impacted by uncomfortable ambient temperature and various diseases. In particular, healthy and sick turkeys behave differently in terms of the duration and frequency of activities such as eating, drinking, preening, and aggressi… ▽ More

    Submitted 9 February, 2021; originally announced February 2021.

  40. arXiv:2101.06341  [pdf, other

    eess.IV

    Advances In Video Compression System Using Deep Neural Network: A Review And Case Studies

    Authors: Dandan Ding, Zhan Ma, Di Chen, Qingshuang Chen, Zoe Liu, Fengqing Zhu

    Abstract: Significant advances in video compression system have been made in the past several decades to satisfy the nearly exponential growth of Internet-scale video traffic. From the application perspective, we have identified three major functional blocks including pre-processing, coding, and post-processing, that have been continuously investigated to maximize the end-user quality of experience (QoE) un… ▽ More

    Submitted 15 January, 2021; originally announced January 2021.

  41. arXiv:2012.00650  [pdf, other

    cs.CV eess.IV eess.SP

    Decomposition, Compression, and Synthesis (DCS)-based Video Coding: A Neural Exploration via Resolution-Adaptive Learning

    Authors: Ming Lu, Tong Chen, Dandan Ding, Fengqing Zhu, Zhan Ma

    Abstract: Inspired by the facts that retinal cells actually segregate the visual scene into different attributes (e.g., spatial details, temporal motion) for respective neuronal processing, we propose to first decompose the input video into respective spatial texture frames (STF) at its native spatial resolution that preserve the rich spatial details, and the other temporal motion frames (TMF) at a lower sp… ▽ More

    Submitted 15 January, 2024; v1 submitted 1 December, 2020; originally announced December 2020.

  42. arXiv:2008.05765  [pdf, other

    eess.IV cs.CV

    Revisiting Temporal Modeling for Video Super-resolution

    Authors: Takashi Isobe, Fang Zhu, Xu Jia, Sheng** Wang

    Abstract: Video super-resolution plays an important role in surveillance video analysis and ultra-high-definition video display, which has drawn much attention in both the research and industrial communities. Although many deep learning-based VSR methods have been proposed, it is hard to directly compare these methods since the different loss functions and training datasets have a significant impact on the… ▽ More

    Submitted 19 August, 2020; v1 submitted 13 August, 2020; originally announced August 2020.

    Comments: BMVC 2020

  43. arXiv:2007.02091  [pdf

    physics.med-ph eess.IV

    Semantic Segmentation Using Deep Learning to Extract Total Extraocular Muscles and Optic Nerve from Orbital Computed Tomography Images

    Authors: Fubao Zhu, Zhengyuan Gao, Chen Zhao, Zelin Zhu, Yanyun Liu, Shaojie Tang, Chengzhi Jiang, Xinhui Li, Min Zhao, Weihua Zhou

    Abstract: Objectives: Precise segmentation of total extraocular muscles (EOM) and optic nerve (ON) is essential to assess anatomical development and progression of thyroid-associated ophthalmopathy (TAO). We aim to develop a semantic segmentation method based on deep learning to extract the total EOM and ON from orbital CT images in patients with suspected TAO. Materials and Methods: A total of 7,879 images… ▽ More

    Submitted 4 July, 2020; originally announced July 2020.

    Comments: 17 pages, 8 figures

  44. arXiv:2006.11538  [pdf, other

    cs.CV cs.LG eess.IV

    Pyramidal Convolution: Rethinking Convolutional Neural Networks for Visual Recognition

    Authors: Ionut Cosmin Duta, Li Liu, Fan Zhu, Ling Shao

    Abstract: This work introduces pyramidal convolution (PyConv), which is capable of processing the input at multiple filter scales. PyConv contains a pyramid of kernels, where each level involves different types of filters with varying size and depth, which are able to capture different levels of details in the scene. On top of these improved recognition capabilities, PyConv is also efficient and, with our f… ▽ More

    Submitted 20 June, 2020; originally announced June 2020.

  45. arXiv:2004.12027  [pdf, other

    cs.CV eess.IV

    Deepfakes Detection with Automatic Face Weighting

    Authors: Daniel Mas Montserrat, Hanxiang Hao, S. K. Yarlagadda, Sriram Baireddy, Ruiting Shao, János Horváth, Emily Bartusiak, Justin Yang, David Güera, Fengqing Zhu, Edward J. Delp

    Abstract: Altered and manipulated multimedia is increasingly present and widely distributed via social media platforms. Advanced video manipulation tools enable the generation of highly realistic-looking altered multimedia. While many methods have been presented to detect manipulations, most of them fail when evaluated with data outside of the datasets used in research environments. In order to address this… ▽ More

    Submitted 4 May, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

  46. arXiv:2004.00583  [pdf, other

    cs.CV cs.LG eess.IV

    Improving Deep Hyperspectral Image Classification Performance with Spectral Unmixing

    Authors: Alan J. X. Guo, Fei Zhu

    Abstract: Recent advances in neural networks have made great progress in the hyperspectral image (HSI) classification. However, the overfitting effect, which is mainly caused by complicated model structure and small training set, remains a major concern. Reducing the complexity of the neural networks could prevent overfitting to some extent, but also declines the networks' ability to express more abstract f… ▽ More

    Submitted 21 December, 2020; v1 submitted 1 April, 2020; originally announced April 2020.

  47. arXiv:1908.02875  [pdf, ps, other

    eess.IV

    Convolutional Neural Networks Based Texture Modeling For AV1

    Authors: Di Chen, Chichen Fu, Zoe Liu, Fengqing Zhu

    Abstract: Modern video codecs including the newly developed AOMedia Video 1 (AV1) utilize hybrid coding techniques to remove spatial and temporal redundancy. However, efficient exploitation of statistical dependencies measured by a mean squared error (MSE) does not always produce the best psychovisual result. One interesting approach is to only encode visually relevant information and use a different coding… ▽ More

    Submitted 7 August, 2019; originally announced August 2019.

    Comments: 22 pages, 7 figures. arXiv admin note: substantial text overlap with arXiv:1804.09291

  48. Noisy-As-Clean: Learning Self-supervised Denoising from the Corrupted Image

    Authors: Jun Xu, Yuan Huang, Ming-Ming Cheng, Li Liu, Fan Zhu, Zhou Xu, Ling Shao

    Abstract: Supervised deep networks have achieved promisingperformance on image denoising, by learning image priors andnoise statistics on plenty pairs of noisy and clean images. Unsupervised denoising networks are trained with only noisy images. However, for an unseen corrupted image, both supervised andunsupervised networks ignore either its particular image prior, the noise statistics, or both. That is, t… ▽ More

    Submitted 9 May, 2020; v1 submitted 17 June, 2019; originally announced June 2019.

    Comments: 12 pages, 9 figures, 6 tables, the first two authors contribute equally

  49. arXiv:1812.04943  [pdf, other

    eess.IV physics.optics

    Long-range depth imaging using a single-photon detector array and non-local data fusion

    Authors: Susan Chan, Abderrahim Halimi, Feng Zhu, Istvan Gyongy, Robert K. Henderson, Richard Bowman, Steve McLaughlin, Gerald S. Buller, Jonathan Leach

    Abstract: The ability to measure and record high-resolution depth images at long stand-off distances is important for a wide range of applications, including connected and automotive vehicles, defense and security, and agriculture and mining. In LIDAR (light detection and ranging) applications, single-photon sensitive detection is an emerging approach, offering high sensitivity to light and picosecond tempo… ▽ More

    Submitted 11 December, 2018; originally announced December 2018.

  50. arXiv:1807.08048  [pdf, other

    cs.RO cs.AI cs.LG eess.SY

    Baidu Apollo EM Motion Planner

    Authors: Haoyang Fan, Fan Zhu, Changchun Liu, Liangliang Zhang, Li Zhuang, Dong Li, Weicheng Zhu, Jiangtao Hu, Hongye Li, Qi Kong

    Abstract: In this manuscript, we introduce a real-time motion planning system based on the Baidu Apollo (open source) autonomous driving platform. The developed system aims to address the industrial level-4 motion planning problem while considering safety, comfort and scalability. The system covers multilane and single-lane autonomous driving in a hierarchical manner: (1) The top layer of the system is a mu… ▽ More

    Submitted 20 July, 2018; originally announced July 2018.