Skip to main content

Showing 1–11 of 11 results for author: Bao, W

Searching in archive eess. Search in all archives.
.
  1. arXiv:2404.18058  [pdf, other

    eess.IV cs.CV

    Joint Reference Frame Synthesis and Post Filter Enhancement for Versatile Video Coding

    Authors: Weijie Bao, Yuantong Zhang, Jianghao Jia, Zhenzhong Chen, Shan Liu

    Abstract: This paper presents the joint reference frame synthesis (RFS) and post-processing filter enhancement (PFE) for Versatile Video Coding (VVC), aiming to explore the combination of different neural network-based video coding (NNVC) tools to better utilize the hierarchical bi-directional coding structure of VVC. Both RFS and PFE utilize the Space-Time Enhancement Network (STENet), which receives two i… ▽ More

    Submitted 27 April, 2024; originally announced April 2024.

  2. arXiv:2404.13748  [pdf, other

    eess.SY math.NA

    Application of Kalman Filter in Stochastic Differential Equations

    Authors: Wencheng Bao, Shi Feng, Kaiwen Zhang

    Abstract: In areas such as finance, engineering, and science, we often face situations that change quickly and unpredictably. These situations are tough to handle and require special tools and methods capable of understanding and predicting what might happen next. Stochastic Differential Equations (SDEs) are renowned for modeling and analyzing real-world dynamical systems. However, obtaining the parameters,… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

    Comments: 18 pages, 14 figures

  3. arXiv:2305.11094  [pdf, other

    cs.HC cs.CV cs.MM cs.SD eess.AS

    QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation

    Authors: Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Haolin Zhuang

    Abstract: Speech-driven gesture generation is highly challenging due to the random jitters of human motion. In addition, there is an inherent asynchronous relationship between human speech and gestures. To tackle these challenges, we introduce a novel quantization-based and phase-guided motion-matching framework. Specifically, we first present a gesture VQ-VAE module to learn a codebook to summarize meaning… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: 15 pages, 12 figures, CVPR 2023 Highlight

  4. arXiv:2208.12133  [pdf, other

    cs.HC cs.AI cs.MM cs.SD eess.AS

    The ReprGesture entry to the GENEA Challenge 2022

    Authors: Sicheng Yang, Zhiyong Wu, Minglei Li, Mengchen Zhao, Jiuxin Lin, Liyang Chen, Weihong Bao

    Abstract: This paper describes the ReprGesture entry to the Generation and Evaluation of Non-verbal Behaviour for Embodied Agents (GENEA) challenge 2022. The GENEA challenge provides the processed datasets and performs crowdsourced evaluations to compare the performance of different gesture generation systems. In this paper, we explore an automatic gesture generation system based on multimodal representatio… ▽ More

    Submitted 25 August, 2022; originally announced August 2022.

    Comments: 8 pages, 4 figures, ICMI 2022

  5. arXiv:2206.04922  [pdf, other

    cs.CL cs.SD eess.AS

    A Novel Chinese Dialect TTS Frontend with Non-Autoregressive Neural Machine Translation

    Authors: Junhui Zhang, Wudi Bao, Junjie Pan, Xiang Yin, Zejun Ma

    Abstract: Chinese dialects are different variations of Chinese and can be considered as different languages in the same language family with Mandarin. Though they all use Chinese characters, the pronunciations, grammar and idioms can vary significantly, and even local speakers may find it hard to input correct written forms of dialect. Besides, using Mandarin text as text-to-speech inputs would generate spe… ▽ More

    Submitted 12 December, 2022; v1 submitted 10 June, 2022; originally announced June 2022.

    Comments: 4 pages,5 figures

  6. arXiv:2103.09455  [pdf, other

    cs.CV eess.IV

    Prediction-assistant Frame Super-Resolution for Video Streaming

    Authors: Wang Shen, Wenbo Bao, Guangtao Zhai, Charlie L Wang, Jerry W Hu, Zhiyong Gao

    Abstract: Video frame transmission delay is critical in real-time applications such as online video gaming, live show, etc. The receiving deadline of a new frame must catch up with the frame rendering time. Otherwise, the system will buffer a while, and the user will encounter a frozen screen, resulting in unsatisfactory user experiences. An effective approach is to transmit frames in lower-quality under po… ▽ More

    Submitted 17 March, 2021; originally announced March 2021.

  7. arXiv:2103.08259  [pdf, other

    eess.IV cs.CV cs.LG

    The QXS-SAROPT Dataset for Deep Learning in SAR-Optical Data Fusion

    Authors: Meiyu Huang, Yao Xu, Lixin Qian, Weili Shi, Yaqin Zhang, Wei Bao, Nan Wang, Xuejiao Liu, Xueshuang Xiang

    Abstract: Deep learning techniques have made an increasing impact on the field of remote sensing. However, deep neural networks based fusion of multimodal data from different remote sensors with heterogenous characteristics has not been fully explored, due to the lack of availability of big amounts of perfectly aligned multi-sensor image data with diverse scenes of high resolutions, especially for synthetic… ▽ More

    Submitted 25 April, 2021; v1 submitted 15 March, 2021; originally announced March 2021.

  8. arXiv:1910.04572  [pdf

    cs.RO eess.SY

    Design, Modelling and Validation of a Novel Extra Slender Continuum Robot for In-situ Inspection and Repair in Aeroengine

    Authors: Mingfeng Wang, Xin Dong, Weiming Ba, Abdelkhalick Mohammad, Dragos Axinte, Andy Norton

    Abstract: In-situ aeroengine maintenance works are highly beneficial as it can significantly reduce the current maintenance cycle which is extensive and costly due to the disassembly requirement of engines from aircrafts. However, navigating in/out via inspection ports and performing multi-axis movements with end-effectors in constrained environments (e.g. combustion chamber) are fairly challenging. A novel… ▽ More

    Submitted 9 October, 2019; originally announced October 2019.

    Comments: 11 pages, 12 figures, journal

  9. arXiv:1909.13051  [pdf, other

    eess.IV cs.CV

    A Dual Camera System for High Spatiotemporal Resolution Video Acquisition

    Authors: Ming Cheng, Zhan Ma, M. Salman Asif, Yiling Xu, Haojie Liu, Wenbo Bao, Jun Sun

    Abstract: This paper presents a dual camera system for high spatiotemporal resolution (HSTR) video acquisition, where one camera shoots a video with high spatial resolution and low frame rate (HSR-LFR) and another one captures a low spatial resolution and high frame rate (LSR-HFR) video. Our main goal is to combine videos from LSR-HFR and HSR-LFR cameras to create an HSTR video. We propose an end-to-end lea… ▽ More

    Submitted 24 March, 2020; v1 submitted 28 September, 2019; originally announced September 2019.

    Comments: To appear in IEEE Transactions on Pattern Analysis and Machine Intelligence

  10. arXiv:1909.00548  [pdf, other

    eess.IV cs.CV cs.LG

    Resource Optimized Neural Architecture Search for 3D Medical Image Segmentation

    Authors: Woong Bae, Seungho Lee, Yeha Lee, Beomhee Park, Minki Chung, Kyu-Hwan Jung

    Abstract: Neural Architecture Search (NAS), a framework which automates the task of designing neural networks, has recently been actively studied in the field of deep learning. However, there are only a few NAS methods suitable for 3D medical image segmentation. Medical 3D images are generally very large; thus it is difficult to apply previous NAS methods due to their GPU computational burden and long train… ▽ More

    Submitted 2 September, 2019; originally announced September 2019.

    Comments: MICCAI(International Conference on Medical Image Computing and Computer Assisted Intervention) 2019 accepted

  11. arXiv:1905.06103  [pdf

    eess.SY

    Closed Loop Load Model Identification Using Small Disturbance Data

    Authors: Shangyuan Li, Li Feng, Deqiang Gan, Zhen Wang, Wei Bao, Hao Xu

    Abstract: Load model identification using small disturbance data is studied. It is proved that the individual load to be identified and the rest of the system forms a closed-loop system. Then, the impacts of disturbances entering the feedforward channel (internal disturbance) and feedback channel (external disturbance) on relationship between load inputs and outputs are examined analytically. It is found ou… ▽ More

    Submitted 15 May, 2019; originally announced May 2019.

    Comments: 6 pages, 5 figures