Skip to main content

Showing 1–14 of 14 results for author: Zhuang, H

Searching in archive eess. Search in all archives.
.
  1. arXiv:2403.05834  [pdf, other

    cs.MM cs.SD eess.AS

    Enhancing Expressiveness in Dance Generation via Integrating Frequency and Music Style Information

    Authors: Qiaochu Huang, Xu He, Boshi Tang, Haolin Zhuang, Liyang Chen, Shuochen Gao, Zhiyong Wu, Haozhi Huang, Helen Meng

    Abstract: Dance generation, as a branch of human motion generation, has attracted increasing attention. Recently, a few works attempt to enhance dance expressiveness, which includes genre matching, beat alignment, and dance dynamics, from certain aspects. However, the enhancement is quite limited as they lack comprehensive consideration of the aforementioned three factors. In this paper, we propose Expressi… ▽ More

    Submitted 9 March, 2024; originally announced March 2024.

  2. arXiv:2311.04769  [pdf

    eess.IV cs.CV

    An attention-based deep learning network for predicting Platinum resistance in ovarian cancer

    Authors: Haoming Zhuang, Beibei Li, **gtong Ma, Patrice Monkam, Shouliang Qi, Wei Qian, Dianning He

    Abstract: Background: Ovarian cancer is among the three most frequent gynecologic cancers globally. High-grade serous ovarian cancer (HGSOC) is the most common and aggressive histological type. Guided treatment for HGSOC typically involves platinum-based combination chemotherapy, necessitating an assessment of whether the patient is platinum-resistant. The purpose of this study is to propose a deep learning… ▽ More

    Submitted 8 November, 2023; originally announced November 2023.

  3. arXiv:2305.11094  [pdf, other

    cs.HC cs.CV cs.MM cs.SD eess.AS

    QPGesture: Quantization-Based and Phase-Guided Motion Matching for Natural Speech-Driven Gesture Generation

    Authors: Sicheng Yang, Zhiyong Wu, Minglei Li, Zhensong Zhang, Lei Hao, Weihong Bao, Haolin Zhuang

    Abstract: Speech-driven gesture generation is highly challenging due to the random jitters of human motion. In addition, there is an inherent asynchronous relationship between human speech and gestures. To tackle these challenges, we introduce a novel quantization-based and phase-guided motion-matching framework. Specifically, we first present a gesture VQ-VAE module to learn a codebook to summarize meaning… ▽ More

    Submitted 18 May, 2023; originally announced May 2023.

    Comments: 15 pages, 12 figures, CVPR 2023 Highlight

  4. arXiv:2304.12704  [pdf, other

    cs.SD cs.MM eess.AS

    GTN-Bailando: Genre Consistent Long-Term 3D Dance Generation based on Pre-trained Genre Token Network

    Authors: Haolin Zhuang, Shun Lei, Long Xiao, Weiqin Li, Liyang Chen, Sicheng Yang, Zhiyong Wu, Shiyin Kang, Helen Meng

    Abstract: Music-driven 3D dance generation has become an intensive research topic in recent years with great potential for real-world applications. Most existing methods lack the consideration of genre, which results in genre inconsistency in the generated dance movements. In addition, the correlation between the dance genre and the music has not been investigated. To address these issues, we propose a genr… ▽ More

    Submitted 25 April, 2023; originally announced April 2023.

    Comments: Accepted by ICASSP2023.Demo page: https://im1eon.github.io/ICASSP23-GTNB-DG/

  5. arXiv:2304.08990  [pdf, other

    eess.IV cs.CV

    A Comparison of Image Denoising Methods

    Authors: Zhaoming Kong, Fangxi Deng, Haomin Zhuang, Jun Yu, Lifang He, Xiaowei Yang

    Abstract: The advancement of imaging devices and countless images generated everyday pose an increasingly high demand on image denoising, which still remains a challenging task in terms of both effectiveness and efficiency. To improve denoising quality, numerous denoising techniques and approaches have been proposed in the past decades, including different transforms, regularization terms, algebraic represe… ▽ More

    Submitted 9 May, 2023; v1 submitted 18 April, 2023; originally announced April 2023.

    Comments: In this paper, we intend to collect and compare various denoising methods to investigate their effectiveness, efficiency, applicability and generalization ability with both synthetic and real-world experiments. arXiv admin note: substantial text overlap with arXiv:2011.03462

  6. AccEar: Accelerometer Acoustic Eavesdrop** with Unconstrained Vocabulary

    Authors: Pengfei Hu, Hui Zhuang, Panneer Selvam Santhalingamy, Riccardo Spolaor, Parth Pathaky, Guoming Zhang, Xiuzhen Cheng

    Abstract: With the increasing popularity of voice-based applications, acoustic eavesdrop** has become a serious threat to users' privacy. While on smartphones the access to microphones needs an explicit user permission, acoustic eavesdrop** attacks can rely on motion sensors (such as accelerometer and gyroscope), which access is unrestricted. However, previous instances of such attacks can only recogniz… ▽ More

    Submitted 2 December, 2022; originally announced December 2022.

    Comments: 2022 IEEE Symposium on Security and Privacy (SP)

    Journal ref: 2022 IEEE Symposium on Security and Privacy (SP)

  7. arXiv:2208.08757  [pdf, other

    eess.AS cs.LG cs.SD

    Speech Representation Disentanglement with Adversarial Mutual Information Learning for One-shot Voice Conversion

    Authors: SiCheng Yang, Methawee Tantrawenith, Haolin Zhuang, Zhiyong Wu, Aolan Sun, Jianzong Wang, Ning Cheng, Huaizhen Tang, Xintao Zhao, Jie Wang, Helen Meng

    Abstract: One-shot voice conversion (VC) with only a single target speaker's speech for reference has become a hot research topic. Existing works generally disentangle timbre, while information about pitch, rhythm and content is still mixed together. To perform one-shot VC effectively with further disentangling these speech components, we employ random resampling for pitch and content encoder and use the va… ▽ More

    Submitted 18 August, 2022; originally announced August 2022.

    Comments: 5 pages,5 figures,INTERSPEECH 2022

  8. arXiv:2206.01599  [pdf, other

    eess.SP

    A Deep-Learning Usability Expansion Model of Ocean Observations

    Authors: Ali Muhamed Ali, Hanqi Zhuang, Yu Huang, Ali K. Ibrahim, Ali Salem Altaher, Laurent Chérubin

    Abstract: Today's ocean numerical prediction skills depend on the availability of in-situ and remote ocean observations at the time of the predictions only. Because observations are scarce and discontinuous in time and space, numerical models are often unable to accurately model and predict real ocean dynamics, leading to a lack of fulfillment of a range of services that require reliable predictions at vari… ▽ More

    Submitted 3 June, 2022; originally announced June 2022.

    Comments: 34 pages, 14 figurs, one table

  9. arXiv:2008.01798  [pdf, other

    cs.LG eess.IV eess.SP stat.ML

    Physics-informed Tensor-train ConvLSTM for Volumetric Velocity Forecasting of Loop Current

    Authors: Yu Huang, Yufei Tang, Hanqi Zhuang, James VanZwieten, Laurent Cherubin

    Abstract: According to the National Academies, a weekly forecast of velocity, vertical structure, and duration of the Loop Current (LC) and its eddies is critical for understanding the oceanography and ecosystem, and for mitigating outcomes of anthropogenic and natural disasters in the Gulf of Mexico (GoM). However, this forecast is a challenging problem since the LC behaviour is dominated by long-range spa… ▽ More

    Submitted 18 December, 2021; v1 submitted 4 August, 2020; originally announced August 2020.

    Comments: 10 pages, 5 figures

    Journal ref: Front. Artif. Intell. 4(2021)197

  10. arXiv:2007.09478  [pdf

    cs.CV cs.LG eess.IV

    Classification of Diabetic Retinopathy via Fundus Photography: Utilization of Deep Learning Approaches to Speed up Disease Detection

    Authors: Hangwei Zhuang, Nabil Ettehadi

    Abstract: In this paper, we propose two distinct solutions to the problem of Diabetic Retinopathy (DR) classification. In the first approach, we introduce a shallow neural network architecture. This model performs well on classification of the most frequent classes while fails at classifying the less frequent ones. In the second approach, we use transfer learning to re-train the last modified layer of a ver… ▽ More

    Submitted 18 July, 2020; originally announced July 2020.

    Comments: 6 pages, 9 figures

    ACM Class: I.4.6; I.4.9

  11. arXiv:2006.10159  [pdf, other

    physics.ins-det cs.LG eess.IV eess.SP hep-ex

    Automatic heterogeneous quantization of deep neural networks for low-latency inference on the edge for particle detectors

    Authors: Claudionor N. Coelho Jr., Aki Kuusela, Shan Li, Hao Zhuang, Thea Aarrestad, Vladimir Loncar, Jennifer Ngadiuba, Maurizio Pierini, Adrian Alan Pol, Sioni Summers

    Abstract: Although the quest for more accurate solutions is pushing deep learning research towards larger and more complex algorithms, edge devices demand efficient inference and therefore reduction in model size, latency and energy consumption. One technique to limit model size is quantization, which implies using fewer bits to represent weights and biases. Such an approach usually results in a decline in… ▽ More

    Submitted 21 June, 2021; v1 submitted 15 June, 2020; originally announced June 2020.

    Journal ref: Nature Machine Intelligence, Volume 3 (2021)

  12. arXiv:2005.08356  [pdf

    eess.AS cs.SD

    North Atlantic Right Whales Up-call Detection Using Multimodel Deep Learning

    Authors: Ali K Ibrahim, Hanqi Zhuang, Laurent M. Ch'erubin, Nurgun Erdol, Gregory O Corry-Crowe, Ali Muhamed Ali

    Abstract: A new method for North Atlantic Right Whales (NARW) up-call detection using Multimodel Deep Learning (MMDL) is presented in this paper. In this approach, signals from passive acoustic sensors are first converted to spectrogram and scalogram images, which are time-frequency representations of the signals. These images are in turn used to train an MMDL detec-tor, consisting of Convolutional Neural N… ▽ More

    Submitted 17 May, 2020; originally announced May 2020.

  13. arXiv:2001.00170  [pdf

    eess.IV cs.CV

    Residual Block-based Multi-Label Classification and Localization Network with Integral Regression for Vertebrae Labeling

    Authors: Chunli Qin, Demin Yao, Han Zhuang, Hui Wang, Yonghong Shi, Zhijian Song

    Abstract: Accurate identification and localization of the vertebrae in CT scans is a critical and standard preprocessing step for clinical spinal diagnosis and treatment. Existing methods are mainly based on the integration of multiple neural networks, and most of them use the Gaussian heat map to locate the vertebrae's centroid. However, the process of obtaining the vertebrae's centroid coordinates using h… ▽ More

    Submitted 1 January, 2020; originally announced January 2020.

    Comments: 10 pages with 9 figures

  14. arXiv:0908.1273  [pdf, other

    math.OC cs.NI eess.SY

    A General Class of Throughput Optimal Routing Policies in Multi-hop Wireless Networks

    Authors: Mohammad Naghshvar, Hairuo Zhuang, Tara Javidi

    Abstract: This paper considers the problem of throughput optimal routing/scheduling in a multi-hop constrained queueing network with random connectivity whose special case includes opportunistic multi-hop wireless networks and input-queued switch fabrics. The main challenge in the design of throughput optimal routing policies is closely related to identifying appropriate and universal Lyapunov functions wit… ▽ More

    Submitted 10 March, 2011; v1 submitted 10 August, 2009; originally announced August 2009.

    Comments: 31 pages (one column), 8 figures, (revision submitted to IEEE Transactions on Information Theory)

    MSC Class: 34D20