Skip to main content

Showing 1–8 of 8 results for author: Gan, Z

Searching in archive eess. Search in all archives.
.
  1. arXiv:2406.17225  [pdf, other

    eess.IV cs.CV

    Multimodal Cross-Task Interaction for Survival Analysis in Whole Slide Pathological Images

    Authors: Songhan Jiang, Zhengyu Gan, Linghan Cai, Yifeng Wang, Yongbing Zhang

    Abstract: Survival prediction, utilizing pathological images and genomic profiles, is increasingly important in cancer analysis and prognosis. Despite significant progress, precise survival analysis still faces two main challenges: (1) The massive pixels contained in whole slide images (WSIs) complicate the process of pathological images, making it difficult to generate an effective representation of the tu… ▽ More

    Submitted 24 June, 2024; originally announced June 2024.

  2. arXiv:2403.10723  [pdf, other

    eess.SY

    Leveraging Symmetries in Gaits for Reinforcement Learning: A Case Study on Quadrupedal Gaits

    Authors: Jiayu Ding, Xulin Chen, Garret E. Katz, Zhenyu Gan

    Abstract: In this research, we address the complex task of develo** versatile and agile quadrupedal gaits for robotic platforms, a domain predominantly governed by model-based trajectory optimization methods. We propose an innovative, reference-free reinforcement learning framework that exploits the intrinsic symmetries of dynamic systems to synthesize a broad array of naturalistic quadrupedal locomotion… ▽ More

    Submitted 14 June, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

  3. arXiv:2306.04579  [pdf, other

    eess.IV cs.CV

    A Dataset for Deep Learning-based Bone Structure Analyses in Total Hip Arthroplasty

    Authors: Kaidong Zhang, Ziyang Gan, Dong Liu, Xifu Shang

    Abstract: Total hip arthroplasty (THA) is a widely used surgical procedure in orthopedics. For THA, it is of clinical significance to analyze the bone structure from the CT images, especially to observe the structure of the acetabulum and femoral head, before the surgical procedure. For such bone structure analyses, deep learning technologies are promising but require high-quality labeled data for the learn… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

    Comments: 16 pages, 17 figures

  4. arXiv:2305.07223  [pdf, other

    cs.SD cs.CV cs.MM eess.AS

    Transavs: End-To-End Audio-Visual Segmentation With Transformer

    Authors: Yuhang Ling, Yuxi Li, Zhenye Gan, Jiangning Zhang, Mingmin Chi, Yabiao Wang

    Abstract: Audio-Visual Segmentation (AVS) is a challenging task, which aims to segment sounding objects in video frames by exploring audio signals. Generally AVS faces two key challenges: (1) Audio signals inherently exhibit a high degree of information density, as sounds produced by multiple objects are entangled within the same audio stream; (2) Objects of the same category tend to produce similar audio s… ▽ More

    Submitted 26 December, 2023; v1 submitted 11 May, 2023; originally announced May 2023.

    Comments: 4 pages, 3 figures

  5. Breaking Symmetries Leads to Diverse Quadrupedal Gaits

    Authors: Jiayu Ding, Zhenyu Gan

    Abstract: Symmetry manifests itself in legged locomotion in a variety of ways. No matter where a legged system begins to move periodically, the torso and limbs coordinate with each other's movements in a similar manner. Also, in many gaits observed in nature, the legs on both sides of the torso move in exactly the same way, sometimes they are just half a period out of phase. Furthermore, when some animals m… ▽ More

    Submitted 8 April, 2024; v1 submitted 8 March, 2023; originally announced March 2023.

    Comments: Please refer to the published version to cite this paper

    Journal ref: IEEE Robotics and Automation Letters, Institute of Electrical and Electronics Engineers (IEEE), 2024

  6. arXiv:2103.09420  [pdf, other

    eess.AS cs.SD

    Improving Zero-shot Voice Style Transfer via Disentangled Representation Learning

    Authors: Siyang Yuan, Pengyu Cheng, Ruiyi Zhang, Weituo Hao, Zhe Gan, Lawrence Carin

    Abstract: Voice style transfer, also called voice conversion, seeks to modify one speaker's voice to generate speech as if it came from another (target) speaker. Previous works have made progress on voice conversion with parallel training data and pre-known speakers. However, zero-shot voice style transfer, which learns from non-parallel data and generates voices for previously unseen speakers, remains a ch… ▽ More

    Submitted 16 March, 2021; originally announced March 2021.

    Comments: To appear in ICLR 2021

  7. arXiv:2006.03315  [pdf, other

    cs.CV cs.LG eess.IV

    Multi-modal Feature Fusion with Feature Attention for VATEX Captioning Challenge 2020

    Authors: Ke Lin, Zhuoxin Gan, Liwei Wang

    Abstract: This report describes our model for VATEX Captioning Challenge 2020. First, to gather information from multiple domains, we extract motion, appearance, semantic and audio features. Then we design a feature attention module to attend on different feature when decoding. We apply two types of decoders, top-down and X-LAN and ensemble these models to get the final result. The proposed method outperfor… ▽ More

    Submitted 5 June, 2020; originally announced June 2020.

  8. Synthesis of model predictive control based on data-driven learning

    Authors: Yuanqiang Zhou, Dewei Li, Yugeng Xi, Zhongxue Gan

    Abstract: For the application of MPC design in on-line regulation or tracking control problems, several studies have attempted to develop an accurate model, and realize adequate uncertainty description of linear or non-linear plants of the processes. In this study, we employ the data-driven learning technique to iteratively approximate the dynamical parameters, without requiring a priori knowledge of system… ▽ More

    Submitted 29 March, 2019; originally announced April 2019.

    Comments: 4 pages

    Journal ref: SCIENCE CHINA Information Sciences, 2019