Search | arXiv e-print repository

Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment

Authors: Li Siyao, Tianpei Gu, Zhitao Yang, Zhengyu Lin, Ziwei Liu, Henghui Ding, Lei Yang, Chen Change Loy

Abstract: We introduce a novel task within the field of 3D dance generation, termed dance accompaniment, which necessitates the generation of responsive movements from a dance partner, the "follower", synchronized with the lead dancer's movements and the underlying musical rhythm. Unlike existing solo or group dance generation tasks, a duet dance scenario entails a heightened degree of interaction between t… ▽ More We introduce a novel task within the field of 3D dance generation, termed dance accompaniment, which necessitates the generation of responsive movements from a dance partner, the "follower", synchronized with the lead dancer's movements and the underlying musical rhythm. Unlike existing solo or group dance generation tasks, a duet dance scenario entails a heightened degree of interaction between the two participants, requiring delicate coordination in both pose and position. To support this task, we first build a large-scale and diverse duet interactive dance dataset, DD100, by recording about 117 minutes of professional dancers' performances. To address the challenges inherent in this task, we propose a GPT-based model, Duolando, which autoregressively predicts the subsequent tokenized motion conditioned on the coordinated information of the music, the leader's and the follower's movements. To further enhance the GPT's capabilities of generating stable results on unseen conditions (music and leader motions), we devise an off-policy reinforcement learning strategy that allows the model to explore viable trajectories from out-of-distribution samplings, guided by human-defined rewards. Based on the collected dataset and proposed method, we establish a benchmark with several carefully designed metrics. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: ICLR 2024

arXiv:2403.05793 [pdf, ps, other]

Performance Bounds for Passive Sensing in Asynchronous ISAC Systems -- Appendices

Authors: **gbo Zhao, Zhaoming Lu, J. Andrew Zhang, Weicai Li, Yifeng Xiong, Zijun Han, Xiangming Wen, Tao Gu

Abstract: This document contains the appendices for our paper titled ``Performance Bounds for Passive Sensing in Asynchronous ISAC Systems." The appendices include rigorous derivations of key formulas, detailed proofs of the theorems and propositions introduced in the paper, and details of the algorithm tested in the numerical simulation for validation. These appendices aim to support and elaborate on the f… ▽ More This document contains the appendices for our paper titled ``Performance Bounds for Passive Sensing in Asynchronous ISAC Systems." The appendices include rigorous derivations of key formulas, detailed proofs of the theorems and propositions introduced in the paper, and details of the algorithm tested in the numerical simulation for validation. These appendices aim to support and elaborate on the findings and methodologies presented in the main text. All external references to equations, theorems, and so forth, are directed towards the corresponding elements within the main paper. △ Less

Submitted 29 March, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

Comments: 5 pages

arXiv:2402.09217 [pdf, other]

Inferentialist Resource Semantics

Authors: Alexander V. Gheorghiu, Tao Gu, David J. Pym

Abstract: In systems modelling, a system typically comprises located resources relative to which processes execute. One important use of logic in informatics is in modelling such systems for the purpose of reasoning (perhaps automated) about their behaviour and properties. To this end, one requires an interpretation of logical formulae in terms of the resources and states of the system; such an interpretati… ▽ More In systems modelling, a system typically comprises located resources relative to which processes execute. One important use of logic in informatics is in modelling such systems for the purpose of reasoning (perhaps automated) about their behaviour and properties. To this end, one requires an interpretation of logical formulae in terms of the resources and states of the system; such an interpretation is called a resource semantics of the logic. This paper shows how inferentialism -- the view that meaning is given in terms of inferential behaviour -- enables a versatile and expressive framework for resource semantics. Specifically, how inferentialism seamlessly incorporates the assertion-based approach of the logic of Bunched Implications, foundational in program verification (e.g., as the basis of Separation Logic), and the renowned number-of-uses reading of Linear Logic. This integration enables reasoning about shared and separated resources in intuitive and familiar ways, as well as about the composition and interfacing of system components. △ Less

Submitted 12 April, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

arXiv:2203.13055 [pdf, other]

Bailando: 3D Dance Generation by Actor-Critic GPT with Choreographic Memory

Authors: Li Siyao, Weijiang Yu, Tianpei Gu, Chunze Lin, Quan Wang, Chen Qian, Chen Change Loy, Ziwei Liu

Abstract: Driving 3D characters to dance following a piece of music is highly challenging due to the spatial constraints applied to poses by choreography norms. In addition, the generated dance sequence also needs to maintain temporal coherency with different music genres. To tackle these challenges, we propose a novel music-to-dance framework, Bailando, with two powerful components: 1) a choreographic memo… ▽ More Driving 3D characters to dance following a piece of music is highly challenging due to the spatial constraints applied to poses by choreography norms. In addition, the generated dance sequence also needs to maintain temporal coherency with different music genres. To tackle these challenges, we propose a novel music-to-dance framework, Bailando, with two powerful components: 1) a choreographic memory that learns to summarize meaningful dancing units from 3D pose sequence to a quantized codebook, 2) an actor-critic Generative Pre-trained Transformer (GPT) that composes these units to a fluent dance coherent to the music. With the learned choreographic memory, dance generation is realized on the quantized units that meet high choreography standards, such that the generated dancing sequences are confined within the spatial constraints. To achieve synchronized alignment between diverse motion tempos and music beats, we introduce an actor-critic-based reinforcement learning scheme to the GPT with a newly-designed beat-align reward function. Extensive experiments on the standard benchmark demonstrate that our proposed framework achieves state-of-the-art performance both qualitatively and quantitatively. Notably, the learned choreographic memory is shown to discover human-interpretable dancing-style poses in an unsupervised manner. △ Less

Submitted 24 March, 2022; v1 submitted 24 March, 2022; originally announced March 2022.

Comments: Accepted by CVPR 2022. Code and video link: https://github.com/lisiyao21/Bailando/

arXiv:2003.07719 [pdf]

Toward a Wearable RFID System for Real-Time Activity Recognition Using Radio Patterns

Authors: Liang Wang, Tao Gu, ** Tao, Jian Lu

Abstract: Elderly care is one of the many applications supported by real-time activity recognition systems. Traditional approaches use cameras, body sensor networks, or radio patterns from various sources for activity recognition. However, these approaches are limited due to ease-of-use, coverage, or privacy preserving issues. In this paper, we present a novel wearable Radio Frequency Identification (RFID)… ▽ More Elderly care is one of the many applications supported by real-time activity recognition systems. Traditional approaches use cameras, body sensor networks, or radio patterns from various sources for activity recognition. However, these approaches are limited due to ease-of-use, coverage, or privacy preserving issues. In this paper, we present a novel wearable Radio Frequency Identification (RFID) system aims at providing an easy-to-use solution with high detection coverage. Our system uses passive tags which are maintenance-free and can be embedded into the clothes to reduce the wearing and maintenance efforts. A small RFID reader is also worn on the user's body to extend the detection coverage as the user moves. We exploit RFID radio patterns and extract both spatial and temporal features to characterize various activities. We also address the issues of false negative of tag readings and tag/antenna calibration, and design a fast online recognition system. Antenna and tag selection is done automatically to explore the minimum number of devices required to achieve target accuracy. We develop a prototype system which consists of a wearable RFID system and a smartphone to demonstrate the working principles, and conduct experimental studies with four subjects over two weeks. The results show that our system achieves a high recognition accuracy of 93.6 percent with a latency of 5 seconds. Additionally, we show that the system only requires two antennas and four tagged body parts to achieve a high recognition accuracy of 85 percent. △ Less

Submitted 8 March, 2020; originally announced March 2020.

arXiv:1802.09705 [pdf, other]

doi 10.1109/TMC.2018.2880442

Enabling Multiple Access for Non-Line-of-Sight Light-to-Camera Communications

Authors: Fan Yang, Shining Li, Zhe Yang, Tao Gu, Cheng Qian

Abstract: Light-to-Camera Communications (LCC) have emerged as a new wireless communication technology with great potential to benefit a broad range of applications. However, the existing LCC systems either require cameras directly facing to the lights or can only communicate over a single link, resulting in low throughputs and being fragile to ambient illuminant interference. We present HYCACO, a novel LCC… ▽ More Light-to-Camera Communications (LCC) have emerged as a new wireless communication technology with great potential to benefit a broad range of applications. However, the existing LCC systems either require cameras directly facing to the lights or can only communicate over a single link, resulting in low throughputs and being fragile to ambient illuminant interference. We present HYCACO, a novel LCC system, which enables multiple light emitting diodes (LEDs) with an unaltered camera to communicate via the non-line-of-sight (NLoS) links. Different from other NLoS LCC systems, the proposed scheme is resilient to the complex indoor luminous environment. HYCACO can decode the messages by exploring the mixed reflected optical signals transmitted from multiple LEDs. By further exploiting the rolling shutter mechanism, we present the optimal optical frequencies and camera exposure duration selection strategy to achieve the best performance. We built a hardware prototype to demonstrate the efficiency of the proposed scheme under different application scenarios. The experimental results show that the system throughput reaches 4.5 kbps on iPhone 6s with three transmitters. With the robustness, improved system throughput and ease of use, HYCACO has great potentials to be used in a wide range of applications such as advertising, tagging objects, and device certifications. △ Less

Submitted 1 August, 2018; v1 submitted 26 February, 2018; originally announced February 2018.

Comments: 12 pages, 13 figures

Showing 1–6 of 6 results for author: Gu, T