-
Toward One-Second Latency: Evolution of Live Media Streaming
Authors:
Abdelhak Bentaleb,
May Lim,
Mehmet N. Akcay,
Ali C. Begen,
Sarra Hammoudi,
Roger Zimmermann
Abstract:
This survey presents the evolution of live media streaming and the technological developments behind today's IP-based low-latency live streaming systems. Live streaming primarily involves capturing, encoding, packaging and delivering real-time events such as live sports, live news, personal broadcasts and surveillance videos. Live streaming also involves concurrent streaming of linear TV programmi…
▽ More
This survey presents the evolution of live media streaming and the technological developments behind today's IP-based low-latency live streaming systems. Live streaming primarily involves capturing, encoding, packaging and delivering real-time events such as live sports, live news, personal broadcasts and surveillance videos. Live streaming also involves concurrent streaming of linear TV programming off the satellite, cable, over-the-air or IPTV broadcast, where the programming is not necessarily a real-time event. The survey starts with a discussion on the latency and latency continuum in streaming applications. Then, it lays out the existing live streaming workflows and protocols, followed by an in-depth analysis of the latency sources in these workflows and protocols. The survey continues with the technology enablers, low-latency extensions for the popular HTTP adaptive streaming methods and enhancements for robust low-latency playback. An entire section is dedicated to the detailed summary and findings of Twitch's grand challenge on low-latency live streaming. The survey concludes with a discussion of ongoing research problems in this space.
△ Less
Submitted 4 October, 2023;
originally announced October 2023.
-
3D Segmentation of Humans in Point Clouds with Synthetic Data
Authors:
Ayça Takmaz,
Jonas Schult,
Irem Kaftan,
Mertcan Akçay,
Bastian Leibe,
Robert Sumner,
Francis Engelmann,
Siyu Tang
Abstract:
Segmenting humans in 3D indoor scenes has become increasingly important with the rise of human-centered robotics and AR/VR applications. To this end, we propose the task of joint 3D human semantic segmentation, instance segmentation and multi-human body-part segmentation. Few works have attempted to directly segment humans in cluttered 3D scenes, which is largely due to the lack of annotated train…
▽ More
Segmenting humans in 3D indoor scenes has become increasingly important with the rise of human-centered robotics and AR/VR applications. To this end, we propose the task of joint 3D human semantic segmentation, instance segmentation and multi-human body-part segmentation. Few works have attempted to directly segment humans in cluttered 3D scenes, which is largely due to the lack of annotated training data of humans interacting with 3D scenes. We address this challenge and propose a framework for generating training data of synthetic humans interacting with real 3D scenes. Furthermore, we propose a novel transformer-based model, Human3D, which is the first end-to-end model for segmenting multiple human instances and their body-parts in a unified manner. The key advantage of our synthetic data generation framework is its ability to generate diverse and realistic human-scene interactions, with highly accurate ground truth. Our experiments show that pre-training on synthetic data improves performance on a wide variety of 3D human segmentation tasks. Finally, we demonstrate that Human3D outperforms even task-specific state-of-the-art 3D segmentation methods.
△ Less
Submitted 18 August, 2023; v1 submitted 1 December, 2022;
originally announced December 2022.
-
Visual Modulation of Human Responses to Support Surface Translation
Authors:
Mustafa Emre Akçay,
Vittorio Lippi,
Thomas Mergner
Abstract:
Vision is known to improve human postural responses to external perturbations. This study investigates the role of vision for the responses to continuous pseudorandom support surface translations in the body sagittal plane in three visual conditions: with the eyes closed (EC), in stroboscopic illumination (EO/SI; only visual position information) and with eyes open in continuous illumination (EO/C…
▽ More
Vision is known to improve human postural responses to external perturbations. This study investigates the role of vision for the responses to continuous pseudorandom support surface translations in the body sagittal plane in three visual conditions: with the eyes closed (EC), in stroboscopic illumination (EO/SI; only visual position information) and with eyes open in continuous illumination (EO/CI; position and velocity information) with the room as static visual scene (or the interior of a moving cabin, in some of the trials). In the frequency spectrum of the translation stimulus we distinguished on the basis of the response patterns between a low-frequency, mid-frequency, and high-frequency range (LFR: 0.0165-0.14 Hz; MFR: 0.15-0.57 Hz; HFR: 0.58-2.46 Hz). With EC, subjects' mean sway response gain was very low in the LFR. On average it increased with EO/SI (although not to a significant degree p = 0.078) and more so with EO/CI (p < 10<sup>-6</sup>). In contrast, the average gain in the MFR decreased from EC to EO/SI (although not to a significant degree, p = 0.548) and further to EO/CI (p = 0.0002). In the HFR, all three visual conditions produced, similarly, high gain levels. A single inverted pendulum (SIP) model controlling center of mass (COM) balancing about the ankle joints formally described the EC response as being strongly shaped by a resonance phenomenon arising primarily from the control's proprioceptive feedback loop. The effect of adding visual information in these simulations lies in a reduction of the resonance, similar as in the experiments. Extending the model to a double inverted pendulum (DIP) suggested in addition a biomechanical dam** effective from trunk sway in the hip joints on the resonance.
△ Less
Submitted 5 March, 2021;
originally announced March 2021.