-
MPEG Media Enablers For Richer XR Experiences
Authors:
Emmanuel Thomas,
Emmanouil Potetsianakis,
Thomas Stockhammer,
Imed Bouazizi,
Mary-Luc Champel
Abstract:
With the advent of immersive media applications, the requirements for the representation and the consumption of such content has dramatically increased. The ever-increasing size of the media asset combined with the stringent motion-to-photon latency requirement makes the equation of a high quality of experience for XR streaming services difficult to solve. The MPEG-I standards aim at facilitating…
▽ More
With the advent of immersive media applications, the requirements for the representation and the consumption of such content has dramatically increased. The ever-increasing size of the media asset combined with the stringent motion-to-photon latency requirement makes the equation of a high quality of experience for XR streaming services difficult to solve. The MPEG-I standards aim at facilitating the wide deployment of immersive applications. This paper describes part 13, Video Decoding Interface, and part 14, Scene Description for MPEG Media of MPEG-I which address decoder management and the virtual scene composition, respectively. These new parts intend to make complex media rendering operations and hardware resources management hidden from the application, hence lowering the barrier for XR application to become mainstream and accessible to XR experience developers and designers. Both parts are expected to be published by ISO at the end of 2021.
△ Less
Submitted 9 October, 2020;
originally announced October 2020.
-
A Holistic Survey of Wireless Multipath Video Streaming
Authors:
Samira Afzal,
Vanessa Testoni,
Christian Esteve Rothenberg,
Prakash Kolan,
Imed Bouazizi
Abstract:
Most of today's mobile devices are equipped with multiple network interfaces and one of the main bandwidth-hungry applications that would benefit from multipath communications is wireless video streaming. However, most of the current transport protocols do not match the requirements of video streaming applications or are not designed to address relevant issues, such as delay constraints, networks…
▽ More
Most of today's mobile devices are equipped with multiple network interfaces and one of the main bandwidth-hungry applications that would benefit from multipath communications is wireless video streaming. However, most of the current transport protocols do not match the requirements of video streaming applications or are not designed to address relevant issues, such as delay constraints, networks heterogeneity, and head-of-line blocking issues. This survey provides a holistic literature review of multipath wireless video streaming, shedding light on the different alternatives from an end-to-end layered stack perspective, unveiling trade-offs of each approach, and presenting a suitable taxonomy to classify the state-of-the-art. Finally, we discuss open issues and avenues for future work.
△ Less
Submitted 21 September, 2021; v1 submitted 14 June, 2019;
originally announced June 2019.
-
Ad-Net: Audio-Visual Convolutional Neural Network for Advertisement Detection In Videos
Authors:
Shervin Minaee,
Imed Bouazizi,
Prakash Kolan,
Hossein Najafzadeh
Abstract:
Personalized advertisement is a crucial task for many of the online businesses and video broadcasters. Many of today's broadcasters use the same commercial for all customers, but as one can imagine different viewers have different interests and it seems reasonable to have customized commercial for different group of people, chosen based on their demographic features, and history. In this project,…
▽ More
Personalized advertisement is a crucial task for many of the online businesses and video broadcasters. Many of today's broadcasters use the same commercial for all customers, but as one can imagine different viewers have different interests and it seems reasonable to have customized commercial for different group of people, chosen based on their demographic features, and history. In this project, we propose a framework, which gets the broadcast videos, analyzes them, detects the commercial and replaces it with a more suitable commercial. We propose a two-stream audio-visual convolutional neural network, that one branch analyzes the visual information and the other one analyzes the audio information, and then the audio and visual embedding are fused together, and are used for commercial detection, and content categorization. We show that using both the visual and audio content of the videos significantly improves the model performance for video analysis. This network is trained on a dataset of more than 50k regular video and commercial shots, and achieved much better performance compared to the models based on hand-crafted features.
△ Less
Submitted 22 June, 2018;
originally announced June 2018.