Search | arXiv e-print repository

arXiv:2010.04645 [pdf]

MPEG Media Enablers For Richer XR Experiences

Authors: Emmanuel Thomas, Emmanouil Potetsianakis, Thomas Stockhammer, Imed Bouazizi, Mary-Luc Champel

Abstract: With the advent of immersive media applications, the requirements for the representation and the consumption of such content has dramatically increased. The ever-increasing size of the media asset combined with the stringent motion-to-photon latency requirement makes the equation of a high quality of experience for XR streaming services difficult to solve. The MPEG-I standards aim at facilitating… ▽ More With the advent of immersive media applications, the requirements for the representation and the consumption of such content has dramatically increased. The ever-increasing size of the media asset combined with the stringent motion-to-photon latency requirement makes the equation of a high quality of experience for XR streaming services difficult to solve. The MPEG-I standards aim at facilitating the wide deployment of immersive applications. This paper describes part 13, Video Decoding Interface, and part 14, Scene Description for MPEG Media of MPEG-I which address decoder management and the virtual scene composition, respectively. These new parts intend to make complex media rendering operations and hardware resources management hidden from the application, hence lowering the barrier for XR application to become mainstream and accessible to XR experience developers and designers. Both parts are expected to be published by ISO at the end of 2021. △ Less

Submitted 9 October, 2020; originally announced October 2020.

ACM Class: I.3.2; H.5.1

Journal ref: IBC (2020)

arXiv:1906.06184 [pdf, other]

A Holistic Survey of Wireless Multipath Video Streaming

Authors: Samira Afzal, Vanessa Testoni, Christian Esteve Rothenberg, Prakash Kolan, Imed Bouazizi

Abstract: Most of today's mobile devices are equipped with multiple network interfaces and one of the main bandwidth-hungry applications that would benefit from multipath communications is wireless video streaming. However, most of the current transport protocols do not match the requirements of video streaming applications or are not designed to address relevant issues, such as delay constraints, networks… ▽ More Most of today's mobile devices are equipped with multiple network interfaces and one of the main bandwidth-hungry applications that would benefit from multipath communications is wireless video streaming. However, most of the current transport protocols do not match the requirements of video streaming applications or are not designed to address relevant issues, such as delay constraints, networks heterogeneity, and head-of-line blocking issues. This survey provides a holistic literature review of multipath wireless video streaming, shedding light on the different alternatives from an end-to-end layered stack perspective, unveiling trade-offs of each approach, and presenting a suitable taxonomy to classify the state-of-the-art. Finally, we discuss open issues and avenues for future work. △ Less

Submitted 21 September, 2021; v1 submitted 14 June, 2019; originally announced June 2019.

Comments: 44 pages. 11 figures. 9 Tables. 228 References. Preprint article version under submission to Journal of Network and Computer Applications (JNCA) 2021

arXiv:1806.08612 [pdf, other]

Ad-Net: Audio-Visual Convolutional Neural Network for Advertisement Detection In Videos

Authors: Shervin Minaee, Imed Bouazizi, Prakash Kolan, Hossein Najafzadeh

Abstract: Personalized advertisement is a crucial task for many of the online businesses and video broadcasters. Many of today's broadcasters use the same commercial for all customers, but as one can imagine different viewers have different interests and it seems reasonable to have customized commercial for different group of people, chosen based on their demographic features, and history. In this project,… ▽ More Personalized advertisement is a crucial task for many of the online businesses and video broadcasters. Many of today's broadcasters use the same commercial for all customers, but as one can imagine different viewers have different interests and it seems reasonable to have customized commercial for different group of people, chosen based on their demographic features, and history. In this project, we propose a framework, which gets the broadcast videos, analyzes them, detects the commercial and replaces it with a more suitable commercial. We propose a two-stream audio-visual convolutional neural network, that one branch analyzes the visual information and the other one analyzes the audio information, and then the audio and visual embedding are fused together, and are used for commercial detection, and content categorization. We show that using both the visual and audio content of the videos significantly improves the model performance for video analysis. This network is trained on a dataset of more than 50k regular video and commercial shots, and achieved much better performance compared to the models based on hand-crafted features. △ Less

Submitted 22 June, 2018; originally announced June 2018.

Showing 1–3 of 3 results for author: Bouazizi, I