Search | arXiv e-print repository

Fast multi-encoding to reduce the cost of video streaming

Authors: Hadi Amirpour, Vignesh V Menon, Ekrem Çetinkaya, Adithyan Ilangovan, Christian Feldmann, Martin Smole, Christian Timmerer

Abstract: The growth in video Internet traffic and advancements in video attributes such as framerate, resolution, and bit-depth boost the demand to devise a large-scale, highly efficient video encoding environment. This is even more essential for Dynamic Adaptive Streaming over HTTP (DASH)-based content provisioning as it requires encoding numerous representations of the same video content. High Efficiency… ▽ More The growth in video Internet traffic and advancements in video attributes such as framerate, resolution, and bit-depth boost the demand to devise a large-scale, highly efficient video encoding environment. This is even more essential for Dynamic Adaptive Streaming over HTTP (DASH)-based content provisioning as it requires encoding numerous representations of the same video content. High Efficiency Video Coding (HEVC) is one standard video codec that significantly improves encoding efficiency over its predecessor Advanced Video Coding (AVC). This improvement is achieved at the expense of significantly increased time complexity, which is a challenge for content and service providers. As various representations are the same video content encoded at different bitrates or resolutions, the encoding analysis information from the already encoded representations can be shared to accelerate the encoding of other representations. Several state-of-the-art schemes first encode a single representation, called a reference representation. During this encoding, the encoder creates analysis metadata with information such as the slicetype decisions, CU, PU, TU partitioning, and the HEVC bitstream itself. The remaining representations, called dependent representations, analyze the above metadata and then reuse it to skip searching some partitioning, thus, reducing the computational complexity. With the emergence of cloud-based encoding services, video encoding is accelerated by utilizing an increased number of resources, i.e., with multi-core CPUs, multiple representations can be encoded in parallel. This paper presents an overview of a wide range of multi-encoding schemes with and without the support of machine learning approaches integrated into the HEVC Test Model (HM) and x265, respectively. △ Less

Submitted 25 October, 2022; originally announced October 2022.

Comments: Accepted in IBC2022

arXiv:2201.04488 [pdf, other]

doi 10.1007/978-3-030-98355-0_33

ECAS-ML: Edge Computing Assisted Adaptation Scheme with Machine Learning for HTTP Adaptive Streaming

Authors: Jesús Aguilar-Armijo, Ekrem Çetinkaya, Christian Timmerer, Hermann Hellwagner

Abstract: As the video streaming traffic in mobile networks is increasing, improving the content delivery process becomes crucial, e.g., by utilizing edge computing support. At an edge node, we can deploy adaptive bitrate (ABR) algorithms with a better understanding of network behavior and access to radio and player metrics. In this work, we present ECAS-ML, Edge Assisted Adaptation Scheme for HTTP Adaptive… ▽ More As the video streaming traffic in mobile networks is increasing, improving the content delivery process becomes crucial, e.g., by utilizing edge computing support. At an edge node, we can deploy adaptive bitrate (ABR) algorithms with a better understanding of network behavior and access to radio and player metrics. In this work, we present ECAS-ML, Edge Assisted Adaptation Scheme for HTTP Adaptive Streaming with Machine Learning. ECAS-ML focuses on managing the tradeoff among bitrate, segment switches, and stalls to achieve a higher quality of experience (QoE). For that purpose, we use machine learning techniques to analyze radio throughput traces and predict the best parameters of our algorithm to achieve better performance. The results show that ECAS-ML outperforms other client-based and edge-based ABR algorithms. △ Less

Submitted 12 January, 2022; originally announced January 2022.

Comments: 12 pages, 4 figures

ACM Class: H.5.1; C.2.1

Journal ref: MMM 2022: MultiMedia Modeling pp 394-406

arXiv:2201.04402 [pdf, other]

doi 10.1007/978-3-030-98355-0_40

MoViDNN: A Mobile Platform for Evaluating Video Quality Enhancement with Deep Neural Networks

Authors: Ekrem Çetinkaya, Minh Nguyen, Christian Timmerer

Abstract: Deep neural network (DNN) based approaches have been intensively studied to improve video quality thanks to their fast advancement in recent years. These approaches are designed mainly for desktop devices due to their high computational cost. However, with the increasing performance of mobile devices in recent years, it became possible to execute DNN based approaches in mobile devices. Despite hav… ▽ More Deep neural network (DNN) based approaches have been intensively studied to improve video quality thanks to their fast advancement in recent years. These approaches are designed mainly for desktop devices due to their high computational cost. However, with the increasing performance of mobile devices in recent years, it became possible to execute DNN based approaches in mobile devices. Despite having the required computational power, utilizing DNNs to improve the video quality for mobile devices is still an active research area. In this paper, we propose an open-source mobile platform, namely MoViDNN, to evaluate DNN based video quality enhancement methods, such as super-resolution, denoising, and deblocking. Our proposed platform can be used to evaluate the DNN based approaches both objectively and subjectively. For objective evaluation, we report common metrics such as execution time, PSNR, and SSIM. For subjective evaluation, Mean Score Opinion (MOS) is reported. The proposed platform is available publicly at https://github.com/cd-athena/MoViDNN △ Less

Submitted 12 January, 2022; originally announced January 2022.

Comments: 8 pages, 3 figures

ACM Class: H.5.1; I.4.9

Journal ref: MMM 2022: MultiMedia Modeling pp 465-472

arXiv:2112.06194 [pdf]

doi 10.1109/ISCTURKEY53027.2021.9654356

Improving Performance of Federated Learning based Medical Image Analysis in Non-IID Settings using Image Augmentation

Authors: Alper Emin Cetinkaya, Murat Akin, Seref Sagiroglu

Abstract: Federated Learning (FL) is a suitable solution for making use of sensitive data belonging to patients, people, companies, or industries that are obligatory to work under rigid privacy constraints. FL mainly or partially supports data privacy and security issues and provides an alternative to model problems facilitating multiple edge devices or organizations to contribute a training of a global mod… ▽ More Federated Learning (FL) is a suitable solution for making use of sensitive data belonging to patients, people, companies, or industries that are obligatory to work under rigid privacy constraints. FL mainly or partially supports data privacy and security issues and provides an alternative to model problems facilitating multiple edge devices or organizations to contribute a training of a global model using a number of local data without having them. Non-IID data of FL caused from its distributed nature presents a significant performance degradation and stabilization skews. This paper introduces a novel method dynamically balancing the data distributions of clients by augmenting images to address the non-IID data problem of FL. The introduced method remarkably stabilizes the model training and improves the model's test accuracy from 83.22% to 89.43% for multi-chest diseases detection of chest X-ray images in highly non-IID FL setting. The results of IID, non-IID and non-IID with proposed method federated trainings demonstrated that the proposed method might help to encourage organizations or researchers in develo** better systems to get values from data with respect to data privacy not only for healthcare but also other fields. △ Less

Submitted 14 December, 2021; v1 submitted 12 December, 2021; originally announced December 2021.

Journal ref: IEEE 14th International Conference on Information Security and Cryptology, 2021, pp. 69-74

arXiv:2104.08328 [pdf, other]

doi 10.1016/j.image.2021.116442

CTU Depth Decision Algorithms for HEVC: A Survey

Authors: Ekrem Cetinkaya, Hadi Amirpour, Mohammad Ghanbari, Christian Timmerer

Abstract: High-Efficiency Video Coding (HEVC) surpasses its predecessors in encoding efficiency by introducing new coding tools at the cost of an increased encoding time-complexity. The Coding Tree Unit (CTU) is the main building block used in HEVC. In the HEVC standard, frames are divided into CTUs with the predetermined size of up to 64x64 pixels. Each CTU is then divided recursively into a number of equa… ▽ More High-Efficiency Video Coding (HEVC) surpasses its predecessors in encoding efficiency by introducing new coding tools at the cost of an increased encoding time-complexity. The Coding Tree Unit (CTU) is the main building block used in HEVC. In the HEVC standard, frames are divided into CTUs with the predetermined size of up to 64x64 pixels. Each CTU is then divided recursively into a number of equally sized square areas, known as Coding Units (CUs). Although this diversity of frame partitioning increases encoding efficiency, it also causes an increase in the time complexity due to the increased number of ways to find the optimal partitioning. To address this complexity, numerous algorithms have been proposed to eliminate unnecessary searches during partitioning CTUs by exploiting the correlation in the video. In this paper, existing CTU depth decision algorithms for HEVC are surveyed. These algorithms are categorized into two groups, namely statistics and machine learning approaches. Statistics approaches are further subdivided into neighboring and inherent approaches. Neighboring approaches exploit the similarity between adjacent CTUs to limit the depth range of the current CTU, while inherent approaches use only the available information within the current CTU. Machine learning approaches try to extract and exploit similarities implicitly. Traditional methods like support vector machines or random forests use manually selected features, while recently proposed deep learning methods extract features during training. Finally, this paper discusses extending these methods to more recent video coding formats such as Versatile Video Coding (VVC) and AOMedia Video 1(AV1). △ Less

Submitted 24 June, 2021; v1 submitted 16 April, 2021; originally announced April 2021.

Comments: 27 pages, 12 figures, 5 tables

ACM Class: A.1; E.4

Journal ref: Signal Processing: Image Communication 99 (2021) 116442

Showing 1–5 of 5 results for author: Çetinkaya, E