Search | arXiv e-print repository

Power Reduction Opportunities on End-User Devices in Quality-Steady Video Streaming

Authors: Christian Herglotz, Werner Robitza, Alexander Raake, Tobias Hossfeld, André Kaup

Abstract: This paper uses a crowdsourced dataset of online video streaming sessions to investigate opportunities to reduce the power consumption while considering QoE. For this, we base our work on prior studies which model both the end-user's QoE and the end-user device's power consumption with the help of high-level video features such as the bitrate, the frame rate, and the resolution. On top of existing… ▽ More This paper uses a crowdsourced dataset of online video streaming sessions to investigate opportunities to reduce the power consumption while considering QoE. For this, we base our work on prior studies which model both the end-user's QoE and the end-user device's power consumption with the help of high-level video features such as the bitrate, the frame rate, and the resolution. On top of existing research, which focused on reducing the power consumption at the same QoE optimizing video parameters, we investigate potential power savings by other means such as using a different playback device, a different codec, or a predefined maximum quality level. We find that based on the power consumption of the streaming sessions from the crowdsourcing dataset, devices could save more than 55% of power if all participants adhere to low-power settings. △ Less

Submitted 24 May, 2023; originally announced May 2023.

Comments: 4 pages, 3 figures

arXiv:2204.07131 [pdf, other]

Systematic Analysis of Experiment Precision Measures and Methods for Experiments Comparison

Authors: Jakub Nawała, Tobias Hoßfeld, Lucjan Janowski, Michael Seufert

Abstract: The notion of experiment precision quantifies the variance of user ratings in a subjective experiment. Although there exist measures that assess subjective experiment precision, there are no systematic analyses of these measures available in the literature. To the best of our knowledge, there is also no systematic framework in the Multimedia Quality Assessment field for comparing subjective experi… ▽ More The notion of experiment precision quantifies the variance of user ratings in a subjective experiment. Although there exist measures that assess subjective experiment precision, there are no systematic analyses of these measures available in the literature. To the best of our knowledge, there is also no systematic framework in the Multimedia Quality Assessment field for comparing subjective experiments in terms of their precision. Therefore, the main idea of this paper is to propose a framework for comparing subjective experiments in the field of MQA based on appropriate experiment precision measures. We present three experiment precision measures and three related experiment precision comparison methods. We systematically analyse the performance of the measures and methods proposed. We do so both through a simulation study (varying user rating variance and bias) and by using data from four real-world Quality of Experience (QoE) subjective experiments. In the simulation study we focus on crowdsourcing QoE experiments, since they are known to generate ratings with higher variance and bias, when compared to traditional subjective experiment methodologies. We conclude that our proposed measures and related comparison methods properly capture experiment precision (both when tested on simulated and real-world data). One of the measures also proves capable of dealing with even significantly biased responses. We believe our experiment precision assessment framework will help compare different subjective experiment methodologies. For example, it may help decide which methodology results in more precise user ratings. This may potentially inform future standardisation activities. △ Less

Submitted 4 August, 2022; v1 submitted 14 April, 2022; originally announced April 2022.

Comments: 18 pages, 9 figures. Under review in IEEE Transactions on Multimedia. More results and references added. Improved style. Discussion section and appendices extended

arXiv:2006.16896 [pdf]

doi 10.25972/OPUS-20232

White Paper on Crowdsourced Network and QoE Measurements -- Definitions, Use Cases and Challenges

Authors: Tobias Hoßfeld, Stefan Wunderer, André Beyer, Andrew Hall, Anika Schwind, Christian Gassner, Fabrice Guillemin, Florian Wamser, Krzysztof Wascinski, Matthias Hirth, Michael Seufert, Pedro Casas, Phuoc Tran-Gia, Werner Robitza, Wojciech Wascinski, Zied Ben Houidi

Abstract: This white paper is the outcome of the Würzburg seminar on "Crowdsourced Network and QoE Measurements" which took place from 25-26 September 2019 in Würzburg, Germany. International experts were invited from industry and academia. They are well known in their communities, having different backgrounds in crowdsourcing, mobile networks, network measurements, network performance, Quality of Service (… ▽ More This white paper is the outcome of the Würzburg seminar on "Crowdsourced Network and QoE Measurements" which took place from 25-26 September 2019 in Würzburg, Germany. International experts were invited from industry and academia. They are well known in their communities, having different backgrounds in crowdsourcing, mobile networks, network measurements, network performance, Quality of Service (QoS), and Quality of Experience (QoE). The discussions in the seminar focused on how crowdsourcing will support vendors, operators, and regulators to determine the Quality of Experience in new 5G networks that enable various new applications and network architectures. As a result of the discussions, the need for a white paper manifested, with the goal of providing a scientific discussion of the terms "crowdsourced network measurements" and "crowdsourced QoE measurements", describing relevant use cases for such crowdsourced data, and its underlying challenges. During the seminar, those main topics were identified, intensively discussed in break-out groups, and brought back into the plenum several times. The outcome of the seminar is this white paper at hand which is - to our knowledge - the first one covering the topic of crowdsourced network and QoE measurements. △ Less

Submitted 25 May, 2020; originally announced June 2020.

arXiv:2003.12742 [pdf, other]

From QoS Distributions to QoE Distributions: a System's Perspective

Authors: Tobias Hossfeld, Poul E. Heegaard, Martin Varela, Lea Skorin-Kapov, Markus Fiedler

Abstract: In the context of QoE management, network and service providers commonly rely on models that map system QoS conditions (e.g., system response time, paket loss, etc.) to estimated end user QoE values. Observable QoS conditions in the system may be assumed to follow a certain distribution, meaning that different end users will experience different conditions. On the other hand, drawing from the resu… ▽ More In the context of QoE management, network and service providers commonly rely on models that map system QoS conditions (e.g., system response time, paket loss, etc.) to estimated end user QoE values. Observable QoS conditions in the system may be assumed to follow a certain distribution, meaning that different end users will experience different conditions. On the other hand, drawing from the results of subjective user studies, we know that user diversity leads to distributions of user scores for any given test conditions (in this case referring to the QoS parameters of interest). Our previous studies have shown that to correctly derive various QoE metrics (e.g., Mean Opinion Score (MOS), quantiles, probability of users rating "good or better", etc.) in a system under given conditions, there is a need to consider rating distributions obtained from user studies, which are often times not available. In this paper we extend these findings to show how to approximate user rating distributions given a QoS-to-MOS map** function and second order statistics. Such a user rating distribution may then be combined with a QoS distribution observed in a system to finally derive corresponding distributions of QoE scores. We provide two examples to illustrate this process: 1) analytical results using a Web QoE model relating waiting times to QoE, and 2) numerical results using measurements relating packet losses to video stall pattern, which are in turn mapped to QoE estimates. The results in this paper provide a solution to the problem of understanding the QoE distribution in a system, in cases where the necessary data is not directly available in the form of models going beyond the MOS, or where the full details of subjective experiments are not available. △ Less

Submitted 28 March, 2020; originally announced March 2020.

Comments: 4th International Workshop on Quality of Experience Management (QoE Management 2020), featured by IEEE Conference on Network Softwarization (IEEE NetSoft 2020), Ghent, Belgium

arXiv:2003.11903 [pdf, other]

Crowdsourced Network Measurements in Germany: Mobile Internet Experience from End User Perspective

Authors: Anika Schwind, Florian Wamser, Tobias Hoßfeld, Stefan Wunderer, Erik Tarnvik, Andy Hall

Abstract: Collecting and analyzing meaningful data in mobile networks is the key to assessing network performance. Crowdsourced Network Measurements (CNMs) provide insights beyond the network layer and offer performance and other measurements at the application and user-level towards Quality of Experience (QoE). In this paper, the mobile Internet experience for Germany is evaluated with the help of crowdsou… ▽ More Collecting and analyzing meaningful data in mobile networks is the key to assessing network performance. Crowdsourced Network Measurements (CNMs) provide insights beyond the network layer and offer performance and other measurements at the application and user-level towards Quality of Experience (QoE). In this paper, the mobile Internet experience for Germany is evaluated with the help of crowdsourcing from the perspective of an end user. We statistically analyze a dataset with throughput measurements on the end device from Tutela Ltd., which covers more than 2.5 million throughput tests across Germany from January to July 2019. We give insights into this emerging methodology and highlight the benefits of this method. The paper contains statistics and conclusions for several large cities as well as regions in Germany compared to general statements for Germany, since individual measurements and averages often only imprecisely reflect the situation. The goal is to give a holistic view of the performance of the current mobile network in Germany. Reading this paper, it becomes evident that reliable statements about the quality of the mobile network for Germany depend on a large number of peculiarities in different regions with their own performance characteristics due to different network deployments and population numbers. △ Less

Submitted 26 March, 2020; originally announced March 2020.

arXiv:2003.11300 [pdf, other]

doi 10.1109/QoMEX48832.2020.9123115

Impact of the Number of Votes on the Reliability and Validity of Subjective Speech Quality Assessment in the Crowdsourcing Approach

Authors: Babak Naderi, Tobias Hossfeld, Matthias Hirth, Florian Metzger, Sebastian Möller, Rafael Zequeira Jiménez

Abstract: The subjective quality of transmitted speech is traditionally assessed in a controlled laboratory environment according to ITU-T Rec. P.800. In turn, with crowdsourcing, crowdworkers participate in a subjective online experiment using their own listening device, and in their own working environment. Despite such less controllable conditions, the increased use of crowdsourcing micro-task platforms… ▽ More The subjective quality of transmitted speech is traditionally assessed in a controlled laboratory environment according to ITU-T Rec. P.800. In turn, with crowdsourcing, crowdworkers participate in a subjective online experiment using their own listening device, and in their own working environment. Despite such less controllable conditions, the increased use of crowdsourcing micro-task platforms for quality assessment tasks has pushed a high demand for standardized methods, resulting in ITU-T Rec. P.808. This work investigates the impact of the number of judgments on the reliability and the validity of quality ratings collected through crowdsourcing-based speech quality assessments, as an input to ITU-T Rec. P.808 . Three crowdsourcing experiments on different platforms were conducted to evaluate the overall quality of three different speech datasets, using the Absolute Category Rating procedure. For each dataset, the Mean Opinion Scores (MOS) are calculated using differing numbers of crowdsourcing judgements. Then the results are compared to MOS values collected in a standard laboratory experiment, to assess the validity of crowdsourcing approach as a function of number of votes. In addition, the reliability of the average scores is analyzed by checking inter-rater reliability, gain in certainty, and the confidence of the MOS. The results provide a suggestion on the required number of votes per condition, and allow to model its impact on validity and reliability. △ Less

Submitted 25 March, 2020; originally announced March 2020.

Comments: This paper has been accepted for publication in the 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX)

arXiv:1909.07617 [pdf, other]

Mobile Internet Experience: Urban vs. Rural -- Saturation vs. Starving?

Authors: Anika Schwind, Florian Wamser, Stefan Wunderer, Christian Gassner, Tobias Hoßfeld

Abstract: Mobile Internet experience has been of increasing interest. Services accessed via smartphone applications shall provide satisfying Quality of Experience (QoE), irrespective of end user location, time of the day, and other circumstances. Unfortunately, current LTE networks often don't provide constant user throughput, one of the major system influence factors to mobile Internet QoE. In this paper,… ▽ More Mobile Internet experience has been of increasing interest. Services accessed via smartphone applications shall provide satisfying Quality of Experience (QoE), irrespective of end user location, time of the day, and other circumstances. Unfortunately, current LTE networks often don't provide constant user throughput, one of the major system influence factors to mobile Internet QoE. In this paper, we conducted an exemplary measurement study in LTE networks, comparing the QoE of mobile networks in an urban and a rural region. Our results show that there are significant differences concerning the network speed which can result in unsatisfactory service quality depending on the application to be used. When evaluating the QoE for multiple users who are using the same base station in a specific area, user satisfaction decreases drastically, especially in rural areas. Our work encourages for future work to focus on this gap between the QoE in urban and rural areas. △ Less

Submitted 17 September, 2019; originally announced September 2019.

arXiv:1909.02772 [pdf, other]

Cumulative Quality Modeling for HTTP Adaptive Streaming

Authors: Huyen T. T. Tran, Nam Pham Ngoc, Tobias Hoßfeld, Michael Seufert, Truong Cong Thang

Abstract: Thanks to the abundance of Web platforms and broadband connections, HTTP Adaptive Streaming has become the de facto choice for multimedia delivery nowadays. However, the visual quality of adaptive video streaming may fluctuate strongly during a session due to bandwidth fluctuations. So, it is important to evaluate the quality of a streaming session over time. In this paper, we propose a model to e… ▽ More Thanks to the abundance of Web platforms and broadband connections, HTTP Adaptive Streaming has become the de facto choice for multimedia delivery nowadays. However, the visual quality of adaptive video streaming may fluctuate strongly during a session due to bandwidth fluctuations. So, it is important to evaluate the quality of a streaming session over time. In this paper, we propose a model to estimate the cumulative quality for HTTP Adaptive Streaming. In the model, a sliding window of video segments is employed as the basic building block. Through statistical analysis using a subjective dataset, we identify three important components of the cumulative quality model, namely the minimum window quality, the last window quality, and the average window quality. Experiment results show that the proposed model achieves high prediction performance and outperforms related quality models. In addition, another advantage of the proposed model is its simplicity and effectiveness for deployment in real-time estimation. The source code of the proposed model has been made available to the public at https://github.com/TranHuyen1191/CQM. △ Less

Submitted 25 February, 2020; v1 submitted 6 September, 2019; originally announced September 2019.

arXiv:1808.08065 [pdf, other]

Towards Machine Learning-Based Optimal HAS

Authors: Christian Sieber, Korbinian Hagn, Christian Moldovan, Tobias Hoßfeld, Wolfgang Kellerer

Abstract: Mobile video consumption is increasing and sophisticated video quality adaptation strategies are required to deal with mobile throughput fluctuations. These adaptation strategies have to keep the switching frequency low, the average quality high and prevent stalling occurrences to ensure customer satisfaction. This paper proposes a novel methodology for the design of machine learning-based adaptat… ▽ More Mobile video consumption is increasing and sophisticated video quality adaptation strategies are required to deal with mobile throughput fluctuations. These adaptation strategies have to keep the switching frequency low, the average quality high and prevent stalling occurrences to ensure customer satisfaction. This paper proposes a novel methodology for the design of machine learning-based adaptation logics named HASBRAIN. Furthermore, the performance of a trained neural network against two algorithms from the literature is evaluated. We first use a modified existing optimization formulation to calculate optimal adaptation paths with a minimum number of quality switches for a wide range of videos and for challenging mobile throughput patterns. Afterwards we use the resulting optimal adaptation paths to train and compare different machine learning models. The evaluation shows that an artificial neural network-based model can reach a high average quality with a low number of switches in the mobile scenario. The proposed methodology is general enough to be extended for further designs of machine learning-based algorithms and the provided model can be deployed in on-demand streaming scenarios or be further refined using reward-based mechanisms such as reinforcement learning. All tools, models and datasets created during the work are provided as open-source software. △ Less

Submitted 24 August, 2018; originally announced August 2018.

Comments: 9 pages

arXiv:1806.01126 [pdf, ps, other]

Confidence Interval Estimators for MOS Values

Authors: Tobias Hossfeld, Poul E. Heegaard, Martin Varela, Lea Skorin-Kapov

Abstract: For the quantification of QoE, subjects often provide individual rating scores on certain rating scales which are then aggregated into Mean Opinion Scores (MOS). From the observed sample data, the expected value is to be estimated. While the sample average only provides a point estimator, confidence intervals (CI) are an interval estimate which contains the desired expected value with a given conf… ▽ More For the quantification of QoE, subjects often provide individual rating scores on certain rating scales which are then aggregated into Mean Opinion Scores (MOS). From the observed sample data, the expected value is to be estimated. While the sample average only provides a point estimator, confidence intervals (CI) are an interval estimate which contains the desired expected value with a given confidence level. In subjective studies, the number of subjects performing the test is typically small, especially in lab environments. The used rating scales are bounded and often discrete like the 5-point ACR rating scale. Therefore, we review statistical approaches in the literature for their applicability in the QoE domain for MOS interval estimation (instead of having only a point estimator, which is the MOS). We provide a conservative estimator based on the SOS hypothesis and binomial distributions and compare its performance (CI width, outlier ratio of CI violating the rating scale bounds) and coverage probability with well known CI estimators. We show that the provided CI estimator works very well in practice for MOS interval estimators, while the commonly used studentized CIs suffer from a positive outlier ratio, i.e., CIs beyond the bounds of the rating scale. As an alternative, bootstrap**, i.e., random sampling of the subjective ratings with replacement, is an efficient CI estimator leading to typically smaller CIs, but lower coverage than the proposed estimator. △ Less

Submitted 4 June, 2018; originally announced June 2018.

arXiv:1607.00321 [pdf, ps, other]

doi 10.1007/s41233-016-0002-1

Formal Definition of QoE Metrics

Authors: Tobias Hossfeld, Poul E. Heegaard, Martin Varela, Sebastian Möller

Abstract: This technical report formally defines the QoE metrics which are introduced and discussed in the article "QoE Beyond the MOS: An In-Depth Look at QoE via Better Metrics and their Relation to MOS" by Tobias Hoßfeld, Poul E. Heegaard, Martin Varela, Sebastian Möller, accepted for publication in the Springer journal "Quality and User Experience". Matlab scripts for computing the QoE metrics for given… ▽ More This technical report formally defines the QoE metrics which are introduced and discussed in the article "QoE Beyond the MOS: An In-Depth Look at QoE via Better Metrics and their Relation to MOS" by Tobias Hoßfeld, Poul E. Heegaard, Martin Varela, Sebastian Möller, accepted for publication in the Springer journal "Quality and User Experience". Matlab scripts for computing the QoE metrics for given data sets are available in GitHub. △ Less

Submitted 1 July, 2016; originally announced July 2016.

Journal ref: Quality and User Experience (2016) 1: 2

Showing 1–11 of 11 results for author: Hossfeld, T