Search | arXiv e-print repository

LLM-ABR: Designing Adaptive Bitrate Algorithms via Large Language Models

Authors: Zhiyuan He, Aashish Gottipati, Lili Qiu, Francis Y. Yan, Xufang Luo, Kenuo Xu, Yuqing Yang

Abstract: We present LLM-ABR, the first system that utilizes the generative capabilities of large language models (LLMs) to autonomously design adaptive bitrate (ABR) algorithms tailored for diverse network characteristics. Operating within a reinforcement learning framework, LLM-ABR empowers LLMs to design key components such as states and neural network architectures. We evaluate LLM-ABR across diverse ne… ▽ More We present LLM-ABR, the first system that utilizes the generative capabilities of large language models (LLMs) to autonomously design adaptive bitrate (ABR) algorithms tailored for diverse network characteristics. Operating within a reinforcement learning framework, LLM-ABR empowers LLMs to design key components such as states and neural network architectures. We evaluate LLM-ABR across diverse network settings, including broadband, satellite, 4G, and 5G. LLM-ABR consistently outperforms default ABR algorithms. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2403.06324 [pdf, other]

ACM MMSys 2024 Bandwidth Estimation in Real Time Communications Challenge

Authors: Sami Khairy, Gabriel Mittag, Vishak Gopal, Francis Y. Yan, Zhixiong Niu, Ezra Ameri, Scott Inglis, Mehrsa Golestaneh, Ross Cutler

Abstract: The quality of experience (QoE) delivered by video conferencing systems to end users depends in part on correctly estimating the capacity of the bottleneck link between the sender and the receiver over time. Bandwidth estimation for real-time communications (RTC) remains a significant challenge, primarily due to the continuously evolving heterogeneous network architectures and technologies. From t… ▽ More The quality of experience (QoE) delivered by video conferencing systems to end users depends in part on correctly estimating the capacity of the bottleneck link between the sender and the receiver over time. Bandwidth estimation for real-time communications (RTC) remains a significant challenge, primarily due to the continuously evolving heterogeneous network architectures and technologies. From the first bandwidth estimation challenge which was hosted at ACM MMSys 2021, we learned that bandwidth estimation models trained with reinforcement learning (RL) in simulations to maximize network-based reward functions may not be optimal in reality due to the sim-to-real gap and the difficulty of aligning network-based rewards with user-perceived QoE. This grand challenge aims to advance bandwidth estimation model design by aligning reward maximization with user-perceived QoE optimization using offline RL and a real-world dataset with objective rewards which have high correlations with subjective audio/video quality in Microsoft Teams. All models submitted to the grand challenge underwent initial evaluation on our emulation platform. For a comprehensive evaluation under diverse network conditions with temporal fluctuations, top models were further evaluated on our geographically distributed testbed by using each model to conduct 600 calls within a 12-day period. The winning model is shown to deliver comparable performance to the top behavior policy in the released dataset. By leveraging real-world data and integrating objective audio/video quality scores as rewards, offline RL can therefore facilitate the development of competitive bandwidth estimators for RTC. △ Less

Submitted 15 March, 2024; v1 submitted 10 March, 2024; originally announced March 2024.

arXiv:2305.12333 [pdf, other]

GRACE: Loss-Resilient Real-Time Video through Neural Codecs

Authors: Yihua Cheng, Ziyi Zhang, Hanchen Li, Anton Arapin, Yue Zhang, Qizheng Zhang, Yuhan Liu, Xu Zhang, Francis Y. Yan, Amrita Mazumdar, Nick Feamster, Junchen Jiang

Abstract: In real-time video communication, retransmitting lost packets over high-latency networks is not viable due to strict latency requirements. To counter packet losses without retransmission, two primary strategies are employed -- encoder-based forward error correction (FEC) and decoder-based error concealment. The former encodes data with redundancy before transmission, yet determining the optimal re… ▽ More In real-time video communication, retransmitting lost packets over high-latency networks is not viable due to strict latency requirements. To counter packet losses without retransmission, two primary strategies are employed -- encoder-based forward error correction (FEC) and decoder-based error concealment. The former encodes data with redundancy before transmission, yet determining the optimal redundancy level in advance proves challenging. The latter reconstructs video from partially received frames, but dividing a frame into independently coded partitions inherently compromises compression efficiency, and the lost information cannot be effectively recovered by the decoder without adapting the encoder. We present a loss-resilient real-time video system called GRACE, which preserves the user's quality of experience (QoE) across a wide range of packet losses through a new neural video codec. Central to GRACE's enhanced loss resilience is its joint training of the neural encoder and decoder under a spectrum of simulated packet losses. In lossless scenarios, GRACE achieves video quality on par with conventional codecs (e.g., H.265). As the loss rate escalates, GRACE exhibits a more graceful, less pronounced decline in quality, consistently outperforming other loss-resilient schemes. Through extensive evaluation on various videos and real network traces, we demonstrate that GRACE reduces undecodable frames by 95% and stall duration by 90% compared with FEC, while markedly boosting video quality over error concealment methods. In a user study with 240 crowdsourced participants and 960 subjective ratings, GRACE registers a 38% higher mean opinion score (MOS) than other baselines. △ Less

Submitted 12 March, 2024; v1 submitted 20 May, 2023; originally announced May 2023.

arXiv:2212.12180 [pdf, other]

Autothrottle: A Practical Bi-Level Approach to Resource Management for SLO-Targeted Microservices

Authors: Zibo Wang, **he Li, Chieh-Jan Mike Liang, Feng Wu, Francis Y. Yan

Abstract: Achieving resource efficiency while preserving end-user experience is non-trivial for cloud application operators. As cloud applications progressively adopt microservices, resource managers are faced with two distinct levels of system behavior: end-to-end application latency and per-service resource usage. Translating between the two levels, however, is challenging because user requests traverse h… ▽ More Achieving resource efficiency while preserving end-user experience is non-trivial for cloud application operators. As cloud applications progressively adopt microservices, resource managers are faced with two distinct levels of system behavior: end-to-end application latency and per-service resource usage. Translating between the two levels, however, is challenging because user requests traverse heterogeneous services that collectively (but unevenly) contribute to the end-to-end latency. We present Autothrottle, a bi-level resource management framework for microservices with latency SLOs (service-level objectives). It architecturally decouples application SLO feedback from service resource control, and bridges them through the notion of performance targets. Specifically, an application-wide learning-based controller is employed to periodically set performance targets -- expressed as CPU throttle ratios -- for per-service heuristic controllers to attain. We evaluate Autothrottle on three microservice applications, with workload traces from production scenarios. Results show superior CPU savings, up to 26.21% over the best-performing baseline and up to 93.84% over all baselines. △ Less

Submitted 14 April, 2024; v1 submitted 23 December, 2022; originally announced December 2022.

Comments: Accepted by USENIX NSDI '24

arXiv:2210.13763 [pdf, other]

Teal: Learning-Accelerated Optimization of WAN Traffic Engineering

Authors: Zhiying Xu, Francis Y. Yan, Rachee Singh, Justin T. Chiu, Alexander M. Rush, Minlan Yu

Abstract: The rapid expansion of global cloud wide-area networks (WANs) has posed a challenge for commercial optimization engines to efficiently solve network traffic engineering (TE) problems at scale. Existing acceleration strategies decompose TE optimization into concurrent subproblems but realize limited parallelism due to an inherent tradeoff between run time and allocation performance. We present Te… ▽ More The rapid expansion of global cloud wide-area networks (WANs) has posed a challenge for commercial optimization engines to efficiently solve network traffic engineering (TE) problems at scale. Existing acceleration strategies decompose TE optimization into concurrent subproblems but realize limited parallelism due to an inherent tradeoff between run time and allocation performance. We present Teal, a learning-based TE algorithm that leverages the parallel processing power of GPUs to accelerate TE control. First, Teal designs a flow-centric graph neural network (GNN) to capture WAN connectivity and network flows, learning flow features as inputs to downstream allocation. Second, to reduce the problem scale and make learning tractable, Teal employs a multi-agent reinforcement learning (RL) algorithm to independently allocate each traffic demand while optimizing a central TE objective. Finally, Teal fine-tunes allocations with ADMM (Alternating Direction Method of Multipliers), a highly parallelizable optimization algorithm for reducing constraint violations such as overutilized links. We evaluate Teal using traffic matrices from Microsoft's WAN. On a large WAN topology with >1,700 nodes, Teal generates near-optimal flow allocations while running several orders of magnitude faster than the production optimization engine. Compared with other TE acceleration schemes, Teal satisfies 6--32% more traffic demand and yields 197--625x speedups. △ Less

Submitted 19 May, 2024; v1 submitted 25 October, 2022; originally announced October 2022.

arXiv:2202.05940 [pdf, other]

Automatic Curriculum Generation for Learning Adaptation in Networking

Authors: Zhengxu Xia, Yajie Zhou, Francis Y. Yan, Junchen Jiang

Abstract: As deep reinforcement learning (RL) showcases its strengths in networking and systems, its pitfalls also come to the public's attention--when trained to handle a wide range of network workloads and previously unseen deployment environments, RL policies often manifest suboptimal performance and poor generalizability. To tackle these problems, we present Genet, a new training framework for learnin… ▽ More As deep reinforcement learning (RL) showcases its strengths in networking and systems, its pitfalls also come to the public's attention--when trained to handle a wide range of network workloads and previously unseen deployment environments, RL policies often manifest suboptimal performance and poor generalizability. To tackle these problems, we present Genet, a new training framework for learning better RL-based network adaptation algorithms. Genet is built on the concept of curriculum learning, which has proved effective against similar issues in other domains where RL is extensively employed. At a high level, curriculum learning gradually presents more difficult environments to the training, rather than choosing them randomly, so that the current RL model can make meaningful progress in training. However, applying curriculum learning in networking is challenging because it remains unknown how to measure the "difficulty" of a network environment. Instead of relying on handcrafted heuristics to determine the environment's difficulty level, our insight is to utilize traditional rule-based (non-RL) baselines: If the current RL model performs significantly worse in a network environment than the baselines, then the model's potential to improve when further trained in this environment is substantial. Therefore, Genet automatically searches for the environments where the current model falls significantly behind a traditional baseline scheme and iteratively promotes these environments as the training progresses. Through evaluating Genet on three use cases--adaptive video streaming, congestion control, and load balancing, we show that Genet produces RL policies which outperform both regularly trained RL policies and traditional baselines in each context, not only under synthetic workloads but also in real environments. △ Less

Submitted 8 September, 2022; v1 submitted 11 February, 2022; originally announced February 2022.

Comments: Accepted by SIGCOMM'22

arXiv:2011.09611 [pdf, other]

Implementing BOLA-BASIC on Puffer: Lessons for the use of SSIM in ABR logic

Authors: Emily Marx, Francis Y. Yan, Keith Winstein

Abstract: One ABR algorithm implemented on Puffer is BOLA-BASIC, the simplest variant of BOLA. BOLA finds wide use in industry, notably in the MPEG-DASH reference player used as the basis for video players at Akamai, BBC, Orange, and CBS. The overall goal of BOLA is to maximize each encoded chunk's video quality while minimizing rebuffering. To measure video quality, Puffer uses the structural similarity me… ▽ More One ABR algorithm implemented on Puffer is BOLA-BASIC, the simplest variant of BOLA. BOLA finds wide use in industry, notably in the MPEG-DASH reference player used as the basis for video players at Akamai, BBC, Orange, and CBS. The overall goal of BOLA is to maximize each encoded chunk's video quality while minimizing rebuffering. To measure video quality, Puffer uses the structural similarity metric SSIM, whereas BOLA and other ABR algorithms like BBA, MPC, and Pensieve are more commonly implemented using bitrate (or a variant of bitrate). While bitrate is frequently used, BOLA allows the video provider to define its own proxy of video quality as the algorithm's "utility" function. However, using SSIM as utility proved surprisingly complex for BOLA-BASIC, despite the algorithm's simplicity. Given the rising popularity of SSIM and related quality metrics, we anticipate that a growing number of Puffer-like systems will face similar challenges. We hope developers of such systems find our experiences informative as they implement algorithms designed with bitrate-based utility in mind. △ Less

Submitted 18 November, 2020; originally announced November 2020.

arXiv:1906.01113 [pdf, other]

Learning in situ: a randomized experiment in video streaming

Authors: Francis Y. Yan, Hudson Ayers, Chenzhi Zhu, Sadjad Fouladi, James Hong, Keyi Zhang, Philip Levis, Keith Winstein

Abstract: We describe the results of a randomized controlled trial of video-streaming algorithms for bitrate selection and network prediction. Over the last eight months, we have streamed 14.2 years of video to 56,000 users across the Internet. Sessions are randomized in blinded fashion among algorithms, and client telemetry is recorded for analysis. We found that in this real-world setting, it is difficu… ▽ More We describe the results of a randomized controlled trial of video-streaming algorithms for bitrate selection and network prediction. Over the last eight months, we have streamed 14.2 years of video to 56,000 users across the Internet. Sessions are randomized in blinded fashion among algorithms, and client telemetry is recorded for analysis. We found that in this real-world setting, it is difficult for sophisticated or machine-learned control schemes to outperform a "simple" scheme (buffer-based control), notwithstanding good performance in network emulators or simulators. We performed a statistical analysis and found that the variability and heavy-tailed nature of network and algorithm behavior create hurdles for robust learned algorithms in this area. We developed an ABR algorithm that robustly outperforms other schemes in practice, by combining classical control with a learned network predictor, trained with supervised learning in situ on data from the real deployment environment. To support further investigation, we are publishing an archive of traces and results each day, and will open our ongoing study to the community. We welcome other researchers to use this platform to develop and validate new algorithms for bitrate selection, network prediction, and congestion control. △ Less

Submitted 23 September, 2019; v1 submitted 3 June, 2019; originally announced June 2019.

Journal ref: USENIX NSDI (2020) 495-511

Showing 1–8 of 8 results for author: Yan, F Y