Search | arXiv e-print repository

doi 10.1109/TITS.2024.3378007

Multi-Objective Optimization Using Adaptive Distributed Reinforcement Learning

Authors: **g Tan, Ramin Khalili, Holger Karl

Abstract: The Intelligent Transportation System (ITS) environment is known to be dynamic and distributed, where participants (vehicle users, operators, etc.) have multiple, changing and possibly conflicting objectives. Although Reinforcement Learning (RL) algorithms are commonly applied to optimize ITS applications such as resource management and offloading, most RL algorithms focus on single objectives. In… ▽ More The Intelligent Transportation System (ITS) environment is known to be dynamic and distributed, where participants (vehicle users, operators, etc.) have multiple, changing and possibly conflicting objectives. Although Reinforcement Learning (RL) algorithms are commonly applied to optimize ITS applications such as resource management and offloading, most RL algorithms focus on single objectives. In many situations, converting a multi-objective problem into a single-objective one is impossible, intractable or insufficient, making such RL algorithms inapplicable. We propose a multi-objective, multi-agent reinforcement learning (MARL) algorithm with high learning efficiency and low computational requirements, which automatically triggers adaptive few-shot learning in a dynamic, distributed and noisy environment with sparse and delayed reward. We test our algorithm in an ITS environment with edge cloud computing. Empirical results show that the algorithm is quick to adapt to new environments and performs better in all individual and system metrics compared to the state-of-the-art benchmark. Our algorithm also addresses various practical concerns with its modularized and asynchronous online training method. In addition to the cloud simulation, we test our algorithm on a single-board computer and show that it can make inference in 6 milliseconds. △ Less

Submitted 13 March, 2024; originally announced March 2024.

arXiv:2401.10158 [pdf, other]

DISTINQT: A Distributed Privacy Aware Learning Framework for QoS Prediction for Future Mobile and Wireless Networks

Authors: Nikolaos Koursioumpas, Lina Magoula, Ioannis Stavrakakis, Nancy Alonistioti, M. A. Gutierrez-Estevez, Ramin Khalili

Abstract: Beyond 5G and 6G networks are expected to support new and challenging use cases and applications that depend on a certain level of Quality of Service (QoS) to operate smoothly. Predicting the QoS in a timely manner is of high importance, especially for safety-critical applications as in the case of vehicular communications. Although until recent years the QoS prediction has been carried out by cen… ▽ More Beyond 5G and 6G networks are expected to support new and challenging use cases and applications that depend on a certain level of Quality of Service (QoS) to operate smoothly. Predicting the QoS in a timely manner is of high importance, especially for safety-critical applications as in the case of vehicular communications. Although until recent years the QoS prediction has been carried out by centralized Artificial Intelligence (AI) solutions, a number of privacy, computational, and operational concerns have emerged. Alternative solutions have surfaced (e.g. Split Learning, Federated Learning), distributing AI tasks of reduced complexity across nodes, while preserving the privacy of the data. However, new challenges rise when it comes to scalable distributed learning approaches, taking into account the heterogeneous nature of future wireless networks. The current work proposes DISTINQT, a novel multi-headed input privacy-aware distributed learning framework for QoS prediction. Our framework supports multiple heterogeneous nodes, in terms of data types and model architectures, by sharing computations across them. This enables the incorporation of diverse knowledge into a sole learning process that will enhance the robustness and generalization capabilities of the final QoS prediction model. DISTINQT also contributes to data privacy preservation by encoding any raw input data into highly complex, compressed, and irreversible latent representations before any transmission. Evaluation results showcase that DISTINQT achieves a statistically identical performance compared to its centralized version, while also proving the validity of the privacy preserving claims. DISTINQT manages to achieve a reduction in prediction error of up to 65% on average against six state-of-the-art centralized baseline solutions presented in the Tele-Operated Driving use case. △ Less

Submitted 12 July, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

Comments: 12 Pages Double Column, 10 Figures, (Revised Version) Submitted for possible publication in the IEEE Transactions on Vehicular Technology (IEEE TVT)

arXiv:2308.10664 [pdf, other]

doi 10.1109/TGCN.2024.3372695

A Safe Deep Reinforcement Learning Approach for Energy Efficient Federated Learning in Wireless Communication Networks

Authors: Nikolaos Koursioumpas, Lina Magoula, Nikolaos Petropouleas, Alexandros-Ioannis Thanopoulos, Theodora Panagea, Nancy Alonistioti, M. A. Gutierrez-Estevez, Ramin Khalili

Abstract: Progressing towards a new era of Artificial Intelligence (AI) - enabled wireless networks, concerns regarding the environmental impact of AI have been raised both in industry and academia. Federated Learning (FL) has emerged as a key privacy preserving decentralized AI technique. Despite efforts currently being made in FL, its environmental impact is still an open problem. Targeting the minimizati… ▽ More Progressing towards a new era of Artificial Intelligence (AI) - enabled wireless networks, concerns regarding the environmental impact of AI have been raised both in industry and academia. Federated Learning (FL) has emerged as a key privacy preserving decentralized AI technique. Despite efforts currently being made in FL, its environmental impact is still an open problem. Targeting the minimization of the overall energy consumption of an FL process, we propose the orchestration of computational and communication resources of the involved devices to minimize the total energy required, while guaranteeing a certain performance of the model. To this end, we propose a Soft Actor Critic Deep Reinforcement Learning (DRL) solution, where a penalty function is introduced during training, penalizing the strategies that violate the constraints of the environment, and contributing towards a safe RL process. A device level synchronization method, along with a computationally cost effective FL environment are proposed, with the goal of further reducing the energy consumption and communication overhead. Evaluation results show the effectiveness and robustness of the proposed scheme compared to four state-of-the-art baseline solutions on different network environments and FL architectures, achieving a decrease of up to 94% in the total energy consumption. △ Less

Submitted 5 March, 2024; v1 submitted 21 August, 2023; originally announced August 2023.

Comments: 12 Pages Double Column, 6 Figures, Accepted for publication in the IEEE Transactions on Green Communications and Networking (TGCN). arXiv admin note: text overlap with arXiv:2306.14237

arXiv:2308.07441 [pdf]

Physics-Informed Deep Learning to Reduce the Bias in Joint Prediction of Nitrogen Oxides

Authors: Lianfa Li, Roxana Khalili, Frederick Lurmann, Nathan Pavlovic, Jun Wu, Yan Xu, Yisi Liu, Karl O'Sharkey, Beate Ritz, Luke Oman, Meredith Franklin, Theresa Bastain, Shohreh F. Farzan, Carrie Breton, Rima Habre

Abstract: Atmospheric nitrogen oxides (NOx) primarily from fuel combustion have recognized acute and chronic health and environmental effects. Machine learning (ML) methods have significantly enhanced our capacity to predict NOx concentrations at ground-level with high spatiotemporal resolution but may suffer from high estimation bias since they lack physical and chemical knowledge about air pollution dynam… ▽ More Atmospheric nitrogen oxides (NOx) primarily from fuel combustion have recognized acute and chronic health and environmental effects. Machine learning (ML) methods have significantly enhanced our capacity to predict NOx concentrations at ground-level with high spatiotemporal resolution but may suffer from high estimation bias since they lack physical and chemical knowledge about air pollution dynamics. Chemical transport models (CTMs) leverage this knowledge; however, accurate predictions of ground-level concentrations typically necessitate extensive post-calibration. Here, we present a physics-informed deep learning framework that encodes advection-diffusion mechanisms and fluid dynamics constraints to jointly predict NO2 and NOx and reduce ML model bias by 21-42%. Our approach captures fine-scale transport of NO2 and NOx, generates robust spatial extrapolation, and provides explicit uncertainty estimation. The framework fuses knowledge-driven physicochemical principles of CTMs with the predictive power of ML for air quality exposure, health, and policy applications. Our approach offers significant improvements over purely data-driven ML methods and has unprecedented bias reduction in joint NO2 and NOx prediction. △ Less

Submitted 14 August, 2023; originally announced August 2023.

arXiv:2307.09182 [pdf, other]

doi 10.1145/3596907

Federated Learning for Computationally-Constrained Heterogeneous Devices: A Survey

Authors: Kilian Pfeiffer, Martin Rapp, Ramin Khalili, Jörg Henkel

Abstract: With an increasing number of smart devices like internet of things (IoT) devices deployed in the field, offloadingtraining of neural networks (NNs) to a central server becomes more and more infeasible. Recent efforts toimprove users' privacy have led to on-device learning emerging as an alternative. However, a model trainedonly on a single device, using only local data, is unlikely to reach a high… ▽ More With an increasing number of smart devices like internet of things (IoT) devices deployed in the field, offloadingtraining of neural networks (NNs) to a central server becomes more and more infeasible. Recent efforts toimprove users' privacy have led to on-device learning emerging as an alternative. However, a model trainedonly on a single device, using only local data, is unlikely to reach a high accuracy. Federated learning (FL)has been introduced as a solution, offering a privacy-preserving trade-off between communication overheadand model accuracy by sharing knowledge between devices but disclosing the devices' private data. Theapplicability and the benefit of applying baseline FL are, however, limited in many relevant use cases dueto the heterogeneity present in such environments. In this survey, we outline the heterogeneity challengesFL has to overcome to be widely applicable in real-world applications. We especially focus on the aspect ofcomputation heterogeneity among the participating devices and provide a comprehensive overview of recentworks on heterogeneity-aware FL. We discuss two groups: works that adapt the NN architecture and worksthat approach heterogeneity on a system level, covering Federated Averaging (FedAvg), distillation, and splitlearning-based approaches, as well as synchronous and asynchronous aggregation schemes. △ Less

Submitted 18 July, 2023; originally announced July 2023.

Journal ref: ACM Comput. Surv. 55, 14s, Article 334, 2023

arXiv:2306.14237 [pdf, other]

doi 10.1109/PIMRC56721.2023.10293863

A Safe Genetic Algorithm Approach for Energy Efficient Federated Learning in Wireless Communication Networks

Authors: Lina Magoula, Nikolaos Koursioumpas, Alexandros-Ioannis Thanopoulos, Theodora Panagea, Nikolaos Petropouleas, M. A. Gutierrez-Estevez, Ramin Khalili

Abstract: Federated Learning (FL) has emerged as a decentralized technique, where contrary to traditional centralized approaches, devices perform a model training in a collaborative manner, while preserving data privacy. Despite the existing efforts made in FL, its environmental impact is still under investigation, since several critical challenges regarding its applicability to wireless networks have been… ▽ More Federated Learning (FL) has emerged as a decentralized technique, where contrary to traditional centralized approaches, devices perform a model training in a collaborative manner, while preserving data privacy. Despite the existing efforts made in FL, its environmental impact is still under investigation, since several critical challenges regarding its applicability to wireless networks have been identified. Towards mitigating the carbon footprint of FL, the current work proposes a Genetic Algorithm (GA) approach, targeting the minimization of both the overall energy consumption of an FL process and any unnecessary resource utilization, by orchestrating the computational and communication resources of the involved devices, while guaranteeing a certain FL model performance target. A penalty function is introduced in the offline phase of the GA that penalizes the strategies that violate the constraints of the environment, ensuring a safe GA process. Evaluation results show the effectiveness of the proposed scheme compared to two state-of-the-art baseline solutions, achieving a decrease of up to 83% in the total energy consumption. △ Less

Submitted 5 July, 2023; v1 submitted 25 June, 2023; originally announced June 2023.

Comments: 6 pages, 6 figures, Accepted in IEEE PIMRC 2023 Conference, Latest revision with small corrections (typos etc.)

arXiv:2305.17005 [pdf, other]

Aggregating Capacity in FL through Successive Layer Training for Computationally-Constrained Devices

Authors: Kilian Pfeiffer, Ramin Khalili, Jörg Henkel

Abstract: Federated learning (FL) is usually performed on resource-constrained edge devices, e.g., with limited memory for the computation. If the required memory to train a model exceeds this limit, the device will be excluded from the training. This can lead to a lower accuracy as valuable data and computation resources are excluded from training, also causing bias and unfairness. The FL training process… ▽ More Federated learning (FL) is usually performed on resource-constrained edge devices, e.g., with limited memory for the computation. If the required memory to train a model exceeds this limit, the device will be excluded from the training. This can lead to a lower accuracy as valuable data and computation resources are excluded from training, also causing bias and unfairness. The FL training process should be adjusted to such constraints. The state-of-the-art techniques propose training subsets of the FL model at constrained devices, reducing their resource requirements for training. But these techniques largely limit the co-adaptation among parameters of the model and are highly inefficient, as we show: it is actually better to train a smaller (less accurate) model by the system where all the devices can train the model end-to-end, than applying such techniques. We propose a new method that enables successive freezing and training of the parameters of the FL model at devices, reducing the training's resource requirements at the devices, while still allowing enough co-adaptation between parameters. We show through extensive experimental evaluation that our technique greatly improves the accuracy of the trained model (by 52.4 p.p.) compared with the state of the art, efficiently aggregating the computation capacity available on distributed devices. △ Less

Submitted 27 November, 2023; v1 submitted 26 May, 2023; originally announced May 2023.

Comments: accepted at NeurIPS'23

arXiv:2305.15092 [pdf, other]

doi 10.1145/3632775.3639589

FedZero: Leveraging Renewable Excess Energy in Federated Learning

Authors: Philipp Wiesner, Ramin Khalili, Dennis Grinwald, Pratik Agrawal, Lauritz Thamsen, Odej Kao

Abstract: Federated Learning (FL) is an emerging machine learning technique that enables distributed model training across data silos or edge devices without data sharing. Yet, FL inevitably introduces inefficiencies compared to centralized model training, which will further increase the already high energy usage and associated carbon emissions of machine learning in the future. One idea to reduce FL's carb… ▽ More Federated Learning (FL) is an emerging machine learning technique that enables distributed model training across data silos or edge devices without data sharing. Yet, FL inevitably introduces inefficiencies compared to centralized model training, which will further increase the already high energy usage and associated carbon emissions of machine learning in the future. One idea to reduce FL's carbon footprint is to schedule training jobs based on the availability of renewable excess energy that can occur at certain times and places in the grid. However, in the presence of such volatile and unreliable resources, existing FL schedulers cannot always ensure fast, efficient, and fair training. We propose FedZero, an FL system that operates exclusively on renewable excess energy and spare capacity of compute infrastructure to effectively reduce a training's operational carbon emissions to zero. Using energy and load forecasts, FedZero leverages the spatio-temporal availability of excess resources by selecting clients for fast convergence and fair participation. Our evaluation, based on real solar and load traces, shows that FedZero converges significantly faster than existing approaches under the mentioned constraints while consuming less energy. Furthermore, it is robust to forecasting errors and scalable to tens of thousands of clients. △ Less

Submitted 10 January, 2024; v1 submitted 24 May, 2023; originally announced May 2023.

Comments: Accepted for publication at ACM e-Energy '24

arXiv:2208.04237 [pdf, other]

doi 10.1016/j.comcom.2022.07.047

Multi-Agent Reinforcement Learning for Long-Term Network Resource Allocation through Auction: a V2X Application

Authors: **g Tan, Ramin Khalili, Holger Karl, Artur Hecker

Abstract: We formulate offloading of computational tasks from a dynamic group of mobile agents (e.g., cars) as decentralized decision making among autonomous agents. We design an interaction mechanism that incentivizes such agents to align private and system goals by balancing between competition and cooperation. In the static case, the mechanism provably has Nash equilibria with optimal resource allocation… ▽ More We formulate offloading of computational tasks from a dynamic group of mobile agents (e.g., cars) as decentralized decision making among autonomous agents. We design an interaction mechanism that incentivizes such agents to align private and system goals by balancing between competition and cooperation. In the static case, the mechanism provably has Nash equilibria with optimal resource allocation. In a dynamic environment, this mechanism's requirement of complete information is impossible to achieve. For such environments, we propose a novel multi-agent online learning algorithm that learns with partial, delayed and noisy state information, thus greatly reducing information need. Our algorithm is also capable of learning from long-term and sparse reward signals with varying delay. Empirical results from the simulation of a V2X application confirm that through learning, agents with the learning algorithm significantly improve both system and individual performance, reducing up to 30% of offloading failure rate, communication overhead and load variation, increasing computation resource utilization and fairness. Results also confirm the algorithm's good convergence and generalization property in different environments. △ Less

Submitted 29 July, 2022; originally announced August 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:2204.02267

arXiv:2207.06537 [pdf, ps, other]

doi 10.1109/TVT.2022.3186910

Scheduling Out-of-Coverage Vehicular Communications Using Reinforcement Learning

Authors: Taylan Şahin, Ramin Khalili, Mate Boban, Adam Wolisz

Abstract: Performance of vehicle-to-vehicle (V2V) communications depends highly on the employed scheduling approach. While centralized network schedulers offer high V2V communication reliability, their operation is conventionally restricted to areas with full cellular network coverage. In contrast, in out-of-cellular-coverage areas, comparatively inefficient distributed radio resource management is used. To… ▽ More Performance of vehicle-to-vehicle (V2V) communications depends highly on the employed scheduling approach. While centralized network schedulers offer high V2V communication reliability, their operation is conventionally restricted to areas with full cellular network coverage. In contrast, in out-of-cellular-coverage areas, comparatively inefficient distributed radio resource management is used. To exploit the benefits of the centralized approach for enhancing the reliability of V2V communications on roads lacking cellular coverage, we propose VRLS (Vehicular Reinforcement Learning Scheduler), a centralized scheduler that proactively assigns resources for out-of-coverage V2V communications \textit{before} vehicles leave the cellular network coverage. By training in simulated vehicular environments, VRLS can learn a scheduling policy that is robust and adaptable to environmental changes, thus eliminating the need for targeted (re-)training in complex real-life environments. We evaluate the performance of VRLS under varying mobility, network load, wireless channel, and resource configurations. VRLS outperforms the state-of-the-art distributed scheduling algorithm in zones without cellular network coverage by reducing the packet error rate by half in highly loaded conditions and achieving near-maximum reliability in low-load scenarios. △ Less

Submitted 13 July, 2022; originally announced July 2022.

Comments: This article has been accepted for publication in IEEE Transactions on Vehicular Technology. This is the author's version

arXiv:2204.02268 [pdf, other]

Learning to Bid Long-Term: Multi-Agent Reinforcement Learning with Long-Term and Sparse Reward in Repeated Auction Games

Authors: **g Tan, Ramin Khalili, Holger Karl

Abstract: We propose a multi-agent distributed reinforcement learning algorithm that balances between potentially conflicting short-term reward and sparse, delayed long-term reward, and learns with partial information in a dynamic environment. We compare different long-term rewards to incentivize the algorithm to maximize individual payoff and overall social welfare. We test the algorithm in two simulated a… ▽ More We propose a multi-agent distributed reinforcement learning algorithm that balances between potentially conflicting short-term reward and sparse, delayed long-term reward, and learns with partial information in a dynamic environment. We compare different long-term rewards to incentivize the algorithm to maximize individual payoff and overall social welfare. We test the algorithm in two simulated auction games, and demonstrate that 1) our algorithm outperforms two benchmark algorithms in a direct competition, with cost to social welfare, and 2) our algorithm's aggressive competitive behavior can be guided with the long-term reward signal to maximize both individual payoff and overall social welfare. △ Less

Submitted 5 April, 2022; originally announced April 2022.

arXiv:2204.02267 [pdf, other]

doi 10.1109/INFOCOM48880.2022.9796717

Multi-Agent Distributed Reinforcement Learning for Making Decentralized Offloading Decisions

Authors: **g Tan, Ramin Khalili, Holger Karl, Artur Hecker

Abstract: We formulate computation offloading as a decentralized decision-making problem with autonomous agents. We design an interaction mechanism that incentivizes agents to align private and system goals by balancing between competition and cooperation. The mechanism provably has Nash equilibria with optimal resource allocation in the static case. For a dynamic environment, we propose a novel multi-agent… ▽ More We formulate computation offloading as a decentralized decision-making problem with autonomous agents. We design an interaction mechanism that incentivizes agents to align private and system goals by balancing between competition and cooperation. The mechanism provably has Nash equilibria with optimal resource allocation in the static case. For a dynamic environment, we propose a novel multi-agent online learning algorithm that learns with partial, delayed and noisy state information, and a reward signal that reduces information need to a great extent. Empirical results confirm that through learning, agents significantly improve both system and individual performance, e.g., 40% offloading failure rate reduction, 32% communication overhead reduction, up to 38% computation resource savings in low contention, 18% utilization increase with reduced load variation in high contention, and improvement in fairness. Results also confirm the algorithm's good convergence and generalization property in significantly different environments. △ Less

Submitted 5 April, 2022; originally announced April 2022.

arXiv:2203.05468 [pdf, other]

CoCoFL: Communication- and Computation-Aware Federated Learning via Partial NN Freezing and Quantization

Authors: Kilian Pfeiffer, Martin Rapp, Ramin Khalili, Jörg Henkel

Abstract: Devices participating in federated learning (FL) typically have heterogeneous communication, computation, and memory resources. However, in synchronous FL, all devices need to finish training by the same deadline dictated by the server. Our results show that training a smaller subset of the neural network (NN) at constrained devices, i.e., drop** neurons/filters as proposed by state of the art,… ▽ More Devices participating in federated learning (FL) typically have heterogeneous communication, computation, and memory resources. However, in synchronous FL, all devices need to finish training by the same deadline dictated by the server. Our results show that training a smaller subset of the neural network (NN) at constrained devices, i.e., drop** neurons/filters as proposed by state of the art, is inefficient, preventing these devices to make an effective contribution to the model. This causes unfairness w.r.t the achievable accuracies of constrained devices, especially in cases with a skewed distribution of class labels across devices. We present a novel FL technique, CoCoFL, which maintains the full NN structure on all devices. To adapt to the devices' heterogeneous resources, CoCoFL freezes and quantizes selected layers, reducing communication, computation, and memory requirements, whereas other layers are still trained in full precision, enabling to reach a high accuracy. Thereby, CoCoFL efficiently utilizes the available resources on devices and allows constrained devices to make a significant contribution to the FL system, increasing fairness among participants (accuracy parity) and significantly improving the final accuracy of the model. △ Less

Submitted 28 June, 2023; v1 submitted 10 March, 2022; originally announced March 2022.

Comments: Published at TMLR

Journal ref: Transactions on Machine Learning Research, 06/2023

arXiv:2112.08761 [pdf, other]

DISTREAL: Distributed Resource-Aware Learning in Heterogeneous Systems

Authors: Martin Rapp, Ramin Khalili, Kilian Pfeiffer, Jörg Henkel

Abstract: We study the problem of distributed training of neural networks (NNs) on devices with heterogeneous, limited, and time-varying availability of computational resources. We present an adaptive, resource-aware, on-device learning mechanism, DISTREAL, which is able to fully and efficiently utilize the available resources on devices in a distributed manner, increasing the convergence speed. This is ach… ▽ More We study the problem of distributed training of neural networks (NNs) on devices with heterogeneous, limited, and time-varying availability of computational resources. We present an adaptive, resource-aware, on-device learning mechanism, DISTREAL, which is able to fully and efficiently utilize the available resources on devices in a distributed manner, increasing the convergence speed. This is achieved with a dropout mechanism that dynamically adjusts the computational complexity of training an NN by randomly drop** filters of convolutional layers of the model. Our main contribution is the introduction of a design space exploration (DSE) technique, which finds Pareto-optimal per-layer dropout vectors with respect to resource requirements and convergence speed of the training. Applying this technique, each device is able to dynamically select the dropout vector that fits its available resource without requiring any assistance from the server. We implement our solution in a federated learning (FL) system, where the availability of computational resources varies both between devices and over time, and show through extensive evaluation that we are able to significantly increase the convergence speed over the state of the art without compromising on the final accuracy. △ Less

Submitted 4 April, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

Comments: published in AAAI Conference on Artificial Intelligence (AAAI'22)

arXiv:2006.05403 [pdf, other]

Distributed Learning on Heterogeneous Resource-Constrained Devices

Authors: Martin Rapp, Ramin Khalili, Jörg Henkel

Abstract: We consider a distributed system, consisting of a heterogeneous set of devices, ranging from low-end to high-end. These devices have different profiles, e.g., different energy budgets, or different hardware specifications, determining their capabilities on performing certain learning tasks. We propose the first approach that enables distributed learning in such a heterogeneous system. Applying our… ▽ More We consider a distributed system, consisting of a heterogeneous set of devices, ranging from low-end to high-end. These devices have different profiles, e.g., different energy budgets, or different hardware specifications, determining their capabilities on performing certain learning tasks. We propose the first approach that enables distributed learning in such a heterogeneous system. Applying our approach, each device employs a neural network (NN) with a topology that fits its capabilities; however, part of these NNs share the same topology, so that their parameters can be jointly learned. This differs from current approaches, such as federated learning, which require all devices to employ the same NN, enforcing a trade-off between achievable accuracy and computational overhead of training. We evaluate heterogeneous distributed learning for reinforcement learning (RL) and observe that it greatly improves the achievable reward on more powerful devices, compared to current approaches, while still maintaining a high reward on the weaker devices. We also explore supervised learning, observing similar gains. △ Less

Submitted 9 June, 2020; originally announced June 2020.

arXiv:1907.09319 [pdf, other]

VRLS: A Unified Reinforcement Learning Scheduler for Vehicle-to-Vehicle Communications

Authors: Taylan Şahin, Ramin Khalili, Mate Boban, Adam Wolisz

Abstract: Vehicle-to-vehicle (V2V) communications have distinct challenges that need to be taken into account when scheduling the radio resources. Although centralized schedulers (e.g., located on base stations) could be utilized to deliver high scheduling performance, they cannot be employed in case of coverage gaps. To address the issue of reliable scheduling of V2V transmissions out of coverage, we propo… ▽ More Vehicle-to-vehicle (V2V) communications have distinct challenges that need to be taken into account when scheduling the radio resources. Although centralized schedulers (e.g., located on base stations) could be utilized to deliver high scheduling performance, they cannot be employed in case of coverage gaps. To address the issue of reliable scheduling of V2V transmissions out of coverage, we propose Vehicular Reinforcement Learning Scheduler (VRLS), a centralized scheduler that predictively assigns the resources for V2V communication while the vehicle is still in cellular network coverage. VRLS is a unified reinforcement learning (RL) solution, wherein the learning agent, the state representation, and the reward provided to the agent are applicable to different vehicular environments of interest (in terms of vehicular density, resource configuration, and wireless channel conditions). Such a unified solution eliminates the necessity of redesigning the RL components for a different environment, and facilitates transfer learning from one to another similar environment. We evaluate the performance of VRLS and show its ability to avoid collisions and half-duplex errors, and to reuse the resources better than the state of the art scheduling algorithms. We also show that pre-trained VRLS agent can adapt to different V2V environments with limited retraining, thus enabling real-world deployment in different scenarios. △ Less

Submitted 22 July, 2019; originally announced July 2019.

Comments: Article accepted to IEEE CAVS 2019

arXiv:1904.12653 [pdf, other]

doi 10.1109/VNC.2018.8628366

Reinforcement Learning Scheduler for Vehicle-to-Vehicle Communications Outside Coverage

Authors: Taylan Şahin, Ramin Khalili, Mate Boban, Adam Wolisz

Abstract: Radio resources in vehicle-to-vehicle (V2V) communication can be scheduled either by a centralized scheduler residing in the network (e.g., a base station in case of cellular systems) or a distributed scheduler, where the resources are autonomously selected by the vehicles. The former approach yields a considerably higher resource utilization in case the network coverage is uninterrupted. However,… ▽ More Radio resources in vehicle-to-vehicle (V2V) communication can be scheduled either by a centralized scheduler residing in the network (e.g., a base station in case of cellular systems) or a distributed scheduler, where the resources are autonomously selected by the vehicles. The former approach yields a considerably higher resource utilization in case the network coverage is uninterrupted. However, in case of intermittent or out-of-coverage, due to not having input from centralized scheduler, vehicles need to revert to distributed scheduling. Motivated by recent advances in reinforcement learning (RL), we investigate whether a centralized learning scheduler can be taught to efficiently pre-assign the resources to vehicles for out-of-coverage V2V communication. Specifically, we use the actor-critic RL algorithm to train the centralized scheduler to provide non-interfering resources to vehicles before they enter the out-of-coverage area. Our initial results show that a RL-based scheduler can achieve performance as good as or better than the state-of-art distributed scheduler, often outperforming it. Furthermore, the learning process completes within a reasonable time (ranging from a few hundred to a few thousand epochs), thus making the RL-based scheduler a promising solution for V2V communications with intermittent network coverage. △ Less

Submitted 29 April, 2019; originally announced April 2019.

Comments: Article published in IEEE VNC 2018

arXiv:1406.6772 [pdf, other]

doi 10.1145/2674005.2675007

MSPlayer: Multi-Source and multi-Path LeverAged YoutubER

Authors: Yung-Chih Chen, Don Towsley, Ramin Khalili

Abstract: Online video streaming through mobile devices has become extremely popular nowadays. YouTube, for example, reported that the percentage of its traffic streaming to mobile devices has soared from 6% to more than 40% over the past two years. Moreover, people are constantly seeking to stream high quality video for better experience while often suffering from limited bandwidth. Thanks to the rapid dep… ▽ More Online video streaming through mobile devices has become extremely popular nowadays. YouTube, for example, reported that the percentage of its traffic streaming to mobile devices has soared from 6% to more than 40% over the past two years. Moreover, people are constantly seeking to stream high quality video for better experience while often suffering from limited bandwidth. Thanks to the rapid deployment of content delivery networks (CDNs), popular videos are now replicated at different sites, and users can stream videos from close-by locations with low latencies. As mobile devices nowadays are equipped with multiple wireless interfaces (e.g., WiFi and 3G/4G), aggregating bandwidth for high definition video streaming has become possible. We propose a client-based video streaming solution, MSPlayer, that takes advantage of multiple video sources as well as multiple network paths through different interfaces. MSPlayer reduces start-up latency and provides high quality video streaming and robust data transport in mobile scenarios. We experimentally demonstrate our solution on a testbed and through the YouTube video service. △ Less

Submitted 9 November, 2014; v1 submitted 26 June, 2014; originally announced June 2014.

Comments: accepted to ACM CoNEXT'14

ACM Class: C.2; C.2.1; C.4

arXiv:1401.0207 [pdf, other]

Urban Mobility Scaling: Lessons from `Little Data'

Authors: Galen Wilkerson, Ramin Khalili, Stefan Schmid

Abstract: Recent mobility scaling research, using new data sources, often relies on aggregated data alone. Hence, these studies face difficulties characterizing the influence of factors such as transportation mode on mobility patterns. This paper attempts to complement this research by looking at a category-rich mobility data set. In order to shed light on the impact of categories, as a case study, we use c… ▽ More Recent mobility scaling research, using new data sources, often relies on aggregated data alone. Hence, these studies face difficulties characterizing the influence of factors such as transportation mode on mobility patterns. This paper attempts to complement this research by looking at a category-rich mobility data set. In order to shed light on the impact of categories, as a case study, we use conventionally collected German mobility data. In contrast to `check-in'-based data, our results are not biased by Euclidean distance approximations. In our analysis, we show that aggregation can hide crucial differences between trip length distributions, when subdivided by categories. For example, we see that on an urban scale (0 to ~15 km), walking, versus driving, exhibits a highly different scaling exponent, thus universality class. Moreover, mode share and trip length are responsive to day-of-week and time-of-day. For example, in Germany, although driving is relatively less frequent on Sundays than on Wednesdays, trips seem to be longer. In addition, our work may shed new light on the debate between distance-based and intervening-opportunity mechanisms affecting mobility patterns, since mode may be chosen both according to trip length and urban form. △ Less

Submitted 6 January, 2014; v1 submitted 31 December, 2013; originally announced January 2014.

Comments: 6 pages, 9 figures

arXiv:1310.2748 [pdf, other]

Multi-Source Multi-Path HTTP (mHTTP): A Proposal

Authors: Juhoon Kim, Ramin Khalili, Anja Feldmann, Yung-Chih Chen, Don Towsley

Abstract: Today, most devices have multiple network interfaces. Coupled with wide-spread replication of popular content at multiple locations, this provides substantial path diversity in the Internet. We propose Multi-source Multipath HTTP, mHTTP, which takes advantage of all existing types of path diversity in the Internet. mHTTP needs only client-side but not server-side or network modifications as it is… ▽ More Today, most devices have multiple network interfaces. Coupled with wide-spread replication of popular content at multiple locations, this provides substantial path diversity in the Internet. We propose Multi-source Multipath HTTP, mHTTP, which takes advantage of all existing types of path diversity in the Internet. mHTTP needs only client-side but not server-side or network modifications as it is a receiver-oriented mechanism. Moreover, the modifications are restricted to the socket interface. Thus, no changes are needed to the applications or to the kernel. As mHTTP relies on HTTP range requests, it is specific to HTTP which accounts for more than 60% of the Internet traffic. We implement mHTTP and study its performance by conducting measurements over a testbed and in the wild. Our results show that mHTTP indeed takes advantage of all types of path diversity in the Internet, and that it is a viable alternative to Multipath TCP for HTTP traffic. mHTTP decreases download times for large objects up to 50%, whereas it does no harm to small object downloads. △ Less

Submitted 10 December, 2013; v1 submitted 10 October, 2013; originally announced October 2013.

Comments: 12 pages

ACM Class: C.2.0; C.4.0

arXiv:1307.7584 [pdf, ps, other]

Towards a System Theoretic Approach to Wireless Network Capacity in Finite Time and Space

Authors: Florin Ciucu, Ramin Khalili, Yuming Jiang, Liu Yang, Yong Cui

Abstract: In asymptotic regimes, both in time and space (network size), the derivation of network capacity results is grossly simplified by brushing aside queueing behavior in non-Jackson networks. This simplifying double-limit model, however, lends itself to conservative numerical results in finite regimes. To properly account for queueing behavior beyond a simple calculus based on average rates, we advoca… ▽ More In asymptotic regimes, both in time and space (network size), the derivation of network capacity results is grossly simplified by brushing aside queueing behavior in non-Jackson networks. This simplifying double-limit model, however, lends itself to conservative numerical results in finite regimes. To properly account for queueing behavior beyond a simple calculus based on average rates, we advocate a system theoretic methodology for the capacity problem in finite time and space regimes. This methodology also accounts for spatial correlations arising in networks with CSMA/CA scheduling and it delivers rigorous closed-form capacity results in terms of probability distributions. Unlike numerous existing asymptotic results, subject to anecdotal practical concerns, our transient one can be used in practical settings: for example, to compute the time scales at which multi-hop routing is more advantageous than single-hop routing. △ Less

Submitted 1 August, 2013; v1 submitted 29 July, 2013; originally announced July 2013.

Showing 1–21 of 21 results for author: Khalili, R