Search | arXiv e-print repository

A Bayesian Framework of Deep Reinforcement Learning for Joint O-RAN/MEC Orchestration

Authors: Fahri Wisnu Murti, Samad Ali, Matti Latva-aho

Abstract: Multi-access Edge Computing (MEC) can be implemented together with Open Radio Access Network (O-RAN) over commodity platforms to offer low-cost deployment and bring the services closer to end-users. In this paper, a joint O-RAN/MEC orchestration using a Bayesian deep reinforcement learning (RL)-based framework is proposed that jointly controls the O-RAN functional splits, the allocated resources a… ▽ More Multi-access Edge Computing (MEC) can be implemented together with Open Radio Access Network (O-RAN) over commodity platforms to offer low-cost deployment and bring the services closer to end-users. In this paper, a joint O-RAN/MEC orchestration using a Bayesian deep reinforcement learning (RL)-based framework is proposed that jointly controls the O-RAN functional splits, the allocated resources and hosting locations of the O-RAN/MEC services across geo-distributed platforms, and the routing for each O-RAN/MEC data flow. The goal is to minimize the long-term overall network operation cost and maximize the MEC performance criterion while adapting possibly time-varying O-RAN/MEC demands and resource availability. This orchestration problem is formulated as Markov decision process (MDP). However, the system consists of multiple BSs that share the same resources and serve heterogeneous demands, where their parameters have non-trivial relations. Consequently, finding the exact model of the underlying system is impractical, and the formulated MDP renders in a large state space with multi-dimensional discrete action. To address such modeling and dimensionality issues, a novel model-free RL agent is proposed for our solution framework. The agent is built from Double Deep Q-network (DDQN) that tackles the large state space and is then incorporated with action branching, an action decomposition method that effectively addresses the multi-dimensional discrete action with linear increase complexity. Further, an efficient exploration-exploitation strategy under a Bayesian framework using Thomson sampling is proposed to improve the learning performance and expedite its convergence. Trace-driven simulations are performed using an O-RAN-compliant model. The results show that our approach is data-efficient (i.e., converges faster) and increases the returned reward by 32\% than its non-Bayesian version. △ Less

Submitted 26 December, 2023; originally announced December 2023.

Comments: This article is submitted to IEEE

arXiv:2208.05282 [pdf, other]

doi 10.1109/TNSM.2023.3292713

Deep Reinforcement Learning for Orchestrating Cost-Aware Reconfigurations of vRANs

Authors: Fahri Wisnu Murti, Samad Ali, George Iosifidis, Matti Latva-aho

Abstract: Virtualized Radio Access Networks (vRANs) are fully configurable and can be implemented at a low cost over commodity platforms to enable network management flexibility. In this paper, a novel vRAN reconfiguration problem is formulated to jointly reconfigure the functional splits of the base stations (BSs), locations of the virtualized central units (vCUs) and distributed units (vDUs), their resour… ▽ More Virtualized Radio Access Networks (vRANs) are fully configurable and can be implemented at a low cost over commodity platforms to enable network management flexibility. In this paper, a novel vRAN reconfiguration problem is formulated to jointly reconfigure the functional splits of the base stations (BSs), locations of the virtualized central units (vCUs) and distributed units (vDUs), their resources, and the routing for each BS data flow. The objective is to minimize the long-term total network operation cost while adapting to the varying traffic demands and resource availability. Testbed measurements are performed to study the relationship between the traffic demands and computing resources, which reveals high variance and depends on the platform and its load. Consequently, finding the perfect model of the underlying system is non-trivial. Therefore, to solve the proposed problem, a deep reinforcement learning (RL)-based framework is proposed and developed using model-free RL approaches. Moreover, the problem consists of multiple BSs sharing the same resources, which results in a multi-dimensional discrete action space and leads to a combinatorial number of possible actions. To overcome this curse of dimensionality, action branching architecture, which is an action decomposition method with a shared decision module followed by neural network is combined with Dueling Double Deep Q-network (D3QN) algorithm. Simulations are carried out using an O-RAN compliant model and real traces of the testbed. Our numerical results show that the proposed framework successfully learns the optimal policy that adaptively selects the vRAN configurations, where its learning convergence can be further expedited through transfer learning even in different vRAN systems. It offers significant cost savings by up to 59\% of a static benchmark, 35\% of DDPG with discretization, and 76\% of non-branching D3QN. △ Less

Submitted 3 July, 2023; v1 submitted 10 August, 2022; originally announced August 2022.

Comments: This article has been accepted for publication in IEEE Transactions on Network and Service Management

arXiv:2205.07518 [pdf, other]

doi 10.1109/EuCNC/6GSummit54941.2022.9815815

Learning-Based Orchestration for Dynamic Functional Split and Resource Allocation in vRANs

Authors: Fahri Wisnu Murti, Samad Ali, George Iosifidis, Matti Latva-aho

Abstract: One of the key benefits of virtualized radio access networks (vRANs) is network management flexibility. However, this versatility raises previously-unseen network management challenges. In this paper, a learning-based zero-touch vRAN orchestration framework (LOFV) is proposed to jointly select the functional splits and allocate the virtualized resources to minimize the long-term management cost. F… ▽ More One of the key benefits of virtualized radio access networks (vRANs) is network management flexibility. However, this versatility raises previously-unseen network management challenges. In this paper, a learning-based zero-touch vRAN orchestration framework (LOFV) is proposed to jointly select the functional splits and allocate the virtualized resources to minimize the long-term management cost. First, testbed measurements of the behaviour between the users' demand and the virtualized resource utilization are collected using a centralized RAN system. The collected data reveals that there are non-linear and non-monotonic relationships between demand and resource utilization. Then, a comprehensive cost model is proposed that takes resource overprovisioning, declined demand, instantiation and reconfiguration into account. Moreover, the proposed cost model also captures different routing and computing costs for each split. Motivated by our measurement insights and cost model, LOFV is developed using a model-free reinforcement learning paradigm. The proposed solution is constructed from a combination of deep Q-learning and a regression-based neural network that maps the network state and users' demand into split and resource control decisions. Our numerical evaluations show that LOFV can offer cost savings by up to 69\% of the optimal static policy and 45\% of the optimal fully dynamic policy. △ Less

Submitted 16 May, 2022; originally announced May 2022.

Comments: This paper has been accepted in Proc. of The 2022 Joint European Conference on Networks and Communications (EuCNC) & 6G Summit

arXiv:2106.00011 [pdf, other]

doi 10.1109/TWC.2022.3179811

Constrained Deep Reinforcement Based Functional Split Optimization in Virtualized RANs

Authors: Fahri Wisnu Murti, Samad Ali, Matti Latva-aho

Abstract: In virtualized radio access network (vRAN), the base station (BS) functions are decomposed into virtualized components that can be hosted at the centralized unit or distributed units through functional splits. Such flexibility has many benefits; however, it also requires solving the problem of finding the optimal splits of functions of the BSs in such a way that minimizes the total network cost. T… ▽ More In virtualized radio access network (vRAN), the base station (BS) functions are decomposed into virtualized components that can be hosted at the centralized unit or distributed units through functional splits. Such flexibility has many benefits; however, it also requires solving the problem of finding the optimal splits of functions of the BSs in such a way that minimizes the total network cost. The underlying vRAN system is complex and precise modelling of it is not trivial. Formulating the functional split problem to minimize the cost results in a combinatorial problem that is provably NP-hard, and solving it is computationally expensive. In this paper, a constrained deep reinforcement learning (RL) approach is proposed to solve the problem with minimal assumptions about the underlying system. Since in deep RL, the action selection is the outcome of inference of a neural network, it can be done in real-time while training to update the neural networks can be done in the background. However, since the problem is combinatorial, even for a small number of functions, the action space of the RL problem becomes large. Therefore, to deal with such a large action space, a chain rule-based stochastic policy is exploited in which a long short-term memory (LSTM) network-based sequence-to-sequence model is applied to estimate the policy that is selecting the functional split actions. However, the utilized policy is still limited to an unconstrained problem, and each split decision is bounded by vRAN's constraint requirements. Hence, a constrained policy gradient method is leveraged to train and guide the policy toward constraint satisfaction. Further, a search strategy by greedy decoding or temperature sampling is utilized to improve the optimality performance at the test time. Simulations are performed to evaluate the performance of the proposed solution using synthetic and real network datasets. △ Less

Submitted 3 June, 2022; v1 submitted 31 May, 2021; originally announced June 2021.

Comments: This article has been accepted for publication in IEEE Transactions on Wireless Communications

arXiv:2105.14731 [pdf, other]

doi 10.1109/ICCWorkshops50388.2021.9473703

Deep Reinforcement Based Optimization of Function Splitting in Virtualized Radio Access Networks

Authors: Fahri Wisnu Murti, Samad Ali, Matti Latva-aho

Abstract: Virtualized Radio Access Network (vRAN) is one of the key enablers of future wireless networks as it brings the agility to the radio access network (RAN) architecture and offers degrees of design freedom. Yet, it also creates a challenging problem on how to design the functional split configuration. In this paper, a deep reinforcement learning approach is proposed to optimize function splitting in… ▽ More Virtualized Radio Access Network (vRAN) is one of the key enablers of future wireless networks as it brings the agility to the radio access network (RAN) architecture and offers degrees of design freedom. Yet, it also creates a challenging problem on how to design the functional split configuration. In this paper, a deep reinforcement learning approach is proposed to optimize function splitting in vRAN. A learning paradigm is developed that optimizes the location of functions in the RAN. These functions can be placed either at a central/cloud unit (CU) or a distributed unit (DU). This problem is formulated as constrained neural combinatorial reinforcement learning to minimize the total network cost. In this solution, a policy gradient method with Lagrangian relaxation is applied that uses a stacked long short-term memory (LSTM) neural network architecture to approximate the policy. Then, a sampling technique with a temperature hyperparameter is applied for the inference process. The results show that our proposed solution can learn the optimal function split decision and solve the problem with a $0.4\%$ optimality gap. Moreover, our method can reduce the cost by up to $320\%$ compared to a distributed-RAN (D-RAN). We also conclude that altering the traffic load and routing cost does not significantly degrade the optimality performance. △ Less

Submitted 31 May, 2021; originally announced May 2021.

Comments: This paper has been accepted in IEEE International Conference on Communications Workshops (ICC Workshops) 2021

arXiv:2002.10681 [pdf, other]

doi 10.1109/ICC40277.2020.9149318

On the Optimization of Multi-Cloud Virtualized Radio Access Networks

Authors: Fahri Wisnu Murti, Andres Garcia-Saavedra, Xavier Costa-Perez, George Iosifidis

Abstract: We study the important and challenging problem of virtualized radio access network (vRAN) design in its most general form. We develop an optimization framework that decides the number and deployment locations of central/cloud units (CUs); which distributed units (DUs) each of them will serve; the functional split that each BS will implement; and the network paths for routing the traffic to CUs and… ▽ More We study the important and challenging problem of virtualized radio access network (vRAN) design in its most general form. We develop an optimization framework that decides the number and deployment locations of central/cloud units (CUs); which distributed units (DUs) each of them will serve; the functional split that each BS will implement; and the network paths for routing the traffic to CUs and the network core. Our design criterion is to minimize the operator's expenditures while serving the expected traffic. To this end, we combine a linearization technique with a cutting-planes method in order to expedite the exact solution of the formulated problem. We evaluate our framework using real operational networks and system measurements, and follow an exhaustive parameter-sensitivity analysis. We find that the benefits when departing from single-CU deployments can be as high as 30% for our networks, but these gains diminish with the further addition of CUs. Our work sheds light on the vRAN design from a new angle, highlights the importance of deploying multiple CUs, and offers a rigorous framework for optimizing the costs of Multi-CUs vRAN. △ Less

Submitted 26 February, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

Comments: This preprint is to be published in Proc. of IEEE International Conference on Communications (ICC) 2020

Report number: June 2020, pp. 1-7

Journal ref: 2020 IEEE International Conference on Communications (ICC)

arXiv:1812.10266 [pdf, ps, other]

doi 10.1002/dac.4533

Exploiting non-orthogonal multiple access in downlink coordinated multipoint transmission with the presence of imperfect channel state information

Authors: Fahri Wisnu Murti, Rahmat Faddli Siregar, Muhammad Royyan, Soo Young Shin

Abstract: In this paper, the impact of imperfect channel state information (CSI) on a downlink coordinated multipoint (CoMP) transmission system with non-orthogonal multiple access (NOMA) is investigated since perfect knowledge of a channel can not be guaranteed in practice. Furthermore, the channel estimation error is applied to estimate the channel information wherein its priori of variance is assumed to… ▽ More In this paper, the impact of imperfect channel state information (CSI) on a downlink coordinated multipoint (CoMP) transmission system with non-orthogonal multiple access (NOMA) is investigated since perfect knowledge of a channel can not be guaranteed in practice. Furthermore, the channel estimation error is applied to estimate the channel information wherein its priori of variance is assumed to be known. The impact of the number of coordinated base stations (BSs) on downlink CoMP NOMA is investigated. Users are classified into one of two groups according to their position within the cell, namely cell-center user (CCU) and cell-edge user (CEU). In this paper, ergodic capacity and sum capacity for both CCU and CEU are derived as closed form. In addition, various experiments are conducted with different parameters such as SNR, error variance, and power allocation to show their impact on the CoMP method. The results show that CoMP NOMA outperforms the CoMP orthogonal multiple access (OMA) wherein the condition of the channel impacts the performance of CoMP NOMA less. It is worth noting that a higher number of coordinated BSs enhances the total capacity of CoMP NOMA. Finally, the performance analysis is validated due to the close accordance between the analytical and simulation results. △ Less

Submitted 26 February, 2020; v1 submitted 26 December, 2018; originally announced December 2018.

Comments: Minor Revision Wiley International Journal of Communication Systems

Journal ref: International Journal of Communication Systems. 2020

Showing 1–7 of 7 results for author: Murti, F W