Meta Reinforcement Learning for Resource Allocation in Multi-Antenna UAV Network with Rate Splitting Multiple Access
Authors:
Hosein Zarini,
Maryam Farajzadeh Dehkordi,
Armin Farhadi,
Mohammad Robat Mili,
Ali Movaghar,
Mehdi Rasti,
Yonghui Li,
Kai-Kit Wong
Abstract:
Unmanned aerial vehicles (UAVs) with multiple antennas have recently been explored to improve capacity in wireless networks. However, the strict energy constraint of UAVs, given their simultaneous flying and communication tasks, renders the exploration of energy-efficient multi-antenna techniques indispensable for UAVs. Meanwhile, lens antenna subarray (LAS) emerges as a promising energy-efficient…
▽ More
Unmanned aerial vehicles (UAVs) with multiple antennas have recently been explored to improve capacity in wireless networks. However, the strict energy constraint of UAVs, given their simultaneous flying and communication tasks, renders the exploration of energy-efficient multi-antenna techniques indispensable for UAVs. Meanwhile, lens antenna subarray (LAS) emerges as a promising energy-efficient solution that has not been previously harnessed for this purpose. In this paper, we propose a LAS-aided multi-antenna UAV to serve ground users in the downlink transmission of the terahertz (THz) band, utilizing rate splitting multiple access (RSMA) for effective beam division multiplexing. We formulate an optimization problem of maximizing the total system spectral efficiency (SE). This involves optimizing the UAV's transmit beamforming and the common rate of RSMA. By recasting the optimization problem into a Markov decision process (MDP), we propose a deep deterministic policy gradient (DDPG)-based resource allocation mechanism tailored to capture problem dynamics and optimize its variables. Moreover, given the UAV's frequent mobility and consequential system reconfigurations, we fortify the trained DDPG model with a meta-learning strategy, enhancing its adaptability to system variations. Numerically, more than 20\% energy efficiency gain is achieved by our proposed LAS-aided multi-antenna UAV equipped with 4 lenses, compared to a single-lens UAV. Simulations also demonstrate that at a signal-to-noise (SNR) of 10 dB, the incorporation of RSMA results in a 22\% SE enhancement over conventional orthogonal beam division multiple access. Furthermore, the overall system SE improves by 27\%, when meta-learning is employed for fine-tuning the conventional DDPG method in literature.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
Meta Reinforcement Learning for Resource Allocation in Unmanned Aerial Vehicles with MIMO Visible Light Communication
Authors:
Hosein Zarini,
Amir Mohammadi,
Maryam Farajzadeh Dehkordi,
Mohammad Robat Mili,
Bardia Safaei,
Ali Movaghar,
Merouane Debbah
Abstract:
This paper centers around a multiple-input-multiple-output (MIMO) visible light communication (VLC) system, where an unmanned aerial vehicle (UAV) benefits from a light emitting diode (LED) array to serve photo-diode (PD)-equipped users for illumination and communication simultaneously. Concerning the battery limitation of the UAV and considerable energy consumption of the LED array, a hybrid dimm…
▽ More
This paper centers around a multiple-input-multiple-output (MIMO) visible light communication (VLC) system, where an unmanned aerial vehicle (UAV) benefits from a light emitting diode (LED) array to serve photo-diode (PD)-equipped users for illumination and communication simultaneously. Concerning the battery limitation of the UAV and considerable energy consumption of the LED array, a hybrid dimming control scheme is devised at the UAV that effectively controls the number of glared LEDs and thereby mitigates the overall energy consumption. To assess the performance of this system, a radio resource allocation problem is accordingly formulated for jointly optimizing the motion trajectory, transmit beamforming and LED selection at the UAV, assuming that channel state information (CSI) is partially available. By reformulating the optimization problem in Markov decision process (MDP) form, we propose a soft actor-critic (SAC) mechanism that captures the dynamics of the problem and optimizes its parameters. Additionally, regarding the frequent mobility of the UAV and thus remarkable rearrangement of the system, we enhance the trained SAC model by integrating a meta-learning strategy that enables more adaptation to system variations. According to simulations, upgrading a single-LED UAV by an array of 10 LEDs, exhibits 47% and 34% improvements in data rate and energy efficiency, albeit at the expense of 8% more power consumption.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.