HTML conversions sometimes display errors due to content that did not convert correctly from the source. This paper uses the following packages that are not yet supported by the HTML conversion tool. Feedback on these issues are not necessary; they are known and are being worked on.

  • failed: autobreak
  • failed: autobreak
  • failed: calligra

Authors: achieve the best HTML results from your LaTeX submissions by following these best practices.

License: arXiv.org perpetual non-exclusive license
arXiv:2401.10253v2 [cs.NI] 18 Mar 2024

Hybrid-Task Meta-Learning: A Graph Neural Network Approach for Scalable and Transferable Bandwidth Allocation

Xin Hao, Changyang She,  Phee Lep Yeoh, , Yuhong Liu, Branka Vucetic,  and Yonghui Li Part of this work was presented at the 2023 IEEE International Conference on Communications Workshops (ICC workshops) [1]. (Corresponding author: Changyang She.)
Abstract

In this paper, we develop a deep learning-based bandwidth allocation policy that is: 1) scalable with the number of users and 2) transferable to different communication scenarios, such as non-stationary wireless channels, different quality-of-service (QoS) requirements, and dynamically available resources. To support scalability, the bandwidth allocation policy is represented by a graph neural network (GNN), with which the number of training parameters does not change with the number of users. To enable the generalization of the GNN, we develop a hybrid-task meta-learning (HML) algorithm that trains the initial parameters of the GNN with different communication scenarios during meta-training. Next, during meta-testing, a few samples are used to fine-tune the GNN with unseen communication scenarios. Simulation results demonstrate that our HML approach can improve the initial performance by 8.79%percent8.798.79\%8.79 %, and sampling efficiency by 73%percent7373\%73 %, compared with existing benchmarks. After fine-tuning, our near-optimal GNN-based policy can achieve close to the same reward with much lower inference complexity compared to the optimal policy obtained using iterative optimization.

Index Terms:
Bandwidth allocation, graph neural network, meta-learning, quality-of-service.

I Introduction

Throughout the rapid evolution of wireless communication systems, the spectral efficiency, which is the amount of information that can be transmitted over a given bandwidth while maintaining a certain quality of service (QoS) level, still remains one of the most critical performance metrics for future sixth-generation (6G) wireless communications [1, 2]. To maximize spectrum efficiency, low-complexity bandwidth allocation solutions are critical for real-time decision-making within each transmission time interval (TTI) that could be shorter than one millisecond in current fifth-generation (5G) wireless communications. Furthermore, the number of users requesting bandwidth in each TTI is stochastic [3, 4], each user may have different QoS requirements [6, 5, 7], and wireless channels are non-stationary [8, 9], making it difficult to develop a low-complexity bandwidth allocation policy that is scalable with the number of users and can satisfy a diverse range of communication scenarios.

Existing iterative optimization algorithms can obtain optimal bandwidth allocation policies, but their computational complexity is generally too high to be implemented in real time [10, 11, 12]. To reduce the computational complexity, deep learning is a promising approach for 6G communications [14, 13]. The idea is to train a deep neural network that maps the network status to the optimal decision. After training, the deep neural network can be used in communication systems for real-time decision-making, referred to as inference [15]. Although deep learning has much lower inference complexity compared with iterative optimization algorithms, existing deep learning solutions using fully connected neural networks (FNNs) are not scalable to different number of users in wireless networks [16]. This is because the number of training parameters of an FNN depends on the dimensions of the input and output, which change with the number of users. Thus, a well-trained FNN is not applicable in wireless networks with stochastic user requests. In contrast to FNNs, graph neural networks (GNNs) have scalable numbers of training parameters that adapt to the number of users [17] — making them highly-suitable for develo** scalable deep learning-based resource allocation solutions for wireless networks [19, 18]. Furthermore, improving the generalization ability of GNN in wireless networks with diverse QoS requirements remains an open problem.

A key 5G application that requires flexible resource allocation solutions is network slicing, where resources from a shared physical infrastructure is partitioned into distinct network slices supporting diverse QoS requirements, such as data rate [21, 20], latency [22, 23], and security [25, 26, 24, 27], in both long and short coding blocklength regimes [30, 28, 29]. To reserve resources for a single slice, the authors of [31] proposed to compute the weights of different slices based on the corresponding QoS requirements and the number of service requests. With this approach, the amount of reserved resources for each slice is stochastic. Meanwhile, since the wireless channels are non-stationary, the reserved resources and the wireless channels in the training stage could be different from the actual required resources in the testing stage [32, 33]. As such, the mismatch between training data samples and testing data samples remains a crucial bottleneck for implementing efficient learning-based policies in practical wireless networks.

Recent works have proposed to reduce the online training time by transfer learning, which involves offline pre-training and online fine-tuning [10]. This method effectively reuses previously well-trained neural network features and significantly improves the sample efficiency. To further improve the online training efficiency for unseen tasks, meta-learning has been proposed [34, 37, 35, 36]. One of the meta-learning algorithms, model-agnostic meta-learning (MAML), has been applied to solve policy mismatch issues caused by varying user requests and non-stationary wireless channels [39, 38, 8, 9]. While these aforementioned works have highlighted the generalization ability of meta-learning for non-stationary wireless resource allocation, no works have addressed the impact of diverse QoS requirements in different communication scenarios.

In this paper, we put forth a low-complexity bandwidth allocation framework by designing a GNN that is scalable with the number of users and applying meta-learning to generalize the GNN to different communication scenarios. The main contributions are summarized as follows,

  • Our proposed GNN is designed to handle six diverse QoS requirements of data rate, latency, and security in each of the long and short coding blocklength regimes. This generalization is achieved by using feature engineering to translate the channel state information (CSI) and customized QoS requirement of individual users into the minimum required bandwidth.

  • Based on the extracted feature of minimum required bandwidth, we design a GNN-based bandwidth allocation policy that is scalable to the number of users. To train the GNN, we apply an unsupervised learning method to maximize the sum reward of the users with different QoS requirements in a network-slicing architecture.

  • The optimal bandwidth allocation policies are obtained based on an iterative optimization algorithm to obtain the performance limit of the GNN-based policy in terms of the sum reward. By analyzing the computational complexity, we show that the GNN has a much lower inference complexity compared with the iterative optimization algorithm that is optimal.

  • Finally, we develop our generalized hybrid-task meta-learning (HML) algorithm that is transferable to different communication scenarios by using meta-training to train the initial parameters of the GNN. We note that only a few samples are required to fine-tune the parameters of the GNN in meta-testing which validates that our GNN-based policy initialized by HML can be efficiently transferred to previously unseen communication scenarios. Simulation results show that our GNN-based policy achieves near-optimal performance and HML significantly outperforms the three considered benchmarks of MAML, MTL transfer (multi-task learning based transfer learning), and random initialisation.

In our simulations, the gap between the sum reward achieved by the GNN-based policy and that of the optimal bandwidth allocation policy obtained from the iterative optimization algorithm is less than 6%percent66\%6 %. HML also improves the initial performance by up to 8.79%percent8.798.79\%8.79 % and sample efficiency by up to 73%percent7373\%73 % compared with the MAML benchmark. We also show that the performance gains of HML is even higher when compared to the other two benchmarks.

II Related Works

II-A Deep Learning for Resource Allocation in Wireless Communications

Applying deep learning for resource allocation in wireless networks has been widely studied in the existing literature [15, 16]. In [15], the authors showed that learning-based algorithms could obtain near-optimal solutions, and the computational complexity in inference is low. In [16], the authors proposed a FNN-based unsupervised learning algorithm to optimize the bandwidth allocation policy. More recently, due to the fact that FNN is not scalable to the number of users, GNNs have been applied in wireless networks optimizations [19, 18]. In [18], the authors designed a GNN, which is scalable to the number of users in a wireless network, to minimize the summation of queuing delay violation probability and packet loss probability. In [19], the authors developed GNN-based scalable learning-based methods to solve radio resource management problems.

TABLE I: Considered QoS Requirements in Related Works
Refs QoS Data rate Latency Security
Long Short Long Short Long Short
[21, 20] \checkmark
[22, 23] \checkmark
[24, 25, 26] \checkmark
[27] \checkmark
[28] \checkmark \checkmark
[30] \checkmark \checkmark
[10] \checkmark \checkmark \checkmark

II-B Generalization of Deep Learning Policies in Non-Stationary Wireless Networks

In wireless networks, the user requests, wireless channels, and available resources for each type of service can be non-stationary. Table I summarizes some QoS requirements considered in the related works. For example, data rate, latency, and security have been investigated in [20, 21, 22, 23, 24, 25, 26]. These papers mainly focus on scenarios with long channel coding blocklengths, where the achievable rate of a wireless link can be approximated by the Shannon capacity. In 5G, the coding blocklength can be short, and Shannon capacity is not applicable. As such, the authors of [27, 28] established how to optimize wireless communication systems using the achievable rate in the short blocklength regime [29]. Meanwhile, different services may co-exist in one network, and the authors of [30, 10] considered different QoS requirements in both long and short blocklength regimes. To support diverse QoS requirements in network slicing, the authors of [31] proposed to reserve bandwidth for different slices based on the number of users and the required QoS.

Further considering that the number of requests, the reserved resources, and the wireless channels are dynamic, improving the generalization ability of deep learning policies has attracted significant research interests in recent years. One approach to address this challenge is to carefully initialize the neural network and fine-tune it online. The authors of [10] applied transfer learning to fine-tune the parameters of deep neural networks that are trained offline in dynamic wireless networks. To further improve the sample efficiency in an unseen communication scenario, meta-learning has been adopted in [8, 9, 38, 39], where the hyper-parameters of a deep neural network, such as the initial parameters, are updated according to a set of communication scenarios in meta-training. In [38], meta-learning was applied to optimize computing resource allocation policies in mobile edge computing networks to fit both time-varying wireless channels and different requests of computing tasks. In [39], meta-learning was applied in virtual reality to quickly adapt to the user movement patterns changing over time. To improve the training efficiency in non-stationary vehicle networks, the authors in [8] proposed optimizing the beamforming using meta-learning. In [9], the authors combined meta-learning and support vector regression to extract the features for beamforming optimization, further improving training efficiency over non-stationary channels.

III System Model and Problem Formulation

We consider an uplink orthogonal-frequency-division-multiple-access communication system with network slicing where U𝑈Uitalic_U users are requesting different types of services from one base station (BS). The BS first reserves bandwidth for each type of service according to the QoS requirement and the number of users. Then, it allocates bandwidth to different users within each slice. The resource reservation for different slices has been extensively studied in the existing literature, so we will focus on develo** bandwidth allocation policies for individual slices with different numbers of users, non-stationary wireless channels, and dynamic available bandwidth.

III-A Different QoS in Infinite and Short Blocklength Regimes

To investigate the generalization ability of our proposed bandwidth allocation policy, we consider both long and short blocklength regimes with three types of QoS requirements, i.e., data rate, queuing delay, and security. Thus, there are six scenarios in total. We denote the reward of the u𝑢uitalic_u-th user by

ruΦ,ξ,Φ{D,E,S} and ξ{,},superscriptsubscript𝑟𝑢Φ𝜉Φ𝐷𝐸𝑆 and 𝜉r_{u}^{\Phi,\xi},\quad\Phi\in\{D,E,S\}\text{ and }\xi\in\{\mathcal{I,F}\},italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , roman_Φ ∈ { italic_D , italic_E , italic_S } and italic_ξ ∈ { caligraphic_I , caligraphic_F } , (1)

where superscripts D,E,S𝐷𝐸𝑆D,E,Sitalic_D , italic_E , italic_S represent data rate, effective capacity with queuing delay constraint, and secrecy rate, respectively, whilst the superscripts ,\mathcal{I,F}caligraphic_I , caligraphic_F represent the scenarios in the infinite long and finite short blocklength regimes, respectively.

III-A1 Data Rate Requirement

When the blocklength is long, the data rate reward of the u𝑢uitalic_u-th user can be expressed as

ruD,=wuln2ln(1+PuhuwuN0),superscriptsubscript𝑟𝑢𝐷subscript𝑤𝑢21subscript𝑃𝑢subscript𝑢subscript𝑤𝑢subscript𝑁0r_{u}^{D,\mathcal{I}}=\frac{w_{u}}{\ln 2}\ln\left(1+\frac{P_{u}h_{u}}{w_{u}N_{% 0}}\right),italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT = divide start_ARG italic_w start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_ARG start_ARG roman_ln 2 end_ARG roman_ln ( 1 + divide start_ARG italic_P start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) , (2)

where wusubscript𝑤𝑢w_{u}italic_w start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is the bandwidth allocated to the u𝑢uitalic_u-th user, Pusubscript𝑃𝑢P_{u}italic_P start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is the transmit power of the u𝑢uitalic_u-th user, N0subscript𝑁0N_{0}italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is the single-sided noise spectral density, and hu=αugusubscript𝑢subscript𝛼𝑢subscript𝑔𝑢h_{u}=\alpha_{u}g_{u}italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = italic_α start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is the channel gain, where αusubscript𝛼𝑢{\alpha}_{u}italic_α start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT and gusubscript𝑔𝑢g_{u}italic_g start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT represent the large-scale and small-scale channel gains between the u𝑢uitalic_u-th user and the BS, respectively.

When the blocklength is short, decoding errors cannot be neglected. As such, the data rate reward of the u𝑢uitalic_u-th user can be approximated by [29]

ruD,ruD,VuLufQ1(ϵu)ln2/wusuperscriptsubscript𝑟𝑢𝐷superscriptsubscript𝑟𝑢𝐷subscript𝑉𝑢subscript𝐿𝑢superscriptsubscript𝑓𝑄1subscriptitalic-ϵ𝑢2subscript𝑤𝑢r_{u}^{D,\mathcal{F}}\approx r_{u}^{D,\mathcal{I}}-\sqrt{\frac{V_{u}}{L_{u}}}% \frac{f_{Q}^{-1}(\epsilon_{u})}{\ln 2/w_{u}}italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_F end_POSTSUPERSCRIPT ≈ italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT - square-root start_ARG divide start_ARG italic_V start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_ARG start_ARG italic_L start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_ARG end_ARG divide start_ARG italic_f start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_ϵ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ) end_ARG start_ARG roman_ln 2 / italic_w start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_ARG (3)

where Vu=1(1+PuhuwuN0)2subscript𝑉𝑢1superscript1subscript𝑃𝑢subscript𝑢subscript𝑤𝑢subscript𝑁02V_{u}=1-{\left(1+\frac{P_{u}h_{u}}{w_{u}N_{0}}\right)^{-2}}italic_V start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = 1 - ( 1 + divide start_ARG italic_P start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT is the channel dispersion that measures the stochastic variability of the channel related to a deterministic channel with the same capacity, Lu=Tswusubscript𝐿𝑢subscript𝑇ssubscript𝑤𝑢L_{u}=T_{\mathrm{s}}w_{u}italic_L start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = italic_T start_POSTSUBSCRIPT roman_s end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is the blocklength, and Tssubscript𝑇sT_{\mathrm{s}}italic_T start_POSTSUBSCRIPT roman_s end_POSTSUBSCRIPT is the transmission duration of each coding block. The function fQ1(x)superscriptsubscript𝑓𝑄1𝑥f_{Q}^{-1}(x)italic_f start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_x ) is the inverse of the Gaussian Q-function, and ϵusubscriptitalic-ϵ𝑢\epsilon_{u}italic_ϵ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is the decoding error probability.

III-A2 Latency Requirement

When considering latency constraints due to queueing delays, the effective capacity is applied to characterize the statistical QoS requirement in wireless communications, and is expressed as [28]

ruE,ξ=1ϑuTcln(𝔼gu[exp(ϑuTcruD,ξ)]),ξ{,},formulae-sequencesuperscriptsubscript𝑟𝑢𝐸𝜉1subscriptitalic-ϑ𝑢subscript𝑇csubscript𝔼subscript𝑔𝑢delimited-[]subscriptitalic-ϑ𝑢subscript𝑇csuperscriptsubscript𝑟𝑢𝐷𝜉𝜉\begin{split}r_{u}^{E,\xi}=&-\frac{1}{\vartheta_{u}T_{\mathrm{c}}}\ln\left(% \mathbb{E}_{g_{u}}\left[\exp\left(-\vartheta_{u}T_{\mathrm{c}}r_{u}^{D,\xi}% \right)\right]\right),\xi\in\{\mathcal{I,F}\},\end{split}start_ROW start_CELL italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_E , italic_ξ end_POSTSUPERSCRIPT = end_CELL start_CELL - divide start_ARG 1 end_ARG start_ARG italic_ϑ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT end_ARG roman_ln ( blackboard_E start_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ roman_exp ( - italic_ϑ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , italic_ξ end_POSTSUPERSCRIPT ) ] ) , italic_ξ ∈ { caligraphic_I , caligraphic_F } , end_CELL end_ROW (4)

where Tcsubscript𝑇cT_{\mathrm{c}}italic_T start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT is the channel coherence time, ϑusubscriptitalic-ϑ𝑢\vartheta_{u}italic_ϑ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT is the QoS exponent for queuing delay, 𝔼[]𝔼delimited-[]\mathbb{E}[\cdot]blackboard_E [ ⋅ ] denotes the expectation, and ruD,ξsuperscriptsubscript𝑟𝑢𝐷𝜉r_{u}^{D,\xi}italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , italic_ξ end_POSTSUPERSCRIPT is the data rate in (2) or (3). We note that ϑu=ln(1/εu)auτmaxsubscriptitalic-ϑ𝑢1subscript𝜀𝑢subscript𝑎𝑢subscript𝜏\vartheta_{u}=\frac{\ln(1/\varepsilon_{u})}{a_{u}{\tau}_{\max}}italic_ϑ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = divide start_ARG roman_ln ( 1 / italic_ε start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ) end_ARG start_ARG italic_a start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT end_ARG is determined by the maximum tolerable delay bound violation probability, εusubscript𝜀𝑢\varepsilon_{u}italic_ε start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT, the packet arrival rate, ausubscript𝑎𝑢a_{u}italic_a start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT, and the threshold of queuing delay, τmaxsubscript𝜏\tau_{\max}italic_τ start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT.

III-A3 Security Requirement

To formulate the wireless security requirement, we consider that there is an eavesdropper that attempts to wiretap the information transmitted by each user.

In the long blocklength regime, the secrecy rate of the u𝑢uitalic_u-th user can be expressed as [24]

ruS,=[ruD,rue,]+,superscriptsubscript𝑟𝑢𝑆superscriptdelimited-[]superscriptsubscript𝑟𝑢𝐷superscriptsubscript𝑟𝑢𝑒r_{u}^{S,\mathcal{I}}=\left[r_{u}^{D,\mathcal{I}}-r_{u}^{e,\mathcal{I}}\right]% ^{+},italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT = [ italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT - italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e , caligraphic_I end_POSTSUPERSCRIPT ] start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT , (5)

where [x]+=max{0,x}superscriptdelimited-[]𝑥0𝑥[x]^{+}=\max\{0,x\}[ italic_x ] start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT = roman_max { 0 , italic_x }, and rue,=wuln2ln(1+PuhuewuN0)superscriptsubscript𝑟𝑢𝑒subscript𝑤𝑢21subscript𝑃𝑢superscriptsubscript𝑢𝑒subscript𝑤𝑢subscript𝑁0r_{u}^{e,\mathcal{I}}=\frac{w_{u}}{\ln 2}\ln\left(1+\frac{P_{u}h_{u}^{e}}{w_{u% }N_{0}}\right)italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e , caligraphic_I end_POSTSUPERSCRIPT = divide start_ARG italic_w start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_ARG start_ARG roman_ln 2 end_ARG roman_ln ( 1 + divide start_ARG italic_P start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) is the data rate of the wiretapped channel from the u𝑢uitalic_u-th user to the eavesdropper. The channel gain of the wiretapped channel is denoted by hue=αueguesuperscriptsubscript𝑢𝑒superscriptsubscript𝛼𝑢𝑒superscriptsubscript𝑔𝑢𝑒h_{u}^{e}=\alpha_{u}^{e}g_{u}^{e}italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT = italic_α start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT italic_g start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT, where αuesuperscriptsubscript𝛼𝑢𝑒{\alpha}_{u}^{e}italic_α start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT and guesuperscriptsubscript𝑔𝑢𝑒g_{u}^{e}italic_g start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT represent the large-scale and small-scale channel gains between the u𝑢uitalic_u-th user and the eavesdropper, respectively.

In the short blocklength regime, the achievable secrecy rate of the u𝑢uitalic_u-th user can be approximated as [27],

ruS,={ruS,VuLufQ1(ϵu)ln2/wuVueLufQ1(δu)ln2/wu,hu>hue0,huhue,superscriptsubscript𝑟𝑢𝑆casessuperscriptsubscript𝑟𝑢𝑆subscript𝑉𝑢subscript𝐿𝑢superscriptsubscript𝑓𝑄1subscriptitalic-ϵ𝑢2subscript𝑤𝑢superscriptsubscript𝑉𝑢𝑒subscript𝐿𝑢superscriptsubscript𝑓𝑄1subscript𝛿𝑢2subscript𝑤𝑢subscript𝑢superscriptsubscript𝑢𝑒0subscript𝑢superscriptsubscript𝑢𝑒\displaystyle r_{u}^{S,\mathcal{F}}=\begin{cases}r_{u}^{S,\mathcal{I}}-\sqrt{% \frac{V_{u}}{L_{u}}}\frac{f_{Q}^{-1}(\epsilon_{u})}{\ln 2/w_{u}}-\sqrt{\frac{V% _{u}^{e}}{L_{u}}}\frac{f_{Q}^{-1}(\delta_{u})}{\ln 2/w_{u}},&h_{u}>h_{u}^{e}\\ 0,&h_{u}\leq h_{u}^{e},\end{cases}italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_F end_POSTSUPERSCRIPT = { start_ROW start_CELL italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT - square-root start_ARG divide start_ARG italic_V start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_ARG start_ARG italic_L start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_ARG end_ARG divide start_ARG italic_f start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_ϵ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ) end_ARG start_ARG roman_ln 2 / italic_w start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_ARG - square-root start_ARG divide start_ARG italic_V start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT end_ARG start_ARG italic_L start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_ARG end_ARG divide start_ARG italic_f start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_δ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ) end_ARG start_ARG roman_ln 2 / italic_w start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_ARG , end_CELL start_CELL italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT > italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT end_CELL end_ROW start_ROW start_CELL 0 , end_CELL start_CELL italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ≤ italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT , end_CELL end_ROW (6)

where Vue=1(1+PuhuewuN0)2superscriptsubscript𝑉𝑢𝑒1superscript1subscript𝑃𝑢superscriptsubscript𝑢𝑒subscript𝑤𝑢𝑁02V_{u}^{e}=1-{\left(1+\frac{P_{u}h_{u}^{e}}{w_{u}N0}\right)^{-2}}italic_V start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT = 1 - ( 1 + divide start_ARG italic_P start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT italic_N 0 end_ARG ) start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT, and δusubscript𝛿𝑢\delta_{u}italic_δ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT represents the information leakage, which describes the statistical independence between the transmitted confidential message and the eavesdropper’s observation, and is measured by the total variation distance [27].

III-B Bandwidth Reservation for Different Slices

We assume that there can be multiple bandwidth reservation policies for different slices in network slicing. Given the total bandwidth of the BS, Wmaxsubscript𝑊W_{\max}italic_W start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT, the bandwidth reserved for the τ𝜏\tauitalic_τ-th slice is given by

WτΦ,ξ=fτNS(UτΦ,ξ,IuτΦ,ξ)Wmax,superscriptsubscript𝑊𝜏Φ𝜉superscriptsubscript𝑓𝜏NSsuperscriptsubscript𝑈𝜏Φ𝜉superscriptsubscript𝐼subscript𝑢𝜏Φ𝜉subscript𝑊W_{\tau}^{\Phi,\xi}=f_{\tau}^{\mathrm{NS}}\left(U_{\tau}^{\Phi,\xi},I_{u_{\tau% }}^{\Phi,\xi}\right)\cdot W_{\max},italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = italic_f start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_NS end_POSTSUPERSCRIPT ( italic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , italic_I start_POSTSUBSCRIPT italic_u start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) ⋅ italic_W start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT , (7)

where UτΦ,ξsuperscriptsubscript𝑈𝜏Φ𝜉U_{\tau}^{\Phi,\xi}italic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT is the number of users in the τ𝜏\tauitalic_τ-th slice, IuτΦ,ξsuperscriptsubscript𝐼subscript𝑢𝜏Φ𝜉I_{u_{\tau}}^{\Phi,\xi}italic_I start_POSTSUBSCRIPT italic_u start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT is the QoS class identifier (QCI) of the uτsubscript𝑢𝜏u_{\tau}italic_u start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT-th user in the τ𝜏\tauitalic_τ-th slice, and fτNS(,)superscriptsubscript𝑓𝜏NSf_{\tau}^{\mathrm{NS}}\left(\cdot,\cdot\right)italic_f start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_NS end_POSTSUPERSCRIPT ( ⋅ , ⋅ ) is the network function for bandwidth reservation in network slicing. Since the sum of the bandwidth reserved for all the slices equals the total bandwidth of the BS, thus

τ=1TNSfτNS(UτΦ,ξ,IuτΦ,ξ)=1.superscriptsubscript𝜏1subscript𝑇NSsuperscriptsubscript𝑓𝜏NSsuperscriptsubscript𝑈𝜏Φ𝜉superscriptsubscript𝐼subscript𝑢𝜏Φ𝜉1\sum_{\tau=1}^{T_{\mathrm{NS}}}f_{\tau}^{\mathrm{NS}}\left(U_{\tau}^{\Phi,\xi}% ,I_{u_{\tau}}^{\Phi,\xi}\right)=1.∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT roman_NS end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_f start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_NS end_POSTSUPERSCRIPT ( italic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , italic_I start_POSTSUBSCRIPT italic_u start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) = 1 . (8)

where TNSsubscript𝑇NST_{\mathrm{NS}}italic_T start_POSTSUBSCRIPT roman_NS end_POSTSUBSCRIPT is the number of slices. Inspired by [31], the bandwidth reserved for each slice depends on the number of users in this slice and the QCI of these users, e.g.,

fτNS()=u𝒰τΦ,ξIuτΦ,ξτ=1TNSu𝒰τΦ,ξIuτΦ,ξ.superscriptsubscript𝑓𝜏NSsubscript𝑢superscriptsubscript𝒰𝜏Φ𝜉superscriptsubscript𝐼subscript𝑢𝜏Φ𝜉superscriptsubscript𝜏1subscript𝑇NSsubscript𝑢superscriptsubscript𝒰𝜏Φ𝜉superscriptsubscript𝐼subscript𝑢𝜏Φ𝜉f_{\tau}^{\mathrm{NS}}(\cdot)=\frac{\sum_{u\in\mathcal{U}_{\tau}^{\Phi,\xi}}I_% {u_{\tau}}^{\Phi,\xi}}{\sum_{\tau=1}^{T_{\mathrm{NS}}}\sum_{u\in\mathcal{U}_{% \tau}^{\Phi,\xi}}I_{u_{\tau}}^{\Phi,\xi}}.italic_f start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_NS end_POSTSUPERSCRIPT ( ⋅ ) = divide start_ARG ∑ start_POSTSUBSCRIPT italic_u ∈ caligraphic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT italic_u start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT roman_NS end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_u ∈ caligraphic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_I start_POSTSUBSCRIPT italic_u start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_ARG . (9)

III-C Problem Formulation

To maximize the sum reward subject to the QoS requirements in each slice, we formulate the bandwidth allocation problem as follows,

max𝒘τΦ,ξsubscriptsuperscriptsubscript𝒘𝜏Φ𝜉\displaystyle\max_{\bm{w}_{\tau}^{\Phi,\xi}}\quadroman_max start_POSTSUBSCRIPT bold_italic_w start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT u𝒰τΦ,ξruΦ,ξ,subscript𝑢superscriptsubscript𝒰𝜏Φ𝜉superscriptsubscript𝑟𝑢Φ𝜉\displaystyle\sum_{u\in\mathcal{U}_{\tau}^{\Phi,\xi}}r_{u}^{\Phi,\xi},∑ start_POSTSUBSCRIPT italic_u ∈ caligraphic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , (10)
s.t.formulae-sequencest\displaystyle\mathrm{s.t.}\quadroman_s . roman_t . u𝒰τΦ,ξwuΦ,ξWτΦ,ξ,subscript𝑢superscriptsubscript𝒰𝜏Φ𝜉superscriptsubscript𝑤𝑢Φ𝜉superscriptsubscript𝑊𝜏Φ𝜉\displaystyle\sum\limits_{u\in\mathcal{U}_{\tau}^{\Phi,\xi}}w_{u}^{\Phi,\xi}% \leq W_{\tau}^{\Phi,\xi},∑ start_POSTSUBSCRIPT italic_u ∈ caligraphic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ≤ italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , (10a)
wuΦ,ξ0,superscriptsubscript𝑤𝑢Φ𝜉0\displaystyle w_{u}^{\Phi,\xi}\geq 0,italic_w start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ≥ 0 , (10b)
ruΦ,ξrτΦ,ξ,superscriptsubscript𝑟𝑢Φ𝜉superscriptsubscript𝑟𝜏Φ𝜉\displaystyle r_{u}^{\Phi,\xi}\geq r_{\tau}^{\Phi,\xi},italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ≥ italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , (10c)

where 𝒘τΦ,ξ=[w1Φ,ξ,w2Φ,ξ,,wUτΦ,ξ]Tsuperscriptsubscript𝒘𝜏Φ𝜉superscriptsuperscriptsubscript𝑤1Φ𝜉superscriptsubscript𝑤2Φ𝜉superscriptsubscript𝑤subscript𝑈𝜏Φ𝜉T\bm{w}_{\tau}^{\Phi,\xi}=[w_{1}^{\Phi,\xi},w_{2}^{\Phi,\xi},\cdots,w_{U_{\tau}% }^{\Phi,\xi}]^{\mathrm{T}}bold_italic_w start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = [ italic_w start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , italic_w start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , ⋯ , italic_w start_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT is the bandwidth allocated to the users, and rτΦ,ξsuperscriptsubscript𝑟𝜏Φ𝜉r_{\tau}^{\Phi,\xi}italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT is the minimum threshold of the QoS required by the users. Thus, constraint (10c) guarantees the QoS of all the users.

III-D Analysis of Problem Feasibility

Given the available bandwidth constraint in (10a) and the QoS constraint in (10c), problem (10) will be infeasible when some of the users in this slice have weak channels. We denote the minimum bandwidth required to meet constraint (10c) by 𝒘minΦ,ξ=[w1,minΦ,ξ,,wUτ,minΦ,ξ]Tsuperscriptsubscript𝒘Φ𝜉superscriptsuperscriptsubscript𝑤1Φ𝜉superscriptsubscript𝑤subscript𝑈𝜏Φ𝜉T{\bm{w}}_{\min}^{\Phi,\xi}=\left[{w}_{1,\min}^{\Phi,\xi},\cdots,{w}_{U_{\tau},% \min}^{\Phi,\xi}\right]^{\mathrm{T}}bold_italic_w start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = [ italic_w start_POSTSUBSCRIPT 1 , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , ⋯ , italic_w start_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT. If some users experience deep fading, leading to u𝒰τΦ,ξwu,minΦ,ξ>WτΦ,ξsubscript𝑢superscriptsubscript𝒰𝜏Φ𝜉superscriptsubscript𝑤𝑢Φ𝜉superscriptsubscript𝑊𝜏Φ𝜉\sum_{u\in\mathcal{U}_{\tau}^{\Phi,\xi}}{w}_{u,\min}^{\Phi,\xi}>W_{\tau}^{\Phi% ,\xi}∑ start_POSTSUBSCRIPT italic_u ∈ caligraphic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_u , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT > italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT, then problem (10) is infeasible. In this case, the BS will only schedule the users with sufficiently strong channels. Alternatively, to maximize the number of scheduled users in problem (10), we consider that the BS schedules the K𝐾Kitalic_K users with the smallest bandwidth requirement. Denote the set of scheduled users by 𝒦τΦ,ξsuperscriptsubscript𝒦𝜏Φ𝜉\mathcal{K}_{\tau}^{\Phi,\xi}caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT. Then, for any k𝒦τΦ,ξ𝑘superscriptsubscript𝒦𝜏Φ𝜉k\in\mathcal{K}_{\tau}^{\Phi,\xi}italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT and u𝒦τΦ,ξ𝑢superscriptsubscript𝒦𝜏Φ𝜉u\notin\mathcal{K}_{\tau}^{\Phi,\xi}italic_u ∉ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT, we have wk,minΦ,ξwu,minΦ,ξsuperscriptsubscript𝑤𝑘Φ𝜉superscriptsubscript𝑤𝑢Φ𝜉{w}_{k,\min}^{\Phi,\xi}\leq{w}_{u,\min}^{\Phi,\xi}italic_w start_POSTSUBSCRIPT italic_k , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ≤ italic_w start_POSTSUBSCRIPT italic_u , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT. After user scheduling, problem (10) can be reformulated as follows,

max𝒘τΦ,ξsubscriptsuperscriptsubscript𝒘𝜏Φ𝜉\displaystyle\max_{\bm{w}_{\tau}^{\Phi,\xi}}\quadroman_max start_POSTSUBSCRIPT bold_italic_w start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT k𝒦τΦ,ξrkΦ,ξ,subscript𝑘superscriptsubscript𝒦𝜏Φ𝜉superscriptsubscript𝑟𝑘Φ𝜉\displaystyle\sum_{k\in\mathcal{K}_{\tau}^{\Phi,\xi}}{r_{k}^{\Phi,\xi}},∑ start_POSTSUBSCRIPT italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , (11)
s.t.formulae-sequencest\displaystyle\mathrm{s.t.}\quadroman_s . roman_t . k𝒦τΦ,ξwkΦ,ξWτΦ,ξ,subscript𝑘superscriptsubscript𝒦𝜏Φ𝜉superscriptsubscript𝑤𝑘Φ𝜉superscriptsubscript𝑊𝜏Φ𝜉\displaystyle\sum\limits_{k\in\mathcal{K}_{\tau}^{\Phi,\xi}}{w_{k}^{\Phi,\xi}}% \leq W_{\tau}^{\Phi,\xi},∑ start_POSTSUBSCRIPT italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ≤ italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , (11a)
wkΦ,ξwk,minΦ,ξ.superscriptsubscript𝑤𝑘Φ𝜉superscriptsubscript𝑤𝑘Φ𝜉\displaystyle w_{k}^{\Phi,\xi}\geq w_{k,\min}^{\Phi,\xi}.italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ≥ italic_w start_POSTSUBSCRIPT italic_k , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT . (11b)

In the following, we investigate how to find the optimal solution to problem (11).

IV Hybrid-Task Meta-Learning for GNN-based Scalable Bandwidth Allocation

In this section, we first illustrate how to obtain the optimal bandwidth allocation by using an iterative optimization algorithm. Next, we utilize feature engineering techniques to reformulate the problem, and represent the bandwidth allocation policy by a GNN. To generalize the GNN, the feature of required minimum bandwidth that can be used to represent different QoS requirements is used as the GNN’s input. Then, we develop a meta-learning approach to train the GNN. The goal is to obtain a policy that is scalable to the number of users and can generalize well in diverse communication scenarios with different channel distributions, QoS requirements, and available bandwidth.

1 Initialize: Bandwidth of a resource block: ΔwΔ𝑤\Delta wroman_Δ italic_w.
2 Use user scheduling algorithm to get the minimum required bandwidth for each scheduled user: wkΦ,ξ=wk,minΦ,ξ,k𝒦τΦ,ξformulae-sequencesuperscriptsubscript𝑤𝑘Φ𝜉superscriptsubscript𝑤𝑘Φ𝜉for-all𝑘superscriptsubscript𝒦𝜏Φ𝜉w_{k}^{\Phi,\xi}=w_{k,\min}^{\Phi,\xi},\forall k\in\mathcal{K}_{\tau}^{\Phi,\xi}italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = italic_w start_POSTSUBSCRIPT italic_k , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , ∀ italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT.
3 while WτΦ,ξk𝒦τΦ,ξwkΦ,ξΔwsuperscriptsubscript𝑊𝜏normal-Φ𝜉subscript𝑘superscriptsubscript𝒦𝜏normal-Φ𝜉superscriptsubscript𝑤𝑘normal-Φ𝜉normal-Δ𝑤W_{\tau}^{\Phi,\xi}-\sum\nolimits_{k\in\mathcal{K}_{\tau}^{\Phi,\xi}}{w_{k}^{% \Phi,\xi}}\geq\Delta witalic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT - ∑ start_POSTSUBSCRIPT italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ≥ roman_Δ italic_w do
4       for k𝒦τΦ,ξ𝑘superscriptsubscript𝒦𝜏normal-Φ𝜉k\in\mathcal{K}_{\tau}^{\Phi,\xi}italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT do
5             ΔrkΦ,ξ(wkΦ,ξ)=rkΦ,ξ(wkΦ,ξ+Δw)rkΦ,ξ(wkΦ,ξ)Δsuperscriptsubscript𝑟𝑘Φ𝜉superscriptsubscript𝑤𝑘Φ𝜉superscriptsubscript𝑟𝑘Φ𝜉superscriptsubscript𝑤𝑘Φ𝜉Δ𝑤superscriptsubscript𝑟𝑘Φ𝜉superscriptsubscript𝑤𝑘Φ𝜉\Delta r_{k}^{\Phi,\xi}(w_{k}^{\Phi,\xi})=r_{k}^{\Phi,\xi}(w_{k}^{\Phi,\xi}+% \Delta w)-r_{k}^{\Phi,\xi}(w_{k}^{\Phi,\xi})roman_Δ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) = italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT + roman_Δ italic_w ) - italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ).
6       end for
7      Identify user has highest ΔrkΦ,ξ(wkΦ,ξ)Δsuperscriptsubscript𝑟𝑘Φ𝜉superscriptsubscript𝑤𝑘Φ𝜉\Delta r_{k}^{\Phi,\xi}(w_{k}^{\Phi,\xi})roman_Δ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) in 𝒦τΦ,ξsuperscriptsubscript𝒦𝜏Φ𝜉\mathcal{K}_{\tau}^{\Phi,\xi}caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT: kallo=argmaxkΔrkΦ,ξ(wkΦ,ξ)subscript𝑘allosubscript𝑘Δsuperscriptsubscript𝑟𝑘Φ𝜉superscriptsubscript𝑤𝑘Φ𝜉k_{\mathrm{allo}}=\arg\max\limits_{k}{\Delta r_{k}^{\Phi,\xi}}(w_{k}^{\Phi,\xi})italic_k start_POSTSUBSCRIPT roman_allo end_POSTSUBSCRIPT = roman_arg roman_max start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT roman_Δ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ).
8       Allocate extra ΔwΔ𝑤\Delta wroman_Δ italic_w bandwidth to the kallosubscript𝑘allok_{\mathrm{allo}}italic_k start_POSTSUBSCRIPT roman_allo end_POSTSUBSCRIPT-th user: wkalloΦ,ξ=wkalloΦ,ξ+Δwsuperscriptsubscript𝑤subscript𝑘alloΦ𝜉superscriptsubscript𝑤subscript𝑘alloΦ𝜉Δ𝑤w_{k_{\mathrm{allo}}}^{\Phi,\xi}=w_{k_{\mathrm{allo}}}^{\Phi,\xi}+\Delta witalic_w start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT roman_allo end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = italic_w start_POSTSUBSCRIPT italic_k start_POSTSUBSCRIPT roman_allo end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT + roman_Δ italic_w.
9 end while
Output: Optimal bandwidth allocation policy: 𝒘Φ,ξ,opt=𝒘Φ,ξsuperscript𝒘Φ𝜉optsuperscript𝒘Φ𝜉\bm{w}^{\Phi,\xi,\mathrm{opt}}=\bm{w}^{\Phi,\xi}bold_italic_w start_POSTSUPERSCRIPT roman_Φ , italic_ξ , roman_opt end_POSTSUPERSCRIPT = bold_italic_w start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT.
Algorithm 1 Iterative Bandwidth Allocation Algorithm

IV-A Optimal Bandwidth Allocation by Iterative Algorithm

Inspired by the optimization algorithm for resource allocation in [10], we propose an iterative optimization algorithm for solving our problems. We denote the bandwidth of each resource block by ΔwΔ𝑤\Delta wroman_Δ italic_w. At the beginning of the iteration, the bandwidth allocated to each user is wk,minΦ,ξsuperscriptsubscript𝑤𝑘Φ𝜉w_{k,\min}^{\Phi,\xi}italic_w start_POSTSUBSCRIPT italic_k , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT. In each iteration, we calculate the incremental reward of each user when an extra resource block is allocated to it, denoted by ΔrkΦ,ξ(wk)=rkΦ,ξ(wkΦ,ξ+Δw)rkΦ,ξ(wkΦ,ξ)Δsuperscriptsubscript𝑟𝑘Φ𝜉subscript𝑤𝑘superscriptsubscript𝑟𝑘Φ𝜉superscriptsubscript𝑤𝑘Φ𝜉Δ𝑤superscriptsubscript𝑟𝑘Φ𝜉superscriptsubscript𝑤𝑘Φ𝜉\Delta{r}_{k}^{\Phi,\xi}(w_{k})={r}_{k}^{\Phi,\xi}({w}_{k}^{\Phi,\xi}+\Delta{w% })-{r}_{k}^{\Phi,\xi}({w}_{k}^{\Phi,\xi})roman_Δ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT + roman_Δ italic_w ) - italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ). Finally, the resource block is allocated to the user with the highest ΔrkΦ,ξΔsuperscriptsubscript𝑟𝑘Φ𝜉\Delta{r}_{k}^{\Phi,\xi}roman_Δ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT. The details of the algorithm can be found in Algorithm 1. The optimality of the algorithm depends on the properties of the problems. For problem (11), if it is a convex problem, then Algorithm 1 can obtain the optimal solution [10]. To validate whether problem (11) is convex or not, we only need to validate whether ruΦ,ξsuperscriptsubscript𝑟𝑢Φ𝜉r_{u}^{\Phi,\xi}italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT is concave or not. In the long blocklength regime, we can prove that the secrecy rate is concave in bandwidth. See proof in Appendix A. Since Shannon’s capacity is a special case of the secrecy rate when the wiretapped channel is in deep fading, thus Shannon’s capacity is also concave in bandwidth. In addition, the authors of [41] proved that the effective capacity is concave in bandwidth. Therefore, Algorithm 1 can obtain the optimal solution in the long blocklength regime. In the short-blocklength regime, ruΦ,ξsuperscriptsubscript𝑟𝑢Φ𝜉r_{u}^{\Phi,\xi}italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT is not concave when wkΦ,ξ(0,)superscriptsubscript𝑤𝑘Φ𝜉0w_{k}^{\Phi,\xi}\in(0,\infty)italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ∈ ( 0 , ∞ ). Nevertheless, based on the results in [42], the optimal bandwidth can be obtained in a region [0,wth](0,)0subscript𝑤th0[0,w_{\rm th}]\subset(0,\infty)[ 0 , italic_w start_POSTSUBSCRIPT roman_th end_POSTSUBSCRIPT ] ⊂ ( 0 , ∞ ), where ruΦ,ξsuperscriptsubscript𝑟𝑢Φ𝜉r_{u}^{\Phi,\xi}italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT is concave in bandwidth. By searching for the optimal bandwidth in [0,wth]0subscript𝑤th[0,w_{\rm th}][ 0 , italic_w start_POSTSUBSCRIPT roman_th end_POSTSUBSCRIPT ], Algorithm 1 can obtain the optimal solution in the short blocklength regime.

IV-B Feature Engineering and Problem Reformulation

To obtain a policy that can generalize well in different scenarios, we propose to use feature engineering technology to represent the channels and QoS requirements with more general features. Specifically, we first normalize the bandwidth allocation policy by the bandwidth reserved for this slice. The normalized bandwidth allocated to the k𝑘kitalic_k-th user, k𝒦τΦ,ξ𝑘superscriptsubscript𝒦𝜏Φ𝜉k\in\mathcal{K}_{\tau}^{\Phi,\xi}italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT, is given by w~kΦ,ξwkΦ,ξ/WτΦ,ξsuperscriptsubscript~𝑤𝑘Φ𝜉superscriptsubscript𝑤𝑘Φ𝜉superscriptsubscript𝑊𝜏Φ𝜉{\tilde{w}_{k}^{\Phi,\xi}}\triangleq{w}_{k}^{\Phi,\xi}/W_{\tau}^{\Phi,\xi}over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ≜ italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT / italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT. Then, the normalized minimum bandwidth required by the scheduled users is denoted by 𝒘~τ,minΦ,ξ=[w~1,minΦ,ξ,w~2,minΦ,ξ,,w~Kτ,minΦ,ξ]Tsuperscriptsubscript~𝒘𝜏Φ𝜉superscriptsuperscriptsubscript~𝑤1Φ𝜉superscriptsubscript~𝑤2Φ𝜉superscriptsubscript~𝑤subscript𝐾𝜏Φ𝜉T\tilde{\bm{w}}_{\tau,\min}^{\Phi,\xi}=[\tilde{w}_{1,\min}^{\Phi,\xi},\tilde{w}% _{2,\min}^{\Phi,\xi},...,\tilde{w}_{K_{\tau},\min}^{\Phi,\xi}]^{\mathrm{T}}over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = [ over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT 1 , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT 2 , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , … , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT. We define the surplus bandwidth as wSΦ,ξ=WτΦ,ξk𝒦τΦ,ξwksuperscriptsubscript𝑤SΦ𝜉superscriptsubscript𝑊𝜏Φ𝜉subscript𝑘superscriptsubscript𝒦𝜏Φ𝜉subscript𝑤𝑘w_{\mathrm{S}}^{\Phi,\xi}=W_{\tau}^{\Phi,\xi}-\sum_{k\in\mathcal{K}_{\tau}^{% \Phi,\xi}}w_{k}italic_w start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT - ∑ start_POSTSUBSCRIPT italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, and further denote the normalized surplus bandwidth by wS~Φ,ξwSΦ,ξ/WτΦ,ξsuperscript~subscript𝑤SΦ𝜉superscriptsubscript𝑤SΦ𝜉superscriptsubscript𝑊𝜏Φ𝜉\tilde{w_{\mathrm{S}}}^{\Phi,\xi}\triangleq w_{\mathrm{S}}^{\Phi,\xi}/W_{\tau}% ^{\Phi,\xi}over~ start_ARG italic_w start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT end_ARG start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ≜ italic_w start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT / italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT.

We note that bandwidth allocation policy maps channels and constraints to the bandwidth allocated to each user. After scheduling and normalization, the features of the channel state information and constraints (11a) and (11b) can be represented by 𝒘~τ,minΦ,ξsuperscriptsubscript~𝒘𝜏Φ𝜉\tilde{\bm{w}}_{\tau,\min}^{\Phi,\xi}over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT. Therefore, the bandwidth allocation policy can be reformulated as the map** from 𝒘~τ,minΦ,ξsuperscriptsubscript~𝒘𝜏Φ𝜉\tilde{\bm{w}}_{\tau,\min}^{\Phi,\xi}over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT and w~SΦ,ξsuperscriptsubscript~𝑤SΦ𝜉\tilde{w}_{\rm S}^{\Phi,\xi}over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT to 𝒘~Φ,ξsuperscript~𝒘Φ𝜉\tilde{\bm{w}}^{\Phi,\xi}over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT. We denote this function by

𝒘~τΦ,ξ=𝒇W(𝒘~τ,minΦ,ξ,w~SΦ,ξ)superscriptsubscript~𝒘𝜏Φ𝜉superscript𝒇Wsuperscriptsubscript~𝒘𝜏Φ𝜉superscriptsubscript~𝑤SΦ𝜉\tilde{\bm{w}}_{\tau}^{\Phi,\xi}=\bm{f}^{\rm W}\left(\tilde{\bm{w}}_{\tau,\min% }^{\Phi,\xi},\tilde{w}_{\rm S}^{\Phi,\xi}\right)over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = bold_italic_f start_POSTSUPERSCRIPT roman_W end_POSTSUPERSCRIPT ( over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) (12)

where 𝒇W(𝒘~τ,minΦ,ξ,w~SΦ,ξ)=[f1W(𝒘~τ,minΦ,ξ,w~SΦ,ξ),,fKW(𝒘~τ,minΦ,ξ,w~SΦ,ξ)]Tsuperscript𝒇Wsuperscriptsubscript~𝒘𝜏Φ𝜉superscriptsubscript~𝑤SΦ𝜉superscriptsuperscriptsubscript𝑓1Wsuperscriptsubscript~𝒘𝜏Φ𝜉superscriptsubscript~𝑤SΦ𝜉superscriptsubscript𝑓𝐾Wsuperscriptsubscript~𝒘𝜏Φ𝜉superscriptsubscript~𝑤SΦ𝜉T\bm{f}^{\mathrm{W}}(\tilde{\bm{w}}_{\tau,\min}^{\Phi,\xi},\tilde{w}_{\rm S}^{% \Phi,\xi})=\left[f_{1}^{\mathrm{W}}(\tilde{\bm{w}}_{\tau,\min}^{\Phi,\xi},% \tilde{w}_{\rm S}^{\Phi,\xi}),\cdots,f_{K}^{\mathrm{W}}(\tilde{\bm{w}}_{\tau,% \min}^{\Phi,\xi},\tilde{w}_{\rm S}^{\Phi,\xi})\right]^{\mathrm{T}}bold_italic_f start_POSTSUPERSCRIPT roman_W end_POSTSUPERSCRIPT ( over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) = [ italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_W end_POSTSUPERSCRIPT ( over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) , ⋯ , italic_f start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_W end_POSTSUPERSCRIPT ( over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT and w~kΦ,ξ=fkW(𝒘~τ,minΦ,ξ,w~SΦ,ξ)superscriptsubscript~𝑤𝑘Φ𝜉superscriptsubscript𝑓𝑘Wsuperscriptsubscript~𝒘𝜏Φ𝜉superscriptsubscript~𝑤SΦ𝜉\tilde{w}_{k}^{\Phi,\xi}=f_{k}^{\mathrm{W}}(\tilde{\bm{w}}_{\tau,\min}^{\Phi,% \xi},\tilde{w}_{\mathrm{S}}^{\Phi,\xi})over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_W end_POSTSUPERSCRIPT ( over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ). Given the bandwidth reserved for this slice, the achievable rates of the scheduled users can be expressed as

𝒓τΦ,ξ=𝒇Φ,ξ(𝒘~Φ,ξWτΦ,ξ)=𝒇Φ,ξ(𝒇W(𝒘~τ,minΦ,ξ,w~SΦ,ξ)WτΦ,ξ)superscriptsubscript𝒓𝜏Φ𝜉superscript𝒇Φ𝜉superscript~𝒘Φ𝜉superscriptsubscript𝑊𝜏Φ𝜉superscript𝒇Φ𝜉superscript𝒇Wsuperscriptsubscript~𝒘𝜏Φ𝜉superscriptsubscript~𝑤SΦ𝜉superscriptsubscript𝑊𝜏Φ𝜉\begin{split}\bm{r}_{\tau}^{\Phi,\xi}&=\bm{f}^{\Phi,\xi}\left(\tilde{\bm{w}}^{% \Phi,\xi}\cdot W_{\tau}^{\Phi,\xi}\right)\\ &=\bm{f}^{\Phi,\xi}\left(\bm{f}^{\mathrm{W}}\left(\tilde{\bm{w}}_{\tau,\min}^{% \Phi,\xi},\tilde{w}_{\rm S}^{\Phi,\xi}\right)\cdot W_{\tau}^{\Phi,\xi}\right)% \end{split}start_ROW start_CELL bold_italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_CELL start_CELL = bold_italic_f start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( over~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ⋅ italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL = bold_italic_f start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( bold_italic_f start_POSTSUPERSCRIPT roman_W end_POSTSUPERSCRIPT ( over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) ⋅ italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) end_CELL end_ROW (13)

where 𝒓τΦ,ξ=[r1Φ,ξ,,rKτΦ,ξ]Tsuperscriptsubscript𝒓𝜏Φ𝜉superscriptsuperscriptsubscript𝑟1Φ𝜉superscriptsubscript𝑟subscript𝐾𝜏Φ𝜉T\bm{r}_{\tau}^{\Phi,\xi}=\left[r_{1}^{\Phi,\xi},\cdots,r_{K_{\tau}}^{\Phi,\xi}% \right]^{\mathrm{T}}bold_italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = [ italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , ⋯ , italic_r start_POSTSUBSCRIPT italic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT, 𝒇Φ,ξ(𝒘~τΦ,ξWτΦ,ξ)=[f1Φ,ξ(w~1Φ,ξWτΦ,ξ),,fKΦ,ξ(w~KτΦ,ξWτΦ,ξ)]Tsuperscript𝒇Φ𝜉superscriptsubscript~𝒘𝜏Φ𝜉superscriptsubscript𝑊𝜏Φ𝜉superscriptsuperscriptsubscript𝑓1Φ𝜉superscriptsubscript~𝑤1Φ𝜉superscriptsubscript𝑊𝜏Φ𝜉superscriptsubscript𝑓𝐾Φ𝜉superscriptsubscript~𝑤subscript𝐾𝜏Φ𝜉superscriptsubscript𝑊𝜏Φ𝜉T\bm{f}^{\Phi,\xi}(\tilde{\bm{w}}_{\tau}^{\Phi,\xi}\cdot W_{\tau}^{\Phi,\xi})=% \left[f_{1}^{\Phi,\xi}(\tilde{w}_{1}^{\Phi,\xi}\cdot W_{\tau}^{\Phi,\xi}),% \cdots,f_{K}^{\Phi,\xi}(\tilde{w}_{K_{\tau}}^{\Phi,\xi}\cdot W_{\tau}^{\Phi,% \xi})\right]^{\mathrm{T}}bold_italic_f start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ⋅ italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) = [ italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ⋅ italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) , ⋯ , italic_f start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ⋅ italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT, and rkΦ,ξ=fkΦ,ξ(fkW(𝒘~τ,minΦ,ξ,w~S)WτΦ,ξ)superscriptsubscript𝑟𝑘Φ𝜉superscriptsubscript𝑓𝑘Φ𝜉superscriptsubscript𝑓𝑘Wsuperscriptsubscript~𝒘𝜏Φ𝜉subscript~𝑤Ssuperscriptsubscript𝑊𝜏Φ𝜉r_{k}^{\Phi,\xi}=f_{k}^{\Phi,\xi}\left(f_{k}^{\mathrm{W}}(\tilde{\bm{w}}_{\tau% ,\min}^{\Phi,\xi},\tilde{w}_{\rm S})\cdot W_{\tau}^{\Phi,\xi}\right)italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_W end_POSTSUPERSCRIPT ( over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT ) ⋅ italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ). Then, we can reformulate problem (11) as a functional optimization problem,

max𝒇W()subscriptsuperscript𝒇W\displaystyle\max_{\bm{f}^{\mathrm{W}}(\cdot)}roman_max start_POSTSUBSCRIPT bold_italic_f start_POSTSUPERSCRIPT roman_W end_POSTSUPERSCRIPT ( ⋅ ) end_POSTSUBSCRIPT k𝒦τΦ,ξfkΦ,ξ(fkW(𝒘~τ,minΦ,ξ,w~SΦ,ξ)WτΦ,ξ),subscript𝑘superscriptsubscript𝒦𝜏Φ𝜉superscriptsubscript𝑓𝑘Φ𝜉superscriptsubscript𝑓𝑘Wsuperscriptsubscript~𝒘𝜏Φ𝜉superscriptsubscript~𝑤SΦ𝜉superscriptsubscript𝑊𝜏Φ𝜉\displaystyle\sum\limits_{k\in\mathcal{K}_{\tau}^{\Phi,\xi}}f_{k}^{\Phi,\xi}% \left(f_{k}^{\mathrm{W}}\left(\tilde{\bm{w}}_{\tau,\min}^{\Phi,\xi},\tilde{w}_% {\mathrm{S}}^{\Phi,\xi}\right)\cdot W_{\tau}^{\Phi,\xi}\right),∑ start_POSTSUBSCRIPT italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_W end_POSTSUPERSCRIPT ( over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) ⋅ italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) , (14)
s.t.formulae-sequencest\displaystyle\mathrm{s.t.}\quadroman_s . roman_t . 1k𝒦τΦ,ξfkW(𝒘~τ,minΦ,ξ,w~SΦ,ξ)0,1subscript𝑘superscriptsubscript𝒦𝜏Φ𝜉superscriptsubscript𝑓𝑘Wsuperscriptsubscript~𝒘𝜏Φ𝜉superscriptsubscript~𝑤SΦ𝜉0\displaystyle 1-\sum\limits_{k\in\mathcal{K}_{\tau}^{\Phi,\xi}}f_{k}^{\mathrm{% W}}\left(\tilde{\bm{w}}_{\tau,\min}^{\Phi,\xi},\tilde{w}_{\rm S}^{\Phi,\xi}% \right)\geq 0,1 - ∑ start_POSTSUBSCRIPT italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_W end_POSTSUPERSCRIPT ( over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) ≥ 0 , (14a)
fkW(𝒘~τ,minΦ,ξ,w~SΦ,ξ)w~k,minΦ,ξ.superscriptsubscript𝑓𝑘Wsuperscriptsubscript~𝒘𝜏Φ𝜉superscriptsubscript~𝑤SΦ𝜉superscriptsubscript~𝑤𝑘Φ𝜉\displaystyle f_{k}^{\mathrm{W}}\left(\tilde{\bm{w}}_{\tau,\min}^{\Phi,\xi},% \tilde{w}_{\rm S}^{\Phi,\xi}\right)\geq\tilde{w}_{k,\min}^{\Phi,\xi}.italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_W end_POSTSUPERSCRIPT ( over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ) ≥ over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_k , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT . (14b)

In the rest part of this section, we will find the optimal solution to problem (14).

IV-C Proposed GNN

In this subsection, we propose a GNN-based unsupervised learning algorithm to obtain a scalable bandwidth allocation policy.

Refer to caption
Figure 1: GNN-based scalable bandwidth allocation.
1 Initialize batch size, J𝐽Jitalic_J, number of training epochs, N𝑁Nitalic_N, and learning rate βθsubscript𝛽𝜃\beta_{\theta}italic_β start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT.
2 Randomly initialize θ0superscript𝜃0\theta^{0}italic_θ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT
3 for n=0,1,,N1𝑛01normal-⋯𝑁1n=0,1,\cdots,N-1italic_n = 0 , 1 , ⋯ , italic_N - 1  do
4       Message passing: xkn=fFNN(w~k,minΦ,ξ,w~SΦ,ξ|θn),k𝒦τΦ,ξ.formulae-sequencesuperscriptsubscript𝑥𝑘𝑛subscript𝑓FNNsuperscriptsubscript~𝑤𝑘Φ𝜉conditionalsuperscriptsubscript~𝑤SΦ𝜉superscript𝜃𝑛for-all𝑘superscriptsubscript𝒦𝜏Φ𝜉x_{k}^{n}=f_{\mathrm{FNN}}\left(\tilde{w}_{k,\min}^{\Phi,\xi},\tilde{w}_{% \mathrm{S}}^{\Phi,\xi}\Big{|}\theta^{n}\right),\forall k\in\mathcal{K}_{\tau}^% {\Phi,\xi}.italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT = italic_f start_POSTSUBSCRIPT roman_FNN end_POSTSUBSCRIPT ( over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_k , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT | italic_θ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) , ∀ italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT .  Aggregation: 𝒙n=f𝙲𝚘𝚗𝚌𝚊𝚝(x1n,,xKn)=[x1n,,xKn]Tsuperscript𝒙𝑛subscript𝑓𝙲𝚘𝚗𝚌𝚊𝚝superscriptsubscript𝑥1𝑛superscriptsubscript𝑥𝐾𝑛superscriptsuperscriptsubscript𝑥1𝑛superscriptsubscript𝑥𝐾𝑛T\bm{x}^{n}=f_{\mathtt{Concat}}\left(x_{1}^{n},\cdots,x_{K}^{n}\right)=\left[x_% {1}^{n},\cdots,x_{K}^{n}\right]^{\mathrm{T}}bold_italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT = italic_f start_POSTSUBSCRIPT typewriter_Concat end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) = [ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT , ⋯ , italic_x start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT and 𝒚n=f𝚂𝚘𝚏𝚝𝚖𝚊𝚡(𝒙n).superscript𝒚𝑛subscript𝑓𝚂𝚘𝚏𝚝𝚖𝚊𝚡superscript𝒙𝑛\bm{y}^{n}=f_{\mathtt{Softmax}}(\bm{x}^{n}).bold_italic_y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT = italic_f start_POSTSUBSCRIPT typewriter_Softmax end_POSTSUBSCRIPT ( bold_italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) . Readout: 𝒘~n=f𝚁𝚎𝚊𝚍𝚘𝚞𝚝(𝒚n)=𝒚nw~SΦ,ξ+𝒘~minΦ,ξ.superscriptbold-~𝒘𝑛subscript𝑓𝚁𝚎𝚊𝚍𝚘𝚞𝚝superscript𝒚𝑛superscript𝒚𝑛superscriptsubscript~𝑤SΦ𝜉superscriptsubscriptbold-~𝒘Φ𝜉\bm{\tilde{w}}^{n}=f_{\mathtt{Readout}}(\bm{y}^{n})=\bm{y}^{n}\cdot\tilde{w}_{% \mathrm{S}}^{\Phi,\xi}+\bm{\tilde{w}}_{\min}^{\Phi,\xi}.overbold_~ start_ARG bold_italic_w end_ARG start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT = italic_f start_POSTSUBSCRIPT typewriter_Readout end_POSTSUBSCRIPT ( bold_italic_y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) = bold_italic_y start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ⋅ over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT + overbold_~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT .  Update the loss function by eq. (15), denoted by fL(θn)superscript𝑓Lsuperscript𝜃𝑛f^{\mathrm{L}}(\theta^{n})italic_f start_POSTSUPERSCRIPT roman_L end_POSTSUPERSCRIPT ( italic_θ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ).   Update parameters of the GNN by SGA: θn+1=θn+βθθfL(θn).superscript𝜃𝑛1superscript𝜃𝑛subscript𝛽𝜃subscript𝜃superscript𝑓Lsuperscript𝜃𝑛{\theta}^{n+1}={\theta}^{n}+\beta_{\theta}{\nabla_{\theta}{f^{\mathrm{L}}}(% \theta^{n})}.italic_θ start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT = italic_θ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT + italic_β start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ∇ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT italic_f start_POSTSUPERSCRIPT roman_L end_POSTSUPERSCRIPT ( italic_θ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ) . 
5 end for
Return the parameters of the GNN as: θoptsuperscript𝜃opt\theta^{\mathrm{opt}}italic_θ start_POSTSUPERSCRIPT roman_opt end_POSTSUPERSCRIPT.
Algorithm 2 GNN for Scalable Bandwidth Allocation.

IV-C1 Structure of GNN

As shown in Fig. 1, the proposed GNN-based bandwidth allocation algorithm comprises three key steps: message passing, aggregation, and readout.

Message passing

Each scheduled user is a vertex in the GNN. We use a fully connected neural network (FNN) to obtain the embedding of each vertex, denoted by xk,k𝒦τΦ,ξsubscript𝑥𝑘for-all𝑘superscriptsubscript𝒦𝜏Φ𝜉x_{k},\forall k\in\mathcal{K}_{\tau}^{\Phi,\xi}italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , ∀ italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT. The inputs of each FNN include w~k,minΦ,ξsuperscriptsubscript~𝑤𝑘Φ𝜉\tilde{w}_{k,\min}^{\Phi,\xi}over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_k , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT and w~SΦ,ξ=1k𝒦τΦ,ξw~k,minΦ,ξsuperscriptsubscript~𝑤SΦ𝜉1subscript𝑘superscriptsubscript𝒦𝜏Φ𝜉superscriptsubscript~𝑤𝑘Φ𝜉\tilde{w}_{\mathrm{S}}^{\Phi,\xi}=1-\sum_{k\in\mathcal{K}_{\tau}^{\Phi,\xi}}% \tilde{w}_{k,\min}^{\Phi,\xi}over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = 1 - ∑ start_POSTSUBSCRIPT italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_k , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT. We use θ𝜃\thetaitalic_θ to denote the training parameters of the FNN. In the n𝑛nitalic_n-th epoch, the message passing function is given by xkn=fFNN(w~k,minΦ,ξ,w~SΦ,ξ|θn)superscriptsubscript𝑥𝑘𝑛subscript𝑓FNNsuperscriptsubscript~𝑤𝑘Φ𝜉conditionalsuperscriptsubscript~𝑤SΦ𝜉superscript𝜃𝑛x_{k}^{n}=f_{\mathrm{FNN}}\left(\tilde{w}_{k,\min}^{\Phi,\xi},\tilde{w}_{% \mathrm{S}}^{\Phi,\xi}\Big{|}\theta^{n}\right)italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT = italic_f start_POSTSUBSCRIPT roman_FNN end_POSTSUBSCRIPT ( over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_k , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT | italic_θ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ). Since the vertices are homogeneous, the training parameters of all the FNNs are the same.

Aggregation

In the aggregation step, we first aggregate the embeddings of all the scheduled users by using a concatenation function, f𝙲𝚘𝚗𝚌𝚊𝚝()subscript𝑓𝙲𝚘𝚗𝚌𝚊𝚝f_{\mathtt{Concat}}(\cdot)italic_f start_POSTSUBSCRIPT typewriter_Concat end_POSTSUBSCRIPT ( ⋅ ), followed by a Softmax function, f𝚂𝚘𝚏𝚝𝚖𝚊𝚡()subscript𝑓𝚂𝚘𝚏𝚝𝚖𝚊𝚡f_{\mathtt{Softmax}}(\cdot)italic_f start_POSTSUBSCRIPT typewriter_Softmax end_POSTSUBSCRIPT ( ⋅ ), which serves as the activation function in the aggregation. The output after aggregation is denoted by 𝒚𝒚\bm{y}bold_italic_y.

Readout

The GNN’s output of each vertex is updated by a readout function given by, f𝚁𝚎𝚊𝚍𝚘𝚞𝚝(𝒚)=𝒚w~SΦ,ξ+𝒘~τ,minΦ,ξsubscript𝑓𝚁𝚎𝚊𝚍𝚘𝚞𝚝𝒚𝒚superscriptsubscript~𝑤SΦ𝜉superscriptsubscript~𝒘𝜏Φ𝜉f_{\mathtt{Readout}}(\bm{y})=\bm{y}\cdot\tilde{w}_{\mathrm{S}}^{\Phi,\xi}+% \tilde{\bm{w}}_{\tau,\min}^{\Phi,\xi}italic_f start_POSTSUBSCRIPT typewriter_Readout end_POSTSUBSCRIPT ( bold_italic_y ) = bold_italic_y ⋅ over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT + over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT italic_τ , roman_min end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT. Since 𝒚𝒚\bm{y}bold_italic_y is obtained from the 𝚂𝚘𝚏𝚝𝚖𝚊𝚡𝚂𝚘𝚏𝚝𝚖𝚊𝚡\mathtt{Softmax}typewriter_Softmax function, the summation of its elements is one. From the readout function, all the surplus bandwidth is allocated to the users, and constraints (14a) and (14b) can be satisfied.

IV-C2 Unsupervised Learning

The learning algorithm is detailed in Algorithm 2. Specifically, in the n𝑛nitalic_n-th epoch, we use our GNN to obtain the bandwidth allocation and estimate the expectation of the objective function by using the batch samples according to

fL(θ)=1Jj=1Jk𝒦τΦ,ξfkΦ,ξ(w~j,kΦ,ξWj,τNS),superscript𝑓L𝜃1𝐽superscriptsubscript𝑗1𝐽subscript𝑘superscriptsubscript𝒦𝜏Φ𝜉superscriptsubscript𝑓𝑘Φ𝜉superscriptsubscript~𝑤𝑗𝑘Φ𝜉superscriptsubscript𝑊𝑗𝜏NS\displaystyle f^{\mathrm{L}}(\theta)=\frac{1}{J}\sum\limits_{j=1}^{J}\sum% \limits_{k\in\mathcal{K}_{\tau}^{\Phi,\xi}}f_{k}^{\Phi,\xi}\left(\tilde{w}_{j,% k}^{\Phi,\xi}\cdot W_{j,\tau}^{\mathrm{NS}}\right),italic_f start_POSTSUPERSCRIPT roman_L end_POSTSUPERSCRIPT ( italic_θ ) = divide start_ARG 1 end_ARG start_ARG italic_J end_ARG ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_J end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ⋅ italic_W start_POSTSUBSCRIPT italic_j , italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_NS end_POSTSUPERSCRIPT ) , (15)

where J𝐽Jitalic_J is the batch size. Then, we use stochastic gradient descent (SGA) to maximize the estimated expectation of the objective function in (14). As shown in [16], maximizing the expectation of the objective function, where the expectation is taken over channels, is equivalent to maximizing the objective function with given channels. Thus, from Algorithm 2, we can find the bandwidth allocation policy that maximizes the objective function in (14).

IV-C3 Computational Complexity

We compare the computational complexity of our GNN with the iterative algorithm introduced in Section IV-A. In cellular systems, both algorithms will be implemented in each transmission time interval with a duration of less than 1 ms. Thus, we are interested in the inference complexity of our GNN, i.e., the number of operations to be executed to obtain the bandwidth allocation in each transmission time interval.

Inference complexity of our GNN

To compute the embedding of each vertex, we need to compute the output of the FNN in Fig. 1. We denote the number of layers of the FNN by LFNNsubscript𝐿FNNL_{\mathrm{FNN}}italic_L start_POSTSUBSCRIPT roman_FNN end_POSTSUBSCRIPT and the number of neurons in the \ellroman_ℓ-th layer by mFNNsuperscriptsubscript𝑚FNNm_{\mathrm{FNN}}^{\ell}italic_m start_POSTSUBSCRIPT roman_FNN end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT. Then, the number of multiplications required to compute the output of the \ellroman_ℓ-th layer is mFNNmFNN+1superscriptsubscript𝑚FNNsuperscriptsubscript𝑚FNN1m_{\mathrm{FNN}}^{\ell}\cdot m_{\mathrm{FNN}}^{\ell+1}italic_m start_POSTSUBSCRIPT roman_FNN end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT ⋅ italic_m start_POSTSUBSCRIPT roman_FNN end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ℓ + 1 end_POSTSUPERSCRIPT and the total number of multiplications for computing the embedding is MFNN==1LFNNmFNNmFNN+1subscript𝑀FNNsuperscriptsubscript1subscript𝐿FNNsuperscriptsubscript𝑚FNNsuperscriptsubscript𝑚FNN1M_{\mathrm{FNN}}=\sum\nolimits_{\ell=1}^{L_{\mathrm{FNN}}}m_{\mathrm{FNN}}^{% \ell}\cdot m_{\mathrm{FNN}}^{\ell+1}italic_M start_POSTSUBSCRIPT roman_FNN end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT roman_ℓ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_L start_POSTSUBSCRIPT roman_FNN end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_m start_POSTSUBSCRIPT roman_FNN end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT ⋅ italic_m start_POSTSUBSCRIPT roman_FNN end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_ℓ + 1 end_POSTSUPERSCRIPT [10]. After obtaining the embeddings of K𝐾Kitalic_K users, the number of multiplications required by f𝚂𝚘𝚏𝚝𝚖𝚊𝚡(𝒙)subscript𝑓𝚂𝚘𝚏𝚝𝚖𝚊𝚡𝒙f_{\mathtt{Softmax}}(\bm{x})italic_f start_POSTSUBSCRIPT typewriter_Softmax end_POSTSUBSCRIPT ( bold_italic_x ) and f𝚁𝚎𝚊𝚍𝚘𝚞𝚝(𝒚)subscript𝑓𝚁𝚎𝚊𝚍𝚘𝚞𝚝𝒚f_{\mathtt{Readout}}(\bm{y})italic_f start_POSTSUBSCRIPT typewriter_Readout end_POSTSUBSCRIPT ( bold_italic_y ) is 2K2𝐾2K2 italic_K. Therefore, the inference complexity of the GNN-based bandwidth allocation policy is

OGNN=O(K(MFNN+2)).subscript𝑂GNN𝑂𝐾subscript𝑀FNN2O_{\mathrm{GNN}}=O(K\cdot(M_{\mathrm{FNN}}+2)).italic_O start_POSTSUBSCRIPT roman_GNN end_POSTSUBSCRIPT = italic_O ( italic_K ⋅ ( italic_M start_POSTSUBSCRIPT roman_FNN end_POSTSUBSCRIPT + 2 ) ) . (16)
Complexity of the iterative algorithm

In each iteration of the optimization algorithm, we assign a small portion of the normalized surplus bandwidth, denoted by Δw~Δ~𝑤\Delta\tilde{w}roman_Δ over~ start_ARG italic_w end_ARG, to a user that can maximize the objective function. The algorithm needs to compute the objective function K𝐾Kitalic_K times and find the best user. We denote the complexity for computing the objective function by ΩΩ\Omegaroman_Ω, then the complexity of the iterative algorithm is given by

Oiter=O(KwSΔwΩ),subscript𝑂iter𝑂𝐾subscript𝑤SΔ𝑤ΩO_{\mathrm{iter}}=O\left(K\cdot\frac{w_{\mathrm{S}}}{\Delta w}\cdot\Omega% \right),italic_O start_POSTSUBSCRIPT roman_iter end_POSTSUBSCRIPT = italic_O ( italic_K ⋅ divide start_ARG italic_w start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT end_ARG start_ARG roman_Δ italic_w end_ARG ⋅ roman_Ω ) , (17)

where wS/Δwsubscript𝑤SΔ𝑤w_{\mathrm{S}}/{\Delta{w}}italic_w start_POSTSUBSCRIPT roman_S end_POSTSUBSCRIPT / roman_Δ italic_w represents the number of iterations used in the iterative algorithm.

Complexity comparison

To obtain bandwidth allocation in each transmission time interval, the transmitter either uses the forward propagation algorithm to compute the outcome of the GNN or executes the iterative algorithm. From eqs. (16) and (17), we can see that the computational complexity of our GNN and the iterative algorithm increase linearly with the number of users. Recall that MFNNsubscript𝑀FNNM_{\mathrm{FNN}}italic_M start_POSTSUBSCRIPT roman_FNN end_POSTSUBSCRIPT in eq. (16) is quite limited. In contrast, the complexity of the iterative algorithm also increases with the amount of surplus bandwidth and the resource block and thus depends on the channels of the users. In addition, the computing complexity for evaluating the objective function, denoted by ΩΩ\Omegaroman_Ω in eq. (17), in each iteration of the optimization algorithm could also be extremely high. Thus, the inference complexity of the GNN is much lower than the complexity of the iterative optimization algorithm.

IV-D Hybrid-Task Meta-Learning

To obtain a GNN with strong generalization ability, we propose an HML algorithm that combines multi-task learning and meta-learning.

IV-D1 Task, Sample, and Taskset

To apply the meta-learning framework, we first define tasks, samples, and tasksets in the context of bandwidth allocation problems. A task is a specific bandwidth allocation problem with a unique combination of system parameters, including the number of users, U𝑈Uitalic_U, the channel model (i.e., path loss model, shadowing, and small-scale channel fading), the QoS requirement, rτΦ,ξsuperscriptsubscript𝑟𝜏Φ𝜉r_{\tau}^{\Phi,\xi}italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT, and the reserved bandwidth, WτΦ,ξsuperscriptsubscript𝑊𝜏Φ𝜉W_{\tau}^{\Phi,\xi}italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT. If any of the above system parameters change, it would result in a different task. For each task, the samples correspond to the wireless channels that have been transformed into the minimum bandwidth requirement by feature engineering, as specified in constraint (14b). There are four tasksets in meta-learning, and a taskset consists of multiple tasks. We will provide their definitions in the sequel.

Refer to caption
(a) Model-agnostic meta-learning (MAML).
Refer to caption
(b) Hybrid-task meta-learning (HML).
Figure 2: Tasksets of meta-learning algorithms, where different shapes represent different tasks.

IV-D2 Support Set and Query Set in Meta-Training

As shown in Fig. 2(a), most meta-learning learning algorithms, such as MAML, consist of a meta-training stage and a meta-testing stage. In meta-training, there are two tasksets, support set 𝒯Ssuperscript𝒯S\mathcal{T}^{\mathrm{S}}caligraphic_T start_POSTSUPERSCRIPT roman_S end_POSTSUPERSCRIPT and query set 𝒯Qsuperscript𝒯Q\mathcal{T}^{\mathrm{Q}}caligraphic_T start_POSTSUPERSCRIPT roman_Q end_POSTSUPERSCRIPT. The tasks in the two tasksets are the same, but the samples of each task in the two tasksets are different. Specifically, we first set the initialize parameters of the GNN to ϕitalic-ϕ\phiitalic_ϕ, which is randomly initialized at the beginning of meta-training, and updated in every iteration of the meta-training. Then, we train the parameters of the GNN by using the tasks and the corresponding samples in the support set, where θ𝜃\thetaitalic_θ is initialized with parameters ϕitalic-ϕ\phiitalic_ϕ. Then, we update the initial parameters ϕitalic-ϕ\phiitalic_ϕ by using the tasks and the corresponding samples in the query set. We denote the initial parameters trained in meta-training of MAML by ϕ*superscriptitalic-ϕ\phi^{*}italic_ϕ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT. The details of the MAML algorithm can be found in [37].

IV-D3 Fine-Tuning Set and Evaluation Set in Meta-Testing

To evaluate the generalization ability of the GNN, a different set of tasks that are unseen in the meta-training stage are used in meta-testing. As shown in Fig. 2(a), the tasks in meta-testing are divided into a fine-tuning set and an evaluation set, denoted by 𝒯Fsuperscript𝒯F\mathcal{T}^{\mathrm{F}}caligraphic_T start_POSTSUPERSCRIPT roman_F end_POSTSUPERSCRIPT and 𝒯Esuperscript𝒯E\mathcal{T}^{\mathrm{E}}caligraphic_T start_POSTSUPERSCRIPT roman_E end_POSTSUPERSCRIPT, respectively. The tasks in 𝒯Fsuperscript𝒯F\mathcal{T}^{\mathrm{F}}caligraphic_T start_POSTSUPERSCRIPT roman_F end_POSTSUPERSCRIPT and 𝒯Esuperscript𝒯E\mathcal{T}^{\mathrm{E}}caligraphic_T start_POSTSUPERSCRIPT roman_E end_POSTSUPERSCRIPT are the same, but the samples of each task in these two tasksets are different. For each new task in meta-testing, the samples from the fine-tuning set are used to fine-tune θ𝜃\thetaitalic_θ, which is initialized by ϕ*superscriptitalic-ϕ\phi^{*}italic_ϕ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT obtained in meta-training. After fine-tuning, the updated GNN is tested with the samples from the evaluation set. If no sample is used to fine-tune the GNN in the meta-testing stage, we refer to this approach as zero-shot meta-learning. Otherwise, it is known as few-shot meta-learning. The meta-testing algorithm is detailed in Algorithm 3.

1 Initialize the number of training epochs, N𝑁Nitalic_N, and the learning rate of the target testing task, βθsubscript𝛽𝜃\beta_{\theta}italic_β start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT.
2 Select the i𝑖iitalic_i-th task from the fine-tuning set and the evaluation set: 𝒯iF𝒯Fsuperscriptsubscript𝒯𝑖Fsuperscript𝒯F\mathcal{T}_{i}^{\mathrm{F}}\in\mathcal{T}^{\mathrm{F}}caligraphic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_F end_POSTSUPERSCRIPT ∈ caligraphic_T start_POSTSUPERSCRIPT roman_F end_POSTSUPERSCRIPT and 𝒯iE𝒯Esuperscriptsubscript𝒯𝑖Esuperscript𝒯E\mathcal{T}_{i}^{\mathrm{E}}\in\mathcal{T}^{\mathrm{E}}caligraphic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_E end_POSTSUPERSCRIPT ∈ caligraphic_T start_POSTSUPERSCRIPT roman_E end_POSTSUPERSCRIPT.
3 Set the initialization parameters of the GNN as: θ0=ϕ*superscript𝜃0superscriptitalic-ϕ\theta^{0}=\phi^{*}italic_θ start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT = italic_ϕ start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT.
4 for n=0,1,,N1𝑛01normal-⋯𝑁1n=0,1,\cdots,N-1italic_n = 0 , 1 , ⋯ , italic_N - 1 do
5       Randomly select J𝐽Jitalic_J samples from task 𝒯iFsuperscriptsubscript𝒯𝑖F\mathcal{T}_{i}^{\mathrm{F}}caligraphic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_F end_POSTSUPERSCRIPT
6       Calculate loss of in the fine-tuning set according to (15), denoted by fL,F(θn)superscript𝑓LFsuperscript𝜃𝑛f^{\mathrm{L,F}}(\theta^{n})italic_f start_POSTSUPERSCRIPT roman_L , roman_F end_POSTSUPERSCRIPT ( italic_θ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ).
7       Update the parameters of GNN by: θn+1=θn+βθθfL,F(θn)superscript𝜃𝑛1superscript𝜃𝑛subscript𝛽𝜃subscript𝜃superscript𝑓LFsuperscript𝜃𝑛\theta^{n+1}=\theta^{n}+{\beta_{\theta}}\nabla_{\theta}{f}^{\mathrm{L,F}}(% \theta^{n})italic_θ start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT = italic_θ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT + italic_β start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ∇ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT italic_f start_POSTSUPERSCRIPT roman_L , roman_F end_POSTSUPERSCRIPT ( italic_θ start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ).
8      
9 end for
10Randomly select Jsuperscript𝐽J^{\prime}italic_J start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT samples from task 𝒯iEsuperscriptsubscript𝒯𝑖E\mathcal{T}_{i}^{\mathrm{E}}caligraphic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_E end_POSTSUPERSCRIPT.
11 Evaluate the fine-tuned policies of the i𝑖iitalic_i-th task using θNsuperscript𝜃𝑁\theta^{N}italic_θ start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT: 𝒘iΦ,ξ=fGNN(𝒘~min,i,jΦ,ξ,w~S,i,jΦ,ξ|θN)superscriptsubscript𝒘𝑖Φ𝜉subscript𝑓GNNsuperscriptsubscript~𝒘𝑖superscript𝑗Φ𝜉conditionalsuperscriptsubscript~𝑤S𝑖superscript𝑗Φ𝜉superscript𝜃𝑁\bm{w}_{i}^{\Phi,\xi}=f_{\mathrm{GNN}}\left(\tilde{\bm{w}}_{\min,i,j^{\prime}}% ^{\Phi,\xi},\tilde{w}_{\mathrm{S},i,j^{\prime}}^{\Phi,\xi}\big{|}\theta^{N}\right)bold_italic_w start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = italic_f start_POSTSUBSCRIPT roman_GNN end_POSTSUBSCRIPT ( over~ start_ARG bold_italic_w end_ARG start_POSTSUBSCRIPT roman_min , italic_i , italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT , over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT roman_S , italic_i , italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT | italic_θ start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT )
Algorithm 3 Meta-Testing
1 Randomly initialize the training parameters for all the tasks, ϕitalic-ϕ\phiitalic_ϕ, the number of meta-training epochs, M𝑀Mitalic_M, the learning rate of meta-training, βϕsubscript𝛽italic-ϕ\beta_{\phi}italic_β start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT, and the learning rate of each task, βθsubscript𝛽𝜃\beta_{\theta}italic_β start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT.
2 for m=0,1,,M1𝑚01normal-⋯𝑀1m=0,1,\cdots,M-1italic_m = 0 , 1 , ⋯ , italic_M - 1 do
3       Select a batch of I𝐼Iitalic_I tasks from the support set: 𝒯iS𝒯S,i{1,2,,I}formulae-sequencesuperscriptsubscript𝒯𝑖Ssuperscript𝒯S𝑖12𝐼\mathcal{T}_{i}^{\mathrm{S}}\in\mathcal{T}^{\mathrm{S}},\quad i\in\{1,2,\cdots% ,I\}caligraphic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_S end_POSTSUPERSCRIPT ∈ caligraphic_T start_POSTSUPERSCRIPT roman_S end_POSTSUPERSCRIPT , italic_i ∈ { 1 , 2 , ⋯ , italic_I }.
4       for i=1,2,,I𝑖12normal-⋯𝐼i=1,2,\cdots,Iitalic_i = 1 , 2 , ⋯ , italic_I do
5             Set the initial parameters of the GNN to θim=ϕm.superscriptsubscript𝜃𝑖𝑚superscriptitalic-ϕ𝑚\theta_{i}^{m}=\phi^{m}.italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT = italic_ϕ start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT .
6             for n=0,1,,N1𝑛01normal-⋯𝑁1n=0,1,\cdots,N-1italic_n = 0 , 1 , ⋯ , italic_N - 1 do
7                   Randomly select J𝐽Jitalic_J samples from task 𝒯iSsuperscriptsubscript𝒯𝑖S\mathcal{T}_{i}^{\mathrm{S}}caligraphic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_S end_POSTSUPERSCRIPT.
8                   Calculate the loss function in the support set according to (15), denoted by fL,S(θim,n)superscript𝑓LSsuperscriptsubscript𝜃𝑖𝑚𝑛{f}^{\mathrm{L,S}}(\theta_{i}^{m,n})italic_f start_POSTSUPERSCRIPT roman_L , roman_S end_POSTSUPERSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m , italic_n end_POSTSUPERSCRIPT ).  Update the parameters by: θim,n+1=θim,n+βθθfL,S(θim,n).superscriptsubscript𝜃𝑖𝑚𝑛1superscriptsubscript𝜃𝑖𝑚𝑛subscript𝛽𝜃subscript𝜃superscript𝑓LSsuperscriptsubscript𝜃𝑖𝑚𝑛\theta_{i}^{m,n+1}=\theta_{i}^{m,n}+{\beta_{\theta}}\nabla_{\theta}{f}^{% \mathrm{L,S}}(\theta_{i}^{m,n}).italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m , italic_n + 1 end_POSTSUPERSCRIPT = italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m , italic_n end_POSTSUPERSCRIPT + italic_β start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT ∇ start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT italic_f start_POSTSUPERSCRIPT roman_L , roman_S end_POSTSUPERSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m , italic_n end_POSTSUPERSCRIPT ) . 
9             end for
10            Select a batch of Isuperscript𝐼I^{\prime}italic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT tasks from the query set: 𝒯iQ𝒯Q,i{1,2,,I}formulae-sequencesuperscriptsubscript𝒯superscript𝑖Qsuperscript𝒯Qsuperscript𝑖12superscript𝐼\mathcal{T}_{i^{\prime}}^{\mathrm{Q}}\in\mathcal{T}^{\mathrm{Q}},\quad i^{% \prime}\in\{1,2,\cdots,I^{\prime}\}caligraphic_T start_POSTSUBSCRIPT italic_i start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Q end_POSTSUPERSCRIPT ∈ caligraphic_T start_POSTSUPERSCRIPT roman_Q end_POSTSUPERSCRIPT , italic_i start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ { 1 , 2 , ⋯ , italic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT }. for i=1,,Isuperscript𝑖normal-′1normal-⋯superscript𝐼normal-′i^{\prime}=1,\cdots,I^{\prime}italic_i start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 1 , ⋯ , italic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT do
11                   Randomly select Jsuperscript𝐽J^{\prime}italic_J start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT samples from task 𝒯iQ,j{1,2,,J}superscriptsubscript𝒯superscript𝑖Qsuperscript𝑗12superscript𝐽\mathcal{T}_{i^{\prime}}^{\mathrm{Q}},\quad j^{\prime}\in\{1,2,...,J^{\prime}\}caligraphic_T start_POSTSUBSCRIPT italic_i start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Q end_POSTSUPERSCRIPT , italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ { 1 , 2 , … , italic_J start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT }.
12                   Calculate the loss function in the query task: fiL,Q(θim,N)=1I1Ji=1Ij=1Jk𝒦τΦ,ξfkΦ,ξ(w~i,j,kΦ,ξ,mWτ,jNS)superscriptsubscript𝑓𝑖LQsuperscriptsubscript𝜃𝑖𝑚𝑁1superscript𝐼1superscript𝐽superscriptsubscriptsuperscript𝑖1superscript𝐼superscriptsubscriptsuperscript𝑗1superscript𝐽subscript𝑘superscriptsubscript𝒦𝜏Φ𝜉superscriptsubscript𝑓𝑘Φ𝜉superscriptsubscript~𝑤superscript𝑖superscript𝑗𝑘Φ𝜉𝑚superscriptsubscript𝑊𝜏𝑗NSf_{i}^{\mathrm{L,Q}}\left(\theta_{i}^{m,N}\right)=\frac{1}{I^{\prime}}\frac{1}% {J^{\prime}}\sum\limits_{i^{\prime}=1}^{I^{\prime}}\sum\limits_{j^{\prime}=1}^% {J^{\prime}}\sum\limits_{k\in\mathcal{K}_{\tau}^{\Phi,\xi}}f_{k}^{\Phi,\xi}% \left(\tilde{w}_{i^{\prime},j^{\prime},k}^{\Phi,\xi,m}\cdot W_{\tau,j}^{% \mathrm{NS}}\right)italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_L , roman_Q end_POSTSUPERSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m , italic_N end_POSTSUPERSCRIPT ) = divide start_ARG 1 end_ARG start_ARG italic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG divide start_ARG 1 end_ARG start_ARG italic_J start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_ARG ∑ start_POSTSUBSCRIPT italic_i start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_J start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_k ∈ caligraphic_K start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ( over~ start_ARG italic_w end_ARG start_POSTSUBSCRIPT italic_i start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ , italic_m end_POSTSUPERSCRIPT ⋅ italic_W start_POSTSUBSCRIPT italic_τ , italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_NS end_POSTSUPERSCRIPT ),  
13             end for
14            
15       end for
16      Calculate the loss function in meta-training: fL,Meta,m(ϕm)=1Ii=1IfiL,Q(θim,N)superscript𝑓LMeta𝑚superscriptitalic-ϕ𝑚1𝐼superscriptsubscript𝑖1𝐼superscriptsubscript𝑓𝑖LQsuperscriptsubscript𝜃𝑖𝑚𝑁{f}^{\mathrm{L,Meta},m}\left(\phi^{m}\right)=\frac{1}{I}\sum\limits_{i=1}^{I}{% f}_{i}^{\mathrm{L,Q}}\left(\theta_{i}^{m,N}\right)italic_f start_POSTSUPERSCRIPT roman_L , roman_Meta , italic_m end_POSTSUPERSCRIPT ( italic_ϕ start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ) = divide start_ARG 1 end_ARG start_ARG italic_I end_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_I end_POSTSUPERSCRIPT italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_L , roman_Q end_POSTSUPERSCRIPT ( italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m , italic_N end_POSTSUPERSCRIPT ).  Update the initial parameters: ϕm+1=ϕm+βϕϕfL,Meta,m(ϕm)superscriptitalic-ϕ𝑚1superscriptitalic-ϕ𝑚subscript𝛽italic-ϕsubscriptitalic-ϕsuperscript𝑓LMeta𝑚superscriptitalic-ϕ𝑚\phi^{m+1}=\phi^{m}+\beta_{\phi}\nabla_{\phi}{f}^{\mathrm{L,Meta},m}\left(\phi% ^{m}\right)italic_ϕ start_POSTSUPERSCRIPT italic_m + 1 end_POSTSUPERSCRIPT = italic_ϕ start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT + italic_β start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT ∇ start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT italic_f start_POSTSUPERSCRIPT roman_L , roman_Meta , italic_m end_POSTSUPERSCRIPT ( italic_ϕ start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ).
17      
18 end for
Return the optimal initial parameters of the GNN: ϕopt=ϕMsuperscriptitalic-ϕoptsuperscriptitalic-ϕ𝑀\phi^{\mathrm{opt}}=\phi^{M}italic_ϕ start_POSTSUPERSCRIPT roman_opt end_POSTSUPERSCRIPT = italic_ϕ start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT.
Algorithm 4 Meta-Training of Hybrid-Task Meta-Learning

IV-D4 Meta-Training of Proposed HML Algorithm

Fig. 2(b) illustrates the tasks and tasksets used in the meta-training and meta-testing of the proposed HML algorithm. The difference between MAML and HML lies in the selection of tasks from the query set. In MAML, the tasks selected from the query set are identical to those selected from the support set in each meta-training epoch. To improve the generalization ability, in HML, we select different tasks from the query set to train the initial parameters of the GNN. Specifically, Isuperscript𝐼I^{\prime}italic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT tasks are randomly selected from the query set to estimate the average loss of the GNN parameterized by ϕmsuperscriptitalic-ϕ𝑚\phi^{m}italic_ϕ start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT in the m𝑚mitalic_m-th epoch of meta-training. The step-by-step algorithm for meta-training of the proposed HML algorithm is described in Algorithm 4, and the meta-testing algorithm of HML is the same as that of MAML in Algorithm 3.

TABLE II: Key Simulation Parameters
Simulation parameters Values
Transmit power of each user, Pusubscript𝑃𝑢P_{u}italic_P start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT 23 dBm
Single-sided noise spectral density, N0subscript𝑁0N_{0}italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT -174 dBm/Hz
Channel coherence time, Tcsubscript𝑇cT_{\mathrm{c}}italic_T start_POSTSUBSCRIPT roman_c end_POSTSUBSCRIPT 1111ms [28]
Duration of one time slot, Tssubscript𝑇sT_{\mathrm{s}}italic_T start_POSTSUBSCRIPT roman_s end_POSTSUBSCRIPT 0.1250.1250.1250.125ms
Decoding error probability, ϵusubscriptitalic-ϵ𝑢\epsilon_{u}italic_ϵ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT 105superscript10510^{-5}10 start_POSTSUPERSCRIPT - 5 end_POSTSUPERSCRIPT [28]
Information leakage, δusubscript𝛿𝑢\delta_{u}italic_δ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT 102superscript10210^{-2}10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT [28]
QoS exponent of queuing delay, ϑusubscriptitalic-ϑ𝑢\vartheta_{u}italic_ϑ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT 103superscript10310^{-3}10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT [28]
Size of bandwidth resource block, ΔwΔ𝑤\Delta{w}roman_Δ italic_w 10101010 kHz
Learning rates, βθ/βϕsubscript𝛽𝜃subscript𝛽italic-ϕ\beta_{\theta}/\beta_{\phi}italic_β start_POSTSUBSCRIPT italic_θ end_POSTSUBSCRIPT / italic_β start_POSTSUBSCRIPT italic_ϕ end_POSTSUBSCRIPT 104superscript10410^{-4}10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT
Batch sizes of GNN, J/J𝐽superscript𝐽J/J^{\prime}italic_J / italic_J start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT 32
Batch sizes of meta optimizer, I,I𝐼superscript𝐼I,I^{\prime}italic_I , italic_I start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT 4, 2
TABLE III: System Parameters of Different Tasks
Parameters 𝒯Ssuperscript𝒯S\mathcal{T}^{\mathrm{S}}caligraphic_T start_POSTSUPERSCRIPT roman_S end_POSTSUPERSCRIPT & 𝒯Qsuperscript𝒯Q\mathcal{T}^{\mathrm{Q}}caligraphic_T start_POSTSUPERSCRIPT roman_Q end_POSTSUPERSCRIPT 𝒯Fsuperscript𝒯F\mathcal{T}^{\mathrm{F}}caligraphic_T start_POSTSUPERSCRIPT roman_F end_POSTSUPERSCRIPT & 𝒯Esuperscript𝒯E\mathcal{T}^{\mathrm{E}}caligraphic_T start_POSTSUPERSCRIPT roman_E end_POSTSUPERSCRIPT
Network scale Number of users UτΦ,ξ{10,11,,30}superscriptsubscript𝑈𝜏Φ𝜉101130U_{\tau}^{\Phi,\xi}\in\{10,11,\cdots,30\}italic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT ∈ { 10 , 11 , ⋯ , 30 } UτΦ,ξ=50superscriptsubscript𝑈𝜏Φ𝜉50U_{\tau}^{\Phi,\xi}=50italic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = 50
Channel models Path loss: αu=(du)γusubscript𝛼𝑢superscriptsubscript𝑑𝑢subscript𝛾𝑢\alpha_{u}=(d_{u})^{-\gamma_{u}}italic_α start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = ( italic_d start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - italic_γ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT end_POSTSUPERSCRIPT γu{2,3}subscript𝛾𝑢23\gamma_{u}\in\{2,3\}italic_γ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT ∈ { 2 , 3 } γu=4subscript𝛾𝑢4\gamma_{u}=4italic_γ start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT = 4
Shadowing: puS(ψ)=10/ln102πσψdBψexp((10log10ψμψdB)22σψdB2)superscriptsubscript𝑝𝑢S𝜓10102𝜋subscript𝜎subscript𝜓dB𝜓superscript10subscript10𝜓subscript𝜇subscript𝜓dB22superscriptsubscript𝜎subscript𝜓dB2p_{u}^{\mathrm{S}}(\psi)=\frac{10/\ln 10}{\sqrt{2\pi}\sigma_{\psi_{\mathrm{dB}% }}\psi}\exp\left(-\frac{(10\log_{10}\psi-\mu_{\psi_{\mathrm{dB}}})^{2}}{2% \sigma_{\psi_{\mathrm{dB}}}^{2}}\right)italic_p start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_S end_POSTSUPERSCRIPT ( italic_ψ ) = divide start_ARG 10 / roman_ln 10 end_ARG start_ARG square-root start_ARG 2 italic_π end_ARG italic_σ start_POSTSUBSCRIPT italic_ψ start_POSTSUBSCRIPT roman_dB end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_ψ end_ARG roman_exp ( - divide start_ARG ( 10 roman_log start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT italic_ψ - italic_μ start_POSTSUBSCRIPT italic_ψ start_POSTSUBSCRIPT roman_dB end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUBSCRIPT italic_ψ start_POSTSUBSCRIPT roman_dB end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) ψdB{3,4,5}subscript𝜓dB345\psi_{\mathrm{dB}}\in\{3,4,5\}italic_ψ start_POSTSUBSCRIPT roman_dB end_POSTSUBSCRIPT ∈ { 3 , 4 , 5 } ψdB=8subscript𝜓dB8\psi_{\mathrm{dB}}=8italic_ψ start_POSTSUBSCRIPT roman_dB end_POSTSUBSCRIPT = 8
Small-scale channels: puI(z|s,σ)=zσ2exp(z2+s22σ2)I0(zsσ2)superscriptsubscript𝑝𝑢Iconditional𝑧𝑠𝜎𝑧superscript𝜎2superscript𝑧2superscript𝑠22superscript𝜎2subscript𝐼0𝑧𝑠superscript𝜎2p_{u}^{\mathrm{I}}(z|s,\sigma)=\frac{z}{\sigma^{2}}\exp\left(-\frac{z^{2}+s^{2% }}{2\sigma^{2}}\right)\cdot I_{0}\left(\frac{zs}{\sigma^{2}}\right)italic_p start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_I end_POSTSUPERSCRIPT ( italic_z | italic_s , italic_σ ) = divide start_ARG italic_z end_ARG start_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG roman_exp ( - divide start_ARG italic_z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_s start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) ⋅ italic_I start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( divide start_ARG italic_z italic_s end_ARG start_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ), puN(z|m,σ)=2mmz2m1Γ(m)(2σ2)mexp(mz22σ2)superscriptsubscript𝑝𝑢Nconditional𝑧𝑚𝜎2superscript𝑚𝑚superscript𝑧2𝑚1Γ𝑚superscript2superscript𝜎2𝑚𝑚superscript𝑧22superscript𝜎2p_{u}^{\mathrm{N}}(z|m,\sigma)=\frac{2m^{m}z^{2m-1}}{\Gamma(m){(2\sigma^{2})}^% {m}}\exp\left(-\frac{mz^{2}}{2\sigma^{2}}\right)italic_p start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_N end_POSTSUPERSCRIPT ( italic_z | italic_m , italic_σ ) = divide start_ARG 2 italic_m start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT italic_z start_POSTSUPERSCRIPT 2 italic_m - 1 end_POSTSUPERSCRIPT end_ARG start_ARG roman_Γ ( italic_m ) ( 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT end_ARG roman_exp ( - divide start_ARG italic_m italic_z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ), puR(z|σ)=zσ2exp(z22σ2)superscriptsubscript𝑝𝑢Rconditional𝑧𝜎𝑧superscript𝜎2superscript𝑧22superscript𝜎2p_{u}^{\mathrm{R}}(z|\sigma)=\frac{z}{\sigma^{2}}\exp\left(-\frac{z^{2}}{2% \sigma^{2}}\right)italic_p start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_R end_POSTSUPERSCRIPT ( italic_z | italic_σ ) = divide start_ARG italic_z end_ARG start_ARG italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG roman_exp ( - divide start_ARG italic_z start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) puI(z|s,σ),s{15}superscriptsubscript𝑝𝑢Iconditional𝑧𝑠𝜎𝑠15p_{u}^{\mathrm{I}}(z|s,\sigma),s\in\{1\cdots 5\}italic_p start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_I end_POSTSUPERSCRIPT ( italic_z | italic_s , italic_σ ) , italic_s ∈ { 1 ⋯ 5 }, puN(z|m,σ),m{2,,6}superscriptsubscript𝑝𝑢Nconditional𝑧𝑚𝜎𝑚26p_{u}^{\mathrm{N}}(z|m,\sigma),m\in\{2,\cdots,6\}italic_p start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_N end_POSTSUPERSCRIPT ( italic_z | italic_m , italic_σ ) , italic_m ∈ { 2 , ⋯ , 6 } puR(z|σ)superscriptsubscript𝑝𝑢Rconditional𝑧𝜎p_{u}^{\mathrm{R}}(z|\sigma)italic_p start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_R end_POSTSUPERSCRIPT ( italic_z | italic_σ )
QoS Rewards with different QoS requirements max𝒘u𝒰τS,ruS,subscript𝒘subscript𝑢superscriptsubscript𝒰𝜏𝑆superscriptsubscript𝑟𝑢𝑆\max\limits_{\bm{w}}\sum\limits_{u\in\mathcal{U}_{\tau}^{S,\mathcal{I}}}r_{u}^% {S,\mathcal{I}}roman_max start_POSTSUBSCRIPT bold_italic_w end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_u ∈ caligraphic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT max𝒘u𝒰τΦ,ξruΦ,ξsubscript𝒘subscript𝑢superscriptsubscript𝒰𝜏Φ𝜉superscriptsubscript𝑟𝑢Φ𝜉\max\limits_{\bm{w}}\sum\limits_{u\in\mathcal{U}_{\tau}^{\Phi,\xi}}r_{u}^{\Phi% ,\xi}roman_max start_POSTSUBSCRIPT bold_italic_w end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_u ∈ caligraphic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_r start_POSTSUBSCRIPT italic_u end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT, Φ{D,E,S}Φ𝐷𝐸𝑆\Phi\in\{D,E,S\}roman_Φ ∈ { italic_D , italic_E , italic_S }, ξ{,}𝜉\xi\in\{\mathcal{I,F}\}italic_ξ ∈ { caligraphic_I , caligraphic_F }
Values of QoS constraints (Mbps) rτS,{1,,10}superscriptsubscript𝑟𝜏𝑆110r_{\tau}^{S,\mathcal{I}}\in\{1,\cdots,10\}italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT ∈ { 1 , ⋯ , 10 } rτΦ,ξ=10superscriptsubscript𝑟𝜏Φ𝜉10r_{\tau}^{\Phi,\xi}=10italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = 10
Reserved bandwidth Constraints on reserved bandwidth (MHz) WτS,{10,,100}superscriptsubscript𝑊𝜏𝑆10100W_{\tau}^{S,\mathcal{I}}\in\{10,\cdots,100\}italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT ∈ { 10 , ⋯ , 100 } WτΦ,ξ=100superscriptsubscript𝑊𝜏Φ𝜉100W_{\tau}^{\Phi,\xi}=100italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_Φ , italic_ξ end_POSTSUPERSCRIPT = 100

V Performance Evaluation

In this section, we evaluate the performance of our GNN-based HML algorithm. The GNN is first initialized by the parameters obtained from meta-training, where all the tasks aim to maximize the sum of the secrecy rate with different numbers of users and channel models. Then, we evaluate the performance of our GNN in unseen tasks with different numbers of users, channel models, objective functions, QoS constraints, and reserved bandwidth.

V-A System Setup

We consider a BS, located at (0,0)00(0,0)( 0 , 0 ) m, serving multiple users randomly distributed in a rectangular area, where the coordinates of the users are denoted by (cx,u,cy,u)subscript𝑐𝑥𝑢subscript𝑐𝑦𝑢(c_{x,u},c_{y,u})( italic_c start_POSTSUBSCRIPT italic_x , italic_u end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT italic_y , italic_u end_POSTSUBSCRIPT ), where cx,usubscript𝑐𝑥𝑢c_{x,u}italic_c start_POSTSUBSCRIPT italic_x , italic_u end_POSTSUBSCRIPT and cy,usubscript𝑐𝑦𝑢c_{y,u}italic_c start_POSTSUBSCRIPT italic_y , italic_u end_POSTSUBSCRIPT [100,100]absent100100\in[-100,100]∈ [ - 100 , 100 ]. When the QoS requirement is secrecy rate, an eavesdropper is randomly located in the above rectangular area. The transmitted signal of each user is a complex Gaussian process with zero-mean and equal variance, σ2=1superscript𝜎21\sigma^{2}=1italic_σ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 1. Channel models include large-scale channels and small-scale channels. Specifically, the large-scale channels depend on path loss and shadowing fading, whilst small-scale channels follow Rice, Nakagami, and Rayleigh distributions with various parameters in Table III. The number of neurons in each layer of the GNN is 2/32/64/64/32/123264643212/32/64/64/32/12 / 32 / 64 / 64 / 32 / 1. Unless otherwise mentioned, the simulation parameters are summarized in Table II, and the parameters of tasksets are defined in Table III.

V-B Performance of GNN

Refer to caption
Figure 3: Training losses with different numbers of users, where the secrecy rate in the long blocklength regime is considered, rτS,=10superscriptsubscript𝑟𝜏𝑆10r_{\tau}^{S,\mathcal{I}}=10italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT = 10 Mbps, and WτS,=100superscriptsubscript𝑊𝜏𝑆100W_{\tau}^{S,\mathcal{I}}=100italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT = 100 MHz.

Fig. 3 shows the training losses when the number of users increases from 10101010 to 50505050. The results show that the unsupervised learning algorithm can converge after a few hundred training epochs for different numbers of users, and the convergence time increases slightly with the number of users.

Refer to caption
(a) Secrecy rates of scheduled users.
Refer to caption
(b) Sum secrecy rate.
Figure 4: Testing samples are selected from taskset 𝒯Fsuperscript𝒯F\mathcal{T}^{\mathrm{F}}caligraphic_T start_POSTSUPERSCRIPT roman_F end_POSTSUPERSCRIPT and 𝒯Esuperscript𝒯E\mathcal{T}^{\mathrm{E}}caligraphic_T start_POSTSUPERSCRIPT roman_E end_POSTSUPERSCRIPT in Table. III.

After the training stage of the unsupervised learning algorithm, we select 1000100010001000 samples from the evaluation set of the same task to evaluate the constraint and reward achieved by the GNN in Fig. 4. The results in Fig. 4(a) show that the secrecy rates of all the scheduled users are equal to or higher than the requirement, rτS,=10superscriptsubscript𝑟𝜏𝑆10r_{\tau}^{S,\mathcal{I}}=10italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT = 10 Mbps. The results in Fig. 4(b) show that the sum secrecy rate achieved by the GNN is close to that achieved by the iterative optimization algorithm in Section IV-A (with legend “Optimal”). In other words, the unsupervised learning algorithm can obtain a near-optimal solution.

V-C Meta-Testing Performance of HML

In this subsection, we evaluate the generalization ability of the proposed HML algorithm. The differences between tasks in meta-training and meta-testing are shown in Table. III. In meta-testing, we first select an unseen task that is not included in meta-training. In each training epoch of the meta-testing, 32323232 samples are randomly selected from 𝒯Fsuperscript𝒯F\mathcal{T}^{\mathrm{F}}caligraphic_T start_POSTSUPERSCRIPT roman_F end_POSTSUPERSCRIPT to fine-tune the GNN, whilst all the 1000100010001000 testing samples from the same task in 𝒯Esuperscript𝒯E\mathcal{T}^{\mathrm{E}}caligraphic_T start_POSTSUPERSCRIPT roman_E end_POSTSUPERSCRIPT are used to evaluate the performance.

V-C1 Different Wireless Channels and QoS Requirements

In this part, we set WτS,=100superscriptsubscript𝑊𝜏𝑆100W_{\tau}^{S,\mathcal{I}}=100italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT = 100 MHz and rτS,=10superscriptsubscript𝑟𝜏𝑆10r_{\tau}^{S,\mathcal{I}}=10italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT = 10 Mbps for all types of services. The other parameters follow the rules in 𝒯Ssuperscript𝒯S\mathcal{T}^{\mathrm{S}}caligraphic_T start_POSTSUPERSCRIPT roman_S end_POSTSUPERSCRIPT and 𝒯Qsuperscript𝒯Q\mathcal{T}^{\mathrm{Q}}caligraphic_T start_POSTSUPERSCRIPT roman_Q end_POSTSUPERSCRIPT as shown in Table. III. We compare the initial performance and sample efficiency of HML with four benchmarks: 1) Optimal, 2) Model-agnostic meta-learning (MAML), 3) Multi-task learning-based transfer learning (MTL Transfer), and 4) Random initialization.

  • Optimal: The optimal solution is obtained by the iterative algorithm detailed in Section IV-A, and its optimality has been proved in [10].

  • MAML: MAML is one of the most widely used meta-learning algorithms, and its key ideas have been discussed in Section IV-D.

  • MTL Transfer: Transfer learning improves the sample efficiency by fine-tuning the parameters of the pre-trained GNN in a task with fewer training samples. With multi-task learning (MTL), the initial performance is much better than random initialization as the GNN is pre-trained in multiple tasks [37, 43]. To execute MTL transfer learning, we only need to replace the initialization in line  2 of Algorithm 2 by the pre-trained parameters.

  • Random Initialization: Random initialization is the conventional method that trains the GNN from scratch with a new task.

In figures 5-7, the horizontal axis represents the training epochs used to fine-tune the GNN, and 32323232 samples from 𝒯Fsuperscript𝒯F\mathcal{T}^{\rm F}caligraphic_T start_POSTSUPERSCRIPT roman_F end_POSTSUPERSCRIPT are used to train the GNN. The vertical axis represents the sum of the rewards of all the users, and the average is taken over samples, i.e., 1000100010001000 testing samples from 𝒯Esuperscript𝒯E\mathcal{T}^{\rm E}caligraphic_T start_POSTSUPERSCRIPT roman_E end_POSTSUPERSCRIPT are used. We refer to it as the average sum reward.

Refer to caption
(a) Secrecy rate in long blocklength regime.
Refer to caption
(b) Secrecy rate in short blocklength regime.
Figure 5: Meta-testing with unseen channel models.

In Fig. 5, we consider the average sum of secrecy rates and illustrate the impacts of the number of users, channel models, and coding blocklength on the initial performance and sample efficiency of different methods. The results in Fig. 5 show that HML achieves the best initial average sum secrecy rate and the highest sample efficiency compared with all the benchmarks. In Fig. 5(a), HML can converge in 8888 training epochs. Both MAML and MTL transfer learning takes more than 30303030 epochs to converge. Thus, HML can reduce the convergence time by up to 73%percent7373\%73 %. After the fine-tuning, the gap between learning methods and the optimal solution is around 1.451.451.451.45%. In Fig. 5(b), the coding blocklength in meta-testing is also different from that in meta-training. As a result, the gap between the initial performance of HML and the optimal solution is 7.93%percent7.937.93\%7.93 %. After fine-tuning, the gap reduced to 3.74%percent3.743.74\%3.74 %, which is larger than the gap in Fig. 5(a), where the blocklength is the same in meta-training and meta-testing.

Refer to caption
(a) Shannon capacity in long blocklength regime.
Refer to caption
(b) Achievable rate in short blocklength regime.
Figure 6: Meta-testing with unseen QoS requirements of rate rates and unseen channels.

Fig. 6 shows the average sum of data rates achieved by different methods. The results indicate that when the reward function and the QoS constraint in meta-testing are different from that in meta-training, the gaps between the initial performance of HML and the optimal solution increase to 13.77%percent13.7713.77\%13.77 % and 14.93%percent14.9314.93\%14.93 % in long and short blocklength regimes, respectively. After fine-tuning, the gaps between the learning methods and the optimal solution are smaller than that in Fig. 5. This is because Shannon’s capacity/achievable rate are two special cases of the secrecy rate in the long/short blocklength regimes when the wiretapped channels are in deep fading. It is easier to learn a good policy when the problem becomes less complicated.

Refer to caption
(a) Effective capacity in long blocklength regime.
Refer to caption
(b) Effective capacity in short blocklength regime.
Figure 7: Meta-testing with unseen QoS requirements and unseen channels.

Fig. 7 shows the average sum of effective capacities achieved in the meta-testing stage, where the initial parameters of the GNN are obtained from meta-training, and the GNN is trained with tasks maximizing the sum secrecy rate in the long blocklength regime. In other words, the QoS requirement in meta-testing is queuing delay requirement, which is quite different from the security requirement in meta-training. By comparing the results in Figs. 7 and 5, we can observe that the gaps between the HML and the optimal solution in Fig. 7 are larger than the gaps in Fig. 5. Nevertheless, HML can still converge in around 10101010 to 30303030 epochs and outperforms the other benchmarks in Fig. 7.

Refer to caption
(a) rτS,=10superscriptsubscript𝑟𝜏𝑆10r_{\tau}^{S,\mathcal{I}}=10italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT = 10 Mbps in meta-training.
Refer to caption
(b) rτS,=1superscriptsubscript𝑟𝜏𝑆1r_{\tau}^{S,\mathcal{I}}=1italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT = 1 Mbps in meta-training.
Refer to caption
(c) rτS,{1,,10}superscriptsubscript𝑟𝜏𝑆110r_{\tau}^{S,\mathcal{I}}\in\{1,\cdots,10\}italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT ∈ { 1 , ⋯ , 10 } Mbps in meta-training.
Figure 8: Meta-testing with dynamic secrecy rate requirements, rτS,{1,,10}superscriptsubscript𝑟𝜏𝑆110r_{\tau}^{S,\mathcal{I}}\in\{1,\cdots,10\}italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT ∈ { 1 , ⋯ , 10 } Mbps, whereWτS,=100superscriptsubscript𝑊𝜏𝑆100W_{\tau}^{S,\mathcal{I}}=100italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT = 100 MHz and UτS,=10superscriptsubscript𝑈𝜏𝑆10U_{\tau}^{S,\mathcal{I}}=10italic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT = 10.

V-C2 Meta-Testing with Different System Parameters

In this part, we focus on secrecy rates in the long blocklength regime in both meta-training and meta-testing, and change the values of rτS,superscriptsubscript𝑟𝜏𝑆r_{\tau}^{S,\mathcal{I}}italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT, WτS,superscriptsubscript𝑊𝜏𝑆W_{\tau}^{S,\mathcal{I}}italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT, and UτS,superscriptsubscript𝑈𝜏𝑆U_{\tau}^{S,\mathcal{I}}italic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT to investigate their impacts on the initial performance and sample efficiency of HML in meta-testing.

Refer to caption
Figure 9: Meta-testing with dynamic bandwidth WτS,{10,,100}superscriptsubscript𝑊𝜏𝑆10100W_{\tau}^{S,\mathcal{I}}\in\{10,\cdots,100\}italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT ∈ { 10 , ⋯ , 100 } MHz in meta-training, where rτS,=10superscriptsubscript𝑟𝜏𝑆10r_{\tau}^{S,\mathcal{I}}=10italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT = 10 Mbps and UτS,=10superscriptsubscript𝑈𝜏𝑆10U_{\tau}^{S,\mathcal{I}}=10italic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT = 10.
Refer to caption
Figure 10: Meta-testing with different numbers of users UτS,{5,10,,50}superscriptsubscript𝑈𝜏𝑆51050U_{\tau}^{S,\mathcal{I}}\in\{5,10,\cdots,50\}italic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT ∈ { 5 , 10 , ⋯ , 50 }, where rτS,=10superscriptsubscript𝑟𝜏𝑆10r_{\tau}^{S,\mathcal{I}}=10italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT = 10 Mbps and WτS,=100superscriptsubscript𝑊𝜏𝑆100W_{\tau}^{S,\mathcal{I}}=100italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT = 100 MHz.

In Fig. 8, we evaluate the initial performance and sample efficiency with different rτS,superscriptsubscript𝑟𝜏𝑆r_{\tau}^{S,\mathcal{I}}italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT in support sets and query sets in meta-training. Specifically, we set rτS,superscriptsubscript𝑟𝜏𝑆r_{\tau}^{S,\mathcal{I}}italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT to 10101010 Mbps and 1111 Mbps in meta-training in Figs. 8(a) and 8(b), respectively. In Fig. 8(c), rτS,superscriptsubscript𝑟𝜏𝑆r_{\tau}^{S,\mathcal{I}}italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT is randomly selected from the set {1,,10}110\{1,\cdots,10\}{ 1 , ⋯ , 10 } Mbps in meta-training. In meta-testing, we increase rτS,superscriptsubscript𝑟𝜏𝑆r_{\tau}^{S,\mathcal{I}}italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT from 1111 Mbps to 10101010 Mbps. The results in Figs. 8(a) and 8(b) indicate that the gaps between zero-shot learning (with 00 training epochs in meta-testing) and the optimal solution increase with the difference between rτS,superscriptsubscript𝑟𝜏𝑆r_{\tau}^{S,\mathcal{I}}italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT in meta-training and rτS,superscriptsubscript𝑟𝜏𝑆r_{\tau}^{S,\mathcal{I}}italic_r start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT in meta-testing. To increase the generalization ability, we can increase the diversity of tasks in meta-training as shown in Fig. 8(c). In this way, our GNN is near-optimal with zero-shot learning.

In Fig. 9, we validated the generalization ability of our GNN with dynamic bandwidth WτS,superscriptsubscript𝑊𝜏𝑆W_{\tau}^{S,\mathcal{I}}italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT. In meta-training, WτS,superscriptsubscript𝑊𝜏𝑆W_{\tau}^{S,\mathcal{I}}italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT is randomly selecting from the set {10,,100}10100\{10,\cdots,100\}{ 10 , ⋯ , 100 } MHz. In meta-testing, we increase WτS,superscriptsubscript𝑊𝜏𝑆W_{\tau}^{S,\mathcal{I}}italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT from 10101010 to 100100100100 MHz. The results in Fig. 9 show that our GNN is near-optimal with different values of WτS,superscriptsubscript𝑊𝜏𝑆W_{\tau}^{S,\mathcal{I}}italic_W start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT. In Fig. 10, we further validate the generalization ability of our GNN with different numbers of users. In meta-training, the number of total users is randomly selected, UτS,{10,11,,30}superscriptsubscript𝑈𝜏𝑆101130U_{\tau}^{S,\mathcal{I}}\in\{10,11,...,30\}italic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT ∈ { 10 , 11 , … , 30 }. In meta-testing, we increase the number of total users from 5555 to 50505050. The results in Fig. 10 show that the proposed HML can obtain a GNN that has strong generalization ability with different numbers of users. The gap between the GNN and the optimal policy increases slightly with UτS,superscriptsubscript𝑈𝜏𝑆U_{\tau}^{S,\mathcal{I}}italic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT. This is because the scale of the problem increases with UτS,superscriptsubscript𝑈𝜏𝑆U_{\tau}^{S,\mathcal{I}}italic_U start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT, and it is more difficult to learn the bandwidth allocation policy of a large-scale problem compared with that of a small-scale problem.

VI Conclusion

In this paper, we developed an HML approach to train a GNN-based scalable bandwidth allocation policy that can generalize well in various communication scenarios, including different number of users, wireless channels, QoS requirements, and bandwidth. The main idea is to train the initial parameters of the GNN with various tasks in meta-training, and then fine-tune the parameters with a few samples in meta-testing. Simulation results showed that the performance gap between the GNN and the optimal policy obtained by an iterative algorithm is less than 5555% in most of the cases. For unseen communication scenarios, the GNN can converge in 10101010 to 30303030 training epochs, which are much faster than the existing benchmarks. Our approach can be extended beyond bandwidth allocation, such as power allocation, precoding, and repetitions. Nevertheless, the featuring engineering and the structure of GNN in other scenarios deserve further investigation.

Appendix A Proof of Concavity for Secrecy Rate in Long Blocklength Regimes 

To prove the concavity of the secrecy rate in long blocklength regimes, we only need to prove that the second derivative of the secrecy rate is positive. We first calculate the partial derivative of the secrecy rate of the k𝑘kitalic_k-th scheduled user as follows,

rkS,(wτ,kD,)wτ,kD,=(rkD,(wτ,kD,)rke,(wτ,kD,))wk=ζk+ln(1+ζkwτ,kD,)(wτ,kD,+ζk)ln(2)(wτ,kD,+ζk)ζke+ln(1+ζkewτ,kD,)(wτ,kD,+ζke)ln(2)(wτ,kD,+ζke)=ln(wτ,kD,+ζkwτ,kD,+ζke)ln(2)+(ζkeζk)wτ,kD,ln(2)(wτ,kD,+ζk)(wτ,kD,+ζke),superscriptsubscript𝑟𝑘𝑆superscriptsubscript𝑤𝜏𝑘𝐷superscriptsubscript𝑤𝜏𝑘𝐷superscriptsubscript𝑟𝑘𝐷superscriptsubscript𝑤𝜏𝑘𝐷superscriptsubscript𝑟𝑘𝑒superscriptsubscript𝑤𝜏𝑘𝐷subscript𝑤𝑘subscript𝜁𝑘1subscript𝜁𝑘superscriptsubscript𝑤𝜏𝑘𝐷superscriptsubscript𝑤𝜏𝑘𝐷subscript𝜁𝑘2superscriptsubscript𝑤𝜏𝑘𝐷subscript𝜁𝑘superscriptsubscript𝜁𝑘𝑒1superscriptsubscript𝜁𝑘𝑒superscriptsubscript𝑤𝜏𝑘𝐷superscriptsubscript𝑤𝜏𝑘𝐷superscriptsubscript𝜁𝑘𝑒2superscriptsubscript𝑤𝜏𝑘𝐷superscriptsubscript𝜁𝑘𝑒superscriptsubscript𝑤𝜏𝑘𝐷subscript𝜁𝑘superscriptsubscript𝑤𝜏𝑘𝐷superscriptsubscript𝜁𝑘𝑒2superscriptsubscript𝜁𝑘𝑒subscript𝜁𝑘superscriptsubscript𝑤𝜏𝑘𝐷2superscriptsubscript𝑤𝜏𝑘𝐷subscript𝜁𝑘superscriptsubscript𝑤𝜏𝑘𝐷superscriptsubscript𝜁𝑘𝑒\begin{split}\frac{\partial r_{k}^{S,\mathcal{I}}(w_{\tau,k}^{D,\mathcal{I}})}% {\partial{w_{\tau,k}^{D,\mathcal{I}}}}=&\frac{\partial\left(r_{k}^{D,\mathcal{% I}}(w_{\tau,k}^{D,\mathcal{I}})-r_{k}^{e,\mathcal{I}}(w_{\tau,k}^{D,\mathcal{I% }})\right)}{\partial w_{k}}\\ =&\frac{-\zeta_{k}+\ln\left(1+\frac{\zeta_{k}}{w_{\tau,k}^{D,\mathcal{I}}}% \right)(w_{\tau,k}^{D,\mathcal{I}}+\zeta_{k})}{\ln(2)(w_{\tau,k}^{D,\mathcal{I% }}+\zeta_{k})}\\ &-\frac{-\zeta_{k}^{e}+\ln\left(1+\frac{\zeta_{k}^{e}}{w_{\tau,k}^{D,\mathcal{% I}}}\right)(w_{\tau,k}^{D,\mathcal{I}}+\zeta_{k}^{e})}{\ln(2)(w_{\tau,k}^{D,% \mathcal{I}}+\zeta_{k}^{e})}\\ =&\frac{\ln\left(\frac{w_{\tau,k}^{D,\mathcal{I}}+\zeta_{k}}{w_{\tau,k}^{D,% \mathcal{I}}+\zeta_{k}^{e}}\right)}{\ln(2)}\\ &+\frac{(\zeta_{k}^{e}-\zeta_{k})w_{\tau,k}^{D,\mathcal{I}}}{\ln(2)(w_{\tau,k}% ^{D,\mathcal{I}}+\zeta_{k})(w_{\tau,k}^{D,\mathcal{I}}+\zeta_{k}^{e})},\end{split}start_ROW start_CELL divide start_ARG ∂ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT ) end_ARG start_ARG ∂ italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT end_ARG = end_CELL start_CELL divide start_ARG ∂ ( italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT ) - italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e , caligraphic_I end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT ) ) end_ARG start_ARG ∂ italic_w start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG end_CELL end_ROW start_ROW start_CELL = end_CELL start_CELL divide start_ARG - italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + roman_ln ( 1 + divide start_ARG italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT end_ARG ) ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT + italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_ARG start_ARG roman_ln ( 2 ) ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT + italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_ARG end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL - divide start_ARG - italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT + roman_ln ( 1 + divide start_ARG italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT end_ARG ) ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT + italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT ) end_ARG start_ARG roman_ln ( 2 ) ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT + italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT ) end_ARG end_CELL end_ROW start_ROW start_CELL = end_CELL start_CELL divide start_ARG roman_ln ( divide start_ARG italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT + italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT + italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT end_ARG ) end_ARG start_ARG roman_ln ( 2 ) end_ARG end_CELL end_ROW start_ROW start_CELL end_CELL start_CELL + divide start_ARG ( italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT - italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT end_ARG start_ARG roman_ln ( 2 ) ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT + italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT + italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT ) end_ARG , end_CELL end_ROW (18)

where ζk=Pkhk/N0subscript𝜁𝑘subscript𝑃𝑘subscript𝑘𝑁0\zeta_{k}={P_{k}h_{k}}/{N0}italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_P start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT / italic_N 0 and ζke=Pkhke/N0superscriptsubscript𝜁𝑘𝑒subscript𝑃𝑘superscriptsubscript𝑘𝑒𝑁0\zeta_{k}^{e}={P_{k}h_{k}^{e}}/{N0}italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT = italic_P start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT / italic_N 0. Since the secrecy rate of the user increases with the increasing of the allocated bandwidth, we have rkS,(wτ,kD,)/wτ,kD,<0superscriptsubscript𝑟𝑘𝑆superscriptsubscript𝑤𝜏𝑘𝐷superscriptsubscript𝑤𝜏𝑘𝐷0{\partial r_{k}^{S,\mathcal{I}}(w_{\tau,k}^{D,\mathcal{I}})}/{\partial{w_{\tau% ,k}^{D,\mathcal{I}}}}<0∂ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT ) / ∂ italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT < 0. The second derivative of rkS,(wτ,kD,)superscriptsubscript𝑟𝑘𝑆superscriptsubscript𝑤𝜏𝑘𝐷r_{k}^{S,\mathcal{I}}(w_{\tau,k}^{D,\mathcal{I}})italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT ) can be derived as follows,

2rkS,(wτ,kD,)wτ,kD,2=wτ,kD,(rkS,(wτ,kD,)wτ,kD,)=(ζkeζk)((ζke+ζk)wτ,kD,+2ζkeζk)ln(2)(wτ,kD,+ζk)2(wτ,kD,+ζke)2.superscript2superscriptsubscript𝑟𝑘𝑆superscriptsubscript𝑤𝜏𝑘𝐷superscriptsuperscriptsubscript𝑤𝜏𝑘𝐷2superscriptsubscript𝑤𝜏𝑘𝐷superscriptsubscript𝑟𝑘𝑆superscriptsubscript𝑤𝜏𝑘𝐷superscriptsubscript𝑤𝜏𝑘𝐷superscriptsubscript𝜁𝑘𝑒subscript𝜁𝑘superscriptsubscript𝜁𝑘𝑒subscript𝜁𝑘superscriptsubscript𝑤𝜏𝑘𝐷2superscriptsubscript𝜁𝑘𝑒subscript𝜁𝑘2superscriptsuperscriptsubscript𝑤𝜏𝑘𝐷subscript𝜁𝑘2superscriptsuperscriptsubscript𝑤𝜏𝑘𝐷superscriptsubscript𝜁𝑘𝑒2\begin{split}\frac{\partial^{2}r_{k}^{S,\mathcal{I}}(w_{\tau,k}^{D,\mathcal{I}% })}{\partial{w_{\tau,k}^{D,\mathcal{I}}}^{2}}=&\frac{\partial}{\partial w_{% \tau,k}^{D,\mathcal{I}}}\left(\frac{\partial r_{k}^{S,\mathcal{I}}(w_{\tau,k}^% {D,\mathcal{I}})}{\partial{w_{\tau,k}^{D,\mathcal{I}}}}\right)\\ =&\dfrac{(\zeta_{k}^{e}-\zeta_{k})\left((\zeta_{k}^{e}+\zeta_{k})w_{\tau,k}^{D% ,\mathcal{I}}+2\zeta_{k}^{e}\zeta_{k}\right)}{\ln(2)(w_{\tau,k}^{D,\mathcal{I}% }+\zeta_{k})^{2}(w_{\tau,k}^{D,\mathcal{I}}+\zeta_{k}^{e})^{2}}.\end{split}start_ROW start_CELL divide start_ARG ∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT ) end_ARG start_ARG ∂ italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG = end_CELL start_CELL divide start_ARG ∂ end_ARG start_ARG ∂ italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT end_ARG ( divide start_ARG ∂ italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT ) end_ARG start_ARG ∂ italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT end_ARG ) end_CELL end_ROW start_ROW start_CELL = end_CELL start_CELL divide start_ARG ( italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT - italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) ( ( italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT + italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT + 2 italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) end_ARG start_ARG roman_ln ( 2 ) ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT + italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT + italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG . end_CELL end_ROW (19)

For any scheduled user, we have ζk>ζkesubscript𝜁𝑘superscriptsubscript𝜁𝑘𝑒\zeta_{k}>\zeta_{k}^{e}italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT > italic_ζ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT. Thus, 2rkS,(wτ,kD,)/wτ,kD,2<0superscript2superscriptsubscript𝑟𝑘𝑆superscriptsubscript𝑤𝜏𝑘𝐷superscriptsuperscriptsubscript𝑤𝜏𝑘𝐷20{\partial^{2}r_{k}^{S,\mathcal{I}}(w_{\tau,k}^{D,\mathcal{I}})}/{\partial{w_{% \tau,k}^{D,\mathcal{I}}}^{2}}<0∂ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT ) / ∂ italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT < 0. Therefore, rkS,(wτ,kD,)superscriptsubscript𝑟𝑘𝑆superscriptsubscript𝑤𝜏𝑘𝐷r_{k}^{S,\mathcal{I}}(w_{\tau,k}^{D,\mathcal{I}})italic_r start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_S , caligraphic_I end_POSTSUPERSCRIPT ( italic_w start_POSTSUBSCRIPT italic_τ , italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_D , caligraphic_I end_POSTSUPERSCRIPT ) is concave. This completes the proof. \square

References

  • [1] X. Hao, P. L. Yeoh, Y. Liu, C. She, B. Vucetic, and Y. Li, “Graph neural network-based bandwidth allocation for secure wireless communications,” in Proc. 2023 IEEE Int. Conf. Commun. Workshops (ICC workshops), Rome, Italy, 2023, pp. 332–337.
  • [2] M. Z. Chowdhury, M. Shahjalal, S. Ahmed, and Y. M. Jang, “6G wireless communication systems: Applications, requirements, technologies, challenges, and research directions,” IEEE Open J. Commun. Soc., vol. 1, pp. 957–975, 2020.
  • [3] Y. Gu, C. She, Z. Quan, C. Qiu, and X. Xu, “Graph neural networks for distributed power allocation in wireless networks: Aggregation over-the-air,” IEEE Trans. Wireless Commun., Early access.
  • [4] J. Guo and C. Yang, “Learning power allocation for multi-cell-multi-user systems with heterogeneous graph neural networks,” IEEE Trans. Wireless Commun., vol. 21, no. 2, pp. 884–897, Feb. 2022.
  • [5] R. D-Mohammady, M. Y. Naderi, and K. R. Chowdhury, “Spectrum allocation and QoS provisioning framework for cognitive radio with heterogeneous service classes,” IEEE Trans. Wireless Commun., vol. 13, no. 7, pp. 3938–3950, Jul. 2014.
  • [6] B. Han, V. Sciancalepore, X. Costa-Pérez, D. Feng, and H. D. Schotten, “Multiservice-based network slicing orchestration with impatient tenants,” IEEE Trans. Wireless Commun., vol. 19, no. 7, pp. 5010–5024, Jul. 2020.
  • [7] L. Zanzi, V. Sciancalepore, A. Garcia-Saavedra, H. D. Schotten, and X. Costa-Pérez, “LACO: A latency-driven network slicing orchestration in beyond-5G networks,” IEEE Trans. Wireless Commun., vol. 20, no. 1, pp. 667–682, Jan. 2021.
  • [8] Y. Yuan, G. Zheng, K. -K. Wong, B. Ottersten, and Z. -Q. Luo, “Transfer learning and meta learning-based fast downlink beamforming adaptation,” IEEE Trans. Wireless Commun., vol. 20, no. 3, pp. 1742–1755, Mar. 2021.
  • [9] J. Zhang, Y. Yuan, G. Zheng, I. Krikidis, and K. -K. Wong, “Embedding model-based fast meta learning for downlink beamforming adaptation,” IEEE Trans. Wireless Commun., vol. 21, no. 1, pp. 149–162, Jan. 2022.
  • [10] R. Dong, C. She, W. Hardjawana, Y. Li, and B. Vucetic, “Deep learning for radio resource allocation with diverse quality-of-service requirements in 5G,” IEEE Trans. Wireless Commun., vol. 20, no. 4, pp. 2309–2324, Apr. 2021.
  • [11] H. Lee, J. Park, S. H. Lee, and I. Lee, “Message-passing based user association and bandwidth allocation in HetNets with wireless backhaul,” IEEE Trans. Wireless Commun., vol. 22, no. 1, pp. 704–717, Jan. 2023.
  • [12] Q. Xu, Z. Su, D. Fang, and Y. Wu, “Hierarchical bandwidth allocation for social community-oriented multicast in space-air-ground integrated networks,” IEEE Trans. Wireless Commun., vol. 22, no. 3, pp. 1915–1930, Mar. 2023.
  • [13] K. B. Letaief, Y. Shi, J. Lu, and J. Lu, “Edge artificial intelligence for 6G: Vision, enabling technologies, and applications,” IEEE J. Sel. Areas Commun., vol. 40, no. 1, pp. 5–36, Jan. 2022.
  • [14] C. She, C. Sun, Z. Gu, Y. Li, C. Yang, H. V. Poor, B. Vucetic, “A tutorial on ultrareliable and low-latency communications in 6G: Integrating domain knowledge into deep learning,” Proc. IEEE, vol. 109, no. 3, pp. 204–246, Mar. 2021.
  • [15] D. He, C. Liu, H. Wang, and T. Q. S. Quek, “Learning-based wireless powered secure transmission,” IEEE Wireless Commun. Lett., vol. 8, no. 2, pp. 600–603, Apr. 2019.
  • [16] C. Sun, C, She, and C. Yang, “Unsupervised deep learning for optimizing wireless systems with instantaneous and statistic constraints” in Ultra-reliable and low-latency communications (URLLC) theory and practice: Advances in 5G and beyond, 1st ed. Hoboken, NJ, USA: John Wiley&Sons, Ltd. 2023, ch. 4, pp. 85–118.
  • [17] J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in Proc. Int. Conf. Mach. Learn. (ICML), Sydney, Australia, pp. 1263–1272, Apr. 2017.
  • [18] Y. Liu, C. She, Y. Zhong, W. Hardjawana, F.-C. Zheng, and B. Vucetic, “Interference-limited ultra-reliable and low-latency communications: Graph neural networks or stochastic geometry?,” 2022, arXiv:2207.06918.
  • [19] Y. Shen, Y. Shi, J. Zhang, and K. B. Letaief, “Graph neural networks for scalable radio resource management: Architecture design and theoretical analysis,” IEEE J. Sel. Areas Commun., vol. 39, no. 1, pp. 101–115, Jan. 2021.
  • [20] J. Guo and C. Yang, “Deep neural networks with data rate model: Learning power allocation efficiently,” IEEE Trans. Commun., vol. 71, no. 3, pp. 1447–1461, Mar. 2023.
  • [21] C. Guo, L. Liang, and G. Y. Li, “Resource allocation for vehicular communications with low latency and high reliability,” IEEE Trans. Wireless Commun., vol. 18, no. 8, pp. 3887–3902, Aug. 2019.
  • [22] D. Wu and R. Negi, “Effective capacity: a wireless link model for support of quality of service,” IEEE Trans. Wireless Commun., vol. 2, no. 4, pp. 630-–643, Jul. 2003.
  • [23] J. Tang and X. Zhang, “Quality-of-service driven power and rate adaptation over wireless links,” IEEE Trans. Wireless Commun., vol. 6, no. 8, pp. 3058–3068, Aug. 2007.
  • [24] W. Yu, A. Chorti, L. Musavian, H. Vincent Poor, and Q. Ni, “Effective secrecy rate for a downlink NOMA network,” IEEE Trans. Wireless Commun., vol. 18, no. 12, pp. 5673–5690, Dec. 2019.
  • [25] H. Yang, Z. Xiong, J. Zhao, D. Niyato, L. Xiao, and Q. Wu, “Deep reinforcement learning based intelligent reflecting surface for secure wireless communications,” IEEE Trans. Wireless Commun., vol. 20, no. 1, pp. 375–388, Jan. 2021.
  • [26] C. Liu, J. Lee, and T. Q.S. Quek, “Safeguarding UAV communications against full-duplex active eavesdropper,” IEEE Trans. Wireless Commun., vol. 18, no. 6, pp. 2919–2931, Jun. 2019.
  • [27] H. -M. Wang, Q. Yang, Z. Ding, and H. V. Poor, “Secure short-packet communications for mission-critical IoT applications,” IEEE Trans. Wireless Commun., vol. 18, no. 5, pp. 2565–2578, May 2019.
  • [28] C. Li, C. She, N. Yang, and T. Q. S. Quek, “Secure transmission rate of short packets with queueing delay requirement,” IEEE Trans. Wireless Commun., vol. 21, no. 1, pp. 203–218, Jan. 2022.
  • [29] Y. Polyanskiy, H. V. Poor, and S. Verdu, “Channel coding rate in the finite blocklength regime,” IEEE Trans. Inf. Theory, vol. 56, no. 5, pp. 2307–2359, May 2010.
  • [30] M. Alsenwi, N. H. Tran, M. Bennis, S. R. Pandey, A. K. Bairagi, and C. S. Hong, “Intelligent resource slicing for eMBB and URLLC coexistence in 5G and beyond: A deep reinforcement learning based approach,” IEEE Trans. Wireless Commun., vol. 20, no. 7, pp. 4585–4600, Jul. 2021.
  • [31] G. Sun, Z. T. Gebrekidan, G. O. Boateng, D. A.-Mensah, and W. Jiang, “Dynamic reservation and deep reinforcement learning based autonomous resource slicing for virtualized radio access networks,” IEEE Access, vol. 7, pp. 45758–45772, 2019.
  • [32] T. T. Do, T. J. Oechtering, S. M. Kim, M. Skoglund, and G. Peters, “Uplink waveform channel with imperfect channel state information and finite constellation Input,” IEEE Trans. Wireless Commun., vol. 16, no. 2, pp. 1107–1119, Feb. 2017.
  • [33] Y. Lu, P. Cheng, Z. Chen, W. H. Mow, Y. Li and B. Vucetic, “Deep multi-task learning for cooperative NOMA: System design and principles,” IEEE J. Sel. Areas Commun., vol. 39, no. 1, pp. 61–78, Jan. 2021.
  • [34] M. Andrychowicz, M. Denil, S. Gomez, M. W. Hoffman, D. Pfau, T. Schaul, B. Shillingford, N. Freitas, “Learning to learn by gradient descent by gradient descent,” in Proc. 30th Conf. Neural Inf. Process. Syst. (NIPS 2016), Barcelona, Spain.
  • [35] A. Nichol, J. Achiam, and J. Schulman, “On first-order meta-learning algorithms,” 2018, arXiv:1803.02999.
  • [36] A. Raghu, M. Raghu, S. Bengio, and O. Vinyals, “Rapid learning or feature reuse? Towards understanding the effectiveness of MAML,” in Proc. Int. Conf. Learn. Representations (ICLR), Apr. 2020.
  • [37] C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks,” in Proc. Int. Conf. Mach. Learn. (ICML), Sydney, Australia, pp. 1126–1135, 2017.
  • [38] L. Huang, L. Zhang, S. Yang, L. P. Qian, and Y. Wu, “Meta-learning based dynamic computation task offloading for mobile edge computing networks,” IEEE Commun. Lett., vol. 25, no. 5, pp. 1568–1572, May 2021.
  • [39] Y. Wang, M. Chen, Z. Yang, W. Saad, T. Luo, S. Cui, H. V. Poor, “Meta-reinforcement learning for reliable communication in THz/VLC wireless VR networks,” IEEE Trans. Wireless Commun., vol. 21, no. 9, pp. 7778–7793, Sept. 2022.
  • [40] A. Goldsmith, Wireless communications. Cambridge, U.K.: Cambridge Univ. Press, 2005.
  • [41] C. Xiong, G. Y. Li, Y. Liu, Y. Chen, and S. Xu, “Energy-efficient design for downlink OFDMA with delay-sensitive traffic,” IEEE Trans. Wireless Commun., vol. 12, no. 6, pp. 3085–3095, Jun. 2013.
  • [42] C. Sun, C. She, C. Yang, T. Q. S. Quek, Y. Li, and B. Vucetic, “Optimizing resource allocation in the short blocklength regime for ultra-reliable and low-latency communications,” IEEE Trans. Wireless Commun., vol. 18, no. 1, pp. 402–415, Jan. 2019.
  • [43] N. Ye, X. Li, H. Yu, L. Zhao, W. Liu, and X. Hou, “DeepNOMA: A unified framework for NOMA using deep multi-task learning,” IEEE Trans. Wireless Commun., vol. 19, no. 4, pp. 2208–2225, Apr. 2020.
  • [44] S. Boyd and L. Vandenberghe, Convex Optimization. New York, NY, USA: Cambridge Univ. Press, 2004.