Fronthaul Quantization-Aware MU-MIMO Precoding for Sum Rate Maximization thanks: This work was supported by the Knut and Alice Wallenberg Foundation.

Yasaman Khorsandmanesh, Emil Björnson, and Joakim Jaldén KTH Royal Institute of Technology, Stockholm, Sweden
Email: {yasamank, emilbjo, jalden}@kth.se
Abstract

This paper considers a multi-user multiple-input multiple-output (MU-MIMO) system where the precoding matrix is selected in a baseband unit (BBU) and then sent over a digital fronthaul to the transmitting antenna array. The fronthaul has a limited bit resolution with a known quantization behavior. We formulate a new sum rate maximization problem where the precoding matrix elements must comply with the quantizer. We solve this non-convex mixed-integer problem to local optimality by a novel iterative algorithm inspired by the classical weighted minimum mean square error (WMMSE) approach. The precoding optimization subproblem becomes an integer least-squares problem, which we solve with a new algorithm using a sphere decoding (SD) approach. We show numerically that the proposed precoding technique vastly outperforms the baseline of optimizing an infinite-resolution precoder and then quantizing it. We also develop a heuristic quantization-aware precoding that outperforms the baseline while having comparable complexity.

Index Terms:
Sum rate maximization, weighted minimum mean square error, quantization-aware precoding.

I Introduction

Multi-user multiple-input multiple-output (MU-MIMO) systems enable high data rates through spatial multiplexing of multiple user equipments (UEs) on the same time-frequency resource [1]. A base station (BS) equipped with multiple antennas and channel state information (CSI) can transmit simultaneously to several UEs using different beamforming directivity to increase the sum rate, which is controlled by the precoding method. In precoding matrix design, it is most common to maximize the sum rate, or even the weighted sum rate, under a constraint on the total transmit power [2]; however, sum rate maximization is known to be NP-hard [3]. One popular approach for sum-rate maximization is the iterative weighted minimum mean square error (WMMSE) algorithm, which finds locally optimal solutions with affordable computational complexity [4]. This paper utilizes a novel iterative algorithm inspired by WMMSE.

A 5G BS typically consists of two main components: an advanced antenna system (AAS) and a baseband unit (BBU). The AAS is a box containing the antenna elements and their respective radio units (RUs). The BBU performs the digital processing related to the received uplink data and transmitted downlink data. The AAS and BBU are connected through a digital fronthaul. The integration of antennas and radios into a single box has made massive MU-MIMO practically feasible [5] and enabled the BBU to be virtualized in an edge cloud through migration to the centralized radio access network architecture [6]. The new implementation bottleneck is the limited fronthaul capacity and the quantization errors it creates.

Both the uplink/downlink data and combining/precoding coefficients are sent over this digital fronthaul and must be quantized to a finite resolution. This paper proposes a novel linear block-level quantization-aware precoding technique that maximizes the sum rate.

I-A Prior Work

Downlink MU-MIMO systems have been widely studied in previous literature regarding impairments in analog hardware [7] and the effect of low-resolution digital-to-analog converters [8]. These prior works are characterized by distortion created either in the RU, analog domain, or converters. Therefore, the transmitted signal is distorted after the precoding. The effect of limited fronthaul capacity is studied in [9], but the precoding design was not quantized. In [10], the authors proposed a fronthaul quantization-aware precoding design that minimizes the sum MSE, which will generally not maximize the sum rate. Many previous studies suggest designing a precoding matrix by maximizing the sum rate, often using the WMMSE approach; see [11, 12] and references therein. Nevertheless, they consider ideal hardware or other types of distortion than precoding quantization.

I-B Contributions

This paper proposes a transmit precoding design that finds a local optimum to the sum-rate maximization problem subject to a transmit power constraint over a limited-capacity fronthaul connection that only accepts precoding matrix elements from a discrete quantization codebook. The main contributions are:

  • We propose maximizing the sum rate by solving a quantization-aware precoding problem. As this mixed-integer problem is non-convex, we rewrite it following the iterative WMMSE algorithm to find a local optimum. Each iteration of the proposed iterative algorithm contains a new integer least-square problem that minimizes the weighted MSE at each UE. The solution is obtained in a new way inspired by sphere decoding (SD). We consider a reduced-complexity variation on the proposed algorithm where only the last iteration is quantization-aware.

  • We define quantization-unaware precoding as a baseline and then recommend a low-complexity heuristic algorithm to sequentially refine the quantization-unaware precoding columns to improve the sum rate. The complexity is comparable to quantization-unaware precoding; thus, massive MIMO scenarios can be effectively handled.

  • We provide numerical results to compare different quantization-aware algorithms with quantization-unaware precoding baseline in terms of sum rate with the correlated Rician fading and a uniform planar array (UPA).

II System model

We consider a single-cell MU-MIMO downlink system, where the BS contains an AAS with M𝑀Mitalic_M antenna-integrated radios and serves K𝐾Kitalic_K single-antenna UEs. The AAS is connected to a BBU through a limited-capacity fronthaul link, which is modeled as a finite-resolution quantizer. The precoding matrix 𝑷𝑷\bm{P}bold_italic_P is computed, and the data symbol vector 𝒔𝒔\bm{s}bold_italic_s is encoded at the BBU and then sent over the fronthaul to the AAS. As data symbols are bit sequences from a channel code, we can transmit them over the fronthaul without quantization errors as they are already quantized. We can then map the data symbols obtained from the BBU to modulation symbols at the AAS. However, the precoding matrix computed at the BBU based on CSI is quantized due to the digital fronthaul. The quantized precoding matrix is then multiplied with the UEs’ data symbols at the AAS, and finally, the product is transmitted wirelessly.

Before analyzing the proposed quantization-aware precoding in Section III, we introduce the problem formulation and quantization scheme in the following subsections.

II-A Problem Formulation

The BBU uses its available CSI to select the downlink precoding matrix 𝑷𝑷\bm{P}bold_italic_P. As the main focus of this work is on managing quantization errors in a precoding matrix that maximizes the sum rate, we consider perfect CSI. However, the same algorithms can be used if the BBU has imperfect CSI but treats it as perfect. We postpone the channel estimation part for future work. The transmitted data symbol to the UE k𝑘kitalic_k is denoted by sksubscript𝑠𝑘s_{k}italic_s start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, which has zero mean and normalized unit power, and the corresponding channel vector 𝒉kT1×Msubscriptsuperscript𝒉T𝑘superscript1𝑀\bm{h}^{\mathrm{T}}_{k}\in\mathbb{C}^{1\times M}bold_italic_h start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT 1 × italic_M end_POSTSUPERSCRIPT represents a narrowband channel and might be one subcarrier of a multi-carrier system. The algorithm developed in this paper can be applied individually to each subcarrier. The received signal at the UE k𝑘kitalic_k is

yk=𝒉kT𝒑ksk+i=1,ikK𝒉kT𝒑isi+nk,subscript𝑦𝑘subscriptsuperscript𝒉T𝑘subscript𝒑𝑘subscript𝑠𝑘superscriptsubscriptformulae-sequence𝑖1𝑖𝑘𝐾subscriptsuperscript𝒉T𝑘subscript𝒑𝑖subscript𝑠𝑖subscript𝑛𝑘y_{k}=\bm{h}^{\mathrm{T}}_{k}{\bm{p}}_{k}s_{k}+\sum_{i=1,i\neq k}^{K}\bm{h}^{% \mathrm{T}}_{k}{\bm{p}}_{i}s_{i}+n_{k},italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = bold_italic_h start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT + ∑ start_POSTSUBSCRIPT italic_i = 1 , italic_i ≠ italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT bold_italic_h start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT + italic_n start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , (1)

where 𝒑k𝒫Msubscript𝒑𝑘superscript𝒫𝑀{\bm{p}}_{k}\in\mathcal{P}^{M}bold_italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ caligraphic_P start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT is the quantized linear precoding vector for UE k𝑘kitalic_k and nk𝒞𝒩(0,N0)similar-tosubscript𝑛𝑘𝒞𝒩0subscript𝑁0n_{k}\sim\mathcal{CN}(0,N_{0})italic_n start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∼ caligraphic_C caligraphic_N ( 0 , italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) represents the independent additive complex Gaussian receiver noise with power N0subscript𝑁0N_{0}italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. For later use, we define the total received signal as 𝒚=[y1,,yK]T𝒚superscriptsubscript𝑦1subscript𝑦𝐾T\bm{y}=[y_{1},\ldots,y_{K}]^{\mathrm{T}}bold_italic_y = [ italic_y start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_y start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT, the data symbols vector as s=[s1,,sK]T𝒪Kssuperscriptsubscript𝑠1subscript𝑠𝐾Tsuperscript𝒪𝐾\textbf{s}=[s_{1},\ldots,s_{K}]^{\mathrm{T}}\in\mathcal{O}^{K}s = [ italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ∈ caligraphic_O start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT (𝒪𝒪\mathcal{O}caligraphic_O is the finite set of constellation points such as a QAM alphabet), the channel matrix as 𝑯=[𝒉1,,𝒉K]TK×M𝑯superscriptsubscript𝒉1subscript𝒉𝐾Tsuperscript𝐾𝑀\bm{H}=[\bm{h}_{1},\ldots,\bm{h}_{K}]^{\mathrm{T}}\in\mathbb{C}^{K\times M}bold_italic_H = [ bold_italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_italic_h start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_K × italic_M end_POSTSUPERSCRIPT, and the precoding matrix as 𝑷=[𝒑1,,𝒑K]𝒫M×K𝑷subscript𝒑1subscript𝒑𝐾superscript𝒫𝑀𝐾\bm{P}=[\bm{p}_{1},\ldots,\bm{p}_{K}]\in\mathcal{P}^{M\times K}bold_italic_P = [ bold_italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_italic_p start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT ] ∈ caligraphic_P start_POSTSUPERSCRIPT italic_M × italic_K end_POSTSUPERSCRIPT.

The fronthaul quantization alphabet set 𝒫𝒫\mathcal{P}caligraphic_P is defined as

𝒫={lR+jlI:lR,lI}.𝒫conditional-setsubscript𝑙𝑅𝑗subscript𝑙𝐼subscript𝑙𝑅subscript𝑙𝐼\mathcal{P}=\{l_{R}+jl_{I}:l_{R},l_{I}\in\mathcal{L}\}.caligraphic_P = { italic_l start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT + italic_j italic_l start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT : italic_l start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT , italic_l start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT ∈ caligraphic_L } . (2)

We assume the same quantization alphabet is used for the real and imaginary parts. Here ={l0,,lL1}subscript𝑙0subscript𝑙𝐿1\mathcal{L}=\{l_{0},\ldots,l_{L-1}\}caligraphic_L = { italic_l start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_l start_POSTSUBSCRIPT italic_L - 1 end_POSTSUBSCRIPT } contains the set of real-valued quantization labels, L=||𝐿L=|\mathcal{L}|italic_L = | caligraphic_L | denotes the number of quantization levels, and L¯=log2(L)¯𝐿subscript2𝐿\bar{L}=\log_{2}(L)over¯ start_ARG italic_L end_ARG = roman_log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_L ) is the number of quantization bits per real dimension. Note that 𝒫𝒫\mathcal{P}caligraphic_P becomes the complex-number set \mathbb{C}blackboard_C in the case of infinite resolution. The quantized precoding matrix 𝑷𝑷\bm{P}bold_italic_P and the data symbols vector 𝒔𝒔\bm{s}bold_italic_s are sent separately over the fronthaul. The precoded signal vector 𝒙𝒙\bm{x}bold_italic_x is calculated at the AAS as 𝒙=α𝑷𝒔𝒙𝛼𝑷𝒔\bm{x}=\alpha\bm{P}\bm{s}bold_italic_x = italic_α bold_italic_P bold_italic_s, where the scaling factor α=q/tr(𝑷𝑷H)𝛼𝑞tr𝑷superscript𝑷H\alpha=\sqrt{q/\mathrm{tr}(\bm{P}\bm{P}^{\mathrm{H}})}italic_α = square-root start_ARG italic_q / roman_tr ( bold_italic_P bold_italic_P start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) end_ARG is computed at the AAS, q𝑞qitalic_q denotes the maximum transmit power, and tr()tr\mathrm{tr}(\cdot)roman_tr ( ⋅ ) denotes the matrix trace. This scaling factor ensures that the 𝔼[x22]=q𝔼delimited-[]superscriptsubscriptnormx22𝑞\mathbb{E}[\left\|\textbf{x}\right\|_{2}^{2}]=qblackboard_E [ ∥ x ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] = italic_q, where \|\cdot\|∥ ⋅ ∥ is the Euclidean norm so that the maximum power is always utilized, despite the finite-resolution quantization.

The achievable rate is log2(1+SINRk(𝑷))subscript21subscriptSINR𝑘𝑷\log_{2}(1+\mathrm{SINR}_{k}(\bm{P}))roman_log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 1 + roman_SINR start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_P ) ), where the signal-to-interference-plus-noise-ratio (SINR) depends on the precoding matrix 𝑷𝑷\bm{P}bold_italic_P as

SINRk(𝑷)=|𝒉kT𝒑k|2i=1,ikK|𝒉kT𝒑i|2+N0.subscriptSINR𝑘𝑷superscriptsubscriptsuperscript𝒉T𝑘subscript𝒑𝑘2superscriptsubscriptformulae-sequence𝑖1𝑖𝑘𝐾superscriptsubscriptsuperscript𝒉T𝑘subscript𝒑𝑖2subscript𝑁0\mathrm{SINR}_{k}(\bm{P})=\frac{\big{|}\bm{h}^{\mathrm{T}}_{k}{\bm{p}}_{k}\big% {|}^{2}}{\sum_{i=1,i\neq k}^{K}\big{|}\bm{h}^{\mathrm{T}}_{k}{\bm{p}}_{i}\big{% |}^{2}+N_{0}}.roman_SINR start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_P ) = divide start_ARG | bold_italic_h start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 , italic_i ≠ italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT | bold_italic_h start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG . (3)

We want to maximize the sum rate of this downlink channel under the mentioned maximum transmit power constraint, and we define this problem as

{maxi!}

[2] PP^M ×K ∑_k=1^K log_2 (1 + SINR_k (P))P_1:    \addConstrainttr(PP^H) ≤q, where the optimization variable is 𝑷𝑷\bm{P}bold_italic_P with elements pm,k𝒫subscript𝑝𝑚𝑘𝒫p_{m,k}\in\mathcal{P}italic_p start_POSTSUBSCRIPT italic_m , italic_k end_POSTSUBSCRIPT ∈ caligraphic_P for k=1,,K𝑘1𝐾k=1,\ldots,Kitalic_k = 1 , … , italic_K and m=1,,M𝑚1𝑀m=1,\ldots,Mitalic_m = 1 , … , italic_M. Problem 1subscript1\mathbb{P}_{1}blackboard_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is not convex since the utility is non-concave and the search space is discrete, so it is hard to find the optimal solution [11].

II-B Quantization Scheme

We are modeling the digital fronthaul as a quantizer. In practice, uniform quantization is often used, so we assume our quantizer function 𝒬():𝒫:𝒬𝒫\mathcal{Q}(\cdot):\mathbb{C}\to\mathcal{P}caligraphic_Q ( ⋅ ) : blackboard_C → caligraphic_P is a symmetric uniform quantization with step size ΔΔ\Deltaroman_Δ. Each entry of the quantization labels \mathcal{L}caligraphic_L is defined as

lz=Δ(zL12),z=0,,L1.formulae-sequencesubscript𝑙𝑧Δ𝑧𝐿12𝑧0𝐿1l_{z}=\Delta\left(z-\frac{L-1}{2}\right),\quad z=0,\ldots,L-1.italic_l start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT = roman_Δ ( italic_z - divide start_ARG italic_L - 1 end_ARG start_ARG 2 end_ARG ) , italic_z = 0 , … , italic_L - 1 . (4)

Furthermore, we let 𝒯={τ0,,τL}𝒯subscript𝜏0subscript𝜏𝐿\mathcal{T}=\{\tau_{0},\ldots,\tau_{L}\}caligraphic_T = { italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , … , italic_τ start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT }, where =τ0<τ1<<τ(L1)<τL=subscript𝜏0subscript𝜏1subscript𝜏𝐿1subscript𝜏𝐿-\infty=\tau_{0}<\tau_{1}<\ldots<\tau_{(L-1)}<\tau_{L}=\infty- ∞ = italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT < italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < … < italic_τ start_POSTSUBSCRIPT ( italic_L - 1 ) end_POSTSUBSCRIPT < italic_τ start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT = ∞, specify the set of the L+1𝐿1L+1italic_L + 1 quantization thresholds. For uniform quantizers, the thresholds are

τz=Δ(zL2),z=1,,L1.formulae-sequencesubscript𝜏𝑧Δ𝑧𝐿2𝑧1𝐿1\tau_{z}=\Delta\left(z-\frac{L}{2}\right),\quad z=1,\ldots,L-1.italic_τ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT = roman_Δ ( italic_z - divide start_ARG italic_L end_ARG start_ARG 2 end_ARG ) , italic_z = 1 , … , italic_L - 1 . (5)

The quantizer function 𝒬()𝒬\mathcal{Q}(\cdot)caligraphic_Q ( ⋅ ) can be uniquely described by the set of quantization labels ={lz:z=0,,L1}conditional-setsubscript𝑙𝑧𝑧0𝐿1\mathcal{L}=\{l_{z}:z=0,\ldots,L-1\}caligraphic_L = { italic_l start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT : italic_z = 0 , … , italic_L - 1 } and the set of quantization thresholds 𝒯𝒯\mathcal{T}caligraphic_T. The quantizer maps an input r𝑟r\in\mathbb{C}italic_r ∈ blackboard_C to the quantized output 𝒬(r)=lo+jll𝒫𝒬𝑟subscript𝑙𝑜𝑗subscript𝑙𝑙𝒫\mathcal{Q}(r)=l_{o}+jl_{l}\in\mathcal{P}caligraphic_Q ( italic_r ) = italic_l start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT + italic_j italic_l start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT ∈ caligraphic_P, where the set is defined in (2), if {𝒬(r)}[τo,τo+1)𝒬𝑟subscript𝜏𝑜subscript𝜏𝑜1\mathfrak{R}\{\mathcal{Q}(r)\}\in[\tau_{o},\tau_{o+1})fraktur_R { caligraphic_Q ( italic_r ) } ∈ [ italic_τ start_POSTSUBSCRIPT italic_o end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT italic_o + 1 end_POSTSUBSCRIPT ) and {𝒬(r)}[τl,τl+1)𝒬𝑟subscript𝜏𝑙subscript𝜏𝑙1\mathfrak{I}\{\mathcal{Q}(r)\}\in[\tau_{l},\tau_{l+1})fraktur_I { caligraphic_Q ( italic_r ) } ∈ [ italic_τ start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT italic_l + 1 end_POSTSUBSCRIPT ). The step size ΔΔ\Deltaroman_Δ of the quantizer should be chosen to minimize the distortion between the quantized output and the unquantized input. The optimal step size ΔΔ\Deltaroman_Δ depends on the dynamic range of the input, which in our case depends on the precoding scheme and channel model. We select the step size to minimize the distortion under the maximum-entropy assumption that each input element to the quantizer is distributed 𝒞𝒩(0,qKM)𝒞𝒩0𝑞𝐾𝑀\mathcal{CN}(0,\frac{q}{KM})caligraphic_C caligraphic_N ( 0 , divide start_ARG italic_q end_ARG start_ARG italic_K italic_M end_ARG ), where the variance is selected so that the sum power of the elements matches with the power constraint (II-A). The corresponding optimal step size for the normal distribution was numerically found in [13].

III Proposed WMMSE Algorithm

As the optimization problem 1subscript1\mathbb{P}_{1}blackboard_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is non-convex, the global optimal solution is challenging to find. We instead target finding a local optimum. Inspired by [4], we will rewrite 1subscript1\mathbb{P}_{1}blackboard_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT as an equivalent iterative WMMSE problem for which a local optimum can be found through alternating optimization. In the following, we decompose this equivalent optimization problem into a sequence of convex subproblems.

Let s^k=βkyksubscript^𝑠𝑘subscript𝛽𝑘subscript𝑦𝑘\hat{s}_{k}=\beta_{k}{y}_{k}over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT denote the estimate at UE k𝑘kitalic_k of the transmitted data symbol sksubscript𝑠𝑘s_{k}italic_s start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. It is obtained from the received signal yksubscript𝑦𝑘y_{k}italic_y start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT using the receiver gain βksubscript𝛽𝑘{\beta}_{k}\in\mathbb{C}italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ blackboard_C (also known as the precoding factor [8]). For a given receiver gain, the MSE in the data detection as UE k𝑘kitalic_k becomes

ek(𝑷,βk)subscript𝑒𝑘𝑷subscript𝛽𝑘\displaystyle e_{k}(\bm{P},{\beta}_{k})italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_P , italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) =𝔼[|sks^k|2]absent𝔼delimited-[]superscriptsubscript𝑠𝑘subscript^𝑠𝑘2\displaystyle=\mathbb{E}\left[|{s}_{k}-\hat{{s}}_{k}|^{2}\right]= blackboard_E [ | italic_s start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - over^ start_ARG italic_s end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ]
=|βk|2(|𝒉kT𝒑k|2+i=1,ikK|𝒉kT𝒑i|2+N0)absentsuperscriptsubscript𝛽𝑘2superscriptsubscriptsuperscript𝒉T𝑘subscript𝒑𝑘2superscriptsubscriptformulae-sequence𝑖1𝑖𝑘𝐾superscriptsubscriptsuperscript𝒉T𝑘subscript𝒑𝑖2subscript𝑁0\displaystyle=\left|\beta_{k}\right|^{2}\left(\left|\bm{h}^{\mathrm{T}}_{k}{% \bm{p}}_{k}\right|^{2}+\sum\limits_{i=1,i\neq k}^{K}\left|\bm{h}^{\mathrm{T}}_% {k}{\bm{p}}_{i}\right|^{2}+N_{0}\right)= | italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( | bold_italic_h start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∑ start_POSTSUBSCRIPT italic_i = 1 , italic_i ≠ italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT | bold_italic_h start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT )
2(βk𝒉kT𝒑k)+1.2subscript𝛽𝑘subscriptsuperscript𝒉T𝑘subscript𝒑𝑘1\displaystyle\quad-2\Re\left(\beta_{k}\bm{h}^{\mathrm{T}}_{k}{\bm{p}}_{k}% \right)+1.- 2 roman_ℜ ( italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_h start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) + 1 . (6)

The MSE in (6) is a convex function of βksubscript𝛽𝑘\beta_{k}italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. We can select the value of βksubscript𝛽𝑘\beta_{k}italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT that minimizes the MSE for given 𝑷𝑷\bm{P}bold_italic_P as

β¯k(𝑷)=(𝒉kT𝒑k)|𝒉kT𝒑k|2+i=1,ikK|𝒉kT𝒑i|2+N0.subscript¯𝛽𝑘𝑷superscriptsubscriptsuperscript𝒉T𝑘subscript𝒑𝑘superscriptsubscriptsuperscript𝒉T𝑘subscript𝒑𝑘2superscriptsubscriptformulae-sequence𝑖1𝑖𝑘𝐾superscriptsubscriptsuperscript𝒉T𝑘subscript𝒑𝑖2subscript𝑁0\bar{\beta}_{k}(\bm{P})=\frac{(\bm{h}^{\mathrm{T}}_{k}{\bm{p}}_{k})^{*}}{\left% |\bm{h}^{\mathrm{T}}_{k}{\bm{p}}_{k}\right|^{2}+\sum\limits_{i=1,i\neq k}^{K}% \left|\bm{h}^{\mathrm{T}}_{k}{\bm{p}}_{i}\right|^{2}+N_{0}}.over¯ start_ARG italic_β end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_P ) = divide start_ARG ( bold_italic_h start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT end_ARG start_ARG | bold_italic_h start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + ∑ start_POSTSUBSCRIPT italic_i = 1 , italic_i ≠ italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT | bold_italic_h start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT bold_italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG . (7)

By plugging optimal β¯ksubscript¯𝛽𝑘\bar{\beta}_{k}over¯ start_ARG italic_β end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT into (6), we can see that ek(𝑷,β¯k)subscript𝑒𝑘𝑷subscript¯𝛽𝑘e_{k}(\bm{P},\bar{\beta}_{k})italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_P , over¯ start_ARG italic_β end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) is equal to 1/(1+SINRk(𝑷))11subscriptSINR𝑘𝑷1/(1+\mathrm{SINR}_{k}(\bm{P}))1 / ( 1 + roman_SINR start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_P ) ).

Now by defining the auxiliary weight dk0subscript𝑑𝑘0d_{k}\geq 0italic_d start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ≥ 0, we formulate a weighted sum MMSE problem subject to the same total transmit power constraint as in (II-A):

{mini!}

[2] PP^M ×K , β,d ∑_k=1^K ( d_k e_k(P, β_k) - log_2 (d_k)) P_2:   \addConstrainttr(PP^H) ≤q, where 𝜷=[β1,,βK]T𝜷superscriptsubscript𝛽1subscript𝛽𝐾T\bm{\beta}=[\beta_{1},\ldots,\beta_{K}]^{\mathrm{T}}bold_italic_β = [ italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_β start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT is a vector containing all receiver gains and 𝒅=[d1,,dK]T𝒅superscriptsubscript𝑑1subscript𝑑𝐾T\bm{d}=[d_{1},\ldots,d_{K}]^{\mathrm{T}}bold_italic_d = [ italic_d start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_d start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT is a vector containing all the UE weights in the weighted MSE. The problem 2subscript2\mathbb{P}_{2}blackboard_P start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is equivalent to 1subscript1\mathbb{P}_{1}blackboard_P start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT in the sense that the optimal 𝑷𝑷\bm{P}bold_italic_P is the same for both problems. This equivalence comes from the fact that the optimal weight for the UE k𝑘kitalic_k is

d¯k=1ek(𝑷,β¯k(𝑷))=1+SINRk(𝑷),subscript¯𝑑𝑘1subscript𝑒𝑘𝑷subscript¯𝛽𝑘𝑷1subscriptSINR𝑘𝑷\bar{d}_{k}=\frac{1}{e_{k}(\bm{P},\bar{\beta}_{k}(\bm{P}))}=1+\mathrm{SINR}_{k% }(\bm{P}),over¯ start_ARG italic_d end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = divide start_ARG 1 end_ARG start_ARG italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_P , over¯ start_ARG italic_β end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_P ) ) end_ARG = 1 + roman_SINR start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_P ) , (8)

so (III) then becomes Kk=1Klog2(1+SINRk(𝑷))𝐾superscriptsubscript𝑘1𝐾subscript21subscriptSINR𝑘𝑷K-\sum_{k=1}^{K}\log_{2}(1+\mathrm{SINR}_{k}(\bm{P}))italic_K - ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT roman_log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 1 + roman_SINR start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_P ) ).

The cost function in (III) is convex in each individual optimization variable, which is the key reason for considering this equivalent problem formulation.

For fixed βksubscript𝛽𝑘{\beta}_{k}italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and dksubscript𝑑𝑘{d}_{k}italic_d start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT (e.g., calculated as in (7) and (8)), we have the WMMSE problem {mini!}[2] PP^M ×K∑_k=1^K d_k e_k(P, β_k) P_3:   \addConstrainttr(PP^H) ≤q, which is a mixed-integer convex problem. It can be solved using general-purpose methods, such as CVX [14]. By iterating between updating βksubscript𝛽𝑘\beta_{k}italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT using (7), dksubscript𝑑𝑘d_{k}italic_d start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT using (8), and 𝑷𝑷\bm{P}bold_italic_P by solving 3subscript3\mathbb{P}_{3}blackboard_P start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, we obtain a block coordinate descent algorithm that will converge to a stationary point (for the same reasons as in [4]). This algorithm is summarized in Figure 1.

We can initialize the algorithm using any precoding matrix, including those obtained using classical infinite-resolution precoding schemes. The Wiener filtering (WF) precoding scheme (also known as regularized zero-forcing) is the most desirable one in this context since it can be derived by minimizing the sum MSE [15]. Hence, we suggest setting the initial precoding matrix as 𝑷initial=𝑯H(𝑯𝑯H+KN0qIK)1subscript𝑷initialsuperscript𝑯Hsuperscript𝑯superscript𝑯H𝐾subscript𝑁0𝑞subscriptI𝐾1\bm{P}_{\mathrm{initial}}=\bm{H}^{\mathrm{H}}(\bm{H}\bm{H}^{\mathrm{H}}+\frac{% KN_{0}}{q}\textbf{I}_{K})^{-1}bold_italic_P start_POSTSUBSCRIPT roman_initial end_POSTSUBSCRIPT = bold_italic_H start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ( bold_italic_H bold_italic_H start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT + divide start_ARG italic_K italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG italic_q end_ARG I start_POSTSUBSCRIPT italic_K end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT.

\begin{overpic}[width=281.85034pt]{flowchart.pdf} \put(15.8,94.0){Start}\put(9.0,82.0){\scriptsize Initialize arbitrary} \put(10.0,79.5){\scriptsize$\bm{P}$, iteration $N$ }\put(10.0,76.5){% \scriptsize and set $n=1$} \put(3.0,65.0){\scriptsize Set $\beta_{k}$ \eqref{eq:beta}, $k=1,\ldots,K$}% \put(3.0,51.0){\scriptsize Set $d_{k}$ \eqref{eq:weightsmse}, $k=1,\ldots,K$}% \put(7.0,37.0){\scriptsize Solve the Problem $\mathbb{P}_{3}$} \put(17.0,23.5){\scriptsize$n$ is } \put(13.0,20.0){\scriptsize equal to $N$?} \put(35.5,23.5){\scriptsize No} \put(22.0,12.0){\scriptsize Yes } \put(42.0,24.0){\scriptsize Update $\bm{P}$ and} \put(42.0,20.0){\scriptsize set $n=n+1$} \put(16.2,4.0){End}\end{overpic}
Figure 1: Flowchart of the proposed iterative algorithm for solving WMMSE problem 2subscript2\mathbb{P}_{2}blackboard_P start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Here N𝑁Nitalic_N is the maximum number of iterations.

III-A Efficient Implementation of Quantization-Aware Precoding

The main complexity in the proposed algorithm originates from solving 3subscript3\mathbb{P}_{3}blackboard_P start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT. Instead of using a general-purpose solver, we will propose an efficient dedicated algorithm. We can rewrite the objective function (III) as

k=1Kdkek(𝑷,βk)=𝔼[diag(𝒅)(𝒔diag(𝜷)𝒚)22],superscriptsubscript𝑘1𝐾subscript𝑑𝑘subscript𝑒𝑘𝑷subscript𝛽𝑘𝔼delimited-[]superscriptsubscriptnormdiag𝒅𝒔diag𝜷𝒚22\displaystyle\sum_{k=1}^{K}{d}_{k}e_{k}\left(\bm{P},{\beta}_{k}\right)=\mathbb% {E}\left[\left\|\sqrt{\mathrm{diag}({\bm{d}})}\Big{(}\bm{s}-\mathrm{diag}({\bm% {\beta}})\bm{y}\Big{)}\right\|_{2}^{2}\right],∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_d start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT italic_e start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( bold_italic_P , italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) = blackboard_E [ ∥ square-root start_ARG roman_diag ( bold_italic_d ) end_ARG ( bold_italic_s - roman_diag ( bold_italic_β ) bold_italic_y ) ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ] , (9)

where diag(𝒅)diag𝒅\mathrm{diag}(\bm{d})roman_diag ( bold_italic_d ) is a diagonal matrix with elements from the vector 𝒅𝒅\bm{d}bold_italic_d with UE weights on the main diagonal. The expression in (9) can be expanded as

𝔼[diag(𝒅)𝒔diag(𝒅)diag(𝜷)𝒚22]𝔼delimited-[]superscriptsubscriptnormdiag𝒅𝒔diag𝒅diag𝜷𝒚22\displaystyle\mathbb{E}\left[\left\|\sqrt{\mathrm{diag}(\bm{d})}\bm{s}-\sqrt{% \mathrm{diag}(\bm{d})}\mathrm{diag}(\bm{\beta})\bm{y}\right\|_{2}^{2}\right]blackboard_E [ ∥ square-root start_ARG roman_diag ( bold_italic_d ) end_ARG bold_italic_s - square-root start_ARG roman_diag ( bold_italic_d ) end_ARG roman_diag ( bold_italic_β ) bold_italic_y ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ]
=tr(diag(𝒅)diag(𝒅)𝑫𝑯𝑷diag(𝒅)𝑷H𝑯H𝑫H\displaystyle=\mathrm{tr}\Big{(}\mathrm{diag}(\bm{d})-\sqrt{\mathrm{diag}(\bm{% d})}\bm{D}\bm{H}\bm{P}-\sqrt{\mathrm{diag}(\bm{d})}\bm{P}^{\mathrm{H}}\bm{H}^{% \mathrm{H}}\bm{D}^{\mathrm{H}}= roman_tr ( roman_diag ( bold_italic_d ) - square-root start_ARG roman_diag ( bold_italic_d ) end_ARG bold_italic_D bold_italic_H bold_italic_P - square-root start_ARG roman_diag ( bold_italic_d ) end_ARG bold_italic_P start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_H start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_D start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT
+𝑫𝑯𝑷𝑷H𝑯H𝑫H+N0𝑫𝑫H),\displaystyle\quad+\bm{D}\bm{H}\bm{P}\bm{P}^{\mathrm{H}}\bm{H}^{\mathrm{H}}\bm% {D}^{\mathrm{H}}+N_{0}\bm{D}\bm{D}^{\mathrm{H}}\Big{)},+ bold_italic_D bold_italic_H bold_italic_P bold_italic_P start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_H start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_D start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT + italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bold_italic_D bold_italic_D start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) , (10)

where we introduce the notation 𝑫=diag(𝒅)diag(𝜷)𝑫diag𝒅diag𝜷\bm{D}=\sqrt{\mathrm{diag}(\bm{d})}\mathrm{diag}(\bm{\beta})bold_italic_D = square-root start_ARG roman_diag ( bold_italic_d ) end_ARG roman_diag ( bold_italic_β ).

We first notice that 3subscript3\mathbb{P}_{3}blackboard_P start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT is a so-called integer least-squares problem due to uniform quantizer that we chose. The search space is a scaled finite subset of the infinite integer lattice. A technique that has previously been proposed as an efficient algorithm to solve closest lattice point problems in the Euclidean sense is called sphere decoding (SD). SD has significantly lower average computational complexity than a naive exhaustive search [16]. The basic principle of SD is to reduce the number of search points in a skewed lattice that lies within a hypersphere of radius d𝑑ditalic_d, which can speed up the process of finding the solution without loss of optimality. In this paper, we consider the Schnorr-Euchner SD (SESD) algorithm [17], where the enumeration sorts candidate symbols in a zig-zag manner. SESD improves the basic SD algorithm by first checking the smallest child node of the parent node in each layer because the first found the feasible solution is often quite suitable and quickly enables a reduction of the search radius. Thus, many branches can be pruned, and the calculation complexity can be further lowered.

We propose using the SESD algorithm to approximately solve 3subscript3\mathbb{P}_{3}blackboard_P start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT for fixed values of βksubscript𝛽𝑘\beta_{k}italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and dksubscript𝑑𝑘d_{k}italic_d start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. However, the classical SD framework does not contain any power constraint. To adapt our problem 3subscript3\mathbb{P}_{3}blackboard_P start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT to match with the form required by SD algorithms, we need to proceed as follows. First, we rewrite the objective function (10) using the Lagrange multiplier λ𝜆\lambdaitalic_λ as

𝔏(𝑷,𝜷,λ)𝔏𝑷𝜷𝜆\displaystyle\mathfrak{L}(\bm{P},\bm{\beta},\lambda)fraktur_L ( bold_italic_P , bold_italic_β , italic_λ ) =tr(diag(𝒅)diag(𝒅)𝑫𝑯𝑷\displaystyle=\mathrm{tr}\Big{(}\mathrm{diag}(\bm{d})-\sqrt{\mathrm{diag}(\bm{% d})}\bm{D}\bm{H}\bm{P}= roman_tr ( roman_diag ( bold_italic_d ) - square-root start_ARG roman_diag ( bold_italic_d ) end_ARG bold_italic_D bold_italic_H bold_italic_P
diag(𝒅)𝑷H𝑯H𝑫H+𝑫𝑯𝑷𝑷H𝑯H𝑫Hdiag𝒅superscript𝑷Hsuperscript𝑯Hsuperscript𝑫H𝑫𝑯𝑷superscript𝑷Hsuperscript𝑯Hsuperscript𝑫H\displaystyle-\sqrt{\mathrm{diag}(\bm{d})}\bm{P}^{\mathrm{H}}\bm{H}^{\mathrm{H% }}\bm{D}^{\mathrm{H}}+\bm{D}\bm{H}\bm{P}\bm{P}^{\mathrm{H}}\bm{H}^{\mathrm{H}}% \bm{D}^{\mathrm{H}}- square-root start_ARG roman_diag ( bold_italic_d ) end_ARG bold_italic_P start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_H start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_D start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT + bold_italic_D bold_italic_H bold_italic_P bold_italic_P start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_H start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_D start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT
+N0𝑫𝑫H)+λ(tr(𝑷𝑷H)q).\displaystyle+N_{0}\bm{D}\bm{D}^{\mathrm{H}}\Big{)}+\lambda\big{(}\mathrm{tr}(% \bm{P}\bm{P}^{\mathrm{H}})-q\big{)}.+ italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT bold_italic_D bold_italic_D start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) + italic_λ ( roman_tr ( bold_italic_P bold_italic_P start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT ) - italic_q ) . (11)

When minimizing (11) with respect to 𝑷𝑷\bm{P}bold_italic_P and λ𝜆\lambdaitalic_λ, we can drop the constant term tr(diag(𝒅))trdiag𝒅\mathrm{tr}(\mathrm{diag}(\bm{d}))roman_tr ( roman_diag ( bold_italic_d ) ) and have problem 4subscript4\mathbb{P}_{4}blackboard_P start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, given in (III-A) at the top of the next page. Although strong duality does not hold for the integer least-squares problem 3subscript3\mathbb{P}_{3}blackboard_P start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, 4subscript4\mathbb{P}_{4}blackboard_P start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT provides a good approximation. For solving 4subscript4\mathbb{P}_{4}blackboard_P start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, we can utilize the SD algorithm for a fixed value of λ𝜆\lambdaitalic_λ. We then make a bisection search over λ𝜆\lambdaitalic_λ to find the value that gives a solution that satisfies the power constraint in (III) near equality.

For a fixed value of λ𝜆\lambdaitalic_λ, and by vectorizing 4subscript4\mathbb{P}_{4}blackboard_P start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT and using 𝒇=vec((diag(𝒅)𝑫𝑯)T)𝒇vecsuperscriptdiag𝒅𝑫𝑯T\bm{f}=\mathrm{vec}((\sqrt{\mathrm{diag}(\bm{d})}\bm{DH})^{\mathrm{T}})bold_italic_f = roman_vec ( ( square-root start_ARG roman_diag ( bold_italic_d ) end_ARG bold_italic_D bold_italic_H ) start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT ), we obtain 5subscript5\mathbb{P}_{5}blackboard_P start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT,

{mini}

—l— PP^M ×K, λ≥0 tr(P^H(H^HD^H DH + λI_M )P- diag(d) DHP - (diag(d) DHP)^H ) - λq .P_4:

{mini}

—l— p_i∈P^M , i=1,…,K ∑_i=1^K ( p_i^H ( H^HD^H DH+λI_M )p_i -f_i^Tp_i - ( f_i^Tp_i)^H), P_5:    

which finds a suboptimal precoding matrix and has K𝐾Kitalic_K separable objective functions that each only depends on one of the optimization variables. This feature enables parallel optimization of 𝒑isubscript𝒑𝑖\bm{p}_{i}bold_italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT for i=1,,K𝑖1𝐾i=1,\ldots,Kitalic_i = 1 , … , italic_K. Thus, in addition to the more efficient search strategy, the reformulation of problem 5subscript5\mathbb{P}_{5}blackboard_P start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT also significantly reduces the dimension of each subproblem [10]. By defining 𝑽^=𝑯H𝑫H𝑫𝑯+λ𝑰M^𝑽superscript𝑯Hsuperscript𝑫H𝑫𝑯𝜆subscript𝑰𝑀\hat{\bm{V}}=\bm{H}^{\mathrm{H}}\bm{D}^{\mathrm{H}}\bm{D}\bm{H}+\lambda\bm{I}_% {M}over^ start_ARG bold_italic_V end_ARG = bold_italic_H start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_D start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_D bold_italic_H + italic_λ bold_italic_I start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT, we can obtain the equivalent formulation of each term of the objective function in (III-A) as

𝒑iH𝑽^𝒑i𝒇iT𝒑i(𝒇iT𝒑i)H=𝒄i𝑮𝒑i22𝒄iH𝒄i,superscriptsubscript𝒑𝑖H^𝑽subscript𝒑𝑖superscriptsubscript𝒇𝑖Tsubscript𝒑𝑖superscriptsuperscriptsubscript𝒇𝑖Tsubscript𝒑𝑖Hsuperscriptsubscriptdelimited-∥∥subscript𝒄𝑖𝑮subscript𝒑𝑖22superscriptsubscript𝒄𝑖Hsubscript𝒄𝑖\displaystyle\bm{p}_{i}^{\mathrm{H}}\hat{\bm{V}}\bm{p}_{i}-\bm{f}_{i}^{\mathrm% {T}}\bm{p}_{i}-\left(\bm{f}_{i}^{\mathrm{T}}\bm{p}_{i}\right)^{\mathrm{H}}=% \lVert\bm{c}_{i}-\bm{G}\bm{p}_{i}\rVert_{2}^{2}-\bm{c}_{i}^{\mathrm{H}}\bm{c}_% {i},bold_italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT over^ start_ARG bold_italic_V end_ARG bold_italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT bold_italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - ( bold_italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT bold_italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT = ∥ bold_italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - bold_italic_G bold_italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∥ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT - bold_italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , (12)

where 𝑮M×M𝑮superscript𝑀𝑀\bm{G}\in\mathbb{C}^{M\times M}bold_italic_G ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × italic_M end_POSTSUPERSCRIPT is obtained from the Cholesky decomposition 𝑽^=𝑮H𝑮^𝑽superscript𝑮H𝑮\hat{\bm{V}}=\bm{G}^{\mathrm{H}}\bm{G}over^ start_ARG bold_italic_V end_ARG = bold_italic_G start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT bold_italic_G and 𝒄i=(𝒇iT𝑮1)Hsubscript𝒄𝑖superscriptsuperscriptsubscript𝒇𝑖Tsuperscript𝑮1H\bm{c}_{i}=(\bm{f}_{i}^{\mathrm{T}}\bm{G}^{-1})^{\mathrm{H}}bold_italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ( bold_italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_T end_POSTSUPERSCRIPT bold_italic_G start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT roman_H end_POSTSUPERSCRIPT. We can now minimize (12) by the classical SESD algorithm. We call this approach the Proposed SD-based WMMSE algorithm.

III-B Quantization-Unaware Precoding

The conventional WMMSE-based algorithms in [11, 12] (among others) compute a precoding matrix from M×Ksuperscript𝑀𝐾\mathbb{C}^{M\times K}blackboard_C start_POSTSUPERSCRIPT italic_M × italic_K end_POSTSUPERSCRIPT that maximizes the sum rate with infinite resolution and the available CSI. Hence, the naive baseline approach would be to compute such a precoding matrix 𝑷unquantized=𝑾M×Ksubscript𝑷unquantized𝑾superscript𝑀𝐾\bm{P}_{\mathrm{unquantized}}=\bm{W}\in\mathbb{C}^{M\times K}bold_italic_P start_POSTSUBSCRIPT roman_unquantized end_POSTSUBSCRIPT = bold_italic_W ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × italic_K end_POSTSUPERSCRIPT and then quantize each entry using 𝒬():M×K𝒫M×K:𝒬superscript𝑀𝐾superscript𝒫𝑀𝐾\mathcal{Q}(\cdot):\mathbb{C}^{M\times K}\to\mathcal{P}^{M\times K}caligraphic_Q ( ⋅ ) : blackboard_C start_POSTSUPERSCRIPT italic_M × italic_K end_POSTSUPERSCRIPT → caligraphic_P start_POSTSUPERSCRIPT italic_M × italic_K end_POSTSUPERSCRIPT so that the result can be sent over the fronthaul. In this case, the BBU follows Fig. 1 but solves 3subscript3\mathbb{P}_{3}blackboard_P start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT in the domain \mathbb{C}blackboard_C instead of 𝒫𝒫\mathcal{P}caligraphic_P, and 3subscript3\mathbb{P}_{3}blackboard_P start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT becomes a continuous convex optimization problem that is solvable using any general-purpose convex solver. We will refer to this quantization-unaware precoding approach as the Unaware WMMSE. After the WMMSE algorithm has converged, the final precoding matrix is obtained by quantizing each entry as 𝑷=𝒬(𝑾)𝑷𝒬𝑾\bm{P}=\mathcal{Q}(\bm{W})bold_italic_P = caligraphic_Q ( bold_italic_W ).

III-C Combined Quantization-Aware and -Unaware Precoding

Although the proposed SD approach is tailored for the problem, there is a limit to how large setups (M𝑀Mitalic_M and K𝐾Kitalic_K) it can handle before the run time of the Proposed SD-based WMMSE approach becomes an issue. As we run the SD algorithm in each iteration, the algorithm’s average complexity over N𝑁Nitalic_N iterations becomes O(NKL2γM)𝑂𝑁𝐾superscript𝐿2𝛾𝑀O(NKL^{2\gamma M})italic_O ( italic_N italic_K italic_L start_POSTSUPERSCRIPT 2 italic_γ italic_M end_POSTSUPERSCRIPT ) for some 0γ10𝛾10\leq\gamma\leq 10 ≤ italic_γ ≤ 1 [16].

An alternative would be to search for a precoding matrix in M×Ksuperscript𝑀𝐾\mathbb{C}^{M\times K}blackboard_C start_POSTSUPERSCRIPT italic_M × italic_K end_POSTSUPERSCRIPT for N1𝑁1N-1italic_N - 1 iterations and then use the SD-based algorithm only for the final iteration. In this case, we will first identify UE gains βksubscript𝛽𝑘\beta_{k}italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT and weights dksubscript𝑑𝑘d_{k}italic_d start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT that are suitable for sum-rate maximization with infinite-resolution precoding and then compute the corresponding optimized quantization-aware precoding. The order of complexity would be O(KL2γM)𝑂𝐾superscript𝐿2𝛾𝑀O(KL^{2\gamma M})italic_O ( italic_K italic_L start_POSTSUPERSCRIPT 2 italic_γ italic_M end_POSTSUPERSCRIPT ) for some 0γ10𝛾10\leq\gamma\leq 10 ≤ italic_γ ≤ 1. We call this method Half-aware WMMSE.

III-D Heuristic Quantization-Aware Precoding

Although the Half-aware WMMSE scheme has lower complexity than the Proposed SD-based WMMSE, the complexity grows exponentially with the number of antennas M𝑀Mitalic_M, which makes scenarios with many antennas intractable. Hence, we believe it should primarily be seen as a benchmark for designing quantization-aware precoding schemes with low complexity. In this subsection, we propose such a heuristic precoding scheme. The proposed scheme is an add-on to the Unaware WMMSE precoding. After we quantize the precoding matrix 𝑾M×K𝑾superscript𝑀𝐾\bm{W}\in\mathbb{C}^{M\times K}bold_italic_W ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × italic_K end_POSTSUPERSCRIPT to obtain 𝑷𝒫M×K𝑷superscript𝒫𝑀𝐾\bm{P}\in\mathcal{P}^{M\times K}bold_italic_P ∈ caligraphic_P start_POSTSUPERSCRIPT italic_M × italic_K end_POSTSUPERSCRIPT, we refine the elements sequentially. We consider the second closest quantization levels in both the real and imaginary dimensions according to Euclidean distance [10]. This search gives us three alternative ways of quantizing each element in the precoding matrix 𝑾𝑾\bm{W}bold_italic_W. We call this method the Heuristic quantization-aware.

As the matrix elements are refined sequentially, we must order the elements properly. We propose to start by updating the column of the quantized precoding matrix corresponding to the UE k𝑘kitalic_k with the highest generated interference GIk=i=1,ikK|[𝑯𝑷^]i,k|2subscriptGI𝑘superscriptsubscriptformulae-sequence𝑖1𝑖𝑘𝐾superscriptsubscriptdelimited-[]𝑯^𝑷𝑖𝑘2\mathrm{GI}_{k}=\sum_{{i=1},i\neq k}^{K}|[{\bm{H}}{\hat{\bm{P}}}]_{i,k}|^{2}roman_GI start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_i = 1 , italic_i ≠ italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT | [ bold_italic_H over^ start_ARG bold_italic_P end_ARG ] start_POSTSUBSCRIPT italic_i , italic_k end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, where 𝑷^=α𝑷=α𝒬(𝑾)^𝑷𝛼𝑷𝛼𝒬𝑾\hat{\bm{P}}=\alpha\bm{P}=\alpha\mathcal{Q}(\bm{W})over^ start_ARG bold_italic_P end_ARG = italic_α bold_italic_P = italic_α caligraphic_Q ( bold_italic_W ) since this might improve the performance the most.111We have noticed experimentally that this leads to the largest improvement in sum rate at high SNR. Then for that specific UE k𝑘kitalic_k, for each transmit antenna m{1,,M}𝑚1𝑀m\in\{1,\ldots,M\}italic_m ∈ { 1 , … , italic_M }, we identify the four nearest points in 𝒫𝒫\mathcal{P}caligraphic_P to the element wk,msubscript𝑤𝑘𝑚w_{k,m}italic_w start_POSTSUBSCRIPT italic_k , italic_m end_POSTSUBSCRIPT from the original unquantized precoding matrix 𝑾𝑾\bm{W}bold_italic_W. We evaluate the sum rate

k=1Klog2(1+|[𝑯𝑷^]k,k|2i=1,ikK|[𝑯𝑷^]k,i|2+N0),superscriptsubscript𝑘1𝐾subscript21superscriptsubscriptdelimited-[]𝑯^𝑷𝑘𝑘2superscriptsubscriptformulae-sequence𝑖1𝑖𝑘𝐾superscriptsubscriptdelimited-[]𝑯^𝑷𝑘𝑖2subscript𝑁0\sum_{k=1}^{K}\log_{2}\left(1+\frac{\big{|}[{\bm{H}}\hat{\bm{P}}]_{k,k}\big{|}% ^{2}}{\sum_{i=1,i\neq k}^{K}\big{|}[{\bm{H}}\hat{\bm{P}}]_{k,i}\big{|}^{2}+N_{% 0}}\right),∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT roman_log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 1 + divide start_ARG | [ bold_italic_H over^ start_ARG bold_italic_P end_ARG ] start_POSTSUBSCRIPT italic_k , italic_k end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_i = 1 , italic_i ≠ italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT | [ bold_italic_H over^ start_ARG bold_italic_P end_ARG ] start_POSTSUBSCRIPT italic_k , italic_i end_POSTSUBSCRIPT | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG ) , (13)

for the four different 𝑷𝑷{\bm{P}}bold_italic_P options obtained with pk,m{four nearest points to wk,min𝒫}subscript𝑝𝑘𝑚four nearest points to subscript𝑤𝑘𝑚in𝒫p_{k,m}\in\{\text{four nearest points to }w_{k,m}\hskip 2.84526pt\text{in}% \hskip 2.84526pt\mathcal{P}\}italic_p start_POSTSUBSCRIPT italic_k , italic_m end_POSTSUBSCRIPT ∈ { four nearest points to italic_w start_POSTSUBSCRIPT italic_k , italic_m end_POSTSUBSCRIPT in caligraphic_P } while all other elements are fixed. We then replace the corresponding element in 𝑷𝑷\bm{P}bold_italic_P with the option that achieves the largest sum rate. The rest of the UEs are ordered based on decreasing generated interference, and the precoding elements are updated similarly.

IV Numerical Results

This section compares the sum rates achieved by the aforementioned precoding approaches as a function of the SNR. The sum rate is calculated using Monte Carlo simulations for the case of Gaussian signaling.

IV-A Channel Model

We consider spatially correlated Rician fading channels composed of a line-of-sight (LoS) path component and a non-line-of-sight (NLoS) path component as

𝑯=κκ+1𝑯LoS+1κ+1𝑯NLoS,𝑯𝜅𝜅1superscript𝑯LoS1𝜅1superscript𝑯NLoS\bm{H}=\sqrt{\frac{\kappa}{\kappa+1}}\bm{H}^{\text{LoS}}+\sqrt{\frac{1}{\kappa% +1}}\bm{H}^{\text{NLoS}},bold_italic_H = square-root start_ARG divide start_ARG italic_κ end_ARG start_ARG italic_κ + 1 end_ARG end_ARG bold_italic_H start_POSTSUPERSCRIPT LoS end_POSTSUPERSCRIPT + square-root start_ARG divide start_ARG 1 end_ARG start_ARG italic_κ + 1 end_ARG end_ARG bold_italic_H start_POSTSUPERSCRIPT NLoS end_POSTSUPERSCRIPT , (14)

where κ𝜅\kappaitalic_κ is the Rician factor, while 𝑯LoSK×Msuperscript𝑯LoSsuperscript𝐾𝑀\bm{H}^{\text{LoS}}\in\mathbb{C}^{K\times M}bold_italic_H start_POSTSUPERSCRIPT LoS end_POSTSUPERSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_K × italic_M end_POSTSUPERSCRIPT and 𝑯NLoSK×Msuperscript𝑯NLoSsuperscript𝐾𝑀\bm{H}^{\text{NLoS}}\in\mathbb{C}^{K\times M}bold_italic_H start_POSTSUPERSCRIPT NLoS end_POSTSUPERSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_K × italic_M end_POSTSUPERSCRIPT are the LoS and NLoS components, respectively. We assume that the AAS is a UPA; thus, the LoS channel matrix can be modeled as in [18, Ch. 7].

We model the NLoS channel as 𝒉kNLoS𝒞𝒩(𝟎M,𝑹k)similar-tosuperscriptsubscript𝒉𝑘NLoS𝒞𝒩subscript0𝑀subscript𝑹𝑘\bm{h}_{k}^{\text{NLoS}}\sim\mathcal{CN}(\bm{0}_{M},\bm{R}_{k})bold_italic_h start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT NLoS end_POSTSUPERSCRIPT ∼ caligraphic_C caligraphic_N ( bold_0 start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT , bold_italic_R start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ), which is spatially correlated Rayleigh fading. The spatial correlation matrix 𝑹kM×Msubscript𝑹𝑘superscript𝑀𝑀{\bm{R}_{k}}\in\mathbb{C}^{M\times M}bold_italic_R start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ blackboard_C start_POSTSUPERSCRIPT italic_M × italic_M end_POSTSUPERSCRIPT is generated following the local scattering model from [18, Ch. 2]. The locations of the UEs are randomly generated with the same elevation angle θ=0𝜃0\theta=0italic_θ = 0 but uniformly distributed azimuth angles ϕitalic-ϕ\phiitalic_ϕ, seen from the UPA.

IV-B Results and Discussion

We assume the UPA consists of M=16𝑀16M=16italic_M = 16 entries in the form of a 4×4444\times 44 × 4 array. The number of quantization levels is L=8𝐿8L=8italic_L = 8, the number of UEs is K=4𝐾4K=4italic_K = 4, and the Rician factor is κ=5𝜅5\kappa=5italic_κ = 5. The UEs have the same SNR, which is defined as SNR=qγN0SNR𝑞𝛾subscript𝑁0\mathrm{SNR}=\frac{q\gamma}{N_{0}}roman_SNR = divide start_ARG italic_q italic_γ end_ARG start_ARG italic_N start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG, where γ𝛾\gammaitalic_γ is the common channel variance.

Refer to caption
Figure 2: Sum rate evolution when running the proposed WMMSE algorithm using CVX or the proposed SD-based method.

Fig. 2 presents the convergence behavior of the proposed WMMSE algorithm in Fig. 1 for the cases when 3subscript3\mathbb{P}_{3}blackboard_P start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT is solved exactly using CVX (denoted by CVX-based WMMSE) and approximately using the proposed SD-based method. We consider SNR=20SNR20\mathrm{SNR}=20roman_SNR = 20 dB and generate one user drop. The proposed algorithm reaches a stationary point after N=5𝑁5N=5italic_N = 5 iterations. We notice that, despite the approximations made to lower the complexity in the SD-based method, the difference in the sum rate is negligible after convergence. The difference between the sum rate at the starting point and the stationary point approximately shows the improvement of the proposed sum-rate maximization algorithm compared to our previous results in [10], which considered sum MSE minimization.

Refer to caption
Figure 3: The average sum rate versus the SNR for different precoding schemes. We assume that the BS has M=4×4𝑀44M=4\times 4italic_M = 4 × 4 antennas and serves K=4𝐾4K=4italic_K = 4 UEs with L=8𝐿8L=8italic_L = 8 quantization levels.

Fig. 3 depicts the average sum rate as a function of the SNR for the different precoding schemes. The top curve, WMMSE (infinite res), considers the ideal case without quantization and outperforms all the quantized precoding schemes since the rate increases linearly (in dB scale) at high SNR. In all quantized precoding schemes, the sum rate converges to specific limits at high SNR since the interference cannot be canceled entirely due to the limited precoding resolution; i.e., the system is interference-limited at high SNR. The gap between WMMSE (infinite res) and our novel Proposed SD-based WMMSE precoding is remarkably smaller than the gap between Unaware WMMSE and WMMSE (infinite res), where we quantized the precoding matrix used for WMMSE (infinite res). Our algorithm provides twice the rate at high SNR. The lower-complexity Half-aware WMMSE approach also outperforms the Unaware WMMSE algorithm, but there is a substantial gap to Proposed SD-based WMMSE. This gap demonstrates the importance of computing quantization-aware weights in the WMMSE algorithm instead of relying on those obtained with infinite resolution. The proposed Heuristic quantization-aware precoding reaches nearly the same performance as Half-aware WMMSE and, thus, performs vastly better than Unaware WMMSE precoding. Note that the complexity of Heuristic quantization-aware is polynomial with M𝑀Mitalic_M and K𝐾Kitalic_K, just as Unaware WMMSE, and therefore is implementable in the same scenarios (e.g., massive MIMO).

V Conclusions

A 5G site often consists of an AAS connected to a BBU via a digital fronthaul with limited capacity. Hence, the precoding matrix that is computed at the BBU must be quantized to finite precision. We have proposed a novel WMMSE-based framework for quantization-aware precoding that finds a local optimum to the sum rate maximization problem. Moreover, a reduced-complexity SD algorithm was proposed to find an approximate but tight solution. We demonstrated that the sum rate can be doubled at high SNR by being quantization-aware both when selecting the weights in the WMMSE formulation and when selecting the precoding matrix. Besides, a heuristic quantization-aware precoding was proposed to outperform the quantization-unaware baseline while having comparable complexity. Finally, we note that the framework can be easily modified also to solve weighted sum rate problems.

References

  • [1] D. Gesbert, M. Kountouris, R. W. Heath, C.-B. Chae, and T. Sälzer, “Shifting the MIMO paradigm,” IEEE Signal Process. Mag., vol. 24, no. 5, pp. 36–46, Sep. 2007.
  • [2] E. Björnson, G. Zheng, M. Bengtsson, and B. Ottersten, “Robust monotonic optimization framework for multicell MISO systems,” IEEE Trans. Signal Proc., vol. 60, no. 5, pp. 2508–2523, 2012.
  • [3] Y.-F. Liu, Y.-H. Dai, and Z.-Q. Luo, “Coordinated beamforming for MISO interference channel: Complexity analysis and efficient algorithms,” IEEE Trans. Signal Proc., vol. 59, no. 3, pp. 1142–1157, 2010.
  • [4] Q. Shi, M. Razaviyayn, Z.-Q. Luo, and C. He, “An iteratively weighted MMSE approach to distributed sum-utility maximization for a MIMO interfering broadcast channel,” IEEE Trans. Signal Process., vol. 59, no. 9, pp. 4331–4340, 2011.
  • [5] E. Björnson, L. Sanguinetti, H. Wymeersch, J. Hoydis, and T. L. Marzetta, “Massive MIMO is a reality—What is next? Five promising research directions for antenna arrays,” Digit. Signal Process., vol. 94, pp. 3–20, 2019.
  • [6] M. Peng, C. Wang, V. Lau, and H. V. Poor, “Fronthaul-constrained cloud radio access networks: Insights and challenges,” IEEE Wirel. Commun., vol. 22, no. 2, pp. 152–160, 2015.
  • [7] E. Björnson, J. Hoydis, M. Kountouris, and M. Debbah, “Massive MIMO systems with non-ideal hardware: Energy efficiency, estimation, and capacity limits,” IEEE Trans. Inf. Theory, vol. 60, no. 11, pp. 7112–7139, 2014.
  • [8] S. Jacobsson, G. Durisi, M. Coldrey, T. Goldstein, and C. Studer, “Quantized precoding for massive MU-MIMO,” IEEE Trans. Commun., vol. 65, no. 11, pp. 4670–4684, 2017.
  • [9] P. Parida, H. S. Dhillon, and A. F. Molisch, “Downlink performance analysis of cell-free massive MIMO with finite fronthaul capacity,” in VTC-Fall, Chicago, IL, USA, 2018, pp. 1–6.
  • [10] Y. Khorsandmanesh, E. Björnson, and J. Jaldén, “Optimized precoding for MU-MIMO with fronthaul quantization,” arXiv preprint arXiv:2209.01868, 2022.
  • [11] S. S. Christensen, R. Agarwal, E. De Carvalho, and J. M. Cioffi, “Weighted sum-rate maximization using weighted MMSE for MIMO-BC beamforming design,” IEEE Trans. Wirel. Commun., vol. 7, no. 12, pp. 4792–4799, 2008.
  • [12] X. Zhao, S. Lu, Q. Shi, and Z.-Q. Luo, “Rethinking WMMSE: Can its complexity scale linearly with the number of BS antennas?” arXiv preprint arXiv:2205.06225, 2022.
  • [13] D. Hui and D. L. Neuhoff, “Asymptotic analysis of optimal fixed-rate uniform scalar quantization,” IEEE Trans. Inf. Theory, vol. 47, no. 3, pp. 957–977, 2001.
  • [14] M. Grant and S. Boyd, “CVX: Matlab software for disciplined convex programming, version 2.1,” 2014.
  • [15] M. Joham, W. Utschick, and J. A. Nossek, “Linear transmit processing in MIMO communications systems,” IEEE Trans. Signal Process., vol. 53, no. 8, pp. 2700–2712, 2005.
  • [16] J. Jaldén and B. Ottersten, “On the complexity of sphere decoding in digital communications,” IEEE Trans. Signal Process., vol. 53, no. 4, pp. 1474–1484, 2005.
  • [17] E. Agrell, T. Eriksson, A. Vardy, and K. Zeger, “Closest point search in lattices,” IEEE Trans. Inf. Theory, vol. 48, no. 8, pp. 2201–2214, 2002.
  • [18] E. Björnson, J. Hoydis, and L. Sanguinetti, “Massive MIMO networks: Spectral, energy, and hardware efficiency,” Foundations and Trends in Signal Processing, vol. 11, no. 3-4, pp. 154–655, 2017.