Joint Channel and Data Estimation for Multiuser Extremely Large-Scale MIMO Systems
Abstract
This paper proposes a joint channel and data estimation (JCDE) algorithm for uplink multiuser extremely large-scale multiple-input-multiple-output (XL-MIMO) systems. The initial channel estimation is formulated as a sparse reconstruction problem based on the angle and distance sparsity under the near-field propagation condition. This problem is solved using non-orthogonal pilots through an efficient low complexity two-stage compressed sensing algorithm. Furthermore, the initial channel estimates are refined by employing a JCDE framework driven by both non-orthogonal pilots and estimated data. The JCDE problem is solved by sequential expectation propagation (EP) algorithms, where the channel and data are alternately updated in an iterative manner. In the channel estimation phase, integrating Bayesian inference with a model-based deterministic approach provides precise estimations to effectively exploit the near-field characteristics in the beam-domain. In the data estimation phase, a linear minimum mean square error (LMMSE)-based filter is designed at each sub-array to address the correlation due to energy leakage in the beam-domain arising from the near-field effects. Numerical simulations reveal that the proposed initial channel estimation and JCDE algorithm outperforms the state-of-the-art approaches in terms of channel estimation, data detection, and computational complexity.
Index Terms:
Extremely large-scale-MIMO (XL-MIMO), near-field, joint channel and data estimation, compressed sensingI Introduction
To meet the demands for high spectral efficiency in future 6G systems [1], it is essential to further exploit spatial multiplexing and abundant spectral resources at mid/high frequency bands such as centimeter-wave (cmWave), milimeter-wave (mmWave), and sub-terahertz (sub-THz). In light of these requirements, extremely large-scale multiple-input-multiple-output (XL-MIMO) [2, 3, 4] has emerged as a promising technology, enabling sharp directive beamforming and extensive spatial multiplexing. However, the significant increase in antenna aperture leads to an expansion of the Rayleigh distance [5, 6], defined as the border between the near-field and far-field regions. Thus, the near-field effects in XL-MIMO systems may not be negligible in some practically-relevant circumstances, such as in small area coverage with high carrier frequency bands [7].
Unlike the conventional far-field channel, the near-field channel depends not only on gains and angles (e.g., angles of arrivals (AoAs)) but also on distances from signal sources such as user equipments (UEs) and scatterers. Hence, conventional channel estimation methods such as [8, 9], which exploit the beam-domain sparsity under the assumption of planar wavefront, experience significant performance degradation in the near-field due to energy leakage effects in the beam-domain. To tackle this issue, the authors in [7] have proposed a polar-domain simultaneous orthogonal matching pursuit (P-SOMP) algorithm, which leverages the angle and distance sparsity known as polar-domain sparsity arising from the aforementioned near-field peculiar characteristics. In P-SOMP, polar (angle-distance) grids are generated by spatially quantizing the polar-domain to utilize compressed sensing techniques, however, in multiuser systems, P-SOMP requires orthogonal pilots to separate multiple UEs. As such, the proposed approach results in non-negligible overhead as the number of UEs grows, especially in XL-MIMO systems capable of spatially multiplexing many UEs.
Considering these challenges, a near-field channel estimation algorithm, which works even with non-orthogonal pilots, has recently been proposed in [10] in the context of grant-free XL-MIMO systems111Note that while this method was originally developed for jointly active user detection and channel estimation, it is also applicable to sole channel estimation problems without active user detection.. However, due to the non-orthogonality among pilots, inter-user interference still remains, so it is necessary to jointly estimate all UE channel components. As a result, this joint estimation significantly increases computational complexity because it requires UE-wise polar grids, which leads to a large grid size. Therefore, the authors in [10] proposed a 2D-compressive sampling matching pursuit (CoSaMP) algorithm, which is based on the CoSaMP algorithm in the polar-UE 2D domain constructed by UE-wise polar grids. While the 2D-CoSaMP algorithm can mitigate computational complexity, its estimation performance is hindered by overfitting to noisy measurements, which results from the inverse operation on over-sampled estimates. Consequently, subsequent data detection suffers from severe performance deterioration, particularly with high-order modulation.
One of the prospective solutions to obtain accurate channel estimate with non-orthogonal pilots is joint channel and data estimation (JCDE) [11, 12, 13, 14, 15], where not only pilot sequences but also estimated data symbols are utilized as pilot replicas, leveraging their statistical quasi-orthogonality. The JCDE problem can be formulated as a bilinear inference problem (BIP). One of the prominent algorithms based on a Bayesian framework for BIP is bilinear generalized approximate message passing (BiGAMP) [16]. BiGAMP is an extension of GAMP, originally designed for a high-dimensional generalized-linear problem by utilizing loopy belief propagation (BP) with central limit theorem (CLT) and Taylor-series approximations based on large system limit to simplify the BP update. Due to the heavy dependency on the large system assumption of BiGAMP, the convergence performance deteriorates significantly in case the system is too small, the pilots are too short or when there is an improper prior distribution [11]. To address these issues, the authors in [11] have proposed bilinear Gaussian belief propagation (BiGaBP), which relaxes the BP update rules of BiGAMP, based on GaBP [17], without heavily relying on the approximation under a large system limit assumption. This relaxation of the approximation leads to performance improvements while maintaining the same complexity order as BiGAMP.
Bilinear inference algorithms that exploit physical model structures, such as channel sparsity in the beam-domain, have been investigated in [13, 14]. In these papers, the channel sparsity is modeled using a Bernoulli Gaussian (BG) prior distribution because this prior is analytically tractable with a closed-form posterior. However, the sparse structure cannot be exactly expressed by the analytically tractable prior, which leads to modeling errors resulting in performance deterioration. To tackle this issue, the authors of [12] have integrated a model-based deterministic approach [18] into a Bayesian inference framework [11], referred to as AoA-aided BiGaBP. This deterministic approach rectifies the model mismatch caused by the use of the tractable prior.
However, since this method assumes a far-field model, the model correction by the deterministic estimation is insufficient for the near-field region in XL-MIMO systems. Moreover, AoA-aided BiGaBP relies on a maximum ratio combining (MRC)-based detector, which cannot address the correlation caused by energy leakage in the beam-domain due to near-field effects. In addition, the computational complexity of the data denoising process in AoA-aided BiGaBP based on prior information for modulation constellations scales proportionally to not only the modulation order but also the number of antennas. This increase in complexity stems from the fact that BiGaBP suppresses the self-feedback of messages in the algorithmic iteration by generating antenna-wise extrinsic values based on BP rules without the Onsager correction term [11]. Consequently, the number of inputs to the denoiser function, based on modulation constellations, increases proportionally to the number of antennas.
Within the context outlined above, we propose a JCDE algorithm for multiuser XL-MIMO systems with non-orthogonal pilots. Our contributions are summarized as follows.
-
•
Initial channel estimation for JCDE: A novel initialization mechanism for the multiuser near-field channel estimation problem with pilot contamination due to non-orthogonal pilots is proposed, enabling an accurate initial estimate that is then used in the subsequent JCDE algorithm. The proposed initial channel estimation algorithm consists of two stages to maintain low computational complexity. In the first stage, angle and distance parameters for all UEs are estimated from the polar grids using the simultaneous orthogonal matching pursuit (SOMP) algorithm without pairing between each estimated path and the corresponding UE. Subsequently, the second stage involves the UE-path pairing using 2D-OMP [19] with a reduced number of grids constructed on the angle and distance parameters derived from the first stage. Owing to the above two-stage procedure, our proposed initial channel estimation outperforms the existing state-of-the-art scheme [10], while maintaining comparable computational complexity.
-
•
JCDE algorithm with model-based estimation: A novel bilinear JCDE inference algorithm is proposed, which integrates a model-based deterministic estimation mechanism with a Bayesian inference framework to address possible performance degradation due to modeling errors in the prior knowledge assumed in the Bayesian inference, as in [11]. In contrast to the state-of-the-art AoA-aided BiGaBP under the assumption of far-field propagation, our algorithm estimates the channels as an aggregation of two distinct quantities: 1) a model-based estimate that captures the near-field channel structure and 2) its modeling error that captures how much different the current estimate is from the true channel. The model-based estimate is alternately updated through a matching pursuit algorithm exploiting the near-field model structures, whereas the residual modeling errors and data symbols are jointly estimated by the expectation propagation (EP) algorithm [20], where an approximate posterior is calculated by minimizing the Kullback-Leibler (KL) divergence between the true posterior and the approximate posterior. To tackle the spatial correlation caused by the energy leakage across neighboring beams in the beam domain while reducing the complexity, we introduce a novel posterior calculation design that enables the implementation of a sub-array-wise linear minimum mean square error (LMMSE)-based filter, allowing parallel computation of the matrix inversion with a much smaller dimension than the array size. This design indeed results in lower computational complexity compared to the state-of-the-art method, owing to the modification of an extrinsic value generation that does not rely on BP rules, as shown in the simulation results.
Notation: The notation indicates the element of the matrix . For a random variable and a probabilistic density function , indicate the expectation of over . For any function , denotes the integral of with respect to except for . The operator denotes the Kronecker product. For the index sets and , denotes the cartesian product of and . represent the set .
II System Model
![Refer to caption](x1.png)
![Refer to caption](x2.png)
We consider an uplink XL-MIMO system, where a base station (BS) has a uniform linear array (ULA) with -antennas, serving single antenna UEs. The ULA is positioned along the -axis, where is the -th antenna coordinate, and is antenna spacing with wavelength , as shown in Fig.1.
II-A Channel Model
The near-field channel in the spatial domain between a BS and the -th UE is modeled as
(1) |
where , and denote the AoA and path gain of the -th path and the -th UE, respectively [21].
Without loss of generality, represents the line-of-sight (LoS) component and represents the non-line-of-sight (NLoS) components. Let denote the total number of path including -UEs. Accordingly, denotes the distance between the BS and the -th UE, and is the distance between the BS and the -th scatterer around the -th UE. Besides, denotes the array response vector defined as
(2) |
where is the distance between the -th antenna and the -th UE or scatterers.
For the -th UE, let us define the collections of AoAs, distances, and path gains as , , and , respectively, and the corresponding array response matrix is defined as . Then, the channel matrix is written as
(3) |
where is the array response matrix consisting of -UEs with AoAs, distances and path gains defined as , , and , respectively.
The array response vector in the far-field region, i.e., when is expressed as from (2). As the far-field array response depends only on the angle, the far-field channel can be converted into a sparse beam-domain channel with the discrete Fourier transform (DFT) matrix . In contrast, as the far-field approximation does not hold in the near-field region, the beam-domain near-field channel exhibits not a simple sparse structure but rather a cluster sparse structure, which is caused by energy leakage due to a model mismatch between the DFT matrix and the near-field array response . To illustrate the energy leakage effects, Fig. 2 depicts the amplitude of the beam-domain channel vector in the near-field and far-field regions. It can be seen that the far-field channel possesses a distinct sparse structure with a peaky spike. On the other hand, the near-field channel exhibits a clustered sparsity with flatter peaks due to energy leakage. Hence, conventional channel estimation methods exploiting the beam-domain sparsity [8, 9, 18] encounter significant performance degradation in the near-field.
II-B Received Signal Model
To estimate the near-field channel, the -th UE transmits a non-orthogonal pilot sequence and data symbol subsequently, where and are the length of pilots and data symbols. Each entry of is randomly generated from a -quadrature amplitude modulation (QAM) constellation with average symbol energy . Then, the received pilot and data in the spatial domain are given by
(4) |
where and are the transmitted pilot matrix and data matrix. and are the additive white Gaussian noise (AWGN) matrices, whose entries are generated from with noise variance . By stacking the received pilot and data , the effective received signal becomes , and the sum length of pilots and data , is formulated as
(5) |
with and . For the sake of future convenience, let us define the pilot and data index set as with and .
III Overview of the Proposed Algorithm
This section describes the overview of the proposed algorithm. The overall procedures of the proposed algorithm are illustrated in Fig. 3. As shown in the figure, the proposed algorithm mainly consists of two parts: the initial channel estimation part and subsequent JCDE part. The initial channel estimation part yields an accurate near-field channel estimate to support the convergence of the subsequent JCDE algorithm, and is composed of two stages to reduce the computational complexity. In the first stage, the angle and distance candidates from large-size polar-grids are estimated by utilizing the SOMP algorithm. In the second stage, the pairing between the path candidates obtained in the first stage and corresponding UEs is performed via the 2D-OMP algorithm by using UE specific pilot sequences. The first and second stages for initial channel estimation are described in Section IV-A and Section IV-B, respectively.
In the subsequent JCDE process, the channel and data are jointly estimated via the EP algorithm with a deterministic model-based estimation approach using the initial channel estimate. To exploit the near-field model structures, the beam-domain channel matrix is decomposed into a model-based estimate and residual channel error . and are jointly estimated by the EP algorithm, where the approximate joint posterior for and is calculated as described in Section V-C and V-D. The model-based estimate is determined by the initial channel estimate and adaptively updated in the algorithm iterations to further improve estimation performance as described in Section V-F.
![Refer to caption](x3.png)
IV Proposed Initial Channel Estimation
IV-A Angle and Distance Estimation
To leverage the near-field channel sparsity, the virtual channel representation in the polar-domain [8] is utilized with polar-grids. The polar-grids are designed by spatially quantizing the angle and distance domain into grid points as and with and . Using the polar-grids and , the polar-domain dictionary (i.e., virtual array response matrix) is designed as
(7) |
From (6) and (IV-A), the received pilot signal is given by
(8) |
where is the row sparse matrix such that the number of nonzero rows is only and other rows are zero since the channel is composed of a total of paths defined as in (1), with a sufficiently large number of grids, i.e., . Equation (8) exactly holds only if there is no quantization errors in polar grids. In actual environments, however, it approximately holds due to the presence of quantization errors. Therefore, to compensate the quantization errors, we overestimate the number of paths based on the propagation environment in the considered carrier frequency [7]. To estimate path candidates from grids, the sparse reconstruction problem for is formulated as
(9) |
where denotes the number of non-zero rows of .
The problem in (IV-A) can be approximately solved by a compressed sensing algorithm for multiple measurement vectors (MMV) problems, e.g., SOMP [8]. The computational complexity of SOMP at the -th iteration in a naive implementation is . Its complexity can be further reduced by using the matrix inversion lemma (MIL) to [22, 23]. Solving the problem (IV-A) yields the angle and distance candidates corresponding to the non-zero rows of , defined as and .
IV-B UE-Path Pairing
The path candidate set obtained in the first stage does not specify the association of individual paths with each UE. To estimate individual channels for each UE, the second stage performs UE-path pairing, where the estimated path candidates are associated with each user using UE-specific non-orthogonal pilot sequences 222In case of orthogonal piloting, one can readily imagine that this is a straightforward task.. The usage of limited path candidates , , rather than large-size polar-grids , sampling the entire polar domain, can lead to a complexity reduction. Using the path set , the polar-domain dictionary matrix is designed as
(10) |
Reducing the size of the polar grids from in (IV-A) to in (10) can effectively lower the complexity in the following compressed sensing algorithm. Then, the channel vector for the -th UE can be approximated with the polar-domain dictionary as
(11) |
where is the virtual path gain vector.
The equation (12) can be transformed into a 1D linear equation as with , and . Although the estimation for from the vectorized observation can be simply addressed by various methods such as OMP [22], this significantly increases the complexity due to the large-size dictionary . Hence, to circumvent the high computational burden, the 2D signal representation in (12) is directly addressed without the vectorized 1D representation. Then, the sparse reconstruction problem for in (12) is formulated as
(13) |
with .
The optimization problem (IV-B) is solved via a two-dimensional compressed sensing algorithm. The conventional method [10] tackles this problem with the large-size polar dictionary in (IV-A) instead of in (10) via the 2D-CoSaMP algorithm, which sacrifices estimation performance for complexity reduction compared to 2D-OMP [19]. In contrast, our proposed method solves the optimization problem (IV-B) via the 2D-OMP algorithm using the small-size polar-domain dictionary constructed by the path candidates in the first stage. As a result, the proposed method possesses the prominent capability to overcome the conventional approach [10] while retaining comparable computational complexity. Detailed discussions regarding the complexity of the proposed algorithm are presented in Section VI.
Solving the problem (IV-B) yields the estimated path gain vector , angle , and distance corresponding to the non-zero elements of , where is the estimated number of paths for the -th UE. Given the estimates, the initial channel estimate can be obtained as
(14) |
where is the estimated array response. The proposed initial channel estimation method is summarized in Algorithm 1.
V Proposed joint channel and data estimation
Given the initial estimates obtained from Algorithm 1, we aim to improve both the channel estimation performance as well as the data estimation accuracy by jointly processing the channel estimation and data detection while considering near-field properties. This section elaborates on the proposed JCDE algorithm with the initial channel estimate.
V-A Pre-processing for Channel and Data Estimation
V-A1 Pre-processing for Channel Estimation
To exploit the channel sparsity, the received signal and channel matrix in the spatial-domain are transformed in the beam-domain as and , where is the DFT matrix . As described in Section II-A, the near-field channel has a cluster sparse structure due to energy leakage, thus, to tackle this problem, the channel matrix is first considered as the aggregation of the model-based estimate and the residual channel estimation error , resulting in
(15) |
An initial value for the model-based estimate is determined with the proposed initial channel estimate in (14) as , and it is adaptively updated based on the near-field model structure as described in Section V-F. As the residual error is defined by subtracting the current estimate from the beam-domain channel as in (15), this subtraction results in a sparser domain representation compared to the original beam-domain channel . The dominant path components are removed from by , facilitating the sparse matrix reconstruction by considering (instead of ) as the variable to be estimated by a Bayesian inference framework.
V-A2 Pre-processing for Data Estimation
For low-complexity data estimation, the conventional methods based on the far-field assumption, such as [13, 12, 14], utilize MRC-based detectors that are effective in the far-field region since the beam-domain channel exhibits a simple sparse structure with a peaky spike and no correlation between the beam indices. However, these detectors are ineffective in the near-field scenario because the near-field channel has cluster sparsity due to energy leakage, and the leaked energy is correlated in the beam-domain. Although LMMSE-based detection methods such as [24, 25] are effective to deal with the correlation, these methods require matrix inversion with the size , which is computationally expensive especially in XL-MIMO systems. To balance the computational complexity and detection performance, the array is virtually divided into multiple sub-arrays, and a sub-array-wise LMMSE-based detector is designed similarly to [26]. In contrast to [26], which assumes perfect channel state information (CSI), the proposed method considers the channel estimation error while jointly estimating data and channel, exploiting the near-field model structures.
Accordingly, the extra-large array with antennas are partitioned into sub-arrays, and the sub-array has antennas satisfying . The received signals, residual channel errors, and model-based estimates can be also seen as , , and , with , , and . The received signals and can then be rewritten as
(16) |
For convenience, let us define as the antenna index set, and as the antenna index set at the -th sub-array such that and .
V-B Bayesian Inference Formulation
Based on the linear observation in (16) with the deterministic variable and random variables and , the likelihood function for and can be expressed as
(17) |
where with , , and .
Since each entry of is randomly selected from the QAM constellation point set , the prior can be written as
(18) |
with and .
Although many conventional methods such as [13, 14] design the i.i.d. sparse prior for the beam-domain channel as (e.g., BG prior), this modeling causes the model mismatch due to energy leakage effects in the near-field region. Therefore, we design the sparse prior for the residual channel error instead of as
(19) |
where is Gaussian prior distribution with zero mean and variance , which is widely used for sparse representation in the sparse Bayesian learning (SBL) algorithm [27], where is the hyper parameter set to be optimized through the expectation maximization (EM) algorithm [28] as described in Section V-E.
From the likelihood in (17) and priors in (18), (19), the posterior can be written as
(20) |
where is the marginal likelihood referred to as the evidence for parameter . Our objective is to estimate , , and through the posterior and the evidence.
The estimator for by the type-II maximum likelihood method [29] is given as
(21) |
However, the calculation of the evidence is intractable due to the multidimensional integral for and . Hence, we utilize the EM algorithm, which maximizes the evidence lower bound (ELBO) in each iteration, instead of directly maximizing the evidence [28]. Given at the -th iteration, at the -th iteration can be obtained as the following E-step and M-step:
(22) | ||||
(23) |
where is the ELBO with the constant value .
Since E-step requires the calculation of a multidimensional integral that is computationally unreasonable, we approximate the posterior by , using the EP algorithm. After the E-step, the maximization problem in (23) with the approximate posterior is solved, which is described in detail in Section V-E. The EP procedure continues until it reaches the maximum number of iterations . Finally, the last updated parameters at are used as the final estimates as , , and .
In what follows, let us drop the iteration index for notation simplicity. The approximate posterior is derived by minimizing the KL divergence subject to a Gaussian distribution set as
(24) |
where the approximate posterior is designed as
(25) |
where is a normalizing constant, and , , , and are the approximate factors such that , , and subject to Gaussian distribution set .
These approximate factors are designed as , , , , where , , , and are the parameterized approximate functions defined as
(26a) | ||||
(26b) | ||||
(26c) | ||||
(26d) |
where , , , and are unknown parameters to be optimized by minimizing the KL divergence.
Since the approximate posterior in (25) is designed subject to Gaussian distribution set , the marginalized approximate posterior and can be expressed as and , where and are the posterior means, and and are the posterior variances.
Let denote an unknown parameter set to be optimized. The optimal unknown parameter set is obtained by minimizing the KL divergence in (24). However, the objective function cannot be expressed in closed-form because includes intractable integral operations with respect to the true posterior . To tackle this, we set the target distribution instead of the true posterior into the KL divergence in (24). The target distribution is designed by replacing a part of the true posterior with the approximate functions in (26a)-(26d) as described in the following sections. For the sake of notation convenience for the design of the target distribution in the following section, the approximate distribution and are expressed using (26a)-(26d) as
(27a) | ||||
(27b) |
V-C EP for Data Estimation
V-C1 Update
While the parameter in is updated, the other parameters are fixed as the tentative estimated values, that is, the KL minimization problem for is formulated as
(28) |
where is the target distribution for , which is designed using in (27a) as
(29) |
where is a normalizing constant.
Let denote the objective function in (28), resorting to
(30) |
Since the objective function is convex with respect to , the necessary and sufficient condition for the global optimal , i.e., , is equivalent to
(31) |
where is the projection operator onto Gaussian distribution set , which indicates the moment matching, i.e., the first and second moments of distribution matches those of the target distribution.
The marginalized approximate posterior in (31) is written as
(32) |
with , which can also be represented as
(33) |
with the normalizing constant .
The marginalized target distribution in (31) is written as
(34) |
where is the conditional probability distribution defined in (35) in the top of next page along with , and , with being the normalizing constant and
(35) | ||||
(36) | ||||
(37) |
From the conditional distribution in (35), the mean and covariance , can be calculated as
(38a) | ||||
(38b) |
with .
Under large system conditions with CLT, the conditional distribution can be approximated as . Thus, the approximate function can be expressed as with the mean and variance calculated as
(40a) | ||||
(40b) |
with . The calculation of in (38a) corresponds to a soft interference cancellation (Soft-IC) [12] using data replicas and channel replicas .
V-C2 Update
The KL minimization problem for is formulated as
(41) |
where is the target distribution defined as
(42) |
where is a normalizing constant.
Similar to the derivation of (31), the optimal condition for is derived as
(43) |
where is the marginalized target distribution calculated as
(44) |
with the approximate function multiplied over the sub-array direction, , calculated as
(45) |
with
(46) |
Note that combining the mean and variance over the sub-array direction , as written in (46), leads to further improvements for data detection owing to the spatial diversity. Substituting (44) into (43), the approximate posterior can be written as
(47) |
The approximate posterior mean and variance of can be derived using the MMSE denoiser function [31], which is designed based on the prior for QAM constellation in (18). Then, the posterior mean and variance is expressed as and , which can be calculated as
(48a) | ||||
(48b) |
with
As shown in the Soft-IC process in (38a)-(38), and in (50) are used as soft replicas instead of and in (48a)-(48b) in order to suppress the self-noise feedback in the algorithm iterations [32]. In conventional JCDE algorithms [11, 12], the self-feedback suppression is performed before the denoising process in (48a)-(48b) by generating antenna-wise extrinsic values based on BP rules. Hence, the complexity of the denoising process is . In contrast, the proposed method can reduce the complexity in the denoising process as , since the extrinsic values and are generated after the denoising process in (50).
V-D EP for Residual Channel Error Estimation
V-D1 Update
For , we minimize
(53) |
where is the target distribution designed as
(54) |
where is a normalizing constant.
Through the same procedure as the derivation of in (39), the mean and variance of approximate function are obtained as
(55) |
with
(56a) | ||||
(56b) | ||||
(56c) | ||||
(56d) |
V-D2 Update
For , we have
(57) |
where is the target distribution designed as
(58) |
where is a normalizing constant.
Following the same methodology used to derive in (47), the approximate posterior are derived as
(59) |
where the mean and variance of can be calculated based on the prior distribution in (19) as
(60) |
with
(61) |
Similarly, the approximate function can be derived in the same manner as (49):
(62) |
from which the mean and variance are respectively given by
(63a) | ||||
(63b) |
Finally, the approximate function is obtained in a similar way as the derivation of (51) as
(64) |
with the mean and variance of being
(65) |
V-E Expectation Maximization for Hyper Parameter Learning
V-F Reinforcement for the Model-Based Estimate
To further improve the convergence performance for the EP algorithm, we update the model-based estimate in each iteration. Using the estimated residual channel error at the -th iteration, the channel estimate for the -th UE can be reconstructed as
(68) |
The model-based estimate at the -th iteration is updated with the channel estimate at the previous iteration in (68). To efficiently estimate by leveraging the near-field sparsity, the virtual channel representation with polar grids as described in Section IV-A are utilized. The grids are dynamically designed in the iterations, where the center of the grids is set as the angle and distance estimates at the previous iteration, and the range of grids decreases with the number of iterations. Thus, the angle and distance grids for the -th UE and -th path at the -th iteration are designed as
(69a) | ||||
(69b) |
with , . and are the angle and distance estimates at the -th iteration, respectively, and and are, respectively, the range of angle and distance grids, where the initial values and are determined using the angle and distance estimates obtained by the initial channel estimation as shown in Algorithm 1.
Note that the range of angle and distance grids and are respectively designed by a monotonically decreasing function such as and , where the constant values are uniquely determined with the desired range , , , and . Accordingly, the sets of angle and distance grids for the -th UE are defined as , , , and .
Using the angle and distance grids in (69a)-(69b), the polar-domain dictionary matrix for the -th UE is designed as
(70) |
where is the virtual array response for the -th UE and -th path defined as
Through the virtual channel representation with the dictionary matrix , the near-field channel for the -th UE can be expressed as
(71) |
where is the virtual path gain vector for the -th path, and is the virtual path gain vector including all paths.
In light of the near-field model in (71), an update of the model-based estimate can be obtained by
(72) |
with denoting the path gain estimates, being the corresponding array responses, which can be computed by solving
subject to | ||||
(73) |
To summarize, the proposed algorithm is encapsulated in Algorithm 2, where a dam** scheme [16] is introduced in line 6, 7, 12, and 18 to enhance convergence performance.
VI Simulation Results
This section evaluates the performance of the proposed initial channel estimation and subsequent JCDE algorithms under the following setup. The carrier frequency is , the number of BS antennas is , the number of UEs is , the modulation order is -QAM, and the length of pilots and data are and , respectively. The non-orthogonal pilot is designed by the frame design method in [15]. The near-field channel is composed of paths, i.e., LoS path and NLoS paths, with a Rician -factor of 10 dB. The total number of paths is , and the corresponding oversampling quantity used in Algorithm 1 is set to . The AoAs and distances are uniformly randomly generated in the range and m, respectively. The polar-domain dictionary in (IV-A) is designed with , and desired coherence in [10]. The performance is evaluated by the normalized mean-squared error (NMSE) and bit error rate (BER) under various signal-to-noise ratio (SNR). NMSE and SNR are defined as , and . In what follows, the initial channel estimation and JCDE performance are evaluated in Section VI-A and VI-B, respectively.
VI-A Initial Channel Estimation Performance
To evaluate the initial channel estimation performance, the following estimation methods are compared: (a) LS: a classical least squares-based channel estimation, (b) P-SOMP [7]: a near-field channel estimation without considering the non-orthogonality of pilots. (c) 2D-CoSaMP [10]: a near-field channel estimation considering non-orthogonality, and (d) the proposed initial channel estimation method in Algorithm 1.
Fig. 4 shows the NMSE against SNR. The P-SOMP exhibits limited improvement with an increase in SNR due to pilot contamination stemming from non-orthogonal pilots, whereas 2D-CoSaMP demonstrates a performance enhancement compared to P-SOMP. The proposed method surpasses these conventional methods by mitigating noise amplification through the utilization of 2D-OMP in the second stage associated with UE-path pairing, resulting in superior channel estimation. Fig. 4 and Table I show the computational complexity evaluated by floating point operations (FLOPs). As depicted in the figure, the FLOPs of the proposed method are comparable to 2D-CoSaMP, owing to the two-stage procedure separating angle-distance estimation and UE-path pairing.
![Refer to caption](x4.png)
![Refer to caption](x5.png)
VI-B JCDE Performance
In this subsection, we evaluate the performance of the proposed JCDE algorithm.
As for JCDE algorithm parameters,
the dam** factor is set to ,
the number of iterations is ,
the number of grids are ,
the grid ranges are
,
,
, and
, respectively.
The extremely large array with antennas is divided into sub-arrays with antennas per sub-array.
For comparison, AoA-aided BiGaBP [12] are employed as a benchmark, which is a state-of-the-art JCDE algorithm.
Besides, we consider an ideal Genie-aided case with perfect knowledge of CSI or data corresponding to the lower bound of the proposed method.
VI-B1 JCDE Performance with Initial Channel Estimation
![Refer to caption](x6.png)
![Refer to caption](x7.png)
This subsection reveals the NMSE and BER performance of the JCDE algorithms with various initial channel estimation methods, including P-SOMP, 2D-CoSaMP, and the proposed initial channel estimation method. To evaluate the data detection capability of the above initial channel estimation methods, the LMMSE detector is used for data estimation.
Fig. 5 shows the BER and NMSE performance. As shown in the figures, while LMMSE with LS, which cannot take advantage of the near-field model structures, exhibits poor BER performance, LMMSE with the other initial estimation approaches considering the near-field model structure achieve a slight performance improvement. However, there remains high-level error floors due to the non-orthogonal pilots. In contrast, the JCDE algorithms boost BER performance due to utilizing both pilot and consecutive data. In particular, the proposed JCDE algorithm with the proposed initial channel estimation demonstrates a significant performance gain, approaching the lower bound of perfect CSI or perfect data.
Moreover, the proposed JCDE algorithm demonstrates a notable BER performance compared to the state-of-the-art AoA-aided BiGaBP [12].
The performance improvement can be attributed to two primary factors.
The first factor is that the proposed algorithm can leverage the near-field model-based estimation described in Section V-F, whereas BiGaBP relies on the far-field assumption.
The second factor is that the proposed sub-array-wise LMMSE-based detection in (40a) is capable of addressing the correlation between the leaked energy in the beam-domain, whereas BiGaBP is incapable of doing so because of its MRC.
To reveal the aforementioned two factors, in Section VI-B2, we show the convergence analysis with and without near-field model information.
Besides, we evaluate in Section VI-B3 the proposed sub-array-wise LMMSE-based detection performance and its complexity across various numbers of sub-arrays .
VI-B2 Convergence Analysis
To clarify the advantages gained by leveraging the near-field model structure, we evaluate the proposed JCDE algorithm with and without the model-based estimation process explained in Section V-F.
Fig. 6 illustrates the BER and NMSE convergence behavior with respect to the number of algorithmic iterations.
In the figure, the red triangle marker corresponds to the proposed JCDE algorithm without the model-based estimate, i.e., , where the prior distribution is designed i.i.d. for each element of instead of , akin to [13, 14].
The green square marker corresponds to the proposed JCDE algorithm with the initial model-based estimate but without updating in iterations, i.e., .
Comparing the red triangle maker and green square marker, we can verify the performance improvement stemming from the use of the near-filed model through the decomposition of into and as written in (15).
Furthermore, in comparison to the proposed algorithm with adaptive update, it can be seen that the adaptive updating of the model-based estimate enhances the BER and NMSE performance by further exploiting the near-field model.
![Refer to caption](x8.png)
![Refer to caption](x9.png)
VI-B3 Performance Against the Number of Sub-arrays
Algotrithm | FLOPs | ||
---|---|---|---|
BiGaBP [12] | |||
Proposed |
|
![Refer to caption](x10.png)
![Refer to caption](x11.png)
To analyze the impact of the number of sub-arrays on the performance of the proposed JCDE algorithm employing the sub-array-wise LMMSE-based detection, we offer in Fig. 7 the BER and FLOPs with respect to various numbers of sub-arrays , where corresponds to the full-array LMMSE-based detection and corresponds to the MRC-based detection. As depicted in the figure, the BER decreases as the number of sub-arrays increases (i.e., the number of antennas at each sub-array decreases) because each sub-array fails to effectively whiten the correlation in the beam-domain even in the perfect CSI case. In particular, the MRC-based detection corresponding to exhibits poor performance. In contrast, an increase in the number of sub-arrays leads to a reduction in FLOPs attributed to the decreased size of the inverse matrix associated with the LMMSE-based detection in (40a). Despite relying on the LMMSE-based detector, the proposed algorithm can achieve lower FLOPs when compared to BiGaBP, which relies on an MRC-based detector, since the proposed method suppresses self-feedback in (50) after the denoising process as in (48a)-(48b) with FLOPs , whereas BiGaBP suppresses self-feedback before the denoising process [12, 32] with FLOPs that is dominant complexity throughout the entire process as shown in Table II. From the above results, it is evident that the proposed method outperforms the conventional method in terms of both data detection and complexity.
VII Conclusion
This paper proposed an initial channel estimation algorithm and subsequent JCDE algorithm for multiuser XL-MIMO systems with non-orthogonal pilots. The initial channel estimation is performed by an efficient two-stage compressed sensing algorithm exploiting the polar-domain sparsity. Furthermore, the initial channel estimates are refined by jointly utilizing both non-orthogonal pilots and data via the EP algorithm. To improve channel estimation accuracy, the model-based deterministic approach is integrated into a Bayesian inference framework. In addition, to address the near-field specific correlation in the beam domain, a sub-array-wise LMMSE filter is designed considering the correlation and channel estimation errors for data detection. Computer simulations validated that the proposed method is superior to existing approaches in terms of channel estimation, data detection, and complexity.
References
- [1] H. Tataria, M. Shafi, A. F. Molisch, M. Dohler, H. Sjöland, and F. Tufvesson, “6G wireless systems: Vision, requirements, challenges, insights, and opportunities,” Proc. IEEE, vol. 109, no. 7, pp. 1166–1199, 2021.
- [2] E. D. Carvalho, A. Ali, A. Amiri, M. Angjelichinoski, and R. W. Heath, “Non-stationarities in extra-large-scale massive MIMO,” IEEE Wirel. Commun., vol. 27, no. 4, pp. 74–80, 2020.
- [3] Z. Wang et al., “A tutorial on extremely large-scale MIMO for 6G: Fundamentals, signal processing, and applications,” IEEE Commun. Surveys Tuts., Early Access, 2024.
- [4] H. Iimori, T. Takahashi, K. Ishibashi, G. T. F. de Abreu, D. González G., and O. Gonsa, “Joint activity and channel estimation for extra-large MIMO systems,” IEEE Trans. Wirel. Commun., vol. 21, no. 9, pp. 7253–7270, 2022.
- [5] M. Cui, Z. Wu, Y. Lu, X. Wei, and L. Dai, “Near-field MIMO communications for 6G: Fundamentals, challenges, potentials, and future directions,” IEEE Commun. Mag., vol. 61, no. 1, pp. 40–46, 2023.
- [6] Y. Liu, Z. Wang, J. Xu, C. Ouyang, X. Mu, and R. Schober, “Near-field communications: A tutorial review,” IEEE Open J. Commun. Soc., vol. 4, pp. 1999–2049, 2023.
- [7] M. Cui and L. Dai, “Channel estimation for extremely large-scale MIMO: Far-field or near-field?” IEEE Trans. Commun., vol. 70, no. 4, pp. 2663–2677, 2022.
- [8] J. Rodríguez-Fernández, N. González-Prelcic, K. Venugopal, and R. W. Heath, “Frequency-domain compressive channel estimation for frequency-selective hybrid millimeter wave MIMO systems,” IEEE Trans. Wirel. Commun., vol. 17, no. 5, pp. 2946–2960, 2018.
- [9] C. Hu, L. Dai, T. Mir, Z. Gao, and J. Fang, “Super-resolution channel estimation for mmwave massive mimo with hybrid precoding,” IEEE Trans. Veh. Technol., vol. 67, no. 9, pp. 8954–8958, 2018.
- [10] X. Xie, Y. Wu, J. An, D. W. K. Ng, C. Xing, and W. Zhang, “Massive unsourced random access for near-field communications,” IEEE Trans. Commun., pp. 1–1, Early Access 2024.
- [11] K. Ito, T. Takahashi, S. Ibi, and S. Sampei, “Bilinear gaussian belief propagation for massive mimo detection with non-orthogonal pilots,” IEEE Trans. Commun., vol. 72, no. 2, pp. 1045–1061, 2024.
- [12] K. Ito, T. Takahashi, K. Igarashi, S. Ibi, and S. Sampei, “AoA estimation-aided Bayesian receiver design via bilinear inference for mmWave massive MIMO,” in Proc. IEEE Int. Conf. Commun. (ICC), 2023, pp. 6474–6479.
- [13] W. Yan and X. Yuan, “Semi-blind channel-and-signal estimation for uplink massive MIMO with channel sparsity,” IEEE Access, vol. 7, pp. 95 008–95 020, 2019.
- [14] L. Chen and X. Yuan, “Blind multiuser detection in massive MIMO channels with clustered sparsity,” IEEE Wirel. Commun. Lett., vol. 8, no. 4, pp. 1052–1055, 2019.
- [15] H. Iimori, T. Takahashi, K. Ishibashi, G. T. F. de Abreu, and W. Yu, “Grant-free access via bilinear inference for cell-free MIMO with low-coherence pilots,” IEEE Trans. Wirel. Commun., vol. 20, no. 11, pp. 7694–7710, 2021.
- [16] J. T. Parker, P. Schniter, and V. Cevher, “Bilinear generalized approximate message passing―part I: Derivation,” IEEE Trans. Signal Process., vol. 62, no. 22, pp. 5839–5853, 2014.
- [17] Y. Kabashima, “A CDMA multiuser detection algorithm on the basis of belief propagation,” J. Phys. A, Math. Gen., vol. 36, no. 43, 2003.
- [18] D. Fan et al., “Angle domain channel estimation in hybrid millimeter wave massive MIMO systems,” IEEE Trans. Wirel. Commun., vol. 17, no. 12, pp. 8165–8179, 2018.
- [19] Y. Fang, J. Wu, and B. Huang, “2D sparse signal recovery via 2D orthogonal matching pursuit,” Sci. China Inf. Sci., vol. 55, pp. 889–897, 2012.
- [20] T. P. Minka, “Expectation propagation for approximate bayesian inference,” Proc. 17th Conf. Uncertainty Artif, pp. 362–369, 2001.
- [21] H. Iimori, G. T. F. de Abreu, O. Taghizadeh, R.-A. Stoica, T. Hara, and K. Ishibashi, “Stochastic learning robust beamforming for millimeter-wave systems with path blockage,” IEEE Wirel. Commun. Lett., vol. 9, no. 9, pp. 1557–1561, 2020.
- [22] Y. Pati, R. Rezaiifar, and P. Krishnaprasad, “Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition,” in Proc. Asilomar Conf. Signals, Syst., Comput, 1993, pp. 40–44 vol.1.
- [23] B. L. Sturm and M. G. Christensen, “Comparison of orthogonal matching pursuit implementations,” in Proc. 20th Eur. Signal Process. Conf. (EUSIPCO), 2012, pp. 220–224.
- [24] J. Ma and L. **, “Orthogonal AMP,” IEEE Access, vol. 5, pp. 2020–2033, 2017.
- [25] S. Rangan, P. Schniter, and A. K. Fletcher, “Vector approximate message passing,” IEEE Trans. Inf. Theory, vol. 65, no. 10, pp. 6664–6684, 2019.
- [26] H. Wang, A. Kosasih, C.-K. Wen, S. **, and W. Hardjawana, “Expectation propagation detector for extra-large scale massive MIMO,” IEEE Trans. Wirel. Commun., vol. 19, no. 3, pp. 2036–2051, 2020.
- [27] A. Mishra, A. Rajoriya, A. K. Jagannatham, and G. Ascheid, “Sparse bayesian learning-based channel estimation in millimeter wave hybrid MIMO systems,” in Proc. IEEE Int. Workshop Sig. Process. Ad. Wirel. Commun. (SPAWC), 2017, pp. 1–5.
- [28] C. M. Bishop, Pattern Recognition and Machine Learning (Information Science and Statistics). Berlin, Germany: Springer-Verlag, 2006.
- [29] M. E. Tip**, “Sparse Bayesian learning and the relevance vector machine,” J. Mach. Learn. Res., vol. 1, no. 2, pp. 211–244, 2002.
- [30] P. Jain, P. Kar et al., “Non-convex optimization for machine learning,” Found. Trends Mach. Learn., vol. 10, no. 3-4, pp. 142–363, 2017.
- [31] Q. Zou and H. Yang, “A concise tutorial on approximate message passing,” arXiv:2201.07487, 2022.
- [32] R. Tamaki, K. Ito, T. Takahashi, S. Ibi, and S. Sampei, “Suppression of self-noise feedback in GAMP for highly correlated large MIMO detection,” in Proc. IEEE Int. Conf. Commun. (ICC), 2022, pp. 1300–1305.