Scatterer Recognition from LiDAR Point Clouds for Environment-Embedded Vehicular Channel Modeling via Synesthesia of Machines

Ziwei Huang, , Lu Bai, , Zengrui Han, , and Xiang Cheng Z. Huang, Z. Han, and X. Cheng are with the State Key Laboratory of Advanced Optical Communication Systems and Networks, School of Electronics, Peking University, Bei**g, 100871, P. R. China (email: [email protected], [email protected], [email protected]).L. Bai is with the Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, **an, 250101, P. R. China (e-mail: [email protected]).

Abstract

In this paper, a novel environment-embedded vehicular channel model is proposed by scatterer recognition from light detection and ranging (LiDAR) point clouds via Synesthesia of Machines (SoM). To provide a robust data foundation, a new intelligent sensing-communication integration dataset in vehicular urban scenarios is constructed. Based on the constructed dataset, the complex SoM mechanism, i.e., map** relationship between scatterers in electromagnetic space and LiDAR point clouds in physical environment, is explored via multilayer perceptron (MLP) with electromagnetic propagation mechanism. By using LiDAR point clouds to implement scatterer recognition, channel non-stationarity and consistency are modeled in an environment-embedded manner. Using ray-tracing (RT)-based results as the ground truth, the scatterer recognition accuracy exceeds 90%. The accuracy of the proposed model is further verified by the close fit between simulation results and RT results.

Index Terms:

Intelligent sensing-communication integration, Synesthesia of Machines (SoM), environment-embedded vehicular channel modeling, LiDAR point clouds, scatterer recognition.

I Introduction

To support precise localization sensing and efficient communication link establishment for intelligent vehicles, it is essential to achieve in-depth understanding of the surrounding environment and high-precision vehicular channel modeling. However, widely used approaches, which solely utilize radio frequency (RF) communication information, are difficult to achieve high-precision vehicular channel modeling, and thus cannot support the aforementioned application related to intelligent vehicles. Fortunately, intelligent vehicles are equipped with multi-modal devices, which can acquire surrounding environmental information and further assist in vehicular channel modeling [1]. To adequately utilize the multi-modal information in the surrounding environment, inspired by human synesthesia, a novel concept, i.e., Synesthesia of Machines (SoM), is proposed [2]. SoM aims to achieve intelligent integration of communications and multi-modal sensing via artificial neural networks. As the cornerstone of SoM research, the exploration of SoM mechanism, i.e., map** relationship between physical environment and electromagnetic space, is essential. Based on the SoM mechanism, a high-precision vehicular channel model can be constructed in an environment-embedded manner.

Considering the necessity of exploring SoM mechanism, i.e., map** relationship, some preliminary work has been conducted. The authors in [3] proposed an environment reconstruction method based on LiDAR point clouds, and further explored the map** relationship between LiDAR point clouds and path loss. However, the map** relationship explored in [3] was limited to sensing and channel large-scale fading. As stated in [4], multipath fading, i.e., channel small-scale fading, is a significant factor, which affects communication system design and presents more challenges compared to channel large-scale fading. To intuitively characterize multipath fading, the concept of scatterers is introduced to model the interaction between radio waves and objects [5]. Currently, extensive vehicular channel measurements [6]–[8] and standardized channel models [5, 9] have been conducted to explore spatial attributes of scatterers, including their numbers and positions. By characterizing the spatial attributes of scatterers, channel non-stationarity and consistency can be modeled through birth-death (BD) process and visibility region (VR) [9]–[11]. Based on the Markov chain, the BD process characterizes the mathematical relationship for the variation of the scatterer number, thus capturing channel non-stationarity. Based on the geometry, the VR characterizes the spatial relationship for the smooth evolution of scatterers, thus capturing channel non-stationarity and consistency. Nevertheless, the aforementioned two methods focus on modeling the scatterer variation/evolution statistically. In this case, since the map** relationship between objects in physical environment and scatterers in electromagnetic space is ignored, channel non-stationarity and consistency cannot be accurately captured. This results in the inability to model the tight interplay between physical environment and electromagnetic space. Although existing vehicular channel models preliminarily capture the variation/evolution of scatterers and channel non-stationarity/consistency by utilizing BD process and VR method, they cannot meet the high-precision requirements of vehicular channel models. A high-precision vehicular channel model, which can capture channel non-stationarity and consistency in an environment-embedded manner, i.e., environment-channel non-stationarity and consistency, is urgently required.

Refer to caption — Figure 1: BEV of vehicular urban crossroad simulation scenarios at Snapshot 200. Figs. (a)–(c) are scenarios in AirSim with low, medium, and high VTDs, respectively. Figs. (d)–(f) are scenarios in Wireless InSite with low, medium, and high VTDs, respectively.

To fill this gap, we propose a novel environment-embedded vehicular channel model via SoM. By using AirSim [12] and Wireless InSite [13], a new intelligent sensing-communication integration dataset in vehicular scenarios with low, medium, and high vehicular traffic densities (VTDs) is constructed. The LiDAR point cloud is intelligently processed by the density-based spatial clustering of applications with noise (DBSCAN) clustering algorithm to extract physical environment features, which are further aligned with electromagnetic space. By leveraging a typical artificial neural network, i.e., multilayer perceptron (MLP), with electromagnetic propagation mechanisms, the complex SoM mechanism, i.e., map** relationship between LiDAR point clouds in physical environment and scatterers in electromagnetic space, is investigated for the first time. To model environment-channel non-stationarity and consistency, physical environment features via LiDAR point clouds are utilized for scatterer recognition, thus modeling spatial attributes, i.e., numbers and positions, of scatterers in an environment-embedded manner. Using ray-tracing (RT)-based results as the ground truth, simulation results show that the scatterer recognition accuracy exceeds 90% in each VTD condition. The accuracy of the proposed model is also verified by the close fit between simulation results and RT results.

II Map** Relationship Exploration: Scatterer Recognition from LiDAR Point Clouds

II-A High-Fidelity Dataset Construction

By using AirSim [12] and Wireless InSite [13], we construct a new dataset in the vehicular urban crossroad. To obtain high-fidelity LiDAR point clouds, simulation scenarios in AirSim are constructed via the advanced three-dimensional (3D) modeling software with the superior rendering effect. To collect high-fidelity scatterers, Wireless InSite exploits RT technology based on geometrical optics and uniform theory of diffraction. Similar to our previous work in [14], physical environment in AirSim and electromagnetic space in Wireless InSite further achieve in-depth integration and precise alignment. In AirSim, the LiDAR equipped on each vehicle has $16$ channels, $10$ Hz scanning frequency, and $240,000$ points per second, where the upward and downward field of view (FoV) are $15^{\circ}$ and $-25^{\circ}$ , respectively. In Wireless InSite, the communication device equipped on each vehicle is operated at $28$ GHz carrier frequency with $2$ GHz bandwidth, where numbers of antennas at transmitter (Tx) and receiver (Rx) are $L_{\mathrm{T}}$ = $L_{\mathrm{R}}$ = 1. The heights of the car and the bus are $2$ m and $3$ m, respectively.

Given the diversity of the dataset, Fig. 1 demonstrates that we consider three VTD conditions, i.e., low, medium, and high, and three types of streets, i.e., vertical ( $x$ -axis), horizontal ( $y$ -axis), and crossing ( $xy$ -axis) streets. Each type of street has $6$ different transceiver links, e.g., Car5 (Tx) and Car7 (Rx) at the horizontal street, Car1 (Tx) and Car2 (Rx) at the vertical street, and Car1 (Tx) and Car8 (Rx) at the crossing street, thus containing line-of-sight (LoS) and non-LoS (NLoS) conditions. Each transceiver link has $1500$ snapshots and each VTD condition has the same transceiver link. There are $27,000$ snapshots at each VTD condition. Overall, the constructed dataset contains $81,000$ snapshots with high-fidelity LiDAR point clouds and scatterer information.

II-B Map** Relationship Exploration

For clarity, a step list illustrating the exploration of the SoM mechanism, i.e., map** relationship between physical environment and electromagnetic space, is presented below.

Step 1: Unlike the monostatic sensing, Tx and Rx have different positions in vehicular communications. Therefore, LiDAR point clouds at Tx and Rx can be concatenated to obtain physical environment, as shown in Fig. 2(a).

Step 2: To reduce data redundancy, the ground point is removed by the pre-processing of concatenated LiDAR point clouds, which are further downsampled, as shown in Fig. 2(b).

Step 3: A typical clustering algorithm in machine learning, i.e., DBSCAN, is leveraged to efficiently obtain physical environment features. For clarity, Fig. 2(c) shows the bird’s-eye view (BEV) of LiDAR point clouds, which contain $18$ clustering groups.

Step 4: Since the in-depth integration and precise alignment are conducted in the constructed dataset, physical environment and electromagnetic space can be matched in the same world coordinate system. In Fig. 2(d), scatterers are located at the clustering group. According to the RT mechanism, paths are significantly affected by the transmission distance and angle. To calculate the size and orientation of each clustering group, its circumscribed cuboid is obtained. The height of circumscribed cuboid is the same as that of clustering group. The circumscribed cuboid projection is the minimum perimeter bounding rectangle of the clustering group projection.

Step 5: Considering the advantage of dealing with the task of numerical inputs and numerical outputs, MLP is exploited to achieve scatterer recognition from LiDAR point clouds, as shown in Fig. 2(e). The input is physical environment feature extracted by LiDAR point clouds, including the length, width, height, center point, and orientation vector of circumscribed cuboid and the position of transceiver. The output is scatterer number at each clustering group. For example, in Fig. 2, the output is a matrix with dimensions of $18$ by $1$ . As a result, with the help of MLP, the number of scatterers in electromagnetic space at each clustering group of LiDAR point clouds in physical environment can be obtained for the first time.

Step 6: To further enhance the interpretability of network output, the propagation mechanism is considered via the VR method. Similar to our previous work in [1], the scatterers are divided into dynamic and static scatterers, which are further assigned to VR. Fig. 2(f) shows the VR assigned to static/dynamic scatterers, i.e., the 3D ellipsoid with the transceiver as the focus, where major axis, minor axis, and focal length are $2a^{\mathrm{sta/dyn}}(t)$ , $2b^{\mathrm{sta/dyn}}(t)$ , and $2c^{\mathrm{sta/dyn}}(t)$ , respectively. VR-related parameters are accurately obtained via RT-based channel data [1]. Finally, scatterers recognized through LiDAR point clouds, which are located outside VR, are deleted and the output number is also changed.

By using the SoM mechanism, i.e., map** relationship, scatterer recognition from LiDAR point clouds is achieved. This facilitates environment-embedded vehicular channel modeling with accurate channel parameters and the capturing of environment-channel non-stationarity and consistency.

III Environment-Embedded Channel Modeling

In this section, an environment-embedded vehicular channel model by scatterer recognition from LiDAR point clouds is proposed. The framework of the proposed model is similar to our previous work in [1]. The channel impulse response (CIR) is given as (1). Due to page limitations, the definition of parameters in (1) is omitted, which can be found in [1].

	$\displaystyle h(t,\tau)=$	$\displaystyle\underbrace{\sqrt{\frac{\Omega(t)}{\Omega(t)+1}}h^{\mathrm{LoS}}(% t)\delta\left(\tau-\tau^{\mathrm{LoS}}(t)\right)}_{\mathrm{LoS}}+\underbrace{% \sqrt{\frac{\eta^{\mathrm{GR}}(t)}{\Omega(t)+1}}h^{\mathrm{GR}}(t)\delta\left(% \tau-\tau^{\mathrm{GR}}(t)\right)}_{\mathrm{Ground\,Reflection}}$		(1)
		$\displaystyle+\underbrace{\sum_{i=1}^{N_{\mathrm{c}}(t)}\sum_{n_{i}=1}^{N_{% \mathrm{s}}(t)}\sqrt{\frac{\eta^{\mathrm{sta}}(t)}{\Omega(t)+1}}h^{\mathrm{sta% }}_{i,n_{i}}(t)\delta\left(\tau-\tau^{\mathrm{sta}}_{i,n_{i}}(t)\right)+\sum_{% j=1}^{M_{\mathrm{c}}(t)}\sum_{n_{j}=1}^{M_{\mathrm{s}}(t)}\sqrt{\frac{\eta^{% \mathrm{dyn}}(t)}{\Omega(t)+1}}h^{\mathrm{dyn}}_{j,n_{j}}(t)\delta\left(\tau-% \tau^{\mathrm{dyn}}_{j,n_{j}}(t)\right)}_{\mathrm{NLoS}}.$		(1)

Channel non-stationarity and consistency are the typical channel characteristic and feature, which can be captured via BD process and VR method [9]–[11]. However, since the BD process and VR method model the mathematical relationship and spatial relationship for the scatterer variation, respectively, the tight interplay between physical environment and channel non-stationarity/consistency cannot be captured. To overcome this limitation and support applications related to intelligent vehicles, by exploiting the complex map** relationship, the proposed approach achieves scatterer recognition from LiDAR point clouds, and thus captures environment-channel non-stationarity and consistency. For clarity, Fig. 3 illustrates the difference between the BD process, the VR method, and the proposed approach. For the proposed approach, scatterers recognized by LiDAR point clouds essentially correspond to the vehicle, tree, and building in the proposed approach, which is different from the BD process and the VR method.

In the proposed approach, at the initial time, the scatterer recognition is implemented by LiDAR point clouds based on the map** relationship. Similar to [1], scatterers are divided into static and dynamic scatterers, which are clustered into static and dynamic clusters. Unlike [1], scatterers recognized by LiDAR point clouds correspond to certain objects in physical environment. This leads to accurate channel parameters, including number $N_{\mathrm{s}}$ / $N_{\mathrm{c}}$ / $M_{\mathrm{s}}$ / $M_{\mathrm{c}}$ , delay $\tau^{\mathrm{sta}}_{i,n_{i}}$ / $\tau^{\mathrm{dyn}}_{j,n_{j}}$ , and angle $\alpha^{\mathrm{sta}}_{i,n_{i}}$ / $\beta^{\mathrm{sta}}_{i,n_{i}}$ / $\alpha^{\mathrm{dyn}}_{j,n_{j}}$ / $\beta^{\mathrm{dyn}}_{j,n_{j}}$ , thus facilitating environment-embedded vehicular channel modeling.

As time evolves and physical environment changes, there are different LiDAR point clouds at different time instants. Through the scatterer recognition from LiDAR point clouds and the capturing of tight interplay between physical environment and electromagnetic space, scatterers change with LiDAR point clouds. As a result, environment-channel non-stationarity in the time domain is mimicked. Furthermore, LiDAR point clouds in physical environment at adjacent time instants are similar. In this case, recognized scatterers from LiDAR point clouds are also similar at adjacent time instants, thus capturing environment-channel consistency in the time domain. To further model environment-channel non-stationarity and consistency in the frequency domain, a frequency-dependent factor $\left(\frac{f}{f_{c}}\right)^{\chi}$ is introduced to the time-varying transfer function (TVTF). The TVTF can be obtained by utilizing the Fourier transform to CIR, which is derived based on the scatterer recognition from LiDAR point clouds with accurate number and position parameters, in respect of delay. Therefore, through the scatterer recognition from LiDAR point clouds, environment-channel non-stationarity and consistency can be accurately captured, thus achieving high-precision environment-embedded vehicular channel modeling.

IV Simulation Results and Analysis

Detailed equipment parameters, e.g., scanning frequency, FoV, carrier frequency, and bandwidth, for LiDAR point clouds and scatterer acquisition are given in Section II-A. In neural network training, the hyper-parameter setting is listed in Table I. The dataset is divided into the training set, validation set, and test set, in the proportion of $3:1:1$ . In Figs. 4–6, the accuracy, error probability heat map, and number of scatterer recognition are given to demonstrate high-precision scatterer recognition. The scatterer recognition accuracy of the proposed approach is further compared with that of the existing random generation approach in Fig. 7. To validate the accuracy of the proposed model, the simulation result and the RT-based result are compared in Fig. 8.

TABLE I: Hyper-parameter setting.

Parameter	Value
Batch size	16
Starting learning rate	$1\times 10^{-3}$
Learning rate scheduler	Every 4 epochs
Learning-rate decaying factor	0.9
Epochs	200
Optimizer	ADAM
Loss function	MSEloss

In Fig. 4, the scatterer recognition accuracy in each clustering group of LiDAR point clouds with different VTDs and streets. The scatterer recognition accuracy is computed by $P=1-\frac{N_{\mathrm{error}}}{N_{\mathrm{all}}}$ , where $N_{\mathrm{error}}$ is the sum of differences between the recognized scatterer number and the ground truth, and $N_{\mathrm{all}}$ is the sum of the ground truth. In Fig. 4, the scatterer recognition accuracy in the aforementioned nine conditions exceeds 90%, with an average value of 90.87%.

Fig. 5 illustrates the probability heat map of scatterer recognition number error in the same nine conditions as Fig. 4. From Fig. 5, it can be seen that the cases where the scatterer recognition number differs from the ground truth by either 0 or 1 accounting for approximately 90% of the instances.

Fig. 6 compares the scatterer recognition number and ground truth with different VTDs. Although there are many scatterers in the clustering group, the scatterer recognition accuracy exceeds 90%. Due to the highest number of scatterers in high VTD, the scatterer recognition accuracy is the lowest.

Fig. 7 compares the scatterer recognition accuracy of the proposed approach and the random generation approach in [1] with different VTDs. In the random generation approach, the scatterer number in each clustering group is randomly generated according to the derived number distribution in [1]. Binary classification is calculated by the recognition accuracy of whether there are scatterers on the clustering group. The regression is calculated by the recognition accuracy of the scatterer number on the clustering group. The proposed approach achieves an accuracy improvement of over 29.13% compared to the accuracy of the random generation approach.

As the power delay profile (PDP) represents the power of received multipath components, which can be described by scatterers, with propagation delays, Fig. 8 compares PDPs. The transceiver link is Car1 (Tx) and Car8 (Rx) at the crossing street with high VTD. As the accurate scatterer recognition and the modeling of tight interplay between physical environment and electromagnetic space, the simulation result based on the proposed model fits well with the RT-based result, where the PDP varies smoothly over time. Therefore, environment-channel non-stationarity and consistency are modeled. In terms of the modeling accuracy, the proposed model outperforms the model in [1] based on the random generation approach.

V Conclusions

This paper has proposed a novel environment-embedded vehicular channel model via SoM, where the SoM mechanism, i.e., map** relationship between physical environment and electromagnetic space, has been explored based on a new dataset. By leveraging LiDAR point clouds for scatterer recognition, environment-channel non-stationarity and consistency have been modeled. Simulation results have demonstrated that the proposed approach has achieved a scatterer recognition accuracy of over 90% and has exhibited an improvement of over 29.13% compared to the random generation approaches. By further capturing environment-channel non-stationarity and consistency, the accuracy of the proposed environment-embedded vehicular channel model has been validated.

References

[1] Z. Huang et al., “A LiDAR-aided channel model for vehicular intelligent sensing-communication integration,” available on arXiv, 2024. [Online]. Available: https://arxiv.longhoe.net/abs/2403.14185.
[2] X. Cheng et al., “Intelligent multi-modal sensing-communication integration: Synesthesia of Machines,” IEEE Commun. Surveys Tuts., vol. 26, no. 1, pp. 258–301, Firstquarter 2024.
[3] A. Gupta, J. Du, D. Chizhik, R. A. Valenzuela, and M. Sellathurai, “Machine learning-based urban canyon path loss prediction using 28 GHz Manhattan measurements,” IEEE Trans. Antennas Propag., vol. 70, no. 6, pp. 4096–4111, Jun. 2022.
[4] N. Bui, et al., “A survey of anticipatory mobile networking: Context-based classification, prediction methodologies, and optimization techniques,” IEEE Commun. Surveys Tuts., vol. 19, no. 3, pp. 1790–1821, Jul.–Sep. 2017.
[5] L. Liu et al., “The COST 2100 MIMO channel model,” IEEE Wireless Commun., vol. 19, no. 6, pp. 92–99, Dec. 2012.
[6] C. Huang et al., “Geometry-cluster-based stochastic MIMO model for vehicle-to-vehicle communications in street canyon scenarios,” IEEE Trans. Wireless Commun., vol. 20, no. 2, pp. 755–770, Feb. 2021.
[7] X. Cai, et al., “Hough-transform-based cluster identification and modeling for V2V channels based on measurements,” IEEE Trans. Veh. Technol., vol. 67, no. 5, pp. 3838–3852, May 2018.
[8] M. Yang et al., “A cluster-based three-dimensional channel model for vehicle-to-vehicle communications,” IEEE Trans. Veh. Technol., vol. 68, no. 6, pp. 5208–5220, Jun. 2019.
[9] Technical Specification Group Radio Access Network; Study on Channel Model for Frequencies From 0.5 to 100 GHz (Release 14), Version 14.2.0, document TR 38.901, 3GPP, Sophia Antipolis, France, Sep. 2017. [Online]. Available: http://www.3gpp.org/DynaReport/ 38901.htm
[10] L. Bai, Z. Huang, Y. Li, and X. Cheng, “A 3D cluster-based channel model for 5G and beyond vehicle-to-vehicle massive MIMO channels,” IEEE Trans. Veh. Technol., vol. 70, no. 9, pp. 8401–8414, Sep. 2021.
[11] H. Chang et al., “A general 3-D nonstationary GBSM for underground vehicular channels,” IEEE Trans. Antennas Propag., vol. 71, no. 2, pp. 1804–1819, Feb. 2023.
[12] S. Shah, D. Dey, C. Lovett, and A. Kapoor, “AirSim: High-fidelity visual and physical simulation for autonomous vehicles,” in Field and Service Robotics, M. Hutter and R. Siegwart, Eds. Cham, Switzerland: Springer, 2018, pp. 621–635.
[13] Remcom. Wireless InSite. [Online]. Available: https://www.remcom.com/wireless-insite-em-propagation-software [Publication date: Jan. 2017, Accessed date: Mar. 2022].
[14] X. Cheng et al., “M³SC: A generic dataset for mixed multi-modal (MMM) sensing and communication integration,” China Commun., vol. 20, no. 11, pp. 13–29, Nov. 2023.