RayProNet: A Neural Point Field Framework for Radio Propagation Modeling in 3D Environments

Ge Cao and Zhen Peng G. Cao and Z. Peng are with the Center for Computational Electromagnetics, Department of Electrical and Computer Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801 USA (e-mail: [email protected]; [email protected]).
Abstract

The radio wave propagation channel is central to the performance of wireless communication systems. In this paper, we introduce a novel machine learning-empowered methodology for wireless channel modeling. The key ingredients include a point-cloud-based neural network and a Spherical Harmonics encoder with light probes. Our approach offers several significant advantages, including the flexibility to adjust antenna radiation patterns and transmitter/receiver locations, the capability to predict radio power maps, and the scalability of large-scale wireless scenes. As a result, it lays the groundwork for an end-to-end pipeline for network planning and deployment optimization. The proposed work is validated in various outdoor and indoor radio environments.

{strip}
[Uncaptioned image]
Figure 1: The schematic illustrates the input, output, and application of our proposed neural point field network framework for predicting wireless radio channel properties in large-scale environments.

I Introduction

Understanding and accurately modeling the characteristics of the propagation channel are essential for the design, deployment, and optimization of wireless communication networks [1, 2, 3]. Although Maxwell’s Equations govern the fundamental physics of wireless information transmission, obtaining full-wave solutions in large-scale environments is typically challenging and time-consuming [4, 5, 6, 7, 8, 9]. Ray tracing-based simulators are commonly employed for modeling wireless channel properties [10, 11, 12, 13, 14]. In the ray tracing process, electromagnetic (EM) rays are uniformly launched from the transmitter antenna, undergoing reflections, transmissions, and diffractions with various buildings and floors, ultimately reaching the receiver locations. These ray paths and interactions yield valuable wireless channel information such as channel gain, channel transfer function, and channel impulse response.

While ray tracing has been a popular tool in wireless channel modeling, its computational complexity escalates with the number of ray-object interactions. Moreover, in wireless deployment and planning scenarios, frequent modifications to transmitter/receiver locations are common. Typically, a new ray tracing simulation is required for each configuration change. This exhibits a noticeable gap between the simulation time of ray-tracing simulators and the rapid time-to-solution demand of wireless network design and optimization. To address these needs, neural network-based forward surrogate models emerge as an attractive solution [15, 16, 17, 18]. Neural networks generally offer faster runtime compared to ray tracing algorithms, and their accuracy can be enhanced by refining the training dataset rather than increasing runtime.

The objective of this paper is to develop a neural network surrogate capable of predicting wireless channel properties across large-scale environments. The overview of the proposed framework is given in Fig. 1. In our methodology, we train the neural surrogate using ray-tracing solutions corresponding to a finite set of transmitting locations within a specific radio environment. Once trained, the neural surrogate leverages its understanding of EM propagation physics to predict EM wave propagation for new transmitter/receiver locations and different antenna radiation patterns. This research emphasizes two key features: (1) the neural surrogate’s functionality to predict the spatial distribution of radiated power (i.e., the radio coverage or path loss map), and (2) its effective generalization to large-scale scenes in both outdoor and indoor environments.

In the realm of neural surrogate development for radio wave propagation, the learning of scene representations is an aspect that has received limited attention in previous works. Many existing approaches primarily focus on 2D image tasks, typically from a bird’s-eye view, and lack the incorporation of geometry information as input [19, 20, 21]. Another recent study [22] focuses on explicitly learning the meshed geometries, thereby limiting its generalizability to large outdoor scenes. In contrast, our proposed work offers a fresh perspective on neural scene representation. The 3D propagation environment (wireless scene) is rendered using point clouds, a representation well-known for its adaptability and intuitive scalability to large-scale scenes [23].

Moreover, we introduce the Neural Point Field framework to implicitly embed wireless channel state information into light probes [24]. Each light probe encapsulates EM ray properties, which are interpolated using a Spherical Harmonics encoder and decoder [25]. This facilitates the extraction of propagation information from queries in different ray directions. Conceptually, these light probes are designed to capture the site-specific EM ray propagation physics. Receivers can seamlessly extract path tracing and ray propagation from these probes, streamlining the process and enhancing overall efficiency.

Compared to existing neural ray tracing methods in the literature, the proposed work excels in scalability and flexibility, accommodating diverse levels of geometry complexity while maintaining high-quality channel prediction. We validate our proposed pipeline across small indoor, medium outdoor, and large city scenes. The results demonstrate the efficacy of our approach in predicting wireless channel properties across various scales of scenes.

II Related Works

In this section, we discuss related works from both the machine learning (ML) and wireless communication communities. Given the resemblance between rendering and wireless channel modeling algorithms, we particularly emphasize studies in neural rendering and computer graphics within the deep learning field. Additionally, since our pipeline design necessitates an implicit representation of geometry, we also introduce relevant works on geometry in neural networks.

Neural Rendering: The ray tracing algorithm is widely used in the rendering process in 3D computer graphics. Leveraging this foundational understanding, our research explores valuable insights from advancements in neural rendering, enriching our approach to wireless channel modeling. Recently, advancements in 3D scene representation using neural networks have showcased their ability to render scenes quickly and flexibly. In these approaches, the radiance field is embedded within neural networks, such as Multi-Layer Perceptrons (MLPs), or at a higher level, within the volume space. This implies that the lighting information is typically fixed and cannot be modified. Despite the complexity of light sources in the rendering process [26], several works have achieved relighting techniques [27, 28, 25].

Since the publication of Kerbl et al.’s work on 3D Gaussian Splatting [29], this new neural rendering technique has garnered significant attention. A Gaussian kernel is applied and learned to represent scene geometries in the format of point clouds. Subsequently, several related works have emerged, including research on relighting [30] and the reconstruction of human avatars [31].

Before the development of 3D Gaussian Splatting, a strategy known as Neural Point Light Fields (NeuralPointLF) was introduced, demonstrating the potential of point cloud formats in the domain of neural rendering [23]. The distinction between NeuralPointLF and 3D Gaussian Splatting lies in the fact that NeuralPointLF does not necessitate a rasterization process in the pipeline. Since ray-tracing simulations in wireless channel modeling also do not require rasterization, our network draws inspiration from NeuralPointLF and incorporates attention techniques into the framework [32]. Furthermore, as NeuralPointLF lacks a relighting process, our pipeline incorporates relighting into its design. This addition addresses scenarios involving changing antenna locations or radiation patterns.

Geometry Representation: The representation format of 3D geometry is crucial for all ML tasks involving three-dimensional data. The most common method for representing geometry is through mesh triangles, consisting of a set of vertices (𝕍𝕍\mathbb{V}blackboard_V), edges (𝔼𝔼\mathbb{E}blackboard_E), and faces (𝔽𝔽\mathbb{F}blackboard_F). While meshed geometries are widely utilized in computational science and engineering, their utilization in deep learning is limited due to the non-differentiability of triangle face indices. Although some researchers have attempted to apply statistical methods to make mesh triangles differentiable [33, 34], these strategies are still computationally intensive for neural networks.

Point clouds have emerged as a preferred geometry representation format in neural network-based research. This representation is utilized across various tasks, including 3D surface reconstruction [35, 36], geometry denoising [37, 38, 39], and geometry completion [40, 41]. Leveraging the differentiability of point clouds, our work adopts the PointNet [42] architecture for geometry representation. While mesh triangles and point clouds are prevalent, other representation formats exist, such as the multi-view model [43, 44] and surface random walk [45].

Neural Radio Channel Modelling: Physically-based simulation guided by neural networks is gaining popularity across various scientific domains, including fluid dynamics [46, 47], soft body dynamics [48, 49], and electrodynamics [50, 51], etc. In the field of applied and computational electromagnetics, several approaches leveraging neural networks have been proposed [52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65]. Many of these neural surrogates aim to learn the scattering process involving obstacles in free space. Given that wireless channel properties are governed by the propagation and scattering of EM waves, our work shares objectives related to those of these approaches. The emphasis of this work is to expand the application domain to encompass more complex scenarios, specifically extending into 3D environments featuring intricate obstacles like buildings.

Until now, there has been limited attention given to the task of wireless channel modeling in complex 3D environments. A recent work addressing this task is WINERT [22]. In their approach, a complete ray tracing process is implemented, with a focus on learning the propagation properties (reflection, transmission, diffraction) of buildings. However, they did not implement the ray-triangle intersection process as differentiable, citing its non-differentiability. Furthermore, their pipeline is not suitable for handling large-scale and complicated scene geometries.

Several other works have also aimed to develop neural surrogates for predicting path loss map information. Nevertheless, most of these works focus on 2D tasks that do not explicitly require geometry representation. Instead, they rely on 2D bird’s-eye-view images (heatmaps) for training [19, 20, 21]. While this format simplifies the learning process and results in a faster pipeline, it may encounter difficulties in effectively capturing the complexities of 3D scenes in an end-to-end manner.

III Neural Point Field for Wireless Channel

The proposed work aims to investigate a neural point field network to simulate the ray tracing process between transmitters and receivers within complex wireless scenes. At its core, this method relies on three fundamental elements: leveraging point clouds for the representation of geometric structures, integrating light probes to capture path tracing and ray propagation information, and utilizing spherical harmonic functions for the extraction of field data. An overview of the pipeline is illustrated in Fig. 2, which we henceforth refer to as RayProNet. The detailed technical ingredients and underlying rationales are provided below.

Refer to caption
Figure 2: RayProNet: a neural point field framework for wireless channel modeling pipeline. (The symbols A - E represent the subsections in Section III.)

III-A Data Preparation

The RayProNet pipeline relies on two primary inputs: the locations of receivers and transmitters, alongside the 3D geometry of the environment, which is initially transformed into point clouds as the default format for representation. Point clouds offer an efficient means of encoding complex geometric features like obstacles, buildings, and terrain by sampling points in space to capture the characteristics of interacting objects within the environment. This approach allows for the effective encoding of interacting objects, with a particular emphasis on learning geometric features.

In addition, light probes are uniformly placed throughout the scene, capturing the propagation behavior of EM rays through space. Their integration into the pipeline allows the model to acquire essential insights into ray paths, reflections, and diffractions, thereby enhancing the accuracy and efficiency of the learning process. Light probes play a crucial role in encoding propagation information, particularly in environments characterized by sparse geometric structures, as elaborated in Section III.C. The data preparation stage proceeds as follows:

  • Transmitter Setup: Initially, we define the locations of transmitters and configure their antenna patterns. This process ensures an accurate representation of transmitter characteristics in the simulation.

  • Receiver Setup: Similarly, we specify the locations of receivers and configure their antenna patterns to accurately simulate receiver behavior in the wireless environment.

  • Identify n𝑛nitalic_n Nearest Light Probes: For each receiver, we identify the n𝑛nitalic_n nearest light probes and record ray directions, enabling the collection of electromagnetic field information from the surroundings (Fig. 3).

  • Identify K𝐾Kitalic_K Nearest Points: Next, we determine the K𝐾Kitalic_K nearest points for each light probe and record this as a K𝐾Kitalic_K-closest direction attachment. This enables us to capture detailed geometric information about the scene (Fig. 4).

The parameters n𝑛nitalic_n and K𝐾Kitalic_K serve as hyperparameters that offer flexibility for customization based on scene complexity and application-specific consideration, allowing for tailored adjustments to the pipeline. For example, applications requiring highly fidelity predictions or precise localization may benefit from larger values of n𝑛nitalic_n and K𝐾Kitalic_K.

III-B Point Cloud Feature Embedding

Given our primary focus on wireless channel modeling rather than rendering, employing a multi-view model presents challenges due to the absence of a specific look-at direction in our task. Therefore, we adopt the PointNet model [42] for its effectiveness and robustness in learning various features of point clouds. PointNet is originally proposed for 3D recognition tasks such as object classification, part segmentation, and semantic segmentation. Unlike traditional convolutional neural networks (CNNs) that operate on grid-like data and images, PointNet can directly process point clouds without requiring any intermediate representation like voxelization. In our implementation, we begin by normalizing all scene point clouds to the range [1,1]11[-1,1][ - 1 , 1 ]. We then utilize PointNet to generate a feature matrix 𝒍j,knp×128subscript𝒍𝑗𝑘superscriptsubscript𝑛𝑝128\boldsymbol{l}_{j,k}\in\mathbb{R}^{n_{p}\times 128}bold_italic_l start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT × 128 end_POSTSUPERSCRIPT, where npsubscript𝑛𝑝n_{p}italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT is the total number of point clouds. Afterwards, we split this matrix into 𝒍j,k1np×64subscript𝒍𝑗𝑘1superscriptsubscript𝑛𝑝64\boldsymbol{l}_{j,k1}\in\mathbb{R}^{n_{p}\times 64}bold_italic_l start_POSTSUBSCRIPT italic_j , italic_k 1 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT × 64 end_POSTSUPERSCRIPT and 𝒍j,k2np×64subscript𝒍𝑗𝑘2superscriptsubscript𝑛𝑝64\boldsymbol{l}_{j,k2}\in\mathbb{R}^{n_{p}\times 64}bold_italic_l start_POSTSUBSCRIPT italic_j , italic_k 2 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT × 64 end_POSTSUPERSCRIPT for use in subsequent interpolation steps.

III-C Path Tracing with Light Probes and Point Clouds

Integrating light probes into our pipeline stands as an important contribution to our pipeline. It represents a strategic solution to address the unique challenges encountered in wireless ray propagation scenarios. In environments characterized by sparse geometric structures, such as open landscapes or urban settings with tall buildings, a straightforward implementation of neural ray tracing may encounter limitations. When rays emitted from antennas fail to intersect with nearby point clouds, one has to extrapolate their trajectories into unobstructed space.

To mitigate potential inaccuracies arising from the absence of precise ray directions, the proposed work draws inspiration from the concept of light probes in computer graphics. Light probes serve as essential tools for capturing and simulating realistic lighting effects within virtual environments. These probes act as virtual cameras that record light information from different directions, allowing for the creation of dynamic and immersive lighting scenarios.

In our pipeline, we place a set of light probes (much fewer than the number of point clouds) throughout the scene. Each light probe serves as a virtual observation point, capturing and encoding surrounding ray propagation information. This encoded data allows nearby receivers to easily decode it using the ray direction and distance as queries. Essentially, individual light probes serve as a neural surrogate for baking the propagation information within their nearby space.

Refer to caption
Figure 3: Identifying n-nearest light probes: Each receiver locates its n𝑛nitalic_n nearest light probes and retrieves radiance information from them.
Refer to caption
Figure 4: Identifying K-nearest points: Each light probe finds its K𝐾Kitalic_K closest points and encodes occlusion information.

Moreover, the introduction of light probes enables the extraction of EM field information from the embedded features of point clouds. Specifically, we select the K𝐾Kitalic_K closest point clouds for each light probe. Analogous to the relighting task in neural rendering, our re-transmitting task requires considering the contributions from the transmitter locations. As a result, for each light probe, a total of K+1𝐾1K+1italic_K + 1 points, comprising both point clouds and transmitters, are chosen for providing the information of transmitter signal propagation and the occlusion of buildings. This selection establishes a physical correspondence, where there is a Line of Sight (LOS) contribution from the transmitter (akin to direct illumination in rendering) and K𝐾Kitalic_K contributions (resembling wave physics of reflection, diffraction, and scattering) from point clouds.

In our design, an attention technique is employed for this extraction process (Fig. 5), guided by the location information between light probes and transmitter (distance dtsubscript𝑑𝑡d_{t}italic_d start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, elevation θtsubscript𝜃𝑡\theta_{t}italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and azimuth ϕtsubscriptitalic-ϕ𝑡\phi_{t}italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT), and the information between light probes and point clouds (distance djsubscript𝑑𝑗d_{j}italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, elevation θjsubscript𝜃𝑗\theta_{j}italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and azimuth ϕjsubscriptitalic-ϕ𝑗\phi_{j}italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT). Subsequently, we combine them with our previous embedded feature as 𝑲j,kK×67subscript𝑲𝑗𝑘superscript𝐾67\boldsymbol{K}_{j,k}\in\mathbb{R}^{K\times 67}bold_italic_K start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_K × 67 end_POSTSUPERSCRIPT and 𝑽j,kK×67subscript𝑽𝑗𝑘superscript𝐾67\boldsymbol{V}_{j,k}\in\mathbb{R}^{K\times 67}bold_italic_V start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_K × 67 end_POSTSUPERSCRIPT.

{𝑲j,k=𝒍j,k1{dj,θj,ϕj}𝑽j,k=𝒍j,k2{dt,θt,ϕt}casessubscript𝑲𝑗𝑘direct-sumsubscript𝒍𝑗𝑘1subscript𝑑𝑗subscript𝜃𝑗subscriptitalic-ϕ𝑗otherwisesubscript𝑽𝑗𝑘direct-sumsubscript𝒍𝑗𝑘2subscript𝑑𝑡subscript𝜃𝑡subscriptitalic-ϕ𝑡otherwise\displaystyle\centering\begin{cases}\boldsymbol{K}_{j,k}=\boldsymbol{l}_{j,k1}% \oplus\{d_{j},\theta_{j},\phi_{j}\}\\ \boldsymbol{V}_{j,k}=\boldsymbol{l}_{j,k2}\oplus\{d_{t},\theta_{t},\phi_{t}\}% \\ \end{cases}{ start_ROW start_CELL bold_italic_K start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT = bold_italic_l start_POSTSUBSCRIPT italic_j , italic_k 1 end_POSTSUBSCRIPT ⊕ { italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_θ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT } end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL bold_italic_V start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT = bold_italic_l start_POSTSUBSCRIPT italic_j , italic_k 2 end_POSTSUBSCRIPT ⊕ { italic_d start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } end_CELL start_CELL end_CELL end_ROW (1)

By adding a query into the pipeline, which is instructed by K+1𝐾1K+1italic_K + 1 closest direction 𝒅jsubscript𝒅𝑗\boldsymbol{d}_{j}bold_italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT from light probes to point clouds and transmitter, EM field profiles are effectively baked into our light probes. These closest directions are shot from light probes directly to point clouds, and positionally encoded by encoder 𝑭θQsubscript𝑭subscript𝜃𝑄\boldsymbol{F}_{\theta_{Q}}bold_italic_F start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT end_POSTSUBSCRIPT (2).

𝑸j=𝑭θQ(𝒅j)subscript𝑸𝑗subscript𝑭subscript𝜃𝑄subscript𝒅𝑗\displaystyle\centering\boldsymbol{Q}_{j}=\boldsymbol{F}_{\theta_{Q}}(% \boldsymbol{d}_{j})bold_italic_Q start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = bold_italic_F start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_d start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) (2)

Subsequently, we apply multi-head attention to learn the power information of light probes: Given key-value pair (𝒌j,k,𝑽j,k)subscript𝒌𝑗𝑘subscript𝑽𝑗𝑘(\boldsymbol{k}_{j,k},\boldsymbol{V}_{j,k})( bold_italic_k start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT , bold_italic_V start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT ), the task is to predict a weight corresponding to the query ray vector 𝑸jsubscript𝑸𝑗\boldsymbol{Q}_{j}bold_italic_Q start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT. The output weight is then encoded into point cloud feature vector 𝒍i,jn×128subscript𝒍𝑖𝑗superscript𝑛128\boldsymbol{l}_{i,j}\in\mathbb{R}^{n\times 128}bold_italic_l start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × 128 end_POSTSUPERSCRIPT.

𝒍i,j=𝑭θatten(𝑲j,k,𝑽j,k,𝑸j)subscript𝒍𝑖𝑗subscript𝑭subscript𝜃𝑎𝑡𝑡𝑒𝑛subscript𝑲𝑗𝑘subscript𝑽𝑗𝑘subscript𝑸𝑗\displaystyle\centering\boldsymbol{l}_{i,j}=\boldsymbol{F}_{\theta_{atten}}(% \boldsymbol{K}_{j,k},\boldsymbol{V}_{j,k},\boldsymbol{Q}_{j})bold_italic_l start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT = bold_italic_F start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_a italic_t italic_t italic_e italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_K start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT , bold_italic_V start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT , bold_italic_Q start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) (3)
Refer to caption
Figure 5: Multi-head attention: In Section III.C, a multi-head attention module is employed to aggregate the point cloud feature vector lj,ksubscript𝑙𝑗𝑘l_{j,k}italic_l start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT along with the K-closest direction j𝑗jitalic_j. This process generates a light probe feature. The attention module described in Section III.D follows a similar structure.

III-D Receivers: Unveiling Ray Physics from Light Probes

Once the EM propagation information has been baked into light probes, the next step is to determine a format of interpolation for storing this data. Similar to the previous section, we employ another attention technique when receivers extract EM power from light probes. In this process, we select the n𝑛nitalic_n closest light probes and generate a ray for each of them. The direction of these rays serves as the query for our attention block. Thereby, we design two instructions (key and value) with the combination of three variables: distance disubscript𝑑𝑖d_{i}italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, elevation θisubscript𝜃𝑖\theta_{i}italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, and azimuth ϕisubscriptitalic-ϕ𝑖\phi_{i}italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. They are between point clouds and receivers (key), transmitters and receivers (value). This section is very similar to the previous part.

{𝑲i,j=𝒍i,j1{di,θi,ϕi}𝑽i,j=𝒍i,j2{dt,θt,ϕt}casessubscript𝑲𝑖𝑗direct-sumsubscript𝒍𝑖𝑗1subscript𝑑𝑖subscript𝜃𝑖subscriptitalic-ϕ𝑖otherwisesubscript𝑽𝑖𝑗direct-sumsubscript𝒍𝑖𝑗2subscript𝑑𝑡subscript𝜃𝑡subscriptitalic-ϕ𝑡otherwise\displaystyle\centering\begin{cases}\boldsymbol{K}_{i,j}=\boldsymbol{l}_{i,j1}% \oplus\{d_{i},\theta_{i},\phi_{i}\}\\ \boldsymbol{V}_{i,j}=\boldsymbol{l}_{i,j2}\oplus\{d_{t},\theta_{t},\phi_{t}\}% \\ \end{cases}{ start_ROW start_CELL bold_italic_K start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT = bold_italic_l start_POSTSUBSCRIPT italic_i , italic_j 1 end_POSTSUBSCRIPT ⊕ { italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_θ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL bold_italic_V start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT = bold_italic_l start_POSTSUBSCRIPT italic_i , italic_j 2 end_POSTSUBSCRIPT ⊕ { italic_d start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_θ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_ϕ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } end_CELL start_CELL end_CELL end_ROW (4)

Upon receiving a key-value pair, we encode n𝑛nitalic_n ray directions 𝒅isubscript𝒅𝑖\boldsymbol{d}_{i}bold_italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, which is shot from receivers to light probes. Following positional encoding, a query vector 𝑸isubscript𝑸𝑖\boldsymbol{Q}_{i}bold_italic_Q start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is generated, which is then applied to another Multi-head attention neural block for feature extraction. The resulting output is a ray feature vector 𝒍in+1subscript𝒍𝑖superscript𝑛1\boldsymbol{l}_{i}\in\mathbb{R}^{n+1}bold_italic_l start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT, where n𝑛nitalic_n represents the number of rays. Notably, the inclusion of LoS necessitates the addition of another receiver-transmitter ray into our pipeline, thereby augmenting the final feature count to n+1𝑛1n+1italic_n + 1 instead of n𝑛nitalic_n.

𝑸i=𝑭θQ(𝒅i)subscript𝑸𝑖subscript𝑭subscript𝜃𝑄subscript𝒅𝑖\displaystyle\centering\boldsymbol{Q}_{i}=\boldsymbol{F}_{\theta_{Q}}(% \boldsymbol{d}_{i})bold_italic_Q start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_italic_F start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_Q end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) (5)
𝒍i=𝑭θatten(𝑲i,j,𝑽i,j,𝑸i)subscript𝒍𝑖subscript𝑭subscript𝜃𝑎𝑡𝑡𝑒𝑛subscript𝑲𝑖𝑗subscript𝑽𝑖𝑗subscript𝑸𝑖\displaystyle\centering\boldsymbol{l}_{i}=\boldsymbol{F}_{\theta_{atten}}(% \boldsymbol{K}_{i,j},\boldsymbol{V}_{i,j},\boldsymbol{Q}_{i})bold_italic_l start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_italic_F start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_a italic_t italic_t italic_e italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_K start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT , bold_italic_V start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT , bold_italic_Q start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) (6)

III-E Spherical Harmonics-based Decoding of Ray Features

After decoding the ray features from our attention neural blocks, we represent this information as a set of spherical harmonics coefficients. Spherical harmonics are special functions defined on the surface of a sphere, widely utilized in various fields such as atomic and molecular physics, quantum mechanics, and computer graphics. These functions constitute an orthogonal and complete set of basis functions, particularly renowned for their utility in encoding or decoding directional information. A visualization of 3-order Spherical Harmonics is shown in Fig. 6, where a higher order suggests enhanced performance in restoring higher frequency and directional information.

In the previous subsection, a ray feature 𝒍in+1subscript𝒍𝑖superscript𝑛1\boldsymbol{l}_{i}\in\mathbb{R}^{n+1}bold_italic_l start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT is provided. Here, we employ an 8-layer multi-layer perceptron (MLP) with 256 channels. The output is a spherical harmonics interpolation coefficient 𝒄i(n+1)×ncsubscript𝒄𝑖superscript𝑛1subscript𝑛𝑐\boldsymbol{c}_{i}\in\mathbb{R}^{(n+1)\times n_{c}}bold_italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_n + 1 ) × italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, where ncsubscript𝑛𝑐n_{c}italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT is the output channel, typically set as the interpolation degree.

𝒄i=𝑭θMLP(𝒍i)subscript𝒄𝑖subscript𝑭subscript𝜃𝑀𝐿𝑃subscript𝒍𝑖\displaystyle\centering\boldsymbol{c}_{i}=\boldsymbol{F}_{\theta_{MLP}}(% \boldsymbol{l}_{i})bold_italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_italic_F start_POSTSUBSCRIPT italic_θ start_POSTSUBSCRIPT italic_M italic_L italic_P end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( bold_italic_l start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) (7)

Subsequently, we divide the output coefficient 𝒄i(n+1)×ncsubscript𝒄𝑖superscript𝑛1subscript𝑛𝑐\boldsymbol{c}_{i}\in\mathbb{R}^{(n+1)\times n_{c}}bold_italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT ( italic_n + 1 ) × italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT end_POSTSUPERSCRIPT into 𝒄i1n×ncsubscript𝒄𝑖1superscript𝑛subscript𝑛𝑐\boldsymbol{c}_{i1}\in\mathbb{R}^{n\times n_{c}}bold_italic_c start_POSTSUBSCRIPT italic_i 1 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n × italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT end_POSTSUPERSCRIPT and 𝒄i2ncsubscript𝒄𝑖2superscriptsubscript𝑛𝑐\boldsymbol{c}_{i2}\in\mathbb{R}^{n_{c}}bold_italic_c start_POSTSUBSCRIPT italic_i 2 end_POSTSUBSCRIPT ∈ blackboard_R start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT end_POSTSUPERSCRIPT. Finally, a Spherical Harmonics decoder is applied.

𝒐i=𝒄i1TSH(𝒅i)+𝒄i2subscript𝒐𝑖superscriptsubscript𝒄𝑖1𝑇𝑆𝐻subscript𝒅𝑖subscript𝒄𝑖2\displaystyle\centering\boldsymbol{o}_{i}=\boldsymbol{c}_{i1}^{T}SH(% \boldsymbol{d}_{i})+\boldsymbol{c}_{i2}bold_italic_o start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = bold_italic_c start_POSTSUBSCRIPT italic_i 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_S italic_H ( bold_italic_d start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) + bold_italic_c start_POSTSUBSCRIPT italic_i 2 end_POSTSUBSCRIPT (8)
Refer to caption
Figure 6: Spherical Harmonics: This figure visualizes 3rdsuperscript3rd3^{\textrm{rd}}3 start_POSTSUPERSCRIPT rd end_POSTSUPERSCRIPT-order Spherical Harmonics, whose solution is a multiple of the associated Legendre polynomial Pl|m|superscriptsubscript𝑃𝑙𝑚P_{l}^{|m|}italic_P start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT start_POSTSUPERSCRIPT | italic_m | end_POSTSUPERSCRIPT with input of azimuth φ𝜑\varphiitalic_φ and elevation θ𝜃\thetaitalic_θ. In our methodology, Spherical Harmonics, which takes the ray direction as input, are employed for radiance encoding.

IV Numerical Experiments

In this section, we quantitatively validate our proposed method for predicting radio propagation characteristics across various wireless environments. The section is divided into three parts. Part A details the experimental setup, data collection, and the training process employed. Part B focuses on the validation and verification procedures. Finally, Part C demonstrates and evaluates the proposed methodology in large-scale, 3D wireless scenarios.

IV-A Experimental Setup and Training

Our evaluation aims to assess the effectiveness of the proposed model in accurately predicting signal strength and coverage within wireless environments. We focus on two key outcomes: path loss maps and received signal strength at designated locations. Path loss maps depict the attenuation of electromagnetic signals as they propagate through the wireless scenes, offering valuable insights into coverage areas. Meanwhile, the received signal strength at specific locations provides crucial information for tasks such as localization and connectivity assessment. Both outcomes are essential for network planning and optimization, informing decisions regarding antenna placement and transmission power levels to optimize network performance and reliability.

IV-A1 Data Collection

Our datasets are generated using an open-sourced ray-tracing simulator: Sionna [66]. We generate our dataset in various scales of scenes such as: wiindoor (small indoor room scene), etoile (Medium city block scene), and Munich (large urban city scene).

As described in Table I, the dataset comprises 175375similar-to175375175\sim 375175 ∼ 375 transmitter locations and approximately 1,92035,816formulae-sequencesimilar-to1920358161,920\sim 35,8161 , 920 ∼ 35 , 816 uniformly sampled receiver locations for each scene. About 85%percent8585\%85 % of them is used for training, with the remaining 15%percent1515\%15 % reserved for validation. The operating frequency is 2.14GHz2.14GHz2.14\rm{GHz}2.14 roman_GHz. After training, the model serves as a neural surrogate for wireless channel prediction.

TABLE I: Data collection: We validate our methodology across three different scene scales. This table provides the configuration details of our dataset.
Training Dataset wiindoor etoile center etoile munich
Scale indoor room isolated building city blocks urban city
Covered area 10×10m21010superscriptm210\times 10\;\rm{m}^{2}10 × 10 roman_m start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT 150×160m2150160superscriptm2150\times 160\;\rm{m}^{2}150 × 160 roman_m start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT 853×676m2853676superscriptm2853\times 676\;\rm{m}^{2}853 × 676 roman_m start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT 1475×1205m214751205superscriptm21475\times 1205\;\rm{m}^{2}1475 × 1205 roman_m start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT
Transmitters 175175175175 375375375375 175175175175 175175175175
Receivers 100×100×21001002100\times 100\times 2100 × 100 × 2 30×32×23032230\times 32\times 230 × 32 × 2 86×68×28668286\times 68\times 286 × 68 × 2 148×121×21481212148\times 121\times 2148 × 121 × 2
Antenna patterns 4444

IV-A2 Training setups

In our experiments, n𝑛nitalic_n rays are initially launched from receivers to find n𝑛nitalic_n nearest light probes. Each light probe is attached to K𝐾Kitalic_K different point clouds. The hyperparameters n𝑛nitalic_n and K𝐾Kitalic_K are selected as 8888.

Our model is trained with a batch size of 1000100010001000 and a learning rate of 0.00010.00010.00010.0001. We train the model for 500500500500 epochs, which typically takes between 1.510similar-to1.5101.5\sim 101.5 ∼ 10 hours in a GPU environment using an NVIDIA GeForce RTX 3090 Ti. We utilize the Adam optimizer [67] and the mean square error (MSE) loss function for received power optimization.

IV-A3 Evaluation Metric

The evaluation metric serves as a quantitative measure to assess the performance of the proposed method in predicting radio path loss maps. It quantifies the accuracy of the predictions by comparing them to ground truth (GT) data or measurements. The specific evaluation metrics that will be used in our numerical experiments include Mean Square Error (MSE), and Peak Signal-to-Noise Ratio (PSNR).

{MSE=i=0N(oiogt)2PSNR=20log10(max(oi)/MSE)casesMSEsuperscriptsubscripti0Nsuperscriptsubscriptoisubscriptogt2otherwisePSNR20losubscriptg10maxsubscriptoiMSEotherwise\displaystyle\centering\begin{cases}\rm{MSE}=\sum_{i=0}^{N}(o_{i}-o_{gt})^{2}% \\ \rm{PSNR}=20\rm{log}_{10}({\rm{max}{(o_{i})}}/{\sqrt{MSE}})\\ \end{cases}{ start_ROW start_CELL roman_MSE = ∑ start_POSTSUBSCRIPT roman_i = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_N end_POSTSUPERSCRIPT ( roman_o start_POSTSUBSCRIPT roman_i end_POSTSUBSCRIPT - roman_o start_POSTSUBSCRIPT roman_gt end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL roman_PSNR = 20 roman_l roman_o roman_g start_POSTSUBSCRIPT 10 end_POSTSUBSCRIPT ( roman_max ( roman_o start_POSTSUBSCRIPT roman_i end_POSTSUBSCRIPT ) / square-root start_ARG roman_MSE end_ARG ) end_CELL start_CELL end_CELL end_ROW (9)

IV-B Validation and Verification

IV-B1 Assessment in Learning EM Propagation Physics

In this subsection, we evaluate our framework’s ability to model EM propagation physics. Our objective is to determine how effectively the neural network learns and understands key EM principles such as reflection, transmission, and diffraction, particularly in the context of various building structures. To assess this, we train the neural network using scenes with a single isolated building at the center. This setup allows us to isolate and analyze the model’s performance in understanding EM propagation in the presence of architectural elements. We provide two separate datasets for training: one including diffraction effects and the other without. The validation results, shown in Fig. LABEL:fig:_Evaluation_EM_physics_understanding, demonstrate the model’s proficiency in accurately capturing essential features of EM propagation physics. Moreover, enhancements in prediction accuracy can be achieved through the refinement of the training dataset.

IV-B2 Comparison to Other Neural Surrogates

While our primary focus is on 3D end-to-end channel power prediction, the scarcity of open-source 3D-based neural surrogates led us to evaluate our model against a 2D-based neural surrogate PMNet [20] and a standard Multi-Layer Perceptron (MLP) model. The MLP network is designed with four hidden layers of sizes 64, 64, 32, and 64, using leaky ReLU as the activation function.

Figure LABEL:fig:_Comparison_with_others and Table II present the visualization and quantitative comparison. The results indicate that our prediction closely matches the ground truth (ray-tracing simulator) and outperforms other neural surrogates. Notably, even evaluating the 2D path loss map (at a certain height), our 3D RayProNet shows a significant advantage over the 2D pipeline (PMNet). In our numerical experiments, the MSE score of PMNet is substantially lower than their 2D validations (approximately 102similar-toabsentsuperscript102\sim 10^{-2}∼ 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT in our isolated building environment and approximately 104similar-toabsentsuperscript104\sim 10^{-4}∼ 10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT in their USC campus setting). The main reason for this difference is the size of the training dataset. The dataset in this experiment uses only 100 transmitters, whereas PMNet validates its pipeline with 19,016 configurations on the USC campus. Typically, a larger dataset size leads to better performance.

TABLE II: Comparison with other neural surrogates: MSE loss and PSNR score comparison of power between ray-tracing results (ground truth) and various neural predictions (Ours, MLP, PMNet) in our isolated building environment. The upward arrow indicates better performance with larger values, while the downward arrow denotes better performance with smaller values. The best scores and lowest errors are highlighted in bold font.
- Ours MLP PMNet
MSE \downarrow 𝟑×𝟏𝟎𝟒3superscript104\boldsymbol{3\times 10^{-4}}bold_3 bold_× bold_10 start_POSTSUPERSCRIPT bold_- bold_4 end_POSTSUPERSCRIPT 4×1034superscript1034\times 10^{-3}4 × 10 start_POSTSUPERSCRIPT - 3 end_POSTSUPERSCRIPT 0.0390.0390.0390.039
PSNR \uparrow 35.2435.24\boldsymbol{35.24}bold_35.24 23.9023.9023.9023.90 14.2714.2714.2714.27

IV-B3 Verification through Ablation Experiment

One of the key ingredients in RayProNet is the introduction of light probes. Hence, we will evaluate the impact of this module by performing an ablation experiment. If we remove the light probe module, receivers will directly shoot rays to find the K𝐾Kitalic_K closest point clouds, rather than the n𝑛nitalic_n closest light probes. For this ablation experiment, we use a similar ray sampling strategy to NPLF [23]. Both models are trained for 12 hours.

The results of our ablation experiment are shown in Figure LABEL:fig:_Ablation_results and Table III. These results align with our expectation that in both large outdoor scenes and small indoor scenes, it is common for a ray beam to be shot from an antenna but not reach any buildings (point clouds in our pipeline) nearby, causing the ray to be wasted as a default latent feature. The data in Figure LABEL:fig:_Ablation_results and Table III support our analysis that power prediction is significantly limited when light probes are removed. Since light probes cover all areas in space, each antenna can always find a nearby light probe and extract propagation features from it. Hence, our proposed approach consistently aligns well with ray-tracing ground truth, even in large outdoor scenes.

TABLE III: Ablation experiment: This table presents the MSE loss and PSNR score of power for both our dataset (etoile) and WINERT’s dataset (wiindoor).
etoile Ours Ablation
MSE \downarrow 0.00110.0011\boldsymbol{0.0011}bold_0.0011 0.0050.0050.0050.005
PSNR \uparrow 29.3829.38\boldsymbol{29.38}bold_29.38 23.0723.0723.0723.07
wiindoor Ours Ablation
MSE \downarrow 0.00170.0017\boldsymbol{0.0017}bold_0.0017 0.0220.0220.0220.022
PSNR \uparrow 27.6927.69\boldsymbol{27.69}bold_27.69 16.6016.6016.6016.60

IV-C Evaluation in Large-scale, 3D Wireless Scenes

IV-C1 Large-scale Environment

This subsection aims to validate the scalability of the proposed RayProNet in predicting EM propagation across different scene scales, ranging from small indoor rooms to expansive urban cities. By evaluating our model on three distinct scene scales, as depicted in Fig. LABEL:fig:_various_scales, we demonstrate its versatility and robustness. The results show a high degree of consistency with ray-tracing simulation results, confirming the accuracy of our model in diverse settings.

In particular, the experiment involving a small-scale indoor room showcases our model’s capability to accurately capture complex ray trajectories. Despite the inherent complexity of the ray paths, the model effectively recognizes the intricate propagation patterns. This validation underscores our methodology’s ability to handle a wide range of scenarios, making it suitable for applications in both indoor and outdoor wireless environments.

We also provide a time performance evaluation comparing our model to traditional ray-tracing (Table IV). The validation dataset consists of 25252525 transmitters, 4444 antenna patterns, and 148×121148121148\times 121148 × 121 receivers (100×100100100100\times 100100 × 100 in a small-scale indoor room scene). This setup results in 100 different configurations. The results show our methodology is at least 80 times faster than traditional ray-tracing, with an average time consumption of at most 3.23.23.23.2 seconds per configuration.

TABLE IV: Runtime comparison between our model and ray-tracing: In this table, we present a comparison of runtime performance between our model and ray-tracing with a validation set consisting of 25252525 transmitters, 4444 antenna patterns, and 148×121148121148\times 121148 × 121 receivers (100×100100100100\times 100100 × 100 in small-scale indoor room scene).
Dataset urban city indoor room
Runtime (ours) 32.86𝒔32.86𝒔\boldsymbol{32.86s}bold_32.86 bold_italic_s 13.17𝒔13.17𝒔\boldsymbol{13.17s}bold_13.17 bold_italic_s
Runtime (ray-tracing) 2642s2642𝑠2642s2642 italic_s 2771s2771𝑠2771s2771 italic_s

IV-C2 Antenna Radiation Pattern

Furthermore, our RayProNet is capable of accommodating various types of trained antenna radiation patterns as input. This versatility allows the model to adapt to different antenna configurations, enhancing its applicability in wireless planning scenarios. The evaluation of these different antenna radiation patterns, as shown in Fig. LABEL:fig:_Evaluation_Antenna_radiation_pattern, reveals a substantial agreement between our predictions and the ray tracing results. Such capability is crucial for applications requiring detailed antenna placement, highlighting the practical utility and versatility of our proposed methodology in diverse wireless environments.

IV-C3 Quantitative Measurements

So far, our results are primarily displayed in the format of 2D coverage maps for visualization. However, it is important to emphasize that our approach is essentially an end-to-end pipeline capable of predicting received signal strength at designated locations. To rigorously evaluate our model’s performance, we selected five distinct receiver locations on the map: (-167.5 m, 22.5 m), (-162.5 m, 52.5 m), (-157.5 m, 62.5 m), (-147.5 m, 72.5 m), and (-137.5 m, 97.5 m).

For each of these horizontal locations, we assessed the model’s predictions at three different heights: 7.5 m, 10.5 m, and 13.5 m, resulting in a total of 15 evaluation points. This comprehensive selection allows us to test the model’s accuracy and reliability across various spatial configurations. The precise locations of these points, along with the corresponding results, are illustrated in Figure LABEL:fig:_quantitative_measurements. This detailed analysis demonstrates our model’s robustness and flexibility in accurately predicting power propagation in 3D environments.

V Conclusion

To the best of our knowledge, this work represents the first effort in 3D neural wireless channel modeling capable of handling large-scale input scenes. Most prior works have focused on 2D image tasks that do not explicitly require explicit geometry representation. A recent work in Winert [22] was primarily designed for small indoor scenes, as its pipeline necessitates map** the intersection between a ray and a specific mesh triangle into a one-hot vector - an approach that is impractical for large scenes due to its excessive memory requirements.

Our proposed method offers a significant advancement in rapid wireless channel modeling for extensive 3D scenes, achieving speeds 80200similar-to8020080\sim 20080 ∼ 200 times faster than GPU-accelerated ray tracing methods. This efficiency is particularly beneficial in scenarios where transmitter and receiver locations frequently change, such as in wireless deployment and planning.

Our framework does have a notable limitation: geometry and occlusion information are embedded within the neural networks. Consequently, any changes to the scene geometry necessitate re-training the pipeline. Future research will be focused on develo** a more flexible framework capable of adapting to geometry changes without the need for re-training, enhancing its applicability and efficiency.

References

  • [1] H. L. Bertoni, Radio Propagation for Modern Wireless Systems.   Pearson Education, 2009. [Online]. Available: https://books.google.com/books?id=YF-s90or91sC
  • [2] B. Mondal, T. A. Thomas, E. Visotsky, F. W. Vook, A. Ghosh, Y.-H. Nam, Y. Li, J. Zhang, M. Zhang, Q. Luo, Y. Kakishima, and K. Kitao, “3D channel model in 3GPP,” IEEE Communications Magazine, vol. 53, pp. 16–23, 2015. [Online]. Available: https://api.semanticscholar.org/CorpusID:6888846
  • [3] T. K. Sarkar, M. S. Palma, and M. N. Abdallah, The Physics and Mathematics of Electromagnetic Wave Propagation in Cellular Wireless Communication.   John Wiley &\&& Sons, 2018.
  • [4] C. Brennan, P. J. Cullen, and L. Rossi, “An MFIE-based tabulated interaction method for UHF terrain propagation problems,” IEEE Transactions on Antennas and Propagation, vol. 48, no. 6, pp. 1003–1005, Jun 2000.
  • [5] P. Xu and L. Tsang, “Propagation over terrain and urban environment using the multilevel UV method and a hybrid UV/SDFMM method,” IEEE Antennas and Wireless Propagation Letters, vol. 3, pp. 336–339, 2004.
  • [6] C. A. Tunc, A. Altintas, and V. B. Erturk, “Examination of existent propagation models over large inhomogeneous terrain profiles using fast integral equation solution,” IEEE Transactions on Antennas and Propagation, vol. 53, no. 9, pp. 3080–3083, Sept 2005.
  • [7] F. Akleman and L. Sevgi, “A novel MoM- and SSPE-based groundwave-propagation field-strength prediction simulator,” IEEE Antennas and Propagation Magazine, vol. 49, no. 5, pp. 69–82, Oct 2007.
  • [8] A. Alighanbari and C. D. Sarris, “Rigorous and efficient time-domain modeling of electromagnetic wave propagation and fading statistics in indoor wireless channels,” IEEE Transactions on Antennas and Propagation, vol. 55, no. 8, pp. 2373–2381, Aug 2007.
  • [9] B. MacKie-Mason, Y. Shao, A. Greenwood, and Z. Peng, “Supercomputing-enabled first-principles analysis of radio wave propagation in urban environments,” IEEE Transactions on Antennas and Propagation, vol. 66, no. 12, pp. 6606–6617, 2018.
  • [10] F. Aguado Agelet, A. Formella, J. Hernando Rabanos, F. Isasi de Vicente, and F. Perez Fontan, “Efficient ray-tracing acceleration techniques for radio propagation modeling,” IEEE Transactions on Vehicular Technology, vol. 49, no. 6, pp. 2089–2104, 2000.
  • [11] Z. Ji, B.-H. Li, H.-X. Wang, H.-Y. Chen, and T. K. Sarkar, “Efficient ray-tracing methods for propagation prediction for indoor wireless communications,” IEEE Antennas and Propagation Magazine, vol. 43, no. 2, pp. 41–49, April 2001.
  • [12] F. S. de Adana, O. G. Blanco, I. G. Diego, J. P. Arriaga, and M. F. Catedra, “Propagation model based on ray tracing for the design of personal communication systems in indoor environments,” IEEE Transactions on Vehicular Technology, vol. 49, no. 6, pp. 2105–2112, Nov 2000.
  • [13] Z. Yun and M. F. Iskander, “Ray tracing for radio propagation modeling: Principles and applications,” IEEE Access, vol. 3, pp. 1089–1100, 2015.
  • [14] D. He, B. Ai, K. Guan, L. Wang, Z. Zhong, and T. Kürner, “The design and applications of high-performance ray-tracing simulation platform for 5G and beyond wireless communications: A tutorial,” IEEE Communications Surveys & Tutorials, vol. 21, no. 1, pp. 10–27, 2019.
  • [15] L. Azpilicueta, M. Rawat, K. Rawat, F. M. Ghannouchi, and F. Falcone, “A ray launching-neural network approach for radio wave propagation analysis in complex indoor environments,” IEEE Transactions on Antennas and Propagation, vol. 62, no. 5, pp. 2777–2786, 2014.
  • [16] T. Imai, K. Kitao, and M. Inomata, “Radio propagation prediction model using convolutional neural networks by deep learning,” in 2019 13th European Conference on Antennas and Propagation (EuCAP), 2019, pp. 1–5.
  • [17] A. Seretis and C. D. Sarris, “An overview of machine learning techniques for radiowave propagation modeling,” IEEE Transactions on Antennas and Propagation, vol. 70, no. 6, pp. 3970–3985, 2022.
  • [18] A. Seretis, C. Xu, and C. Sarris, “Fast selection of indoor wireless transmitter locations with generalizable neural network propagation models,” Oct. 2023. [Online]. Available: http://dx.doi.org/10.36227/techrxiv.24425536.v1
  • [19] T. M. Hehn, T. Orekondy, O. Shental, A. Behboodi, J. Bucheli, A. Doshi, J. Namgoong, T. Yoo, A. Sampath, and J. B. Soriaga, “Transformer-based neural surrogate for link-level path loss prediction from variable-sized maps,” arXiv preprint arXiv:2310.04570, 2023.
  • [20] J.-H. Lee, O. G. Serbetci, D. P. Selvam, and A. F. Molisch, “Pmnet: Robust pathloss map prediction via supervised learning,” in Proceedings of IEEE Global Communicaions Conference (GLOBECOM), December 2023.
  • [21] S. Bakirtzis, K. Qiu, J. Zhang, and I. Wassell, “Deepray: Deep learning meets ray-tracing,” in 2022 16th European Conference on Antennas and Propagation (EuCAP), 2022, pp. 1–5.
  • [22] T. Orekondy, K. Pratik, S. Kadambi, H. Ye, J. Soriaga, and A. Behboodi, “Winert: Towards neural ray tracing for wireless channel modelling and differentiable simulations,” in The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023.   OpenReview.net, 2023.
  • [23] J. Ost, I. Laradji, A. Newell, Y. Bahat, and F. Heide, “Neural point light fields,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
  • [24] P. Debevec, “A median cut algorithm for light probe sampling,” in ACM SIGGRAPH 2008 classes, 2008, pp. 1–3.
  • [25] Y. Xu, G. Zoss, P. Chandran, M. Gross, D. Bradley, and P. Gotardo, “Renerf: Relightable neural radiance fields with nearfield lighting,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2023, pp. 22 581–22 591.
  • [26] H. Ren, H. Fan, R. Wang, Y. Huo, R. Tang, L. Wang, and H. Bao, “Data-driven digital lighting design for residential indoor spaces,” ACM Trans. Graph., vol. 42, no. 3, mar 2023. [Online]. Available: https://doi.org/10.1145/3582001
  • [27] X. Zhang, P. P. Srinivasan, B. Deng, P. Debevec, W. T. Freeman, and J. T. Barron, “Nerfactor: Neural factorization of shape and reflectance under an unknown illumination,” ACM Trans. Graph., vol. 40, no. 6, dec 2021. [Online]. Available: https://doi.org/10.1145/3478513.3480496
  • [28] G. Li, A. Meka, F. Mueller, M. C. Buehler, O. Hilliges, and T. Beeler, “Eyenerf: A hybrid representation for photorealistic synthesis, animation and relighting of human eyes,” ACM Trans. Graph., vol. 41, no. 4, jul 2022. [Online]. Available: https://doi.org/10.1145/3528223.3530130
  • [29] B. Kerbl, G. Kopanas, T. Leimkühler, and G. Drettakis, “3d gaussian splatting for real-time radiance field rendering,” ACM Transactions on Graphics, vol. 42, no. 4, July 2023. [Online]. Available: https://repo-sam.inria.fr/fungraph/3d-gaussian-splatting/
  • [30] J. Gao, C. Gu, Y. Lin, H. Zhu, X. Cao, L. Zhang, and Y. Yao, “Relightable 3d gaussian: Real-time point cloud relighting with brdf decomposition and ray tracing,” arXiv:2311.16043, 2023.
  • [31] Y. Liu, X. Huang, M. Qin, Q. Lin, and H. Wang, “Animatable 3d gaussian: Fast and high-quality reconstruction of multiple human avatars,” arXiv preprint arXiv:2311.16482, 2023.
  • [32] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Proceedings of the 31st International Conference on Neural Information Processing Systems, ser. NIPS’17.   Red Hook, NY, USA: Curran Associates Inc., 2017, p. 6000–6010.
  • [33] T.-M. Li, M. Aittala, F. Durand, and J. Lehtinen, “Differentiable monte carlo ray tracing through edge sampling,” ACM Trans. Graph., vol. 37, no. 6, dec 2018. [Online]. Available: https://doi.org/10.1145/3272127.3275109
  • [34] W. Jakob, S. Speierer, N. Roussel, M. Nimier-David, D. Vicini, T. Zeltner, B. Nicolet, M. Crespo, V. Leroy, and Z. Zhang, “Mitsuba 3 renderer,” 2022, https://mitsuba-renderer.org.
  • [35] J. J. Park, P. Florence, J. Straub, R. Newcombe, and S. Lovegrove, “Deepsdf: Learning continuous signed distance functions for shape representation,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
  • [36] J. Choe, B. Joung, F. Rameau, J. Park, and I. S. Kweon, “Deep point cloud reconstruction,” arXiv preprint arXiv:2111.11704, 2021.
  • [37] R. Roveri, A. Öztireli, I. Pandele, and M. Gross, “Pointpronets: Consolidation of point clouds with convolutional neural networks,” Computer Graphics Forum, vol. 37, pp. 87–99, 05 2018.
  • [38] M.-J. Rakotosaona, V. La Barbera, P. Guerrero, N. J. Mitra, and M. Ovsjanikov, “Pointcleannet: Learning to denoise and remove outliers from dense point clouds,” in Computer Graphics Forum, vol. 39, no. 1.   Wiley Online Library, 2020, pp. 185–203.
  • [39] S. Luo and W. Hu, “Score-based point cloud denoising,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), October 2021, pp. 4583–4592.
  • [40] X. Wen, P. Xiang, Z. Han, Y.-P. Cao, P. Wan, W. Zheng, and Y.-S. Liu, “Pmp-net: Point cloud completion by learning multi-step point moving paths,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
  • [41] P. Xiang, X. Wen, Y.-S. Liu, Y.-P. Cao, P. Wan, W. Zheng, and Z. Han, “SnowflakeNet: Point cloud completion by snowflake point deconvolution with skip-transformer,” in Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2021.
  • [42] C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652–660.
  • [43] Y. Feng, Z. Zhang, X. Zhao, R. Ji, and Y. Gao, “Gvcnn: Group-view convolutional neural networks for 3d shape recognition,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 264–272.
  • [44] A. Kanezaki, Y. Matsushita, and Y. Nishida, “Rotationnet: Joint object categorization and pose estimation using multiviews from unsupervised viewpoints,” in Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
  • [45] R. B. Izhak, A. Lahav, and A. Tal, “Attwalk: Attentive cross-walks for deep mesh analysis,” in 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 2937–2946.
  • [46] Y. Xie, E. Franz, M. Chu, and N. Thuerey, “tempoGAN: A Temporally Coherent, Volumetric GAN for Super-resolution Fluid Flow,” ACM Transactions on Graphics (TOG), vol. 37, no. 4, p. 95, 2018.
  • [47] S. Sato, Y. Dobashi, and T. Kim, “Stream-guided smoke simulations,” ACM Trans. Graph., vol. 40, no. 4, jul 2021. [Online]. Available: https://doi.org/10.1145/3450626.3459846
  • [48] T. Du, K. Wu, P. Ma, S. Wah, A. Spielberg, D. Rus, and W. Matusik, “Diffpd: Differentiable projective dynamics,” ACM Trans. Graph., vol. 41, no. 2, nov 2021. [Online]. Available: https://doi.org/10.1145/3490168
  • [49] P. Ma, T. Du, J. Z. Zhang, K. Wu, A. Spielberg, R. K. Katzschmann, and W. Matusik, “Diffaqua: a differentiable computational design pipeline for soft underwater swimmers with shape interpolation,” ACM Trans. Graph., vol. 40, no. 4, jul 2021. [Online]. Available: https://doi.org/10.1145/3450626.3459832
  • [50] O. Noakoasteen, S. Wang, Z. Peng, and C. Christodoulou, “Physics-informed deep neural networks for transient electromagnetic analysis,” IEEE Open Journal of Antennas and Propagation, vol. 1, pp. 404–412, 2020.
  • [51] O. Noakoasteen, C. Christodoulou, Z. Peng, and S. Goudos, “Physics‐informed surrogates for electromagnetic dynamics using transformers and graph neural networks,” IET Microwaves, Antennas & Propagation, 02 2024.
  • [52] Z. Wei and X. Chen, “Physics-inspired convolutional neural network for solving full-wave inverse scattering problems,” IEEE Transactions on Antennas and Propagation, vol. 67, no. 9, pp. 6138–6148, 2019.
  • [53] L. Li, L. G. Wang, F. L. Teixeira, C. Liu, A. Nehorai, and T. J. Cui, “Deepnis: Deep neural network for nonlinear electromagnetic inverse scattering,” IEEE Transactions on Antennas and Propagation, vol. 67, no. 3, pp. 1819–1825, 2019.
  • [54] H. M. Yao, L. Jiang, and W. E. I. Sha, “Enhanced deep learning approach based on the deep convolutional encoder–decoder architecture for electromagnetic inverse scattering problems,” IEEE Antennas and Wireless Propagation Letters, vol. 19, no. 7, pp. 1211–1215, 2020.
  • [55] K. Xu, L. Wu, X. Ye, and X. Chen, “Deep learning-based inversion methods for solving inverse scattering problems with phaseless data,” IEEE Transactions on Antennas and Propagation, vol. 68, no. 11, pp. 7457–7470, 2020.
  • [56] H. M. Yao, R. Guo, M. Li, L. Jiang, and M. K. P. Ng, “Enhanced supervised descent learning technique for electromagnetic inverse scattering problems by the deep convolutional neural networks,” IEEE Transactions on Antennas and Propagation, vol. 70, no. 8, pp. 6195–6206, 2022.
  • [57] Y. Ge, L. Guo, and M. Li, “Physics-informed deep learning for time-domain electromagnetic radiation problem,” in 2022 IEEE MTT-S International Microwave Biomedical Conference (IMBioC), 2022, pp. 114–116.
  • [58] R. Guo, Z. Lin, T. Shan, X. Song, M. Li, F. Yang, S. Xu, and A. Abubakar, “Physics embedded deep neural network for solving full-wave inverse scattering problems,” IEEE Transactions on Antennas and Propagation, vol. 70, no. 8, pp. 6148–6159, 2022.
  • [59] Y. Hu, Y. **, X. Wu, and J. Chen, “A physics-driven deep-learning inverse solver for subsurface sensing,” in 2020 IEEE USNC-CNC-URSI North American Radio Science Meeting (Joint with AP-S Symposium), 2020, pp. 135–136.
  • [60] M. Salucci, M. Arrebola, T. Shan, and M. Li, “Artificial intelligence: New frontiers in real-time inverse scattering and electromagnetic imaging,” IEEE Transactions on Antennas and Propagation, vol. 70, no. 8, pp. 6349–6364, 2022.
  • [61] Y. Hu, Y. **, X. Wu, and J. Chen, “A theory-guided deep neural network for time domain electromagnetic simulation and inversion using a differentiable programming platform,” IEEE Transactions on Antennas and Propagation, vol. 70, no. 1, pp. 767–772, 2022.
  • [62] Q. Dai, Y. H. Lee, H.-H. Sun, G. Ow, M. L. M. Yusof, and A. C. Yucel, “3dinvnet: A deep learning-based 3d ground-penetrating radar data inversion,” IEEE Transactions on Geoscience and Remote Sensing, vol. 61, pp. 1–16, 2023.
  • [63] L. Guo, M. Li, S. Xu, F. Yang, and L. Liu, “Electromagnetic modeling using an fdtd-equivalent recurrent convolution neural network: Accurate computing on a deep learning framework,” IEEE Antennas and Propagation Magazine, vol. 65, no. 1, pp. 93–102, 2023.
  • [64] Y. Su, S. Zeng, X. Wu, Y. Huang, and J. Chen, “Physics-informed graph neural network for electromagnetic simulations,” in 2023 XXXVth General Assembly and Scientific Symposium of the International Union of Radio Science (URSI GASS), 2023, pp. 1–3.
  • [65] S. Qi and C. D. Sarris, “Hybrid physics-informed neural network for the wave equation with unconditionally stable time-step**,” IEEE Antennas and Wireless Propagation Letters, vol. 23, no. 4, pp. 1356–1360, 2024.
  • [66] J. Hoydis, S. Cammerer, F. Ait Aoudia, A. Vem, N. Binder, G. Marcus, and A. Keller, “Sionna: An open-source library for next-generation physical layer research,” arXiv preprint, Mar. 2022.
  • [67] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in International Conference on Learning Representations (ICLR), San Diega, CA, USA, 2015.