-
Quasi-Monte Carlo Algorithms (not only) for Graphics Software
Authors:
Alexander Keller,
Carsten Wächter,
Nikolaus Binder
Abstract:
Quasi-Monte Carlo methods have become the industry standard in computer graphics. For that purpose, efficient algorithms for low discrepancy sequences are discussed. In addition, numerical pitfalls encountered in practice are revealed. We then take a look at massively parallel quasi-Monte Carlo integro-approximation for image synthesis by light transport simulation. Beyond superior uniformity, low…
▽ More
Quasi-Monte Carlo methods have become the industry standard in computer graphics. For that purpose, efficient algorithms for low discrepancy sequences are discussed. In addition, numerical pitfalls encountered in practice are revealed. We then take a look at massively parallel quasi-Monte Carlo integro-approximation for image synthesis by light transport simulation. Beyond superior uniformity, low discrepancy points may be optimized with respect to additional criteria, such as noise characteristics at low sampling rates or the quality of low-dimensional projections.
△ Less
Submitted 28 July, 2023;
originally announced July 2023.
-
Sionna RT: Differentiable Ray Tracing for Radio Propagation Modeling
Authors:
Jakob Hoydis,
Fayçal Aït Aoudia,
Sebastian Cammerer,
Merlin Nimier-David,
Nikolaus Binder,
Guillermo Marcus,
Alexander Keller
Abstract:
Sionna is a GPU-accelerated open-source library for link-level simulations based on TensorFlow. Since release v0.14 it integrates a differentiable ray tracer (RT) for the simulation of radio wave propagation. This unique feature allows for the computation of gradients of the channel impulse response and other related quantities with respect to many system and environment parameters, such as materi…
▽ More
Sionna is a GPU-accelerated open-source library for link-level simulations based on TensorFlow. Since release v0.14 it integrates a differentiable ray tracer (RT) for the simulation of radio wave propagation. This unique feature allows for the computation of gradients of the channel impulse response and other related quantities with respect to many system and environment parameters, such as material properties, antenna patterns, array geometries, as well as transmitter and receiver orientations and positions. In this paper, we outline the key components of Sionna RT and showcase example applications such as learning radio materials and optimizing transmitter orientations by gradient descent. While classic ray tracing is a crucial tool for 6G research topics like reconfigurable intelligent surfaces, integrated sensing and communications, as well as user localization, differentiable ray tracing is a key enabler for many novel and exciting research directions, for example, digital twins.
△ Less
Submitted 19 July, 2023; v1 submitted 20 March, 2023;
originally announced March 2023.
-
Rendering along the Hilbert Curve
Authors:
Alexander Keller,
Carsten Wächter,
Nikolaus Binder
Abstract:
Based on the seminal work on Array-RQMC methods and rank-1 lattice sequences by Pierre L'Ecuyer and collaborators, we introduce efficient deterministic algorithms for image synthesis. Enumerating a low discrepancy sequence along the Hilbert curve superimposed on the raster of pixels of an image, we achieve noise characteristics that are desirable with respect to the human visual system, especially…
▽ More
Based on the seminal work on Array-RQMC methods and rank-1 lattice sequences by Pierre L'Ecuyer and collaborators, we introduce efficient deterministic algorithms for image synthesis. Enumerating a low discrepancy sequence along the Hilbert curve superimposed on the raster of pixels of an image, we achieve noise characteristics that are desirable with respect to the human visual system, especially at very low sampling rates. As compared to the state of the art, our simple algorithms neither require randomization, nor costly optimization, nor lookup tables. We analyze correlations of space-filling curves and low discrepancy sequences, and demonstrate the benefits of the new algorithms in a professional, massively parallel light transport simulation and rendering system.
△ Less
Submitted 12 July, 2022;
originally announced July 2022.
-
GPU-Accelerated Machine Learning in Non-Orthogonal Multiple Access
Authors:
Daniel Schäufele,
Guillermo Marcus,
Nikolaus Binder,
Matthias Mehlhose,
Alexander Keller,
Sławomir Stańczak
Abstract:
Non-orthogonal multiple access (NOMA) is an interesting technology that enables massive connectivity as required in future 5G and 6G networks. While purely linear processing already achieves good performance in NOMA systems, in certain scenarios, non-linear processing is mandatory to ensure acceptable performance. In this paper, we propose a neural network architecture that combines the advantages…
▽ More
Non-orthogonal multiple access (NOMA) is an interesting technology that enables massive connectivity as required in future 5G and 6G networks. While purely linear processing already achieves good performance in NOMA systems, in certain scenarios, non-linear processing is mandatory to ensure acceptable performance. In this paper, we propose a neural network architecture that combines the advantages of both linear and non-linear processing. Its real-time detection performance is demonstrated by a highly efficient implementation on a graphics processing unit (GPU). Using real measurements in a laboratory environment, we show the superiority of our approach over conventional methods.
△ Less
Submitted 13 June, 2022;
originally announced June 2022.
-
Sionna: An Open-Source Library for Next-Generation Physical Layer Research
Authors:
Jakob Hoydis,
Sebastian Cammerer,
Fayçal Ait Aoudia,
Avinash Vem,
Nikolaus Binder,
Guillermo Marcus,
Alexander Keller
Abstract:
Sionna is a GPU-accelerated open-source library for link-level simulations based on TensorFlow. It enables the rapid prototy** of complex communication system architectures and provides native support for the integration of neural networks. Sionna implements a wide breadth of carefully tested state-of-the-art algorithms that can be used for benchmarking and end-to-end performance evaluation. Thi…
▽ More
Sionna is a GPU-accelerated open-source library for link-level simulations based on TensorFlow. It enables the rapid prototy** of complex communication system architectures and provides native support for the integration of neural networks. Sionna implements a wide breadth of carefully tested state-of-the-art algorithms that can be used for benchmarking and end-to-end performance evaluation. This allows researchers to focus on their research, making it more impactful and reproducible, while saving time implementing components outside their area of expertise. This white paper provides a brief introduction to Sionna, explains its design principles and features, as well as future extensions, such as integrated ray tracing and custom CUDA kernels. We believe that Sionna is a valuable tool for research on next-generation communication systems, such as 6G, and we welcome contributions from our community.
△ Less
Submitted 20 March, 2023; v1 submitted 22 March, 2022;
originally announced March 2022.
-
GPU-accelerated partially linear multiuser detection for 5G and beyond URLLC systems
Authors:
Matthias Mehlhose,
Guillermo Marcus,
Daniel Schäufele,
Daniyal Amir Awan,
Nikolaus Binder,
Martin Kasparick,
Renato L. G. Cavalcante,
Sławomir Stańczak,
Alexander Keller
Abstract:
In this feasibility study, we have implemented a recently proposed partially linear multiuser detection algorithm in reproducing kernel Hilbert spaces (RKHSs) on a GPU-accelerated platform. Partially linear multiuser detection, which combines the robustness of linear detection with the power of nonlinear methods, has been proposed for a massive connectivity scenario with the non-orthogonal multipl…
▽ More
In this feasibility study, we have implemented a recently proposed partially linear multiuser detection algorithm in reproducing kernel Hilbert spaces (RKHSs) on a GPU-accelerated platform. Partially linear multiuser detection, which combines the robustness of linear detection with the power of nonlinear methods, has been proposed for a massive connectivity scenario with the non-orthogonal multiple access (NOMA). This is a promising approach, but detecting payloads within a received orthogonal frequency division multiplexing (OFDM) radio frame requires the execution of a large number of inner product operations, which are the main computational burden of the algorithm. Although inner-product operations consist of simple kernel evaluations, their vast number poses a challenge in ultra-low latency (ULL) applications, because the time needed for computing the inner products might exceed the sub-millisecond latency requirement. To address this problem, this study demonstrates the acceleration of the inner-product operations through massive parallelization. The result is a GPU-accelerated real-time OFDM receiver that enables sub-millisecond latency detection to meet the requirements of 5th generation (5G) and beyond ultra-reliable and low latency communications (URLLC) systems. Moreover, the parallelization and acceleration techniques explored and demonstrated in this study can be extended to many other signal processing algorithms in Hilbert spaces, such as those based on projection onto convex sets (POCS) and adaptive projected subgradient method (APSM) algorithms. Experimental results and comparisons with the state-of-art confirm the effectiveness of our techniques.
△ Less
Submitted 17 May, 2022; v1 submitted 13 January, 2022;
originally announced January 2022.
-
Massively Parallel Path Space Filtering
Authors:
Nikolaus Binder,
Sascha Fricke,
Alexander Keller
Abstract:
Restricting path tracing to a small number of paths per pixel for performance reasons rarely achieves a satisfactory image quality for scenes of interest. However, path space filtering may dramatically improve the visual quality by sharing information across vertices of paths classified as proximate. Unlike screen space-based approaches, these paths neither need to be present on the screen, nor is…
▽ More
Restricting path tracing to a small number of paths per pixel for performance reasons rarely achieves a satisfactory image quality for scenes of interest. However, path space filtering may dramatically improve the visual quality by sharing information across vertices of paths classified as proximate. Unlike screen space-based approaches, these paths neither need to be present on the screen, nor is filtering restricted to the first intersection with the scene. While searching proximate vertices had been more expensive than filtering in screen space, we greatly improve over this performance penalty by storing, updating, and looking up the required information in a hash table. The keys are constructed from jittered and quantized information, such that only a single query very likely replaces costly neighborhood searches. A massively parallel implementation of the algorithm is demonstrated on a graphics processing unit (GPU).
△ Less
Submitted 3 February, 2021; v1 submitted 15 February, 2019;
originally announced February 2019.
-
Massively Parallel Construction of Radix Tree Forests for the Efficient Sampling of Discrete Probability Distributions
Authors:
Nikolaus Binder,
Alexander Keller
Abstract:
We compare different methods for sampling from discrete probability distributions and introduce a new algorithm which is especially efficient on massively parallel processors, such as GPUs. The scheme preserves the distribution properties of the input sequence, exposes constant time complexity on the average, and significantly lowers the average number of operations for certain distributions when…
▽ More
We compare different methods for sampling from discrete probability distributions and introduce a new algorithm which is especially efficient on massively parallel processors, such as GPUs. The scheme preserves the distribution properties of the input sequence, exposes constant time complexity on the average, and significantly lowers the average number of operations for certain distributions when sampling is performed in a parallel algorithm that requires synchronization afterwards. Avoiding load balancing issues of naïve approaches, a very efficient massively parallel construction algorithm for the required auxiliary data structure is complemented.
△ Less
Submitted 30 August, 2019; v1 submitted 2 January, 2019;
originally announced January 2019.
-
Massively Parallel Stackless Ray Tracing of Catmull-Clark Subdivision Surfaces
Authors:
Nikolaus Binder,
Alexander Keller
Abstract:
We present a fast and efficient method for intersecting rays with Catmull-Clark subdivision surfaces. It takes advantage of the approximation democratized by OpenSubdiv, in which regular patches are represented by tensor product Bézier surfaces and irregular ones are approximated using Gregory patches. Our algorithm operates solely on the original patch data and can process both patch types simult…
▽ More
We present a fast and efficient method for intersecting rays with Catmull-Clark subdivision surfaces. It takes advantage of the approximation democratized by OpenSubdiv, in which regular patches are represented by tensor product Bézier surfaces and irregular ones are approximated using Gregory patches. Our algorithm operates solely on the original patch data and can process both patch types simultaneously with only a small amount of control flow divergence. Besides introducing an optimized method to determine axis aligned bounding boxes of Gregory patches restricted in the parametric domain, several techniques are introduced that accelerate the recursive subdivision process including stackless operation, efficient work distribution, and control flow optimizations. The algorithm is especially useful for quick turnarounds during patch editing and animation playback.
△ Less
Submitted 8 November, 2018;
originally announced November 2018.
-
Fast, High Precision Ray/Fiber Intersection using Tight, Disjoint Bounding Volumes
Authors:
Nikolaus Binder,
Alexander Keller
Abstract:
Analyzing and identifying the shortcomings of current subdivision methods for finding intersections of rays with fibers defined by the surface of a circular contour swept along a Bézier curve, we present a new algorithm that improves precision and performance. Instead of the inefficient pruning using overlap** axis aligned bounding boxes and determining the closest point of approach of the ray a…
▽ More
Analyzing and identifying the shortcomings of current subdivision methods for finding intersections of rays with fibers defined by the surface of a circular contour swept along a Bézier curve, we present a new algorithm that improves precision and performance. Instead of the inefficient pruning using overlap** axis aligned bounding boxes and determining the closest point of approach of the ray and the curve, we prune using disjoint bounding volumes defined by cylinders and calculate the intersections on the limit surface. This in turn allows for computing accurate parametric position and normal in the point of intersection. The iteration requires only one bit per subdivision to avoid costly stack memory operations. At a low number of subdivisions, the performance of the high precision algorithm is competitive, while for a high number of subdivisions it dramatically outperforms the state-of-the-art. Besides an extensive mathematical analysis, source code is provided.
△ Less
Submitted 8 November, 2018;
originally announced November 2018.