-
RHINO-VR Experience: Teaching Mobile Robotics Concepts in an Interactive Museum Exhibit
Authors:
Erik Schlachhoff,
Nils Dengler,
Leif Van Holland,
Patrick Stotko,
Jorge de Heuvel,
Reinhard Klein,
Maren Bennewitz
Abstract:
In 1997, the very first tour guide robot RHINO was deployed in a museum in Germany. With the ability to navigate autonomously through the environment, the robot gave tours to over 2,000 visitors. Today, RHINO itself has become an exhibit and is no longer operational. In this paper, we present RHINO-VR, an interactive museum exhibit using virtual reality (VR) that allows museum visitors to experien…
▽ More
In 1997, the very first tour guide robot RHINO was deployed in a museum in Germany. With the ability to navigate autonomously through the environment, the robot gave tours to over 2,000 visitors. Today, RHINO itself has become an exhibit and is no longer operational. In this paper, we present RHINO-VR, an interactive museum exhibit using virtual reality (VR) that allows museum visitors to experience the historical robot RHINO in operation in a virtual museum. RHINO-VR, unlike static exhibits, enables users to familiarize themselves with basic mobile robotics concepts without the fear of damaging the exhibit. In the virtual environment, the user is able to interact with RHINO in VR by pointing to a location to which the robot should navigate and observing the corresponding actions of the robot. To include other visitors who cannot use the VR, we provide an external observation view to make RHINO visible to them. We evaluated our system by measuring the frame rate of the VR simulation, comparing the generated virtual 3D models with the originals, and conducting a user study. The user-study showed that RHINO-VR improved the visitors' understanding of the robot's functionality and that they would recommend experiencing the VR exhibit to others.
△ Less
Submitted 10 June, 2024; v1 submitted 22 March, 2024;
originally announced March 2024.
-
FPO++: Efficient Encoding and Rendering of Dynamic Neural Radiance Fields by Analyzing and Enhancing Fourier PlenOctrees
Authors:
Saskia Rabich,
Patrick Stotko,
Reinhard Klein
Abstract:
Fourier PlenOctrees have shown to be an efficient representation for real-time rendering of dynamic Neural Radiance Fields (NeRF). Despite its many advantages, this method suffers from artifacts introduced by the involved compression when combining it with recent state-of-the-art techniques for training the static per-frame NeRF models. In this paper, we perform an in-depth analysis of these artif…
▽ More
Fourier PlenOctrees have shown to be an efficient representation for real-time rendering of dynamic Neural Radiance Fields (NeRF). Despite its many advantages, this method suffers from artifacts introduced by the involved compression when combining it with recent state-of-the-art techniques for training the static per-frame NeRF models. In this paper, we perform an in-depth analysis of these artifacts and leverage the resulting insights to propose an improved representation. In particular, we present a novel density encoding that adapts the Fourier-based compression to the characteristics of the transfer function used by the underlying volume rendering procedure and leads to a substantial reduction of artifacts in the dynamic model. Furthermore, we show an augmentation of the training data that relaxes the periodicity assumption of the compression. We demonstrate the effectiveness of our enhanced Fourier PlenOctrees in the scope of quantitative and qualitative evaluations on synthetic and real-world scenes.
△ Less
Submitted 31 October, 2023;
originally announced October 2023.
-
TraM-NeRF: Tracing Mirror and Near-Perfect Specular Reflections through Neural Radiance Fields
Authors:
Leif Van Holland,
Ruben Bliersbach,
Jan U. Müller,
Patrick Stotko,
Reinhard Klein
Abstract:
Implicit representations like Neural Radiance Fields (NeRF) showed impressive results for photorealistic rendering of complex scenes with fine details. However, ideal or near-perfectly specular reflecting objects such as mirrors, which are often encountered in various indoor scenes, impose ambiguities and inconsistencies in the representation of the reconstructed scene leading to severe artifacts…
▽ More
Implicit representations like Neural Radiance Fields (NeRF) showed impressive results for photorealistic rendering of complex scenes with fine details. However, ideal or near-perfectly specular reflecting objects such as mirrors, which are often encountered in various indoor scenes, impose ambiguities and inconsistencies in the representation of the reconstructed scene leading to severe artifacts in the synthesized renderings. In this paper, we present a novel reflection tracing method tailored for the involved volume rendering within NeRF that takes these mirror-like objects into account while avoiding the cost of straightforward but expensive extensions through standard path tracing. By explicitly modeling the reflection behavior using physically plausible materials and estimating the reflected radiance with Monte-Carlo methods within the volume rendering formulation, we derive efficient strategies for importance sampling and the transmittance computation along rays from only few samples. We show that our novel method enables the training of consistent representations of such challenging scenes and achieves superior results in comparison to previous state-of-the-art approaches.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Efficient 3D Reconstruction, Streaming and Visualization of Static and Dynamic Scene Parts for Multi-client Live-telepresence in Large-scale Environments
Authors:
Leif Van Holland,
Patrick Stotko,
Stefan Krumpen,
Reinhard Klein,
Michael Weinmann
Abstract:
Despite the impressive progress of telepresence systems for room-scale scenes with static and dynamic scene entities, expanding their capabilities to scenarios with larger dynamic environments beyond a fixed size of a few square-meters remains challenging.
In this paper, we aim at sharing 3D live-telepresence experiences in large-scale environments beyond room scale with both static and dynamic…
▽ More
Despite the impressive progress of telepresence systems for room-scale scenes with static and dynamic scene entities, expanding their capabilities to scenarios with larger dynamic environments beyond a fixed size of a few square-meters remains challenging.
In this paper, we aim at sharing 3D live-telepresence experiences in large-scale environments beyond room scale with both static and dynamic scene entities at practical bandwidth requirements only based on light-weight scene capture with a single moving consumer-grade RGB-D camera. To this end, we present a system which is built upon a novel hybrid volumetric scene representation in terms of the combination of a voxel-based scene representation for the static contents, that not only stores the reconstructed surface geometry but also contains information about the object semantics as well as their accumulated dynamic movement over time, and a point-cloud-based representation for dynamic scene parts, where the respective separation from static parts is achieved based on semantic and instance information extracted for the input frames. With an independent yet simultaneous streaming of both static and dynamic content, where we seamlessly integrate potentially moving but currently static scene entities in the static model until they are becoming dynamic again, as well as the fusion of static and dynamic data at the remote client, our system is able to achieve VR-based live-telepresence at close to real-time rates. Our evaluation demonstrates the potential of our novel approach in terms of visual quality, performance, and ablation studies regarding involved design choices.
△ Less
Submitted 13 February, 2024; v1 submitted 25 November, 2022;
originally announced November 2022.
-
Incomplete Gamma Kernels: Generalizing Locally Optimal Projection Operators
Authors:
Patrick Stotko,
Michael Weinmann,
Reinhard Klein
Abstract:
We present incomplete gamma kernels, a generalization of Locally Optimal Projection (LOP) operators. In particular, we reveal the relation of the classical localized $ L_1 $ estimator, used in the LOP operator for surface reconstruction from noisy point clouds, to the common Mean Shift framework via a novel kernel. Furthermore, we generalize this result to a whole family of kernels that are built…
▽ More
We present incomplete gamma kernels, a generalization of Locally Optimal Projection (LOP) operators. In particular, we reveal the relation of the classical localized $ L_1 $ estimator, used in the LOP operator for surface reconstruction from noisy point clouds, to the common Mean Shift framework via a novel kernel. Furthermore, we generalize this result to a whole family of kernels that are built upon the incomplete gamma function and each represents a localized $ L_p $ estimator. By deriving various properties of the kernel family concerning distributional, Mean Shift induced, and other aspects such as strict positive definiteness, we obtain a deeper understanding of the operator's projection behavior. From these theoretical insights, we illustrate several applications ranging from an improved Weighted LOP (WLOP) density weighting scheme and a more accurate Continuous LOP (CLOP) kernel approximation to the definition of a novel set of robust loss functions. These incomplete gamma losses include the Gaussian and LOP loss as special cases and can be applied for reconstruction tasks such as normal filtering. We demonstrate the effects of each application in a range of quantitative and qualitative experiments that highlight the benefits induced by our modifications.
△ Less
Submitted 2 May, 2022;
originally announced May 2022.
-
Bonn Activity Maps: Dataset Description
Authors:
Julian Tanke,
Oh-Hun Kwon,
Patrick Stotko,
Radu Alexandru Rosu,
Michael Weinmann,
Hassan Errami,
Sven Behnke,
Maren Bennewitz,
Reinhard Klein,
Andreas Weber,
Angela Yao,
Juergen Gall
Abstract:
The key prerequisite for accessing the huge potential of current machine learning techniques is the availability of large databases that capture the complex relations of interest. Previous datasets are focused on either 3D scene representations with semantic information, tracking of multiple persons and recognition of their actions, or activity recognition of a single person in captured 3D environ…
▽ More
The key prerequisite for accessing the huge potential of current machine learning techniques is the availability of large databases that capture the complex relations of interest. Previous datasets are focused on either 3D scene representations with semantic information, tracking of multiple persons and recognition of their actions, or activity recognition of a single person in captured 3D environments. We present Bonn Activity Maps, a large-scale dataset for human tracking, activity recognition and anticipation of multiple persons. Our dataset comprises four different scenes that have been recorded by time-synchronized cameras each only capturing the scene partially, the reconstructed 3D models with semantic annotations, motion trajectories for individual people including 3D human poses as well as human activity annotations. We utilize the annotations to generate activity likelihoods on the 3D models called activity maps.
△ Less
Submitted 13 December, 2019;
originally announced December 2019.
-
stdgpu: Efficient STL-like Data Structures on the GPU
Authors:
Patrick Stotko
Abstract:
Tremendous advances in parallel computing and graphics hardware opened up several novel real-time GPU applications in the fields of computer vision, computer graphics as well as augmented reality (AR) and virtual reality (VR). Although these applications built upon established open-source frameworks that provide highly optimized algorithms, they often come with custom self-written data structures…
▽ More
Tremendous advances in parallel computing and graphics hardware opened up several novel real-time GPU applications in the fields of computer vision, computer graphics as well as augmented reality (AR) and virtual reality (VR). Although these applications built upon established open-source frameworks that provide highly optimized algorithms, they often come with custom self-written data structures to manage the underlying data. In this work, we present stdgpu, an open-source library which defines several generic GPU data structures for fast and reliable data management. Rather than abandoning previous established frameworks, our library aims to extend them, therefore bridging the gap between CPU and GPU computing. This way, it provides clean and familiar interfaces and integrates seamlessly into new as well as existing projects. We hope to foster further developments towards unified CPU and GPU computing and welcome contributions from the community.
△ Less
Submitted 16 August, 2019;
originally announced August 2019.
-
Efficient 3D Reconstruction and Streaming for Group-Scale Multi-Client Live Telepresence
Authors:
Patrick Stotko,
Stefan Krumpen,
Michael Weinmann,
Reinhard Klein
Abstract:
Sharing live telepresence experiences for teleconferencing or remote collaboration receives increasing interest with the recent progress in capturing and AR/VR technology. Whereas impressive telepresence systems have been proposed on top of on-the-fly scene capture, data transmission and visualization, these systems are restricted to the immersion of single or up to a low number of users into the…
▽ More
Sharing live telepresence experiences for teleconferencing or remote collaboration receives increasing interest with the recent progress in capturing and AR/VR technology. Whereas impressive telepresence systems have been proposed on top of on-the-fly scene capture, data transmission and visualization, these systems are restricted to the immersion of single or up to a low number of users into the respective scenarios. In this paper, we direct our attention on immersing significantly larger groups of people into live-captured scenes as required in education, entertainment or collaboration scenarios. For this purpose, rather than abandoning previous approaches, we present a range of optimizations of the involved reconstruction and streaming components that allow the immersion of a group of more than 24 users within the same scene - which is about a factor of 6 higher than in previous work - without introducing further latency or changing the involved consumer hardware setup. We demonstrate that our optimized system is capable of generating high-quality scene reconstructions as well as providing an immersive viewing experience to a large group of people within these live-captured scenes.
△ Less
Submitted 13 January, 2020; v1 submitted 8 August, 2019;
originally announced August 2019.
-
A VR System for Immersive Teleoperation and Live Exploration with a Mobile Robot
Authors:
Patrick Stotko,
Stefan Krumpen,
Max Schwarz,
Christian Lenz,
Sven Behnke,
Reinhard Klein,
Michael Weinmann
Abstract:
Applications like disaster management and industrial inspection often require experts to enter contaminated places. To circumvent the need for physical presence, it is desirable to generate a fully immersive individual live teleoperation experience. However, standard video-based approaches suffer from a limited degree of immersion and situation awareness due to the restriction to the camera view,…
▽ More
Applications like disaster management and industrial inspection often require experts to enter contaminated places. To circumvent the need for physical presence, it is desirable to generate a fully immersive individual live teleoperation experience. However, standard video-based approaches suffer from a limited degree of immersion and situation awareness due to the restriction to the camera view, which impacts the navigation. In this paper, we present a novel VR-based practical system for immersive robot teleoperation and scene exploration. While being operated through the scene, a robot captures RGB-D data that is streamed to a SLAM-based live multi-client telepresence system. Here, a global 3D model of the already captured scene parts is reconstructed and streamed to the individual remote user clients where the rendering for e.g. head-mounted display devices (HMDs) is performed. We introduce a novel lightweight robot client component which transmits robot-specific data and enables a quick integration into existing robotic systems. This way, in contrast to first-person exploration systems, the operators can explore and navigate in the remote site completely independent of the current position and view of the capturing robot, complementing traditional input devices for teleoperation. We provide a proof-of-concept implementation and demonstrate the capabilities as well as the performance of our system regarding interactive object measurements and bandwidth-efficient data streaming and visualization. Furthermore, we show its benefits over purely video-based teleoperation in a user study revealing a higher degree of situation awareness and a more precise navigation in challenging environments.
△ Less
Submitted 3 February, 2020; v1 submitted 8 August, 2019;
originally announced August 2019.
-
SLAMCast: Large-Scale, Real-Time 3D Reconstruction and Streaming for Immersive Multi-Client Live Telepresence
Authors:
Patrick Stotko,
Stefan Krumpen,
Matthias B. Hullin,
Michael Weinmann,
Reinhard Klein
Abstract:
Real-time 3D scene reconstruction from RGB-D sensor data, as well as the exploration of such data in VR/AR settings, has seen tremendous progress in recent years. The combination of both these components into telepresence systems, however, comes with significant technical challenges. All approaches proposed so far are extremely demanding on input and output devices, compute resources and transmiss…
▽ More
Real-time 3D scene reconstruction from RGB-D sensor data, as well as the exploration of such data in VR/AR settings, has seen tremendous progress in recent years. The combination of both these components into telepresence systems, however, comes with significant technical challenges. All approaches proposed so far are extremely demanding on input and output devices, compute resources and transmission bandwidth, and they do not reach the level of immediacy required for applications such as remote collaboration. Here, we introduce what we believe is the first practical client-server system for real-time capture and many-user exploration of static 3D scenes. Our system is based on the observation that interactive frame rates are sufficient for capturing and reconstruction, and real-time performance is only required on the client site to achieve lag-free view updates when rendering the 3D model. Starting from this insight, we extend previous voxel block hashing frameworks by overcoming internal dependencies and introducing, to the best of our knowledge, the first thread-safe GPU hash map data structure that is robust under massively concurrent retrieval, insertion and removal of entries on a thread level. We further propose a novel transmission scheme for volume data that is specifically targeted to Marching Cubes geometry reconstruction and enables a 90% reduction in bandwidth between server and exploration clients. The resulting system poses very moderate requirements on network bandwidth, latency and client-side computation, which enables it to rely entirely on consumer-grade hardware, including mobile devices. We demonstrate that our technique achieves state-of-the-art representation accuracy while providing, for any number of clients, an immersive and fluid lag-free viewing experience even during network outages.
△ Less
Submitted 12 April, 2019; v1 submitted 9 May, 2018;
originally announced May 2018.