-
EPOCH: Jointly Estimating the 3D Pose of Cameras and Humans
Authors:
Nicola Garau,
Giulia Martinelli,
Niccolò Bisagno,
Denis Tomè,
Carsten Stoll
Abstract:
Monocular Human Pose Estimation (HPE) aims at determining the 3D positions of human joints from a single 2D image captured by a camera. However, a single 2D point in the image may correspond to multiple points in 3D space. Typically, the uniqueness of the 2D-3D relationship is approximated using an orthographic or weak-perspective camera model. In this study, instead of relying on approximations,…
▽ More
Monocular Human Pose Estimation (HPE) aims at determining the 3D positions of human joints from a single 2D image captured by a camera. However, a single 2D point in the image may correspond to multiple points in 3D space. Typically, the uniqueness of the 2D-3D relationship is approximated using an orthographic or weak-perspective camera model. In this study, instead of relying on approximations, we advocate for utilizing the full perspective camera model. This involves estimating camera parameters and establishing a precise, unambiguous 2D-3D relationship. To do so, we introduce the EPOCH framework, comprising two main components: the pose lifter network (LiftNet) and the pose regressor network (RegNet). LiftNet utilizes the full perspective camera model to precisely estimate the 3D pose in an unsupervised manner. It takes a 2D pose and camera parameters as inputs and produces the corresponding 3D pose estimation. These inputs are obtained from RegNet, which starts from a single image and provides estimates for the 2D pose and camera parameters. RegNet utilizes only 2D pose data as weak supervision. Internally, RegNet predicts a 3D pose, which is then projected to 2D using the estimated camera parameters. This process enables RegNet to establish the unambiguous 2D-3D relationship. Our experiments show that modeling the lifting as an unsupervised task with a camera in-the-loop results in better generalization to unseen data. We obtain state-of-the-art results for the 3D HPE on the Human3.6M and MPI-INF-3DHP datasets. Our code is available at: [Github link upon acceptance, see supplementary materials].
△ Less
Submitted 28 June, 2024;
originally announced June 2024.
-
fNeRF: High Quality Radiance Fields from Practical Cameras
Authors:
Yi Hua,
Christoph Lassner,
Carsten Stoll,
Iain Matthews
Abstract:
In recent years, the development of Neural Radiance Fields has enabled a previously unseen level of photo-realistic 3D reconstruction of scenes and objects from multi-view camera data. However, previous methods use an oversimplified pinhole camera model resulting in defocus blur being `baked' into the reconstructed radiance field. We propose a modification to the ray casting that leverages the opt…
▽ More
In recent years, the development of Neural Radiance Fields has enabled a previously unseen level of photo-realistic 3D reconstruction of scenes and objects from multi-view camera data. However, previous methods use an oversimplified pinhole camera model resulting in defocus blur being `baked' into the reconstructed radiance field. We propose a modification to the ray casting that leverages the optics of lenses to enhance scene reconstruction in the presence of defocus blur. This allows us to improve the quality of radiance field reconstructions from the measurements of a practical camera with finite aperture. We show that the proposed model matches the defocus blur behavior of practical cameras more closely than pinhole models and other approximations of defocus blur models, particularly in the presence of partial occlusions. This allows us to achieve sharper reconstructions, improving the PSNR on validation of all-in-focus images, on both synthetic and real datasets, by up to 3 dB.
△ Less
Submitted 15 June, 2024;
originally announced June 2024.
-
Personalized 3D Human Pose and Shape Refinement
Authors:
Tom Wehrbein,
Bodo Rosenhahn,
Iain Matthews,
Carsten Stoll
Abstract:
Recently, regression-based methods have dominated the field of 3D human pose and shape estimation. Despite their promising results, a common issue is the misalignment between predictions and image observations, often caused by minor joint rotation errors that accumulate along the kinematic chain. To address this issue, we propose to construct dense correspondences between initial human model estim…
▽ More
Recently, regression-based methods have dominated the field of 3D human pose and shape estimation. Despite their promising results, a common issue is the misalignment between predictions and image observations, often caused by minor joint rotation errors that accumulate along the kinematic chain. To address this issue, we propose to construct dense correspondences between initial human model estimates and the corresponding images that can be used to refine the initial predictions. To this end, we utilize renderings of the 3D models to predict per-pixel 2D displacements between the synthetic renderings and the RGB images. This allows us to effectively integrate and exploit appearance information of the persons. Our per-pixel displacements can be efficiently transformed to per-visible-vertex displacements and then used for 3D model refinement by minimizing a reprojection loss. To demonstrate the effectiveness of our approach, we refine the initial 3D human mesh predictions of multiple models using different refinement procedures on 3DPW and RICH. We show that our approach not only consistently leads to better image-model alignment, but also to improved 3D accuracy.
△ Less
Submitted 18 March, 2024;
originally announced March 2024.
-
Bitcoin MiCA Whitepaper
Authors:
Juan Ignacio Ibañez,
Lena Klaaßen,
Ulrich Gallersdörfer,
Christian Stoll
Abstract:
This document is written as an academic exercise, with the goal of exploring the feasibility of writing a white paper in accordance with Regulation (EU) 2023/1114 (MiCA). It is meant as a Proof of Concept (PoC) illustrating a concrete application of the requirements of MiCA. Like the MiCA white papers PoC shared by ESMA, this document is solely for the purposes of the PoC, to inform the public as…
▽ More
This document is written as an academic exercise, with the goal of exploring the feasibility of writing a white paper in accordance with Regulation (EU) 2023/1114 (MiCA). It is meant as a Proof of Concept (PoC) illustrating a concrete application of the requirements of MiCA. Like the MiCA white papers PoC shared by ESMA, this document is solely for the purposes of the PoC, to inform the public as to how a crypto-asset white paper could work, inspire public debate and feedback, and enhance the public conversation around the implementation of EU regulations.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Accounting for carbon emissions caused by cryptocurrency and token systems
Authors:
Ulrich Gallersdörfer,
Lena Klaaßen,
Christian Stoll
Abstract:
The energy consumption and related carbon emissions of cryptocurrencies such as Bitcoin are subject to extensive discussion in public, academia, and industry. As cryptocurrencies continue their journey into mainstream finance, incentives to participate in the networks and consume energy to do so remain significant. First guidance on how to allocate the carbon footprint of the Bitcoin network to si…
▽ More
The energy consumption and related carbon emissions of cryptocurrencies such as Bitcoin are subject to extensive discussion in public, academia, and industry. As cryptocurrencies continue their journey into mainstream finance, incentives to participate in the networks and consume energy to do so remain significant. First guidance on how to allocate the carbon footprint of the Bitcoin network to single investors exist, however a holistic framework capturing a wider range of cryptocurrencies and tokens remains absent. This white paper explores different approaches of how to allocate emissions caused by cryptocurrencies and tokens. Based on our analysis of the strengths and limitations of potential approaches, we propose a framework that combines key drivers of emissions in Proof of Work and Proof of Stake networks.
△ Less
Submitted 20 March, 2023; v1 submitted 11 November, 2021;
originally announced November 2021.
-
MESA Technical Note: Beam Breakup Instability Threshold Current
Authors:
S. Glukhov,
O. Boine-Frankenheim,
C. Stoll,
F. Hug
Abstract:
MESA (Mainz Energy recovery Superconducting Accelerator) is an energy recovery linac (ERL) which is under construction at Johannes Gutenberg University in Mainz. It will be operated in external beam (EB) mode with 150 $μ$A electron beam at 155 MeV and energy recovery (ER) mode with 1 mA (first stage) and later 10 mA (second stage) electron beam at 105 MeV. An important factor which may limit perfo…
▽ More
MESA (Mainz Energy recovery Superconducting Accelerator) is an energy recovery linac (ERL) which is under construction at Johannes Gutenberg University in Mainz. It will be operated in external beam (EB) mode with 150 $μ$A electron beam at 155 MeV and energy recovery (ER) mode with 1 mA (first stage) and later 10 mA (second stage) electron beam at 105 MeV. An important factor which may limit performance of the machine is a beam breakup (BBU) instability which may occur due to excitation of higher-order modes (HOMs) in superconducting RF cavities. This effect occurs only when the injected beam current exceeds a threshold value. The aim of the present work is to develop a software for reliable determination of the threshold current in MESA, find main factors which may change its value and finally make a decision concerning capability of MESA operation at 10 mA and need for additional measures for suppressing BBU instability.
△ Less
Submitted 1 February, 2021;
originally announced February 2021.
-
ANR: Articulated Neural Rendering for Virtual Avatars
Authors:
Amit Raj,
Julian Tanke,
James Hays,
Minh Vo,
Carsten Stoll,
Christoph Lassner
Abstract:
The combination of traditional rendering with neural networks in Deferred Neural Rendering (DNR) provides a compelling balance between computational complexity and realism of the resulting images. Using skinned meshes for rendering articulating objects is a natural extension for the DNR framework and would open it up to a plethora of applications. However, in this case the neural shading step must…
▽ More
The combination of traditional rendering with neural networks in Deferred Neural Rendering (DNR) provides a compelling balance between computational complexity and realism of the resulting images. Using skinned meshes for rendering articulating objects is a natural extension for the DNR framework and would open it up to a plethora of applications. However, in this case the neural shading step must account for deformations that are possibly not captured in the mesh, as well as alignment inaccuracies and dynamics -- which can confound the DNR pipeline. We present Articulated Neural Rendering (ANR), a novel framework based on DNR which explicitly addresses its limitations for virtual human avatars. We show the superiority of ANR not only with respect to DNR but also with methods specialized for avatar creation and animation. In two user studies, we observe a clear preference for our avatar model and we demonstrate state-of-the-art performance on quantitative evaluation metrics. Perceptually, we observe better temporal stability, level of detail and plausibility.
△ Less
Submitted 23 December, 2020;
originally announced December 2020.
-
Bitcoin's future carbon footprint
Authors:
Shize Qin,
Lena Klaaßen,
Ulrich Gallersdörfer,
Christian Stoll,
Da Zhang
Abstract:
The carbon footprint of Bitcoin has drawn wide attention, but Bitcoin's long-term impact on the climate remains uncertain. Here we present a framework to overcome uncertainties in previous estimates and project Bitcoin's electricity consumption and carbon footprint in the long term. If we assume Bitcoin's market capitalization grows in line with the one of gold, we find that the annual electricity…
▽ More
The carbon footprint of Bitcoin has drawn wide attention, but Bitcoin's long-term impact on the climate remains uncertain. Here we present a framework to overcome uncertainties in previous estimates and project Bitcoin's electricity consumption and carbon footprint in the long term. If we assume Bitcoin's market capitalization grows in line with the one of gold, we find that the annual electricity consumption of Bitcoin may increase from 60 to 400 TWh between 2020 and 2100. The future carbon footprint of Bitcoin strongly depends on the decarbonization pathway of the electricity sector. If the electricity sector achieves carbon neutrality by 2050, Bitcoin's carbon footprint has peaked already. However, in the business-as-usual scenario, emissions sum up to 2 gigatons until 2100, an amount comparable to 7% of global emissions in 2019. The Bitcoin price spike at the end of 2020 shows, however, that progressive development of market capitalization could yield an electricity consumption of more than 100 TWh already in 2021, and lead to cumulative emissions of over 5 gigatons by 2100. Therefore, we also discuss policy instruments to reduce Bitcoin's future carbon footprint.
△ Less
Submitted 28 January, 2021; v1 submitted 4 November, 2020;
originally announced November 2020.
-
PatchNets: Patch-Based Generalizable Deep Implicit 3D Shape Representations
Authors:
Edgar Tretschk,
Ayush Tewari,
Vladislav Golyanik,
Michael Zollhöfer,
Carsten Stoll,
Christian Theobalt
Abstract:
Implicit surface representations, such as signed-distance functions, combined with deep learning have led to impressive models which can represent detailed shapes of objects with arbitrary topology. Since a continuous function is learned, the reconstructions can also be extracted at any arbitrary resolution. However, large datasets such as ShapeNet are required to train such models. In this paper,…
▽ More
Implicit surface representations, such as signed-distance functions, combined with deep learning have led to impressive models which can represent detailed shapes of objects with arbitrary topology. Since a continuous function is learned, the reconstructions can also be extracted at any arbitrary resolution. However, large datasets such as ShapeNet are required to train such models. In this paper, we present a new mid-level patch-based surface representation. At the level of patches, objects across different categories share similarities, which leads to more generalizable models. We then introduce a novel method to learn this patch-based representation in a canonical space, such that it is as object-agnostic as possible. We show that our representation trained on one category of objects from ShapeNet can also well represent detailed shapes from any other category. In addition, it can be trained using much fewer shapes, compared to existing approaches. We show several applications of our new representation, including shape interpolation and partial point cloud completion. Due to explicit control over positions, orientations and scales of patches, our representation is also more controllable compared to object-level representations, which enables us to deform encoded shapes non-rigidly.
△ Less
Submitted 5 February, 2021; v1 submitted 4 August, 2020;
originally announced August 2020.
-
TexMesh: Reconstructing Detailed Human Texture and Geometry from RGB-D Video
Authors:
Tiancheng Zhi,
Christoph Lassner,
Tony Tung,
Carsten Stoll,
Srinivasa G. Narasimhan,
Minh Vo
Abstract:
We present TexMesh, a novel approach to reconstruct detailed human meshes with high-resolution full-body texture from RGB-D video. TexMesh enables high quality free-viewpoint rendering of humans. Given the RGB frames, the captured environment map, and the coarse per-frame human mesh from RGB-D tracking, our method reconstructs spatiotemporally consistent and detailed per-frame meshes along with a…
▽ More
We present TexMesh, a novel approach to reconstruct detailed human meshes with high-resolution full-body texture from RGB-D video. TexMesh enables high quality free-viewpoint rendering of humans. Given the RGB frames, the captured environment map, and the coarse per-frame human mesh from RGB-D tracking, our method reconstructs spatiotemporally consistent and detailed per-frame meshes along with a high-resolution albedo texture. By using the incident illumination we are able to accurately estimate local surface geometry and albedo, which allows us to further use photometric constraints to adapt a synthetically trained model to real-world sequences in a self-supervised manner for detailed surface geometry and high-resolution texture estimation. In practice, we train our models on a short example sequence for self-adaptation and the model runs at interactive framerate afterwards. We validate TexMesh on synthetic and real-world data, and show it outperforms the state of art quantitatively and qualitatively.
△ Less
Submitted 20 September, 2020; v1 submitted 31 July, 2020;
originally announced August 2020.