Skip to main content

Showing 1–25 of 25 results for author: Kontschieder, P

.
  1. arXiv:2406.18524  [pdf, other

    cs.CV

    MultiDiff: Consistent Novel View Synthesis from a Single Image

    Authors: Norman Müller, Katja Schwarz, Barbara Roessle, Lorenzo Porzi, Samuel Rota Bulò, Matthias Nießner, Peter Kontschieder

    Abstract: We introduce MultiDiff, a novel approach for consistent novel view synthesis of scenes from a single RGB image. The task of synthesizing novel views from a single reference image is highly ill-posed by nature, as there exist multiple, plausible explanations for unobserved areas. To address this issue, we incorporate strong priors in form of monocular depth predictors and video-diffusion models. Mo… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Project page: https://sirwyver.github.io/MultiDiff Video: https://youtu.be/zBC4z4qXW_4 - CVPR 2024

  2. arXiv:2406.09404  [pdf, other

    cs.CV cs.AI cs.LG

    ConsistDreamer: 3D-Consistent 2D Diffusion for High-Fidelity Scene Editing

    Authors: Jun-Kun Chen, Samuel Rota Bulò, Norman Müller, Lorenzo Porzi, Peter Kontschieder, Yu-Xiong Wang

    Abstract: This paper proposes ConsistDreamer - a novel framework that lifts 2D diffusion models with 3D awareness and 3D consistency, thus enabling high-fidelity instruction-guided scene editing. To overcome the fundamental limitation of missing 3D consistency in 2D diffusion models, our key insight is to introduce three synergetic strategies that augment the input of the 2D diffusion model to become 3D-awa… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: CVPR 2024

  3. arXiv:2406.03175  [pdf, other

    cs.CV

    Dynamic 3D Gaussian Fields for Urban Areas

    Authors: Tobias Fischer, Jonas Kulhanek, Samuel Rota Bulò, Lorenzo Porzi, Marc Pollefeys, Peter Kontschieder

    Abstract: We present an efficient neural 3D scene representation for novel-view synthesis (NVS) in large-scale, dynamic urban areas. Existing works are not well suited for applications like mixed-reality or closed-loop simulation due to their limited visual quality and non-interactive rendering speeds. Recently, rasterization-based approaches have achieved high-quality NVS at impressive speeds. However, the… ▽ More

    Submitted 5 June, 2024; originally announced June 2024.

    Comments: Project page is available at https://tobiasfshr.github.io/pub/4dgf/

  4. arXiv:2404.06109  [pdf, other

    cs.CV

    Revising Densification in Gaussian Splatting

    Authors: Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder

    Abstract: In this paper, we address the limitations of Adaptive Density Control (ADC) in 3D Gaussian Splatting (3DGS), a scene representation method achieving high-quality, photorealistic results for novel view synthesis. ADC has been introduced for automatic 3D point primitive management, controlling densification and pruning, however, with certain limitations in the densification logic. Our main contribut… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  5. arXiv:2404.04211  [pdf, other

    cs.CV

    Robust Gaussian Splatting

    Authors: François Darmon, Lorenzo Porzi, Samuel Rota-Bulò, Peter Kontschieder

    Abstract: In this paper, we address common error sources for 3D Gaussian Splatting (3DGS) including blur, imperfect camera poses, and color inconsistencies, with the goal of improving its robustness for practical applications like reconstructions from handheld phone captures. Our main contribution involves modeling motion blur as a Gaussian distribution over camera poses, allowing us to address both camera… ▽ More

    Submitted 5 April, 2024; originally announced April 2024.

  6. arXiv:2404.00168  [pdf, other

    cs.CV

    Multi-Level Neural Scene Graphs for Dynamic Urban Environments

    Authors: Tobias Fischer, Lorenzo Porzi, Samuel Rota Bulò, Marc Pollefeys, Peter Kontschieder

    Abstract: We estimate the radiance field of large-scale dynamic areas from multiple vehicle captures under varying environmental conditions. Previous works in this domain are either restricted to static environments, do not scale to more than a single short video, or struggle to separately represent dynamic object instances. To this end, we present a novel, decomposable radiance field approach for dynamic u… ▽ More

    Submitted 29 March, 2024; originally announced April 2024.

    Comments: CVPR 2024. Project page is available at https://tobiasfshr.github.io/pub/ml-nsg/

  7. arXiv:2312.03160  [pdf, other

    cs.CV cs.GR cs.LG

    HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces

    Authors: Haithem Turki, Vasu Agrawal, Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder, Deva Ramanan, Michael Zollhöfer, Christian Richardt

    Abstract: Neural radiance fields provide state-of-the-art view synthesis quality but tend to be slow to render. One reason is that they make use of volume rendering, thus requiring many samples (and model queries) per ray at render time. Although this representation is flexible and easy to optimize, most real-world objects can be modeled more efficiently with surfaces instead of volumes, requiring far fewer… ▽ More

    Submitted 27 March, 2024; v1 submitted 5 December, 2023; originally announced December 2023.

    Comments: CVPR 2024 Project page: https://haithemturki.com/hybrid-nerf/

  8. VR-NeRF: High-Fidelity Virtualized Walkable Spaces

    Authors: Linning Xu, Vasu Agrawal, William Laney, Tony Garcia, Aayush Bansal, Changil Kim, Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder, Aljaž Božič, Dahua Lin, Michael Zollhöfer, Christian Richardt

    Abstract: We present an end-to-end system for the high-fidelity capture, model reconstruction, and real-time rendering of walkable spaces in virtual reality using neural radiance fields. To this end, we designed and built a custom multi-camera rig to densely capture walkable spaces in high fidelity and with multi-view high dynamic range images in unprecedented quality and density. We extend instant neural g… ▽ More

    Submitted 4 November, 2023; originally announced November 2023.

    Comments: SIGGRAPH Asia 2023; Project page: https://vr-nerf.github.io

  9. arXiv:2306.06044  [pdf, other

    cs.CV cs.GR eess.IV

    GANeRF: Leveraging Discriminators to Optimize Neural Radiance Fields

    Authors: Barbara Roessle, Norman Müller, Lorenzo Porzi, Samuel Rota Bulò, Peter Kontschieder, Matthias Nießner

    Abstract: Neural Radiance Fields (NeRF) have shown impressive novel view synthesis results; nonetheless, even thorough recordings yield imperfections in reconstructions, for instance due to poorly observed areas or minor lighting changes. Our goal is to mitigate these imperfections from various sources with a joint solution: we take advantage of the ability of generative adversarial networks (GANs) to produ… ▽ More

    Submitted 18 September, 2023; v1 submitted 9 June, 2023; originally announced June 2023.

    Comments: SIGGRAPH Asia 2023, project page: https://barbararoessle.github.io/ganerf , video: https://youtu.be/352ccXWxQVE

    Journal ref: ACM Transactions on Graphics, Vol. 42, No. 6, Article 207 (2023) 1-14

  10. arXiv:2304.02009  [pdf, other

    cs.CV

    OrienterNet: Visual Localization in 2D Public Maps with Neural Matching

    Authors: Paul-Edouard Sarlin, Daniel DeTone, Tsun-Yi Yang, Armen Avetisyan, Julian Straub, Tomasz Malisiewicz, Samuel Rota Bulo, Richard Newcombe, Peter Kontschieder, Vasileios Balntas

    Abstract: Humans can orient themselves in their 3D environments using simple 2D maps. Differently, algorithms for visual localization mostly rely on complex 3D point clouds that are expensive to build, store, and maintain over time. We bridge this gap by introducing OrienterNet, the first deep neural network that can localize an image with sub-meter accuracy using the same 2D semantic maps that humans use.… ▽ More

    Submitted 4 April, 2023; originally announced April 2023.

    Comments: CVPR 2023

  11. arXiv:2212.09802  [pdf, other

    cs.CV cs.LG

    Panoptic Lifting for 3D Scene Understanding with Neural Fields

    Authors: Yawar Siddiqui, Lorenzo Porzi, Samuel Rota Buló, Norman Müller, Matthias Nießner, Angela Dai, Peter Kontschieder

    Abstract: We propose Panoptic Lifting, a novel approach for learning panoptic 3D volumetric representations from images of in-the-wild scenes. Once trained, our model can render color images together with 3D-consistent panoptic segmentation from novel viewpoints. Unlike existing approaches which use 3D input directly or indirectly, our method requires only machine-generated 2D panoptic segmentation masks… ▽ More

    Submitted 19 December, 2022; originally announced December 2022.

    Comments: Project Page: https://nihalsid.github.io/panoptic-lifting/, Video: https://youtu.be/QtsiL-6rSuM

  12. arXiv:2212.01206  [pdf, other

    cs.CV

    DiffRF: Rendering-Guided 3D Radiance Field Diffusion

    Authors: Norman Müller, Yawar Siddiqui, Lorenzo Porzi, Samuel Rota Bulò, Peter Kontschieder, Matthias Nießner

    Abstract: We introduce DiffRF, a novel approach for 3D radiance field synthesis based on denoising diffusion probabilistic models. While existing diffusion-based methods operate on images, latent codes, or point cloud data, we are the first to directly generate volumetric radiance fields. To this end, we propose a 3D denoising model which directly operates on an explicit voxel grid representation. However,… ▽ More

    Submitted 27 March, 2023; v1 submitted 2 December, 2022; originally announced December 2022.

    Comments: Project page: https://sirwyver.github.io/DiffRF/ Video: https://youtu.be/qETBcLu8SUk - CVPR 2023 Highlight - updated evaluations after fixing initial data map** error on all methods

  13. arXiv:2204.03593  [pdf, other

    cs.CV

    AutoRF: Learning 3D Object Radiance Fields from Single View Observations

    Authors: Norman Müller, Andrea Simonelli, Lorenzo Porzi, Samuel Rota Bulò, Matthias Nießner, Peter Kontschieder

    Abstract: We introduce AutoRF - a new approach for learning neural 3D object representations where each object in the training set is observed by only a single view. This setting is in stark contrast to the majority of existing works that leverage multiple views of the same object, employ explicit priors during training, or require pixel-perfect annotations. To address this challenging setting, we propose t… ▽ More

    Submitted 7 April, 2022; originally announced April 2022.

    Comments: CVPR 2022. Project page: https://sirwyver.github.io/AutoRF/

  14. arXiv:2111.00770  [pdf, other

    cs.CV

    Dense Prediction with Attentive Feature Aggregation

    Authors: Yung-Hsu Yang, Thomas E. Huang, Min Sun, Samuel Rota Bulò, Peter Kontschieder, Fisher Yu

    Abstract: Aggregating information from features across different layers is an essential operation for dense prediction models. Despite its limited expressiveness, feature concatenation dominates the choice of aggregation operations. In this paper, we introduce Attentive Feature Aggregation (AFA) to fuse different network layers with more expressive non-linear operations. AFA exploits both spatial and channe… ▽ More

    Submitted 19 January, 2023; v1 submitted 1 November, 2021; originally announced November 2021.

    Comments: 20 pages, 14 figures, WACV 2023

  15. arXiv:2101.00667  [pdf, other

    cs.CV

    Weakly Supervised Multi-Object Tracking and Segmentation

    Authors: Idoia Ruiz, Lorenzo Porzi, Samuel Rota Bulò, Peter Kontschieder, Joan Serrat

    Abstract: We introduce the problem of weakly supervised Multi-Object Tracking and Segmentation, i.e. joint weakly supervised instance segmentation and multi-object tracking, in which we do not provide any kind of mask annotation. To address it, we design a novel synergistic training strategy by taking advantage of multi-task learning, i.e. classification and tracking tasks guide the training of the unsuperv… ▽ More

    Submitted 3 January, 2021; originally announced January 2021.

    Comments: Accepted at Autonomous Vehicle Vision WACV 2021 Workshop

    Journal ref: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2021

  16. arXiv:2012.07717  [pdf, other

    cs.CV

    Improving Panoptic Segmentation at All Scales

    Authors: Lorenzo Porzi, Samuel Rota Bulò, Peter Kontschieder

    Abstract: Crop-based training strategies decouple training resolution from GPU memory consumption, allowing the use of large-capacity panoptic segmentation networks on multi-megapixel images. Using crops, however, can introduce a bias towards truncating or missing large objects. To address this, we propose a novel crop-aware bounding box regression loss (CABB loss), which promotes predictions to be consiste… ▽ More

    Submitted 23 March, 2021; v1 submitted 14 December, 2020; originally announced December 2020.

  17. arXiv:2012.05796  [pdf, other

    cs.CV

    Are we Missing Confidence in Pseudo-LiDAR Methods for Monocular 3D Object Detection?

    Authors: Andrea Simonelli, Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder, Elisa Ricci

    Abstract: Pseudo-LiDAR-based methods for monocular 3D object detection have received considerable attention in the community due to the performance gains exhibited on the KITTI3D benchmark, in particular on the commonly reported validation split. This generated a distorted impression about the superiority of Pseudo-LiDAR-based (PL-based) approaches over methods working with RGB images only. Our first contri… ▽ More

    Submitted 13 May, 2021; v1 submitted 10 December, 2020; originally announced December 2020.

  18. arXiv:1912.10739  [pdf, other

    cs.CV

    Improving Optical Flow on a Pyramid Level

    Authors: Markus Hofinger, Samuel Rota Bulò, Lorenzo Porzi, Arno Knapitsch, Thomas Pock, Peter Kontschieder

    Abstract: In this work we review the coarse-to-fine spatial feature pyramid concept, which is used in state-of-the-art optical flow estimation networks to make exploration of the pixel flow search space computationally tractable and efficient. Within an individual pyramid level, we improve the cost volume construction process by departing from a war**- to a sampling-based strategy, which avoids ghosting a… ▽ More

    Submitted 18 July, 2020; v1 submitted 23 December, 2019; originally announced December 2019.

  19. arXiv:1912.08035  [pdf, other

    cs.CV

    Towards Generalization Across Depth for Monocular 3D Object Detection

    Authors: Andrea Simonelli, Samuel Rota Bulò, Lorenzo Porzi, Elisa Ricci, Peter Kontschieder

    Abstract: While expensive LiDAR and stereo camera rigs have enabled the development of successful 3D object detection methods, monocular RGB-only approaches lag much behind. This work advances the state of the art by introducing MoVi-3D, a novel, single-stage deep architecture for monocular 3D object detection. MoVi-3D builds upon a novel approach which leverages geometrical information to generate, both at… ▽ More

    Submitted 2 April, 2020; v1 submitted 17 December, 2019; originally announced December 2019.

  20. arXiv:1912.02096  [pdf, other

    cs.CV

    Learning Multi-Object Tracking and Segmentation from Automatic Annotations

    Authors: Lorenzo Porzi, Markus Hofinger, Idoia Ruiz, Joan Serrat, Samuel Rota Bulò, Peter Kontschieder

    Abstract: In this work we contribute a novel pipeline to automatically generate training data, and to improve over state-of-the-art multi-object tracking and segmentation (MOTS) methods. Our proposed track mining algorithm turns raw street-level videos into high-fidelity MOTS training data, is scalable and overcomes the need of expensive and time-consuming manual annotation approaches. We leverage state-of-… ▽ More

    Submitted 30 March, 2020; v1 submitted 4 December, 2019; originally announced December 2019.

  21. arXiv:1905.12365  [pdf, other

    cs.CV

    Disentangling Monocular 3D Object Detection

    Authors: Andrea Simonelli, Samuel Rota Rota Bulò, Lorenzo Porzi, Manuel López-Antequera, Peter Kontschieder

    Abstract: In this paper we propose an approach for monocular 3D object detection from a single RGB image, which leverages a novel disentangling transformation for 2D and 3D detection losses and a novel, self-supervised confidence score for 3D bounding boxes. Our proposed loss disentanglement has the twofold advantage of simplifying the training dynamics in the presence of losses with complex interactions of… ▽ More

    Submitted 29 May, 2019; originally announced May 2019.

    Comments: Project website at https://research.mapillary.com/publication/MonoDIS/

  22. arXiv:1905.01220  [pdf, other

    cs.CV

    Seamless Scene Segmentation

    Authors: Lorenzo Porzi, Samuel Rota Bulò, Aleksander Colovic, Peter Kontschieder

    Abstract: In this work we introduce a novel, CNN-based architecture that can be trained end-to-end to deliver seamless scene segmentation results. Our goal is to predict consistent semantic segmentation and detection results by means of a panoptic output format, going beyond the simple combination of independently trained segmentation and detection models. The proposed architecture takes advantage of a nove… ▽ More

    Submitted 3 May, 2019; originally announced May 2019.

    Comments: extended version of the accepted CVPR 2019 paper

  23. arXiv:1712.02616  [pdf, other

    cs.CV

    In-Place Activated BatchNorm for Memory-Optimized Training of DNNs

    Authors: Samuel Rota Bulò, Lorenzo Porzi, Peter Kontschieder

    Abstract: In this work we present In-Place Activated Batch Normalization (InPlace-ABN) - a novel approach to drastically reduce the training memory footprint of modern deep neural networks in a computationally efficient way. Our solution substitutes the conventionally used succession of BatchNorm + Activation layers with a single plugin layer, hence avoiding invasive framework surgery while providing straig… ▽ More

    Submitted 26 October, 2018; v1 submitted 7 December, 2017; originally announced December 2017.

  24. arXiv:1704.02966  [pdf, other

    cs.CV stat.ML

    Loss Max-Pooling for Semantic Image Segmentation

    Authors: Samuel Rota Bulò, Gerhard Neuhold, Peter Kontschieder

    Abstract: We introduce a novel loss max-pooling concept for handling imbalanced training data distributions, applicable as alternative loss layer in the context of deep neural networks for semantic image segmentation. Most real-world semantic segmentation datasets exhibit long tail distributions with few object categories comprising the majority of data and consequently biasing the classifiers towards them.… ▽ More

    Submitted 10 April, 2017; originally announced April 2017.

    Comments: accepted at CVPR 2017

  25. arXiv:1603.01250  [pdf, other

    cs.CV cs.AI

    Decision Forests, Convolutional Networks and the Models in-Between

    Authors: Yani Ioannou, Duncan Robertson, Darko Zikic, Peter Kontschieder, Jamie Shotton, Matthew Brown, Antonio Criminisi

    Abstract: This paper investigates the connections between two state of the art classifiers: decision forests (DFs, including decision jungles) and convolutional neural networks (CNNs). Decision forests are computationally efficient thanks to their conditional computation property (computation is confined to only a small region of the tree, the nodes along a single branch). CNNs achieve state of the art accu… ▽ More

    Submitted 3 March, 2016; originally announced March 2016.

    Comments: Microsoft Research Technical Report

    Report number: MSR-TR-2015-58