-
SALVe: Semantic Alignment Verification for Floorplan Reconstruction from Sparse Panoramas
Authors:
John Lambert,
Yuguang Li,
Ivaylo Boyadzhiev,
Lambert Wixson,
Manjunath Narayana,
Will Hutchcroft,
James Hays,
Frank Dellaert,
Sing Bing Kang
Abstract:
We propose a new system for automatic 2D floorplan reconstruction that is enabled by SALVe, our novel pairwise learned alignment verifier. The inputs to our system are sparsely located 360$^\circ$ panoramas, whose semantic features (windows, doors, and openings) are inferred and used to hypothesize pairwise room adjacency or overlap. SALVe initializes a pose graph, which is subsequently optimized…
▽ More
We propose a new system for automatic 2D floorplan reconstruction that is enabled by SALVe, our novel pairwise learned alignment verifier. The inputs to our system are sparsely located 360$^\circ$ panoramas, whose semantic features (windows, doors, and openings) are inferred and used to hypothesize pairwise room adjacency or overlap. SALVe initializes a pose graph, which is subsequently optimized using GTSAM. Once the room poses are computed, room layouts are inferred using HorizonNet, and the floorplan is constructed by stitching the most confident layout boundaries. We validate our system qualitatively and quantitatively as well as through ablation studies, showing that it outperforms state-of-the-art SfM systems in completeness by over 200%, without sacrificing accuracy. Our results point to the significance of our work: poses of 81% of panoramas are localized in the first 2 connected components (CCs), and 89% in the first 3 CCs. Code and models are publicly available at https://github.com/zillow/salve.
△ Less
Submitted 27 June, 2024;
originally announced June 2024.
-
Graph-CoVis: GNN-based Multi-view Panorama Global Pose Estimation
Authors:
Negar Nejatishahidin,
Will Hutchcroft,
Manjunath Narayana,
Ivaylo Boyadzhiev,
Yuguang Li,
Naji Khosravan,
Jana Kosecka,
Sing Bing Kang
Abstract:
In this paper, we address the problem of wide-baseline camera pose estimation from a group of 360$^\circ$ panoramas under upright-camera assumption. Recent work has demonstrated the merit of deep-learning for end-to-end direct relative pose regression in 360$^\circ$ panorama pairs [11]. To exploit the benefits of multi-view logic in a learning-based framework, we introduce Graph-CoVis, which non-t…
▽ More
In this paper, we address the problem of wide-baseline camera pose estimation from a group of 360$^\circ$ panoramas under upright-camera assumption. Recent work has demonstrated the merit of deep-learning for end-to-end direct relative pose regression in 360$^\circ$ panorama pairs [11]. To exploit the benefits of multi-view logic in a learning-based framework, we introduce Graph-CoVis, which non-trivially extends CoVisPose [11] from relative two-view to global multi-view spherical camera pose estimation. Graph-CoVis is a novel Graph Neural Network based architecture that jointly learns the co-visible structure and global motion in an end-to-end and fully-supervised approach. Using the ZInD [4] dataset, which features real homes presenting wide-baselines, occlusion, and limited visual overlap, we show that our model performs competitively to state-of-the-art approaches.
△ Less
Submitted 25 April, 2023;
originally announced April 2023.
-
U2RLE: Uncertainty-Guided 2-Stage Room Layout Estimation
Authors:
Pooya Fayyazsanavi,
Zhiqiang Wan,
Will Hutchcroft,
Ivaylo Boyadzhiev,
Yuguang Li,
Jana Kosecka,
Sing Bing Kang
Abstract:
While the existing deep learning-based room layout estimation techniques demonstrate good overall accuracy, they are less effective for distant floor-wall boundary. To tackle this problem, we propose a novel uncertainty-guided approach for layout boundary estimation introducing new two-stage CNN architecture termed U2RLE. The initial stage predicts both floor-wall boundary and its uncertainty and…
▽ More
While the existing deep learning-based room layout estimation techniques demonstrate good overall accuracy, they are less effective for distant floor-wall boundary. To tackle this problem, we propose a novel uncertainty-guided approach for layout boundary estimation introducing new two-stage CNN architecture termed U2RLE. The initial stage predicts both floor-wall boundary and its uncertainty and is followed by the refinement of boundaries with high positional uncertainty using a different, distance-aware loss. Finally, outputs from the two stages are merged to produce the room layout. Experiments using ZInD and Structure3D datasets show that U2RLE improves over current state-of-the-art, being able to handle both near and far walls better. In particular, U2RLE outperforms current state-of-the-art techniques for the most distant walls.
△ Less
Submitted 17 April, 2023;
originally announced April 2023.
-
Semantically Supervised Appearance Decomposition for Virtual Staging from a Single Panorama
Authors:
Tiancheng Zhi,
Bowei Chen,
Ivaylo Boyadzhiev,
Sing Bing Kang,
Martial Hebert,
Srinivasa G. Narasimhan
Abstract:
We describe a novel approach to decompose a single panorama of an empty indoor environment into four appearance components: specular, direct sunlight, diffuse and diffuse ambient without direct sunlight. Our system is weakly supervised by automatically generated semantic maps (with floor, wall, ceiling, lamp, window and door labels) that have shown success on perspective views and are trained for…
▽ More
We describe a novel approach to decompose a single panorama of an empty indoor environment into four appearance components: specular, direct sunlight, diffuse and diffuse ambient without direct sunlight. Our system is weakly supervised by automatically generated semantic maps (with floor, wall, ceiling, lamp, window and door labels) that have shown success on perspective views and are trained for panoramas using transfer learning without any further annotations. A GAN-based approach supervised by coarse information obtained from the semantic map extracts specular reflection and direct sunlight regions on the floor and walls. These lighting effects are removed via a similar GAN-based approach and a semantic-aware inpainting step. The appearance decomposition enables multiple applications including sun direction estimation, virtual furniture insertion, floor material replacement, and sun direction change, providing an effective tool for virtual home staging. We demonstrate the effectiveness of our approach on a large and recently released dataset of panoramas of empty homes.
△ Less
Submitted 26 May, 2022;
originally announced May 2022.
-
LASER: LAtent SpacE Rendering for 2D Visual Localization
Authors:
Zhixiang Min,
Naji Khosravan,
Zachary Bessinger,
Manjunath Narayana,
Sing Bing Kang,
Enrique Dunn,
Ivaylo Boyadzhiev
Abstract:
We present LASER, an image-based Monte Carlo Localization (MCL) framework for 2D floor maps. LASER introduces the concept of latent space rendering, where 2D pose hypotheses on the floor map are directly rendered into a geometrically-structured latent space by aggregating viewing ray features. Through a tightly coupled rendering codebook scheme, the viewing ray features are dynamically determined…
▽ More
We present LASER, an image-based Monte Carlo Localization (MCL) framework for 2D floor maps. LASER introduces the concept of latent space rendering, where 2D pose hypotheses on the floor map are directly rendered into a geometrically-structured latent space by aggregating viewing ray features. Through a tightly coupled rendering codebook scheme, the viewing ray features are dynamically determined at rendering-time based on their geometries (i.e. length, incident-angle), endowing our representation with view-dependent fine-grain variability. Our codebook scheme effectively disentangles feature encoding from rendering, allowing the latent space rendering to run at speeds above 10KHz. Moreover, through metric learning, our geometrically-structured latent space is common to both pose hypotheses and query images with arbitrary field of views. As a result, LASER achieves state-of-the-art performance on large-scale indoor localization datasets (i.e. ZInD and Structured3D) for both panorama and perspective image queries, while significantly outperforming existing learning-based methods in speed.
△ Less
Submitted 26 March, 2023; v1 submitted 31 March, 2022;
originally announced April 2022.
-
PSMNet: Position-aware Stereo Merging Network for Room Layout Estimation
Authors:
Haiyan Wang,
Will Hutchcroft,
Yuguang Li,
Zhiqiang Wan,
Ivaylo Boyadzhiev,
Yingli Tian,
Sing Bing Kang
Abstract:
In this paper, we propose a new deep learning-based method for estimating room layout given a pair of 360 panoramas. Our system, called Position-aware Stereo Merging Network or PSMNet, is an end-to-end joint layout-pose estimator. PSMNet consists of a Stereo Pano Pose (SP2) transformer and a novel Cross-Perspective Projection (CP2) layer. The stereo-view SP2 transformer is used to implicitly infer…
▽ More
In this paper, we propose a new deep learning-based method for estimating room layout given a pair of 360 panoramas. Our system, called Position-aware Stereo Merging Network or PSMNet, is an end-to-end joint layout-pose estimator. PSMNet consists of a Stereo Pano Pose (SP2) transformer and a novel Cross-Perspective Projection (CP2) layer. The stereo-view SP2 transformer is used to implicitly infer correspondences between views, and can handle noisy poses. The pose-aware CP2 layer is designed to render features from the adjacent view to the anchor (reference) view, in order to perform view fusion and estimate the visible layout. Our experiments and analysis validate our method, which significantly outperforms the state-of-the-art layout estimators, especially for large and complex room spaces.
△ Less
Submitted 29 March, 2022;
originally announced March 2022.
-
Panoramic Structure from Motion via Geometric Relationship Detection
Authors:
Satoshi Ikehata,
Ivaylo Boyadzhiev,
Qi Shan,
Yasutaka Furukawa
Abstract:
This paper addresses the problem of Structure from Motion (SfM) for indoor panoramic image streams, extremely challenging even for the state-of-the-art due to the lack of textures and minimal parallax. The key idea is the fusion of single-view and multi-view reconstruction techniques via geometric relationship detection (e.g., detecting 2D lines as coplanar in 3D). Rough geometry suffices to perfo…
▽ More
This paper addresses the problem of Structure from Motion (SfM) for indoor panoramic image streams, extremely challenging even for the state-of-the-art due to the lack of textures and minimal parallax. The key idea is the fusion of single-view and multi-view reconstruction techniques via geometric relationship detection (e.g., detecting 2D lines as coplanar in 3D). Rough geometry suffices to perform such detection, and our approach utilizes rough surface normal estimates from an image-to-normal deep network to discover geometric relationships among lines. The detected relationships provide exact geometric constraints in our line-based linear SfM formulation. A constrained linear least squares is used to reconstruct a 3D model and camera motions, followed by the bundle adjustment. We have validated our algorithm on challenging datasets, outperforming various state-of-the-art reconstruction techniques.
△ Less
Submitted 5 December, 2016;
originally announced December 2016.
-
Harmonic motion and Cassini ovals
Authors:
Khristo N. Boyadzhiev,
Irina A. Boyadzhiev
Abstract:
We consider a two-dimensional free harmonic oscillator where the initial position is fixed and the initial velocity can change direction. All possible orbits are ellipses and their envelo** curve is an ellipse too. We show that the locus of the foci of all elliptical orbits is a Cassini oval. Depending on the magnitude of the initial velocity we observe all three kinds of Cassini ovals, one of w…
▽ More
We consider a two-dimensional free harmonic oscillator where the initial position is fixed and the initial velocity can change direction. All possible orbits are ellipses and their envelo** curve is an ellipse too. We show that the locus of the foci of all elliptical orbits is a Cassini oval. Depending on the magnitude of the initial velocity we observe all three kinds of Cassini ovals, one of which is the lemniscate of Bernoulli. These Cassini ovals have the same foci as the envelo** ellipse.
△ Less
Submitted 23 March, 2013;
originally announced March 2013.