-
RBF Weighted Hyper-Involution for RGB-D Object Detection
Authors:
Mehfuz A Rahman,
Jiju Peethambaran,
Neil London
Abstract:
A vast majority of conventional augmented reality devices are equipped with depth sensors. Depth images produced by such sensors contain complementary information for object detection when used with color images. Despite the benefits, it remains a complex task to simultaneously extract photometric and depth features in real time due to the immanent difference between depth and color images. Moreov…
▽ More
A vast majority of conventional augmented reality devices are equipped with depth sensors. Depth images produced by such sensors contain complementary information for object detection when used with color images. Despite the benefits, it remains a complex task to simultaneously extract photometric and depth features in real time due to the immanent difference between depth and color images. Moreover, standard convolution operations are not sufficient to properly extract information directly from raw depth images leading to intermediate representations of depth which is inefficient. To address these issues, we propose a real-time and two stream RGBD object detection model. The proposed model consists of two new components: a depth guided hyper-involution that adapts dynamically based on the spatial interaction pattern in the raw depth map and an up-sampling based trainable fusion layer that combines the extracted depth and color image features without blocking the information transfer between them. We show that the proposed model outperforms other RGB-D based object detection models on NYU Depth v2 dataset and achieves comparable (second best) results on SUN RGB-D. Additionally, we introduce a new outdoor RGB-D object detection dataset where our proposed model outperforms other models. The performance evaluation on diverse synthetic data generated from CAD models and images shows the potential of the proposed model to be adapted to augmented reality based applications.
△ Less
Submitted 30 September, 2023;
originally announced October 2023.
-
2D Points Curve Reconstruction Survey and Benchmark
Authors:
Stefan Ohrhallinger,
Jiju Peethambaran,
Amal D. Parakkat,
Tamal K. Dey,
Ramanathan Muthuganapathy
Abstract:
Curve reconstruction from unstructured points in a plane is a fundamental problem with many applications that has generated research interest for decades. Involved aspects like handling open, sharp, multiple and non-manifold outlines, run-time and provability as well as potential extension to 3D for surface reconstruction have led to many different algorithms. We survey the literature on 2D curve…
▽ More
Curve reconstruction from unstructured points in a plane is a fundamental problem with many applications that has generated research interest for decades. Involved aspects like handling open, sharp, multiple and non-manifold outlines, run-time and provability as well as potential extension to 3D for surface reconstruction have led to many different algorithms. We survey the literature on 2D curve reconstruction and then present an open-sourced benchmark for the experimental study. Our unprecedented evaluation on a selected set of planar curve reconstruction algorithms aims to give an overview of both quantitative analysis and qualitative aspects for hel** users to select the right algorithm for specific problems in the field. Our benchmark framework is available online to permit reproducing the results, and easy integration of new algorithms.
△ Less
Submitted 17 March, 2021;
originally announced March 2021.
-
LSMAT Least Squares Medial Axis Transform
Authors:
Daniel Rebain,
Baptiste Angles,
Julien Valentin,
Nicholas Vining,
Jiju Peethambaran,
Shahram Izadi,
Andrea Tagliasacchi
Abstract:
The medial axis transform has applications in numerous fields including visualization, computer graphics, and computer vision. Unfortunately, traditional medial axis transformations are usually brittle in the presence of outliers, perturbations and/or noise along the boundary of objects. To overcome this limitation, we introduce a new formulation of the medial axis transform which is naturally rob…
▽ More
The medial axis transform has applications in numerous fields including visualization, computer graphics, and computer vision. Unfortunately, traditional medial axis transformations are usually brittle in the presence of outliers, perturbations and/or noise along the boundary of objects. To overcome this limitation, we introduce a new formulation of the medial axis transform which is naturally robust in the presence of these artifacts. Unlike previous work which has approached the medial axis from a computational geometry angle, we consider it from a numerical optimization perspective. In this work, we follow the definition of the medial axis transform as "the set of maximally inscribed spheres". We show how this definition can be formulated as a least squares relaxation where the transform is obtained by minimizing a continuous optimization problem. The proposed approach is inherently parallelizable by performing independant optimization of each sphere using Gauss-Newton, and its least-squares form allows it to be significantly more robust compared to traditional computational geometry approaches. Extensive experiments on 2D and 3D objects demonstrate that our method provides superior results to the state of the art on both synthetic and real-data.
△ Less
Submitted 10 October, 2020;
originally announced October 2020.
-
Dynamic Edge Weights in Graph Neural Networks for 3D Object Detection
Authors:
Sumesh Thakur,
Jiju Peethambaran
Abstract:
A robust and accurate 3D detection system is an integral part of autonomous vehicles. Traditionally, a majority of 3D object detection algorithms focus on processing 3D point clouds using voxel grids or bird's eye view (BEV). Recent works, however, demonstrate the utilization of the graph neural network (GNN) as a promising approach to 3D object detection. In this work, we propose an attention bas…
▽ More
A robust and accurate 3D detection system is an integral part of autonomous vehicles. Traditionally, a majority of 3D object detection algorithms focus on processing 3D point clouds using voxel grids or bird's eye view (BEV). Recent works, however, demonstrate the utilization of the graph neural network (GNN) as a promising approach to 3D object detection. In this work, we propose an attention based feature aggregation technique in GNN for detecting objects in LiDAR scan. We first employ a distance-aware down-sampling scheme that not only enhances the algorithmic performance but also retains maximum geometric features of objects even if they lie far from the sensor. In each layer of the GNN, apart from the linear transformation which maps the per node input features to the corresponding higher level features, a per node masked attention by specifying different weights to different nodes in its first ring neighborhood is also performed. The masked attention implicitly accounts for the underlying neighborhood graph structure of every node and also eliminates the need of costly matrix operations thereby improving the detection accuracy without compromising the performance. The experiments on KITTI dataset show that our method yields comparable results for 3D object detection.
△ Less
Submitted 17 September, 2020;
originally announced September 2020.
-
TCM-ICP: Transformation Compatibility Measure for Registering Multiple LIDAR Scans
Authors:
Aby Thomas,
Adarsh Sunilkumar,
Shankar Shylesh,
Aby Abahai T.,
Subhasree Methirumangalath,
Dong Chen,
Jiju Peethambaran
Abstract:
Rigid registration of multi-view and multi-platform LiDAR scans is a fundamental problem in 3D map**, robotic navigation, and large-scale urban modeling applications. Data acquisition with LiDAR sensors involves scanning multiple areas from different points of view, thus generating partially overlap** point clouds of the real world scenes. Traditionally, ICP (Iterative Closest Point) algorithm…
▽ More
Rigid registration of multi-view and multi-platform LiDAR scans is a fundamental problem in 3D map**, robotic navigation, and large-scale urban modeling applications. Data acquisition with LiDAR sensors involves scanning multiple areas from different points of view, thus generating partially overlap** point clouds of the real world scenes. Traditionally, ICP (Iterative Closest Point) algorithm is used to register the acquired point clouds together to form a unique point cloud that captures the scanned real world scene. Conventional ICP faces local minima issues and often needs a coarse initial alignment to converge to the optimum. In this work, we present an algorithm for registering multiple, overlap** LiDAR scans. We introduce a geometric metric called Transformation Compatibility Measure (TCM) which aids in choosing the most similar point clouds for registration in each iteration of the algorithm. The LiDAR scan most similar to the reference LiDAR scan is then transformed using simplex technique. An optimization of the transformation using gradient descent and simulated annealing techniques are then applied to improve the resulting registration. We evaluate the proposed algorithm on four different real world scenes and experimental results shows that the registration performance of the proposed method is comparable or superior to the traditionally used registration methods. Further, the algorithm achieves superior registration results even when dealing with outliers.
△ Less
Submitted 31 January, 2020; v1 submitted 4 January, 2020;
originally announced January 2020.