Search | arXiv e-print repository

GP-S3Net: Graph-based Panoptic Sparse Semantic Segmentation Network

Authors: Ryan Razani, Ran Cheng, Enxu Li, Ehsan Taghavi, Yuan Ren, Liu Bingbing

Abstract: Panoptic segmentation as an integrated task of both static environmental understanding and dynamic object identification, has recently begun to receive broad research interest. In this paper, we propose a new computationally efficient LiDAR based panoptic segmentation framework, called GP-S3Net. GP-S3Net is a proposal-free approach in which no object proposals are needed to identify the objects in… ▽ More Panoptic segmentation as an integrated task of both static environmental understanding and dynamic object identification, has recently begun to receive broad research interest. In this paper, we propose a new computationally efficient LiDAR based panoptic segmentation framework, called GP-S3Net. GP-S3Net is a proposal-free approach in which no object proposals are needed to identify the objects in contrast to conventional two-stage panoptic systems, where a detection network is incorporated for capturing instance information. Our new design consists of a novel instance-level network to process the semantic results by constructing a graph convolutional network to identify objects (foreground), which later on are fused with the background classes. Through the fine-grained clusters of the foreground objects from the semantic segmentation backbone, over-segmentation priors are generated and subsequently processed by 3D sparse convolution to embed each cluster. Each cluster is treated as a node in the graph and its corresponding embedding is used as its node feature. Then a GCNN predicts whether edges exist between each cluster pair. We utilize the instance label to generate ground truth edge labels for each constructed graph in order to supervise the learning. Extensive experiments demonstrate that GP-S3Net outperforms the current state-of-the-art approaches, by a significant margin across available datasets such as, nuScenes and SemanticPOSS, ranking first on the competitive public SemanticKITTI leaderboard upon publication. △ Less

Submitted 18 August, 2021; originally announced August 2021.

arXiv:2103.08852 [pdf, other]

Lite-HDSeg: LiDAR Semantic Segmentation Using Lite Harmonic Dense Convolutions

Authors: Ryan Razani, Ran Cheng, Ehsan Taghavi, Liu Bingbing

Abstract: Autonomous driving vehicles and robotic systems rely on accurate perception of their surroundings. Scene understanding is one of the crucial components of perception modules. Among all available sensors, LiDARs are one of the essential sensing modalities of autonomous driving systems due to their active sensing nature with high resolution of sensor readings. Accurate and fast semantic segmentation… ▽ More Autonomous driving vehicles and robotic systems rely on accurate perception of their surroundings. Scene understanding is one of the crucial components of perception modules. Among all available sensors, LiDARs are one of the essential sensing modalities of autonomous driving systems due to their active sensing nature with high resolution of sensor readings. Accurate and fast semantic segmentation methods are needed to fully utilize LiDAR sensors for scene understanding. In this paper, we present Lite-HDSeg, a novel real-time convolutional neural network for semantic segmentation of full $3$D LiDAR point clouds. Lite-HDSeg can achieve the best accuracy vs. computational complexity trade-off in SemanticKitti benchmark and is designed on the basis of a new encoder-decoder architecture with light-weight harmonic dense convolutions as its core. Moreover, we introduce ICM, an improved global contextual module to capture multi-scale contextual features, and MCSPN, a multi-class Spatial Propagation Network to further refine the semantic boundaries. Our experimental results show that the proposed method outperforms state-of-the-art semantic segmentation approaches which can run real-time, thus is suitable for robotic and autonomous driving applications. △ Less

Submitted 16 March, 2021; originally announced March 2021.

arXiv:2102.04530 [pdf, other]

(AF)2-S3Net: Attentive Feature Fusion with Adaptive Feature Selection for Sparse Semantic Segmentation Network

Authors: Ran Cheng, Ryan Razani, Ehsan Taghavi, Enxu Li, Bingbing Liu

Abstract: Autonomous robotic systems and self driving cars rely on accurate perception of their surroundings as the safety of the passengers and pedestrians is the top priority. Semantic segmentation is one the essential components of environmental perception that provides semantic information of the scene. Recently, several methods have been introduced for 3D LiDAR semantic segmentation. While, they can le… ▽ More Autonomous robotic systems and self driving cars rely on accurate perception of their surroundings as the safety of the passengers and pedestrians is the top priority. Semantic segmentation is one the essential components of environmental perception that provides semantic information of the scene. Recently, several methods have been introduced for 3D LiDAR semantic segmentation. While, they can lead to improved performance, they are either afflicted by high computational complexity, therefore are inefficient, or lack fine details of smaller instances. To alleviate this problem, we propose AF2-S3Net, an end-to-end encoder-decoder CNN network for 3D LiDAR semantic segmentation. We present a novel multi-branch attentive feature fusion module in the encoder and a unique adaptive feature selection module with feature map re-weighting in the decoder. Our AF2-S3Net fuses the voxel based learning and point-based learning into a single framework to effectively process the large 3D scene. Our experimental results show that the proposed method outperforms the state-of-the-art approaches on the large-scale SemanticKITTI benchmark, ranking 1st on the competitive public leaderboard competition upon publication. △ Less

Submitted 8 February, 2021; originally announced February 2021.

Comments: 10 pages, 6 figures, 4 tables

arXiv:2008.10544 [pdf, other]

TORNADO-Net: mulTiview tOtal vaRiatioN semAntic segmentation with Diamond inceptiOn module

Authors: Martin Gerdzhev, Ryan Razani, Ehsan Taghavi, Bingbing Liu

Abstract: Semantic segmentation of point clouds is a key component of scene understanding for robotics and autonomous driving. In this paper, we introduce TORNADO-Net - a neural network for 3D LiDAR point cloud semantic segmentation. We incorporate a multi-view (bird-eye and range) projection feature extraction with an encoder-decoder ResNet architecture with a novel diamond context block. Current projectio… ▽ More Semantic segmentation of point clouds is a key component of scene understanding for robotics and autonomous driving. In this paper, we introduce TORNADO-Net - a neural network for 3D LiDAR point cloud semantic segmentation. We incorporate a multi-view (bird-eye and range) projection feature extraction with an encoder-decoder ResNet architecture with a novel diamond context block. Current projection-based methods do not take into account that neighboring points usually belong to the same class. To better utilize this local neighbourhood information and reduce noisy predictions, we introduce a combination of Total Variation, Lovasz-Softmax, and Weighted Cross-Entropy losses. We also take advantage of the fact that the LiDAR data encompasses 360 degrees field of view and uses circular padding. We demonstrate state-of-the-art results on the SemanticKITTI dataset and also provide thorough quantitative evaluations and ablation results. △ Less

Submitted 24 August, 2020; originally announced August 2020.

Comments: 10 pages, 4 figures

arXiv:1904.08506 [pdf, other]

Adaptive Hierarchical Down-Sampling for Point Cloud Classification

Authors: Ehsan Nezhadarya, Ehsan Taghavi, Ryan Razani, Bingbing Liu, Jun Luo

Abstract: While several convolution-like operators have recently been proposed for extracting features out of point clouds, down-sampling an unordered point cloud in a deep neural network has not been rigorously studied. Existing methods down-sample the points regardless of their importance for the output. As a result, some important points in the point cloud may be removed, while less valuable points may b… ▽ More While several convolution-like operators have recently been proposed for extracting features out of point clouds, down-sampling an unordered point cloud in a deep neural network has not been rigorously studied. Existing methods down-sample the points regardless of their importance for the output. As a result, some important points in the point cloud may be removed, while less valuable points may be passed to the next layers. In contrast, adaptive down-sampling methods sample the points by taking into account the importance of each point, which varies based on the application, task and training data. In this paper, we propose a permutation-invariant learning-based adaptive down-sampling layer, called Critical Points Layer (CPL), which reduces the number of points in an unordered point cloud while retaining the important points. Unlike most graph-based point cloud down-sampling methods that use $k$-NN search algorithm to find the neighbouring points, CPL is a global down-sampling method, rendering it computationally very efficient. The proposed layer can be used along with any graph-based point cloud convolution layer to form a convolutional neural network, dubbed CP-Net in this paper. We introduce a CP-Net for $3$D object classification that achieves the best accuracy for the ModelNet$40$ dataset among point cloud-based methods, which validates the effectiveness of the CPL. △ Less

Submitted 22 May, 2020; v1 submitted 11 April, 2019; originally announced April 2019.

Journal ref: 2020 Conference on Computer Vision and Pattern Recognition

arXiv:1607.01355 [pdf, other]

Object Recognition and Identification Using ESM Data

Authors: E. Taghavi, D. Song, R. Tharmarasa, T. Kirubarajan, Anne-Claire Boury-Brisset, Bhashyam Balaji

Abstract: Recognition and identification of unknown targets is a crucial task in surveillance and security systems. Electronic Support Measures (ESM) are one of the most effective sensors for identification, especially for maritime and air--to--ground applications. In typical surveillance systems multiple ESM sensors are usually deployed along with kinematic sensors like radar. Different ESM sensors may pro… ▽ More Recognition and identification of unknown targets is a crucial task in surveillance and security systems. Electronic Support Measures (ESM) are one of the most effective sensors for identification, especially for maritime and air--to--ground applications. In typical surveillance systems multiple ESM sensors are usually deployed along with kinematic sensors like radar. Different ESM sensors may produce different types of reports ready to be sent to the fusion center. The focus of this paper is to develop a new architecture for target recognition and identification when non--homogeneous ESM and possibly kinematic reports are received at the fusion center. The new fusion architecture is evaluated using simulations to show the benefit of utilizing different ESM reports such as attributes and signal level ESM data. △ Less

Submitted 22 March, 2016; originally announced July 2016.

arXiv:1603.03450 [pdf, other]

Multisensor--Multitarget Bearing--Only Sensor Registration

Authors: Ehsan Taghavi, R. Tharmarasa, T. Kirubarajan, Mike McDonald

Abstract: Bearing--only estimation is one of the fundamental and challenging problems in target tracking. As in the case of radar tracking, the presence of offset or position biases can exacerbate the challenges in bearing--only estimation. Modeling various sensor biases is not a trivial task and not much has been done in the literature specifically for bearing--only tracking. This paper addresses the model… ▽ More Bearing--only estimation is one of the fundamental and challenging problems in target tracking. As in the case of radar tracking, the presence of offset or position biases can exacerbate the challenges in bearing--only estimation. Modeling various sensor biases is not a trivial task and not much has been done in the literature specifically for bearing--only tracking. This paper addresses the modeling of offset biases in bearing--only sensors and the ensuing multitarget tracking with bias compensation. Bias estimation is handled at the fusion node to which individual sensors report their local tracks in the form of associated measurement reports (AMR) or angle-only tracks. The modeling is based on a multisensor approach that can effectively handle a time--varying number of targets in the surveillance region. The proposed algorithm leads to a maximum likelihood bias estimator. The corresponding Cramér--Rao Lower Bound to quantify the theoretical accuracy that can be achieved by the proposed method or any other algorithm is also derived. Finally, simulation results on different distributed tracking scenarios are presented to demonstrate the capabilities of the proposed approach. In order to show that the proposed method can work even with false alarms and missed detections, simulation results on a centralized tracking scenario where the local sensors send all their measurements (not AMRs or local tracks) are also presented. △ Less

Submitted 9 March, 2016; originally announced March 2016.

Journal ref: IEEE Transactions on Aerospace and Electronics Systems, 52 (4), 2016

arXiv:1603.03449 [pdf, other]

doi 10.1109/TAES.2015.140574

A Practical Bias Estimation Algorithm for Multisensor--Multitarget Tracking

Authors: Ehsan Taghavi, R. Tharmarasa, T. Kirubarajan, Yaakov Bar-Shalom, Mike McDonald

Abstract: Bias estimation or sensor registration is an essential step in ensuring the accuracy of global tracks in multisensor-multitarget tracking. Most previously proposed algorithms for bias estimation rely on local measurements in centralized systems or tracks in distributed systems, along with additional information like covariances, filter gains or targets of opportunity. In addition, it is generally… ▽ More Bias estimation or sensor registration is an essential step in ensuring the accuracy of global tracks in multisensor-multitarget tracking. Most previously proposed algorithms for bias estimation rely on local measurements in centralized systems or tracks in distributed systems, along with additional information like covariances, filter gains or targets of opportunity. In addition, it is generally assumed that such data are made available to the fusion center at every sampling time. In practical distributed multisensor tracking systems, where each platform sends local tracks to the fusion center, only state estimates and, perhaps, their covariances are sent to the fusion center at non-consecutive sampling instants or scans. That is, not all the information required for exact bias estimation at the fusion center is available in practical distributed tracking systems. In this paper, a new algorithm that is capable of accurately estimating the biases even in the absence of filter gain information from local platforms is proposed for distributed tracking systems with intermittent track transmission. Through the calculation of the Posterior Cramér--Rao lower bound and various simulation results, it is shown that the performance of the new algorithm, which uses the tracklet idea and does not require track transmission at every sampling time or exchange of filter gains, can approach the performance of the exact bias estimation algorithm that requires local filter gains. △ Less

Submitted 9 March, 2016; originally announced March 2016.

Journal ref: IEEE Transactions on Aerospace and Electronics Systems, 52 (1), 2016

Showing 1–8 of 8 results for author: Taghavi, E