Search | arXiv e-print repository

LetsMap: Unsupervised Representation Learning for Semantic BEV Map**

Authors: Nikhil Gosala, Kürsat Petek, B Ravi Kiran, Senthil Yogamani, Paulo Drews-Jr, Wolfram Burgard, Abhinav Valada

Abstract: Semantic Bird's Eye View (BEV) maps offer a rich representation with strong occlusion reasoning for various decision making tasks in autonomous driving. However, most BEV map** approaches employ a fully supervised learning paradigm that relies on large amounts of human-annotated BEV ground truth data. In this work, we address this limitation by proposing the first unsupervised representation lea… ▽ More Semantic Bird's Eye View (BEV) maps offer a rich representation with strong occlusion reasoning for various decision making tasks in autonomous driving. However, most BEV map** approaches employ a fully supervised learning paradigm that relies on large amounts of human-annotated BEV ground truth data. In this work, we address this limitation by proposing the first unsupervised representation learning approach to generate semantic BEV maps from a monocular frontal view (FV) image in a label-efficient manner. Our approach pretrains the network to independently reason about scene geometry and scene semantics using two disjoint neural pathways in an unsupervised manner and then finetunes it for the task of semantic BEV map** using only a small fraction of labels in the BEV. We achieve label-free pretraining by exploiting spatial and temporal consistency of FV images to learn scene geometry while relying on a novel temporal masked autoencoder formulation to encode the scene representation. Extensive evaluations on the KITTI-360 and nuScenes datasets demonstrate that our approach performs on par with the existing state-of-the-art approaches while using only 1% of BEV labels and no additional labeled data. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: 23 pages, 5 figures

arXiv:2403.11761 [pdf, other]

BEVCar: Camera-Radar Fusion for BEV Map and Object Segmentation

Authors: Jonas Schramm, Niclas Vödisch, Kürsat Petek, B Ravi Kiran, Senthil Yogamani, Wolfram Burgard, Abhinav Valada

Abstract: Semantic scene segmentation from a bird's-eye-view (BEV) perspective plays a crucial role in facilitating planning and decision-making for mobile robots. Although recent vision-only methods have demonstrated notable advancements in performance, they often struggle under adverse illumination conditions such as rain or nighttime. While active sensors offer a solution to this challenge, the prohibiti… ▽ More Semantic scene segmentation from a bird's-eye-view (BEV) perspective plays a crucial role in facilitating planning and decision-making for mobile robots. Although recent vision-only methods have demonstrated notable advancements in performance, they often struggle under adverse illumination conditions such as rain or nighttime. While active sensors offer a solution to this challenge, the prohibitively high cost of LiDARs remains a limiting factor. Fusing camera data with automotive radars poses a more inexpensive alternative but has received less attention in prior research. In this work, we aim to advance this promising avenue by introducing BEVCar, a novel approach for joint BEV object and map segmentation. The core novelty of our approach lies in first learning a point-based encoding of raw radar data, which is then leveraged to efficiently initialize the lifting of image features into the BEV space. We perform extensive experiments on the nuScenes dataset and demonstrate that BEVCar outperforms the current state of the art. Moreover, we show that incorporating radar information significantly enhances robustness in challenging environmental conditions and improves segmentation performance for distant objects. To foster future research, we provide the weather split of the nuScenes dataset used in our experiments, along with our code and trained models at http://bevcar.cs.uni-freiburg.de. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2402.10776 [pdf]

doi 10.1109/ACCESS.2019.2904788

In-Vivo Hyperspectral Human Brain Image Database for Brain Cancer Detection

Authors: H. Fabelo, S. Ortega, A. Szolna, D. Bulters, J. F. Pineiro, S. Kabwama, A. Shanahan, H. Bulstrode, S. Bisshopp, B. R. Kiran, D. Ravi, R. Lazcano, D. Madronal, C. Sosa, C. Espino, M. Marquez, M. De la Luz Plaza, R. Camacho, D. Carrera, M. Hernandez, G. M. Callico, J. Morera, B. Stanciulescu, G. Z. Yang, R. Salvador , et al. (3 additional authors not shown)

Abstract: The use of hyperspectral imaging for medical applications is becoming more common in recent years. One of the main obstacles that researchers find when develo** hyperspectral algorithms for medical applications is the lack of specific, publicly available, and hyperspectral medical data. The work described in this paper was developed within the framework of the European project HELICoiD (HypErspe… ▽ More The use of hyperspectral imaging for medical applications is becoming more common in recent years. One of the main obstacles that researchers find when develo** hyperspectral algorithms for medical applications is the lack of specific, publicly available, and hyperspectral medical data. The work described in this paper was developed within the framework of the European project HELICoiD (HypErspectraL Imaging Cancer Detection), which had as a main goal the application of hyperspectral imaging to the delineation of brain tumors in real-time during neurosurgical operations. In this paper, the methodology followed to generate the first hyperspectral database of in-vivo human brain tissues is presented. Data was acquired employing a customized hyperspectral acquisition system capable of capturing information in the Visual and Near InfraRed (VNIR) range from 400 to 1000 nm. Repeatability was assessed for the cases where two images of the same scene were captured consecutively. The analysis reveals that the system works more efficiently in the spectral range between 450 and 900 nm. A total of 36 hyperspectral images from 22 different patients were obtained. From these data, more than 300 000 spectral signatures were labeled employing a semi-automatic methodology based on the spectral angle mapper algorithm. Four different classes were defined: normal tissue, tumor tissue, blood vessel, and background elements. All the hyperspectral data has been made available in a public repository. △ Less

Submitted 16 February, 2024; originally announced February 2024.

Comments: 19 pages, 12 figures

Journal ref: IEEE Access, 2019, 7, pp. 39098 39116

arXiv:2402.07192 [pdf]

doi 10.1371/journal.pone.0193721

Spatio-spectral classification of hyperspectral images for brain cancer detection during surgical operations

Authors: H. Fabelo, S. Ortega, D. Ravi, B. R. Kiran, C. Sosa, D. Bulters, G. M. Callico, H. Bulstrode, A. Szolna, J. F. Pineiro, S. Kabwama, D. Madronal, R. Lazcano, A. J. OShanahan, S. Bisshopp, M. Hernandez, A. Baez-Quevedo, G. Z. Yang, B. Stanciulescu, R. Salvador, E. Juarez, R. Sarmiento

Abstract: Surgery for brain cancer is a major problem in neurosurgery. The diffuse infiltration into the surrounding normal brain by these tumors makes their accurate identification by the naked eye difficult. Since surgery is the common treatment for brain cancer, an accurate radical resection of the tumor leads to improved survival rates for patients. However, the identification of the tumor boundaries du… ▽ More Surgery for brain cancer is a major problem in neurosurgery. The diffuse infiltration into the surrounding normal brain by these tumors makes their accurate identification by the naked eye difficult. Since surgery is the common treatment for brain cancer, an accurate radical resection of the tumor leads to improved survival rates for patients. However, the identification of the tumor boundaries during surgery is challenging. Hyperspectral imaging is a noncontact, non-ionizing and non-invasive technique suitable for medical diagnosis. This study presents the development of a novel classification method taking into account the spatial and spectral characteristics of the hyperspectral images to help neurosurgeons to accurately determine the tumor boundaries in surgical-time during the resection, avoiding excessive excision of normal tissue or unintentionally leaving residual tumor. The algorithm proposed in this study to approach an efficient solution consists of a hybrid framework that combines both supervised and unsupervised machine learning methods. To evaluate the proposed approach, five hyperspectral images of surface of the brain affected by glioblastoma tumor in vivo from five different patients have been used. The final classification maps obtained have been analyzed and validated by specialists. These preliminary results are promising, obtaining an accurate delineation of the tumor area. △ Less

Submitted 11 February, 2024; originally announced February 2024.

arXiv:2305.04614 [pdf, other]

Reducing Onboard Processing Time for Path Planning in Dynamically Evolving Polygonal Maps

Authors: Aditya Shirwatkar, Aman Singh, Jana Ravi Kiran

Abstract: Autonomous agents face the challenge of coordinating multiple tasks (perception, motion planning, controller) which are computationally expensive on a single onboard computer. To utilize the onboard processing capacity optimally, it is imperative to arrive at computationally efficient algorithms for global path planning. In this work, it is attempted to reduce the processing time for global path p… ▽ More Autonomous agents face the challenge of coordinating multiple tasks (perception, motion planning, controller) which are computationally expensive on a single onboard computer. To utilize the onboard processing capacity optimally, it is imperative to arrive at computationally efficient algorithms for global path planning. In this work, it is attempted to reduce the processing time for global path planning in dynamically evolving polygonal maps. In dynamic environments, maps may not remain valid for long. Hence it is of utmost importance to obtain the shortest path quickly in an ever-changing environment. To address this, an existing rapid path-finding algorithm, the Minimal Construct was used. This algorithm discovers only a necessary portion of the Visibility Graph around obstacles and computes collision tests only for lines that seem heuristically promising. Simulations show that this algorithm finds shortest paths faster than traditional grid-based A* searches in most cases, resulting in smoother and shorter paths even in dynamic environments. △ Less

Submitted 8 May, 2023; originally announced May 2023.

Comments: Course Project Report for CP:230 Motion Planning and Autonomous Navigation by Dr. Debasish Ghose

arXiv:2302.10679 [pdf, other]

Evaluating the effect of data augmentation and BALD heuristics on distillation of Semantic-KITTI dataset

Authors: Anh Duong, Alexandre Almin, Léo Lemarié, B Ravi Kiran

Abstract: Active Learning (AL) has remained relatively unexplored for LiDAR perception tasks in autonomous driving datasets. In this study we evaluate Bayesian active learning methods applied to the task of dataset distillation or core subset selection (subset with near equivalent performance as full dataset). We also study the effect of application of data augmentation (DA) within Bayesian AL based dataset… ▽ More Active Learning (AL) has remained relatively unexplored for LiDAR perception tasks in autonomous driving datasets. In this study we evaluate Bayesian active learning methods applied to the task of dataset distillation or core subset selection (subset with near equivalent performance as full dataset). We also study the effect of application of data augmentation (DA) within Bayesian AL based dataset distillation. We perform these experiments on the full Semantic-KITTI dataset. We extend our study over our existing work only on 1/4th of the same dataset. Addition of DA and BALD have a negative impact over the labeling efficiency and thus the capacity to distill datasets. We demonstrate key issues in designing a functional AL framework and finally conclude with a review of challenges in real world active learning. △ Less

Submitted 21 February, 2023; originally announced February 2023.

Comments: Submitted to VISAPP Springer book extension. arXiv admin note: substantial text overlap with arXiv:2202.02661

arXiv:2302.08292 [pdf, other]

Navya3DSeg -- Navya 3D Semantic Segmentation Dataset & split generation for autonomous vehicles

Authors: Alexandre Almin, Léo Lemarié, Anh Duong, B Ravi Kiran

Abstract: Autonomous driving (AD) perception today relies heavily on deep learning based architectures requiring large scale annotated datasets with their associated costs for curation and annotation. The 3D semantic data are useful for core perception tasks such as obstacle detection and ego-vehicle localization. We propose a new dataset, Navya 3D Segmentation (Navya3DSeg), with a diverse label space corre… ▽ More Autonomous driving (AD) perception today relies heavily on deep learning based architectures requiring large scale annotated datasets with their associated costs for curation and annotation. The 3D semantic data are useful for core perception tasks such as obstacle detection and ego-vehicle localization. We propose a new dataset, Navya 3D Segmentation (Navya3DSeg), with a diverse label space corresponding to a large scale production grade operational domain, including rural, urban, industrial sites and universities from 13 countries. It contains 23 labeled sequences and 25 supplementary sequences without labels, designed to explore self-supervised and semi-supervised semantic segmentation benchmarks on point clouds. We also propose a novel method for sequential dataset split generation based on iterative multi-label stratification, and demonstrated to achieve a +1.2% mIoU improvement over the original split proposed by SemanticKITTI dataset. A complete benchmark for semantic segmentation task was performed, with state of the art methods. Finally, we demonstrate an Active Learning (AL) based dataset distillation framework. We introduce a novel heuristic-free sampling method called ego-pose distance based sampling in the context of AL. A detailed presentation on the dataset is available here https://www.youtube.com/watch?v=5m6ALIs-s20. △ Less

Submitted 20 July, 2023; v1 submitted 16 February, 2023; originally announced February 2023.

Comments: Accepted version to IEEE RA-L. Version with supplementary materials

arXiv:2211.00067 [pdf]

COVID-19 Infection Exposure to Customers Shop** during Black Friday

Authors: Braxton Rolle, Ravi Kiran

Abstract: The outbreak of COVID-19 within the last two years has resulted in much further investigation into the safety of large events that involve a gathering of people. This study aims to investigate how COVID-19 can spread through a large crowd of people shop** in a store with no safety precautions taken. The event being investigated is Black Friday, where hundreds or thousands of customers flood stor… ▽ More The outbreak of COVID-19 within the last two years has resulted in much further investigation into the safety of large events that involve a gathering of people. This study aims to investigate how COVID-19 can spread through a large crowd of people shop** in a store with no safety precautions taken. The event being investigated is Black Friday, where hundreds or thousands of customers flood stores to hopefully receive the best deals on popular items. A mock store was created, separated into several different shop** sections, and represented using a 2-D grid where each square on the grid represented a 5 feet by 5 feet area of the mock store. Customers were simulated to enter the store, shop for certain items, check out, and then leave the store. A percentage of customers were chosen to be infective when they entered the store, which means that they could spread infection quantum to other customers. Four hours of time was simulated with around 6,000 customers being included. The maximum distance exposure could be spread (2 feet-10 feet), the minimum time of exposure needed to become infected (2 - 15 minutes), and the total percentage of customers who started as infective (1% - 5%) were all changed and their effects on the number of newly infected customers were measured. It was found that increasing the maximum exposure distance by 2 feet resulted in between a 20% to 250% increase in newly infected customers, depending on the distances being used. It was also found that increasing the percentage of customers who started as infective from 1% to 2% and then to 5% resulted in a 200% to 300% increase in newly infected customers. △ Less

Submitted 22 October, 2022; originally announced November 2022.

Comments: 22 pages, 11 tables, and 8 figures

MSC Class: 68U35 ACM Class: J.3.2

arXiv:2206.12738 [pdf, other]

Self-Supervised 3D Monocular Object Detection by Recycling Bounding Boxes

Authors: Sugirtha T, Sridevi M, Khailash Santhakumar, Hao Liu, B Ravi Kiran, Thomas Gauthier, Senthil Yogamani

Abstract: Modern object detection architectures are moving towards employing self-supervised learning (SSL) to improve performance detection with related pretext tasks. Pretext tasks for monocular 3D object detection have not yet been explored yet in literature. The paper studies the application of established self-supervised bounding box recycling by labeling random windows as the pretext task. The classif… ▽ More Modern object detection architectures are moving towards employing self-supervised learning (SSL) to improve performance detection with related pretext tasks. Pretext tasks for monocular 3D object detection have not yet been explored yet in literature. The paper studies the application of established self-supervised bounding box recycling by labeling random windows as the pretext task. The classifier head of the 3D detector is trained to classify random windows containing different proportions of the ground truth objects, thus handling the foreground-background imbalance. We evaluate the pretext task using the RTM3D detection model as baseline, with and without the application of data augmentation. We demonstrate improvements of between 2-3 % in mAP 3D and 0.9-1.5 % BEV scores using SSL over the baseline scores. We propose the inverse class frequency re-weighted (ICFW) mAP score that highlights improvements in detection for low frequency classes in a class imbalanced dataset with long tails. We demonstrate improvements in ICFW both mAP 3D and BEV scores to take into account the class imbalance in the KITTI validation dataset. We see 4-5 % increase in ICFW metric with the pretext task. △ Less

Submitted 25 June, 2022; originally announced June 2022.

Comments: Published at ICCVW-SSLAD 2021. arXiv admin note: substantial text overlap with arXiv:2104.10786

arXiv:2202.02666 [pdf, other]

Simulation-to-Reality domain adaptation for offline 3D object annotation on pointclouds with correlation alignment

Authors: Weishuang Zhang, B Ravi Kiran, Thomas Gauthier, Yanis Mazouz, Theo Steger

Abstract: Annotating objects with 3D bounding boxes in LiDAR pointclouds is a costly human driven process in an autonomous driving perception system. In this paper, we present a method to semi-automatically annotate real-world pointclouds collected by deployment vehicles using simulated data. We train a 3D object detector model on labeled simulated data from CARLA jointly with real world pointclouds from ou… ▽ More Annotating objects with 3D bounding boxes in LiDAR pointclouds is a costly human driven process in an autonomous driving perception system. In this paper, we present a method to semi-automatically annotate real-world pointclouds collected by deployment vehicles using simulated data. We train a 3D object detector model on labeled simulated data from CARLA jointly with real world pointclouds from our target vehicle. The supervised object detection loss is augmented with a CORAL loss term to reduce the distance between labeled simulated and unlabeled real pointcloud feature representations. The goal here is to learn representations that are invariant to simulated (labeled) and real-world (unlabeled) target domains. We also provide an updated survey on domain adaptation methods for pointclouds. △ Less

Submitted 26 February, 2022; v1 submitted 5 February, 2022; originally announced February 2022.

Comments: Accepted at IMPROVE 2022

arXiv:2202.02661 [pdf, other]

LiDAR dataset distillation within bayesian active learning framework: Understanding the effect of data augmentation

Authors: Ngoc Phuong Anh Duong, Alexandre Almin, Léo Lemarié, B Ravi Kiran

Abstract: Autonomous driving (AD) datasets have progressively grown in size in the past few years to enable better deep representation learning. Active learning (AL) has re-gained attention recently to address reduction of annotation costs and dataset size. AL has remained relatively unexplored for AD datasets, especially on point cloud data from LiDARs. This paper performs a principled evaluation of AL bas… ▽ More Autonomous driving (AD) datasets have progressively grown in size in the past few years to enable better deep representation learning. Active learning (AL) has re-gained attention recently to address reduction of annotation costs and dataset size. AL has remained relatively unexplored for AD datasets, especially on point cloud data from LiDARs. This paper performs a principled evaluation of AL based dataset distillation on (1/4th) of the large Semantic-KITTI dataset. Further on, the gains in model performance due to data augmentation (DA) are demonstrated across different subsets of the AL loop. We also demonstrate how DA improves the selection of informative samples to annotate. We observe that data augmentation achieves full dataset accuracy using only 60\% of samples from the selected dataset configuration. This provides faster training time and subsequent gains in annotation costs. △ Less

Submitted 5 February, 2022; originally announced February 2022.

Comments: Accepted at VISAPP 2022

arXiv:2104.10786 [pdf, other]

Exploring 2D Data Augmentation for 3D Monocular Object Detection

Authors: Sugirtha T, Sridevi M, Khailash Santhakumar, B Ravi Kiran, Thomas Gauthier, Senthil Yogamani

Abstract: Data augmentation is a key component of CNN based image recognition tasks like object detection. However, it is relatively less explored for 3D object detection. Many standard 2D object detection data augmentation techniques do not extend to 3D box. Extension of these data augmentations for 3D object detection requires adaptation of the 3D geometry of the input scene and synthesis of new viewpoint… ▽ More Data augmentation is a key component of CNN based image recognition tasks like object detection. However, it is relatively less explored for 3D object detection. Many standard 2D object detection data augmentation techniques do not extend to 3D box. Extension of these data augmentations for 3D object detection requires adaptation of the 3D geometry of the input scene and synthesis of new viewpoints. This requires accurate depth information of the scene which may not be always available. In this paper, we evaluate existing 2D data augmentations and propose two novel augmentations for monocular 3D detection without a requirement for novel view synthesis. We evaluate these augmentations on the RTM3D detection model firstly due to the shorter training times . We obtain a consistent improvement by 4% in the 3D AP (@IoU=0.7) for cars, ~1.8% scores 3D AP (@IoU=0.25) for pedestrians & cyclists, over the baseline on KITTI car detection dataset. We also demonstrate a rigorous evaluation of the mAP scores by re-weighting them to take into account the class imbalance in the KITTI validation dataset. △ Less

Submitted 21 April, 2021; originally announced April 2021.

arXiv:2009.04916 [pdf, other]

GoCoronaGo: Privacy Respecting Contact Tracing for COVID-19 Management

Authors: Yogesh Simmhan, Tarun Rambha, Aakash Khochare, Shriram Ramesh, Animesh Baranawal, John Varghese George, Rahul Atul Bhope, Amrita Namtirtha, Amritha Sundararajan, Sharath Suresh Bhargav, Nihar Thakkar, Raj Kiran

Abstract: The COVID-19 pandemic is imposing enormous global challenges in managing the spread of the virus. A key pillar to mitigation is contact tracing, which complements testing and isolation. Digital apps for contact tracing using Bluetooth technology available in smartphones have gained prevalence globally. In this article, we discuss various capabilities of such digital contact tracing, and its implic… ▽ More The COVID-19 pandemic is imposing enormous global challenges in managing the spread of the virus. A key pillar to mitigation is contact tracing, which complements testing and isolation. Digital apps for contact tracing using Bluetooth technology available in smartphones have gained prevalence globally. In this article, we discuss various capabilities of such digital contact tracing, and its implication on community safety and individual privacy, among others. We further describe the GoCoronaGo institutional contact tracing app that we have developed, and the conscious and sometimes contrarian design choices we have made. We offer a detailed overview of the app, backend platform and analytics, and our early experiences with deploying the app to over 1000 users within the Indian Institute of Science campus in Bangalore. We also highlight research opportunities and open challenges for digital contact tracing and analytics over temporal networks constructed from them. △ Less

Submitted 10 September, 2020; originally announced September 2020.

Comments: Pre-print of article to appear in the Journal of the Indian Institute of Science

arXiv:2005.13102 [pdf, other]

Road Segmentation on low resolution Lidar point clouds for autonomous vehicles

Authors: Leonardo Gigli, B Ravi Kiran, Thomas Paul, Andres Serna, Nagarjuna Vemuri, Beatriz Marcotegui, Santiago Velasco-Forero

Abstract: Point cloud datasets for perception tasks in the context of autonomous driving often rely on high resolution 64-layer Light Detection and Ranging (LIDAR) scanners. They are expensive to deploy on real-world autonomous driving sensor architectures which usually employ 16/32 layer LIDARs. We evaluate the effect of subsampling image based representations of dense point clouds on the accuracy of the r… ▽ More Point cloud datasets for perception tasks in the context of autonomous driving often rely on high resolution 64-layer Light Detection and Ranging (LIDAR) scanners. They are expensive to deploy on real-world autonomous driving sensor architectures which usually employ 16/32 layer LIDARs. We evaluate the effect of subsampling image based representations of dense point clouds on the accuracy of the road segmentation task. In our experiments the low resolution 16/32 layer LIDAR point clouds are simulated by subsampling the original 64 layer data, for subsequent transformation in to a feature map in the Bird-Eye-View (BEV) and SphericalView (SV) representations of the point cloud. We introduce the usage of the local normal vector with the LIDAR's spherical coordinates as an input channel to existing LoDNN architectures. We demonstrate that this local normal feature in conjunction with classical features not only improves performance for binary road segmentation on full resolution point clouds, but it also reduces the negative impact on the accuracy when subsampling dense point clouds as compared to the usage of classical features alone. We assess our method with several experiments on two datasets: KITTI Road-segmentation benchmark and the recently released Semantic KITTI dataset. △ Less

Submitted 26 May, 2020; originally announced May 2020.

Comments: ISPRS 2020

arXiv:2002.00444 [pdf, other]

Deep Reinforcement Learning for Autonomous Driving: A Survey

Authors: B Ravi Kiran, Ibrahim Sobh, Victor Talpaert, Patrick Mannion, Ahmad A. Al Sallab, Senthil Yogamani, Patrick Pérez

Abstract: With the development of deep representation learning, the domain of reinforcement learning (RL) has become a powerful learning framework now capable of learning complex policies in high dimensional environments. This review summarises deep reinforcement learning (DRL) algorithms and provides a taxonomy of automated driving tasks where (D)RL methods have been employed, while addressing key computat… ▽ More With the development of deep representation learning, the domain of reinforcement learning (RL) has become a powerful learning framework now capable of learning complex policies in high dimensional environments. This review summarises deep reinforcement learning (DRL) algorithms and provides a taxonomy of automated driving tasks where (D)RL methods have been employed, while addressing key computational challenges in real world deployment of autonomous driving agents. It also delineates adjacent domains such as behavior cloning, imitation learning, inverse reinforcement learning that are related but are not classical RL algorithms. The role of simulators in training agents, methods to validate, test and robustify existing solutions in RL are discussed. △ Less

Submitted 23 January, 2021; v1 submitted 2 February, 2020; originally announced February 2020.

Comments: Accepted for publication at IEEE Transactions on Intelligent Transportation Systems

arXiv:1901.01536 [pdf, other]

Exploring applications of deep reinforcement learning for real-world autonomous driving systems

Authors: Victor Talpaert, Ibrahim Sobh, B Ravi Kiran, Patrick Mannion, Senthil Yogamani, Ahmad El-Sallab, Patrick Perez

Abstract: Deep Reinforcement Learning (DRL) has become increasingly powerful in recent years, with notable achievements such as Deepmind's AlphaGo. It has been successfully deployed in commercial vehicles like Mobileye's path planning system. However, a vast majority of work on DRL is focused on toy examples in controlled synthetic car simulator environments such as TORCS and CARLA. In general, DRL is still… ▽ More Deep Reinforcement Learning (DRL) has become increasingly powerful in recent years, with notable achievements such as Deepmind's AlphaGo. It has been successfully deployed in commercial vehicles like Mobileye's path planning system. However, a vast majority of work on DRL is focused on toy examples in controlled synthetic car simulator environments such as TORCS and CARLA. In general, DRL is still at its infancy in terms of usability in real-world applications. Our goal in this paper is to encourage real-world deployment of DRL in various autonomous driving (AD) applications. We first provide an overview of the tasks in autonomous driving systems, reinforcement learning algorithms and applications of DRL to AD systems. We then discuss the challenges which must be addressed to enable further progress towards real-world deployment. △ Less

Submitted 16 January, 2019; v1 submitted 6 January, 2019; originally announced January 2019.

Comments: Accepted for Oral Presentation at VISAPP 2019

arXiv:1811.12507 [pdf, other]

Regression and Classification by Zonal Kriging

Authors: Jean Serra, Jesus Angulo, B Ravi Kiran

Abstract: Consider a family $Z=\{\boldsymbol{x_{i}},y_{i}$,$1\leq i\leq N\}$ of $N$ pairs of vectors $\boldsymbol{x_{i}} \in \mathbb{R}^d$ and scalars $y_{i}$ that we aim to predict for a new sample vector $\mathbf{x}_0$. Kriging models $y$ as a sum of a deterministic function $m$, a drift which depends on the point $\boldsymbol{x}$, and a random function $z$ with zero mean. The zonality hypothesis interpre… ▽ More Consider a family $Z=\{\boldsymbol{x_{i}},y_{i}$,$1\leq i\leq N\}$ of $N$ pairs of vectors $\boldsymbol{x_{i}} \in \mathbb{R}^d$ and scalars $y_{i}$ that we aim to predict for a new sample vector $\mathbf{x}_0$. Kriging models $y$ as a sum of a deterministic function $m$, a drift which depends on the point $\boldsymbol{x}$, and a random function $z$ with zero mean. The zonality hypothesis interprets $y$ as a weighted sum of $d$ random functions of a single independent variables, each of which is a kriging, with a quadratic form for the variograms drift. We can therefore construct an unbiased estimator $y^{*}(\boldsymbol{x_{0}})=\sum_{i}λ^{i}z(\boldsymbol{x_{i}})$ de $y(\boldsymbol{x_{0}})$ with minimal variance $E[y^{*}(\boldsymbol{x_{0}})-y(\boldsymbol{x_{0}})]^{2}$, with the help of the known training set points. We give the explicitly closed form for $λ^{i}$ without having calculated the inverse of the matrices. △ Less

Submitted 11 December, 2018; v1 submitted 29 November, 2018; originally announced November 2018.

Comments: Technical Report

arXiv:1809.11036 [pdf, other]

Real-time Dynamic Object Detection for Autonomous Driving using Prior 3D-Maps

Authors: B Ravi Kiran, Luis Roldão, Benat Irastorza, Renzo Verastegui, Sebastian Suss, Senthil Yogamani, Victor Talpaert, Alexandre Lepoutre, Guillaume Trehard

Abstract: Lidar has become an essential sensor for autonomous driving as it provides reliable depth estimation. Lidar is also the primary sensor used in building 3D maps which can be used even in the case of low-cost systems which do not use Lidar. Computation on Lidar point clouds is intensive as it requires processing of millions of points per second. Additionally there are many subsequent tasks such as c… ▽ More Lidar has become an essential sensor for autonomous driving as it provides reliable depth estimation. Lidar is also the primary sensor used in building 3D maps which can be used even in the case of low-cost systems which do not use Lidar. Computation on Lidar point clouds is intensive as it requires processing of millions of points per second. Additionally there are many subsequent tasks such as clustering, detection, tracking and classification which makes real-time execution challenging. In this paper, we discuss real-time dynamic object detection algorithms which leverages previously mapped Lidar point clouds to reduce processing. The prior 3D maps provide a static background model and we formulate dynamic object detection as a background subtraction problem. Computation and modeling challenges in the map** and online execution pipeline are described. We propose a rejection cascade architecture to subtract road regions and other 3D regions separately. We implemented an initial version of our proposed algorithm and evaluated the accuracy on CARLA simulator. △ Less

Submitted 5 July, 2019; v1 submitted 28 September, 2018; originally announced September 2018.

Comments: Preprint Submission to ECCVW AutoNUE 2018 - v2 author name accent correction

arXiv:1806.04456 [pdf]

Impersonation: Modeling Persona in Smart Responses to Email

Authors: Rajeev Gupta, Ranganath Kondapally, Chakrapani Ravi Kiran

Abstract: In this paper, we present design, implementation, and effectiveness of generating personalized suggestions for email replies. To personalize email responses based on users style and personality, we model the users persona based on her past responses to emails. This model is added to the language-based model created across users using past responses of the all user emails. A users model captures… ▽ More In this paper, we present design, implementation, and effectiveness of generating personalized suggestions for email replies. To personalize email responses based on users style and personality, we model the users persona based on her past responses to emails. This model is added to the language-based model created across users using past responses of the all user emails. A users model captures the typical responses of the user given a particular context. The context includes the email received, recipient of the email, and other external signals such as calendar activities, preferences, etc. The context along with users personality (e.g., extrovert, formal, reserved, etc.) is used to suggest responses. These responses can be a mixture of multiple modes: email replies (textual), audio clips, etc. This helps in making responses mimic the user as much as possible and helps the user to be more productive while retaining her mark in the responses. △ Less

Submitted 12 June, 2018; originally announced June 2018.

Comments: IJCAI 2018 conference. Workshop on Humanizing AI

arXiv:1801.03149 [pdf, other]

An overview of deep learning based methods for unsupervised and semi-supervised anomaly detection in videos

Authors: B Ravi Kiran, Dilip Mathew Thomas, Ranjith Parakkal

Abstract: Videos represent the primary source of information for surveillance applications and are available in large amounts but in most cases contain little or no annotation for supervised learning. This article reviews the state-of-the-art deep learning based methods for video anomaly detection and categorizes them based on the type of model and criteria of detection. We also perform simple studies to un… ▽ More Videos represent the primary source of information for surveillance applications and are available in large amounts but in most cases contain little or no annotation for supervised learning. This article reviews the state-of-the-art deep learning based methods for video anomaly detection and categorizes them based on the type of model and criteria of detection. We also perform simple studies to understand the different approaches and provide the criteria of evaluation for spatio-temporal anomaly detection. △ Less

Submitted 30 January, 2018; v1 submitted 9 January, 2018; originally announced January 2018.

Comments: 15 pages, double column

arXiv:1705.09339 [pdf, other]

Rejection-Cascade of Gaussians: Real-time adaptive background subtraction framework

Authors: B Ravi Kiran, Arindam Das, Senthil Yogamani

Abstract: Background-Foreground classification is a well-studied problem in computer vision. Due to the pixel-wise nature of modeling and processing in the algorithm, it is usually difficult to satisfy real-time constraints. There is a trade-off between the speed (because of model complexity) and accuracy. Inspired by the rejection cascade of Viola-Jones classifier, we decompose the Gaussian Mixture Model (… ▽ More Background-Foreground classification is a well-studied problem in computer vision. Due to the pixel-wise nature of modeling and processing in the algorithm, it is usually difficult to satisfy real-time constraints. There is a trade-off between the speed (because of model complexity) and accuracy. Inspired by the rejection cascade of Viola-Jones classifier, we decompose the Gaussian Mixture Model (GMM) into an adaptive cascade of Gaussians(CoG). We achieve a good improvement in speed without compromising the accuracy with respect to the baseline GMM model. We demonstrate a speed-up factor of 4-5x and 17 percent average improvement in accuracy over Wallflowers surveillance datasets. The CoG is then demonstrated to over the latent space representation of images of a convolutional variational autoencoder(VAE). We provide initial results over CDW-2014 dataset, which could speed up background subtraction for deep architectures. △ Less

Submitted 16 November, 2019; v1 submitted 25 May, 2017; originally announced May 2017.

Comments: Accepted for National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG 2019)

arXiv:1209.6484 [pdf]

doi 10.5120/8409-2043

Vulnerability Management for an Enterprise Resource Planning System

Authors: Shivani Goel, Ravi Kiran, Deepak Garg

Abstract: Enterprise resource planning (ERP) systems are commonly used in technical educational institutions(TEIs). ERP systems should continue providing services to its users irrespective of the level of failure. There could be many types of failures in the ERP systems. There are different types of measures or characteristics that can be defined for ERP systems to handle the levels of failure. Here in this… ▽ More Enterprise resource planning (ERP) systems are commonly used in technical educational institutions(TEIs). ERP systems should continue providing services to its users irrespective of the level of failure. There could be many types of failures in the ERP systems. There are different types of measures or characteristics that can be defined for ERP systems to handle the levels of failure. Here in this paper, various types of failure levels are identified along with various characteristics which are concerned with those failures. The relation between all these is summarized. The disruptions causing vulnerabilities in TEIs are identified .A vulnerability management cycle has been suggested along with many commercial and open source vulnerability management tools. The paper also highlights the importance of resiliency in ERP systems in TEIs. △ Less

Submitted 28 September, 2012; originally announced September 2012.

Journal ref: International Journal of Computer Applications Foundation of Computer Science, Volume 53, No.4, 2012, pp. 19-22

Showing 1–22 of 22 results for author: Kiran, R