-
Vehicular Applications of Koopman Operator Theory -- A Survey
Authors:
Waqas Manzoor,
Samir Rawashdeh,
Alireza Mohammadi
Abstract:
Koopman operator theory has proven to be a promising approach to nonlinear system identification and global linearization. For nearly a century, there had been no efficient means of calculating the Koopman operator for applied engineering purposes. The introduction of a recent computationally efficient method in the context of fluid dynamics, which is based on the system dynamics decomposition to…
▽ More
Koopman operator theory has proven to be a promising approach to nonlinear system identification and global linearization. For nearly a century, there had been no efficient means of calculating the Koopman operator for applied engineering purposes. The introduction of a recent computationally efficient method in the context of fluid dynamics, which is based on the system dynamics decomposition to a set of normal modes in descending order, has overcome this long-lasting computational obstacle. The purely data-driven nature of Koopman operators holds the promise of capturing unknown and complex dynamics for reduced-order model generation and system identification, through which the rich machinery of linear control techniques can be utilized. Given the ongoing development of this research area and the many existing open problems in the fields of smart mobility and vehicle engineering, a survey of techniques and open challenges of applying Koopman operator theory to this vibrant area is warranted. This review focuses on the various solutions of the Koopman operator which have emerged in recent years, particularly those focusing on mobility applications, ranging from characterization and component-level control operations to vehicle performance and fleet management. Moreover, this comprehensive review of over 100 research papers highlights the breadth of ways Koopman operator theory has been applied to various vehicular applications with a detailed categorization of the applied Koopman operator-based algorithm type. Furthermore, this review paper discusses theoretical aspects of Koopman operator theory that have been largely neglected by the smart mobility and vehicle engineering community and yet have large potential for contributing to solving open problems in these areas.
△ Less
Submitted 21 March, 2023; v1 submitted 18 March, 2023;
originally announced March 2023.
-
High-temporal-resolution event-based vehicle detection and tracking
Authors:
Zaid El-Shair,
Samir Rawashdeh
Abstract:
Event-based vision has been rapidly growing in recent years justified by the unique characteristics it presents such as its high temporal resolutions (~1us), high dynamic range (>120dB), and output latency of only a few microseconds. This work further explores a hybrid, multi-modal, approach for object detection and tracking that leverages state-of-the-art frame-based detectors complemented by han…
▽ More
Event-based vision has been rapidly growing in recent years justified by the unique characteristics it presents such as its high temporal resolutions (~1us), high dynamic range (>120dB), and output latency of only a few microseconds. This work further explores a hybrid, multi-modal, approach for object detection and tracking that leverages state-of-the-art frame-based detectors complemented by hand-crafted event-based methods to improve the overall tracking performance with minimal computational overhead. The methods presented include event-based bounding box (BB) refinement that improves the precision of the resulting BBs, as well as a continuous event-based object detection method, to recover missed detections and generate inter-frame detections that enable a high-temporal-resolution tracking output. The advantages of these methods are quantitatively verified by an ablation study using the higher order tracking accuracy (HOTA) metric. Results show significant performance gains resembled by an improvement in the HOTA from 56.6%, using only frames, to 64.1% and 64.9%, for the event and edge-based mask configurations combined with the two methods proposed, at the baseline framerate of 24Hz. Likewise, incorporating these methods with the same configurations has improved HOTA from 52.5% to 63.1%, and from 51.3% to 60.2% at the high-temporal-resolution tracking rate of 384Hz. Finally, a validation experiment is conducted to analyze the real-world single-object tracking performance using high-speed LiDAR. Empirical evidence shows that our approaches provide significant advantages compared to using frame-based object detectors at the baseline framerate of 24Hz and higher tracking rates of up to 500Hz.
△ Less
Submitted 2 January, 2023; v1 submitted 29 December, 2022;
originally announced December 2022.
-
RMOPP: Robust Multi-Objective Post-Processing for Effective Object Detection
Authors:
Mayuresh Savargaonkar,
Abdallah Chehade,
Samir Rawashdeh
Abstract:
Over the last few decades, many architectures have been developed that harness the power of neural networks to detect objects in near real-time. Training such systems requires substantial time across multiple GPUs and massive labeled training datasets. Although the goal of these systems is generalizability, they are often impractical in real-life applications due to flexibility, robustness, or spe…
▽ More
Over the last few decades, many architectures have been developed that harness the power of neural networks to detect objects in near real-time. Training such systems requires substantial time across multiple GPUs and massive labeled training datasets. Although the goal of these systems is generalizability, they are often impractical in real-life applications due to flexibility, robustness, or speed issues. This paper proposes RMOPP: A robust multi-objective post-processing algorithm to boost the performance of fast pre-trained object detectors with a negligible impact on their speed. Specifically, RMOPP is a statistically driven, post-processing algorithm that allows for simultaneous optimization of precision and recall. A unique feature of RMOPP is the Pareto frontier that identifies dominant possible post-processed detectors to optimize for both precision and recall. RMOPP explores the full potential of a pre-trained object detector and is deployable for near real-time predictions. We also provide a compelling test case on YOLOv2 using the MS-COCO dataset.
△ Less
Submitted 8 February, 2021;
originally announced February 2021.
-
Learning Panoptic Segmentation from Instance Contours
Authors:
Sumanth Chennupati,
Venkatraman Narayanan,
Ganesh Sistu,
Senthil Yogamani,
Samir A Rawashdeh
Abstract:
Panoptic Segmentation aims to provide an understanding of background (stuff) and instances of objects (things) at a pixel level. It combines the separate tasks of semantic segmentation (pixel level classification) and instance segmentation to build a single unified scene understanding task. Typically, panoptic segmentation is derived by combining semantic and instance segmentation tasks that are l…
▽ More
Panoptic Segmentation aims to provide an understanding of background (stuff) and instances of objects (things) at a pixel level. It combines the separate tasks of semantic segmentation (pixel level classification) and instance segmentation to build a single unified scene understanding task. Typically, panoptic segmentation is derived by combining semantic and instance segmentation tasks that are learned separately or jointly (multi-task networks). In general, instance segmentation networks are built by adding a foreground mask estimation layer on top of object detectors or using instance clustering methods that assign a pixel to an instance center. In this work, we present a fully convolution neural network that learns instance segmentation from semantic segmentation and instance contours (boundaries of things). Instance contours along with semantic segmentation yield a boundary aware semantic segmentation of things. Connected component labeling on these results produces instance segmentation. We merge semantic and instance segmentation results to output panoptic segmentation. We evaluate our proposed method on the CityScapes dataset to demonstrate qualitative and quantitative performances along with several ablation studies. Our overview video can be accessed from url:https://youtu.be/wBtcxRhG3e0.
△ Less
Submitted 5 April, 2021; v1 submitted 15 October, 2020;
originally announced October 2020.
-
MultiNet++: Multi-Stream Feature Aggregation and Geometric Loss Strategy for Multi-Task Learning
Authors:
Sumanth Chennupati,
Ganesh Sistu,
Senthil Yogamani,
Samir A Rawashdeh
Abstract:
Multi-task learning is commonly used in autonomous driving for solving various visual perception tasks. It offers significant benefits in terms of both performance and computational complexity. Current work on multi-task learning networks focus on processing a single input image and there is no known implementation of multi-task learning handling a sequence of images. In this work, we propose a mu…
▽ More
Multi-task learning is commonly used in autonomous driving for solving various visual perception tasks. It offers significant benefits in terms of both performance and computational complexity. Current work on multi-task learning networks focus on processing a single input image and there is no known implementation of multi-task learning handling a sequence of images. In this work, we propose a multi-stream multi-task network to take advantage of using feature representations from preceding frames in a video sequence for joint learning of segmentation, depth, and motion. The weights of the current and previous encoder are shared so that features computed in the previous frame can be leveraged without additional computation. In addition, we propose to use the geometric mean of task losses as a better alternative to the weighted average of task losses. The proposed loss function facilitates better handling of the difference in convergence rates of different tasks. Experimental results on KITTI, Cityscapes and SYNTHIA datasets demonstrate that the proposed strategies outperform various existing multi-task learning solutions.
△ Less
Submitted 22 April, 2019; v1 submitted 15 April, 2019;
originally announced April 2019.
-
NeurAll: Towards a Unified Visual Perception Model for Automated Driving
Authors:
Ganesh Sistu,
Isabelle Leang,
Sumanth Chennupati,
Senthil Yogamani,
Ciaran Hughes,
Stefan Milz,
Samir Rawashdeh
Abstract:
Convolutional Neural Networks (CNNs) are successfully used for the important automotive visual perception tasks including object recognition, motion and depth estimation, visual SLAM, etc. However, these tasks are typically independently explored and modeled. In this paper, we propose a joint multi-task network design for learning several tasks simultaneously. Our main motivation is the computatio…
▽ More
Convolutional Neural Networks (CNNs) are successfully used for the important automotive visual perception tasks including object recognition, motion and depth estimation, visual SLAM, etc. However, these tasks are typically independently explored and modeled. In this paper, we propose a joint multi-task network design for learning several tasks simultaneously. Our main motivation is the computational efficiency achieved by sharing the expensive initial convolutional layers between all tasks. Indeed, the main bottleneck in automated driving systems is the limited processing power available on deployment hardware. There is also some evidence for other benefits in improving accuracy for some tasks and easing development effort. It also offers scalability to add more tasks leveraging existing features and achieving better generalization. We survey various CNN based solutions for visual perception tasks in automated driving. Then we propose a unified CNN model for the important tasks and discuss several advanced optimization and architecture design techniques to improve the baseline model. The paper is partly review and partly positional with demonstration of several preliminary results promising for future research. We first demonstrate results of multi-stream learning and auxiliary learning which are important ingredients to scale to a large multi-task model. Finally, we implement a two-stream three-task network which performs better in many cases compared to their corresponding single-task models, while maintaining network size.
△ Less
Submitted 9 March, 2024; v1 submitted 10 February, 2019;
originally announced February 2019.
-
AuxNet: Auxiliary tasks enhanced Semantic Segmentation for Automated Driving
Authors:
Sumanth Chennupati,
Ganesh Sistu,
Senthil Yogamani,
Samir Rawashdeh
Abstract:
Decision making in automated driving is highly specific to the environment and thus semantic segmentation plays a key role in recognizing the objects in the environment around the car. Pixel level classification once considered a challenging task which is now becoming mature to be productized in a car. However, semantic annotation is time consuming and quite expensive. Synthetic datasets with doma…
▽ More
Decision making in automated driving is highly specific to the environment and thus semantic segmentation plays a key role in recognizing the objects in the environment around the car. Pixel level classification once considered a challenging task which is now becoming mature to be productized in a car. However, semantic annotation is time consuming and quite expensive. Synthetic datasets with domain adaptation techniques have been used to alleviate the lack of large annotated datasets. In this work, we explore an alternate approach of leveraging the annotations of other tasks to improve semantic segmentation. Recently, multi-task learning became a popular paradigm in automated driving which demonstrates joint learning of multiple tasks improves overall performance of each tasks. Motivated by this, we use auxiliary tasks like depth estimation to improve the performance of semantic segmentation task. We propose adaptive task loss weighting techniques to address scale issues in multi-task loss functions which become more crucial in auxiliary tasks. We experimented on automotive datasets including SYNTHIA and KITTI and obtained 3% and 5% improvement in accuracy respectively.
△ Less
Submitted 17 January, 2019;
originally announced January 2019.