Search | arXiv e-print repository

Evaluation of automated driving system safety metrics with logged vehicle trajectory data

Authors: Xintao Yan, Shuo Feng, David J. LeBlanc, Carol Flannagan, Henry X. Liu

Abstract: Real-time safety metrics are important for the automated driving system (ADS) to assess the risk of driving situations and to assist the decision-making. Although a number of real-time safety metrics have been proposed in the literature, systematic performance evaluation of these safety metrics has been lacking. As different behavioral assumptions are adopted in different safety metrics, it is dif… ▽ More Real-time safety metrics are important for the automated driving system (ADS) to assess the risk of driving situations and to assist the decision-making. Although a number of real-time safety metrics have been proposed in the literature, systematic performance evaluation of these safety metrics has been lacking. As different behavioral assumptions are adopted in different safety metrics, it is difficult to compare the safety metrics and evaluate their performance. To overcome this challenge, in this study, we propose an evaluation framework utilizing logged vehicle trajectory data, in that vehicle trajectories for both subject vehicle (SV) and background vehicles (BVs) are obtained and the prediction errors caused by behavioral assumptions can be eliminated. Specifically, we examine whether the SV is in a collision unavoidable situation at each moment, given all near-future trajectories of BVs. In this way, we level the ground for a fair comparison of different safety metrics, as a good safety metric should always alarm in advance to the collision unavoidable moment. When trajectory data from a large number of trips are available, we can systematically evaluate and compare different metrics' statistical performance. In the case study, three representative real-time safety metrics, including the time-to-collision (TTC), the PEGASUS Criticality Metric (PCM), and the Model Predictive Instantaneous Safety Metric (MPrISM), are evaluated using a large-scale simulated trajectory dataset. The proposed evaluation framework is important for researchers, practitioners, and regulators to characterize different metrics, and to select appropriate metrics for different applications. Moreover, by conducting failure analysis on moments when a safety metric failed, we can identify its potential weaknesses which are valuable for its potential refinements and improvements. △ Less

Submitted 2 January, 2024; originally announced January 2024.

arXiv:2312.04724 [pdf, other]

Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models

Authors: Manish Bhatt, Sahana Chennabasappa, Cyrus Nikolaidis, Shengye Wan, Ivan Evtimov, Dominik Gabi, Daniel Song, Faizan Ahmad, Cornelius Aschermann, Lorenzo Fontana, Sasha Frolov, Ravi Prakash Giri, Dhaval Kapil, Yiannis Kozyrakis, David LeBlanc, James Milazzo, Aleksandar Straumann, Gabriel Synnaeve, Varun Vontimitta, Spencer Whitman, Joshua Saxe

Abstract: This paper presents CyberSecEval, a comprehensive benchmark developed to help bolster the cybersecurity of Large Language Models (LLMs) employed as coding assistants. As what we believe to be the most extensive unified cybersecurity safety benchmark to date, CyberSecEval provides a thorough evaluation of LLMs in two crucial security domains: their propensity to generate insecure code and their lev… ▽ More This paper presents CyberSecEval, a comprehensive benchmark developed to help bolster the cybersecurity of Large Language Models (LLMs) employed as coding assistants. As what we believe to be the most extensive unified cybersecurity safety benchmark to date, CyberSecEval provides a thorough evaluation of LLMs in two crucial security domains: their propensity to generate insecure code and their level of compliance when asked to assist in cyberattacks. Through a case study involving seven models from the Llama 2, Code Llama, and OpenAI GPT large language model families, CyberSecEval effectively pinpointed key cybersecurity risks. More importantly, it offered practical insights for refining these models. A significant observation from the study was the tendency of more advanced models to suggest insecure code, highlighting the critical need for integrating security considerations in the development of sophisticated LLMs. CyberSecEval, with its automated test case generation and evaluation pipeline covers a broad scope and equips LLM designers and researchers with a tool to broadly measure and enhance the cybersecurity safety properties of LLMs, contributing to the development of more secure AI systems. △ Less

Submitted 7 December, 2023; originally announced December 2023.

arXiv:2308.12846 [pdf, other]

Object level footprint uncertainty quantification in infrastructure based sensing

Authors: Arpan Kusari, Asma Almutairi, Mark E. Gilbert, David J. LeBlanc

Abstract: We examine the problem of estimating footprint uncertainty of objects imaged using the infrastructure based camera sensing. A closed form relationship is established between the ground coordinates and the sources of the camera errors. Using the error propagation equation, the covariance of a given ground coordinate can be measured as a function of the camera errors. The uncertainty of the footprin… ▽ More We examine the problem of estimating footprint uncertainty of objects imaged using the infrastructure based camera sensing. A closed form relationship is established between the ground coordinates and the sources of the camera errors. Using the error propagation equation, the covariance of a given ground coordinate can be measured as a function of the camera errors. The uncertainty of the footprint of the bounding box can then be given as the function of all the extreme points of the object footprint. In order to calculate the uncertainty of a ground point, the typical error sizes of the error sources are required. We present a method of estimating the typical error sizes from an experiment using a static, high-precision LiDAR as the ground truth. Finally, we present a simulated case study of uncertainty quantification from infrastructure based camera in CARLA to provide a sense of how the uncertainty changes across a left turn maneuver. △ Less

Submitted 24 August, 2023; originally announced August 2023.

Comments: Submitted to IEEE Sensors journal

arXiv:2110.07111 [pdf]

A Novel Traffic Simulation Framework for Testing Autonomous Vehicles Using SUMO and CARLA

Authors: Pei Li, Arpan Kusari, David J. LeBlanc

Abstract: Traffic simulation is an efficient and cost-effective way to test Autonomous Vehicles (AVs) in a complex and dynamic environment. Numerous studies have been conducted for AV evaluation using traffic simulation over the past decades. However, the current simulation environments fall behind on two fronts -- the background vehicles (BVs) fail to simulate naturalistic driving behavior and the existing… ▽ More Traffic simulation is an efficient and cost-effective way to test Autonomous Vehicles (AVs) in a complex and dynamic environment. Numerous studies have been conducted for AV evaluation using traffic simulation over the past decades. However, the current simulation environments fall behind on two fronts -- the background vehicles (BVs) fail to simulate naturalistic driving behavior and the existing environments do not test the entire pipeline in a modular fashion. This study aims to propose a simulation framework that creates a complex and naturalistic traffic environment. Specifically, we combine a modified version of the Simulation of Urban MObility (SUMO) simulator with the Cars Learning to Act (CARLA) simulator to generate a simulation environment that could emulate the complexities of the external environment while providing realistic sensor outputs to the AV pipeline. In a past research work, we created an open-source Python package called SUMO-Gym which generates a realistic road network and naturalistic traffic through SUMO and combines that with OpenAI Gym to provide ease of use for the end user. We propose to extend our developed software by adding CARLA, which in turn will enrich the perception of the ego vehicle by providing realistic sensors outputs of the AVs surrounding environment. Using the proposed framework, AVs perception, planning, and control could be tested in a complex and realistic driving environment. The performance of the proposed framework in constructing output generation and AV evaluations are demonstrated using several case studies. △ Less

Submitted 13 October, 2021; originally announced October 2021.

arXiv:2109.11620 [pdf, other]

Enhancing SUMO simulator for simulation based testing and validation of autonomous vehicles

Authors: Arpan Kusari, Pei Li, Hanzhi Yang, Nikhil Punshi, Mich Rasulis, Scott Bogard, David J. LeBlanc

Abstract: Current autonomous vehicle (AV) simulators are built to provide large-scale testing required to prove capabilities under varied conditions in controlled, repeatable fashion. However, they have certain failings including the need for user expertise and complex inconvenient tutorials for customized scenario creation. Simulation of Urban Mobility (SUMO) simulator, which has been presented as an open-… ▽ More Current autonomous vehicle (AV) simulators are built to provide large-scale testing required to prove capabilities under varied conditions in controlled, repeatable fashion. However, they have certain failings including the need for user expertise and complex inconvenient tutorials for customized scenario creation. Simulation of Urban Mobility (SUMO) simulator, which has been presented as an open-source AV simulator, is used extensively but suffer from similar issues which make it difficult for entry-level practitioners to utilize the simulator without significant time investment. In that regard, we provide two enhancements to SUMO simulator geared towards massively improving user experience and providing real-life like variability for surrounding traffic. Firstly, we calibrate a car-following model, Intelligent Driver Model (IDM), for highway and urban naturalistic driving data and sample automatically from the parameter distributions to create the background vehicles. Secondly, we combine SUMO with OpenAI gym, creating a Python package which can run simulations based on real world highway and urban layouts with generic output observations and input actions that can be processed via any AV pipeline. Our aim through these enhancements is to provide an easy-to-use platform which can be readily used for AV testing and validation. △ Less

Submitted 23 September, 2021; originally announced September 2021.

arXiv:1701.08915 [pdf, other]

Accelerated Evaluation of Automated Vehicles Using Piecewise Mixture Models

Authors: Zhiyuan Huang, Ding Zhao, Henry Lam, David J. LeBlanc

Abstract: The process to certify highly Automated Vehicles has not yet been defined by any country in the world. Currently, companies test Automated Vehicles on public roads, which is time-consuming and inefficient. We proposed the Accelerated Evaluation concept, which uses a modified statistics of the surrounding vehicles and the Importance Sampling theory to reduce the evaluation time by several orders of… ▽ More The process to certify highly Automated Vehicles has not yet been defined by any country in the world. Currently, companies test Automated Vehicles on public roads, which is time-consuming and inefficient. We proposed the Accelerated Evaluation concept, which uses a modified statistics of the surrounding vehicles and the Importance Sampling theory to reduce the evaluation time by several orders of magnitude, while ensuring the evaluation results are statistically accurate. In this paper, we further improve the accelerated evaluation concept by using Piecewise Mixture Distribution models, instead of Single Parametric Distribution models. We developed and applied this idea to forward collision control system reacting to vehicles making cut-in lane changes. The behavior of the cut-in vehicles was modeled based on more than 403,581 lane changes collected by the University of Michigan Safety Pilot Model Deployment Program. Simulation results confirm that the accuracy and efficiency of the Piecewise Mixture Distribution method outperformed single parametric distribution methods in accuracy and efficiency, and accelerated the evaluation process by almost four orders of magnitude. △ Less

Submitted 30 January, 2017; originally announced January 2017.

Comments: 11 pages, 13 figures

arXiv:1607.02687 [pdf]

Accelerated Evaluation of Automated Vehicles in Car-Following Maneuvers

Authors: Ding Zhao, Xianan Huang, Huei Peng, Henry Lam, David J. LeBlanc

Abstract: The safety of Automated Vehicles (AVs) must be assured before their release and deployment. The current approach to evaluation relies primarily on (i) testing AVs on public roads or (ii) track testing with scenarios defined in a test matrix. These two methods have completely opposing drawbacks: the former, while offering realistic scenarios, takes too much time to execute; the latter, though it ca… ▽ More The safety of Automated Vehicles (AVs) must be assured before their release and deployment. The current approach to evaluation relies primarily on (i) testing AVs on public roads or (ii) track testing with scenarios defined in a test matrix. These two methods have completely opposing drawbacks: the former, while offering realistic scenarios, takes too much time to execute; the latter, though it can be completed in a short amount of time, has no clear correlation to safety benefits in the real world. To avoid the aforementioned problems, we propose Accelerated Evaluation, focusing on the car-following scenario. The stochastic human-controlled vehicle (HV) motions are modeled based on 1.3 million miles of naturalistic driving data collected by the University of Michigan Safety Pilot Model Deployment Program. The statistics of the HV behaviors are then modified to generate more intense interactions between HVs and AVs to accelerate the evaluation procedure. The Importance Sampling theory was used to ensure that the safety benefits of AVs are accurately assessed under accelerated tests. Crash, injury and conflict rates for a simulated AV are simulated to demonstrate the proposed approach. Results show that test duration is reduced by a factor of 300 to 100,000 compared with the non-accelerated (naturalistic) evaluation. In other words, the proposed techniques have great potential for accelerating the AV evaluation process. △ Less

Submitted 19 February, 2017; v1 submitted 9 July, 2016; originally announced July 2016.

Comments: 11 pages, 11 figures

arXiv:1605.04965 [pdf]

doi 10.1109/TITS.2016.2582208

Accelerated Evaluation of Automated Vehicles Safety in Lane Change Scenarios Based on Importance Sampling Techniques

Authors: Ding Zhao, Henry Lam, Huei Peng, Shan Bao, David J. LeBlanc, Kazutoshi Nobukawa, Christopher S. Pan

Abstract: Automated vehicles (AVs) must be evaluated thoroughly before their release and deployment. A widely-used evaluation approach is the Naturalistic-Field Operational Test (N-FOT), which tests prototype vehicles directly on the public roads. Due to the low exposure to safety-critical scenarios, N-FOTs are time-consuming and expensive to conduct. In this paper, we propose an accelerated evaluation appr… ▽ More Automated vehicles (AVs) must be evaluated thoroughly before their release and deployment. A widely-used evaluation approach is the Naturalistic-Field Operational Test (N-FOT), which tests prototype vehicles directly on the public roads. Due to the low exposure to safety-critical scenarios, N-FOTs are time-consuming and expensive to conduct. In this paper, we propose an accelerated evaluation approach for AVs. The results can be used to generate motions of the primary other vehicles to accelerate the verification of AVs in simulations and controlled experiments. Frontal collision due to unsafe cut-ins is the target crash type of this paper. Human-controlled vehicles making unsafe lane changes are modeled as the primary disturbance to AVs based on data collected by the University of Michigan Safety Pilot Model Deployment Program. The cut-in scenarios are generated based on skewed statistics of collected human driver behaviors, which generate risky testing scenarios while preserving the statistical information so that the safety benefits of AVs in non-accelerated cases can be accurately estimated. The Cross Entropy method is used to recursively search for the optimal skewing parameters. The frequencies of occurrence of conflicts, crashes and injuries are estimated for a modeled automated vehicle, and the achieved accelerated rate is around 2,000 to 20,000. In other words, in the accelerated simulations, driving for 1,000 miles will expose the AV with challenging scenarios that will take about 2 to 20 million miles of real-world driving to encounter. This technique thus has the potential to reduce greatly the development and validation time for AVs. △ Less

Submitted 15 June, 2016; v1 submitted 16 May, 2016; originally announced May 2016.

Comments: submitted to IEEE ITSC

Showing 1–8 of 8 results for author: LeBlanc, D