-
Derivative-based regularization for regression
Authors:
Enrico Lopedoto,
Maksim Shekhunov,
Vitaly Aksenov,
Kizito Salako,
Tillman Weyde
Abstract:
In this work, we introduce a novel approach to regularization in multivariable regression problems. Our regularizer, called DLoss, penalises differences between the model's derivatives and derivatives of the data generating function as estimated from the training data. We call these estimated derivatives data derivatives. The goal of our method is to align the model to the data, not only in terms…
▽ More
In this work, we introduce a novel approach to regularization in multivariable regression problems. Our regularizer, called DLoss, penalises differences between the model's derivatives and derivatives of the data generating function as estimated from the training data. We call these estimated derivatives data derivatives. The goal of our method is to align the model to the data, not only in terms of target values but also in terms of the derivatives involved. To estimate data derivatives, we select (from the training data) 2-tuples of input-value pairs, using either nearest neighbour or random, selection. On synthetic and real datasets, we evaluate the effectiveness of adding DLoss, with different weights, to the standard mean squared error loss. The experimental results show that with DLoss (using nearest neighbour selection) we obtain, on average, the best rank with respect to MSE on validation data sets, compared to no regularization, L2 regularization, and Dropout.
△ Less
Submitted 1 May, 2024;
originally announced May 2024.
-
Demonstrating Software Reliability using Possibly Correlated Tests: Insights from a Conservative Bayesian Approach
Authors:
Kizito Salako,
Xingyu Zhao
Abstract:
This paper presents Bayesian techniques for conservative claims about software reliability, particularly when evidence suggests the software's executions are not statistically independent. We formalise informal notions of "doubting" that the executions are independent, and incorporate such doubts into reliability assessments. We develop techniques that reveal the extent to which independence assum…
▽ More
This paper presents Bayesian techniques for conservative claims about software reliability, particularly when evidence suggests the software's executions are not statistically independent. We formalise informal notions of "doubting" that the executions are independent, and incorporate such doubts into reliability assessments. We develop techniques that reveal the extent to which independence assumptions can undermine conservatism in assessments, and identify conditions under which this impact is not significant. These techniques - novel extensions of conservative Bayesian inference (CBI) approaches - give conservative confidence bounds on the software's failure probability per execution. With illustrations in two application areas - nuclear power-plant safety and autonomous vehicle (AV) safety - our analyses reveals: 1) the confidence an assessor should possess before subjecting a system to operational testing. Otherwise, such testing is futile - favourable operational testing evidence will eventually decrease one's confidence in the system being sufficiently reliable; 2) the independence assumption supports conservative claims sometimes; 3) in some scenarios, observing a system operate without failure gives less confidence in the system than if some failures had been observed; 4) building confidence in a system is very sensitive to failures - each additional failure means significantly more operational testing is required, in order to support a reliability claim.
△ Less
Submitted 11 October, 2023; v1 submitted 16 August, 2022;
originally announced August 2022.
-
The Unnecessity of Assuming Statistically Independent Tests in Bayesian Software Reliability Assessments
Authors:
Kizito Salako,
Xingyu Zhao
Abstract:
When assessing a software-based system, the results of Bayesian statistical inference on operational testing data can provide strong support for software reliability claims. For inference, this data (i.e. software successes and failures) is often assumed to arise in an independent, identically distributed (i.i.d.) manner. In this paper we show how conservative Bayesian approaches make this assumpt…
▽ More
When assessing a software-based system, the results of Bayesian statistical inference on operational testing data can provide strong support for software reliability claims. For inference, this data (i.e. software successes and failures) is often assumed to arise in an independent, identically distributed (i.i.d.) manner. In this paper we show how conservative Bayesian approaches make this assumption unnecessary, by incorporating one's doubts about the assumption into the assessment. We derive conservative confidence bounds on a system's probability of failure on demand (pfd), when operational testing reveals no failures. The generality and utility of the confidence bounds are illustrated in the assessment of a nuclear power-plant safety-protection system, under varying levels of skepticism about the i.i.d. assumption. The analysis suggests that the i.i.d. assumption can make Bayesian reliability assessments extremely optimistic - such assessments do not explicitly account for how software can be very likely to exhibit no failures during extensive operational testing despite the software's pfd being undesirably large.
△ Less
Submitted 22 December, 2022; v1 submitted 31 July, 2022;
originally announced August 2022.
-
Assessing Safety-Critical Systems from Operational Testing: A Study on Autonomous Vehicles
Authors:
Xingyu Zhao,
Kizito Salako,
Lorenzo Strigini,
Valentin Robu,
David Flynn
Abstract:
Context: Demonstrating high reliability and safety for safety-critical systems (SCSs) remains a hard problem. Diverse evidence needs to be combined in a rigorous way: in particular, results of operational testing with other evidence from design and verification. Growing use of machine learning in SCSs, by precluding most established methods for gaining assurance, makes operational testing even mor…
▽ More
Context: Demonstrating high reliability and safety for safety-critical systems (SCSs) remains a hard problem. Diverse evidence needs to be combined in a rigorous way: in particular, results of operational testing with other evidence from design and verification. Growing use of machine learning in SCSs, by precluding most established methods for gaining assurance, makes operational testing even more important for supporting safety and reliability claims. Objective: We use Autonomous Vehicles (AVs) as a current example to revisit the problem of demonstrating high reliability. AVs are making their debut on public roads: methods for assessing whether an AV is safe enough are urgently needed. We demonstrate how to answer 5 questions that would arise in assessing an AV type, starting with those proposed by a highly-cited study. Method: We apply new theorems extending Conservative Bayesian Inference (CBI), which exploit the rigour of Bayesian methods while reducing the risk of involuntary misuse associated with now-common applications of Bayesian inference; we define additional conditions needed for applying these methods to AVs. Results: Prior knowledge can bring substantial advantages if the AV design allows strong expectations of safety before road testing. We also show how naive attempts at conservative assessment may lead to over-optimism instead; why extrapolating the trend of disengagements is not suitable for safety claims; use of knowledge that an AV has moved to a less stressful environment. Conclusion: While some reliability targets will remain too high to be practically verifiable, CBI removes a major source of doubt: it allows use of prior knowledge without inducing dangerously optimistic biases. For certain ranges of required reliability and prior beliefs, CBI thus supports feasible, sound arguments. Useful conservative claims can be derived from limited prior knowledge.
△ Less
Submitted 19 August, 2020;
originally announced August 2020.
-
Assessing the Safety and Reliability of Autonomous Vehicles from Road Testing
Authors:
Xingyu Zhao,
Valentin Robu,
David Flynn,
Kizito Salako,
Lorenzo Strigini
Abstract:
There is an urgent societal need to assess whether autonomous vehicles (AVs) are safe enough. From published quantitative safety and reliability assessments of AVs, we know that, given the goal of predicting very low rates of accidents, road testing alone requires infeasible numbers of miles to be driven. However, previous analyses do not consider any knowledge prior to road testing - knowledge wh…
▽ More
There is an urgent societal need to assess whether autonomous vehicles (AVs) are safe enough. From published quantitative safety and reliability assessments of AVs, we know that, given the goal of predicting very low rates of accidents, road testing alone requires infeasible numbers of miles to be driven. However, previous analyses do not consider any knowledge prior to road testing - knowledge which could bring substantial advantages if the AV design allows strong expectations of safety before road testing. We present the advantages of a new variant of Conservative Bayesian Inference (CBI), which uses prior knowledge while avoiding optimistic biases. We then study the trend of disengagements (take-overs by human drivers) by applying Software Reliability Growth Models (SRGMs) to data from Waymo's public road testing over 51 months, in view of the practice of software updates during this testing. Our approach is to not trust any specific SRGM, but to assess forecast accuracy and then improve forecasts. We show that, coupled with accuracy assessment and recalibration techniques, SRGMs could be a valuable test planning aid.
△ Less
Submitted 18 August, 2019;
originally announced August 2019.