-
Distance Preserving Machine Learning for Uncertainty Aware Accelerator Capacitance Predictions
Authors:
Steven Goldenberg,
Malachi Schram,
Kishansingh Rajput,
Thomas Britton,
Chris Pappas,
Dan Lu,
Jared Walden,
Majdi I. Radaideh,
Sarah Cousineau,
Sudarshan Harave
Abstract:
Providing accurate uncertainty estimations is essential for producing reliable machine learning models, especially in safety-critical applications such as accelerator systems. Gaussian process models are generally regarded as the gold standard method for this task, but they can struggle with large, high-dimensional datasets. Combining deep neural networks with Gaussian process approximation techni…
▽ More
Providing accurate uncertainty estimations is essential for producing reliable machine learning models, especially in safety-critical applications such as accelerator systems. Gaussian process models are generally regarded as the gold standard method for this task, but they can struggle with large, high-dimensional datasets. Combining deep neural networks with Gaussian process approximation techniques have shown promising results, but dimensionality reduction through standard deep neural network layers is not guaranteed to maintain the distance information necessary for Gaussian process models. We build on previous work by comparing the use of the singular value decomposition against a spectral-normalized dense layer as a feature extractor for a deep neural Gaussian process approximation model and apply it to a capacitance prediction problem for the High Voltage Converter Modulators in the Oak Ridge Spallation Neutron Source. Our model shows improved distance preservation and predicts in-distribution capacitance values with less than 1% error.
△ Less
Submitted 5 July, 2023;
originally announced July 2023.
-
An Exploratory Study of Project Activity Changepoints in Open Source Software Evolution
Authors:
James Walden,
Noah Burgin,
Kuljit Kaur
Abstract:
To explore the prevalence of abrupt changes (changepoints) in open source project activity, we assembled a dataset of 8,919 projects from the World of Code. Projects were selected based on age, number of commits, and number of authors. Using the nonparametric PELT algorithm, we identified changepoints in project activity time series, finding that more than 90% of projects had between one and six c…
▽ More
To explore the prevalence of abrupt changes (changepoints) in open source project activity, we assembled a dataset of 8,919 projects from the World of Code. Projects were selected based on age, number of commits, and number of authors. Using the nonparametric PELT algorithm, we identified changepoints in project activity time series, finding that more than 90% of projects had between one and six changepoints. Increases and decreases in project activity occurred with roughly equal frequency. While most changes are relatively small, on the order of a few authors or few dozen commits per month, there were long tails of much larger project activity changes. In future work, we plan to focus on larger changes to search for common open source lifecycle patterns as well as common responses to external events.
△ Less
Submitted 19 March, 2021;
originally announced March 2021.
-
The Impact of a Major Security Event on an Open Source Project: The Case of OpenSSL
Authors:
James Walden
Abstract:
Context: The Heartbleed vulnerability brought OpenSSL to international attention in 2014. The almost moribund project was a key security component in public web servers and over a billion mobile devices. This vulnerability led to new investments in OpenSSL.
Objective: The goal of this study is to determine how the Heartbleed vulnerability changed the software evolution of OpenSSL. We study chang…
▽ More
Context: The Heartbleed vulnerability brought OpenSSL to international attention in 2014. The almost moribund project was a key security component in public web servers and over a billion mobile devices. This vulnerability led to new investments in OpenSSL.
Objective: The goal of this study is to determine how the Heartbleed vulnerability changed the software evolution of OpenSSL. We study changes in vulnerabilities, code quality, project activity, and software engineering practices.
Method: We use a mixed methods approach, collecting multiple types of quantitative data and qualitative data from web sites and an interview with a developer who worked on post-Heartbleed changes. We use regression discontinuity analysis to determine changes in levels and slopes of code and project activity metrics resulting from Heartbleed.
Results: The OpenSSL project made tremendous improvements to code quality and security after Heartbleed. By the end of 2016, the number of commits per month had tripled, 91 vulnerabilities were found and fixed, code complexity decreased significantly, and OpenSSL obtained a CII best practices badge, certifying its use of good open source development practices.
Conclusions: The OpenSSL project provides a model of how an open source project can adapt and improve after a security event. The evolution of OpenSSL shows that the number of known vulnerabilities is not a useful indicator of project security. A small number of vulnerabilities may simply indicate that a project does not expend much effort to finding vulnerabilities. This study suggests that project activity and CII badge best practices may be better indicators of code quality and security than vulnerability counts.
△ Less
Submitted 28 May, 2020;
originally announced May 2020.