Search | arXiv e-print repository

Establishing Provenance Before Coding: Traditional and Next-Gen Signing

Authors: Taylor R. Schorlemmer, Ethan H. Burmane, Kelechi G. Kalu, Santiago Torres-Arias, James C. Davis

Abstract: Software engineers integrate third-party components into their applications. The resulting software supply chain is vulnerable. To reduce the attack surface, we can verify the origin of components (provenance) before adding them. Cryptographic signatures enable this. This article describes traditional signing, its challenges, and changes introduced by next generation signing platforms Software engineers integrate third-party components into their applications. The resulting software supply chain is vulnerable. To reduce the attack surface, we can verify the origin of components (provenance) before adding them. Cryptographic signatures enable this. This article describes traditional signing, its challenges, and changes introduced by next generation signing platforms △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2406.08198 [pdf, other]

An Industry Interview Study of Software Signing for Supply Chain Security

Authors: Kelechi G. Kalu, Tanya Singla, Chinenye Okafor, Santiago Torres-Arias, James C. Davis

Abstract: Many software products are composed by the recursive integration of components from other teams or external parties. Each additional link in a software product's supply chain increases the risk of the injection of malicious behavior. To improve supply chain provenance, many cybersecurity frameworks, standards, and regulations recommend the use of software signing. However, recent surveys and measu… ▽ More Many software products are composed by the recursive integration of components from other teams or external parties. Each additional link in a software product's supply chain increases the risk of the injection of malicious behavior. To improve supply chain provenance, many cybersecurity frameworks, standards, and regulations recommend the use of software signing. However, recent surveys and measurement studies have found that the adoption rate and quality of software signatures are low. These findings raise questions about the practical application of software signing, the human factors influencing its adoption, and the challenges faced during its implementation. We lack in-depth industry perspectives on the challenges and practices of software signing. To understand software signing in practice, we interviewed 18 high-ranking industry practitioners across 13 organizations. We provide possible impacts of experienced software supply chain failures, security standards, and regulations on software signing adoption. We also study the challenges that affect an effective software signing implementation. To summarize our findings: (1) We present a refined model of the software supply chain factory model highlighting practitioner's signing practices; (2) We highlight the different challenges -- Technical, Organizational, and Human -- that hamper software signing implementation; (3) We report that expert subjects disagree on the importance of signing; (4) We describe how failure incidents and industry standards affect the adoption of software signing and other security techniques. Our findings contribute to the understanding of software supply chain security by highlighting the impact of human and organizational factors on Software Supply Chain risks and providing nuanced insights for effectively implementing Software Supply Chain security controls -- towards Software signing in practice. △ Less

Submitted 12 June, 2024; originally announced June 2024.

arXiv:2401.14635 [pdf, other]

Signing in Four Public Software Package Registries: Quantity, Quality, and Influencing Factors

Authors: Taylor R Schorlemmer, Kelechi G Kalu, Luke Chigges, Kyung Myung Ko, Eman Abu Isghair, Saurabh Baghi, Santiago Torres-Arias, James C Davis

Abstract: Many software applications incorporate open-source third-party packages distributed by public package registries. Guaranteeing authorship along this supply chain is a challenge. Package maintainers can guarantee package authorship through software signing. However, it is unclear how common this practice is, and whether the resulting signatures are created properly. Prior work has provided raw data… ▽ More Many software applications incorporate open-source third-party packages distributed by public package registries. Guaranteeing authorship along this supply chain is a challenge. Package maintainers can guarantee package authorship through software signing. However, it is unclear how common this practice is, and whether the resulting signatures are created properly. Prior work has provided raw data on registry signing practices, but only measured single platforms, did not consider quality, did not consider time, and did not assess factors that may influence signing. We do not have up-to-date measurements of signing practices nor do we know the quality of existing signatures. Furthermore, we lack a comprehensive understanding of factors that influence signing adoption. This study addresses this gap. We provide measurements across three kinds of package registries: traditional software (Maven, PyPI), container images (DockerHub), and machine learning models (Hugging Face). For each registry, we describe the nature of the signed artifacts as well as the current quantity and quality of signatures. Then, we examine longitudinal trends in signing practices. Finally, we use a quasi-experiment to estimate the effect that various factors had on software signing practices. To summarize our findings: (1) mandating signature adoption improves the quantity of signatures; (2) providing dedicated tooling improves the quality of signing; (3) getting started is the hard part -- once a maintainer begins to sign, they tend to continue doing so; and (4) although many supply chain attacks are mitigable via signing, signing adoption is primarily affected by registry policy rather than by public knowledge of attacks, new engineering standards, etc. These findings highlight the importance of software package registry managers and signing infrastructure. △ Less

Submitted 14 April, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

Comments: Accepted at IEEE Security & Privacy 2024 (S&P'24)

arXiv:2308.12387 [pdf, other]

Reflecting on the Use of the Policy-Process-Product Theory in Empirical Software Engineering

Authors: Kelechi G. Kalu, Taylor R. Schorlemmer, Sophie Chen, Kyle Robinson, Erik Kocinare, James C. Davis

Abstract: The primary theory of software engineering is that an organization's Policies and Processes influence the quality of its Products. We call this the PPP Theory. Although empirical software engineering research has grown common, it is unclear whether researchers are trying to evaluate the PPP Theory. To assess this, we analyzed half (33) of the empirical works published over the last two years in th… ▽ More The primary theory of software engineering is that an organization's Policies and Processes influence the quality of its Products. We call this the PPP Theory. Although empirical software engineering research has grown common, it is unclear whether researchers are trying to evaluate the PPP Theory. To assess this, we analyzed half (33) of the empirical works published over the last two years in three prominent software engineering conferences. In this sample, 70% focus on policies/processes or products, not both. Only 33% provided measurements relating policy/process and products. We make four recommendations: (1) Use PPP Theory in study design; (2) Study feedback relationships; (3) Diversify the studied feedforward relationships; and (4) Disentangle policy and process. Let us remember that research results are in the context of, and with respect to, the relationship between software products, processes, and policies. △ Less

Submitted 23 August, 2023; originally announced August 2023.

Comments: 5 pages, published in the proceedings of the 2023 ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering in the Ideas-Visions-Reflections track (ESEC/FSE-IVR'23)

arXiv:2308.04898 [pdf, other]

An Empirical Study on Using Large Language Models to Analyze Software Supply Chain Security Failures

Authors: Tanmay Singla, Dharun Anandayuvaraj, Kelechi G. Kalu, Taylor R. Schorlemmer, James C. Davis

Abstract: As we increasingly depend on software systems, the consequences of breaches in the software supply chain become more severe. High-profile cyber attacks like those on SolarWinds and ShadowHammer have resulted in significant financial and data losses, underlining the need for stronger cybersecurity. One way to prevent future breaches is by studying past failures. However, traditional methods of anal… ▽ More As we increasingly depend on software systems, the consequences of breaches in the software supply chain become more severe. High-profile cyber attacks like those on SolarWinds and ShadowHammer have resulted in significant financial and data losses, underlining the need for stronger cybersecurity. One way to prevent future breaches is by studying past failures. However, traditional methods of analyzing these failures require manually reading and summarizing reports about them. Automated support could reduce costs and allow analysis of more failures. Natural Language Processing (NLP) techniques such as Large Language Models (LLMs) could be leveraged to assist the analysis of failures. In this study, we assessed the ability of Large Language Models (LLMs) to analyze historical software supply chain breaches. We used LLMs to replicate the manual analysis of 69 software supply chain security failures performed by members of the Cloud Native Computing Foundation (CNCF). We developed prompts for LLMs to categorize these by four dimensions: type of compromise, intent, nature, and impact. GPT 3.5s categorizations had an average accuracy of 68% and Bard had an accuracy of 58% over these dimensions. We report that LLMs effectively characterize software supply chain failures when the source articles are detailed enough for consensus among manual analysts, but cannot yet replace human analysts. Future work can improve LLM performance in this context, and study a broader range of articles and failures. △ Less

Submitted 9 August, 2023; originally announced August 2023.

Comments: 22 pages, 9 figures

Showing 1–5 of 5 results for author: Kalu, K G