Search | arXiv e-print repository

Secure Software Development in the Era of Fluid Multi-party Open Software and Services

Authors: Ivan Pashchenko, Riccardo Scandariato, Antonino Sabetta, Fabio Massacci

Abstract: Pushed by market forces, software development has become fast-paced. As a consequence, modern development projects are assembled from 3rd-party components. Security & privacy assurance techniques once designed for large, controlled updates over months or years, must now cope with small, continuous changes taking place within a week, and happening in sub-components that are controlled by third-part… ▽ More Pushed by market forces, software development has become fast-paced. As a consequence, modern development projects are assembled from 3rd-party components. Security & privacy assurance techniques once designed for large, controlled updates over months or years, must now cope with small, continuous changes taking place within a week, and happening in sub-components that are controlled by third-party developers one might not even know they existed. In this paper, we aim to provide an overview of the current software security approaches and evaluate their appropriateness in the face of the changed nature in software development. Software security assurance could benefit by switching from a process-based to an artefact-based approach. Further, security evaluation might need to be more incremental, automated and decentralized. We believe this can be achieved by supporting mechanisms for lightweight and scalable screenings that are applicable to the entire population of software components albeit there might be a price to pay. △ Less

Submitted 4 March, 2021; originally announced March 2021.

Comments: 7 pages, 1 figure, to be published in Proceedings of International Conference on Software Engineering - New Ideas and Emerging Results

ACM Class: D.2.0; D.2.13

arXiv:2103.03317 [pdf, other]

Technical Leverage in a Software Ecosystem: Development Opportunities and Security Risks

Authors: Fabio Massacci, Ivan Pashchenko

Abstract: In finance, leverage is the ratio between assets borrowed from others and one's own assets. A matching situation is present in software: by using free open-source software (FOSS) libraries a developer leverages on other people's code to multiply the offered functionalities with a much smaller own codebase. In finance as in software, leverage magnifies profits when returns from borrowing exceed cos… ▽ More In finance, leverage is the ratio between assets borrowed from others and one's own assets. A matching situation is present in software: by using free open-source software (FOSS) libraries a developer leverages on other people's code to multiply the offered functionalities with a much smaller own codebase. In finance as in software, leverage magnifies profits when returns from borrowing exceed costs of integration, but it may also magnify losses, in particular in the presence of security vulnerabilities. We aim to understand the level of technical leverage in the FOSS ecosystem and whether it can be a potential source of security vulnerabilities. Also, we introduce two metrics change distance and change direction to capture the amount and the evolution of the dependency on third-party libraries. The application of the proposed metrics on 8494 distinct library versions from the FOSS Maven-based Java libraries shows that small and medium libraries (less than 100KLoC) have disproportionately more leverage on FOSS dependencies in comparison to large libraries. We show that leverage pays off as leveraged libraries only add a 4% delay in the time interval between library releases while providing four times more code than their own. However, libraries with such leverage (i.e., 75% of libraries in our sample) also have 1.6 higher odds of being vulnerable in comparison to the libraries with lower leverage. We provide an online demo for computing the proposed metrics for real-world software libraries available under the following URL: https://techleverage.eu/. △ Less

Submitted 4 March, 2021; originally announced March 2021.

Comments: 14 pages, 5 figures, to be published in Proceedings of International Conference on Software Engineering (ICSE 2021)

ACM Class: D.2.8; D.2.13

arXiv:2011.06244 [pdf, other]

A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits

Authors: Steffen Herbold, Alexander Trautsch, Benjamin Ledel, Alireza Aghamohammadi, Taher Ahmed Ghaleb, Kuljit Kaur Chahal, Tim Bossenmaier, Bhaveet Nagaria, Philip Makedonski, Matin Nili Ahmadabadi, Kristof Szabados, Helge Spieker, Matej Madeja, Nathaniel Hoy, Valentina Lenarduzzi, Shangwen Wang, Gema Rodríguez-Pérez, Ricardo Colomo-Palacios, Roberto Verdecchia, Paramvir Singh, Yihao Qin, Debasish Chakroborti, Willard Davis, Vijay Walunj, Hongjun Wu , et al. (23 additional authors not shown)

Abstract: Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objective: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits. Metho… ▽ More Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objective: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits. Methods: We use a crowd sourcing approach for manual labeling to validate which changes contribute to bug fixes for each line in bug fixing commits. Each line is labeled by four participants. If at least three participants agree on the same label, we have consensus. Results: We estimate that between 17% and 32% of all changes in bug fixing commits modify the source code to fix the underlying problem. However, when we only consider changes to the production code files this ratio increases to 66% to 87%. We find that about 11% of lines are hard to label leading to active disagreements between participants. Due to confirmed tangling and the uncertainty in our data, we estimate that 3% to 47% of data is noisy without manual untangling, depending on the use case. Conclusion: Tangled commits have a high prevalence in bug fixes and can lead to a large amount of noise in the data. Prior research indicates that this noise may alter results. As researchers, we should be skeptics and assume that unvalidated data is likely very noisy, until proven otherwise. △ Less

Submitted 13 October, 2021; v1 submitted 12 November, 2020; originally announced November 2020.

Comments: Status: Accepted at Empirical Software Engineering

arXiv:1808.09753 [pdf, other]

Vulnerable Open Source Dependencies: Counting Those That Matter

Authors: Ivan Pashchenko, Henrik Plate, Serena Elisa Ponta, Antonino Sabetta, Fabio Massacci

Abstract: BACKGROUND: Vulnerable dependencies are a known problem in today's open-source software ecosystems because OSS libraries are highly interconnected and developers do not always update their dependencies. AIMS: In this paper we aim to present a precise methodology, that combines the code-based analysis of patches with information on build, test, update dates, and group extracted from the very code r… ▽ More BACKGROUND: Vulnerable dependencies are a known problem in today's open-source software ecosystems because OSS libraries are highly interconnected and developers do not always update their dependencies. AIMS: In this paper we aim to present a precise methodology, that combines the code-based analysis of patches with information on build, test, update dates, and group extracted from the very code repository, and therefore, caters to the needs of industrial practice for correct allocation of development and audit resources. METHOD: To understand the industrial impact of the proposed methodology, we considered the 200 most popular OSS Java libraries used by SAP in its own software. Our analysis included 10905 distinct GAVs (group, artifact, version) when considering all the library versions. RESULTS: We found that about 20% of the dependencies affected by a known vulnerability are not deployed, and therefore, they do not represent a danger to the analyzed library because they cannot be exploited in practice. Developers of the analyzed libraries are able to fix (and actually responsible for) 82% of the deployed vulnerable dependencies. The vast majority (81%) of vulnerable dependencies may be fixed by simply updating to a new version, while 1% of the vulnerable dependencies in our sample are halted, and therefore, potentially require a costly mitigation strategy. CONCLUSIONS: Our case study shows that the correct counting allows software development companies to receive actionable information about their library dependencies, and therefore, correctly allocate costly development and audit resources, which is spent inefficiently in case of distorted measurements. △ Less

Submitted 29 August, 2018; originally announced August 2018.

Comments: This is a pre-print of the paper that appears, with the same title, in the proceedings of the 12th International Symposium on Empirical Software Engineering and Measurement, 2018

arXiv:1712.02875 [pdf, ps, other]

One More Way to Encrypt a Message

Authors: Irina Pashchenko

Abstract: This work describes an example of an application of a novel method for symmetric cryptography. Its purpose is to show how a regular message can be encrypted and then decrypted in an easy, yet secure way. The encrypting method introduced in this work is different from others because it involves decimals as well as integers, encrypting the same initial message differently every time, and inserting m… ▽ More This work describes an example of an application of a novel method for symmetric cryptography. Its purpose is to show how a regular message can be encrypted and then decrypted in an easy, yet secure way. The encrypting method introduced in this work is different from others because it involves decimals as well as integers, encrypting the same initial message differently every time, and inserting misleading digits into every encrypted message, thus making the task of breaking the code even harder. A C++ program was written to support each chapter. △ Less

Submitted 6 August, 2023; v1 submitted 7 December, 2017; originally announced December 2017.

Comments: 33 pages, 5 formulas, 3 C++ programs

arXiv:cs/9906022 [pdf, ps, other]

Zero-Parity Stabbing Information

Authors: Joseph O'Rourke, Irena Pashchenko

Abstract: Everett et al. introduced several varieties of stabbing information for the lines determined by pairs of vertices of a simple polygon P, and established their relationships to vertex visibility and other combinatorial data. In the same spirit, we define the ``zero-parity (ZP) stabbing information'' to be a natural weakening of their ``weak stabbing information,'' retaining only the distinction a… ▽ More Everett et al. introduced several varieties of stabbing information for the lines determined by pairs of vertices of a simple polygon P, and established their relationships to vertex visibility and other combinatorial data. In the same spirit, we define the ``zero-parity (ZP) stabbing information'' to be a natural weakening of their ``weak stabbing information,'' retaining only the distinction among {zero, odd, even>0} in the number of polygon edges stabbed. Whereas the weak stabbing information's relation to visibility remains an open problem, we completely settle the analogous questions for zero-parity information, with three results: (1) ZP information is insufficient to distinguish internal from external visibility graph edges; (2) but it does suffice for all polygons that avoid a certain complex substructure; and (3) the natural generalization of ZP information to the continuous case of smooth curves does distinguish internal from external visibility. △ Less

Submitted 22 June, 1999; originally announced June 1999.

ACM Class: F.2.2

Journal ref: Proc. Japan Conf. Discrete Comput. Geom. '98, Dec. 1998, 93--97

Showing 1–6 of 6 results for author: Pashchenko, I