Search | arXiv e-print repository

6GSoft: Software for Edge-to-Cloud Continuum

Authors: Muhammad Azeem Akbar, Matteo Esposito, Sami Hyrynsalmi, Karthikeyan Dinesh Kumar, Valentina Lenarduzzi, Xiaozhou Li, Ali Mehraj, Tommi Mikkonen, Sergio Moreschini, Niko Mäkitalo, Markku Oivo, Anna-Sofia Paavonen, Risha Parveen, Kari Smolander, Ruoyu Su, Kari Systä, Davide Taibi, Nan Yang, Zheying Zhang, Muhammad Zohaib

Abstract: In the era of 6G, develo** and managing software requires cutting-edge software engineering (SE) theories and practices tailored for such complexity across a vast number of connected edge devices. Our project aims to lead the development of sustainable methods and energy-efficient orchestration models specifically for edge environments, enhancing architectural support driven by AI for contempora… ▽ More In the era of 6G, develo** and managing software requires cutting-edge software engineering (SE) theories and practices tailored for such complexity across a vast number of connected edge devices. Our project aims to lead the development of sustainable methods and energy-efficient orchestration models specifically for edge environments, enhancing architectural support driven by AI for contemporary edge-to-cloud continuum computing. This initiative seeks to position Finland at the forefront of the 6G landscape, focusing on sophisticated edge orchestration and robust software architectures to optimize the performance and scalability of edge networks. Collaborating with leading Finnish universities and companies, the project emphasizes deep industry-academia collaboration and international expertise to address critical challenges in edge orchestration and software architecture, aiming to drive significant advancements in software productivity and market impact. △ Less

Submitted 9 July, 2024; v1 submitted 8 July, 2024; originally announced July 2024.

arXiv:2406.17354 [pdf, other]

On the correlation between Architectural Smells and Static Analysis Warnings

Authors: Matteo Esposito, Mikel Robredo, Francesca Arcelli Fontana, Valentina Lenarduzzi

Abstract: Background. Software quality assurance is essential during software development and maintenance. Static Analysis Tools (SAT) are widely used for assessing code quality. Architectural smells are becoming more daunting to address and evaluate among quality issues. Objective. We aim to understand the relationships between static analysis warnings (SAW) and architectural smells (AS) to guide develop… ▽ More Background. Software quality assurance is essential during software development and maintenance. Static Analysis Tools (SAT) are widely used for assessing code quality. Architectural smells are becoming more daunting to address and evaluate among quality issues. Objective. We aim to understand the relationships between static analysis warnings (SAW) and architectural smells (AS) to guide developers/maintainers in focusing their efforts on SAWs more prone to co-occurring with AS. Method. We performed an empirical study on 103 Java projects totaling 72 million LOC belonging to projects from a vast set of domains, and 785 SAW detected by four SAT, Checkstyle, Findbugs, PMD, SonarQube, and 4 architectural smells detected by ARCAN tool. We analyzed how SAWs influence AS presence. Finally, we proposed an AS remediation effort prioritization based on SAW severity and SAW proneness to specific ASs. Results. Our study reveals a moderate correlation between SAWs and ASs. Different combinations of SATs and SAWs significantly affect AS occurrence, with certain SAWs more likely to co-occur with specific ASs. Conversely, 33.79% of SAWs act as "healthy carriers", not associated with any ASs. Conclusion. Practitioners can ignore about a third of SAWs and focus on those most likely to be associated with ASs. Prioritizing AS remediation based on SAW severity or SAW proneness to specific ASs results in effective rankings like those based on AS severity. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.10273 [pdf]

Beyond Words: On Large Language Models Actionability in Mission-Critical Risk Analysis

Authors: Matteo Esposito, Francesco Palagiano, Valentina Lenarduzzi

Abstract: Context. Risk analysis assesses potential risks in specific scenarios. Risk analysis principles are context-less; the same methodology can be applied to a risk connected to health and information technology security. Risk analysis requires a vast knowledge of national and international regulations and standards and is time and effort-intensive. A large language model can quickly summarize informat… ▽ More Context. Risk analysis assesses potential risks in specific scenarios. Risk analysis principles are context-less; the same methodology can be applied to a risk connected to health and information technology security. Risk analysis requires a vast knowledge of national and international regulations and standards and is time and effort-intensive. A large language model can quickly summarize information in less time than a human and can be fine-tuned to specific tasks. Aim. Our empirical study aims to investigate the effectiveness of Retrieval-Augmented Generation and fine-tuned LLM in Risk analysis. To our knowledge, no prior study has explored its capabilities in risk analysis. Method. We manually curated \totalscenarios unique scenarios leading to \totalsamples representative samples from over 50 mission-critical analyses archived by the industrial context team in the last five years. We compared the base GPT-3.5 and GPT-4 models versus their Retrieval-Augmented Generation and fine-tuned counterparts. We employ two human experts as competitors of the models and three other three human experts to review the models and the former human expert's analysis. The reviewers analyzed 5,000 scenario analyses. Results and Conclusions. HEs demonstrated higher accuracy, but LLMs are quicker and more actionable. Moreover, our findings show that RAG-assisted LLMs have the lowest hallucination rates, effectively uncovering hidden risks and complementing human expertise. Thus, the choice of model depends on specific needs, with FTMs for accuracy, RAG for hidden risks discovery, and base models for comprehensiveness and actionability. Therefore, experts can leverage LLMs for an effective complementing companion in risk analysis within a condensed timeframe. They can also save costs by averting unnecessary expenses associated with implementing unwarranted countermeasures. △ Less

Submitted 11 June, 2024; originally announced June 2024.

arXiv:2311.07947 [pdf, ps, other]

Towards a Technical Debt for Recommender System

Authors: Sergio Moreschini, Ludovik Coba, Valentina Lenarduzzi

Abstract: Balancing the management of technical debt within recommender systems requires effectively juggling the introduction of new features with the ongoing maintenance and enhancement of the current system. Within the realm of recommender systems, technical debt encompasses the trade-offs and expedient choices made during the development and upkeep of the recommendation system, which could potentially h… ▽ More Balancing the management of technical debt within recommender systems requires effectively juggling the introduction of new features with the ongoing maintenance and enhancement of the current system. Within the realm of recommender systems, technical debt encompasses the trade-offs and expedient choices made during the development and upkeep of the recommendation system, which could potentially have adverse effects on its long-term performance, scalability, and maintainability. In this vision paper, our objective is to kickstart a research direction regarding Technical Debt in Recommender Systems. We identified 15 potential factors, along with detailed explanations outlining why it is advisable to consider them. △ Less

Submitted 11 December, 2023; v1 submitted 14 November, 2023; originally announced November 2023.

arXiv:2311.03114 [pdf, other]

Ignoring Time Dependence in Software Engineering Data. A Mistake

Authors: Mikel Robredo, Nyyti Saarimaki, Rafael Penaloza, Valentina Lenarduzzi

Abstract: Researchers often delve into the connections between different factors derived from the historical data of software projects. For example, scholars have devoted their endeavors to the exploration of associations among these factors. However, a significant portion of these studies has failed to consider the limitations posed by the temporal interdependencies among these variables and the potential… ▽ More Researchers often delve into the connections between different factors derived from the historical data of software projects. For example, scholars have devoted their endeavors to the exploration of associations among these factors. However, a significant portion of these studies has failed to consider the limitations posed by the temporal interdependencies among these variables and the potential risks associated with the use of statistical methods ill-suited for analyzing data with temporal connections. Our goal is to highlight the consequences of neglecting time dependence during data analysis in current research. We pinpointed out certain potential problems that arise when disregarding the temporal aspect in the data, and support our argument with both theoretical and real examples. △ Less

Submitted 12 November, 2023; v1 submitted 6 November, 2023; originally announced November 2023.

arXiv:2308.02843 [pdf, other]

One Microservice per Developer: Is This the Trend in OSS?

Authors: Dario Amoroso d'Aragona, Xiaoxhou Li, Tomas Cerny, Andrea Janes, Valentina Lenarduzzi, Davide Taibi

Abstract: When develo** and managing microservice systems, practitioners suggest that each microservice should be owned by a particular team. In effect, there is only one team with the responsibility to manage a given service. Consequently, one developer should belong to only one team. This practice of "one-microservice-per-developer" is especially prevalent in large projects with an extensive development… ▽ More When develo** and managing microservice systems, practitioners suggest that each microservice should be owned by a particular team. In effect, there is only one team with the responsibility to manage a given service. Consequently, one developer should belong to only one team. This practice of "one-microservice-per-developer" is especially prevalent in large projects with an extensive development team. Based on the bazaar-style software development model of Open Source Projects, in which different programmers, like vendors at a bazaar, offer to help out develo** different parts of the system, this article investigates whether we can observe the "one-microservice-per-developer" behavior, a strategy we assume anticipated within microservice based Open Source Projects. We conducted an empirical study among 38 microservice-based OS projects. Our findings indicate that the strategy is rarely respected by open-source developers except for projects that have dedicated DevOps teams. △ Less

Submitted 5 August, 2023; originally announced August 2023.

arXiv:2306.02036 [pdf, other]

On the Empirical Evidence of Microservice Logical Coupling. A Registered Report

Authors: Dario Amoroso d Aragona, Luca Pascarella, Andrea Janes, Valentina Lenarduzzi, Rafael Penaloza, Davide Taibi

Abstract: [Context] Coupling is a widely discussed metric by software engineers while develo** complex software systems, often referred to as a crucial factor and symptom of a poor or good design. Nevertheless, measuring the logical coupling among microservices and analyzing the interactions between services is non-trivial because it demands runtime information in the form of log files, which are not alwa… ▽ More [Context] Coupling is a widely discussed metric by software engineers while develo** complex software systems, often referred to as a crucial factor and symptom of a poor or good design. Nevertheless, measuring the logical coupling among microservices and analyzing the interactions between services is non-trivial because it demands runtime information in the form of log files, which are not always accessible. [Objective and Method] In this work, we propose the design of a study aimed at empirically validating the Microservice Logical Coupling (MLC) metric presented in our previous study. In particular, we plan to empirically study Open Source Systems (OSS) built using a microservice architecture. [Results] The result of this work aims at corroborating the effectiveness and validity of the MLC metric. Thus, we will gather empirical evidence and develop a methodology to analyze and support the claims regarding the MLC metric. Furthermore, we establish its usefulness in evaluating and understanding the logical coupling among microservices. △ Less

Submitted 3 June, 2023; originally announced June 2023.

arXiv:2306.02034 [pdf, other]

Does Microservices Adoption Impact the Development Velocity? A Cohort Study. A Registered Report

Authors: Nyyti Saarimaki, Mikel Robredo, Sira vegas, Natalia Juristo, David Taibi, Valentina Lenarduzzi

Abstract: [Context] Microservices enable the decomposition of applications into small and independent services connected together. The independence between services could positively affect the development velocity of a project, which is considered an important metric measuring the time taken to implement features and fix bugs. However, no studies have investigated the connection between microservices and de… ▽ More [Context] Microservices enable the decomposition of applications into small and independent services connected together. The independence between services could positively affect the development velocity of a project, which is considered an important metric measuring the time taken to implement features and fix bugs. However, no studies have investigated the connection between microservices and development velocity. [Objective and Method] The goal of this study plan is to investigate the effect microservices have on development velocity. The study compares GitHub projects adopting microservices from the beginning and similar projects using monolithic architectures. We designed this study using a cohort study method, to enable obtaining a high level of evidence. [Results] The result of this work enables the confirmation of the effective improvement of the development velocity of microservices. Moreover, this study will contribute to the body of knowledge of empirical methods being among the first works adopting the cohort study methodology. △ Less

Submitted 21 June, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

arXiv:2305.00760 [pdf, other]

Breaks and Code Quality: Investigating the Impact of Forgetting on Software Development. A Registered Report

Authors: Dario Amoroso d'Aragona, Luca Pascarella, Andrea Janes, Valentina Lenarduzzi, Rafael Penaloza, Davide Taibi

Abstract: Developers interrupting their participation in a project might slowly forget critical information about the code, such as its intended purpose, structure, the impact of external dependencies, and the approach used for implementation. Forgetting the implementation details can have detrimental effects on software maintenance, comprehension, knowledge sharing, and developer productivity, resulting in… ▽ More Developers interrupting their participation in a project might slowly forget critical information about the code, such as its intended purpose, structure, the impact of external dependencies, and the approach used for implementation. Forgetting the implementation details can have detrimental effects on software maintenance, comprehension, knowledge sharing, and developer productivity, resulting in bugs, and other issues that can negatively influence the software development process. Therefore, it is crucial to ensure that developers have a clear understanding of the codebase and can work efficiently and effectively even after long interruptions. This registered report proposes an empirical study aimed at investigating the impact of the developer's activity breaks duration and different code quality properties. In particular, we aim at understanding if the amount of activity in a project impact the code quality, and if developers with different activity profiles show different impacts on code quality. The results might be useful to understand if it is beneficial to promote the practice of develo** multiple projects in parallel, or if it is more beneficial to reduce the number of projects each developer contributes. △ Less

Submitted 28 August, 2023; v1 submitted 1 May, 2023; originally announced May 2023.

arXiv:2304.03254 [pdf, other]

Toward End-to-End MLOps Tools Map: A Preliminary Study based on a Multivocal Literature Review

Authors: Sergio Moreschi, Gilberto Recupito, Valentina Lenarduzzi, Fabio Palomba, David Hastbacka, Davide Taibi

Abstract: MLOps tools enable continuous development of machine learning, following the DevOps process. Different MLOps tools have been presented on the market, however, such a number of tools often create confusion on the most appropriate tool to be used in each DevOps phase. To overcome this issue, we conducted a multivocal literature review map** 84 MLOps tools identified from 254 Primary Studies, on th… ▽ More MLOps tools enable continuous development of machine learning, following the DevOps process. Different MLOps tools have been presented on the market, however, such a number of tools often create confusion on the most appropriate tool to be used in each DevOps phase. To overcome this issue, we conducted a multivocal literature review map** 84 MLOps tools identified from 254 Primary Studies, on the DevOps phases, highlighting their purpose, and possible incompatibilities. The result of this work will be helpful to both practitioners and researchers, as a starting point for future investigations on MLOps tools, pipelines, and processes. △ Less

Submitted 6 April, 2023; originally announced April 2023.

arXiv:2303.17862 [pdf, other]

Architecture Smells vs. Concurrency Bugs: an Exploratory Study and Negative Results

Authors: Damian Andrew Tamburri, Francesca Arcelli Fontana, Riccardo Roveda, Valentina Lenarduzzi

Abstract: Technical debt occurs in many different forms across software artifacts. One such form is connected to software architectures where debt emerges in the form of structural anti-patterns across architecture elements, namely, architecture smells. As defined in the literature, ``Architecture smells are recurrent architectural decisions that negatively impact internal system quality", thus increasing t… ▽ More Technical debt occurs in many different forms across software artifacts. One such form is connected to software architectures where debt emerges in the form of structural anti-patterns across architecture elements, namely, architecture smells. As defined in the literature, ``Architecture smells are recurrent architectural decisions that negatively impact internal system quality", thus increasing technical debt. In this paper, we aim at exploring whether there exist manifestations of architectural technical debt beyond decreased code or architectural quality, namely, whether there is a relation between architecture smells (which primarily reflect structural characteristics) and the occurrence of concurrency bugs (which primarily manifest at runtime). We study 125 releases of 5 large data-intensive software systems to reveal that (1) several architecture smells may in fact indicate the presence of concurrency problems likely to manifest at runtime but (2) smells are not correlated with concurrency in general -- rather, for specific concurrency bugs they must be combined with an accompanying articulation of specific project characteristics such as project distribution. As an example, a cyclic dependency could be present in the code, but the specific execution-flow could be never executed at runtime. △ Less

Submitted 31 March, 2023; originally announced March 2023.

arXiv:2303.07722 [pdf, other]

Does Cyclomatic or Cognitive Complexity Better Represents Code Understandability? An Empirical Investigation on the Developers Perception

Authors: Valentina Lenarduzzi, Terhi Kilamo, Andrea Janes

Abstract: Background. Code understandability is fundamental. Developers need to clearly understand the code they are modifying. A low understandability can increase the amount of coding effort and misinterpretation of code has impact on the entire development process. Ideally, developers should write clear and understandable code with the least possible effort. Objective. The goal of this work is to investi… ▽ More Background. Code understandability is fundamental. Developers need to clearly understand the code they are modifying. A low understandability can increase the amount of coding effort and misinterpretation of code has impact on the entire development process. Ideally, developers should write clear and understandable code with the least possible effort. Objective. The goal of this work is to investigate if the McCabe Cyclomatic Complexity or the Cognitive Complexity can be a good predictor for the developers' perceived code understandability to understand which of the two complexities can be used as criteria to evaluate if a piece of code is understandable. Method. We designed and conducted an empirical study among 216 junior developers with professional experience ranging from one to four years. We asked them to manually inspect and rate the understandability of 12 Java classes that exhibit different levels of Cyclomatic and Cognitive Complexity. Results. Cognitive Complexity slightly outperforms the Cyclomatic Complexity to predict the developers' perceived understandability. Conclusion. The identification of a clear and validated measure for Code Complexity is still an open issue. Neither the old fashioned McCabe Cyclomatic Complexity and the most recent Cognitive Complexity are good predictors for code understandability, at least when considering the complexity perceived by junior developers. △ Less

Submitted 14 March, 2023; originally announced March 2023.

arXiv:2207.06875 [pdf, other]

Open Tracing Tools: Overview and Critical Comparison

Authors: Andrea Janes, Xiaozhou Li, Valentina Lenarduzzi

Abstract: Background. Co** with the rapid growing complexity in contemporary software architecture, tracing has become an increasingly critical practice and been adopted widely by software engineers. By adopting tracing tools, practitioners are able to monitor, debug, and optimize distributed software architectures easily. However, with excessive number of valid candidates, researchers and practitioners h… ▽ More Background. Co** with the rapid growing complexity in contemporary software architecture, tracing has become an increasingly critical practice and been adopted widely by software engineers. By adopting tracing tools, practitioners are able to monitor, debug, and optimize distributed software architectures easily. However, with excessive number of valid candidates, researchers and practitioners have a hard time finding and selecting the suitable tracing tools by systematically considering their features and advantages.Objective. To such a purpose, this paper aims to provide an overview of popular Open tracing tools via comparison. Method. Herein, we first identified \ra{30} tools in an objective, systematic, and reproducible manner adopting the Systematic Multivocal Literature Review protocol. Then, we characterized each tool looking at the 1) measured features, 2) popularity both in peer-reviewed literature and online media, and 3) benefits and issues. We used topic modeling and sentiment analysis to extract and summarize the benefits and issues. Specially, we adopted ChatGPT to support the topic interpretation. Results. As a result, this paper presents a systematic comparison amongst the selected tracing tools in terms of their features, popularity, benefits and issues. Conclusion. The result mainly shows that each tracing tool provides a unique combination of features with also different pros and cons. The contribution of this paper is to provide the practitioners better understanding of the tracing tools facilitating their adoption. △ Less

Submitted 23 June, 2023; v1 submitted 14 July, 2022; originally announced July 2022.

arXiv:2206.08718 [pdf, other]

CATTO: Just-in-time Test Case Selection and Execution

Authors: Dario Amoroso d'Aragona, Fabiano Pecorelli, Simone Romano, Giuseppe Scanniello, Maria Teresa Baldassarre, Andrea Janes, Valentina Lenarduzzi

Abstract: Regression testing ensures a System Under Test (SUT) still works as expected after changes to it. The simplest approach for regression testing consists of re-running the entire test suite against the changed version of the SUT. However, this might result in a time- and resource-consuming process; \eg when dealing with large and/or complex SUTs and test suits. To work around this problem, test Case… ▽ More Regression testing ensures a System Under Test (SUT) still works as expected after changes to it. The simplest approach for regression testing consists of re-running the entire test suite against the changed version of the SUT. However, this might result in a time- and resource-consuming process; \eg when dealing with large and/or complex SUTs and test suits. To work around this problem, test Case Selection (TCS) strategies can be used. Such strategies seek to build a temporary test suite comprising only those test cases that are relevant to the changes made to the SUT, so avoiding executing those test cases that do not exercise the changed parts. In this paper, we introduce CATTO (Commit Adaptive Tool for Test suite Optimization) and CATTO INTELLIJ PLUGIN. The former is a tool implementing a TCS strategy for SUTs written in Java, while the latter is a wrapper to allow developers to use \toolName directly in IntelliJ. We also conducted a preliminary evaluation of CATTO on seven open-source Java SUTs in terms of reductions in test-suite size, fault-reveling test cases, and fault-detection capability. The results are promising and suggest that CATTO can be of help to developers when performing regression testing. The video demo and the documentation of the tool is available at: \url{https://catto-tool.github.io/} △ Less

Submitted 17 June, 2022; originally announced June 2022.

arXiv:2203.12697 [pdf, other]

What is Software Quality for AI Engineers? Towards a Thinning of the Fog

Authors: Valentina Golendukhina, Valentina Lenarduzzi, Michael Felderer

Abstract: It is often overseen that AI-enabled systems are also software systems and therefore rely on software quality assurance (SQA). Thus, the goal of this study is to investigate the software quality assurance strategies adopted during the development, integration, and maintenance of AI/ML components and code. We conducted semi-structured interviews with representatives of ten Austrian SMEs that develo… ▽ More It is often overseen that AI-enabled systems are also software systems and therefore rely on software quality assurance (SQA). Thus, the goal of this study is to investigate the software quality assurance strategies adopted during the development, integration, and maintenance of AI/ML components and code. We conducted semi-structured interviews with representatives of ten Austrian SMEs that develop AI-enabled systems. A qualitative analysis of the interview data identified 12 issues in the development of AI/ML components. Furthermore, we identified when quality issues arise in AI/ML components and how they are detected. The results of this study should guide future work on software quality assurance processes and techniques for AI/ML components. △ Less

Submitted 23 March, 2022; originally announced March 2022.

Comments: 9 pages, 3 figures, accepted for CAIN22 Conference

arXiv:2112.14927 [pdf, other]

An Empirical Study of Security Practices for Microservices Systems

Authors: Ali Rezaei Nasab, Mojtaba Shahin, Seyed Ali Hoseyni Raviz, Peng Liang, Amir Mashmool, Valentina Lenarduzzi

Abstract: Despite the numerous benefits of microservices systems, security has been a critical issue in such systems. Several factors explain this difficulty, including a knowledge gap among microservices practitioners on properly securing a microservices system. To (partially) bridge this gap, we conducted an empirical study. We first manually analyzed 861 microservices security points, including 567 issue… ▽ More Despite the numerous benefits of microservices systems, security has been a critical issue in such systems. Several factors explain this difficulty, including a knowledge gap among microservices practitioners on properly securing a microservices system. To (partially) bridge this gap, we conducted an empirical study. We first manually analyzed 861 microservices security points, including 567 issues, 9 documents, and 3 wiki pages from 10 GitHub open-source microservices systems and 306 Stack Overflow posts concerning security in microservices systems. In this study, a microservices security point is referred to as "a GitHub issue, a Stack Overflow post, a document, or a wiki page that entails 5 or more microservices security paragraphs". Our analysis led to a catalog of 28 microservices security practices. We then ran a survey with 74 microservices practitioners to evaluate the usefulness of these 28 practices. Our findings demonstrate that the survey respondents affirmed the usefulness of the 28 practices. We believe that the catalog of microservices security practices can serve as a valuable resource for microservices practitioners to more effectively address security issues in microservices systems. It can also inform the research community of the required or less explored areas to develop microservices-specific security practices and tools. △ Less

Submitted 18 November, 2022; v1 submitted 30 December, 2021; originally announced December 2021.

Comments: Preprint accepted for publication in Journal of Systems and Software, 2022

arXiv:2108.12411 [pdf, ps, other]

doi 10.1145/3475716.3484273

Towards a Methodology for Participant Selection in Software Engineering Experiments. A Vision of the Future

Authors: Valentina Lenarduzzi, Oscar Dieste, Davide Fucci, Sira Vegas

Abstract: Background. Software Engineering (SE) researchers extensively perform experiments with human subjects. Well-defined samples are required to ensure external validity. Samples are selected \textit{purposely} or by \textit{convenience}, limiting the generalizability of results. Objective. We aim to depict the current status of participants selection in empirical SE, identifying the main threats and h… ▽ More Background. Software Engineering (SE) researchers extensively perform experiments with human subjects. Well-defined samples are required to ensure external validity. Samples are selected \textit{purposely} or by \textit{convenience}, limiting the generalizability of results. Objective. We aim to depict the current status of participants selection in empirical SE, identifying the main threats and how they are mitigated. We draft a robust approach to participants' selection. Method. We reviewed existing participants' selection guidelines in SE, and performed a preliminary literature review to find out how participants' selection is conducted in SE in practice. % and 3) we summarized the main issues identified. Results. We outline a new selection methodology, by 1) defining the characteristics of the desired population, 2) locating possible sources of sampling available for researchers, and 3) identifying and reducing the "distance" between the selected sample and its corresponding population. Conclusion. We propose a roadmap to develop and empirically validate the selection methodology. △ Less

Submitted 27 August, 2021; originally announced August 2021.

arXiv:2105.14232 [pdf, other]

Identification and Measurement of Technical Debt Requirements in Software Development: a Systematic Literature Review

Authors: Ana Melo, Roberta Fagundes, Valentina Lenarduzzi, Wylliams Santos

Abstract: Context: Technical Debt requirements are related to the distance between the ideal value of the specification and the system's actual implementation, which are consequences of strategic decisions for immediate gains, or unintended changes in context. To ensure the evolution of the software, it is necessary to keep it managed. Identification and measurement are the first two stages of the managemen… ▽ More Context: Technical Debt requirements are related to the distance between the ideal value of the specification and the system's actual implementation, which are consequences of strategic decisions for immediate gains, or unintended changes in context. To ensure the evolution of the software, it is necessary to keep it managed. Identification and measurement are the first two stages of the management process; however, they are little explored in academic research in requirements engineering. Objective: We aimed at investigating which evidence helps to strengthen the process of TD requirements management, including identification and measurement. Method: We conducted a Systematic Literature Review through manual and automatic searches considering 7499 studies from 2010 to 2020, and including 61 primary studies. Results: We identified some causes related to Technical Debt requirements, existing strategies to help in the identification and measurement, and metrics to support the measurement stage. Conclusion: Studies on TD requirements are still preliminary, especially on management tools. Yet, not enough attention is given to interpersonal issues, which are difficulties encountered when performing such activities, and therefore also require research. Finally, the provision of metrics to help measure TD is part of this work's contribution, providing insights into the application in the requirements context. △ Less

Submitted 14 July, 2021; v1 submitted 29 May, 2021; originally announced May 2021.

arXiv:2103.11321 [pdf, other]

Fault Prediction based on Software Metrics and SonarQube Rules. Machine or Deep Learning?

Authors: Francesco Lomio, Sergio Moreschini, Valentina Lenarduzzi

Abstract: Background. Developers spend more time fixing bugs and refactoring the code to increase the maintainability than develo** new features. Researchers investigated the code quality impact on fault-proneness focusing on code smells and code metrics. Objective. We aim at advancing fault-inducing commit prediction based on SonarQube considering the contribution provided by each rule and metric. Method… ▽ More Background. Developers spend more time fixing bugs and refactoring the code to increase the maintainability than develo** new features. Researchers investigated the code quality impact on fault-proneness focusing on code smells and code metrics. Objective. We aim at advancing fault-inducing commit prediction based on SonarQube considering the contribution provided by each rule and metric. Method. We designed and conducted a case study among 33 Java projects analyzed with SonarQube and SZZ to identify fault-inducing and fault-fixing commits. Moreover, we investigated fault-proneness of each SonarQube rule and metric using Machine and Deep Learning models. Results. We analyzed 77,932 commits that contain 40,890 faults and infected by more than 174 SonarQube rules violated 1,9M times, on which there was calculated 24 software metrics available by the tool. Compared to machine learning models, deep learning provide a more accurate fault detection accuracy and allowed us to accurately identify the fault-prediction power of each SonarQube rule. As a result, fourteen of the 174 violated rules has an importance higher than 1\% and account for 30\% of the total fault-proneness importance, while the fault proneness of the remaining 165 rules is negligible. Conclusion. Future works might consider the adoption of timeseries analysis and anomaly detection techniques to better and more accurately detect the rules that impact fault-proneness. △ Less

Submitted 21 March, 2021; originally announced March 2021.

arXiv:2101.08832 [pdf, other]

A Critical Comparison on Six Static Analysis Tools: Detection, Agreement, and Precision

Authors: Valentina Lenarduzzi, Savanna Lujan, Nyyti Saarimaki, Fabio Palomba

Abstract: Background. Developers use Automated Static Analysis Tools (ASATs) to control for potential quality issues in source code, including defects and technical debt. Tool vendors have devised quite a number of tools, which makes it harder for practitioners to select the most suitable one for their needs. To better support developers, researchers have been conducting several studies on ASATs to favor th… ▽ More Background. Developers use Automated Static Analysis Tools (ASATs) to control for potential quality issues in source code, including defects and technical debt. Tool vendors have devised quite a number of tools, which makes it harder for practitioners to select the most suitable one for their needs. To better support developers, researchers have been conducting several studies on ASATs to favor the understanding of their actual capabilities. Aims. Despite the work done so far, there is still a lack of knowledge regarding (1) which source quality problems can actually be detected by static analysis tool warnings, (2) what is their agreement, and (3) what is the precision of their recommendations. We aim at bridging this gap by proposing a large-scale comparison of six popular static analysis tools for Java projects: Better Code Hub, CheckStyle, Coverity Scan, Findbugs, PMD, and SonarQube. Method. We analyze 47 Java projects and derive a taxonomy of warnings raised by 6 state-of-the-practice ASATs. To assess their agreement, we compared them by manually analyzing - at line-level - whether they identify the same issues. Finally, we manually evaluate the precision of the tools. Results. The key results report a comprehensive taxonomy of ASATs warnings, show little to no agreement among the tools and a low degree of precision. Conclusions. We provide a taxonomy that can be useful to researchers, practitioners, and tool vendors to map the current capabilities of the tools. Furthermore, our study provides the first overview on the agreement among different tools as well as an extensive analysis of their precision. △ Less

Submitted 21 January, 2021; originally announced January 2021.

arXiv:2011.06244 [pdf, other]

A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits

Authors: Steffen Herbold, Alexander Trautsch, Benjamin Ledel, Alireza Aghamohammadi, Taher Ahmed Ghaleb, Kuljit Kaur Chahal, Tim Bossenmaier, Bhaveet Nagaria, Philip Makedonski, Matin Nili Ahmadabadi, Kristof Szabados, Helge Spieker, Matej Madeja, Nathaniel Hoy, Valentina Lenarduzzi, Shangwen Wang, Gema Rodríguez-Pérez, Ricardo Colomo-Palacios, Roberto Verdecchia, Paramvir Singh, Yihao Qin, Debasish Chakroborti, Willard Davis, Vijay Walunj, Hongjun Wu , et al. (23 additional authors not shown)

Abstract: Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objective: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits. Metho… ▽ More Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objective: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits. Methods: We use a crowd sourcing approach for manual labeling to validate which changes contribute to bug fixes for each line in bug fixing commits. Each line is labeled by four participants. If at least three participants agree on the same label, we have consensus. Results: We estimate that between 17% and 32% of all changes in bug fixing commits modify the source code to fix the underlying problem. However, when we only consider changes to the production code files this ratio increases to 66% to 87%. We find that about 11% of lines are hard to label leading to active disagreements between participants. Due to confirmed tangling and the uncertainty in our data, we estimate that 3% to 47% of data is noisy without manual untangling, depending on the use case. Conclusion: Tangled commits have a high prevalence in bug fixes and can lead to a large amount of noise in the data. Prior research indicates that this noise may alter results. As researchers, we should be skeptics and assume that unvalidated data is likely very noisy, until proven otherwise. △ Less

Submitted 13 October, 2021; v1 submitted 12 November, 2020; originally announced November 2020.

Comments: Status: Accepted at Empirical Software Engineering

arXiv:2010.03525 [pdf]

Empirical Standards for Software Engineering Research

Authors: Paul Ralph, Nauman bin Ali, Sebastian Baltes, Domenico Bianculli, Jessica Diaz, Yvonne Dittrich, Neil Ernst, Michael Felderer, Robert Feldt, Antonio Filieri, Breno Bernard Nicolau de França, Carlo Alberto Furia, Greg Gay, Nicolas Gold, Daniel Graziotin, Pinjia He, Rashina Hoda, Natalia Juristo, Barbara Kitchenham, Valentina Lenarduzzi, Jorge Martínez, Jorge Melegati, Daniel Mendez, Tim Menzies, Jefferson Molleri , et al. (18 additional authors not shown)

Abstract: Empirical Standards are natural-language models of a scientific community's expectations for a specific kind of study (e.g. a questionnaire survey). The ACM SIGSOFT Paper and Peer Review Quality Initiative generated empirical standards for research methods commonly used in software engineering. These living documents, which should be continuously revised to reflect evolving consensus around resear… ▽ More Empirical Standards are natural-language models of a scientific community's expectations for a specific kind of study (e.g. a questionnaire survey). The ACM SIGSOFT Paper and Peer Review Quality Initiative generated empirical standards for research methods commonly used in software engineering. These living documents, which should be continuously revised to reflect evolving consensus around research best practices, will improve research quality and make peer review more effective, reliable, transparent and fair. △ Less

Submitted 4 March, 2021; v1 submitted 7 October, 2020; originally announced October 2020.

Comments: For the complete standards, supplements and other resources, see https://github.com/acmsigsoft/EmpiricalStandards

arXiv:1909.08933 [pdf, other]

doi 10.1016/j.infsof.2021.106600

From Monolithic Systems to Microservices: An Assessment Framework

Authors: Florian Auer, Valentina Lenarduzzi, Michael Felderer, Davide Taibi

Abstract: Context. Re-architecting monolithic systems with Microservices-based architecture is a common trend. Various companies are migrating to Microservices for different reasons. However, making such an important decision like re-architecting an entire system must be based on real facts and not only on gut feelings. Objective. The goal of this work is to propose an evidence-based decision support framew… ▽ More Context. Re-architecting monolithic systems with Microservices-based architecture is a common trend. Various companies are migrating to Microservices for different reasons. However, making such an important decision like re-architecting an entire system must be based on real facts and not only on gut feelings. Objective. The goal of this work is to propose an evidence-based decision support framework for companies that need to migrate to Microservices, based on the analysis of a set of characteristics and metrics they should collect before re-architecting their monolithic system. Method. We designed this study with a mixed-methods approach combining a Systematic Map** Study with a survey done in the form of interviews with professionals to derive the assessment framework based on Grounded Theory. Results. We identified a set consisting of information and metrics that companies can use to decide whether to migrate to Microservices or not. The proposed assessment framework, based on the aforementioned metrics, could be useful for companies if they need to migrate to Microservices and do not want to run the risk of failing to consider some important information. △ Less

Submitted 15 April, 2021; v1 submitted 19 September, 2019; originally announced September 2019.

Comments: Information and Software Technology(2021)

arXiv:1908.11590 [pdf, other]

doi 10.1016/j.jss.2020.110750

Some SonarQube Issues have a Significant but SmallEffect on Faults and Changes. A large-scale empirical study

Authors: Valentina Lenarduzzi, Nyyti Saarimäki, Davide Taibi

Abstract: Context. Companies commonly invest effort to remove technical issues believed to impact software qualities, such as removing anti-patterns or coding styles violations. Objective. Our aim is to analyze the diffuseness of Technical Debt (TD) items in software systems and to assess their impact on code changes and fault-proneness, considering also the type of TD items and their severity. Method. We c… ▽ More Context. Companies commonly invest effort to remove technical issues believed to impact software qualities, such as removing anti-patterns or coding styles violations. Objective. Our aim is to analyze the diffuseness of Technical Debt (TD) items in software systems and to assess their impact on code changes and fault-proneness, considering also the type of TD items and their severity. Method. We conducted a case study among 33 Java projects from the Apache Software Foundation (ASF) repository. We analyzed 726 commits containing 27K faults and 12M changes. The projects violated 173 SonarQube rules generating more than 95K TD items in more than 200K classes. Results. Clean classes (classes not affected by TD items) are less change-prone than dirty ones, but the difference between the groups is small. Clean classes are slightly more change-prone than classes affected by TD items of type Code Smell or Security Vulnerability. As for fault-proneness, there is no difference between clean and dirty classes. Moreover, we found a lot of incongruities in the type and severity level assigned by SonarQube. Conclusions. Our result can be useful for practitioners to understand which TD items they should refactor and for researchers to bridge the missing gaps. They can also support companies and tool vendors in identifying TD items as accurately as possible. △ Less

Submitted 30 August, 2019; originally announced August 2019.

Journal ref: Journal of Systems and Software Volume 170, December 2020, 110750

arXiv:1908.10337 [pdf]

doi 10.1007/978-3-030-29193-8_7

Continuous Architecting with Microservices and DevOps: A Systematic Map** Study

Authors: Davide Taibi, Valentina Lenarduzzi, Claus Pahl

Abstract: Context: Several companies are migrating their information systems into the Cloud. Microservices and DevOps are two of the most common adopted technologies. However, there is still a lack of understanding how to adopt a microservice-based architectural style and which tools and technique to use in a continuous architecting pipeline. Objective: We aim at characterizing the different microservice ar… ▽ More Context: Several companies are migrating their information systems into the Cloud. Microservices and DevOps are two of the most common adopted technologies. However, there is still a lack of understanding how to adopt a microservice-based architectural style and which tools and technique to use in a continuous architecting pipeline. Objective: We aim at characterizing the different microservice architectural style principles and patterns in order to map existing tools and techniques adopted in the context of DevOps. Methodology: We conducted a Systematic Map** Study identifying the goal and the research questions, the bibliographic sources, the search strings, and the selection criteria to retrieve the most relevant papers. Results: We identified several agreed microservice architectural principles and patterns widely adopted and reported in 23 case studies, together with a summary of the advantages, disadvantages, and lessons learned for each pattern from the case studies. Finally, we mapped the existing microservices-specific techniques in order to understand how to continuously deliver value in a DevOps pipeline. We depicted the current research, reporting gaps and trends. Conclusion: Different patterns emerge for different migration, orchestration, storage and deployment settings. The results also show the lack of empirical work on microservices-specific techniques, especially for the release phase in DevOps. △ Less

Submitted 27 August, 2019; originally announced August 2019.

Comments: this paper was mistakenly uploaded as arXiv:1908.04101v2, which has been subsequently replaced to the correct state

Journal ref: Cloud Computing and Services Science. CLOSER 2018 Selected papers. Communications in Computer and Information Science, vol 1073, pp. 126-151, Springer. 2019

arXiv:1908.09321 [pdf, other]

Does Code Quality Affect Pull Request Acceptance? An empirical study

Authors: Valentina Lenarduzzi, Vili Nikkola, Nyyti Saarimäki, Davide Taibi

Abstract: Background. Pull requests are a common practice for contributing and reviewing contributions, and are employed both in open-source and industrial contexts. One of the main goals of code reviews is to find defects in the code, allowing project maintainers to easily integrate external contributions into a project and discuss the code contributions. Objective. The goal of this paper is to understand… ▽ More Background. Pull requests are a common practice for contributing and reviewing contributions, and are employed both in open-source and industrial contexts. One of the main goals of code reviews is to find defects in the code, allowing project maintainers to easily integrate external contributions into a project and discuss the code contributions. Objective. The goal of this paper is to understand whether code quality is actually considered when pull requests are accepted. Specifically, we aim at understanding whether code quality issues such as code smells, antipatterns, and coding style violations in the pull request code affect the chance of its acceptance when reviewed by a maintainer of the project. Method. We conducted a case study among 28 Java open-source projects, analyzing the presence of 4.7 M code quality issues in 36 K pull requests. We analyzed further correlations by applying Logistic Regression and seven machine learning techniques (Decision Tree, Random Forest, Extremely Randomized Trees, AdaBoost, Gradient Boosting, XGBoost). Results. Unexpectedly, code quality turned out not to affect the acceptance of a pull request at all. As suggested by other works, other factors such as the reputation of the maintainer and the importance of the feature delivered might be more important than code quality in terms of pull request acceptance. Conclusions. Researchers already investigated the influence of the developers' reputation and the pull request acceptance. This is the first work investigating if quality of the code in pull requests affects the acceptance of the pull request or not. We recommend that researchers further investigate this topic to understand if different measures or different tools could provide some useful measures. △ Less

Submitted 25 August, 2019; originally announced August 2019.

arXiv:1908.04101 [pdf]

Microservices Anti Patterns: A Taxonomy

Authors: Davide Taibi, Valentina Lenarduzzi, Claus Pahl

Abstract: Several companies are re-architecting their monolithic information systems with microservices. However, many companies migrated without experience on microservices, mainly learning how to migrate from books or from practitioners' blogs. Because of the novelty of the topic, practitioners and consultancy are learning by doing how to migrate, thus facing several issues but also several benefits. In t… ▽ More Several companies are re-architecting their monolithic information systems with microservices. However, many companies migrated without experience on microservices, mainly learning how to migrate from books or from practitioners' blogs. Because of the novelty of the topic, practitioners and consultancy are learning by doing how to migrate, thus facing several issues but also several benefits. In this chapter, we introduce a catalog and a taxonomy of the most common microservices anti-patterns in order to identify common problems. Our anti-pattern catalogue is based on the experience summarized by different practitioners we interviewed in the last three years. We identified a taxonomy of 20 anti-patterns, including organizational (team oriented and technology/tool oriented) anti-patterns and technical (internal and communication) anti-patterns. The results can be useful to practitioners to avoid experiencing the same difficult situations in the systems they develop. Moreover, researchers can benefit of this catalog and further validate the harmfulness of the anti-patterns identified. △ Less

Submitted 21 August, 2019; v1 submitted 12 August, 2019; originally announced August 2019.

Comments: Microservices - Science and Engineering Springer 2019

arXiv:1908.01502 [pdf]

An Empirical Study on Technical Debt in a Finnish SME

Authors: Valentina Lenarduzzi, Teemu Orava, Nyyti Saarimäki, Kari Systä, Davide Taibi

Abstract: Objective. In this work, we report the experience of a Finnish SME in managing Technical Debt (TD), investigating the most common types of TD they faced in the past, their causes, and their effects. Method. We set up a focus group in the case-company, involving different roles. Results. The results showed that the most significant TD in the company stems from disagreements with the supplier and la… ▽ More Objective. In this work, we report the experience of a Finnish SME in managing Technical Debt (TD), investigating the most common types of TD they faced in the past, their causes, and their effects. Method. We set up a focus group in the case-company, involving different roles. Results. The results showed that the most significant TD in the company stems from disagreements with the supplier and lack of test automation. Specification and test TD are the most significant types of TD. Budget and time constraints were identified as the most important root causes of TD. Conclusion. TD occurs when time or budget is limited or the amount of work are not understood properly. However, not all postponed activities generated "debt". Sometimes the accumulation of TD helped meet deadlines without a major impact, while in other cases the cost for repaying the TD was much higher than the benefits. From this study, we learned that learning, careful estimations, and continuous improvement could be good strategies to mitigate TD. These strategies include iterative validation with customers, efficient communication with stakeholders, meta-cognition in estimations, and value orientation in budgeting and scheduling. △ Less

Submitted 5 August, 2019; originally announced August 2019.

Journal ref: International Symposium on Empirical Software Engineering and Measurement (ESEM), Brazil, 2019

arXiv:1908.00827 [pdf]

doi 10.1145/3345629.3345630

The Technical Debt Dataset

Authors: Valentina Lenarduzzi, Nyyti Saarimäki, Davide Taibi

Abstract: Technical Debt analysis is increasing in popularity as nowadays researchers and industry are adopting various tools for static code analysis to evaluate the quality of their code. Despite this, empirical studies on software projects are expensive because of the time needed to analyze the projects. In addition, the results are difficult to compare as studies commonly consider different projects. In… ▽ More Technical Debt analysis is increasing in popularity as nowadays researchers and industry are adopting various tools for static code analysis to evaluate the quality of their code. Despite this, empirical studies on software projects are expensive because of the time needed to analyze the projects. In addition, the results are difficult to compare as studies commonly consider different projects. In this work, we propose the Technical Debt Dataset, a curated set of project measurement data from 33 Java projects from the Apache Software Foundation. In the Technical Debt Dataset, we analyzed all commits from separately defined time frames with SonarQube to collect Technical Debt information and with Ptidej to detect code smells. Moreover, we extracted all available commit information from the git logs, the refactoring applied with Refactoring Miner, and fault information reported in the issue trackers (Jira). Using this information, we executed the SZZ algorithm to identify the fault-inducing and -fixing commits. We analyzed 78K commits from the selected 33 projects, detecting 1.8M SonarQube issues, 38K code smells, 28K faults and 57K refactorings. The project analysis took more than 200 days. In this paper, we describe the data retrieval pipeline together with the tools used for the analysis. The dataset is made available through CSV files and an SQLite database to facilitate queries on the data. The Technical Debt Dataset aims to open up diverse opportunities for Technical Debt research, enabling researchers to compare results on common projects. △ Less

Submitted 2 August, 2019; originally announced August 2019.

Journal ref: The Fifteenth International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE'19), September 18, 2019, Recife, Brazil

arXiv:1908.00737 [pdf, other]

doi 10.1145/3340482.3342747

Towards Surgically-Precise Technical Debt Estimation: Early Results and Research Roadmap

Authors: Valentina Lenarduzzi, Antonio Martini, Davide Taibi, Damian Andrew Tamburri

Abstract: The concept of technical debt has been explored from many perspectives but its precise estimation is still under heavy empirical and experimental inquiry. We aim to understand whether, by harnessing approximate, data-driven, machine-learning approaches it is possible to improve the current techniques for technical debt estimation, as represented by a top industry quality analysis tool such as Sona… ▽ More The concept of technical debt has been explored from many perspectives but its precise estimation is still under heavy empirical and experimental inquiry. We aim to understand whether, by harnessing approximate, data-driven, machine-learning approaches it is possible to improve the current techniques for technical debt estimation, as represented by a top industry quality analysis tool such as SonarQube. For the sake of simplicity, we focus on relatively simple regression modelling techniques and apply them to modelling the additional project cost connected to the sub-optimal conditions existing in the projects under study. Our results shows that current techniques can be improved towards a more precise estimation of technical debt and the case study shows promising results towards the identification of more accurate estimation of technical debt. △ Less

Submitted 2 August, 2019; originally announced August 2019.

Comments: 6 pages

Journal ref: In Proceedings of the 3rd ACM SIGSOFT International Workshop on Machine Learning Techniques for Software QualityEvaluation (MaLTeSQuE '19), August 27, 2019, Tallinn, Estonia

arXiv:1907.10887 [pdf, other]

Towards an Holistic Definition of Requirements Debt

Authors: Valentina Lenarduzzi, Davide Fucci

Abstract: When not appropriately managed, technical debt is considered to have negative effects on the long term success of a software project. However, how the debt metaphor applies to requirements engineering in general, and to requirements engineering activities in particular, is not well understood. Grounded in the existing literature, we present a holistic definition of requirements debt which include… ▽ More When not appropriately managed, technical debt is considered to have negative effects on the long term success of a software project. However, how the debt metaphor applies to requirements engineering in general, and to requirements engineering activities in particular, is not well understood. Grounded in the existing literature, we present a holistic definition of requirements debt which include debt incurred during the identification, formalization, and implementation of requirements. We outline future assessment to validate and further refine our proposed definition. This conceptualization is the first step towards a requirements debt monitoring framework to support stakeholders decisions, such as when to incur and eventually pay back requirements debt, and at what costs △ Less

Submitted 25 July, 2019; originally announced July 2019.

Journal ref: ESEM2019 Vision paper track

arXiv:1907.00376 [pdf, other]

Are SonarQube Rules Inducing Bugs?

Authors: Valentina Lenarduzzi, Francesco Lomio, Heikki Huttunen, Davide Taibi

Abstract: Background. The popularity of tools for analyzing Technical Debt, and particularly the popularity of SonarQube, is increasing rapidly. SonarQube proposes a set of coding rules, which represent something wrong in the code that will soon be reflected in a fault or will increase maintenance effort. However, our local companies were not confident in the usefulness of the rules proposed by SonarQube an… ▽ More Background. The popularity of tools for analyzing Technical Debt, and particularly the popularity of SonarQube, is increasing rapidly. SonarQube proposes a set of coding rules, which represent something wrong in the code that will soon be reflected in a fault or will increase maintenance effort. However, our local companies were not confident in the usefulness of the rules proposed by SonarQube and contracted us to investigate the fault-proneness of these rules. Objective. In this work we aim at understanding which SonarQube rules are actually fault-prone and to understand which machine learning models can be adopted to accurately identify fault-prone rules. Method. We designed and conducted an empirical study on 21 well-known mature open-source projects. We applied the SZZ algorithm to label the fault-inducing commits. We analyzed the fault-proneness by comparing the classification power of seven machine learning models. Result. Among the 202 rules defined for Java by SonarQube, only 25 can be considered to have relatively low fault-proneness. Moreover, violations considered as "bugs" by SonarQube were generally not fault-prone and, consequently, the fault-prediction power of the model proposed by SonarQube is extremely low. Conclusion. The rules applied by SonarQube for calculating technical debt should be thoroughly investigated and their harmfulness needs to be further confirmed. Therefore, companies should carefully consider which rules they really need to apply, especially if their goal is to reduce fault-proneness. △ Less

Submitted 19 December, 2019; v1 submitted 30 June, 2019; originally announced July 2019.

Journal ref: 27th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) London, Ontario, February 18-21, 2020

arXiv:1904.12538 [pdf, other]

Technical Debt Prioritization: State of the Art. A Systematic Literature Review

Authors: Valentina Lenarduzzi, Terese Besker, Davide Taibi, Antonio Martini, Francesca Arcelli Fontana

Abstract: Background. Software companies need to manage and refactor Technical Debt issues. Therefore, it is necessary to understand if and when refactoring Technical Debt should be prioritized with respect to develo** features or fixing bugs. Objective. The goal of this study is to investigate the existing body of knowledge in software engineering to understand what Technical Debt prioritization approac… ▽ More Background. Software companies need to manage and refactor Technical Debt issues. Therefore, it is necessary to understand if and when refactoring Technical Debt should be prioritized with respect to develo** features or fixing bugs. Objective. The goal of this study is to investigate the existing body of knowledge in software engineering to understand what Technical Debt prioritization approaches have been proposed in research and industry. Method. We conducted a Systematic Literature Review among 384 unique papers published until 2018, following a consolidated methodology applied in Software Engineering. We included 38 primary studies. Results. Different approaches have been proposed for Technical Debt prioritization, all having different goals and optimizing on different criteria. The proposed measures capture only a small part of the plethora of factors used to prioritize Technical Debt qualitatively in practice. We report an impact map of such factors. However, there is a lack of empirical and validated set of tools. Conclusion. We observed that technical Debt prioritization research is preliminary and there is no consensus on what are the important factors and how to measure them. Consequently, we cannot consider current research conclusive and in this paper, we outline different directions for necessary future investigations. △ Less

Submitted 30 January, 2020; v1 submitted 29 April, 2019; originally announced April 2019.

arXiv:1904.11755 [pdf, other]

Are Architectural Smells Independent from Code Smells? An Empirical Study

Authors: Francesca Arcelli Fontanaa, Valentina Lenarduzzi, Riccardo Roveda, Davide Taibi

Abstract: Background. Architectural smells and code smells are symptoms of bad code or design that can cause different quality problems, such as faults, technical debt, or difficulties with maintenance and evolution. Some studies show that code smells and architectural smells often appear together in the same file. The correlation between code smells and architectural smells, however, is not clear yet; some… ▽ More Background. Architectural smells and code smells are symptoms of bad code or design that can cause different quality problems, such as faults, technical debt, or difficulties with maintenance and evolution. Some studies show that code smells and architectural smells often appear together in the same file. The correlation between code smells and architectural smells, however, is not clear yet; some studies on a limited set of projects have claimed that architectural smells can be derived from code smells, while other studies claim the opposite. Objective. The goal of this work is to understand whether architectural smells are independent from code smells or can be derived from a code smell or from one category of them. Method. We conducted a case study analyzing the correlations among 19 code smells, six categories of code smells, and four architectural smells. Results. The results show that architectural smells are correlated with code smells only in a very low number of occurrences and therefore cannot be derived from code smells. Conclusion. Architectural smells are independent from code smells, and therefore deserve special attention by researchers, who should investigate their actual harmfulness, and practitioners, who should consider whether and when to remove them. △ Less

Submitted 26 April, 2019; originally announced April 2019.

Comments: (in press)

Journal ref: Journal of System and Software 2019

arXiv:1902.06282 [pdf, other]

doi 10.1016/j.jss.2020.110710.

Does Migrate a Monolithic System to Microservices Decrease the Technical Debt?

Authors: Valentina Lenarduzzi, Francesco Lomio, Nyyti Saarimäki, Davide Taibi

Abstract: Background. The migration from monolithic systems to microservices involves deep refactoring of the systems. Therefore, the migration usually has a big economic impact and companies tend to postpone several activities during this process, mainly to speed-up the migration itself, but also because of the need to release new features. Objective. We monitored the Technical Debt of a small and medium e… ▽ More Background. The migration from monolithic systems to microservices involves deep refactoring of the systems. Therefore, the migration usually has a big economic impact and companies tend to postpone several activities during this process, mainly to speed-up the migration itself, but also because of the need to release new features. Objective. We monitored the Technical Debt of a small and medium enterprise while migrating a legacy monolithic system to an ecosystem of microservices to analyze changes in the code technical debt before and after the migration to microservices. Method. We conducted a case study analyzing more than four years of the history of a big project (280K Lines of Code) where two teams extracted five business processes from the monolithic system as microservices, by first analyzing the Technical Debt with SonarQube and then performing a qualitative study with the developers to understand the perceived quality of the system and the motivation for eventually postponed activities. Result. The development of microservices helps to reduce the Technical Debt in the long run. Despite an initial spike in the Technical Debt, due to the development of the new microservice, after a relatively short period, the Technical Debt tends to grow slower than in the monolithic system. △ Less

Submitted 4 July, 2020; v1 submitted 17 February, 2019; originally announced February 2019.

Journal ref: The Journal of Systems & Software (2020) 110710

arXiv:1810.10855 [pdf]

Microservices, Continuous Architecture, and Technical Debt Interest: An Empirical Study

Authors: Valentina Lenarduzzi, Davide Taibi

Abstract: Continuous Architecture (CA) is an approach that supports companies in decreasing the time between deliveries. Migration to microservices is one of the most common situations when companies adopt continuous architecting processes [4]. Companies commonly adopt an initial migration strategy to extract some components from the monolithic system as microservices, making use of simplified microservices… ▽ More Continuous Architecture (CA) is an approach that supports companies in decreasing the time between deliveries. Migration to microservices is one of the most common situations when companies adopt continuous architecting processes [4]. Companies commonly adopt an initial migration strategy to extract some components from the monolithic system as microservices, making use of simplified microservices patterns [5][4]. As an example, companies commonly directly connect the microservices to the legacy monolithic system and do not adopt any message bus at the beginning. When the system starts to grow in complexity, they usually start re-architecting their systems, considering different architectural patterns. Some companies introduce API gateway patterns to simplify the management and load balancing of the different services, while others also consider a lightweight message bus [4][5][6]. All these architectural changes require deep refactoring of the system, thereby increasing the risk of new faults being introduced. In this paper, we report the preliminary results of work in progress, where we monitored the TD of an SME (SMEs = small and medium enterprises) that adopted a CA approach while migrating a legacy monolithic system to an ecosystem of microservices. To the best of our knowledge, no studies exist on the impact of postponed activities on the TD, especially in the context of CA and microservices. This work will help companies understand how TD grows and changes over time while at the same time opening up new avenues for future research on the analysis of TD interest in continuous architecting processes. △ Less

Submitted 25 October, 2018; originally announced October 2018.

Journal ref: Davide Taibi, Valentina Lenarduzzi. Microservices, Continuous Architecture, and Technical Debt Interest: An Empirical Study. Euromicro SEAA. Work in Progress. 2018

Showing 1–36 of 36 results for author: Lenarduzzi, V