Search | arXiv e-print repository

How do Machine Learning Projects use Continuous Integration Practices? An Empirical Study on GitHub Actions

Authors: João Helis Bernardo, Daniel Alencar da Costa, Sérgio Queiroz de Medeiros, Uirá Kulesza

Abstract: Continuous Integration (CI) is a well-established practice in traditional software development, but its nuances in the domain of Machine Learning (ML) projects remain relatively unexplored. Given the distinctive nature of ML development, understanding how CI practices are adopted in this context is crucial for tailoring effective approaches. In this study, we conduct a comprehensive analysis of 18… ▽ More Continuous Integration (CI) is a well-established practice in traditional software development, but its nuances in the domain of Machine Learning (ML) projects remain relatively unexplored. Given the distinctive nature of ML development, understanding how CI practices are adopted in this context is crucial for tailoring effective approaches. In this study, we conduct a comprehensive analysis of 185 open-source projects on GitHub (93 ML and 92 non-ML projects). Our investigation comprises both quantitative and qualitative dimensions, aiming to uncover differences in CI adoption between ML and non-ML projects. Our findings indicate that ML projects often require longer build durations, and medium-sized ML projects exhibit lower test coverage compared to non-ML projects. Moreover, small and medium-sized ML projects show a higher prevalence of increasing build duration trends compared to their non-ML counterparts. Additionally, our qualitative analysis illuminates the discussions around CI in both ML and non-ML projects, encompassing themes like CI Build Execution and Status, CI Testing, and CI Infrastructure. These insights shed light on the unique challenges faced by ML projects in adopting CI practices effectively. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: 10 pages, Mining Software Repositories, MSR 2024

arXiv:2309.10205 [pdf, other]

Continuous Integration and Software Quality: A Causal Explanatory Study

Authors: Eliezio Soares, Daniel Alencar da Costa, Uirá Kulesza

Abstract: Continuous Integration (CI) is a software engineering practice that aims to reduce the cost and risk of code integration among teams. Recent empirical studies have confirmed associations between CI and the software quality (SQ). However, no existing study investigates causal relationships between CI and SQ. This paper investigates it by applying the causal Direct Acyclic Graphs (DAGs) technique. W… ▽ More Continuous Integration (CI) is a software engineering practice that aims to reduce the cost and risk of code integration among teams. Recent empirical studies have confirmed associations between CI and the software quality (SQ). However, no existing study investigates causal relationships between CI and SQ. This paper investigates it by applying the causal Direct Acyclic Graphs (DAGs) technique. We combine two other strategies to support this technique: a literature review and a Mining Software Repository (MSR) study. In the first stage, we review the literature to discover existing associations between CI and SQ, which help us create a "literature-based causal DAG" in the second stage. This DAG encapsulates the literature assumptions regarding CI and its influence on SQ. In the third stage, we analyze 12 activity months for 70 opensource projects by mining software repositories -- 35 CI and 35 no-CI projects. This MSR study is not a typical "correlation is not causation" study because it is used to verify the relationships uncovered in the causal DAG produced in the first stages. The fourth stage consists of testing the statistical implications from the "literature-based causal DAG" on our dataset. Finally, in the fifth stage, we build a DAG with observations from the literature and the dataset, the "literature-data DAG". In addition to the direct causal effect of CI on SQ, we find evidence of indirect effects of CI. For example, CI affects teams' communication, which positively impacts SQ. We also highlight the confounding effect of project age. △ Less

Submitted 18 September, 2023; originally announced September 2023.

arXiv:2305.16365 [pdf, other]

The Impact of a Continuous Integration Service on the Delivery Time of Merged Pull Requests

Authors: João Helis Bernardo, Daniel Alencar da Costa, Uirá Kulesza, Christoph Treude

Abstract: Continuous Integration (CI) is a software development practice that builds and tests software frequently (e.g., at every push). One main motivator to adopt CI is the potential to deliver software functionalities more quickly than not using CI. However, there is little empirical evidence to support that CI helps projects deliver software functionalities more quickly. Through the analysis of 162,653… ▽ More Continuous Integration (CI) is a software development practice that builds and tests software frequently (e.g., at every push). One main motivator to adopt CI is the potential to deliver software functionalities more quickly than not using CI. However, there is little empirical evidence to support that CI helps projects deliver software functionalities more quickly. Through the analysis of 162,653 pull requests (PRs) of 87 GitHub projects, we empirically study whether adopting a CI service (TravisCI) can quicken the time to deliver merged PRs. We complement our quantitative study by analyzing 450 survey responses from participants of 73 software projects. Our results reveal that adopting a CI service may not necessarily quicken the delivery of merge PRs. Instead, the pivotal benefit of a CI service is to improve the decision making on PR submissions, without compromising the quality or overloading the project's reviewers and maintainers. The automation provided by CI and the boost in developers' confidence are key advantages of adopting a CI service. Furthermore, open-source projects planning to attract and retain developers should consider the use of a CI service in their project, since CI is perceived to lower the contribution barrier while making contributors feel more confident and engaged in the project. △ Less

Submitted 25 May, 2023; originally announced May 2023.

arXiv:2301.10315 [pdf, other]

Studying the Characteristics of SQL-related Development Tasks: An Empirical Study

Authors: Daniel Alencar da Costa, Natalie Grattan, Nigel Stanger, Sherlock A. Licorish

Abstract: A key function of a software system is its ability to facilitate the manipulation of data, which is often implemented using a flavour of the Structured Query Language (SQL). To develop the data operations of software (i.e, creating, retrieving, updating, and deleting data), developers are required to excel in writing and combining both SQL and application code. The problem is that writing SQL code… ▽ More A key function of a software system is its ability to facilitate the manipulation of data, which is often implemented using a flavour of the Structured Query Language (SQL). To develop the data operations of software (i.e, creating, retrieving, updating, and deleting data), developers are required to excel in writing and combining both SQL and application code. The problem is that writing SQL code in itself is already challenging (e.g., SQL anti-patterns are commonplace) and combining SQL with application code (i.e., for SQL development tasks) is even more demanding. Meanwhile, we have little empirical understanding regarding the characteristics of SQL development tasks. Do SQL development tasks typically need more code changes? Do they typically have a longer time-to-completion? Answers to such questions would prepare the community for the potential challenges associated with such tasks. Our results obtained from 20 Apache projects reveal that SQL development tasks have a significantly longer time-to-completion than SQL-unrelated tasks and require significantly more code changes. Through our qualitative analyses, we observe that SQL development tasks require more spread out changes, effort in reviews and documentation. Our results also corroborate previous research highlighting the prevalence of SQL anti-patterns. The software engineering community should make provision for the peculiarities of SQL coding, in the delivery of safe and secure interactive software. △ Less

Submitted 24 January, 2023; originally announced January 2023.

Comments: Accepted to the Journal of Empirical Software Engineering (EMSE), in Jan 2023

arXiv:2208.02598 [pdf, other]

doi 10.1145/3544902.3546244

Investigating the Impact of Continuous Integration Practices on the Productivity and Quality of Open-Source Projects

Authors: Jadson Santos, Daniel Alencar da Costa, Uirá Kulesza

Abstract: Background: Much research has been conducted to investigate the impact of Continuous Integration (CI) on the productivity and quality of open-source projects. Most of studies have analyzed the impact of adopting a CI server service (e.g, Travis-CI) but did not analyze CI sub-practices. Aims: We aim to evaluate the impact of five CI sub-practices with respect to the productivity and quality of GitH… ▽ More Background: Much research has been conducted to investigate the impact of Continuous Integration (CI) on the productivity and quality of open-source projects. Most of studies have analyzed the impact of adopting a CI server service (e.g, Travis-CI) but did not analyze CI sub-practices. Aims: We aim to evaluate the impact of five CI sub-practices with respect to the productivity and quality of GitHub open-source projects. Method: We collect CI sub-practices of 90 relevant open-source projects for a period of 2 years. We use regression models to analyze whether projects upholding the CI sub-practices are more productive and/or generate fewer bugs. We also perform a qualitative document analysis to understand whether CI best practices are related to a higher quality of projects. Results: Our findings reveal a correlation between the Build Activity and Commit Activity sub-practices and the number of merged pull requests. We also observe a correlation between the Build Activity, Build Health and Time to Fix Broken Builds sub-practices and number of bug-related issues. The qualitative analysis reveals that projects with the best values for CI sub-practices face fewer CI-related problems compared to projects that exhibit the worst values for CI sub-practices. Conclusions: We recommend that projects should strive to uphold the several CI sub-practices as they can impact in the productivity and quality of projects. △ Less

Submitted 4 August, 2022; originally announced August 2022.

Comments: Paper accepted for publication by The ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

arXiv:2206.09762 [pdf]

doi 10.1016/j.jss.2021.111166.

A Systematic Map** Study Addressing the Reliability of Mobile Applications: The Need to Move Beyond Testing Reliability

Authors: Chathrie Wimalasooriya, Sherlock A. Licorish, Daniel Alencar da Costa, Stephen G. MacDonell

Abstract: Intense competition in the mobile apps market means it is important to maintain high levels of app reliability to avoid losing users. Yet despite its importance, app reliability is underexplored in the research literature. To address this need, we identify, analyse, and classify the state-of-the-art in the field of mobile apps' reliability through a systematic map** study. From the results of su… ▽ More Intense competition in the mobile apps market means it is important to maintain high levels of app reliability to avoid losing users. Yet despite its importance, app reliability is underexplored in the research literature. To address this need, we identify, analyse, and classify the state-of-the-art in the field of mobile apps' reliability through a systematic map** study. From the results of such a study, researchers in the field can identify pressing research gaps, and developers can gain knowledge about existing solutions, to potentially leverage them in practice. We found 87 relevant papers which were then analysed and classified based on their research focus, research type, contribution, research method, study settings, data, quality attributes and metrics used. Results indicate that there is a lack of research on understanding reliability with regard to context-awareness, self-healing, ageing and rejuvenation, and runtime event handling. These aspects have rarely been studied, or if studied, there is limited evaluation. We also identified several other research gaps including the need to conduct more research in real-world industrial projects. Furthermore, little attention has been paid towards quality standards while conducting research. Outcomes here show numerous opportunities for greater research depth and breadth on mobile app reliability. △ Less

Submitted 20 June, 2022; originally announced June 2022.

Comments: Journal paper, 29 pages, 12 tables, 7 figures

Journal ref: Journal of Systems and Software 186(2022), pp. 111-166

arXiv:2011.05531 [pdf, other]

Leveraging the Defects Life Cycle to Label Affected Versions and Defective Classes

Authors: Bailey Vandehei, Daniel Alencar da Costa, Davide Falessi

Abstract: Two recent studies explicitly recommend labeling defective classes in releases using the affected versions (AV) available in issue trackers. The aim our study is threefold: 1) to measure the proportion of defects for which the realistic method is usable, 2) to propose a method for retrieving the AVs of a defect, thus making the realistic approach usable when AVs are unavailable, 3) to compare the… ▽ More Two recent studies explicitly recommend labeling defective classes in releases using the affected versions (AV) available in issue trackers. The aim our study is threefold: 1) to measure the proportion of defects for which the realistic method is usable, 2) to propose a method for retrieving the AVs of a defect, thus making the realistic approach usable when AVs are unavailable, 3) to compare the accuracy of the proposed method versus three SZZ implementations. The assumption of our proposed method is that defects have a stable life cycle in terms of the proportion of the number of versions affected by the defects before discovering and fixing these defects. Results related to 212 open-source projects from the Apache ecosystem, featuring a total of about 125,000 defects, reveal that the realistic method cannot be used in the majority (51%) of defects. Therefore, it is important to develop automated methods to retrieve AVs. Results related to 76 open-source projects from the Apache ecosystem, featuring a total of about 6,250,000 classes, affected by 60,000 defects, and spread over 4,000 versions and 760,000 commits, reveal that the proportion of the number of versions between defect discovery and fix is pretty stable (STDV < 2) across the defects of the same project. Moreover, the proposed method resulted significantly more accurate than all three SZZ implementations in (i) retrieving AVs, (ii) labeling classes as defective, and (iii) in develo** defects repositories to perform feature selection. Thus, when the realistic method is unusable, the proposed method is a valid automated alternative to SZZ for retrieving the origin of a defect. Finally, given the low accuracy of SZZ, researchers should consider re-executing the studies that have used SZZ as an oracle and, in general, should prefer selecting projects with a high proportion of available and consistent AVs. △ Less

Submitted 10 November, 2020; originally announced November 2020.

Showing 1–7 of 7 results for author: da Costa, D A