Search | arXiv e-print repository

Naming the Pain in Machine Learning-Enabled Systems Engineering

Authors: Marcos Kalinowski, Daniel Mendez, Görkem Giray, Antonio Pedro Santos Alves, Kelly Azevedo, Tatiana Escovedo, Hugo Villamizar, Helio Lopes, Teresa Baldassarre, Stefan Wagner, Stefan Biffl, Jürgen Musil, Michael Felderer, Niklas Lavesson, Tony Gorschek

Abstract: Context: Machine learning (ML)-enabled systems are being increasingly adopted by companies aiming to enhance their products and operational processes. Objective: This paper aims to deliver a comprehensive overview of the current status quo of engineering ML-enabled systems and lay the foundation to steer practically relevant and problem-driven academic research. Method: We conducted an internation… ▽ More Context: Machine learning (ML)-enabled systems are being increasingly adopted by companies aiming to enhance their products and operational processes. Objective: This paper aims to deliver a comprehensive overview of the current status quo of engineering ML-enabled systems and lay the foundation to steer practically relevant and problem-driven academic research. Method: We conducted an international survey to collect insights from practitioners on the current practices and problems in engineering ML-enabled systems. We received 188 complete responses from 25 countries. We conducted quantitative statistical analyses on contemporary practices using bootstrap** with confidence intervals and qualitative analyses on the reported problems using open and axial coding procedures. Results: Our survey results reinforce and extend existing empirical evidence on engineering ML-enabled systems, providing additional insights into typical ML-enabled systems project contexts, the perceived relevance and complexity of ML life cycle phases, and current practices related to problem understanding, model deployment, and model monitoring. Furthermore, the qualitative analysis provides a detailed map of the problems practitioners face within each ML life cycle phase and the problems causing overall project failure. Conclusions: The results contribute to a better understanding of the status quo and problems in practical environments. We advocate for the further adaptation and dissemination of software engineering practices to enhance the engineering of ML-enabled systems. △ Less

Submitted 20 May, 2024; originally announced June 2024.

Comments: arXiv admin note: text overlap with arXiv:2310.06726

arXiv:2404.06011 [pdf, other]

Guidelines for Using Mixed and Multi Methods Research in Software Engineering

Authors: Margaret-Anne Storey, Rashina Hoda, Alessandra Maciel Paz Milani, Maria Teresa Baldassarre

Abstract: Mixed and multi methods research is often used in software engineering, but researchers outside of the social or human sciences often lack experience when using these designs. This paper provides guidelines and advice on how to design mixed and multi method research, and to encourage the intentional, rigourous, and innovative use of mixed methods in software engineering. It also presents key chara… ▽ More Mixed and multi methods research is often used in software engineering, but researchers outside of the social or human sciences often lack experience when using these designs. This paper provides guidelines and advice on how to design mixed and multi method research, and to encourage the intentional, rigourous, and innovative use of mixed methods in software engineering. It also presents key characteristics of core mixed method research designs. Through a number of fictitious but recognizable software engineering research scenarios and personas of prototypical researchers, we showcase how to choose suitable designs and consider the inevitable tradeoffs any design choice leads to. We furnish the paper with recommended best practices and several antipatterns that illustrate what to avoid in mixed and multi method research. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: 21 pages, 6 figures

arXiv:2403.04667 [pdf, other]

doi 10.1145/3582515.3609555

The Social Impact of Generative AI: An Analysis on ChatGPT

Authors: Maria T. Baldassarre, Danilo Caivano, Berenice Fernandez Nieto, Domenico Gigante, Azzurra Ragone

Abstract: In recent months, the social impact of Artificial Intelligence (AI) has gained considerable public interest, driven by the emergence of Generative AI models, ChatGPT in particular. The rapid development of these models has sparked heated discussions regarding their benefits, limitations, and associated risks. Generative models hold immense promise across multiple domains, such as healthcare, finan… ▽ More In recent months, the social impact of Artificial Intelligence (AI) has gained considerable public interest, driven by the emergence of Generative AI models, ChatGPT in particular. The rapid development of these models has sparked heated discussions regarding their benefits, limitations, and associated risks. Generative models hold immense promise across multiple domains, such as healthcare, finance, and education, to cite a few, presenting diverse practical applications. Nevertheless, concerns about potential adverse effects have elicited divergent perspectives, ranging from privacy risks to escalating social inequality. This paper adopts a methodology to delve into the societal implications of Generative AI tools, focusing primarily on the case of ChatGPT. It evaluates the potential impact on several social sectors and illustrates the findings of a comprehensive literature review of both positive and negative effects, emerging trends, and areas of opportunity of Generative AI models. This analysis aims to facilitate an in-depth discussion by providing insights that can inspire policy, regulation, and responsible development practices to foster a human-centered AI. △ Less

Submitted 7 March, 2024; originally announced March 2024.

Comments: Presented at GoodIT2023 - ACM Conference on Information Technology for Social Good

Journal ref: Proceedings of the 2023 ACM Conference on Information Technology for Social Good (GoodIT '23)

arXiv:2402.05340 [pdf]

doi 10.1145/3644815.3644947

POLARIS: A framework to guide the development of Trustworthy AI systems

Authors: Maria Teresa Baldassarre, Domenico Gigante, Marcos Kalinowski, Azzurra Ragone

Abstract: In the ever-expanding landscape of Artificial Intelligence (AI), where innovation thrives and new products and services are continuously being delivered, ensuring that AI systems are designed and developed responsibly throughout their entire lifecycle is crucial. To this end, several AI ethics principles and guidelines have been issued to which AI systems should conform. Nevertheless, relying sole… ▽ More In the ever-expanding landscape of Artificial Intelligence (AI), where innovation thrives and new products and services are continuously being delivered, ensuring that AI systems are designed and developed responsibly throughout their entire lifecycle is crucial. To this end, several AI ethics principles and guidelines have been issued to which AI systems should conform. Nevertheless, relying solely on high-level AI ethics principles is far from sufficient to ensure the responsible engineering of AI systems. In this field, AI professionals often navigate by sight. Indeed, while recommendations promoting Trustworthy AI (TAI) exist, these are often high-level statements that are difficult to translate into concrete implementation strategies. There is a significant gap between high-level AI ethics principles and low-level concrete practices for AI professionals. To address this challenge, our work presents an experience report where we develop a novel holistic framework for Trustworthy AI - designed to bridge the gap between theory and practice - and report insights from its application in an industrial case study. The framework is built on the result of a systematic review of the state of the practice, a survey, and think-aloud interviews with 34 AI practitioners. The framework, unlike most of those already in the literature, is designed to provide actionable guidelines and tools to support different types of stakeholders throughout the entire Software Development Life Cycle (SDLC). Our goal is to empower AI professionals to confidently navigate the ethical dimensions of TAI through practical insights, ensuring that the vast potential of AI is exploited responsibly for the benefit of society as a whole. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Journal ref: Conference on AI Engineering Software Engineering for AI (CAIN 2024), April 14--15, 2024, Lisbon, Portugal

arXiv:2402.05339 [pdf]

Can participation in a hackathon impact the motivation of software engineering students? A preliminary case study analysis

Authors: Allysson Allex Araújo, Marcos Kalinowski, Maria Teresa Baldassarre

Abstract: [Background] Hackathons are increasingly gaining prominence in Software Engineering (SE) education, lauded for their ability to elevate students' skill sets. [Objective] This paper investigates whether hackathons can impact the motivation of SE students. [Method] We conducted an evaluative case study assessing students' motivations before and after a hackathon, combining quantitative analysis usin… ▽ More [Background] Hackathons are increasingly gaining prominence in Software Engineering (SE) education, lauded for their ability to elevate students' skill sets. [Objective] This paper investigates whether hackathons can impact the motivation of SE students. [Method] We conducted an evaluative case study assessing students' motivations before and after a hackathon, combining quantitative analysis using the Academic Motivation Scale (AMS) and qualitative coding of open-ended responses. [Results] Pre-hackathon findings reveal a diverse range of motivations with an overall acceptance, while post-hackathon responses highlight no statistically significant shift in participants' perceptions. Qualitative findings uncovered themes related to networking, team dynamics, and skill development. From a practical perspective, our findings highlight the potential of hackathons to impact participants' motivation. [Conclusion] While our study enhances the comprehension of hackathons as a motivational tool, it also underscores the need for further exploration of psychometric dimensions in SE educational research. △ Less

Submitted 7 February, 2024; originally announced February 2024.

arXiv:2402.05337 [pdf]

Investigating the Impact of SOLID Design Principles on Machine Learning Code Understanding

Authors: Raphael Cabral, Marcos Kalinowski, Maria Teresa Baldassarre, Hugo Villamizar, Tatiana Escovedo, Hélio Lopes

Abstract: [Context] Applying design principles has long been acknowledged as beneficial for understanding and maintainability in traditional software projects. These benefits may similarly hold for Machine Learning (ML) projects, which involve iterative experimentation with data, models, and algorithms. However, ML components are often developed by data scientists with diverse educational backgrounds, poten… ▽ More [Context] Applying design principles has long been acknowledged as beneficial for understanding and maintainability in traditional software projects. These benefits may similarly hold for Machine Learning (ML) projects, which involve iterative experimentation with data, models, and algorithms. However, ML components are often developed by data scientists with diverse educational backgrounds, potentially resulting in code that doesn't adhere to software design best practices. [Goal] In order to better understand this phenomenon, we investigated the impact of the SOLID design principles on ML code understanding. [Method] We conducted a controlled experiment with three independent trials involving 100 data scientists. We restructured real industrial ML code that did not use SOLID principles. Within each trial, one group was presented with the original ML code, while the other was presented with ML code incorporating SOLID principles. Participants of both groups were asked to analyze the code and fill out a questionnaire that included both open-ended and closed-ended questions on their understanding. [Results] The study results provide statistically significant evidence that the adoption of the SOLID design principles can improve code understanding within the realm of ML projects. [Conclusion] We put forward that software engineering design principles should be spread within the data science community and considered for enhancing the maintainability of ML code. △ Less

Submitted 7 February, 2024; originally announced February 2024.

arXiv:2402.05333 [pdf]

ML-Enabled Systems Model Deployment and Monitoring: Status Quo and Problems

Authors: Eduardo Zimelewicz, Marcos Kalinowski, Daniel Mendez, Görkem Giray, Antonio Pedro Santos Alves, Niklas Lavesson, Kelly Azevedo, Hugo Villamizar, Tatiana Escovedo, Helio Lopes, Stefan Biffl, Juergen Musil, Michael Felderer, Stefan Wagner, Teresa Baldassarre, Tony Gorschek

Abstract: [Context] Systems incorporating Machine Learning (ML) models, often called ML-enabled systems, have become commonplace. However, empirical evidence on how ML-enabled systems are engineered in practice is still limited, especially for activities surrounding ML model dissemination. [Goal] We investigate contemporary industrial practices and problems related to ML model dissemination, focusing on the… ▽ More [Context] Systems incorporating Machine Learning (ML) models, often called ML-enabled systems, have become commonplace. However, empirical evidence on how ML-enabled systems are engineered in practice is still limited, especially for activities surrounding ML model dissemination. [Goal] We investigate contemporary industrial practices and problems related to ML model dissemination, focusing on the model deployment and the monitoring of ML life cycle phases. [Method] We conducted an international survey to gather practitioner insights on how ML-enabled systems are engineered. We gathered a total of 188 complete responses from 25 countries. We analyze the status quo and problems reported for the model deployment and monitoring phases. We analyzed contemporary practices using bootstrap** with confidence intervals and conducted qualitative analyses on the reported problems applying open and axial coding procedures. [Results] Practitioners perceive the model deployment and monitoring phases as relevant and difficult. With respect to model deployment, models are typically deployed as separate services, with limited adoption of MLOps principles. Reported problems include difficulties in designing the architecture of the infrastructure for production deployment and legacy application integration. Concerning model monitoring, many models in production are not monitored. The main monitored aspects are inputs, outputs, and decisions. Reported problems involve the absence of monitoring practices, the need to create custom monitoring tools, and the selection of suitable metrics. [Conclusion] Our results help provide a better understanding of the adopted practices and problems in practice and support guiding ML deployment and monitoring research in a problem-driven manner. △ Less

Submitted 7 February, 2024; originally announced February 2024.

Comments: arXiv admin note: text overlap with arXiv:2310.06726

arXiv:2312.04809 [pdf]

Navigating the Path of Women in Software Engineering: From Academia to Industry

Authors: Tatalina Oliveira, Ann Barcomb, Ronnie de Souza Santos, Helda Barros, Maria Teresa Baldassarre, César França

Abstract: Context. Women remain significantly underrepresented in software engineering, leading to a lasting gender gap in the software industry. This disparity starts in education and extends into the industry, causing challenges such as hostile work environments and unequal opportunities. Addressing these issues is crucial for fostering an inclusive and diverse software engineering workforce. Aim. This st… ▽ More Context. Women remain significantly underrepresented in software engineering, leading to a lasting gender gap in the software industry. This disparity starts in education and extends into the industry, causing challenges such as hostile work environments and unequal opportunities. Addressing these issues is crucial for fostering an inclusive and diverse software engineering workforce. Aim. This study aims to enhance the literature on women in software engineering, exploring their journey from academia to industry and discussing perspectives, challenges, and support. We focus on Brazilian women to extend existing research, which has largely focused on North American and European contexts. Method. In this study, we conducted a cross-sectional survey, collecting both quantitative and qualitative data, focusing on women's experiences in software engineering to explore their journey from university to the software industry. Findings. Our findings highlight persistent challenges faced by women in software engineering, including gender bias, harassment, work-life imbalance, undervaluation, low sense of belonging, and impostor syndrome. These difficulties commonly emerge from university experiences and continue to affect women throughout their entire careers. Conclusion. In summary, our study identifies systemic challenges in women's software engineering journey, emphasizing the need for organizational commitment to address these issues. We provide actionable recommendations for practitioners. △ Less

Submitted 7 December, 2023; originally announced December 2023.

Comments: 12 pages

arXiv:2312.03966 [pdf]

Impostor Phenomenon in Software Engineers

Authors: Paloma Guenes, Rafael Tomaz, Marcos Kalinowski, Maria Teresa Baldassarre, Margaret-Anne Storey

Abstract: The Impostor Phenomenon (IP) is widely discussed in Science, Technology, Engineering, and Mathematics (STEM) and has been evaluated in Computer Science students. However, formal research on IP in software engineers has yet to be conducted, although its impacts may lead to mental disorders such as depression and burnout. This study describes a survey that investigates the extent of impostor feeling… ▽ More The Impostor Phenomenon (IP) is widely discussed in Science, Technology, Engineering, and Mathematics (STEM) and has been evaluated in Computer Science students. However, formal research on IP in software engineers has yet to be conducted, although its impacts may lead to mental disorders such as depression and burnout. This study describes a survey that investigates the extent of impostor feelings in software engineers, considering aspects such as gender, race/ethnicity, and roles. Furthermore, we investigate the influence of IP on their perceived productivity. The survey instrument was designed using a theory-driven approach and included demographic questions, an internationally validated IP scale, and questions for measuring perceived productivity based on the SPACE framework constructs. The survey was sent to companies operating in various business sectors. Data analysis used bootstrap** with resampling to calculate confidence intervals and Mann-Whitney statistical significance testing for assessing the hypotheses. We received responses from 624 software engineers from 26 countries. The bootstrap** results reveal that a proportion of 52.7% of software engineers experience frequent to intense levels of IP and that women suffer at a significantly higher proportion (60.6%) than men (48.8%). Regarding race/ethnicity, we observed more frequent impostor feelings in Asian (67.9%) and Black (65.1%) than in White (50.0%) software engineers. We also observed that the presence of IP is less common among individuals who are married and have children. Moreover, the prevalence of IP showed a statistically significant negative effect on the perceived productivity for all SPACE framework constructs. The evidence relating IP to software engineers provides a starting point to help organizations find ways to raise awareness of the problem and improve the emotional skills of software professionals. △ Less

Submitted 6 December, 2023; originally announced December 2023.

Comments: Preprint with the original submission accepted for publication at ICSE-SEIS 2024

arXiv:2310.06726 [pdf]

Status Quo and Problems of Requirements Engineering for Machine Learning: Results from an International Survey

Authors: Antonio Pedro Santos Alves, Marcos Kalinowski, Görkem Giray, Daniel Mendez, Niklas Lavesson, Kelly Azevedo, Hugo Villamizar, Tatiana Escovedo, Helio Lopes, Stefan Biffl, Jürgen Musil, Michael Felderer, Stefan Wagner, Teresa Baldassarre, Tony Gorschek

Abstract: Systems that use Machine Learning (ML) have become commonplace for companies that want to improve their products and processes. Literature suggests that Requirements Engineering (RE) can help address many problems when engineering ML-enabled systems. However, the state of empirical evidence on how RE is applied in practice in the context of ML-enabled systems is mainly dominated by isolated case s… ▽ More Systems that use Machine Learning (ML) have become commonplace for companies that want to improve their products and processes. Literature suggests that Requirements Engineering (RE) can help address many problems when engineering ML-enabled systems. However, the state of empirical evidence on how RE is applied in practice in the context of ML-enabled systems is mainly dominated by isolated case studies with limited generalizability. We conducted an international survey to gather practitioner insights into the status quo and problems of RE in ML-enabled systems. We gathered 188 complete responses from 25 countries. We conducted quantitative statistical analyses on contemporary practices using bootstrap** with confidence intervals and qualitative analyses on the reported problems involving open and axial coding procedures. We found significant differences in RE practices within ML projects. For instance, (i) RE-related activities are mostly conducted by project leaders and data scientists, (ii) the prevalent requirements documentation format concerns interactive Notebooks, (iii) the main focus of non-functional requirements includes data quality, model reliability, and model explainability, and (iv) main challenges include managing customer expectations and aligning requirements with data. The qualitative analyses revealed that practitioners face problems related to lack of business domain understanding, unclear goals and requirements, low customer engagement, and communication issues. These results help to provide a better understanding of the adopted practices and of which problems exist in practical environments. We put forward the need to adapt further and disseminate RE-related practices for engineering ML-enabled systems. △ Less

Submitted 10 October, 2023; originally announced October 2023.

Comments: Accepted for Publication at PROFES 2023

arXiv:2302.03649 [pdf, other]

doi 10.1007/s10664-022-10277-5

Registered Reports in Software Engineering

Authors: Neil A. Ernst, Maria Teresa Baldassarre

Abstract: Registered reports are scientific publications which begin the publication process by first having the detailed research protocol, including key research questions, reviewed and approved by peers. Subsequent analysis and results are published with minimal additional review, even if there was no clear support for the underlying hypothesis, as long as the approved protocol is followed. Registered re… ▽ More Registered reports are scientific publications which begin the publication process by first having the detailed research protocol, including key research questions, reviewed and approved by peers. Subsequent analysis and results are published with minimal additional review, even if there was no clear support for the underlying hypothesis, as long as the approved protocol is followed. Registered reports can prevent several questionable research practices and give early feedback on research designs. In software engineering research, registered reports were first introduced in the International Conference on Mining Software Repositories (MSR) in 2020. They are now established in three conferences and two pre-eminent journals, including Empirical Software Engineering. We explain the motivation for registered reports, outline the way they have been implemented in software engineering, and outline some ongoing challenges for addressing high quality software engineering research. △ Less

Submitted 7 February, 2023; originally announced February 2023.

Comments: in press as EMSE J. comment

arXiv:2206.08718 [pdf, other]

CATTO: Just-in-time Test Case Selection and Execution

Authors: Dario Amoroso d'Aragona, Fabiano Pecorelli, Simone Romano, Giuseppe Scanniello, Maria Teresa Baldassarre, Andrea Janes, Valentina Lenarduzzi

Abstract: Regression testing ensures a System Under Test (SUT) still works as expected after changes to it. The simplest approach for regression testing consists of re-running the entire test suite against the changed version of the SUT. However, this might result in a time- and resource-consuming process; \eg when dealing with large and/or complex SUTs and test suits. To work around this problem, test Case… ▽ More Regression testing ensures a System Under Test (SUT) still works as expected after changes to it. The simplest approach for regression testing consists of re-running the entire test suite against the changed version of the SUT. However, this might result in a time- and resource-consuming process; \eg when dealing with large and/or complex SUTs and test suits. To work around this problem, test Case Selection (TCS) strategies can be used. Such strategies seek to build a temporary test suite comprising only those test cases that are relevant to the changes made to the SUT, so avoiding executing those test cases that do not exercise the changed parts. In this paper, we introduce CATTO (Commit Adaptive Tool for Test suite Optimization) and CATTO INTELLIJ PLUGIN. The former is a tool implementing a TCS strategy for SUTs written in Java, while the latter is a wrapper to allow developers to use \toolName directly in IntelliJ. We also conducted a preliminary evaluation of CATTO on seven open-source Java SUTs in terms of reductions in test-suite size, fault-reveling test cases, and fault-detection capability. The results are promising and suggest that CATTO can be of help to developers when performing regression testing. The video demo and the documentation of the tool is available at: \url{https://catto-tool.github.io/} △ Less

Submitted 17 June, 2022; originally announced June 2022.

arXiv:2108.06821 [pdf, other]

doi 10.1145/3554976

Crowdsourcing the State of the Art(ifacts)

Authors: Maria Teresa Baldassarre, Neil Ernst, Ben Hermann, Tim Menzies, Rahul Yedida

Abstract: In any field, finding the "leading edge" of research is an on-going challenge. Researchers cannot appease reviewers and educators cannot teach to the leading edge of their field if no one agrees on what is the state-of-the-art. Using a novel crowdsourced "reuse graph" approach, we propose here a new method to learn this state-of-the-art. Our reuse graphs are less effort to build and verify than… ▽ More In any field, finding the "leading edge" of research is an on-going challenge. Researchers cannot appease reviewers and educators cannot teach to the leading edge of their field if no one agrees on what is the state-of-the-art. Using a novel crowdsourced "reuse graph" approach, we propose here a new method to learn this state-of-the-art. Our reuse graphs are less effort to build and verify than other community monitoring methods (e.g. artifact tracks or citation-based searches). Based on a study of 170 papers from software engineering (SE) conferences in 2020, we have found over 1,600 instances of reuse; i.e., reuse is rampant in SE research. Prior pessimism about a lack of reuse in SE research may have been a result of using the wrong methods to measure the wrong things. △ Less

Submitted 15 August, 2021; originally announced August 2021.

Comments: Submitted to Communications ACM

Journal ref: CACM February 2023 (Vol. 66, No. 2)

arXiv:2105.03312 [pdf, other]

doi 10.1016/j.jss.2021.110937

Studying Test-Driven Development and its Retainment Over a Six-month Time Span

Authors: Maria Teresa Baldassarre, Danilo Caivano, Davide Fucci, Natalia Juristo, Simone Romano, Giuseppe Scanniello, BurakTurhan

Abstract: In this paper, we investigate the effect of TDD, as compared to a non-TDD approach, as well as its retainment (or retention) over a time span of (about) six months. To pursue these objectives, we conducted a (quantitative) longitudinal cohort study with 30 novice developers (i.e., third-year undergraduate students in Computer Science). We observed that TDD affects neither the external quality of s… ▽ More In this paper, we investigate the effect of TDD, as compared to a non-TDD approach, as well as its retainment (or retention) over a time span of (about) six months. To pursue these objectives, we conducted a (quantitative) longitudinal cohort study with 30 novice developers (i.e., third-year undergraduate students in Computer Science). We observed that TDD affects neither the external quality of software products nor developers' productivity. However, we observed that the participants applying TDD produced significantly more tests, with a higher fault-detection capability than those using a non-TDD approach. As for the retainment of TDD, we found that TDD is retained by novice developers for at least six months. △ Less

Submitted 11 May, 2021; v1 submitted 7 May, 2021; originally announced May 2021.

Journal ref: Journal of Systems and Software, Volume 176, 2021, 110937, ISSN 0164-1212

arXiv:2008.12528 [pdf, ps, other]

Researcher Bias in Software Engineering Experiments: a Qualitative Investigation

Authors: Simone Romano, Davide Fucci, Giuseppe Scanniello, Maria Teresa Baldassarre, Burak Turhan, Natalia Juristo

Abstract: Researcher Bias (RB) occurs when researchers influence the results of an empirical study based on their expectations.RB might be due to the use of Questionable Research Practices(QRPs). In research fields like medicine, blinding techniques have been applied to counteract RB. We conducted an explorative qualitative survey to investigate RB in Software Engineering (SE)experiments, with respect to (i… ▽ More Researcher Bias (RB) occurs when researchers influence the results of an empirical study based on their expectations.RB might be due to the use of Questionable Research Practices(QRPs). In research fields like medicine, blinding techniques have been applied to counteract RB. We conducted an explorative qualitative survey to investigate RB in Software Engineering (SE)experiments, with respect to (i) QRPs potentially leading to RB, (ii) causes behind RB, and (iii) possible actions to counteract including blinding techniques. Data collection was based on semi-structured interviews. We interviewed nine active experts in the empirical SE community. We then analyzed the transcripts of these interviews through thematic analysis. We found that some QRPs are acceptable in certain cases. Also, it appears that the presence of RB is perceived in SE and, to counteract RB, a number of solutions have been highlighted: some are intended for SE researchers and others for the boards of SE research outlets. △ Less

Submitted 28 August, 2020; originally announced August 2020.

Comments: Published at SEAA2020

arXiv:2007.07751 [pdf, other]

Secondary Studies in the Academic Context: A Systematic Map** and Survey

Authors: Katia Romero Felizardo, Érica Ferreira de Souza, Bianca Minetto Napoleão, Nandamudi Lankalapalli Vijaykumar, Maria Teresa Baldassarre

Abstract: Context: Several researchers have reported their experiences in applying secondary studies (Systematic Literature Reviews - SLRs and Systematic Map**s - SMs) in Software Engineering (SE). However, there is still a lack of studies discussing the value of performing secondary studies in an academic context. Goal: The main goal of this study is to provide an overview on the use of secondary studies… ▽ More Context: Several researchers have reported their experiences in applying secondary studies (Systematic Literature Reviews - SLRs and Systematic Map**s - SMs) in Software Engineering (SE). However, there is still a lack of studies discussing the value of performing secondary studies in an academic context. Goal: The main goal of this study is to provide an overview on the use of secondary studies in an academic context. Method: Two empirical research methods were used. Initially, we conducted an SM to identify the available and relevant studies on the use of secondary studies as a research methodology for conducting SE research projects. Secondly, a survey was performed with 64 SE researchers to identify their perception related to the value of performing secondary studies to support their research projects. Results: Our results show benefits of using secondary studies in the academic context, such as, providing an overview of the literature as well as identifying relevant research literature on a research area enabling to find reasons to explain why a research project should be approved for a grant and/or supporting decisions made in a research project. Difficulties faced by SE graduate students with secondary studies are that they tend to be conducted by a team and it demands more effort than a traditional review. Conclusions: Secondary studies are valuable to graduate students. They should consider conducting a secondary study for their research project due to the benefits and contributions provided to develop the overall project. However, the advice of an experienced supervisor is essential to avoid bias. In addition, the acquisition of skills can increase student's motivation to pursue their research projects and prepare them for both academic or industrial careers. △ Less

Submitted 10 July, 2020; originally announced July 2020.

arXiv:2004.07524 [pdf, other]

Results from a replicated experiment on the affective reactions of novice developers when applying test-driven development

Authors: Simone Romano, Giuseppe Scanniello, Maria Teresa Baldassarre, Davide Fucci, Danilo Caivano

Abstract: Test-driven Development (TDD) is an incremental approach to software development. Despite it is claimed to improve both quality of software and developers' productivity, the research on the claimed effects of TDD has so far shown inconclusive results. Some researchers have ascribed these inconclusive results to the negative affective states that TDD would provoke. A previous (baseline) experiment… ▽ More Test-driven Development (TDD) is an incremental approach to software development. Despite it is claimed to improve both quality of software and developers' productivity, the research on the claimed effects of TDD has so far shown inconclusive results. Some researchers have ascribed these inconclusive results to the negative affective states that TDD would provoke. A previous (baseline) experiment has, therefore, studied the affective reactions of (novice) developers---i.e., 29 third-year undergraduates in Computer Science (CS)---when practicing TDD to implement software. To validate the results of the baseline experiment, we conducted a replicated experiment that studies the affective reactions of novice developers when applying TDD to develop software. Developers in the treatment group carried out a development task using TDD, while those in the control group used a non-TDD approach. To measure the affective reactions of developers, we used the Self-Assessment Manikin instrument complemented with a liking dimension. The most important differences between the baseline and replicated experiments are: (i) the kind of novice developers involved in the experiments---third-year vs. second-year undergraduates in CS from two different universities; and (ii) their number---29 vs. 59. The results of the replicated experiment do not show any difference in the affective reactions of novice developers. Instead, the results of the baseline experiment suggest that developers seem to like TDD less as compared to a non-TDD approach and that developers following TDD seem to like implementing code less than the other developers, while testing code seems to make them less happy. △ Less

Submitted 16 April, 2020; originally announced April 2020.

Comments: XP2020

arXiv:1907.12290 [pdf, other]

An Empirical Assessment on Affective Reactions of Novice Developers when Applying Test-Driven Development

Authors: Simone Romano, Davide Fucci, Maria Teresa Baldassarre, Danilo Caivano, Giuseppe Scanniello

Abstract: We study whether and in which phase Test-Driven Development (TDD) influences affective states of novice developers in terms of pleasure, arousal, dominance, and liking. We performed a controlled experiment with 29 novice developers. Developers in the treatment group performed a development task using TDD, whereas those in the control group used a non-TDD development approach. We compared the affec… ▽ More We study whether and in which phase Test-Driven Development (TDD) influences affective states of novice developers in terms of pleasure, arousal, dominance, and liking. We performed a controlled experiment with 29 novice developers. Developers in the treatment group performed a development task using TDD, whereas those in the control group used a non-TDD development approach. We compared the affective reactions to the development approaches, as well as to the implementation and testing phases, exploiting a lightweight, powerful, and widely used tool, i.e., Self-Assessment Manikin. We observed that there is a difference between the two development approaches in terms of affective reactions. Therefore, it seems that affective reactions play an important role when applying TDD and their investigation could help researchers to better understand such a development approach △ Less

Submitted 29 July, 2019; originally announced July 2019.

Comments: Accepted for publication at the 20th International Conference on Product-Focused Software Process Improvement (PROFES19)

arXiv:1906.11351 [pdf, other]

Software Engineering Research Community Viewpoints on Rapid Reviews

Authors: Bruno Cartaxo, Gustavo Pinto, Baldoino Fonseca, Márcio Ribeiro, Pedro Pinheiro, Sergio Soares, Maria Teresa Baldassarre

Abstract: Background: One of the most important current challenges of Software Engineering (SE) research is to provide relevant evidence to practice. In health related fields, Rapid Reviews (RRs) have shown to be an effective method to achieve that goal. However, little is known about how the SE research community perceives the potential applicability of RRs. Aims: The goal of this study is to understand th… ▽ More Background: One of the most important current challenges of Software Engineering (SE) research is to provide relevant evidence to practice. In health related fields, Rapid Reviews (RRs) have shown to be an effective method to achieve that goal. However, little is known about how the SE research community perceives the potential applicability of RRs. Aims: The goal of this study is to understand the SE research community viewpoints towards the use of RRs as a means to provide evidence to practitioners. Method: To understand their viewpoints, we invited 37 researchers to analyze 50 opinion statements about RRs, and rate them according to what extent they agree with each statement. Q-Methodology was employed to identify the most salient viewpoints, represented by the so called factors. Results: Four factors were identified: Factor A groups undecided researchers that need more evidence before using RRs; Researchers grouped in Factor B are generally positive about RRs, but highlight the need to define minimum standards; Factor C researchers are more skeptical and reinforce the importance of high quality evidence; Researchers aligned to Factor D have a pragmatic point of view, considering RRs can be applied based on the context and constraints faced by practitioners. Conclusions: In conclusion, although there are opposing viewpoints, there are also some common grounds. For example, all viewpoints agree that both RRs and Systematic Reviews can be poorly or well conducted. △ Less

Submitted 26 June, 2019; originally announced June 2019.

Comments: To appear at ESEM 2019. 12 pages

arXiv:1906.05365 [pdf]

doi 10.1109/CHASE.2019.000040

Work Design and Job Rotation in Software Engineering: Results from an Industrial Study

Authors: Ronnie Santos, Marian Teresa Baldassarre, Fabio Queda Bueno da Silva, Cleyton Magalhaes, Luiz Fernando Capretz, Jorge Correia-Neto

Abstract: Job rotation is a managerial practice to be applied in the organizational environment to reduce job monotony, boredom, and exhaustion resulting from job simplification, specialization, and repetition. Previous studies have identified and discussed the use of project-to-project rotations in software practice, gathering empirical evidence from qualitative and field studies and pointing out set of wo… ▽ More Job rotation is a managerial practice to be applied in the organizational environment to reduce job monotony, boredom, and exhaustion resulting from job simplification, specialization, and repetition. Previous studies have identified and discussed the use of project-to-project rotations in software practice, gathering empirical evidence from qualitative and field studies and pointing out set of work-related factors that can be positively or negatively affected by this practice. Goal: We aim to collect and discuss the use of job rotation in software organizations in order to identify the potential benefits and limitations of this practice supported by the statement of existing theories of work design. Method: Using a survey-based research design, we collected and analyzed quantitative data from software engineers about how software development work is designed and organized, as well as the potential effects of job rotations on this work design. We investigated 21 work design constructs, along with job burnout, role conflict, role ambiguity, and two constructs related to job rotation. Results: We identified one new benefit and six new limitations of job rotation, not observed in previous studies and added new discussions to the existing body of knowledge concerning the use of job rotation in software engineering practice. Conclusion: We believe that these results represent another important step towards the construction of a consistent and comprehensive body of evidence that can guide future research and also inform practice about the potential positive and negative effects of job rotation in software development companies. △ Less

Submitted 12 June, 2019; originally announced June 2019.

Journal ref: IEEE/ACM 12th International Workshop on Cooperative and Human Aspects of Software Engineering, may 2019

arXiv:1904.12742 [pdf, other]

doi 10.1007/s10664-020-09818-7

How software engineering research aligns with design science: A review

Authors: Emelie Engström, Margaret-Anne Storey, Per Runeson, Martin Höst, Maria Teresa Baldassarre

Abstract: Background: Assessing and communicating software engineering research can be challenging. Design science is recognized as an appropriate research paradigm for applied research but is seldom referred to in software engineering. Applying the design science lens to software engineering research may improve the assessment and communication of research contributions. Aim: The aim of this study is 1) to… ▽ More Background: Assessing and communicating software engineering research can be challenging. Design science is recognized as an appropriate research paradigm for applied research but is seldom referred to in software engineering. Applying the design science lens to software engineering research may improve the assessment and communication of research contributions. Aim: The aim of this study is 1) to understand whether the design science lens helps summarize and assess software engineering research contributions, and 2) to characterize different types of design science contributions in the software engineering literature. Method: In previous research, we developed a visual abstract template, summarizing the core constructs of the design science paradigm. In this study, we use this template in a review of a set of 38 top software engineering publications to extract and analyze their design science contributions. Results: We identified five clusters of papers, classifying them according to their alignment with the design science paradigm. Conclusions: The design science lens helps emphasize the theoretical contribution of research output---in terms of technological rules---and reflect on the practical relevance, novelty, and rigor of the rules proposed by the research. △ Less

Submitted 8 November, 2019; v1 submitted 29 April, 2019; originally announced April 2019.

Comments: 32 pages, 10 figures

Journal ref: Empirical Software Engineering, 25(4), 2630-2660(2020)

arXiv:1807.02971 [pdf, other]

A Longitudinal Cohort Study on the Retainment of Test-Driven Development

Authors: Davide Fucci, Simone Romano, Maria Teresa Baldassarre, Danilo Caivano, Giuseppe Scanniello, Burak Thuran, Natalia Juristo

Abstract: Background: Test-Driven Development (TDD) is an agile software development practice, which is claimed to boost both external quality of software products and developers' productivity. Aims: We want to study (i) the TDD effects on the external quality of software products as well as the developers' productivity, and (ii) the retainment of TDD over a period of five months. Method: We conducted a (qu… ▽ More Background: Test-Driven Development (TDD) is an agile software development practice, which is claimed to boost both external quality of software products and developers' productivity. Aims: We want to study (i) the TDD effects on the external quality of software products as well as the developers' productivity, and (ii) the retainment of TDD over a period of five months. Method: We conducted a (quantitative) longitudinal cohort study with 30 third year undergraduate students in Computer Science at the University of Bari in Italy. Results: The use of TDD has a statistically significant effect neither on the external quality of software products nor on the developers' productivity. However, we observed that participants using TDD produced significantly more tests than those applying a non-TDD development process and that the retainment of TDD is particularly noticeable in the amount of tests written. Conclusions: Our results should encourage software companies to adopt TDD because who practices TDD tends to write more tests---having more tests can come in handy when testing software systems or localizing faults---and it seems that novice developers retain TDD. △ Less

Submitted 9 July, 2018; originally announced July 2018.

Comments: ESEM, October 2018, Oulu, Finland

Showing 1–22 of 22 results for author: Baldassarre, T