-
TRIAD: Automated Traceability Recovery based on Biterm-enhanced Deduction of Transitive Links among Artifacts
Authors:
Hui Gao,
Hongyu Kuang,
Wesley K. G. Assunção,
Christoph Mayr-Dorn,
Guo** Rong,
He Zhang,
Xiaoxing Ma,
Alexander Egyed
Abstract:
Traceability allows stakeholders to extract and comprehend the trace links among software artifacts introduced across the software life cycle, to provide significant support for software engineering tasks. Despite its proven benefits, software traceability is challenging to recover and maintain manually. Hence, plenty of approaches for automated traceability have been proposed. Most rely on textua…
▽ More
Traceability allows stakeholders to extract and comprehend the trace links among software artifacts introduced across the software life cycle, to provide significant support for software engineering tasks. Despite its proven benefits, software traceability is challenging to recover and maintain manually. Hence, plenty of approaches for automated traceability have been proposed. Most rely on textual similarities among software artifacts, such as those based on Information Retrieval (IR). However, artifacts in different abstraction levels usually have different textual descriptions, which can greatly hinder the performance of IR-based approaches (e.g., a requirement in natural language may have a small textual similarity to a Java class). In this work, we leverage the consensual biterms and transitive relationships (i.e., inner- and outer-transitive links) based on intermediate artifacts to improve IR-based traceability recovery. We first extract and filter biterms from all source, intermediate, and target artifacts. We then use the consensual biterms from the intermediate artifacts to extend the biterms of both source and target artifacts, and finally deduce outer and inner-transitive links to adjust text similarities between source and target artifacts. We conducted a comprehensive empirical evaluation based on five systems widely used in other literature to show that our approach can outperform four state-of-the-art approaches, and how its performance is affected by different conditions of source, intermediate, and target artifacts. The results indicate that our approach can outperform baseline approaches in AP over 15% and MAP over 10% on average.
△ Less
Submitted 16 January, 2024; v1 submitted 28 December, 2023;
originally announced December 2023.
-
Defining and executing temporal constraints for evaluating engineering artifact compliance
Authors:
Cosmina-Cristina Ratiu,
Christoph Mayr-Dorn,
Alexander Egyed
Abstract:
Engineering processes for safety-critical systems describe the steps and sequence that guide engineers from refining user requirements into executable code, as well as producing the artifacts, traces, and evidence that the resulting system is of high quality. Process compliance focuses on ensuring that the actual engineering work is followed as closely as possible to the described engineering proc…
▽ More
Engineering processes for safety-critical systems describe the steps and sequence that guide engineers from refining user requirements into executable code, as well as producing the artifacts, traces, and evidence that the resulting system is of high quality. Process compliance focuses on ensuring that the actual engineering work is followed as closely as possible to the described engineering processes. To this end, temporal constraints describe the ideal sequence of steps. Checking these process constraints, however, is still a daunting task that requires a lot of manual work and delivers feedback to engineers only late in the process. In this paper, we present an automated constraint checking approach that can incrementally check temporal constraints across inter-related engineering artifacts upon every artifact change thereby enabling timely feedback to engineers on process deviations. Temporal constraints are expressed in the Object Constraint Language (OCL) extended with operators from Linear Temporal Logic (LTL). We demonstrate the ability of our approach to support a wide range of higher level temporal patterns. We further show that for constraints in an industry-derived use case, the average evaluation time for a single constraint takes around 0.2 milliseconds.
△ Less
Submitted 20 December, 2023;
originally announced December 2023.
-
Balanced Knowledge Distribution among Software Development Teams -- Observations from Open-Source and Closed-Source Software Development
Authors:
Saad Shafiq,
Christoph Mayr-Dorn,
Atif Mashkoor,
Alexander Egyed
Abstract:
In software development teams, developer turnover is among the primary reasons for project failures as it leads to a great void of knowledge and strain for the newcomers. Unfortunately, no established methods exist to measure how knowledge is distributed among development teams. Knowing how this knowledge evolves and is owned by key developers in a project helps managers reduce risks caused by tur…
▽ More
In software development teams, developer turnover is among the primary reasons for project failures as it leads to a great void of knowledge and strain for the newcomers. Unfortunately, no established methods exist to measure how knowledge is distributed among development teams. Knowing how this knowledge evolves and is owned by key developers in a project helps managers reduce risks caused by turnover. To this end, this paper introduces a novel, realistic representation of domain knowledge distribution: the ConceptRealm. To construct the ConceptRealm, we employ a latent Dirichlet allocation model to represent textual features obtained from 300k issues and 1.3M comments from 518 open-source projects. We analyze whether the newly emerged issues and developers share similar concepts or how aligned the developers' concepts are with the team over time. We also investigate the impact of leaving members on the frequency of concepts. Finally, we evaluate the soundness of our approach to closed-source software, thus allowing the validation of the results from a practical standpoint. We find out that the ConceptRealm can represent the high-level domain knowledge within a team and can be utilized to predict the alignment of developers with issues. We also observe that projects exhibit many keepers independent of project maturity and that abruptly leaving keepers harm the team's concept familiarity.
△ Less
Submitted 26 July, 2022;
originally announced July 2022.
-
Do Communities in Developer Interaction Networks align with Subsystem Developer Teams? An Empirical Study of Open Source Systems
Authors:
Usman Ashraf,
Christoph Mayr-Dorn,
Atif Mashkoor,
Alexander Egyed,
Sebastiano Panichella
Abstract:
Studies over the past decade demonstrated that developers contributing to open source software systems tend to self-organize in "emerging" communities. This latent community structure has a significant impact on software quality. While several approaches address the analysis of developer interaction networks, the question of whether these emerging communities align with the developer teams working…
▽ More
Studies over the past decade demonstrated that developers contributing to open source software systems tend to self-organize in "emerging" communities. This latent community structure has a significant impact on software quality. While several approaches address the analysis of developer interaction networks, the question of whether these emerging communities align with the developer teams working on various subsystems remains unanswered. Work on socio-technical congruence implies that people that work on the same task or artifact need to coordinate and thus communicate, potentially forming stronger interaction ties. Our empirical study of 10 open source projects revealed that developer communities change considerably across a project's lifetime (hence implying that relevant relations between developers change) and that their alignment with subsystem developer teams is mostly low. However, subsystems teams tend to remain more stable. These insights are useful for practitioners and researchers to better understand developer interaction structure of open source systems.
△ Less
Submitted 8 April, 2021;
originally announced April 2021.
-
TaskAllocator: A Recommendation Approach for Role-based Tasks Allocation in Agile Software Development
Authors:
Saad Shafiq,
Atif Mashkoor,
Christoph Mayr-Dorn,
Alexander Egyed
Abstract:
In this paper, we propose a recommendation approach -- TaskAllocator -- in order to predict the assignment of incoming tasks to potential befitting roles. The proposed approach, identifying team roles rather than individual persons, allows project managers to perform better tasks allocation in case the individual developers are over-utilized or moved on to different roles/projects. We evaluated ou…
▽ More
In this paper, we propose a recommendation approach -- TaskAllocator -- in order to predict the assignment of incoming tasks to potential befitting roles. The proposed approach, identifying team roles rather than individual persons, allows project managers to perform better tasks allocation in case the individual developers are over-utilized or moved on to different roles/projects. We evaluated our approach on ten agile case study projects obtained from the Taiga.io repository. In order to determine the TaskAllocator's performance, we have conducted a benchmark study by comparing it with contemporary machine learning models. The applicability of the TaskAllocator was assessed through a plugin that can be integrated with JIRA and provides recommendations about suitable roles whenever a new task is added to the project. Lastly, the source code of the plugin and the dataset employed have been made public.
△ Less
Submitted 3 March, 2021;
originally announced March 2021.
-
Machine Learning for Software Engineering: A Systematic Map**
Authors:
Saad Shafiq,
Atif Mashkoor,
Christoph Mayr-Dorn,
Alexander Egyed
Abstract:
Context: The software development industry is rapidly adopting machine learning for transitioning modern day software systems towards highly intelligent and self-learning systems. However, the full potential of machine learning for improving the software engineering life cycle itself is yet to be discovered, i.e., up to what extent machine learning can help reducing the effort/complexity of softwa…
▽ More
Context: The software development industry is rapidly adopting machine learning for transitioning modern day software systems towards highly intelligent and self-learning systems. However, the full potential of machine learning for improving the software engineering life cycle itself is yet to be discovered, i.e., up to what extent machine learning can help reducing the effort/complexity of software engineering and improving the quality of resulting software systems. To date, no comprehensive study exists that explores the current state-of-the-art on the adoption of machine learning across software engineering life cycle stages. Objective: This article addresses the aforementioned problem and aims to present a state-of-the-art on the growing number of uses of machine learning in software engineering. Method: We conduct a systematic map** study on applications of machine learning to software engineering following the standard guidelines and principles of empirical software engineering. Results: This study introduces a machine learning for software engineering (MLSE) taxonomy classifying the state-of-the-art machine learning techniques according to their applicability to various software engineering life cycle stages. Overall, 227 articles were rigorously selected and analyzed as a result of this study. Conclusion: From the selected articles, we explore a variety of aspects that should be helpful to academics and practitioners alike in understanding the potential of adopting machine learning techniques during software engineering projects.
△ Less
Submitted 27 May, 2020;
originally announced May 2020.