Search | arXiv e-print repository

The More the Merrier? Navigating Accuracy vs. Energy Efficiency Design Trade-Offs in Ensemble Learning Systems

Authors: Rafiullah Omar, Justus Bogner, Henry Muccini, Patricia Lago, Silverio Martínez-Fernández, Xavier Franch

Abstract: Background: Machine learning (ML) model composition is a popular technique to mitigate shortcomings of a single ML model and to design more effective ML-enabled systems. While ensemble learning, i.e., forwarding the same request to several models and fusing their predictions, has been studied extensively for accuracy, we have insufficient knowledge about how to design energy-efficient ensembles. O… ▽ More Background: Machine learning (ML) model composition is a popular technique to mitigate shortcomings of a single ML model and to design more effective ML-enabled systems. While ensemble learning, i.e., forwarding the same request to several models and fusing their predictions, has been studied extensively for accuracy, we have insufficient knowledge about how to design energy-efficient ensembles. Objective: We therefore analyzed three types of design decisions for ensemble learning regarding a potential trade-off between accuracy and energy consumption: a) ensemble size, i.e., the number of models in the ensemble, b) fusion methods (majority voting vs. a meta-model), and c) partitioning methods (whole-dataset vs. subset-based training). Methods: By combining four popular ML algorithms for classification in different ensembles, we conducted a full factorial experiment with 11 ensembles x 4 datasets x 2 fusion methods x 2 partitioning methods (176 combinations). For each combination, we measured accuracy (F1-score) and energy consumption in J (for both training and inference). Results: While a larger ensemble size significantly increased energy consumption (size 2 ensembles consumed 37.49% less energy than size 3 ensembles, which in turn consumed 26.96% less energy than the size 4 ensembles), it did not significantly increase accuracy. Furthermore, majority voting outperformed meta-model fusion both in terms of accuracy (Cohen's d of 0.38) and energy consumption (Cohen's d of 0.92). Lastly, subset-based training led to significantly lower energy consumption (Cohen's d of 0.91), while training on the whole dataset did not increase accuracy significantly. Conclusions: From a Green AI perspective, we recommend designing ensembles of small size (2 or maximum 3 models), using subset-based training, majority voting, and energy-efficient ML algorithms like decision trees, Naive Bayes, or KNN. △ Less

Submitted 3 July, 2024; originally announced July 2024.

Comments: Currently under review at a journal

arXiv:2402.07323 [pdf, other]

Lessons Learned from Mining the Hugging Face Repository

Authors: Joel Castaño, Silverio Martínez-Fernández, Xavier Franch

Abstract: The rapidly evolving fields of Machine Learning (ML) and Artificial Intelligence have witnessed the emergence of platforms like Hugging Face (HF) as central hubs for model development and sharing. This experience report synthesizes insights from two comprehensive studies conducted on HF, focusing on carbon emissions and the evolutionary and maintenance aspects of ML models. Our objective is to pro… ▽ More The rapidly evolving fields of Machine Learning (ML) and Artificial Intelligence have witnessed the emergence of platforms like Hugging Face (HF) as central hubs for model development and sharing. This experience report synthesizes insights from two comprehensive studies conducted on HF, focusing on carbon emissions and the evolutionary and maintenance aspects of ML models. Our objective is to provide a practical guide for future researchers embarking on mining software repository studies within the HF ecosystem to enhance the quality of these studies. We delve into the intricacies of the replication package used in our studies, highlighting the pivotal tools and methodologies that facilitated our analysis. Furthermore, we propose a nuanced stratified sampling strategy tailored for the diverse HF Hub dataset, ensuring a representative and comprehensive analytical approach. The report also introduces preliminary guidelines, transitioning from repository mining to cohort studies, to establish causality in repository mining studies, particularly within the ML model of HF context. This transition is inspired by existing frameworks and is adapted to suit the unique characteristics of the HF model ecosystem. Our report serves as a guiding framework for researchers, contributing to the responsible and sustainable advancement of ML, and fostering a deeper understanding of the broader implications of ML models. △ Less

Submitted 11 February, 2024; originally announced February 2024.

Comments: Accepted at the 2024 ACM/IEEE 1st International Workshop on Methodological Issues with Empirical Studies in Software Engineering (WSESE)

arXiv:2401.12075 [pdf, other]

NLP-based Relation Extraction Methods in Requirements Engineering

Authors: Quim Motger, Xavier Franch

Abstract: In the context of requirements engineering, relation extraction is the task of documenting the traceability between requirements artefacts. When dealing with textual requirements (i.e., requirements expressed using natural language), relation extraction becomes a cognitively challenging task, especially in terms of ambiguity and required effort from domain-experts. Hence, in highly-adaptive, large… ▽ More In the context of requirements engineering, relation extraction is the task of documenting the traceability between requirements artefacts. When dealing with textual requirements (i.e., requirements expressed using natural language), relation extraction becomes a cognitively challenging task, especially in terms of ambiguity and required effort from domain-experts. Hence, in highly-adaptive, large-scale environments, effective and efficient automated relation extraction using natural language processing techniques becomes essential. In this chapter, we present a comprehensive overview of natural language-based relation extraction from text-based requirements. We initially describe the fundamentals of requirements relations based on the most relevant literature in the field, including the most common requirements relations types. The core of the chapter is composed by two main sections: (i) natural language techniques for the identification and categorization of requirements relations (i.e., syntactic vs. semantic techniques), and (ii) information extraction methods for the task of relation extraction (i.e., retrieval-based vs. machine learning-based methods). We complement this analysis with the state-of-the-art challenges and the envisioned future research directions. Overall, this chapter aims at providing a clear perspective on the theoretical and practical fundamentals in the field of natural language-based relation extraction. △ Less

Submitted 22 January, 2024; originally announced January 2024.

Comments: This article will appear as a chapter in a book provisionally titled "Natural Language Processing for Requirements Engineering", to be published by Springer

arXiv:2401.03833 [pdf, other]

T-FREX: A Transformer-based Feature Extraction Method from Mobile App Reviews

Authors: Quim Motger, Alessio Miaschi, Felice Dell'Orletta, Xavier Franch, Jordi Marco

Abstract: Mobile app reviews are a large-scale data source for software-related knowledge generation activities, including software maintenance, evolution and feedback analysis. Effective extraction of features (i.e., functionalities or characteristics) from these reviews is key to support analysis on the acceptance of these features, identification of relevant new feature requests and prioritization of fea… ▽ More Mobile app reviews are a large-scale data source for software-related knowledge generation activities, including software maintenance, evolution and feedback analysis. Effective extraction of features (i.e., functionalities or characteristics) from these reviews is key to support analysis on the acceptance of these features, identification of relevant new feature requests and prioritization of feature development, among others. Traditional methods focus on syntactic pattern-based approaches, typically context-agnostic, evaluated on a closed set of apps, difficult to replicate and limited to a reduced set and domain of apps. Meanwhile, the pervasiveness of Large Language Models (LLMs) based on the Transformer architecture in software engineering tasks lays the groundwork for empirical evaluation of the performance of these models to support feature extraction. In this study, we present T-FREX, a Transformer-based, fully automatic approach for mobile app review feature extraction. First, we collect a set of ground truth features from users in a real crowdsourced software recommendation platform and transfer them automatically into a dataset of app reviews. Then, we use this newly created dataset to fine-tune multiple LLMs on a named entity recognition task under different data configurations. We assess the performance of T-FREX with respect to this ground truth, and we complement our analysis by comparing T-FREX with a baseline method from the field. Finally, we assess the quality of new features predicted by T-FREX through an external human evaluation. Results show that T-FREX outperforms on average the traditional syntactic-based method, especially when discovering new features from a domain for which the model has been fine-tuned. △ Less

Submitted 8 January, 2024; originally announced January 2024.

Comments: Accepted at IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2024). 12 pages (including references), 5 figures, 4 tables

arXiv:2312.01981 [pdf]

doi 10.1007/978-3-031-57327-9_16

Unveiling Competition Dynamics in Mobile App Markets through User Reviews

Authors: Quim Motger, Xavier Franch, Vincenzo Gervasi, Jordi Marco

Abstract: User reviews published in mobile app repositories are essential for understanding user satisfaction and engagement within a specific market segment. Manual analysis of reviews is impractical due to the large data volume, and automated analysis faces challenges like data synthesis and reporting. This complicates the task for app providers in identifying patterns and significant events, especially i… ▽ More User reviews published in mobile app repositories are essential for understanding user satisfaction and engagement within a specific market segment. Manual analysis of reviews is impractical due to the large data volume, and automated analysis faces challenges like data synthesis and reporting. This complicates the task for app providers in identifying patterns and significant events, especially in assessing the influence of competitor apps. Furthermore, review-based research is mostly limited to a single app or a single app provider, excluding potential competition analysis. Consequently, there is an open research challenge in leveraging user reviews to support cross-app analysis within a specific market segment. Following a case-study research method in the microblogging app market, we introduce an automatic, novel approach to support mobile app market analysis. Our approach leverages quantitative metrics and event detection techniques based on newly published user reviews. Significant events are proactively identified and summarized by comparing metric deviations with historical baseline indicators within the lifecycle of a mobile app. Results from our case study show empirical evidence of the detection of relevant events within the selected market segment, including software- or release-based events, contextual events and the emergence of new competitors. △ Less

Submitted 7 June, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

Journal ref: Requirements Engineering: Foundation for Software Quality. REFSQ 2024. Lecture Notes in Computer Science, vol 14588

arXiv:2311.13380 [pdf, other]

Analyzing the Evolution and Maintenance of ML Models on Hugging Face

Authors: Joel Castaño, Silverio Martínez-Fernández, Xavier Franch, Justus Bogner

Abstract: Hugging Face (HF) has established itself as a crucial platform for the development and sharing of machine learning (ML) models. This repository mining study, which delves into more than 380,000 models using data gathered via the HF Hub API, aims to explore the community engagement, evolution, and maintenance around models hosted on HF, aspects that have yet to be comprehensively explored in the li… ▽ More Hugging Face (HF) has established itself as a crucial platform for the development and sharing of machine learning (ML) models. This repository mining study, which delves into more than 380,000 models using data gathered via the HF Hub API, aims to explore the community engagement, evolution, and maintenance around models hosted on HF, aspects that have yet to be comprehensively explored in the literature. We first examine the overall growth and popularity of HF, uncovering trends in ML domains, framework usage, authors grou** and the evolution of tags and datasets used. Through text analysis of model card descriptions, we also seek to identify prevalent themes and insights within the developer community. Our investigation further extends to the maintenance aspects of models, where we evaluate the maintenance status of ML models, classify commit messages into various categories (corrective, perfective, and adaptive), analyze the evolution across development stages of commits metrics and introduce a new classification system that estimates the maintenance status of models based on multiple attributes. This study aims to provide valuable insights about ML model maintenance and evolution that could inform future model development strategies on platforms like HF. △ Less

Submitted 5 February, 2024; v1 submitted 22 November, 2023; originally announced November 2023.

Comments: Accepted at the 2024 IEEE/ACM 21th International Conference on Mining Software Repositories (MSR)

arXiv:2307.09964 [pdf, other]

Towards green AI-based software systems: an architecture-centric approach (GAISSA)

Authors: Silverio Martínez-Fernández, Xavier Franch, Francisco Durán

Abstract: Nowadays, AI-based systems have achieved outstanding results and have outperformed humans in different domains. However, the processes of training AI models and inferring from them require high computational resources, which pose a significant challenge in the current energy efficiency societal demand. To cope with this challenge, this research project paper describes the main vision, goals, and e… ▽ More Nowadays, AI-based systems have achieved outstanding results and have outperformed humans in different domains. However, the processes of training AI models and inferring from them require high computational resources, which pose a significant challenge in the current energy efficiency societal demand. To cope with this challenge, this research project paper describes the main vision, goals, and expected outcomes of the GAISSA project. The GAISSA project aims at providing data scientists and software engineers tool-supported, architecture-centric methods for the modelling and development of green AI-based systems. Although the project is in an initial stage, we describe the current research results, which illustrate the potential to achieve GAISSA objectives. △ Less

Submitted 19 July, 2023; originally announced July 2023.

Comments: Accepted for publication as full paper - 2023 49th Euromicro Conference Series on Software Engineering and Advanced Applications (SEAA)

arXiv:2307.05520 [pdf, other]

Do DL models and training environments have an impact on energy consumption?

Authors: Santiago del Rey, Silverio Martínez-Fernández, Luís Cruz, Xavier Franch

Abstract: Current research in the computer vision field mainly focuses on improving Deep Learning (DL) correctness and inference time performance. However, there is still little work on the huge carbon footprint that has training DL models. This study aims to analyze the impact of the model architecture and training environment when training greener computer vision models. We divide this goal into two resea… ▽ More Current research in the computer vision field mainly focuses on improving Deep Learning (DL) correctness and inference time performance. However, there is still little work on the huge carbon footprint that has training DL models. This study aims to analyze the impact of the model architecture and training environment when training greener computer vision models. We divide this goal into two research questions. First, we analyze the effects of model architecture on achieving greener models while kee** correctness at optimal levels. Second, we study the influence of the training environment on producing greener models. To investigate these relationships, we collect multiple metrics related to energy efficiency and model correctness during the models' training. Then, we outline the trade-offs between the measured energy efficiency and the models' correctness regarding model architecture, and their relationship with the training environment. We conduct this research in the context of a computer vision system for image classification. In conclusion, we show that selecting the proper model architecture and training environment can reduce energy consumption dramatically (up to 81.38%) at the cost of negligible decreases in correctness. Also, we find evidence that GPUs should scale with the models' computational complexity for better energy efficiency. △ Less

Submitted 3 January, 2024; v1 submitted 7 July, 2023; originally announced July 2023.

Comments: 49th Euromicro Conference Series on Software Engineering and Advanced Applications (SEAA). 8 pages, 3 figures

arXiv:2305.11164 [pdf, other]

doi 10.1109/ESEM56168.2023.10304801

Exploring the Carbon Footprint of Hugging Face's ML Models: A Repository Mining Study

Authors: Joel Castaño, Silverio Martínez-Fernández, Xavier Franch, Justus Bogner

Abstract: The rise of machine learning (ML) systems has exacerbated their carbon footprint due to increased capabilities and model sizes. However, there is scarce knowledge on how the carbon footprint of ML models is actually measured, reported, and evaluated. In light of this, the paper aims to analyze the measurement of the carbon footprint of 1,417 ML models and associated datasets on Hugging Face, which… ▽ More The rise of machine learning (ML) systems has exacerbated their carbon footprint due to increased capabilities and model sizes. However, there is scarce knowledge on how the carbon footprint of ML models is actually measured, reported, and evaluated. In light of this, the paper aims to analyze the measurement of the carbon footprint of 1,417 ML models and associated datasets on Hugging Face, which is the most popular repository for pretrained ML models. The goal is to provide insights and recommendations on how to report and optimize the carbon efficiency of ML models. The study includes the first repository mining study on the Hugging Face Hub API on carbon emissions. This study seeks to answer two research questions: (1) how do ML model creators measure and report carbon emissions on Hugging Face Hub?, and (2) what aspects impact the carbon emissions of training ML models? The study yielded several key findings. These include a stalled proportion of carbon emissions-reporting models, a slight decrease in reported carbon footprint on Hugging Face over the past 2 years, and a continued dominance of NLP as the main application domain. Furthermore, the study uncovers correlations between carbon emissions and various attributes such as model size, dataset size, and ML application domains. These results highlight the need for software measurements to improve energy reporting practices and promote carbon-efficient model development within the Hugging Face community. In response to this issue, two classifications are proposed: one for categorizing models based on their carbon emission reporting practices and another for their carbon efficiency. The aim of these classification proposals is to foster transparency and sustainable model development within the ML community. △ Less

Submitted 29 November, 2023; v1 submitted 18 May, 2023; originally announced May 2023.

Comments: Accepted at the 2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)

Journal ref: 2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM) (2023) 260-271

arXiv:2304.10265 [pdf, other]

Replication in Requirements Engineering: the NLP for RE Case

Authors: Sallam Abualhaija, F. BaŞAk Aydemir, Fabiano Dalpiaz, Davide Dell'Anna, Alessio Ferrari, Xavier Franch, Davide Fucci

Abstract: [Context]} Natural language processing (NLP) techniques have been widely applied in the requirements engineering (RE) field to support tasks such as classification and ambiguity detection. Despite its empirical vocation, RE research has given limited attention to replication of NLP for RE studies. Replication is hampered by several factors, including the context specificity of the studies, the het… ▽ More [Context]} Natural language processing (NLP) techniques have been widely applied in the requirements engineering (RE) field to support tasks such as classification and ambiguity detection. Despite its empirical vocation, RE research has given limited attention to replication of NLP for RE studies. Replication is hampered by several factors, including the context specificity of the studies, the heterogeneity of the tasks involving NLP, the tasks' inherent hairiness, and, in turn, the heterogeneous reporting structure. [Solution] To address these issues, we propose a new artifact, referred to as ID-Card, whose goal is to provide a structured summary of research papers emphasizing replication-relevant information. We construct the ID-Card through a structured, iterative process based on design science. [Results] In this paper: (i) we report on hands-on experiences of replication, (ii) we review the state-of-the-art and extract replication-relevant information, (iii) we identify, through focus groups, challenges across two typical dimensions of replication: data annotation and tool reconstruction, and (iv) we present the concept and structure of the ID-Card to mitigate the identified challenges. [Contribution] This study aims to create awareness of replication in NLP for RE. We propose an ID-Card that is intended to foster study replication, but can also be used in other contexts, e.g., for educational purposes. △ Less

Submitted 18 April, 2024; v1 submitted 20 April, 2023; originally announced April 2023.

arXiv:2302.00967 [pdf, other]

Energy Efficiency of Training Neural Network Architectures: An Empirical Study

Authors: Yinlena Xu, Silverio Martínez-Fernández, Matias Martinez, Xavier Franch

Abstract: The evaluation of Deep Learning models has traditionally focused on criteria such as accuracy, F1 score, and related measures. The increasing availability of high computational power environments allows the creation of deeper and more complex models. However, the computations needed to train such models entail a large carbon footprint. In this work, we study the relations between DL model architec… ▽ More The evaluation of Deep Learning models has traditionally focused on criteria such as accuracy, F1 score, and related measures. The increasing availability of high computational power environments allows the creation of deeper and more complex models. However, the computations needed to train such models entail a large carbon footprint. In this work, we study the relations between DL model architectures and their environmental impact in terms of energy consumed and CO$_2$ emissions produced during training by means of an empirical study using Deep Convolutional Neural Networks. Concretely, we study: (i) the impact of the architecture and the location where the computations are hosted on the energy consumption and emissions produced; (ii) the trade-off between accuracy and energy efficiency; and (iii) the difference on the method of measurement of the energy consumed using software-based and hardware-based tools. △ Less

Submitted 2 February, 2023; originally announced February 2023.

Comments: Accepted in HICSS 2023. For its published version refer to the Proceedings of the 56th Hawaii International Conference on System Sciences; URI https://hdl.handle.net/10125/102727

ACM Class: D.2; I.2

Journal ref: Proceedings of the 56th Hawaii International Conference on System Sciences, pp. 781-790 (2023)

arXiv:2211.12104 [pdf, ps, other]

doi 10.1145/3639478.3643114

Energy Consumption of Automated Program Repair

Authors: Matias Martinez, Silverio Martínez-Fernández, Xavier Franch

Abstract: Automated program repair (APR) aims to automatize the process of repairing software bugs in order to reduce the cost of maintaining software programs. Moreover, the success (given by the accuracy metric) of APR approaches has increased in recent years. However, no previous work has considered the energy impact of repairing bugs automatically using APR. The field of green software research aims to… ▽ More Automated program repair (APR) aims to automatize the process of repairing software bugs in order to reduce the cost of maintaining software programs. Moreover, the success (given by the accuracy metric) of APR approaches has increased in recent years. However, no previous work has considered the energy impact of repairing bugs automatically using APR. The field of green software research aims to measure the energy consumption required to develop, maintain, and use software products. This paper combines, for the first time, the APR and Green software research fields. We have as main goal to define the foundation for measuring the energy consumption of the APR activity. We measure the energy consumption of ten traditional program repair tools for Java and ten fine-tuned Large-Language Models (LLM) on source code trying to repair real bugs from Defects4J, a set of real buggy programs. The initial results from this experiment show the existing trade-off between energy consumption and the ability to correctly repair bugs: Some APR tools are capable of achieving higher accuracy by spending less energy than other tools. △ Less

Submitted 5 February, 2024; v1 submitted 22 November, 2022; originally announced November 2022.

Journal ref: 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings (ICSE-Companion '24), April 14--20, 2024, Lisbon, Portugal

arXiv:2207.03689 [pdf, other]

doi 10.7717/peerj-cs.1454

Guiding the retraining of convolutional neural networks against adversarial inputs

Authors: Francisco Durán López, Silverio Martínez-Fernández, Michael Felderer, Xavier Franch

Abstract: Background: When using deep learning models, there are many possible vulnerabilities and some of the most worrying are the adversarial inputs, which can cause wrong decisions with minor perturbations. Therefore, it becomes necessary to retrain these models against adversarial inputs, as part of the software testing process addressing the vulnerability to these inputs. Furthermore, for an energy ef… ▽ More Background: When using deep learning models, there are many possible vulnerabilities and some of the most worrying are the adversarial inputs, which can cause wrong decisions with minor perturbations. Therefore, it becomes necessary to retrain these models against adversarial inputs, as part of the software testing process addressing the vulnerability to these inputs. Furthermore, for an energy efficient testing and retraining, data scientists need support on which are the best guidance metrics and optimal dataset configurations. Aims: We examined four guidance metrics for retraining convolutional neural networks and three retraining configurations. Our goal is to improve the models against adversarial inputs regarding accuracy, resource utilization and time from the point of view of a data scientist in the context of image classification. Method: We conducted an empirical study in two datasets for image classification. We explore: (a) the accuracy, resource utilization and time of retraining convolutional neural networks by ordering new training set by four different guidance metrics (neuron coverage, likelihood-based surprise adequacy, distance-based surprise adequacy and random), (b) the accuracy and resource utilization of retraining convolutional neural networks with three different configurations (from scratch and augmented dataset, using weights and augmented dataset, and using weights and only adversarial inputs). Results: We reveal that retraining with adversarial inputs from original weights and by ordering with surprise adequacy metrics gives the best model w.r.t. the used metrics. Conclusions: Although more studies are necessary, we recommend data scientists to use the above configuration and metrics to deal with the vulnerability to adversarial inputs of deep learning models, as they can improve their models against adversarial inputs without using many inputs. △ Less

Submitted 12 July, 2022; v1 submitted 8 July, 2022; originally announced July 2022.

arXiv:2110.09366 [pdf]

doi 10.1109/TSE.2021.3113558

Use and Misuse of the Term Experiment in Mining Software Repositories Research

Authors: Claudia Ayala, Burak Turhan, Xavier Franch, Natalia Juristo

Abstract: The significant momentum and importance of Mining Software Repositories (MSR) in Software Engineering (SE) has fostered new opportunities and challenges for extensive empirical research. However, MSR researchers seem to struggle to characterize the empirical methods they use into the existing empirical SE body of knowledge. This is especially the case of MSR experiments. To provide evidence on the… ▽ More The significant momentum and importance of Mining Software Repositories (MSR) in Software Engineering (SE) has fostered new opportunities and challenges for extensive empirical research. However, MSR researchers seem to struggle to characterize the empirical methods they use into the existing empirical SE body of knowledge. This is especially the case of MSR experiments. To provide evidence on the special characteristics of MSR experiments and their differences with experiments traditionally acknowledged in SE so far, we elicited the hallmarks that differentiate an experiment from other types of empirical studies and characterized the hallmarks and types of experiments in MSR. We analyzed MSR literature obtained from a small-scale systematic map** study to assess the use of the term experiment in MSR. We found that 19% of the papers claiming to be an experiment are indeed not an experiment at all but also observational studies, so they use the term in a misleading way. From the remaining 81% of the papers, only one of them refers to a genuine controlled experiment while the others stand for experiments with limited control. MSR researchers tend to overlook such limitations, compromising the interpretation of the results of their studies. We provide recommendations and insights to support the improvement of MSR experiments. △ Less

Submitted 18 October, 2021; originally announced October 2021.

arXiv:2110.03820 [pdf]

How Tertiary Studies perform Quality Assessment of Secondary Studies in Software Engineering

Authors: Dolors Costal, Carles Farré, Xavier Franch, Carme Quer

Abstract: Context: Tertiary studies are becoming increasingly popular in software engineering as an instrument to synthesise evidence on a research topic in a systematic way. In order to understand and contextualize their findings, it is important to assess the quality of the selected secondary studies. Objective: This paper aims to provide a state of the art on the assessment of secondary studies' quality… ▽ More Context: Tertiary studies are becoming increasingly popular in software engineering as an instrument to synthesise evidence on a research topic in a systematic way. In order to understand and contextualize their findings, it is important to assess the quality of the selected secondary studies. Objective: This paper aims to provide a state of the art on the assessment of secondary studies' quality as conducted in tertiary studies in the area of software engineering, reporting the frameworks used as instruments, the facets examined in these frameworks, and the purposes of the quality assessment. Method: We designed this study as a systematic map** responding to four research questions derived from the objective above. We applied a rigorous search protocol over the Scopus digital library, resulting in 47 papers after application of inclusion and exclusion criteria. The extracted data was synthesised using content analysis. Results: A majority of tertiary studies perform quality assessment. It is not often used for excluding studies, but to support some kind of investigation. The DARE quality assessment framework is the most frequently used, with customizations in some cases to cover missing facets. We outline the first steps towards building a new framework to address the shortcomings identified. Conclusion: This paper is a step forward establishing a foundation for researchers in two different ways. As authors of tertiary studies, understanding the different possibilities in which they can perform quality assessment of secondary studies. As readers, having an instrument to understand the methodological rigor upon which tertiary studies may claim their findings. △ Less

Submitted 7 October, 2021; originally announced October 2021.

Comments: Preprint of the paper presented at the XXIV Iberoamerican Conference on Software Engineering (ESELAW@CIbSE). Best paper award. If interested in the topic, check also arXiv:2109.08226 (postprint of a paper accepted at ESEM'21)

ACM Class: D.2

arXiv:2109.15284 [pdf, ps, other]

Which Design Decisions in AI-enabled Mobile Applications Contribute to Greener AI?

Authors: Roger Creus Castanyer, Silverio Martínez-Fernández, Xavier Franch

Abstract: Background: The construction, evolution and usage of complex artificial intelligence (AI) models demand expensive computational resources. While currently available high-performance computing environments support well this complexity, the deployment of AI models in mobile devices, which is an increasing trend, is challenging. Mobile applications consist of environments with low computational resou… ▽ More Background: The construction, evolution and usage of complex artificial intelligence (AI) models demand expensive computational resources. While currently available high-performance computing environments support well this complexity, the deployment of AI models in mobile devices, which is an increasing trend, is challenging. Mobile applications consist of environments with low computational resources and hence imply limitations in the design decisions during the AI-enabled software engineering lifecycle that balance the trade-off between the accuracy and the complexity of the mobile applications. Objective: Our objective is to systematically assess the trade-off between accuracy and complexity when deploying complex AI models (e.g. neural networks) to mobile devices, which have an implicit resource limitation. We aim to cover (i) the impact of the design decisions on the achievement of high-accuracy and low resource-consumption implementations; and (ii) the validation of profiling tools for systematically promoting greener AI. Method: This confirmatory registered report consists of a plan to conduct an empirical study to quantify the implications of the design decisions on AI-enabled applications performance and to report experiences of the end-to-end AI-enabled software engineering lifecycle. Concretely, we will implement both image-based and language-based neural networks in mobile applications to solve multiple image classification and text classification problems on different benchmark datasets. Overall, we plan to model the accuracy and complexity of AI-enabled applications in operation with respect to their design decisions and will provide tools for allowing practitioners to gain consciousness of the quantitative relationship between the design decisions and the green characteristics of study. △ Less

Submitted 30 May, 2023; v1 submitted 28 September, 2021; originally announced September 2021.

Comments: The current version is under review

ACM Class: D.2; I.2

arXiv:2109.08226 [pdf]

doi 10.1145/3475716.3484190

Inclusion and Exclusion Criteria in Software Engineering Tertiary Studies: A Systematic Map** and Emerging Framework

Authors: Dolors Costal, Carles Farré, Xavier Franch, Carme Quer

Abstract: Context: Tertiary studies in software engineering (TS@SE) are widely used to synthesise evidence on a research topic systematically. As part of their protocol, TS@SE define inclusion and exclusion criteria (IC/EC) aimed at selecting those secondary studies (SS) to be included in the analysis. Aims: To provide a state of the art on the definition and application of IC/EC in TS@SE, and from the resu… ▽ More Context: Tertiary studies in software engineering (TS@SE) are widely used to synthesise evidence on a research topic systematically. As part of their protocol, TS@SE define inclusion and exclusion criteria (IC/EC) aimed at selecting those secondary studies (SS) to be included in the analysis. Aims: To provide a state of the art on the definition and application of IC/EC in TS@SE, and from the results of this analysis, we outline an emerging framework, TSICEC, to be used by SE researchers. Method: To provide the state of the art, we conducted a systematic map** (SM) combining automatic search and snowballing over the body of SE scientific literature, which led to 50 papers after application of our own IC/EC. The extracted data was synthesised using content analysis. The results were used to define a first version of TSICEC. Results: The SM resulted in a coding schema, and a thorough analysis of the selected papers on the basis of this coding. Our TSICEC framework includes guidelines for the definition of IC/EC in TS@SE. Conclusion: This paper is a step forward establishing a foundation for researchers in two ways. As authors, understanding the different possibilities to define IC/EC and apply them to select SS. As readers, having an instrument to understand the methodological rigor upon which TS@SE may claim their findings. △ Less

Submitted 16 September, 2021; originally announced September 2021.

arXiv:2106.10901 [pdf, other]

doi 10.1145/3527450

Software-Based Dialogue Systems: Survey, Taxonomy and Challenges

Authors: Quim Motger, Xavier Franch, Jordi Marco

Abstract: The use of natural language interfaces in the field of human-computer interaction is undergoing intense study through dedicated scientific and industrial research. The latest contributions in the field, including deep learning approaches like recurrent neural networks, the potential of context-aware strategies and user-centred design approaches, have brought back the attention of the community to… ▽ More The use of natural language interfaces in the field of human-computer interaction is undergoing intense study through dedicated scientific and industrial research. The latest contributions in the field, including deep learning approaches like recurrent neural networks, the potential of context-aware strategies and user-centred design approaches, have brought back the attention of the community to software-based dialogue systems, generally known as conversational agents or chatbots. Nonetheless, and given the novelty of the field, a generic, context-independent overview on the current state of research of conversational agents covering all research perspectives involved is missing. Motivated by this context, this paper reports a survey of the current state of research of conversational agents through a systematic literature review of secondary studies. The conducted research is designed to develop an exhaustive perspective through a clear presentation of the aggregated knowledge published by recent literature within a variety of domains, research focuses and contexts. As a result, this research proposes a holistic taxonomy of the different dimensions involved in the conversational agents' field, which is expected to help researchers and to lay the groundwork for future research in the field of natural language interfaces. △ Less

Submitted 6 February, 2024; v1 submitted 21 June, 2021; originally announced June 2021.

arXiv:2105.13961 [pdf, other]

doi 10.1109/TSE.2021.3087792

A Study about the Knowledge and Use of Requirements Engineering Standards in Industry

Authors: Xavier Franch, Martin Glinz, Daniel Mendez, Norbert Seyff

Abstract: Context: The use of standards is considered a vital part of any engineering discipline. So one could expect that standards play an important role in Requirements Engineering (RE) as well. However, little is known about the actual knowledge and use of RE-related standards in industry. Objective: In this article, we investigate to which extent standards and related artifacts such as templates or gui… ▽ More Context: The use of standards is considered a vital part of any engineering discipline. So one could expect that standards play an important role in Requirements Engineering (RE) as well. However, little is known about the actual knowledge and use of RE-related standards in industry. Objective: In this article, we investigate to which extent standards and related artifacts such as templates or guidelines are known and used by RE practitioners. Method: To this end, we have conducted a questionnaire-based online survey. We could analyze the replies from 90 RE practitioners using a combination of closed and open-text questions. Results: Our results indicate that the knowledge and use of standards and related artifacts in RE is less widespread than one might expect from an engineering perspective. For example, about 47% of the respondents working as requirements engineers or business analysts do not know the core standard in RE, ISO/IEC/IEEE 29148. Participants in our study mostly use standards by personal decision rather than being imposed by their respective company, customer, or regulator. Beyond insufficient knowledge, we also found cultural and organizational factors impeding the widespread adoption of standards in RE. Conclusions: Overall, our results provide empirically informed insights into the actual use of standards and related artifacts in RE practice and - indirectly - about the value that the current standards create for RE practitioners. △ Less

Submitted 6 September, 2021; v1 submitted 28 May, 2021; originally announced May 2021.

Comments: Preprint accepted for publication at IEEE Transactions on Software Engineering and presented as Journal First at the International Requirements Engineering conference 2021. Latest update: Several smaller corrections along the creation of the final version of the manuscript

Journal ref: Transactions on Software Engineering 2022

arXiv:2105.01984 [pdf, other]

doi 10.1145/3487043

Software Engineering for AI-Based Systems: A Survey

Authors: Silverio Martínez-Fernández, Justus Bogner, Xavier Franch, Marc Oriol, Julien Siebert, Adam Trendowicz, Anna Maria Vollmer, Stefan Wagner

Abstract: AI-based systems are software systems with functionalities enabled by at least one AI component (e.g., for image- and speech-recognition, and autonomous driving). AI-based systems are becoming pervasive in society due to advances in AI. However, there is limited synthesized knowledge on Software Engineering (SE) approaches for building, operating, and maintaining AI-based systems. To collect and a… ▽ More AI-based systems are software systems with functionalities enabled by at least one AI component (e.g., for image- and speech-recognition, and autonomous driving). AI-based systems are becoming pervasive in society due to advances in AI. However, there is limited synthesized knowledge on Software Engineering (SE) approaches for building, operating, and maintaining AI-based systems. To collect and analyze state-of-the-art knowledge about SE for AI-based systems, we conducted a systematic map** study. We considered 248 studies published between January 2010 and March 2020. SE for AI-based systems is an emerging research area, where more than 2/3 of the studies have been published since 2018. The most studied properties of AI-based systems are dependability and safety. We identified multiple SE approaches for AI-based systems, which we classified according to the SWEBOK areas. Studies related to software testing and software quality are very prevalent, while areas like software maintenance seem neglected. Data-related issues are the most recurrent challenges. Our results are valuable for: researchers, to quickly understand the state of the art and learn which topics need more research; practitioners, to learn about the approaches and challenges that SE entails for AI-based systems; and, educators, to bridge the gap among SE and AI in their curricula. △ Less

Submitted 2 September, 2021; v1 submitted 5 May, 2021; originally announced May 2021.

Comments: Accepted in ACM Transactions on Software Engineering and Methodology (TOSEM). For its published version refer to the Journal of ACM TOSEM

ACM Class: D.2; I.2

Journal ref: ACM Trans. Softw. Eng. Methodol. 31, 2, Article 37e (March 2022), 59 pages

arXiv:2103.10811 [pdf, other]

Improving Web API Usage Logging

Authors: Rediana Koçi, Xavier Franch, Petar Jovanovic, Alberto Abelló

Abstract: A Web API (WAPI) is a type of API whose interaction with its consumers is done through the Internet. While being accessed through the Internet can be challenging, mostly when WAPIs evolve, it gives providers the possibility to monitor their usage, and understand and analyze consumers' behavior. Currently, WAPI usage is mostly logged for traffic monitoring and troubleshooting. Even though they cont… ▽ More A Web API (WAPI) is a type of API whose interaction with its consumers is done through the Internet. While being accessed through the Internet can be challenging, mostly when WAPIs evolve, it gives providers the possibility to monitor their usage, and understand and analyze consumers' behavior. Currently, WAPI usage is mostly logged for traffic monitoring and troubleshooting. Even though they contain invaluable information regarding consumers' behavior} they are not sufficiently used by providers. In this paper, we first consider two phases of the application development lifecycle, and based on them we distinguish two different types of usage logs, namely development logs and production logs. For each of them we show the potential analyses (e.g., WAPI usability evaluation, consumers' needs identification) that can be performed, as well as the main impediments, that may be caused by the unsuitable log format. We then conduct a case study using logs of the same WAPI from different deployments and different formats, to demonstrate the occurrence of these impediments and at the same time the importance of a proper log format. Next, based on the case study results, we present the main quality issues of WAPI log data and explain their impact on data analyses. For each of them, we give some practical suggestions on how to deal with them, as well as mitigating their root cause. △ Less

Submitted 19 March, 2021; originally announced March 2021.

arXiv:2103.07286 [pdf, other]

Integration of Convolutional Neural Networks in Mobile Applications

Authors: Roger Creus Castanyer, Silverio Martínez-Fernández, Xavier Franch

Abstract: When building Deep Learning (DL) models, data scientists and software engineers manage the trade-off between their accuracy, or any other suitable success criteria, and their complexity. In an environment with high computational power, a common practice is making the models go deeper by designing more sophisticated architectures. However, in the context of mobile devices, which possess less comput… ▽ More When building Deep Learning (DL) models, data scientists and software engineers manage the trade-off between their accuracy, or any other suitable success criteria, and their complexity. In an environment with high computational power, a common practice is making the models go deeper by designing more sophisticated architectures. However, in the context of mobile devices, which possess less computational power, kee** complexity under control is a must. In this paper, we study the performance of a system that integrates a DL model as a trade-off between the accuracy and the complexity. At the same time, we relate the complexity to the efficiency of the system. With this, we present a practical study that aims to explore the challenges met when optimizing the performance of DL models becomes a requirement. Concretely, we aim to identify: (i) the most concerning challenges when deploying DL-based software in mobile applications; and (ii) the path for optimizing the performance trade-off. We obtain results that verify many of the identified challenges in the related work such as the availability of frameworks and the software-data dependency. We provide a documentation of our experience when facing the identified challenges together with the discussion of possible solutions to them. Additionally, we implement a solution to the sustainability of the DL models when deployed in order to reduce the severity of other identified challenges. Moreover, we relate the performance trade-off to a new defined challenge featuring the impact of the complexity in the obtained accuracy. Finally, we discuss and motivate future work that aims to provide solutions to the more open challenges found. △ Less

Submitted 11 March, 2021; originally announced March 2021.

Comments: Pre-print. Accepted and to be published in WAIN@ICSE 2021

arXiv:2102.11556 [pdf]

doi 10.1007/s00766-020-00345-x

The State-of-Practice in Requirements Elicitation: An Extended Interview Study at 12 Companies

Authors: Cristina Palomares, Xavier Franch, Carme Quer, Panagiota Chatzipetrou, Lidia López, Tony Gorschek

Abstract: Context. Requirements engineering remains a discipline that is faced with a large number of challenges, including the implementation of a requirements elicitation process in industry. Although several proposals have been suggested by researchers and academics, little is known of the practices that are actually followed in industry. Objective. We investigate the SoTA with respect to requirements el… ▽ More Context. Requirements engineering remains a discipline that is faced with a large number of challenges, including the implementation of a requirements elicitation process in industry. Although several proposals have been suggested by researchers and academics, little is known of the practices that are actually followed in industry. Objective. We investigate the SoTA with respect to requirements elicitation, examining practitioners' practices. We focus on the techniques, the roles involved, and the challenges associated to the process. Method. We conducted an interview-based survey study involving 24 practitioners from 12 different Swedish IT companies. Results. We found that group interaction techniques, including meetings and workshops, are the most popular type of elicitation techniques that are employed by the practitioners, except in the case of small projects. We noted that customers are frequently involved in the elicitation process, except in the case of market-driven organizations. Technical staff (for example, developers and architects) are more frequently involved in the elicitation process compared to the involvement of business- or strategic staff. Finally, we identified a number of challenges with respect to stakeholders. These challenges include difficulties in understanding and prioritizing their needs. Further, it was noted that requirements instability (i.e., caused by changing needs or priorities) was a predominant challenge. These observations need to be interpreted in the context of the study. Conclusion. The relevant observations regarding the survey participants' experiences should be of interest to the industry; experiences that should be analyzed in the practitioners' context. Researchers may find evidence for the use of academic results in practice, thereby inspiring future theoretical work, as well as further empirical studies in the same area. △ Less

Submitted 23 February, 2021; originally announced February 2021.

Comments: 32 pages, Accepted for publication in the Requirements Engineering Journal, 2021. Cite as Palomares, C., Franch, X., Quer, C. et al. The state-of-practice in requirements elicitation: an extended interview study at 12 companies. Requirements Eng (2021)

arXiv:2102.08485 [pdf, other]

doi 10.1109/TSE.2022.3212166

Improved management of issue dependencies in issue trackers of large collaborative projects

Authors: Mikko Raatikainen, Quim Motger, Clara Marie Lüders, Xavier Franch, Lalli Myllyaho, Elina Kettunen, Jordi Marco, Juha Tiihonen, Mikko Halonen, Tomi Männistö

Abstract: Issue trackers, such as Jira, have become the prevalent collaborative tools in software engineering for managing issues, such as requirements, development tasks, and software bugs. However, issue trackers inherently focus on the lifecycle of single issues, although issues have and express dependencies on other issues that constitute issue dependency networks in large complex collaborative projects… ▽ More Issue trackers, such as Jira, have become the prevalent collaborative tools in software engineering for managing issues, such as requirements, development tasks, and software bugs. However, issue trackers inherently focus on the lifecycle of single issues, although issues have and express dependencies on other issues that constitute issue dependency networks in large complex collaborative projects. The objective of this study is to develop supportive solutions for the improved management of dependent issues in an issue tracker. This study follows the Design Science methodology, consisting of eliciting drawbacks and constructing and evaluating a solution and system. The study was carried out in the context of The Qt Company's Jira, which exemplifies an actively used, almost two-decade-old issue tracker with over 100,000 issues. The drawbacks capture how users operate with issue trackers to handle issue information in large, collaborative, and long-lived projects. The basis of the solution is to keep issues and dependencies as separate objects and automatically construct an issue graph. Dependency detections complement the issue graph by proposing missing dependencies, while consistency checks and diagnoses identify conflicting issue priorities and release assignments. Jira's plugin and service-based system architecture realize the functional and quality concerns of the system implementation. We show how to adopt the intelligent supporting techniques of an issue tracker in a complex use context and a large data-set. The solution considers an integrated and holistic system view, practical applicability and utility, and the practical characteristics of issue data, such as inherent incompleteness. △ Less

Submitted 15 November, 2022; v1 submitted 16 February, 2021; originally announced February 2021.

Comments: Accepted for publication in IEEE Transactions on Software Engineering. Published online 05 October 2022. 21 pages, 3 figures, 8 tables

ACM Class: D.2.1

arXiv:2102.05920 [pdf]

QFL: Data-Driven Feedback Loop to Manage Quality in Agile Development

Authors: Lidia López, Alessandra Bagnato, Antonin Ahbervé, Xavier Franch

Abstract: Background: Quality requirements (QRs) describe desired system qualities, playing an important role in the success of software projects. In the context of agile software development (ASD), where the main objective is the fast delivery of functionalities, QRs are often ill-defined and not well addressed during the development process. Software analytics tools help to control quality though the meas… ▽ More Background: Quality requirements (QRs) describe desired system qualities, playing an important role in the success of software projects. In the context of agile software development (ASD), where the main objective is the fast delivery of functionalities, QRs are often ill-defined and not well addressed during the development process. Software analytics tools help to control quality though the measurement of quality-related software aspects to support decision-makers in the process of QR management. Aim: The goal of this research is to explore the benefits of integrating a concrete software analytics tool, Q-Rapids Tool, to assess software quality and support QR management processes. Method: In the context of a technology transfer project, the Softeam company has integrated Q-Rapids Tool in their development process. We conducted a series of workshops involving Softeam members working in the Modelio product development. Results: We present the Quality Feedback Loop (QFL) process to be integrated in software development processes to control the complete QR life-cycle, from elicitation to validation. As a result of the implementation of QFL in Softeam, Modelio's team members highlight the benefits of integrating a data analytics tool with their project planning tool and the fact that project managers can control the whole process making the final decisions. Conclusions: Practitioners can benefit from the integration of software analytics tools as part of their software development toolchain to control software quality. The implementation of QFL promotes quality in the organization and the integration of software analytics and project planning tools also improves the communication between teams. △ Less

Submitted 11 February, 2021; originally announced February 2021.

Comments: 9 pages, Accepted for publication in IEEE/ACM 43nd International Conference on Software Engineering: Software Engineering in Society (ICSE-SEIS), IEEE, 2021

arXiv:2011.05106 [pdf, other]

doi 10.1109/TSE.2020.3042747

How do Practitioners Perceive the Relevance of Requirements Engineering Research?

Authors: Xavier Franch, Daniel Mendez, Andreas Vogelsang, Rogardt Heldal, Eric Knauss, Marc Oriol, Guilherme H. Travassos, Jeffrey C. Carver, Thomas Zimmermann

Abstract: The relevance of Requirements Engineering (RE) research to practitioners is vital for a long-term dissemination of research results to everyday practice. Some authors have speculated about a mismatch between research and practice in the RE discipline. However, there is not much evidence to support or refute this perception. This paper presents the results of a study aimed at gathering evidence fro… ▽ More The relevance of Requirements Engineering (RE) research to practitioners is vital for a long-term dissemination of research results to everyday practice. Some authors have speculated about a mismatch between research and practice in the RE discipline. However, there is not much evidence to support or refute this perception. This paper presents the results of a study aimed at gathering evidence from practitioners about their perception of the relevance of RE research and at understanding the factors that influence that perception. We conducted a questionnaire-based survey of industry practitioners with expertise in RE. The participants rated the perceived relevance of 435 scientific papers presented at five top RE-related conferences. The 153 participants provided a total of 2,164 ratings. The practitioners rated RE research as essential or worthwhile in a majority of cases. However, the percentage of non-positive ratings is still higher than we would like. Among the factors that affect the perception of relevance are the research's links to industry, the research method used, and respondents' roles. The reasons for positive perceptions were primarily related to the relevance of the problem and the soundness of the solution, while the causes for negative perceptions were more varied. The respondents also provided suggestions for future research, including topics researchers have studied for decades, like elicitation or requirement quality criteria. △ Less

Submitted 3 December, 2020; v1 submitted 10 November, 2020; originally announced November 2020.

Comments: Accepted at IEEE Transactions on Software Engineering

Journal ref: Transaction on Software Engineering 2020

arXiv:2003.05434 [pdf, other]

doi 10.1007/978-3-030-75018-3_14

Develo** and Operating Artificial Intelligence Models in Trustworthy Autonomous Systems

Authors: Silverio Martínez-Fernández, Xavier Franch, Andreas Jedlitschka, Marc Oriol, Adam Trendowicz

Abstract: Companies dealing with Artificial Intelligence (AI) models in Autonomous Systems (AS) face several problems, such as users' lack of trust in adverse or unknown conditions, gaps between software engineering and AI model development, and operation in a continuously changing operational environment. This work-in-progress paper aims to close the gap between the development and operation of trustworthy… ▽ More Companies dealing with Artificial Intelligence (AI) models in Autonomous Systems (AS) face several problems, such as users' lack of trust in adverse or unknown conditions, gaps between software engineering and AI model development, and operation in a continuously changing operational environment. This work-in-progress paper aims to close the gap between the development and operation of trustworthy AI-based AS by defining an approach that coordinates both activities. We synthesize the main challenges of AI-based AS in industrial settings. We reflect on the research efforts required to overcome these challenges and propose a novel, holistic DevOps approach to put it into practice. We elaborate on four research directions: (a) increased users' trust by monitoring operational AI-based AS and identifying self-adaptation needs in critical situations; (b) integrated agile process for the development and evolution of AI models and AS; (c) continuous deployment of different context-specific instances of AI models in a distributed setting of AS; and (d) holistic DevOps-based lifecycle for AI-based AS. △ Less

Submitted 23 April, 2021; v1 submitted 11 March, 2020; originally announced March 2020.

Comments: 9 pages, 1 figure, preprint. Accepted in RCIS 2021

arXiv:2002.02303 [pdf]

doi 10.1016/j.infsof.2019.106225

Management of quality requirements in agile and rapid software development: A systematic map** study

Authors: Woubshet Behutiye, Pertti Karhapää, Lidia Lopez, Xavier Burgues, Silverio Martinez-Fernandez, Anna Maria Vollmer, Pilar Rodriiguez, Xavier Franch, Markku Oivo

Abstract: Context:Quality requirements (QRs) describe the desired quality of software, and they play an important role in the success of software projects. In agile software development (ASD), QRs are often ill-defined and not well addressed due to the focus on quickly delivering functionality. Rapid software development (RSD) approaches (e.g., continuous delivery and continuous deployment), which shorten d… ▽ More Context:Quality requirements (QRs) describe the desired quality of software, and they play an important role in the success of software projects. In agile software development (ASD), QRs are often ill-defined and not well addressed due to the focus on quickly delivering functionality. Rapid software development (RSD) approaches (e.g., continuous delivery and continuous deployment), which shorten delivery times, are more prone to neglect QRs. Despite the significance of QRs in both ASD and RSD, there is limited synthesized knowledge on their management in those approaches. Objective:This study aims to synthesize state-of-the-art knowledge about QR management in ASD and RSD, focusing on three aspects: bibliometric, strategies, and challenges. Research method:Using a systematic map** study with a snowballing search strategy, we identified and structured the literature on QR management in ASD and RSD. Check the PDF file to see the full abstract and document. △ Less

Submitted 6 February, 2020; originally announced February 2020.

Comments: 31 pages, the article is currently in press in Information and Software Technology journal and can be accessed through https://doi.org/10.1016/j.infsof.2019.106225

arXiv:1902.01822 [pdf]

Do We Preach What We Practice? Investigating the Practical Relevance of Requirements Engineering Syllabi - The IREB Case

Authors: Daniel Méndez Fernández, Xavier Franch, Norbert Seyff, Michael Felderer, Martin Glinz, Marcos Kalinowski, Andreas Volgelsang, Stefan Wagner, Stan Bühne, Kim Lauenroth

Abstract: Nowadays, there exist a plethora of different educational syllabi for Requirements Engineering (RE), all aiming at incorporating practically relevant educational units (EUs). Many of these syllabi are based, in one way or the other, on the syllabi provided by the International Requirements Engineering Board (IREB), a non-profit organisation devoted to standardised certification programs for RE. IR… ▽ More Nowadays, there exist a plethora of different educational syllabi for Requirements Engineering (RE), all aiming at incorporating practically relevant educational units (EUs). Many of these syllabi are based, in one way or the other, on the syllabi provided by the International Requirements Engineering Board (IREB), a non-profit organisation devoted to standardised certification programs for RE. IREB syllabi are developed by RE experts and are, thus, based on the assumption that they address topics of practical relevance. However, little is known about to what extent practitioners actually perceive those contents as useful. We have started a study to investigate the relevance of the EUs included in the IREB Foundation Level certification programme. In a first phase reported in this paper, we have surveyed practitioners mainly from DACH countries (Germany, Austria and Switzerland) participating in the IREB certification. Later phases will widen the scope both by including other countries and by not requiring IREB-certified participants. The results shall foster a critical reflection on the practical relevance of EUs built upon the de-facto standard syllabus of IREB. △ Less

Submitted 10 June, 2019; v1 submitted 1 February, 2019; originally announced February 2019.

Comments: CR version of the manuscript accepted for presentation at the 22nd Ibero-American Conference on Software Engineering

arXiv:1809.00711 [pdf]

doi 10.1016/j.infsof.2018.08.013

Adaptive Monitoring: A Systematic Map**

Authors: Edith Zavala, Xavier Franch, Jordi Marco

Abstract: Context: Adaptive monitoring is a method used in a variety of domains for responding to changing conditions. It has been applied in different ways, from monitoring systems' customization to re-composition, in different application domains. However, to the best of our knowledge, there are no studies analyzing how adaptive monitoring differs or resembles among the existing approaches. Method: We hav… ▽ More Context: Adaptive monitoring is a method used in a variety of domains for responding to changing conditions. It has been applied in different ways, from monitoring systems' customization to re-composition, in different application domains. However, to the best of our knowledge, there are no studies analyzing how adaptive monitoring differs or resembles among the existing approaches. Method: We have conducted a systematic map** study of adaptive monitoring approaches following recommended practices. We have applied automatic search and snowballing sampling on different sources and used rigorous selection criteria to retrieve the final set of papers. Moreover, we have used an existing qualitative analysis method for extracting relevant data from studies. Finally, we have applied data mining techniques for identifying patterns in the solutions. Conclusions: This cross-domain overview of the current state of the art on adaptive monitoring may be a solid and comprehensive baseline for researchers and practitioners in the field. Especially, it may help in identifying opportunities of research, for instance, the need of proposing generic and flexible software engineering solutions for supporting adaptive monitoring in a variety of systems. △ Less

Submitted 3 September, 2018; originally announced September 2018.

Comments: 57 pages, 20 figures, 8 tables, Inf. Softw. Technol., Aug. 2018, pre-print, CC-BY-NC-ND 4.0 license, https://doi.org/10.1016/j.infsof.2018.08.013

arXiv:1808.05376 [pdf, other]

doi 10.1145/3242153.3242159

Towards Automated Data Integration in Software Analytics

Authors: Silverio Martínez-Fernández, Petar Jovanovic, Xavier Franch, Andreas Jedlitschka

Abstract: Software organizations want to be able to base their decisions on the latest set of available data and the real-time analytics derived from them. In order to support "real-time enterprise" for software organizations and provide information transparency for diverse stakeholders, we integrate heterogeneous data sources about software analytics, such as static code analysis, testing results, issue tr… ▽ More Software organizations want to be able to base their decisions on the latest set of available data and the real-time analytics derived from them. In order to support "real-time enterprise" for software organizations and provide information transparency for diverse stakeholders, we integrate heterogeneous data sources about software analytics, such as static code analysis, testing results, issue tracking systems, network monitoring systems, etc. To deal with the heterogeneity of the underlying data sources, we follow an ontology-based data integration approach in this paper and define an ontology that captures the semantics of relevant data for software analytics. Furthermore, we focus on the integration of such data sources by proposing two approaches: a static and a dynamic one. We first discuss the current static approach with a predefined set of analytic views representing software quality factors and further envision how this process could be automated in order to dynamically build custom user analysis using a semi-automatic platform for managing the lifecycle of analytics infrastructures. △ Less

Submitted 16 August, 2018; originally announced August 2018.

Comments: This is an author's accepted manuscript of a paper to be published by ACM in the 12th International Workshop on Real-Time Business Intelligence and Analytics (BIRTE@VLDB) 2018. The final authenticated version will be available through https://doi.org/10.1145/3242153.3242159

arXiv:1808.02284 [pdf, other]

Needs and Challenges for a Platform to Support Large-scale Requirements Engineering. A Multiple Case Study

Authors: Davide Fucci, Cristina Palomares, Dolors Costal, Xavier Franch, Mikko Raatikainen, Martin Stettinger, Zijad Kurtanovic, Tero Kojo, Lars Koenig, Andreas Falkner, Gottfried Schenner, Fabrizio Brasca, Tomi Männistö, Alexander Felfernig, Walid Maalej

Abstract: Background: Requirement engineering is often considered a critical activity in system development projects. The increasing complexity of software, as well as number and heterogeneity of stakeholders, motivate the development of methods and tools for improving large-scale requirement engineering. Aims: The empirical study presented in this paper aims to identify and understand the characteristics a… ▽ More Background: Requirement engineering is often considered a critical activity in system development projects. The increasing complexity of software, as well as number and heterogeneity of stakeholders, motivate the development of methods and tools for improving large-scale requirement engineering. Aims: The empirical study presented in this paper aims to identify and understand the characteristics and challenges of a platform, as desired by experts, to support requirement engineering for individual stakeholders, based on the current pain-points of their organizations when dealing with a large number requirements. Method: We conducted a multiple case study with three companies in different domains. We collected data through ten semi-structured interviews with experts from these companies. Results: The main pain-point for stakeholders is handling the vast amount of data from different sources. The foreseen platform should leverage such data to manage changes in requirements according to customers' and users' preferences. It should also offer stakeholders an estimation of how long a requirements engineering task will take to complete, along with an easier requirements dependency identification and requirements reuse strategy. Conclusions: The findings provide empirical evidence about how practitioners wish to improve their requirement engineering processes and tools. The insights are a starting point for in-depth investigations into the problems and solutions presented. Practitioners can use the results to improve existing or design new practices and tools. △ Less

Submitted 6 September, 2018; v1 submitted 7 August, 2018; originally announced August 2018.

Comments: Accepted for publication to the 12th International Symposium on Empirical Software Engineering and Measurement (ESEM18)

arXiv:1804.03416 [pdf, other]

doi 10.1145/3193965.3193970

Protocol and Tools for Conducting Agile Software Engineering Research in an Industrial-Academic Setting: A Preliminary Study

Authors: Katarzyna Biesialska, Xavier Franch, Victor Muntés-Mulero

Abstract: Conducting empirical research in software engineering industry is a process, and as such, it should be generalizable. The aim of this paper is to discuss how academic researchers may address some of the challenges they encounter during conducting empirical research in the software industry by means of a systematic and structured approach. The protocol developed in this paper should serve as a prac… ▽ More Conducting empirical research in software engineering industry is a process, and as such, it should be generalizable. The aim of this paper is to discuss how academic researchers may address some of the challenges they encounter during conducting empirical research in the software industry by means of a systematic and structured approach. The protocol developed in this paper should serve as a practical guide for researchers and help them with conducting empirical research in this complex environment. △ Less

Submitted 10 April, 2018; originally announced April 2018.

Comments: Accepted to CESI 2018 - International Workshop on Conducting Empirical Studies in Industry (in conjunction with ICSE 2018)

Journal ref: 2018 IEEE/ACM 6th International Workshop on Conducting Empirical Studies in Industry (CESI), Gothenburg, 2018, pp. 29-32

arXiv:1803.01896 [pdf]

doi 10.1016/j.eswa.2018.01.009

SACRE: Supporting contextual requirements' adaptation in modern self-adaptive systems in the presence of uncertainty at runtime

Authors: Edith Zavala, Xavier Franch, Jordi Marco, Alessia Knauss, Daniela Damian

Abstract: Runtime uncertainty such as unpredictable resource unavailability, changing environmental conditions and user needs, as well as system intrusions or faults represents one of the main current challenges of self-adaptive systems. Moreover, today's systems are increasingly more complex, distributed, decentralized, etc. and therefore have to reason about and cope with more and more unpredictable event… ▽ More Runtime uncertainty such as unpredictable resource unavailability, changing environmental conditions and user needs, as well as system intrusions or faults represents one of the main current challenges of self-adaptive systems. Moreover, today's systems are increasingly more complex, distributed, decentralized, etc. and therefore have to reason about and cope with more and more unpredictable events. Approaches to deal with such changing requirements in complex today's systems are still missing. This work presents SACRE (Smart Adaptation through Contextual REquirements), our approach leveraging an adaptation feedback loop to detect self-adaptive systems' contextual requirements affected by uncertainty and to integrate machine learning techniques to determine the best operationalization of context based on sensed data at runtime. SACRE is a step forward of our former approach ACon which focus had been on adapting the context in contextual requirements, as well as their basic implementation. SACRE primarily focuses on architectural decisions, addressing self-adaptive systems' engineering challenges. Furthering the work on ACon, in this paper, we perform an evaluation of the entire approach in different uncertainty scenarios in real-time in the extremely demanding domain of smart vehicles. The real-time evaluation is conducted in a simulated environment in which the smart vehicle is implemented through software components. The evaluation results provide empirical evidence about the applicability of SACRE in real and complex software system domains. △ Less

Submitted 5 March, 2018; originally announced March 2018.

Comments: 45 pages, journal article, 14 figures, 9 tables, CC-BY-NC-ND 4.0 license

Journal ref: Expert Systems with Applications, Volume 98, 2018, Pages 166-188, ISSN 0957-4174, (http://www.sciencedirect.com/science/article/pii/S0957417418300095)

arXiv:1711.08894 [pdf]

doi 10.1007/978-3-319-69926-4_41

Non-functional Requirements Documentation in Agile Software Development: Challenges and Solution Proposal

Authors: Woubshet Behutiye, Pertti Karhapää, Dolors Costal, Markku Oivo, Xavier Franch

Abstract: Non-functional requirements (NFRs) are determinant for the success of software projects. However,they are characterized as hard to define, and in agile software development(ASD), are often given less priority and usually not documented. In this paper, we present the findings of the documentation practices and challenges of NFRs in companies utilizing ASD and propose guidelines for enhancing NFRs d… ▽ More Non-functional requirements (NFRs) are determinant for the success of software projects. However,they are characterized as hard to define, and in agile software development(ASD), are often given less priority and usually not documented. In this paper, we present the findings of the documentation practices and challenges of NFRs in companies utilizing ASD and propose guidelines for enhancing NFRs documentation in ASD. We interviewed practitioners from four companies and identified that epics, features, user stories, acceptance criteria,Definition of Done(DoD), product and sprint backlogs are used for documenting NFRS. Please refer to the manuscript for the full abstract. △ Less

Submitted 24 November, 2017; originally announced November 2017.

Comments: 8 pages, Authors post print version, the original work is published in Product-Focused Software Process Improvement, PROFES 2017

arXiv:1705.06013 [pdf]

How do Practitioners Perceive the Relevance of Requirements Engineering Research? An Ongoing Study

Authors: X. Franch, D. Méndez Fernández, M. Oriol, A. Vogelsang, R. Heldal, E. Knauss, G. Horta Travassos, J. C. Carver, O. Dieste, T. Zimmermann

Abstract: The relevance of Requirements Engineering (RE) research to practitioners is a prerequisite for problem-driven research in the area and key for a long-term dissemination of research results to everyday practice. To better understand how industry practitioners perceive the practical relevance of RE research, we have initiated the RE-Pract project, an international collaboration conducting an empiric… ▽ More The relevance of Requirements Engineering (RE) research to practitioners is a prerequisite for problem-driven research in the area and key for a long-term dissemination of research results to everyday practice. To better understand how industry practitioners perceive the practical relevance of RE research, we have initiated the RE-Pract project, an international collaboration conducting an empirical study. This project opts for a replication of previous work done in two different domains and relies on survey research. To this end, we have designed a survey to be sent to several hundred industry practitioners at various companies around the world and ask them to rate their perceived practical relevance of the research described in a sample of 418 RE papers published between 2010 and 2015 at the RE, ICSE, FSE, ESEC/FSE, ESEM and REFSQ conferences. In this paper, we summarise our research protocol and present the current status of our study and the planned future steps. △ Less

Submitted 14 June, 2017; v1 submitted 17 May, 2017; originally announced May 2017.

Comments: Accepted for the 25th International Requirements Engineering Conference, 2017

arXiv:1605.07767 [pdf, other]

iStar 2.0 Language Guide

Authors: Fabiano Dalpiaz, Xavier Franch, Jennifer Horkoff

Abstract: The i* modeling language was introduced to fill the gap in the spectrum of conceptual modeling languages, focusing on the intentional (why?), social (who?), and strategic (how? how else?) dimensions. i* has been applied in many areas, e.g., healthcare, security analysis, eCommerce. Although i* has seen much academic application, the diversity of extensions and variations can make it difficult for… ▽ More The i* modeling language was introduced to fill the gap in the spectrum of conceptual modeling languages, focusing on the intentional (why?), social (who?), and strategic (how? how else?) dimensions. i* has been applied in many areas, e.g., healthcare, security analysis, eCommerce. Although i* has seen much academic application, the diversity of extensions and variations can make it difficult for novices to learn and use it in a consistent way. This document introduces the iStar 2.0 core language, evolving the basic concepts of i* into a consistent and clear set of core concepts, upon which to build future work and to base goal-oriented teaching materials. This document was built from a set of discussions and input from various members of the i* community. It is our intention to revisit, update and expand the document after collecting examples and concrete experiences with iStar 2.0. △ Less

Submitted 16 June, 2016; v1 submitted 25 May, 2016; originally announced May 2016.

ACM Class: D.2.1

arXiv:1301.4600 [pdf]

Requirements Management for Service Providers: the Case of Services for Citizens

Authors: Xavier Franch

Abstract: Take the Internet of Things, a piece of cloud computing, a handful of smart cities, don't forget social platforms, flavour it with mobile technologies and ever-changing environments, shake it up and... voila! What a wonderful service! Oops! Wait a minute, where did my requirements go? Take the Internet of Things, a piece of cloud computing, a handful of smart cities, don't forget social platforms, flavour it with mobile technologies and ever-changing environments, shake it up and... voila! What a wonderful service! Oops! Wait a minute, where did my requirements go? △ Less

Submitted 19 January, 2013; originally announced January 2013.

arXiv:1206.5166 [pdf, ps, other]

Linking Quality Attributes and Constraints with Architectural Decisions

Authors: David Ameller, Xavier Franch

Abstract: Quality attributes and constraints are among the main drivers of architectural decision making. The quality attributes are improved or damaged by the architectural decisions, while restrictions directly include or exclude parts of the architecture (for example, the logical components or technologies). We can determine the impact of a decision of architecture in software quality, or which parts of… ▽ More Quality attributes and constraints are among the main drivers of architectural decision making. The quality attributes are improved or damaged by the architectural decisions, while restrictions directly include or exclude parts of the architecture (for example, the logical components or technologies). We can determine the impact of a decision of architecture in software quality, or which parts of the architecture are affected by a constraint, but the difficult problem is whether we are respecting the quality requirements (requirements on quality attributes) and constraints with all the architectural decisions made. Currently, the common practice is that architects use their own experience to design architectures that meet the quality requirements and restrictions, but at the end, especially for the crucial decisions, the architect has to deal with complex trade-offs between quality attributes and juggle possible incompatibilities raised by the constraints. In this paper we present Quark, a computer-aided method to support architects in software architecture decision making. △ Less

Submitted 22 June, 2012; originally announced June 2012.

arXiv:1110.5574 [pdf]

WeSSQoS: A Configurable SOA System for Quality-aware Web Service Selection

Authors: Oscar Cabrera, Marc Oriol, Xavier Franch, Lidia López, Jordi Marco, Olivia Fragoso, René Santaolaya

Abstract: Web Services (WS) have become one the most used technologies nowadays in software systems. Among the challenges when integrating WS in a given system, requirements-driven selection occupies a prominent place. A comprehensive selection process needs to check compliance of Non-Functional Requirements (NFR), which can be assessed by analysing WS Quality of Service (QoS). In this paper, we describe th… ▽ More Web Services (WS) have become one the most used technologies nowadays in software systems. Among the challenges when integrating WS in a given system, requirements-driven selection occupies a prominent place. A comprehensive selection process needs to check compliance of Non-Functional Requirements (NFR), which can be assessed by analysing WS Quality of Service (QoS). In this paper, we describe the WeSSQoS system that aims at ranking available WS based on the comparison of their QoS and the stated NFRs. WeSSQoS is designed as an open service-oriented architecture that hosts a configurable portfolio of normalization and ranking algorithms that can be selected by the engineer when starting a selection process. WS' QoS can be obtained either from a static, WSDL-like description, or computed dynamically through monitoring techniques. WeSSQoS is designed to work over multiple WS repositories and QoS sources. The impact of having a portfolio of different normalization and ranking algorithms is illustrated with an example. △ Less

Submitted 25 October, 2011; originally announced October 2011.

Showing 1–40 of 40 results for author: Franch, X