Search | arXiv e-print repository

Automated categorization of pre-trained models for software engineering: A case study with a Hugging Face dataset

Authors: Claudio Di Sipio, Riccardo Rubei, Juri Di Rocco, Davide Di Ruscio, Phuong T. Nguyen

Abstract: Software engineering (SE) activities have been revolutionized by the advent of pre-trained models (PTMs), defined as large machine learning (ML) models that can be fine-tuned to perform specific SE tasks. However, users with limited expertise may need help to select the appropriate model for their current task. To tackle the issue, the Hugging Face (HF) platform simplifies the use of PTMs by colle… ▽ More Software engineering (SE) activities have been revolutionized by the advent of pre-trained models (PTMs), defined as large machine learning (ML) models that can be fine-tuned to perform specific SE tasks. However, users with limited expertise may need help to select the appropriate model for their current task. To tackle the issue, the Hugging Face (HF) platform simplifies the use of PTMs by collecting, storing, and curating several models. Nevertheless, the platform currently lacks a comprehensive categorization of PTMs designed specifically for SE, i.e., the existing tags are more suited to generic ML categories. This paper introduces an approach to address this gap by enabling the automatic classification of PTMs for SE tasks. First, we utilize a public dump of HF to extract PTMs information, including model documentation and associated tags. Then, we employ a semi-automated method to identify SE tasks and their corresponding PTMs from existing literature. The approach involves creating an initial map** between HF tags and specific SE tasks, using a similarity-based strategy to identify PTMs with relevant tags. The evaluation shows that model cards are informative enough to classify PTMs considering the pipeline tag. Moreover, we provide a map** between SE tasks and stored PTMs by relying on model names. △ Less

Submitted 21 May, 2024; originally announced May 2024.

Comments: Accepted at The International Conference on Evaluation and Assessment in Software Engineering (EASE), 2024 edition

arXiv:2309.02985 [pdf, other]

Supporting Early-Safety Analysis of IoT Systems by Exploiting Testing Techniques

Authors: Diego Clerissi, Juri Di Rocco, Davide Di Ruscio, Claudio Di Sipio, Felicien Ihirwe, Leonardo Mariani, Daniela Micucci, Maria Teresa Rossi, Riccardo Rubei

Abstract: IoT systems complexity and susceptibility to failures pose significant challenges in ensuring their reliable operation Failures can be internally generated or caused by external factors impacting both the systems correctness and its surrounding environment To investigate these complexities various modeling approaches have been proposed to raise the level of abstraction facilitating automation and… ▽ More IoT systems complexity and susceptibility to failures pose significant challenges in ensuring their reliable operation Failures can be internally generated or caused by external factors impacting both the systems correctness and its surrounding environment To investigate these complexities various modeling approaches have been proposed to raise the level of abstraction facilitating automation and analysis FailureLogic Analysis FLA is a technique that helps predict potential failure scenarios by defining how a components failure logic behaves and spreads throughout the system However manually specifying FLA rules can be arduous and errorprone leading to incomplete or inaccurate specifications In this paper we propose adopting testing methodologies to improve the completeness and correctness of these rules How failures may propagate within an IoT system can be observed by systematically injecting failures while running test cases to collect evidence useful to add complete and refine FLA rules △ Less

Submitted 6 September, 2023; originally announced September 2023.

arXiv:2308.13808 [pdf, other]

ResyDuo: Combining data models and CF-based recommender systems to develop Arduino projects

Authors: Juri Di Rocco, Claudio Di Sipio

Abstract: While specifying an IoT-based system, software developers have to face a set of challenges, spanning from selecting the hardware components to writing the actual source code. Even though dedicated development environments are in place, a nonexpert user might struggle with the over-choice problem in selecting the proper component. By combining MDE and recommender systems, this paper proposes an ini… ▽ More While specifying an IoT-based system, software developers have to face a set of challenges, spanning from selecting the hardware components to writing the actual source code. Even though dedicated development environments are in place, a nonexpert user might struggle with the over-choice problem in selecting the proper component. By combining MDE and recommender systems, this paper proposes an initial prototype, called ResyDuo, to assist Arduino developers by providing two different artifacts, i. e. , hardware components and software libraries. In particular, we make use of a widely adopted collaborative filtering algorithm by collecting relevant information by means of a dedicated data model. ResyDuo can retrieve hardware components by using tags or existing Arduino projects stored on the ProjectHub repository. Then, the system can eventually retrieve corresponding software libraries based on the identified hardware devices. ResyDuo is equipped with a web-based interface that allows users to easily select and configure the under-develo** Arduino project. To assess ResyDuos performances, we run the ten-fold crossvalidation by adopting the grid search strategy to optimize the hyperparameters of the CF-based algorithm. The conducted evaluation shows encouraging results even though there is still room for improvement in terms of the examined metrics. △ Less

Submitted 26 August, 2023; originally announced August 2023.

arXiv:2307.09381 [pdf, other]

Is this Snippet Written by ChatGPT? An Empirical Study with a CodeBERT-Based Classifier

Authors: Phuong T. Nguyen, Juri Di Rocco, Claudio Di Sipio, Riccardo Rubei, Davide Di Ruscio, Massimiliano Di Penta

Abstract: Since its launch in November 2022, ChatGPT has gained popularity among users, especially programmers who use it as a tool to solve development problems. However, while offering a practical solution to programming problems, ChatGPT should be mainly used as a supporting tool (e.g., in software education) rather than as a replacement for the human being. Thus, detecting automatically generated source… ▽ More Since its launch in November 2022, ChatGPT has gained popularity among users, especially programmers who use it as a tool to solve development problems. However, while offering a practical solution to programming problems, ChatGPT should be mainly used as a supporting tool (e.g., in software education) rather than as a replacement for the human being. Thus, detecting automatically generated source code by ChatGPT is necessary, and tools for identifying AI-generated content may need to be adapted to work effectively with source code. This paper presents an empirical study to investigate the feasibility of automated identification of AI-generated code snippets, and the factors that influence this ability. To this end, we propose a novel approach called GPTSniffer, which builds on top of CodeBERT to detect source code written by AI. The results show that GPTSniffer can accurately classify whether code is human-written or AI-generated, and outperforms two baselines, GPTZero and OpenAI Text Classifier. Also, the study shows how similar training data or a classification context with paired snippets helps to boost classification performances. △ Less

Submitted 7 August, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

arXiv:2304.10409 [pdf, other]

Dealing with Popularity Bias in Recommender Systems for Third-party Libraries: How far Are We?

Authors: Phuong T. Nguyen, Riccardo Rubei, Juri Di Rocco, Claudio Di Sipio, Davide Di Ruscio, Massimiliano Di Penta

Abstract: Recommender systems for software engineering (RSSEs) assist software engineers in dealing with a growing information overload when discerning alternative development solutions. While RSSEs are becoming more and more effective in suggesting handy recommendations, they tend to suffer from popularity bias, i.e., favoring items that are relevant mainly because several developers are using them. While… ▽ More Recommender systems for software engineering (RSSEs) assist software engineers in dealing with a growing information overload when discerning alternative development solutions. While RSSEs are becoming more and more effective in suggesting handy recommendations, they tend to suffer from popularity bias, i.e., favoring items that are relevant mainly because several developers are using them. While this rewards artifacts that are likely more reliable and well-documented, it would also mean that missing artifacts are rarely used because they are very specific or more recent. This paper studies popularity bias in Third-Party Library (TPL) RSSEs. First, we investigate whether state-of-the-art research in RSSEs has already tackled the issue of popularity bias. Then, we quantitatively assess four existing TPL RSSEs, exploring their capability to deal with the recommendation of popular items. Finally, we propose a mechanism to defuse popularity bias in the recommendation list. The empirical study reveals that the issue of dealing with popularity in TPL RSSEs has not received adequate attention from the software engineering community. Among the surveyed work, only one starts investigating the issue, albeit getting a low prediction performance. △ Less

Submitted 20 April, 2023; originally announced April 2023.

Comments: 13 pages, To be appeared in the 20th Mining Software Repository Proceedings

arXiv:2301.12202 [pdf, other]

doi 10.1109/ICSA-C57050.2023.00063

A customizable approach to assess software quality through Multi-Criteria Decision Making

Authors: Francesco Basciani, Daniele Di Pompeo, Juri Di Rocco, Alfonso Pierantonio

Abstract: Over the years, Software Quality Engineering has increased interest, demonstrated by significant research papers published in this area. Determining when a software artifact is qualitatively valid is tricky, given the impossibility of providing an objective definition valid for any perspective, context, or stakeholder. Many quality model solutions have been proposed that reference specific quality… ▽ More Over the years, Software Quality Engineering has increased interest, demonstrated by significant research papers published in this area. Determining when a software artifact is qualitatively valid is tricky, given the impossibility of providing an objective definition valid for any perspective, context, or stakeholder. Many quality model solutions have been proposed that reference specific quality attributes in this context. However, these approaches do not consider the context in which the artifacts will operate and the stakeholder's perspective who evaluate its validity. Furthermore, these solutions suffer from the limitations of being artifact-specific and not extensible. In this paper, we provide a generic and extensible mechanism that makes it possible to aggregate and prioritize quality attributes. The user, taking into account his perspective and the context in which the software artifact will operate, is guided in defining all the criteria for his quality model. The management of these criteria is then facilitated through Multi-Criteria Decision Making (MCDM). In addition, we present the PRETTEF model, a concrete instance of the proposed approach for assessing and selecting MVC frameworks. △ Less

Submitted 28 January, 2023; originally announced January 2023.

Comments: 8 pages -- 3rd International Workshop on Model-Driven Engineering for Software Architecture (MDE4SA 2023)

Journal ref: 20th International Conference on Software Architecture, ICSA 2023 - Companion

arXiv:2205.09379 [pdf, other]

doi 10.1002/spe.3238

GitRanking: A Ranking of GitHub Topics for Software Classification using Active Sampling

Authors: Cezar Sas, Andrea Capiluppi, Claudio Di Sipio, Juri Di Rocco, Davide Di Ruscio

Abstract: GitHub is the world's largest host of source code, with more than 150M repositories. However, most of these repositories are not labeled or inadequately so, making it harder for users to find relevant projects. There have been various proposals for software application domain classification over the past years. However, these approaches lack a well-defined taxonomy that is hierarchical, grounded i… ▽ More GitHub is the world's largest host of source code, with more than 150M repositories. However, most of these repositories are not labeled or inadequately so, making it harder for users to find relevant projects. There have been various proposals for software application domain classification over the past years. However, these approaches lack a well-defined taxonomy that is hierarchical, grounded in a knowledge base, and free of irrelevant terms. This work proposes GitRanking, a framework for creating a classification ranked into discrete levels based on how general or specific their meaning is. We collected 121K topics from GitHub and considered $60\%$ of the most frequent ones for the ranking. GitRanking 1) uses active sampling to ensure a minimal number of required annotations; and 2) links each topic to Wikidata, reducing ambiguities and improving the reusability of the taxonomy. Our results show that developers, when annotating their projects, avoid using terms with a high degree of specificity. This makes the finding and discovery of their projects more challenging for other users. Furthermore, we show that GitRanking can effectively rank terms according to their general or specific meaning. This ranking would be an essential asset for developers to build upon, allowing them to complement their annotations with more precise topics. Finally, we show that GitRanking is a dynamically extensible method: it can currently accept further terms to be ranked with a minimum number of annotations ($\sim$ 15). This paper is the first collective attempt to build a ground-up taxonomy of software domains. △ Less

Submitted 19 May, 2022; originally announced May 2022.

Comments: 11 pages, 6 figures, 3 tables

arXiv:2203.06068 [pdf, other]

MemoRec: A Recommender System for Assisting Modelers in Specifying Metamodels

Authors: Juri Di Rocco, Davide Di Ruscio, Claudio Di Sipio, Phuong T. Nguyen, Alfonso Pierantonio

Abstract: Model Driven Engineering (MDE) has been widely applied in software development, aiming to facilitate the coordination among various stakeholders. Such a methodology allows for a more efficient and effective development process. Nevertheless, modeling is a strenuous activity that requires proper knowledge of components, attributes, and logic to reach the level of abstraction required by the applica… ▽ More Model Driven Engineering (MDE) has been widely applied in software development, aiming to facilitate the coordination among various stakeholders. Such a methodology allows for a more efficient and effective development process. Nevertheless, modeling is a strenuous activity that requires proper knowledge of components, attributes, and logic to reach the level of abstraction required by the application domain. In particular, metamodels play an important role in several paradigms, and specifying wrong entities or attributes in metamodels can negatively impact on the quality of the produced artifacts as well as other elements of the whole process. During the metamodeling phase, modelers can benefit from assistance to avoid mistakes, e.g., getting recommendations like meta-classes and structural features relevant to the metamodel being defined. However, suitable machinery is needed to mine data from repositories of existing modeling artifacts and compute recommendations. In this work, we propose MemoRec, a novel approach that makes use of a collaborative filtering strategy to recommend valuable entities related to the metamodel under construction. Our approach can provide suggestions related to both metaclasses and structured features that should be added in the metamodel under definition. We assess the quality of the work with respect to different metrics, i.e., success rate, precision, and recall. The results demonstrate that MemoRec is capable of suggesting relevant items given a partial metamodel and supporting modelers in their task. △ Less

Submitted 11 March, 2022; originally announced March 2022.

Comments: Accepted for publication at the International Journal on Software and Systems Modeling (SoSyM)

arXiv:2201.08201 [pdf, other]

Providing Upgrade Plans for Third-party Libraries: A Recommender System using Migration Graphs

Authors: Riccardo Rubei, Davide Di Ruscio, Claudio Di Sipio, Juri Di Rocco, Phuong T. Nguyen

Abstract: During the development of a software project, developers often need to upgrade third-party libraries (TPLs), aiming to keep their code up-to-date with the newest functionalities offered by the used libraries. In most cases, upgrading used TPLs is a complex and error-prone activity that must be carefully carried out to limit the ripple effects on the software project that depends on the libraries b… ▽ More During the development of a software project, developers often need to upgrade third-party libraries (TPLs), aiming to keep their code up-to-date with the newest functionalities offered by the used libraries. In most cases, upgrading used TPLs is a complex and error-prone activity that must be carefully carried out to limit the ripple effects on the software project that depends on the libraries being upgraded. In this paper, we propose EvoPlan as a novel approach to the recommendation of different upgrade plans given a pair of library-version as input. In particular, among the different paths that can be possibly followed to upgrade the current library version to the desired updated one, EvoPlan is able to suggest the plan that can potentially minimize the efforts being needed to migrate the code of the clients from the library's current release to the target one. The approach has been evaluated on a curated dataset using conventional metrics used in Information Retrieval, i.e., precision, recall, and F-measure. The experimental results show that EvoPlan obtains an encouraging prediction performance considering two different criteria in the plan specification, i.e., the popularity of migration paths and the number of open and closed issues in GitHub for those projects that have already followed the recommended migration paths. △ Less

Submitted 20 January, 2022; originally announced January 2022.

arXiv:2111.14453 [pdf, other]

Enhancing syntax expressiveness in domain-specific modelling

Authors: Damiano Di Vicenzo, Juri Di Rocco, Davide Di Ruscio, Alfonso Pierantonio

Abstract: Domain-specific modelling helps tame the complexity of today's application domains by formalizing concepts and their relationships in modelling languages. While meta-editors are widely-used frameworks for implementing graphical editors for such modelling languages, they are best suitable for defining {novel} topological notations, i.e., syntaxes where the model layout does not contribute to the mo… ▽ More Domain-specific modelling helps tame the complexity of today's application domains by formalizing concepts and their relationships in modelling languages. While meta-editors are widely-used frameworks for implementing graphical editors for such modelling languages, they are best suitable for defining {novel} topological notations, i.e., syntaxes where the model layout does not contribute to the model semantics. In contrast, many engineering fields, e.g., railways systems or electrical engineering, use notations that, on the one hand, are standard and, on the other hand, are demanding more expressive power than topological syntaxes. In this paper, we discuss the problem of enhancing the expressiveness of modelling editors towards geometric/positional syntaxes. Several potential solutions are experimentally implemented on the jjodel web-based platform with the aim of identifying challenges and opportunities. △ Less

Submitted 29 November, 2021; originally announced November 2021.

arXiv:2103.06987 [pdf, other]

doi 10.1109/TSE.2021.3059907

Development of recommendation systems for software engineering: the CROSSMINER experience

Authors: Juri Di Rocco, Davide Di Ruscio, Claudio Di Sipio, Phuong T. Nguyen, Riccardo Rubei

Abstract: To perform their daily tasks, developers intensively make use of existing resources by consulting open-source software (OSS) repositories. Such platforms contain rich data sources, e.g., code snippets, documentation, and user discussions, that can be useful for supporting development activities. Over the last decades, several techniques and tools have been promoted to provide developers with innov… ▽ More To perform their daily tasks, developers intensively make use of existing resources by consulting open-source software (OSS) repositories. Such platforms contain rich data sources, e.g., code snippets, documentation, and user discussions, that can be useful for supporting development activities. Over the last decades, several techniques and tools have been promoted to provide developers with innovative features, aiming to bring in improvements in terms of development effort, cost savings, and productivity. In the context of the EU H2020 CROSSMINER project, a set of recommendation systems has been conceived to assist software programmers in different phases of the development process. The systems provide developers with various artifacts, such as third-party libraries, documentation about how to use the APIs being adopted, or relevant API function calls. To develop such recommendations, various technical choices have been made to overcome issues related to several aspects including the lack of baselines, limited data availability, decisions about the performance measures, and evaluation approaches. This paper is an experience report to present the knowledge pertinent to the set of recommendation systems developed through the CROSSMINER project. We explain in detail the challenges we had to deal with, together with the related lessons learned when develo** and evaluating these systems. Our aim is to provide the research community with concrete takeaway messages that are expected to be useful for those who want to develop or customize their own recommendation systems. The reported experiences can facilitate interesting discussions and research work, which in the end contribute to the advancement of recommendation systems applied to solve different issues in Software Engineering. △ Less

Submitted 11 March, 2021; originally announced March 2021.

Comments: 43 pages; 8 figures; Accepted for publication at the Empirical Software Engineering Journal

ACM Class: D.2.3; D.2.13; K.6.3

arXiv:2102.07508 [pdf, other]

Recommending API Function Calls and Code Snippets to Support Software Development

Authors: Phuong T. Nguyen, Juri Di Rocco, Claudio Di Sipio, Davide Di Ruscio, Massimiliano Di Penta

Abstract: Software development activity has reached a high degree of complexity, guided by the heterogeneity of the components, data sources, and tasks. The proliferation of open-source software (OSS) repositories has stressed the need to reuse available software artifacts efficiently. To this aim, it is necessary to explore approaches to mine data from software repositories and leverage it to produce helpf… ▽ More Software development activity has reached a high degree of complexity, guided by the heterogeneity of the components, data sources, and tasks. The proliferation of open-source software (OSS) repositories has stressed the need to reuse available software artifacts efficiently. To this aim, it is necessary to explore approaches to mine data from software repositories and leverage it to produce helpful recommendations. We designed and implemented FOCUS as a novel approach to provide developers with API calls and source code while they are programming. The system works on the basis of a context-aware collaborative filtering technique to extract API usages from OSS projects. In this work, we show the suitability of FOCUS for Android programming by evaluating it on a dataset of 2,600 mobile apps. The empirical evaluation results show that our approach outperforms two state-of-the-art API recommenders, UP-Miner and PAM, in terms of prediction accuracy. We also point out that there is no significant relationship between the categories for apps defined in Google Play and their API usages. Finally, we show that participants of a user study positively perceive the API and source code recommended by FOCUS as relevant to the current development context. △ Less

Submitted 15 February, 2021; originally announced February 2021.

Comments: 20 pages, 11 figures, accepted for publication at IEEE Transactions on Software Engineering (TSE)

ACM Class: D.2.3; D.2.13; K.6.3

Showing 1–12 of 12 results for author: Di Rocco, J