Skip to main content

Showing 1–33 of 33 results for author: Herbold, S

.
  1. arXiv:2404.16630  [pdf, other

    cs.SE cs.AI cs.CY cs.LG

    Legal Aspects for Software Developers Interested in Generative AI Applications

    Authors: Steffen Herbold, Brian Valerius, Anamaria Mojica-Hanke, Isabella Lex, Joel Mittel

    Abstract: Recent successes in Generative Artificial Intelligence (GenAI) have led to new technologies capable of generating high-quality code, natural language, and images. The next step is to integrate GenAI technology into products, a task typically conducted by software developers. Such product development always comes with a certain risk of liability. Within this article, we want to shed light on the cu… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Submission under review

  2. arXiv:2309.12697  [pdf, other

    cs.CL cs.LG

    Semantic similarity prediction is better than other semantic similarity measures

    Authors: Steffen Herbold

    Abstract: Semantic similarity between natural language texts is typically measured either by looking at the overlap between subsequences (e.g., BLEU) or by using embeddings (e.g., BERTScore, S-BERT). Within this paper, we argue that when we are only interested in measuring the semantic similarity, it is better to directly predict the similarity using a fine-tuned model for such a task. Using a fine-tuned mo… ▽ More

    Submitted 17 January, 2024; v1 submitted 22 September, 2023; originally announced September 2023.

    Comments: Accepted at TMLR: https://openreview.net/forum?id=bfsNmgN5je

  3. arXiv:2308.12095  [pdf, other

    cs.SE

    On Using Information Retrieval to Recommend Machine Learning Good Practices for Software Engineers

    Authors: Laura Cabra-Acela, Anamaria Mojica-Hanke, Mario Linares-Vásquez, Steffen Herbold

    Abstract: Machine learning (ML) is nowadays widely used for different purposes and in several disciplines. From self-driving cars to automated medical diagnosis, machine learning models extensively support users' daily activities, and software engineering tasks are no exception. Not embracing good ML practices may lead to pitfalls that hinder the performance of an ML system and potentially lead to unexpecte… ▽ More

    Submitted 25 August, 2023; v1 submitted 23 August, 2023; originally announced August 2023.

    Comments: Accepted for Publication at ESEC/FSE demonstrations track

  4. arXiv:2304.14276  [pdf, other

    cs.CL cs.LG

    AI, write an essay for me: A large-scale comparison of human-written versus ChatGPT-generated essays

    Authors: Steffen Herbold, Annette Hautli-Janisz, Ute Heuer, Zlata Kikteva, Alexander Trautsch

    Abstract: Background: Recently, ChatGPT and similar generative AI models have attracted hundreds of millions of users and become part of the public discourse. Many believe that such models will disrupt society and will result in a significant change in the education system and information generation in the future. So far, this belief is based on either colloquial evidence or benchmarks from the owners of th… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

    Comments: Submitted

  5. arXiv:2304.06367  [pdf, other

    cs.SE

    Understanding issues related to personal data and data protection in open source projects on GitHub

    Authors: Anne Henning, Lukas Schulte, Steffen Herbold, Oksana Kulyk, Peter Mayer

    Abstract: Context: Data protection regulations such as the GDPR and the CCPA affect how software may handle the personal data of its users and how consent for handling of such data may be given. Prior literature focused on how this works in operation, but lacks a perspective of the impact on the software development process. Objective: Within our work, we will address this gap and explore how software dev… ▽ More

    Submitted 13 April, 2023; originally announced April 2023.

    Comments: Registered Report with Continuity Acceptance (CA) for submission to Empirical Software Engineering granted by RR-Committee of the MSR'23

  6. arXiv:2304.05358  [pdf, ps, other

    cs.SE

    An exploratory study of bug-introducing changes: what happens when bugs are introduced in open source software?

    Authors: Lukas Schulte, Anamaria Mojica-Hanke, Mario Linares-Vásquez, Steffen Herbold

    Abstract: Context: Many studies consider the relation between individual aspects and bug-introduction, e.g., software testing and code review. Due to the design of the studies the results are usually only about correlations as interactions or interventions are not considered. Objective: Within this study, we want to narrow this gap and provide a broad empirical view on aspects of software development and… ▽ More

    Submitted 11 April, 2023; originally announced April 2023.

    Comments: Registered Report with Continuity Acceptance (CA) for submission to Empirical Software Engineering granted by RR-Committee of the MSR'23

  7. arXiv:2301.10516  [pdf, other

    cs.SE cs.LG

    What are the Machine Learning best practices reported by practitioners on Stack Exchange?

    Authors: Anamaria Mojica-Hanke, Andrea Bayona, Mario Linares-Vásquez, Steffen Herbold, Fabio A. González

    Abstract: Machine Learning (ML) is being used in multiple disciplines due to its powerful capability to infer relationships within data. In particular, Software Engineering (SE) is one of those disciplines in which ML has been used for multiple tasks, like software categorization, bugs prediction, and testing. In addition to the multiple ML applications, some studies have been conducted to detect and unders… ▽ More

    Submitted 25 January, 2023; originally announced January 2023.

  8. arXiv:2209.07623  [pdf, other

    cs.SE cs.LG

    Studying the explanations for the automated prediction of bug and non-bug issues using LIME and SHAP

    Authors: Benjamin Ledel, Steffen Herbold

    Abstract: Context: The identification of bugs within the reported issues in an issue tracker is crucial for the triage of issues. Machine learning models have shown promising results regarding the performance of automated issue type prediction. However, we have only limited knowledge beyond our assumptions how such models identify bugs. LIME and SHAP are popular technique to explain the predictions of class… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

    Comments: This registered report received a In-Principal Acceptance (IPA) in the ESEM 2022 RR track

  9. arXiv:2207.11976  [pdf, other

    cs.SE cs.LG

    Differential testing for machine learning: an analysis for classification algorithms beyond deep learning

    Authors: Steffen Herbold, Steffen Tunkel

    Abstract: Context: Differential testing is a useful approach that uses different implementations of the same algorithms and compares the results for software testing. In recent years, this approach was successfully used for test campaigns of deep learning frameworks. Objective: There is little knowledge on the application of differential testing beyond deep learning. Within this article, we want to close… ▽ More

    Submitted 25 July, 2022; originally announced July 2022.

    Comments: Under review

  10. arXiv:2205.01335  [pdf, ps, other

    cs.SE cs.CL cs.LG

    Predicting Issue Types with seBERT

    Authors: Alexander Trautsch, Steffen Herbold

    Abstract: Pre-trained transformer models are the current state-of-the-art for natural language models processing. seBERT is such a model, that was developed based on the BERT architecture, but trained from scratch with software engineering data. We fine-tuned this model for the NLBSE challenge for the task of issue type prediction. Our model dominates the baseline fastText for all three issue types in both… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: Accepted for Publication at the NLBSE'22 Tool Competition

  11. arXiv:2111.09188  [pdf, other

    cs.SE

    Are automated static analysis tools worth it? An investigation into relative warning density and external software quality

    Authors: Alexander Trautsch, Steffen Herbold, Jens Grabowski

    Abstract: Automated Static Analysis Tools (ASATs) are part of software development best practices. ASATs are able to warn developers about potential problems in the code. On the one hand, ASATs are based on best practices so there should be a noticeable effect on software quality. On the other hand, ASATs suffer from false positive warnings, which developers have to inspect and then ignore or mark as invali… ▽ More

    Submitted 18 November, 2021; v1 submitted 17 November, 2021; originally announced November 2021.

  12. arXiv:2109.11902  [pdf, other

    cs.SE

    Broccoli: Bug localization with the help of text search engines

    Authors: Benjamin Ledel, Steffen Herbold

    Abstract: Bug localization is a tedious activity in the bug fixing process in which a software developer tries to locate bugs in the source code described in a bug report. Since this process is time-consuming and requires additional knowledge about the software project, information retrieval techniques can aid the bug localization process. In this paper, we investigate if normal text search engines can impr… ▽ More

    Submitted 10 October, 2021; v1 submitted 24 September, 2021; originally announced September 2021.

  13. arXiv:2109.04738  [pdf, other

    cs.SE cs.LG

    On the validity of pre-trained transformers for natural language processing in the software engineering domain

    Authors: Julian von der Mosel, Alexander Trautsch, Steffen Herbold

    Abstract: Transformers are the current state-of-the-art of natural language processing in many domains and are using traction within software engineering research as well. Such models are pre-trained on large amounts of data, usually from the general domain. However, we only have a limited understanding regarding the validity of transformers within the software engineering domain, i.e., how good such models… ▽ More

    Submitted 12 May, 2022; v1 submitted 10 September, 2021; originally announced September 2021.

    Comments: Review status: submitted

  14. arXiv:2109.03544  [pdf, other

    cs.SE

    What really changes when developers intend to improve their source code: a commit-level study of static metric value and static analysis warning changes

    Authors: Alexander Trautsch, Johannes Erbel, Steffen Herbold, Jens Grabowski

    Abstract: Many software metrics are designed to measure aspects that are believed to be related to software quality. Static software metrics, e.g., size, complexity and coupling are used in defect prediction research as well as software quality models to evaluate software quality. While this indicates a relationship between quality and software metrics, the extent of it is not well understood. Moreover, rec… ▽ More

    Submitted 30 May, 2022; v1 submitted 8 September, 2021; originally announced September 2021.

  15. arXiv:2104.02517  [pdf, other

    cs.SE

    A new perspective on the competent programmer hypothesis through the reproduction of bugs with repeated mutations

    Authors: Zaheed Ahmed, Eike Stein, Steffen Herbold, Fabian Trautsch, Jens Grabowski

    Abstract: The competent programmer hypothesis states that most programmers are competent enough to create correct or almost correct source code. Because this implies that bugs should usually manifest through small variations of the correct code, the competent programmer hypothesis is one of the fundamental assumptions of mutation testing. Unfortunately, it is still unclear if the competent programmer hypoth… ▽ More

    Submitted 15 May, 2023; v1 submitted 6 April, 2021; originally announced April 2021.

    Comments: Submitted and under review

  16. arXiv:2104.00566  [pdf, other

    cs.SE

    Exploring the relationship between performance metrics and cost saving potential of defect prediction models

    Authors: Steffen Tunkel, Steffen Herbold

    Abstract: Context: Performance metrics are a core component of the evaluation of any machine learning model and used to compare models and estimate their usefulness. Recent work started to question the validity of many performance metrics for this purpose in the context of software defect prediction. Objective: Within this study, we explore the relationship between performance metrics and the cost saving… ▽ More

    Submitted 27 July, 2022; v1 submitted 1 April, 2021; originally announced April 2021.

    Comments: Under review

  17. arXiv:2103.00255  [pdf, other

    cs.SD cs.AI cs.LG eess.AS

    Expert Decision Support System for aeroacoustic source type identification using clustering

    Authors: Armin Goudarzi, Carsten Spehr, Steffen Herbold

    Abstract: This paper presents an Expert Decision Support System for the identification of time-invariant, aeroacoustic source types. The system comprises two steps: first, acoustic properties are calculated based on spectral and spatial information. Second, clustering is performed based on these properties. The clustering aims at hel** and guiding an expert for quick identification of different source typ… ▽ More

    Submitted 18 November, 2021; v1 submitted 27 February, 2021; originally announced March 2021.

    Comments: Preprint for JASA Journal

  18. arXiv:2102.11540  [pdf, other

    cs.SE

    MSR Mining Challenge: The SmartSHARK Repository Mining Data

    Authors: Alexander Trautsch, Fabian Trautsch, Steffen Herbold

    Abstract: The SmartSHARK repository mining data is a collection of rich and detailed information about the evolution of software projects. The data is unique in its diversity and contains detailed information about each change, issue tracking data, continuous integration data, as well as pull request and code review data. Moreover, the data does not contain only raw data scraped from repositories, but also… ▽ More

    Submitted 4 August, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

  19. arXiv:2012.09643  [pdf, other

    cs.SD cs.LG eess.AS

    Automatic source localization and spectra generation from sparse beamforming maps

    Authors: Armin Goudarzi, Carsten Spehr, Steffen Herbold

    Abstract: Beamforming is an imaging tool for the investigation of aeroacoustic phenomena and results in high dimensional data that is broken down to spectra by integrating spatial Regions Of Interest. This paper presents two methods that enable the automated identification of aeroacoustic sources in sparse beamforming maps and the extraction of their corresponding spectra to overcome the manual definition o… ▽ More

    Submitted 22 July, 2021; v1 submitted 16 December, 2020; originally announced December 2020.

    Comments: Preprint for JASA special issue on machine learning in acoustics, Revision 2

  20. arXiv:2011.06244  [pdf, other

    cs.SE

    A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits

    Authors: Steffen Herbold, Alexander Trautsch, Benjamin Ledel, Alireza Aghamohammadi, Taher Ahmed Ghaleb, Kuljit Kaur Chahal, Tim Bossenmaier, Bhaveet Nagaria, Philip Makedonski, Matin Nili Ahmadabadi, Kristof Szabados, Helge Spieker, Matej Madeja, Nathaniel Hoy, Valentina Lenarduzzi, Shangwen Wang, Gema Rodríguez-Pérez, Ricardo Colomo-Palacios, Roberto Verdecchia, Paramvir Singh, Yihao Qin, Debasish Chakroborti, Willard Davis, Vijay Walunj, Hongjun Wu , et al. (23 additional authors not shown)

    Abstract: Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objective: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits. Metho… ▽ More

    Submitted 13 October, 2021; v1 submitted 12 November, 2020; originally announced November 2020.

    Comments: Status: Accepted at Empirical Software Engineering

  21. Smoke Testing for Machine Learning: Simple Tests to Discover Severe Defects

    Authors: Steffen Herbold, Tobias Haar

    Abstract: Machine learning is nowadays a standard technique for data analysis within software applications. Software engineers need quality assurance techniques that are suitable for these new kinds of systems. Within this article, we discuss the question whether standard software testing techniques that have been part of textbooks since decades are also useful for the testing of machine learning software.… ▽ More

    Submitted 29 October, 2021; v1 submitted 3 September, 2020; originally announced September 2020.

    Comments: Accepted at Empirical Software Engineering, Springer

  22. On the feasibility of automated prediction of bug and non-bug issues

    Authors: Steffen Herbold, Alexander Trautsch, Fabian Trautsch

    Abstract: Context: Issue tracking systems are used to track and describe tasks in the development process, e.g., requested feature improvements or reported bugs. However, past research has shown that the reported issue types often do not match the description of the issue. Objective: We want to understand the overall maturity of the state of the art of issue type prediction with the goal to predict if iss… ▽ More

    Submitted 8 October, 2021; v1 submitted 11 March, 2020; originally announced March 2020.

  23. arXiv:2001.01972  [pdf, ps, other

    cs.DL

    With Registered Reports Towards Large Scale Data Curation

    Authors: Steffen Herbold

    Abstract: The scale of manually validated data is currently limited by the effort that small groups of researchers can invest for the curation of such data. Within this paper, we propose the use of registered reports to scale the curation of manually validated data. The idea is inspired by the mechanical turk and replaces monetary payment with authorship of data set publication.

    Submitted 7 January, 2020; originally announced January 2020.

  24. arXiv:2001.01606  [pdf, other

    cs.SE

    The SmartSHARK Ecosystem for Software Repository Mining

    Authors: Alexander Trautsch, Fabian Trautsch, Steffen Herbold, Benjamin Ledel, Jens Grabowski

    Abstract: Software repository mining is the foundation for many empirical software engineering studies. The collection and analysis of detailed data can be challenging, especially if data shall be shared to enable replicable research and open science practices. SmartSHARK is an ecosystem that supports replicable and reproducible research based on software repository mining.

    Submitted 6 January, 2020; originally announced January 2020.

    Comments: Submitted to ICSE 2020 Demo Track

  25. A Longitudinal Study of Static Analysis Warning Evolution and the Effects of PMD on Software Quality in Apache Open Source Projects

    Authors: Alexander Trautsch, Steffen Herbold, Jens Grabowski

    Abstract: Automated static analysis tools (ASATs) have become a major part of the software development workflow. Acting on the generated warnings, i.e., changing the code indicated in the warning, should be part of, at latest, the code review phase. Despite this being a best practice in software development, there is still a lack of empirical research regarding the usage of ASATs in the wild. In this work,… ▽ More

    Submitted 27 August, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

    Comments: preprint

    Journal ref: Empirical Software Engineering 25 (2020) 5137-5192

  26. Problems with SZZ and Features: An empirical study of the state of practice of defect prediction data collection

    Authors: Steffen Herbold, Alexander Trautsch, Fabian Trautsch, Benjamin Ledel

    Abstract: Context: The SZZ algorithm is the de facto standard for labeling bug fixing commits and finding inducing changes for defect prediction data. Recent research uncovered potential problems in different parts of the SZZ algorithm. Most defect prediction data sets provide only static code metrics as features, while research indicates that other features are also important. Objective: We provide an em… ▽ More

    Submitted 11 November, 2021; v1 submitted 20 November, 2019; originally announced November 2019.

    Comments: Accepted at Empirical Software Engineering, Springer. First three authors are equally contributing

  27. On the costs and profit of software defect prediction

    Authors: Steffen Herbold

    Abstract: Defect prediction can be a powerful tool to guide the use of quality assurance resources. However, while lots of research covered methods for defect prediction as well as methodological aspects of defect prediction research, the actual cost saving potential of defect prediction is still unclear. Within this article, we close this research gap and formulate a cost model for software defect predicti… ▽ More

    Submitted 11 November, 2019; originally announced November 2019.

    Comments: Under Review (minor revision)

  28. arXiv:1902.07499  [pdf, other

    cs.SE

    A systematic map** study of developer social network research

    Authors: Steffen Herbold, Aynur Amirfallah, Fabian Trautsch, Jens Grabowski

    Abstract: Developer social networks (DSNs) are a tool for the analysis of community structures and collaborations between developers in software projects and software ecosystems. Within this paper, we present the results of a systematic map** study on the use of DSNs in software engineering research. We identified 255 primary studies on DSNs. We mapped the primary studies to research directions, collected… ▽ More

    Submitted 21 August, 2020; v1 submitted 20 February, 2019; originally announced February 2019.

    Comments: Accepted at the Journal of Systems and Software

  29. arXiv:1812.09746  [pdf, other

    cs.LG cs.SE stat.ML

    A Multi-Objective Anytime Rule Mining System to Ease Iterative Feedback from Domain Experts

    Authors: Tobias Baum, Steffen Herbold, Kurt Schneider

    Abstract: Data extracted from software repositories is used intensively in Software Engineering research, for example, to predict defects in source code. In our research in this area, with data from open source projects as well as an industrial partner, we noticed several shortcomings of conventional data mining approaches for classification problems: (1) Domain experts' acceptance is of critical importance… ▽ More

    Submitted 23 December, 2018; originally announced December 2018.

  30. arXiv:1812.09510  [pdf, other

    cs.SE cs.LG

    An Industrial Case Study on Shrinking Code Review Changesets through Remark Prediction

    Authors: Tobias Baum, Steffen Herbold, Kurt Schneider

    Abstract: Change-based code review is used widely in industrial software development. Thus, research on tools that help the reviewer to achieve better review performance can have a high impact. We analyze one possibility to provide cognitive support for the reviewer: Determining the importance of change parts for review, specifically determining which parts of the code change can be left out from the review… ▽ More

    Submitted 22 December, 2018; originally announced December 2018.

  31. arXiv:1801.04107  [pdf, other

    cs.SE

    Benchmarking cross-project defect prediction approaches with costs metrics

    Authors: Steffen Herbold

    Abstract: Defect prediction can be a powerful tool to guide the use of quality assurance resources. In recent years, many researchers focused on the problem of Cross-Project Defect Prediction (CPDP), i.e., the creation of prediction models based on training data from other projects. However, only few of the published papers evaluate the cost efficiency of predictions, i.e., if they save costs if they are us… ▽ More

    Submitted 12 January, 2018; originally announced January 2018.

    Comments: Rejected at ICSE Technical Track, will be presented as a poster and hopefully appear in an extended version in a journal at some point in 2018

  32. arXiv:1707.09281  [pdf, other

    cs.SE

    Correction of "A Comparative Study to Benchmark Cross-project Defect Prediction Approaches"

    Authors: Steffen Herbold, Alexander Trautsch, Jens Grabowski

    Abstract: Unfortunately, the article "A Comparative Study to Benchmark Cross-project Defect Prediction Approaches" has a problem in the statistical analysis which was pointed out almost immediately after the pre-print of the article appeared online. While the problem does not negate the contribution of the the article and all key findings remain the same, it does alter some rankings of approaches used in th… ▽ More

    Submitted 27 July, 2017; originally announced July 2017.

  33. arXiv:1705.06429  [pdf, other

    cs.SE

    A systematic map** study on cross-project defect prediction

    Authors: Steffen Herbold

    Abstract: Cross-Project-Defect Prediction as a sub-topic of defect prediction in general has become a popular topic in research. In this article, we present a systematic map** study with the focus on CPDP, for which we found 50 publications. We summarize the approaches presented by each publication and discuss the case study setups and results. We discovered a great amount of heterogeneity in the way case… ▽ More

    Submitted 18 May, 2017; originally announced May 2017.

    Comments: Under Review