Skip to main content

Showing 1–12 of 12 results for author: Trautsch, A

.
  1. arXiv:2304.14276  [pdf, other

    cs.CL cs.LG

    AI, write an essay for me: A large-scale comparison of human-written versus ChatGPT-generated essays

    Authors: Steffen Herbold, Annette Hautli-Janisz, Ute Heuer, Zlata Kikteva, Alexander Trautsch

    Abstract: Background: Recently, ChatGPT and similar generative AI models have attracted hundreds of millions of users and become part of the public discourse. Many believe that such models will disrupt society and will result in a significant change in the education system and information generation in the future. So far, this belief is based on either colloquial evidence or benchmarks from the owners of th… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

    Comments: Submitted

  2. arXiv:2205.01335  [pdf, ps, other

    cs.SE cs.CL cs.LG

    Predicting Issue Types with seBERT

    Authors: Alexander Trautsch, Steffen Herbold

    Abstract: Pre-trained transformer models are the current state-of-the-art for natural language models processing. seBERT is such a model, that was developed based on the BERT architecture, but trained from scratch with software engineering data. We fine-tuned this model for the NLBSE challenge for the task of issue type prediction. Our model dominates the baseline fastText for all three issue types in both… ▽ More

    Submitted 3 May, 2022; originally announced May 2022.

    Comments: Accepted for Publication at the NLBSE'22 Tool Competition

  3. arXiv:2111.09188  [pdf, other

    cs.SE

    Are automated static analysis tools worth it? An investigation into relative warning density and external software quality

    Authors: Alexander Trautsch, Steffen Herbold, Jens Grabowski

    Abstract: Automated Static Analysis Tools (ASATs) are part of software development best practices. ASATs are able to warn developers about potential problems in the code. On the one hand, ASATs are based on best practices so there should be a noticeable effect on software quality. On the other hand, ASATs suffer from false positive warnings, which developers have to inspect and then ignore or mark as invali… ▽ More

    Submitted 18 November, 2021; v1 submitted 17 November, 2021; originally announced November 2021.

  4. arXiv:2109.04738  [pdf, other

    cs.SE cs.LG

    On the validity of pre-trained transformers for natural language processing in the software engineering domain

    Authors: Julian von der Mosel, Alexander Trautsch, Steffen Herbold

    Abstract: Transformers are the current state-of-the-art of natural language processing in many domains and are using traction within software engineering research as well. Such models are pre-trained on large amounts of data, usually from the general domain. However, we only have a limited understanding regarding the validity of transformers within the software engineering domain, i.e., how good such models… ▽ More

    Submitted 12 May, 2022; v1 submitted 10 September, 2021; originally announced September 2021.

    Comments: Review status: submitted

  5. arXiv:2109.03544  [pdf, other

    cs.SE

    What really changes when developers intend to improve their source code: a commit-level study of static metric value and static analysis warning changes

    Authors: Alexander Trautsch, Johannes Erbel, Steffen Herbold, Jens Grabowski

    Abstract: Many software metrics are designed to measure aspects that are believed to be related to software quality. Static software metrics, e.g., size, complexity and coupling are used in defect prediction research as well as software quality models to evaluate software quality. While this indicates a relationship between quality and software metrics, the extent of it is not well understood. Moreover, rec… ▽ More

    Submitted 30 May, 2022; v1 submitted 8 September, 2021; originally announced September 2021.

  6. arXiv:2102.11540  [pdf, other

    cs.SE

    MSR Mining Challenge: The SmartSHARK Repository Mining Data

    Authors: Alexander Trautsch, Fabian Trautsch, Steffen Herbold

    Abstract: The SmartSHARK repository mining data is a collection of rich and detailed information about the evolution of software projects. The data is unique in its diversity and contains detailed information about each change, issue tracking data, continuous integration data, as well as pull request and code review data. Moreover, the data does not contain only raw data scraped from repositories, but also… ▽ More

    Submitted 4 August, 2021; v1 submitted 23 February, 2021; originally announced February 2021.

  7. arXiv:2011.06244  [pdf, other

    cs.SE

    A Fine-grained Data Set and Analysis of Tangling in Bug Fixing Commits

    Authors: Steffen Herbold, Alexander Trautsch, Benjamin Ledel, Alireza Aghamohammadi, Taher Ahmed Ghaleb, Kuljit Kaur Chahal, Tim Bossenmaier, Bhaveet Nagaria, Philip Makedonski, Matin Nili Ahmadabadi, Kristof Szabados, Helge Spieker, Matej Madeja, Nathaniel Hoy, Valentina Lenarduzzi, Shangwen Wang, Gema Rodríguez-Pérez, Ricardo Colomo-Palacios, Roberto Verdecchia, Paramvir Singh, Yihao Qin, Debasish Chakroborti, Willard Davis, Vijay Walunj, Hongjun Wu , et al. (23 additional authors not shown)

    Abstract: Context: Tangled commits are changes to software that address multiple concerns at once. For researchers interested in bugs, tangled commits mean that they actually study not only bugs, but also other concerns irrelevant for the study of bugs. Objective: We want to improve our understanding of the prevalence of tangling and the types of changes that are tangled within bug fixing commits. Metho… ▽ More

    Submitted 13 October, 2021; v1 submitted 12 November, 2020; originally announced November 2020.

    Comments: Status: Accepted at Empirical Software Engineering

  8. On the feasibility of automated prediction of bug and non-bug issues

    Authors: Steffen Herbold, Alexander Trautsch, Fabian Trautsch

    Abstract: Context: Issue tracking systems are used to track and describe tasks in the development process, e.g., requested feature improvements or reported bugs. However, past research has shown that the reported issue types often do not match the description of the issue. Objective: We want to understand the overall maturity of the state of the art of issue type prediction with the goal to predict if iss… ▽ More

    Submitted 8 October, 2021; v1 submitted 11 March, 2020; originally announced March 2020.

  9. arXiv:2001.01606  [pdf, other

    cs.SE

    The SmartSHARK Ecosystem for Software Repository Mining

    Authors: Alexander Trautsch, Fabian Trautsch, Steffen Herbold, Benjamin Ledel, Jens Grabowski

    Abstract: Software repository mining is the foundation for many empirical software engineering studies. The collection and analysis of detailed data can be challenging, especially if data shall be shared to enable replicable research and open science practices. SmartSHARK is an ecosystem that supports replicable and reproducible research based on software repository mining.

    Submitted 6 January, 2020; originally announced January 2020.

    Comments: Submitted to ICSE 2020 Demo Track

  10. A Longitudinal Study of Static Analysis Warning Evolution and the Effects of PMD on Software Quality in Apache Open Source Projects

    Authors: Alexander Trautsch, Steffen Herbold, Jens Grabowski

    Abstract: Automated static analysis tools (ASATs) have become a major part of the software development workflow. Acting on the generated warnings, i.e., changing the code indicated in the warning, should be part of, at latest, the code review phase. Despite this being a best practice in software development, there is still a lack of empirical research regarding the usage of ASATs in the wild. In this work,… ▽ More

    Submitted 27 August, 2020; v1 submitted 2 December, 2019; originally announced December 2019.

    Comments: preprint

    Journal ref: Empirical Software Engineering 25 (2020) 5137-5192

  11. Problems with SZZ and Features: An empirical study of the state of practice of defect prediction data collection

    Authors: Steffen Herbold, Alexander Trautsch, Fabian Trautsch, Benjamin Ledel

    Abstract: Context: The SZZ algorithm is the de facto standard for labeling bug fixing commits and finding inducing changes for defect prediction data. Recent research uncovered potential problems in different parts of the SZZ algorithm. Most defect prediction data sets provide only static code metrics as features, while research indicates that other features are also important. Objective: We provide an em… ▽ More

    Submitted 11 November, 2021; v1 submitted 20 November, 2019; originally announced November 2019.

    Comments: Accepted at Empirical Software Engineering, Springer. First three authors are equally contributing

  12. arXiv:1707.09281  [pdf, other

    cs.SE

    Correction of "A Comparative Study to Benchmark Cross-project Defect Prediction Approaches"

    Authors: Steffen Herbold, Alexander Trautsch, Jens Grabowski

    Abstract: Unfortunately, the article "A Comparative Study to Benchmark Cross-project Defect Prediction Approaches" has a problem in the statistical analysis which was pointed out almost immediately after the pre-print of the article appeared online. While the problem does not negate the contribution of the the article and all key findings remain the same, it does alter some rankings of approaches used in th… ▽ More

    Submitted 27 July, 2017; originally announced July 2017.