-
ASAP-Repair: API-Specific Automated Program Repair Based on API Usage Graphs
Authors:
Sebastian Nielebock,
Paul Blockhaus,
Jacob Krüger,
Frank Ortmeier
Abstract:
Modern software development relies on the reuse of code via Application Programming Interfaces (APIs). Such reuse relieves developers from learning and develo** established algorithms and data structures anew, enabling them to focus on their problem at hand. However, there is also the risk of misusing an API due to a lack of understanding or proper documentation. While many techniques target API…
▽ More
Modern software development relies on the reuse of code via Application Programming Interfaces (APIs). Such reuse relieves developers from learning and develo** established algorithms and data structures anew, enabling them to focus on their problem at hand. However, there is also the risk of misusing an API due to a lack of understanding or proper documentation. While many techniques target API misuse detection, only limited efforts have been put into automatically repairing API misuses. In this paper, we present our advances on our technique API-Specific Automated Program Repair (ASAP-Repair). ASAP-Repair is intended to fix API misuses based on API Usage Graphs (AUGs) by leveraging API usage templates of state-of-the-art API misuse detectors. We demonstrate that ASAP-Repair is in principle applicable on an established API misuse dataset. Moreover, we discuss next steps and challenges to evolve ASAP-Repair towards a full-fledged Automatic Program Repair (APR) technique.
△ Less
Submitted 12 February, 2024;
originally announced February 2024.
-
Towards Transformer-based Homogenization of Satellite Imagery for Landsat-8 and Sentinel-2
Authors:
Venkatesh Thirugnana Sambandham,
Konstantin Kirchheim,
Sayan Mukhopadhaya,
Frank Ortmeier
Abstract:
Landsat-8 (NASA) and Sentinel-2 (ESA) are two prominent multi-spectral imaging satellite projects that provide publicly available data. The multi-spectral imaging sensors of the satellites capture images of the earth's surface in the visible and infrared region of the electromagnetic spectrum. Since the majority of the earth's surface is constantly covered with clouds, which are not transparent at…
▽ More
Landsat-8 (NASA) and Sentinel-2 (ESA) are two prominent multi-spectral imaging satellite projects that provide publicly available data. The multi-spectral imaging sensors of the satellites capture images of the earth's surface in the visible and infrared region of the electromagnetic spectrum. Since the majority of the earth's surface is constantly covered with clouds, which are not transparent at these wavelengths, many images do not provide much information. To increase the temporal availability of cloud-free images of a certain area, one can combine the observations from multiple sources. However, the sensors of satellites might differ in their properties, making the images incompatible. This work provides a first glance at the possibility of using a transformer-based model to reduce the spectral and spatial differences between observations from both satellite projects. We compare the results to a model based on a fully convolutional UNet architecture. Somewhat surprisingly, we find that, while deep models outperform classical approaches, the UNet significantly outperforms the transformer in our experiments.
△ Less
Submitted 14 October, 2022;
originally announced October 2022.
-
Automated Change Rule Inference for Distance-Based API Misuse Detection
Authors:
Sebastian Nielebock,
Paul Blockhaus,
Jacob Krüger,
Frank Ortmeier
Abstract:
Developers build on Application Programming Interfaces (APIs) to reuse existing functionalities of code libraries. Despite the benefits of reusing established libraries (e.g., time savings, high quality), developers may diverge from the API's intended usage; potentially causing bugs or, more specifically, API misuses. Recent research focuses on develo** techniques to automatically detect API mis…
▽ More
Developers build on Application Programming Interfaces (APIs) to reuse existing functionalities of code libraries. Despite the benefits of reusing established libraries (e.g., time savings, high quality), developers may diverge from the API's intended usage; potentially causing bugs or, more specifically, API misuses. Recent research focuses on develo** techniques to automatically detect API misuses, but many suffer from a high false-positive rate. In this article, we improve on this situation by proposing ChaRLI (Change RuLe Inference), a technique for automatically inferring change rules from developers' fixes of API misuses based on API Usage Graphs (AUGs). By subsequently applying graph-distance algorithms, we use change rules to discriminate API misuses from correct usages. This allows developers to reuse others' fixes of an API misuse at other code locations in the same or another project. We evaluated the ability of change rules to detect API misuses based on three datasets and found that the best mean relative precision (i.e., for testable usages) ranges from 77.1 % to 96.1 % while the mean recall ranges from 0.007 % to 17.7 % for individual change rules. These results underpin that ChaRLI and our misuse detection are helpful complements to existing API misuse detectors.
△ Less
Submitted 14 July, 2022;
originally announced July 2022.
-
Addressing Randomness in Evaluation Protocols for Out-of-Distribution Detection
Authors:
Konstantin Kirchheim,
Tim Gonschorek,
Frank Ortmeier
Abstract:
Deep Neural Networks for classification behave unpredictably when confronted with inputs not stemming from the training distribution. This motivates out-of-distribution detection (OOD) mechanisms. The usual lack of prior information on out-of-distribution data renders the performance estimation of detection approaches on unseen data difficult. Several contemporary evaluation protocols are based on…
▽ More
Deep Neural Networks for classification behave unpredictably when confronted with inputs not stemming from the training distribution. This motivates out-of-distribution detection (OOD) mechanisms. The usual lack of prior information on out-of-distribution data renders the performance estimation of detection approaches on unseen data difficult. Several contemporary evaluation protocols are based on open set simulations, which average the performance over up to five synthetic random splits of a dataset into in- and out-of-distribution samples. However, the number of possible splits may be much larger, and the performance of Deep Neural Networks is known to fluctuate significantly depending on different sources of random variation. We empirically demonstrate that current protocols may fail to provide reliable estimates of the expected performance of OOD methods. By casting this evaluation as a random process, we generalize the concept of open set simulations and propose to estimate the performance of OOD methods using a Monte Carlo approach that addresses the randomness.
△ Less
Submitted 1 March, 2022;
originally announced March 2022.
-
An Experimental Analysis of Graph-Distance Algorithms for Comparing API Usages
Authors:
Sebastian Nielebock,
Paul Blockhaus,
Jacob Krüger,
Frank Ortmeier
Abstract:
Modern software development heavily relies on the reuse of functionalities through Application Programming Interfaces (APIs). However, client developers can have issues identifying the correct usage of a certain API, causing misuses accompanied by software crashes or usability bugs. Therefore, researchers have aimed at identifying API misuses automatically by comparing client code usages to correc…
▽ More
Modern software development heavily relies on the reuse of functionalities through Application Programming Interfaces (APIs). However, client developers can have issues identifying the correct usage of a certain API, causing misuses accompanied by software crashes or usability bugs. Therefore, researchers have aimed at identifying API misuses automatically by comparing client code usages to correct API usages. Some techniques rely on certain API-specific graph-based data structures to improve the abstract representation of API usages. Such techniques need to compare graphs, for instance, by computing distance metrics based on the minimal graph edit distance or the largest common subgraphs, whose computations are known to be NP-hard problems. Fortunately, there exist many abstractions for simplifying graph distance computation. However, their applicability for comparing graph representations of API usages has not been analyzed. In this paper, we provide a comparison of different distance algorithms of API-usage graphs regarding correctness and runtime. Particularly, correctness relates to the algorithms' ability to identify similar correct API usages, but also to discriminate similar correct and false usages as well as non-similar usages. For this purpose, we systematically identified a set of eight graph-based distance algorithms and applied them on two datasets of real-world API usages and misuses. Interestingly, our results suggest that existing distance algorithms are not reliable for comparing API usage graphs. To improve on this situation, we identified and discuss the algorithms' issues, based on which we formulate hypotheses to initiate research on overcoming them.
△ Less
Submitted 27 August, 2021;
originally announced August 2021.
-
AndroidCompass: A Dataset of Android Compatibility Checks in Code Repositories
Authors:
Sebastian Nielebock,
Paul Blockhaus,
Jacob Krüger,
Frank Ortmeier
Abstract:
Many developers and organizations implement apps for Android, the most widely used operating system for mobile devices. Common problems developers face are the various hardware devices, customized Android variants, and frequent updates, forcing them to implement workarounds for the different versions and variants of Android APIs used in practice. In this paper, we contribute the Android Compatibil…
▽ More
Many developers and organizations implement apps for Android, the most widely used operating system for mobile devices. Common problems developers face are the various hardware devices, customized Android variants, and frequent updates, forcing them to implement workarounds for the different versions and variants of Android APIs used in practice. In this paper, we contribute the Android Compatibility checkS dataSet (AndroidCompass) that comprises changes to compatibility checks developers use to enforce workarounds for specific Android versions in their apps. We extracted 80,324 changes to compatibility checks from 1,394 apps by analyzing the version histories of 2,399 projects from the F-Droid catalog. With AndroidCompass, we aim to provide data on when and how developers introduced or evolved workarounds to handle Android incompatibilities. We hope that AndroidCompass fosters research to deal with version incompatibilities, address potential design flaws, identify security concerns, and help derive solutions for other developers, among others-hel** researchers to develop and evaluate novel techniques, and Android app as well as operating-system developers in engineering their software.
△ Less
Submitted 17 March, 2021;
originally announced March 2021.
-
Guided Pattern Mining for API Misuse Detection by Change-Based Code Analysis
Authors:
Sebastian Nielebock,
Robert Heumüller,
Kevin Michael Schott,
Frank Ortmeier
Abstract:
Lack of experience, inadequate documentation, and sub-optimal API design frequently cause developers to make mistakes when re-using third-party implementations. Such API misuses can result in unintended behavior, performance losses, or software crashes. Therefore, current research aims to automatically detect such misuses by comparing the way a developer used an API to previously inferred patterns…
▽ More
Lack of experience, inadequate documentation, and sub-optimal API design frequently cause developers to make mistakes when re-using third-party implementations. Such API misuses can result in unintended behavior, performance losses, or software crashes. Therefore, current research aims to automatically detect such misuses by comparing the way a developer used an API to previously inferred patterns of the correct API usage. While research has made significant progress, these techniques have not yet been adopted in practice. In part, this is due to the lack of a process capable of seamlessly integrating with software development processes. Particularly, existing approaches do not consider how to collect relevant source code samples from which to infer patterns. In fact, an inadequate collection can cause API usage pattern miners to infer irrelevant patterns which leads to false alarms instead of finding true API misuses. In this paper, we target this problem (a) by providing a method that increases the likelihood of finding relevant and true-positive patterns concerning a given set of code changes and agnostic to a concrete static, intra-procedural mining technique and (b) by introducing a concept for just-in-time API misuse detection which analyzes changes at the time of commit. Particularly, we introduce different, lightweight code search and filtering strategies and evaluate them on two real-world API misuse datasets to determine their usefulness in finding relevant intra-procedural API usage patterns. Our main results are (1) commit-based search with subsequent filtering effectively decreases the amount of code to be analyzed, (2) in particular method-level filtering is superior to file-level filtering, (3) project-internal and project-external code search find solutions for different types of misuses and thus are complementary, (4) incorporating prior knowledge of the misused [...]
△ Less
Submitted 12 July, 2021; v1 submitted 1 August, 2020;
originally announced August 2020.
-
Learning References with Gaussian Processes in Model Predictive Control applied to Robot Assisted Surgery
Authors:
Janine Matschek,
Tim Gonschorek,
Magnus Hanses,
Norbert Elkmann,
Frank Ortmeier,
Rolf Findeisen
Abstract:
One of the key benefits of model predictive control is the capability of controlling a system proactively in the sense of taking the future system evolution into account. However, often external disturbances or references are not a priori known, which renders the predictive controllers shortsighted or uninformed. Adaptive prediction models can be used to overcome this issue and provide predictions…
▽ More
One of the key benefits of model predictive control is the capability of controlling a system proactively in the sense of taking the future system evolution into account. However, often external disturbances or references are not a priori known, which renders the predictive controllers shortsighted or uninformed. Adaptive prediction models can be used to overcome this issue and provide predictions of these signals to the controller. In this work we propose to learn references via Gaussian processes for model predictive controllers. To illustrate the approach, we consider robot assisted surgery, where a robotic manipulator needs to follow a learned reference position based on optical tracking measurements.
△ Less
Submitted 25 November, 2019;
originally announced November 2019.
-
Probabilistic Model-Based Safety Analysis
Authors:
Matthias Güdemann,
Frank Ortmeier
Abstract:
Model-based safety analysis approaches aim at finding critical failure combinations by analysis of models of the whole system (i.e. software, hardware, failure modes and environment). The advantage of these methods compared to traditional approaches is that the analysis of the whole system gives more precise results. Only few model-based approaches have been applied to answer quantitative question…
▽ More
Model-based safety analysis approaches aim at finding critical failure combinations by analysis of models of the whole system (i.e. software, hardware, failure modes and environment). The advantage of these methods compared to traditional approaches is that the analysis of the whole system gives more precise results. Only few model-based approaches have been applied to answer quantitative questions in safety analysis, often limited to analysis of specific failure propagation models, limited types of failure modes or without system dynamics and behavior, as direct quantitative analysis is uses large amounts of computing resources. New achievements in the domain of (probabilistic) model-checking now allow for overcoming this problem.
This paper shows how functional models based on synchronous parallel semantics, which can be used for system design, implementation and qualitative safety analysis, can be directly re-used for (model-based) quantitative safety analysis. Accurate modeling of different types of probabilistic failure occurrence is shown as well as accurate interpretation of the results of the analysis. This allows for reliable and expressive assessment of the safety of a system in early design stages.
△ Less
Submitted 25 June, 2010;
originally announced June 2010.