Search | arXiv e-print repository

doi 10.1109/SANER53432.2022.00048

An Empirical Investigation into the Reproduction of Bug Reports for Android Apps

Authors: Jack Johnson, Junayed Mahmud, Tyler Wendland, Kevin Moran, Julia Rubin, Mattia Fazzini

Abstract: One of the key tasks related to ensuring mobile app quality is the reporting, management, and resolution of bug reports. As such, researchers have committed considerable resources toward automating various tasks of the bug management process for mobile apps, such as reproduction and triaging. However, the success of these automated approaches is largely dictated by the characteristics and properti… ▽ More One of the key tasks related to ensuring mobile app quality is the reporting, management, and resolution of bug reports. As such, researchers have committed considerable resources toward automating various tasks of the bug management process for mobile apps, such as reproduction and triaging. However, the success of these automated approaches is largely dictated by the characteristics and properties of the bug reports they operate upon. As such, understanding mobile app bug reports is imperative to drive the continued advancement of report management techniques. While prior studies have examined high-level statistics of large sets of reports, we currently lack an in-depth investigation of how the information typically reported in mobile app issue trackers relates to the specific details generally required to reproduce the underlying failures. In this paper, we perform an in-depth analysis of 180 reproducible bug reports systematically mined from Android apps on GitHub and investigate how the information contained in the reports relates to the task of reproducing the described bugs. In our analysis, we focus on three pieces of information: the environment needed to reproduce the bug report, the steps to reproduce (S2Rs), and the observed behavior. Focusing on this information, we characterize failure types, identify the modality used to report the information, and characterize the quality of the information within the reports. We find that bugs are reported in a multi-modal fashion, the environment is not always provided, and S2Rs often contain missing or non-specific enough information. These findings carry with them important implications on automated bug reproduction techniques as well as automated bug report management approaches more generally. △ Less

Submitted 3 January, 2023; originally announced January 2023.

Comments: Published in the Proceedings of the 29th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER'22), Honolulu, Hawaii, March 15-18, 2022, pp. 321-332

arXiv:2205.15561 [pdf, other]

doi 10.1145/3533767.3534407

Automatically Detecting API-induced Compatibility Issues in Android Apps: A Comparative Analysis (Replicability Study)

Authors: Pei Liu, Yanjie Zhao, Haipeng Cai, Mattia Fazzini, John Grundy, Li Li

Abstract: Fragmentation is a serious problem in the Android ecosystem. This problem is mainly caused by the fast evolution of the system itself and the various customizations independently maintained by different smartphone manufacturers. Many efforts have attempted to mitigate its impact via approaches to automatically pinpoint compatibility issues in Android apps. Unfortunately, at this stage, it is still… ▽ More Fragmentation is a serious problem in the Android ecosystem. This problem is mainly caused by the fast evolution of the system itself and the various customizations independently maintained by different smartphone manufacturers. Many efforts have attempted to mitigate its impact via approaches to automatically pinpoint compatibility issues in Android apps. Unfortunately, at this stage, it is still unknown if this objective has been fulfilled, and the existing approaches can indeed be replicated and reliably leveraged to pinpoint compatibility issues in the wild. We, therefore, propose to fill this gap by first conducting a literature review within this topic to identify all the available approaches. Among the nine identified approaches, we then try our best to reproduce them based on their original datasets. After that, we go one step further to empirically compare those approaches against common datasets with real-world apps containing compatibility issues. Experimental results show that existing tools can indeed be reproduced, but their capabilities are quite distinct, as confirmed by the fact that there is only a small overlap of the results reported by the selected tools. This evidence suggests that more efforts should be spent by our community to achieve sound compatibility issues detection. △ Less

Submitted 31 May, 2022; originally announced May 2022.

Report number: ISSTA '22 Proceedings of the 31st ACM SIGSOFT International Symposium on Software Testing and Analysis

arXiv:2205.15546 [pdf, other]

Identifying and Characterizing Silently-Evolved Methods in the Android API

Authors: Pei Liu, Li Li, Yichun Yan, Mattia Fazzini, John Grundy

Abstract: With over 500,000 commits and more than 700 contributors, the Android platform is undoubtedly one of the largest industrial-scale software projects. This project provides the Android API, and developers heavily rely on this API to develop their Android apps. Unfortunately, because the Android platform and its API evolve at an extremely rapid pace, app developers need to continually monitor API cha… ▽ More With over 500,000 commits and more than 700 contributors, the Android platform is undoubtedly one of the largest industrial-scale software projects. This project provides the Android API, and developers heavily rely on this API to develop their Android apps. Unfortunately, because the Android platform and its API evolve at an extremely rapid pace, app developers need to continually monitor API changes to avoid compatibility issues in their apps (\ie issues that prevent apps from working as expected when running on newer versions of the API). Despite a large number of studies on compatibility issues in the Android API, the research community has not yet investigated issues related to silently-evolved methods (SEMs). These methods are functions whose behavior might have changed but the corresponding documentation did not change accordingly. Because app developers rely on the provided documentation to evolve their apps, changes to methods that are not suitably documented may lead to unexpected failures in the apps using these methods. To shed light on this type of issue, we conducted a large-scale empirical study in which we identified and characterized SEMs across ten versions of the Android API. In the study, we identified SEMs, characterized the nature of the changes, and analyzed the impact of SEMs on a set of 1,000 real-world Android apps. Our experimental results show that SEMs do exist in the Android API, and that 957 of the apps we considered use at least one SEM. Based on these results, we argue that the Android platform developers should take actions to avoid introducing SEMs, especially those involving semantic changes. This situation highlights the need for automated techniques and tools to help Android practitioners in this task. △ Less

Submitted 31 May, 2022; originally announced May 2022.

Report number: 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP)

arXiv:2205.15535 [pdf, other]

doi 10.1145/3524842.3527963

Do Customized Android Frameworks Keep Pace with Android?

Authors: Pei Liu, Mattia Fazzini, John Grundy, Li Li

Abstract: To satisfy varying customer needs, device vendors and OS providers often rely on the open-source nature of the Android OS and offer customized versions of the Android OS. When a new version of the Android OS is released, device vendors and OS providers need to merge the changes from the Android OS into their customizations to account for its bug fixes, security patches, and new features. Because d… ▽ More To satisfy varying customer needs, device vendors and OS providers often rely on the open-source nature of the Android OS and offer customized versions of the Android OS. When a new version of the Android OS is released, device vendors and OS providers need to merge the changes from the Android OS into their customizations to account for its bug fixes, security patches, and new features. Because developers of customized OSs might have made changes to code locations that were also modified by the developers of the Android OS, the merge task can be characterized by conflicts, which can be time-consuming and error-prone to resolve. To provide more insight into this critical aspect of the Android ecosystem, we present an empirical study that investigates how eight open-source customizations of the Android OS merge the changes from the Android OS into their projects. The study analyzes how often the developers from the customized OSs merge changes from the Android OS, how often the developers experience textual merge conflicts, and the characteristics of these conflicts. Furthermore, to analyze the effect of the conflicts, the study also analyzes how the conflicts can affect a randomly selected sample of 1,000 apps. After analyzing 1,148 merge operations, we identified that developers perform these operations for 9.7\% of the released versions of the Android OS, developers will encounter at least one conflict in 41.3\% of the merge operations, 58.1\% of the conflicts required developers to change the customized OSs, and 64.4\% of the apps considered use at least one method affected by a conflict. In addition to detailing our results, the paper also discusses the implications of our findings and provides insights for researchers and practitioners working with Android and its customizations. △ Less

Submitted 31 May, 2022; originally announced May 2022.

Report number: MSR '22: Proceedings of the 19th International Conference on Mining Software Repositories

arXiv:2203.12093 [pdf, other]

Enhancing Mobile App Bug Reporting via Real-time Understanding of Reproduction Steps

Authors: Mattia Fazzini, Kevin Moran, Carlos Bernal Cardenas, Tyler Wendland, Alessandro Orso, Denys Poshyvanyk

Abstract: One of the primary mechanisms by which developers receive feedback about in-field failures of software from users is through bug reports. Unfortunately, the quality of manually written bug reports can vary widely due to the effort required to include essential pieces of information, such as detailed reproduction steps (S2Rs). Despite the difficulty faced by reporters, few existing bug reporting sy… ▽ More One of the primary mechanisms by which developers receive feedback about in-field failures of software from users is through bug reports. Unfortunately, the quality of manually written bug reports can vary widely due to the effort required to include essential pieces of information, such as detailed reproduction steps (S2Rs). Despite the difficulty faced by reporters, few existing bug reporting systems attempt to offer automated assistance to users in crafting easily readable, and conveniently reproducible bug reports. To address the need for proactive bug reporting systems that actively aid the user in capturing crucial information, we introduce a novel bug reporting approach called EBug. EBug assists reporters in writing S2Rs for mobile applications by analyzing natural language information entered by reporters in real-time, and linking this data to information extracted via a combination of static and dynamic program analyses. As reporters write S2Rs, EBug is capable of automatically suggesting potential future steps using predictive models trained on realistic app usages. To evaluate EBug, we performed two user studies based on 20 failures from $11$ real-world apps. The empirical studies involved ten participants that submitted ten bug reports each and ten developers that reproduced the submitted bug reports. In the studies, we found that reporters were able to construct bug reports 31 faster with EBug as compared to the state-of-the-art bug reporting system used as a baseline. EBug's reports were also more reproducible with respect to the ones generated with the baseline. Furthermore, we compared EBug's prediction models to other predictive modeling approaches and found that, overall, the predictive models of our approach outperformed the baseline approaches. Our results are promising and demonstrate the potential benefits provided by proactively assistive bug reporting systems. △ Less

Submitted 22 March, 2022; originally announced March 2022.

arXiv:2106.08403 [pdf, other]

doi 10.1109/MSR52588.2021.00082

AndroR2: A Dataset of Manually Reproduced Bug Reports for Android Applications

Authors: Tyler Wendland, **gyang Sun, Junayed Mahmud, S. M. Hasan Mansur, Steven Huang, Kevin Moran, Julia Rubin, Mattia Fazzini

Abstract: Software maintenance constitutes a large portion of the software development lifecycle. To carry out maintenance tasks, developers often need to understand and reproduce bug reports. As such, there has been increasing research activity coalescing around the notion of automating various activities related to bug reporting. A sizable portion of this research interest has focused on the domain of mob… ▽ More Software maintenance constitutes a large portion of the software development lifecycle. To carry out maintenance tasks, developers often need to understand and reproduce bug reports. As such, there has been increasing research activity coalescing around the notion of automating various activities related to bug reporting. A sizable portion of this research interest has focused on the domain of mobile apps. However, as research around mobile app bug reporting progresses, there is a clear need for a manually vetted and reproducible set of real-world bug reports that can serve as a benchmark for future work. This paper presents ANDROR2: a dataset of 90 manually reproduced bug reports for Android apps listed on Google Play and hosted on GitHub, systematically collected via an in-depth analysis of 459 reports extracted from the GitHub issue tracker. For each reproduced report, ANDROR2 includes the original bug report, an apk file for the buggy version of the app, an executable reproduction script, and metadata regarding the quality of the reproduction steps associated with the original report. We believe that the ANDROR2 dataset can be used to facilitate research in automatically analyzing, understanding, reproducing, localizing, and fixing bugs for mobile applications as well as other software maintenance activities more broadly. △ Less

Submitted 15 June, 2021; originally announced June 2021.

Comments: 5 pages, Accepted to the 2021 International Conference on Mining Software Repositories, Data Showcase Track; Links to Datasets: https://doi.org/10.5281/zenodo.4646313; https://github.com/SageSELab/AndroR2

arXiv:1608.03624 [pdf, other]

From Manual Android Tests to Automated and Platform Independent Test Scripts

Authors: Mattia Fazzini, Eduardo Noronha de A. Freitas, Shauvik Roy Choudhary, Alessandro Orso

Abstract: Because Mobile apps are extremely popular and often mission critical nowadays, companies invest a great deal of resources in testing the apps they provide to their customers. Testing is particularly important for Android apps, which must run on a multitude of devices and operating system versions. Unfortunately, as we confirmed in many interviews with quality assurance professionals, app testing i… ▽ More Because Mobile apps are extremely popular and often mission critical nowadays, companies invest a great deal of resources in testing the apps they provide to their customers. Testing is particularly important for Android apps, which must run on a multitude of devices and operating system versions. Unfortunately, as we confirmed in many interviews with quality assurance professionals, app testing is today a very human intensive, and therefore tedious and error prone, activity. To address this problem, and better support testing of Android apps, we propose a new technique that allows testers to easily create platform independent test scripts for an app and automatically run the generated test scripts on multiple devices and operating system versions. The technique does so without modifying the app under test or the runtime system, by (1) intercepting the interactions of the tester with the app and (2) providing the tester with an intuitive way to specify expected results that it then encode as test oracles. We implemented our technique in a tool named Barista and used the tool to evaluate the practical usefulness and applicability of our approach. Our results show that Barista can faithfully encode user defined test cases as test scripts with built-in oracles, generates test scripts that can run on multiple platforms, and can outperform a state-of-the-art tool with similar functionality. Barista and our experimental infrastructure are publicly available. △ Less

Submitted 11 August, 2016; originally announced August 2016.

Showing 1–7 of 7 results for author: Fazzini, M