Search | arXiv e-print repository

doi 10.1109/SANER53432.2022.00069

An Empirical Investigation into the Use of Image Captioning for Automated Software Documentation

Authors: Kevin Moran, Ali Yachnes, George Purnell, Junayed Mahmud, Michele Tufano, Carlos Bernal-Cárdenas, Denys Poshyvanyk, Zach H'Doubler

Abstract: Existing automated techniques for software documentation typically attempt to reason between two main sources of information: code and natural language. However, this reasoning process is often complicated by the lexical gap between more abstract natural language and more structured programming languages. One potential bridge for this gap is the Graphical User Interface (GUI), as GUIs inherently e… ▽ More Existing automated techniques for software documentation typically attempt to reason between two main sources of information: code and natural language. However, this reasoning process is often complicated by the lexical gap between more abstract natural language and more structured programming languages. One potential bridge for this gap is the Graphical User Interface (GUI), as GUIs inherently encode salient information about underlying program functionality into rich, pixel-based data representations. This paper offers one of the first comprehensive empirical investigations into the connection between GUIs and functional, natural language descriptions of software. First, we collect, analyze, and open source a large dataset of functional GUI descriptions consisting of 45,998 descriptions for 10,204 screenshots from popular Android applications. The descriptions were obtained from human labelers and underwent several quality control mechanisms. To gain insight into the representational potential of GUIs, we investigate the ability of four Neural Image Captioning models to predict natural language descriptions of varying granularity when provided a screenshot as input. We evaluate these models quantitatively, using common machine translation metrics, and qualitatively through a large-scale user study. Finally, we offer learned lessons and a discussion of the potential shown by multimodal models to enhance future techniques for automated software documentation. △ Less

Submitted 3 January, 2023; originally announced January 2023.

Comments: Published in the Proceedings of the 29th IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER'22), Honolulu, Hawaii, March 15-18, 2022, pp. 514-525

arXiv:2301.01191 [pdf, other]

Translating Video Recordings of Complex Mobile App UI Gestures into Replayable Scenarios

Authors: Carlos Bernal-Cárdenas, Nathan Cooper, Madeleine Havranek, Kevin Moran, Oscar Chaparro, Denys Poshyvanyk, Andrian Marcus

Abstract: Screen recordings of mobile applications are easy to obtain and capture a wealth of information pertinent to software developers (e.g., bugs or feature requests), making them a popular mechanism for crowdsourced app feedback. Thus, these videos are becoming a common artifact that developers must manage. In light of unique mobile development constraints, including swift release cycles and rapidly e… ▽ More Screen recordings of mobile applications are easy to obtain and capture a wealth of information pertinent to software developers (e.g., bugs or feature requests), making them a popular mechanism for crowdsourced app feedback. Thus, these videos are becoming a common artifact that developers must manage. In light of unique mobile development constraints, including swift release cycles and rapidly evolving platforms, automated techniques for analyzing all types of rich software artifacts provide benefit to mobile developers. Unfortunately, automatically analyzing screen recordings presents serious challenges, due to their graphical nature, compared to other types of (textual) artifacts. To address these challenges, this paper introduces V2S+, an automated approach for translating video recordings of Android app usages into replayable scenarios. V2S+ is based primarily on computer vision techniques and adapts recent solutions for object detection and image classification to detect and classify user gestures captured in a video, and convert these into a replayable test scenario. Given that V2S+ takes a computer vision-based approach, it is applicable to both hybrid and native Android applications. We performed an extensive evaluation of V2S+ involving 243 videos depicting 4,028 GUI-based actions collected from users exercising features and reproducing bugs from a collection of over 90 popular native and hybrid Android apps. Our results illustrate that V2S+ can accurately replay scenarios from screen recordings, and is capable of reproducing $\approx$ 90.2% of sequential actions recorded in native application scenarios on physical devices, and $\approx$ 83% of sequential actions recorded in hybrid application scenarios on emulators, both with low overhead. A case study with three industrial partners illustrates the potential usefulness of V2S+ from the viewpoint of developers. △ Less

Submitted 3 January, 2023; originally announced January 2023.

Comments: Accepted to IEEE Transactions on Software Engineering. arXiv admin note: substantial text overlap with arXiv:2005.09057

arXiv:2103.04531 [pdf, other]

V2S: A Tool for Translating Video Recordings of Mobile App Usages into Replayable Scenarios

Authors: Madeleine Havranek, Carlos Bernal-Cárdenas, Nathan Cooper, Oscar Chaparro, Denys Poshyvnayk, Kevin Moran

Abstract: Screen recordings are becoming increasingly important as rich software artifacts that inform mobile application development processes. However, the amount of manual effort required to extract information from these graphical artifacts can hinder resource-constrained mobile developers. This paper presents Video2Scenario (V2S), an automated tool that processes video recordings of Android app usages,… ▽ More Screen recordings are becoming increasingly important as rich software artifacts that inform mobile application development processes. However, the amount of manual effort required to extract information from these graphical artifacts can hinder resource-constrained mobile developers. This paper presents Video2Scenario (V2S), an automated tool that processes video recordings of Android app usages, utilizes neural object detection and image classification techniques to classify the depicted user actions, and translates these actions into a replayable scenario. We conducted a comprehensive evaluation to demonstrate V2S's ability to reproduce recorded scenarios across a range of devices and a diverse set of usage cases and applications. The results indicate that, based on its performance with 175 videos depicting 3,534 GUI-based actions, V2S is accurate in reproducing $\approx$89\% of actions from collected videos. △ Less

Submitted 7 March, 2021; originally announced March 2021.

arXiv:2101.09194 [pdf, other]

It Takes Two to Tango: Combining Visual and Textual Information for Detecting Duplicate Video-Based Bug Reports

Authors: Nathan Cooper, Carlos Bernal-Cárdenas, Oscar Chaparro, Kevin Moran, Denys Poshyvanyk

Abstract: When a bug manifests in a user-facing application, it is likely to be exposed through the graphical user interface (GUI). Given the importance of visual information to the process of identifying and understanding such bugs, users are increasingly making use of screenshots and screen-recordings as a means to report issues to developers. However, when such information is reported en masse, such as d… ▽ More When a bug manifests in a user-facing application, it is likely to be exposed through the graphical user interface (GUI). Given the importance of visual information to the process of identifying and understanding such bugs, users are increasingly making use of screenshots and screen-recordings as a means to report issues to developers. However, when such information is reported en masse, such as during crowd-sourced testing, managing these artifacts can be a time-consuming process. As the reporting of screen-recordings in particular becomes more popular, developers are likely to face challenges related to manually identifying videos that depict duplicate bugs. Due to their graphical nature, screen-recordings present challenges for automated analysis that preclude the use of current duplicate bug report detection techniques. To overcome these challenges and aid developers in this task, this paper presents Tango, a duplicate detection technique that operates purely on video-based bug reports by leveraging both visual and textual information. Tango combines tailored computer vision techniques, optical character recognition, and text retrieval. We evaluated multiple configurations of Tango in a comprehensive empirical evaluation on 4,860 duplicate detection tasks that involved a total of 180 screen-recordings from six Android apps. Additionally, we conducted a user study investigating the effort required for developers to manually detect duplicate video-based bug reports and compared this to the effort required to use Tango. The results reveal that Tango's optimal configuration is highly effective at detecting duplicate video-based bug reports, accurately ranking target duplicate videos in the top-2 returned results in 83% of the tasks. Additionally, our user study shows that, on average, Tango can reduce developer effort by over 60%, illustrating its practicality. △ Less

Submitted 5 February, 2021; v1 submitted 22 January, 2021; originally announced January 2021.

Comments: 13 pages and 1 figure. Published at ICSE'21

arXiv:2005.09057 [pdf, other]

doi 10.1145/3377811.3380328

Translating Video Recordings of Mobile App Usages into Replayable Scenarios

Authors: Carlos Bernal-Cárdenas, Nathan Cooper, Kevin Moran, Oscar Chaparro, Andrian Marcus, Denys Poshyvanyk

Abstract: Screen recordings of mobile applications are easy to obtain and capture a wealth of information pertinent to software developers (e.g., bugs or feature requests), making them a popular mechanism for crowdsourced app feedback. Thus, these videos are becoming a common artifact that developers must manage. In light of unique mobile development constraints, including swift release cycles and rapidly e… ▽ More Screen recordings of mobile applications are easy to obtain and capture a wealth of information pertinent to software developers (e.g., bugs or feature requests), making them a popular mechanism for crowdsourced app feedback. Thus, these videos are becoming a common artifact that developers must manage. In light of unique mobile development constraints, including swift release cycles and rapidly evolving platforms, automated techniques for analyzing all types of rich software artifacts provide benefit to mobile developers. Unfortunately, automatically analyzing screen recordings presents serious challenges, due to their graphical nature, compared to other types of (textual) artifacts. To address these challenges, this paper introduces V2S, a lightweight, automated approach for translating video recordings of Android app usages into replayable scenarios. V2S is based primarily on computer vision techniques and adapts recent solutions for object detection and image classification to detect and classify user actions captured in a video, and convert these into a replayable test scenario. We performed an extensive evaluation of V2S involving 175 videos depicting 3,534 GUI-based actions collected from users exercising features and reproducing bugs from over 80 popular Android apps. Our results illustrate that V2S can accurately replay scenarios from screen recordings, and is capable of reproducing $\approx$ 89% of our collected videos with minimal overhead. A case study with three industrial partners illustrates the potential usefulness of V2S from the viewpoint of developers. △ Less

Submitted 18 May, 2020; originally announced May 2020.

Comments: In proceedings of the 42nd International Conference on Software Engineering (ICSE'20), 13 pages

arXiv:2005.09046 [pdf, other]

doi 10.1145/3377811.3380418

Improving the Effectiveness of Traceability Link Recovery using Hierarchical Bayesian Networks

Authors: Kevin Moran, David N. Palacio, Carlos Bernal-Cárdenas, Daniel McCrystal, Denys Poshyvanyk, Chris Shenefiel, Jeff Johnson

Abstract: Traceability is a fundamental component of the modern software development process that helps to ensure properly functioning, secure programs. Due to the high cost of manually establishing trace links, researchers have developed automated approaches that draw relationships between pairs of textual software artifacts using similarity measures. However, the effectiveness of such techniques are often… ▽ More Traceability is a fundamental component of the modern software development process that helps to ensure properly functioning, secure programs. Due to the high cost of manually establishing trace links, researchers have developed automated approaches that draw relationships between pairs of textual software artifacts using similarity measures. However, the effectiveness of such techniques are often limited as they only utilize a single measure of artifact similarity and cannot simultaneously model (implicit and explicit) relationships across groups of diverse development artifacts. In this paper, we illustrate how these limitations can be overcome through the use of a tailored probabilistic model. To this end, we design and implement a HierarchiCal PrObabilistic Model for SoftwarE Traceability (Comet) that is able to infer candidate trace links. Comet is capable of modeling relationships between artifacts by combining the complementary observational prowess of multiple measures of textual similarity. Additionally, our model can holistically incorporate information from a diverse set of sources, including developer feedback and transitive (often implicit) relationships among groups of software artifacts, to improve inference accuracy. We conduct a comprehensive empirical evaluation of Comet that illustrates an improvement over a set of optimally configured baselines of $\approx$14% in the best case and $\approx$5% across all subjects in terms of average precision. The comparative effectiveness of Comet in practice, where optimal configuration is typically not possible, is likely to be higher. Finally, we illustrate Comets potential for practical applicability in a survey with developers from Cisco Systems who used a prototype Comet Jenkins plugin. △ Less

Submitted 11 April, 2022; v1 submitted 18 May, 2020; originally announced May 2020.

Comments: Accepted in the Proceedings of the 42nd International Conference on Software Engineering (ICSE'20), 13 pages

arXiv:1908.00614 [pdf, other]

Learning to Identify Security-Related Issues Using Convolutional Neural Networks

Authors: David N. Palacio, Daniel McCrystal, Kevin Moran, Carlos Bernal-Cárdenas, Denys Poshyvanyk, Chris Shenefiel

Abstract: Software security is becoming a high priority for both large companies and start-ups alike due to the increasing potential for harm that vulnerabilities and breaches carry with them. However, attaining robust security assurance while delivering features requires a precarious balancing act in the context of agile development practices. One path forward to help aid development teams in securing thei… ▽ More Software security is becoming a high priority for both large companies and start-ups alike due to the increasing potential for harm that vulnerabilities and breaches carry with them. However, attaining robust security assurance while delivering features requires a precarious balancing act in the context of agile development practices. One path forward to help aid development teams in securing their software products is through the design and development of security-focused automation. Ergo, we present a novel approach, called SecureReqNet, for automatically identifying whether issues in software issue tracking systems describe security-related content. Our approach consists of a two-phase neural net architecture that operates purely on the natural language descriptions of issues. The first phase of our approach learns high dimensional word embeddings from hundreds of thousands of vulnerability descriptions listed in the CVE database and issue descriptions extracted from open source projects. The second phase then utilizes the semantic ontology represented by these embeddings to train a convolutional neural network capable of predicting whether a given issue is security-related. We evaluated SecureReqNet by applying it to identify security-related issues from a dataset of thousands of issues mined from popular projects on GitLab and GitHub. In addition, we also applied our approach to identify security-related requirements from a commercial software project developed by a major telecommunication company. Our preliminary results are encouraging, with SecureReqNet achieving an accuracy of 96% on open source issues and 71.6% on industrial requirements. △ Less

Submitted 5 August, 2019; v1 submitted 1 August, 2019; originally announced August 2019.

Comments: 5 pages, 3 Figures, ICSME 2019 conference

arXiv:1906.07107 [pdf, other]

Assessing the Quality of the Steps to Reproduce in Bug Reports

Authors: Oscar Chaparro, Carlos Bernal-Cardenas, **g Lu, Kevin Moran, Andrian Marcus, Massimiliano Di Penta, Denys Poshyvanyk, Vincent Ng

Abstract: A major problem with user-written bug reports, indicated by developers and documented by researchers, is the (lack of high) quality of the reported steps to reproduce the bugs. Low-quality steps to reproduce lead to excessive manual effort spent on bug triage and resolution. This paper proposes Euler, an approach that automatically identifies and assesses the quality of the steps to reproduce in a… ▽ More A major problem with user-written bug reports, indicated by developers and documented by researchers, is the (lack of high) quality of the reported steps to reproduce the bugs. Low-quality steps to reproduce lead to excessive manual effort spent on bug triage and resolution. This paper proposes Euler, an approach that automatically identifies and assesses the quality of the steps to reproduce in a bug report, providing feedback to the reporters, which they can use to improve the bug report. The feedback provided by Euler was assessed by external evaluators and the results indicate that Euler correctly identified 98% of the existing steps to reproduce and 58% of the missing ones, while 73% of its quality annotations are correct. △ Less

Submitted 17 June, 2019; originally announced June 2019.

Comments: In Proceedings of the 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE '19), August 26-30, 2019, Tallinn, Estonia

arXiv:1901.00891 [pdf, other]

Guigle: A GUI Search Engine for Android Apps

Authors: Carlos Bernal-Cardenas, Kevin Moran, Michele Tufano, Zichang Liu, Linyong Nan, Zhehan Shi, Denys Poshyvanyk

Abstract: The process of develo** a mobile application typically starts with the ideation and conceptualization of its user interface. This concept is then translated into a set of mock-ups to help determine how well the user interface embodies the intended features of the app. After the creation of mock-ups developers then translate it into an app that runs in a mobile device. In this paper we propose an… ▽ More The process of develo** a mobile application typically starts with the ideation and conceptualization of its user interface. This concept is then translated into a set of mock-ups to help determine how well the user interface embodies the intended features of the app. After the creation of mock-ups developers then translate it into an app that runs in a mobile device. In this paper we propose an approach, called GUIGLE, that aims to facilitate the process of conceptualizing the user interface of an app through GUI search. GUIGLE indexes GUI images and metadata extracted using automated dynamic analysis on a large corpus of apps extracted from Google Play. To perform a search, our approach uses information from text displayed on a screen, user interface components, the app name, and screen color palettes to retrieve relevant screens given a query. Furthermore, we provide a lightweight query language that allows for intuitive search of screens. We evaluate GUIGLE with real users and found that, on average, 68.8% of returned screens were relevant to the specified query. Additionally, users found the various different features of GUIGLE useful, indicating that our search engine provides an intuitive user experience. Finally, users agree that the information presented by GUIGLE is useful in conceptualizing the design of new screens for applications. △ Less

Submitted 3 January, 2019; originally announced January 2019.

Comments: Accepted to 41st ACM/IEEE International Conference on Software Engineering, Formal Tool Demonstrations Track

arXiv:1802.04749 [pdf, other]

doi 10.1145/3183440.3183492

MDroid+: A Mutation Testing Framework for Android

Authors: Kevin Moran, Michele Tufano, Carlos Bernal-Cárdenas, Mario Linares-Vásquez, Gabriele Bavota, Christopher Vendome, Massimiliano Di Penta, Denys Poshyvanyk

Abstract: Mutation testing has shown great promise in assessing the effectiveness of test suites while exhibiting additional applications to test-case generation, selection, and prioritization. Traditional mutation testing typically utilizes a set of simple language specific source code transformations, called operators, to introduce faults. However, empirical studies have shown that for mutation testing to… ▽ More Mutation testing has shown great promise in assessing the effectiveness of test suites while exhibiting additional applications to test-case generation, selection, and prioritization. Traditional mutation testing typically utilizes a set of simple language specific source code transformations, called operators, to introduce faults. However, empirical studies have shown that for mutation testing to be most effective, these simple operators must be augmented with operators specific to the domain of the software under test. One challenging software domain for the application of mutation testing is that of mobile apps. While mobile devices and accompanying apps have become a mainstay of modern computing, the frameworks and patterns utilized in their development make testing and verification particularly difficult. As a step toward hel** to measure and ensure the effectiveness of mobile testing practices, we introduce MDroid+, an automated framework for mutation testing of Android apps. MDroid+ includes 38 mutation operators from ten empirically derived types of Android faults and has been applied to generate over 8,000 mutants for more than 50 apps. △ Less

Submitted 13 February, 2018; originally announced February 2018.

Comments: 4 Pages, Accepted to the Formal Tool Demonstration Track at the 40th International Conference on Software Engineering (ICSE'18)

arXiv:1802.04732 [pdf, other]

doi 10.1145/3180155.3180246

Automated Reporting of GUI Design Violations for Mobile Apps

Authors: Kevin Moran, Boyang Li, Carlos Bernal-Cárdenas, Dan Jelf, Denys Poshyvanyk

Abstract: The inception of a mobile app often takes form of a mock-up of the Graphical User Interface (GUI), represented as a static image delineating the proper layout and style of GUI widgets that satisfy requirements. Following this initial mock-up, the design artifacts are then handed off to developers whose goal is to accurately implement these GUIs and the desired functionality in code. Given the siza… ▽ More The inception of a mobile app often takes form of a mock-up of the Graphical User Interface (GUI), represented as a static image delineating the proper layout and style of GUI widgets that satisfy requirements. Following this initial mock-up, the design artifacts are then handed off to developers whose goal is to accurately implement these GUIs and the desired functionality in code. Given the sizable abstraction gap between mock-ups and code, developers often introduce mistakes related to the GUI that can negatively impact an app's success in highly competitive marketplaces. Moreover, such mistakes are common in the evolutionary context of rapidly changing apps. This leads to the time-consuming and laborious task of design teams verifying that each screen of an app was implemented according to intended design specifications. This paper introduces a novel, automated approach for verifying whether the GUI of a mobile app was implemented according to its intended design. Our approach resolves GUI-related information from both implemented apps and mock-ups and uses computer vision techniques to identify common errors in the implementations of mobile GUIs. We implemented this approach for Android in a tool called GVT and carried out both a controlled empirical evaluation with open-source apps as well as an industrial evaluation with designers and developers from Huawei. The results show that GVT solves an important, difficult, and highly practical problem with remarkable efficiency and accuracy and is both useful and scalable from the point of view of industrial designers and developers. The tool is currently used by over one-thousand industrial designers and developers at Huawei to improve the quality of their mobile apps. △ Less

Submitted 5 April, 2018; v1 submitted 13 February, 2018; originally announced February 2018.

Comments: 11 pages, Accepted to the 40th International Conference on Software Engineering (ICSE'18)

arXiv:1802.02312 [pdf, other]

Machine Learning-Based Prototy** of Graphical User Interfaces for Mobile Apps

Authors: Kevin Moran, Carlos Bernal-Cárdenas, Michael Curcio, Richard Bonett, Denys Poshyvanyk

Abstract: It is common practice for developers of user-facing software to transform a mock-up of a graphical user interface (GUI) into code. This process takes place both at an application's inception and in an evolutionary context as GUI changes keep pace with evolving features. Unfortunately, this practice is challenging and time-consuming. In this paper, we present an approach that automates this process… ▽ More It is common practice for developers of user-facing software to transform a mock-up of a graphical user interface (GUI) into code. This process takes place both at an application's inception and in an evolutionary context as GUI changes keep pace with evolving features. Unfortunately, this practice is challenging and time-consuming. In this paper, we present an approach that automates this process by enabling accurate prototy** of GUIs via three tasks: detection, classification, and assembly. First, logical components of a GUI are detected from a mock-up artifact using either computer vision techniques or mock-up metadata. Then, software repository mining, automated dynamic analysis, and deep convolutional neural networks are utilized to accurately classify GUI-components into domain-specific types (e.g., toggle-button). Finally, a data-driven, K-nearest-neighbors algorithm generates a suitable hierarchical GUI structure from which a prototype application can be automatically assembled. We implemented this approach for Android in a system called ReDraw. Our evaluation illustrates that ReDraw achieves an average GUI-component classification accuracy of 91% and assembles prototype applications that closely mirror target mock-ups in terms of visual affinity while exhibiting reasonable code structure. Interviews with industrial practitioners illustrate ReDraw's potential to improve real development workflows. △ Less

Submitted 4 June, 2018; v1 submitted 7 February, 2018; originally announced February 2018.

Comments: Accepted to IEEE Transactions on Software Engineering

arXiv:1801.06428 [pdf, other]

doi 10.1109/ICSE-C.2017.16

CrashScope: A Practical Tool for Automated Testing of Android Applications

Authors: Kevin Moran, Mario Linares-Vasquez, Carlos Bernal-Cardenas, Christopher Vendome, Denys Poshyvanyk

Abstract: Unique challenges arise when testing mobile applications due to their prevailing event-driven nature and complex contextual features (e.g. sensors, notifications). Current automated input generation approaches for Android apps are typically not practical for developers to use due to required instrumentation or platform dependence and generally do not effectively exercise contextual features. To be… ▽ More Unique challenges arise when testing mobile applications due to their prevailing event-driven nature and complex contextual features (e.g. sensors, notifications). Current automated input generation approaches for Android apps are typically not practical for developers to use due to required instrumentation or platform dependence and generally do not effectively exercise contextual features. To better support developers in mobile testing tasks, in this demo we present a novel, automated tool called CrashScope. This tool explores a given Android app using systematic input generation, according to several strategies informed by static and dynamic analyses, with the intrinsic goal of triggering crashes. When a crash is detected, CrashScope generates an augmented crash report containing screenshots, detailed crash reproduction steps, the captured exception stack trace, and a fully replayable script that automatically reproduces the crash on a target device(s). Results of preliminary studies show that CrashScope is able to uncover about as many crashes as other state of the art tools, while providing detailed useful crash reports and test scripts to developers. Website: www.crashscope-android.com/crashscope-home Video url: https://youtu.be/ii6S1JF6xDw △ Less

Submitted 17 January, 2018; originally announced January 2018.

Comments: 4 pages, Accepted in the Proceedings of 39th IEEE/ACM International Conference on Software Engineering (ICSE'17). arXiv admin note: text overlap with arXiv:1706.01130

arXiv:1801.06271 [pdf, other]

Mining Android App Usages for Generating Actionable GUI-based Execution Scenarios

Authors: Mario Linares-Vasquez, Martin White, Carlos Bernal-Cardenas, Kevin Moran, Denys Poshyvanyk

Abstract: GUI-based models extracted from Android app execution traces, events, or source code can be extremely useful for challenging tasks such as the generation of scenarios or test cases. However, extracting effective models can be an expensive process. Moreover, existing approaches for automatically deriving GUI-based models are not able to generate scenarios that include events which were not observed… ▽ More GUI-based models extracted from Android app execution traces, events, or source code can be extremely useful for challenging tasks such as the generation of scenarios or test cases. However, extracting effective models can be an expensive process. Moreover, existing approaches for automatically deriving GUI-based models are not able to generate scenarios that include events which were not observed in execution (nor event) traces. In this paper, we address these and other major challenges in our novel hybrid approach, coined as MonkeyLab. Our approach is based on the Record-Mine-Generate-Validate framework, which relies on recording app usages that yield execution (event) traces, mining those event traces and generating execution scenarios using statistical language modeling, static and dynamic analyses, and validating the resulting scenarios using an interactive execution of the app on a real device. The framework aims at mining models capable of generating feasible and fully replayable (i.e., actionable) scenarios reflecting either natural user behavior or uncommon usages (e.g., corner cases) for a given app. We evaluated MONKEYLAB in a case study involving several medium-to-large open-source Android apps. Our results demonstrate that MonkeyLab is able to mine GUI-based models that can be used to generate actionable execution scenarios for both natural and unnatural sequences of events on Google Nexus 7 tablets. △ Less

Submitted 18 January, 2018; originally announced January 2018.

Comments: 12 pages, accepted to the Proceedings of the 12th IEEE Working Conference on Mining Software Repositories (MSR'15)

arXiv:1801.06268 [pdf, other]

How do Developers Test Android Applications?

Authors: Mario Linares Vasquez, Carlos Bernal-Cardenas, Kevin Moran, Denys Poshyvanyk

Abstract: Enabling fully automated testing of mobile applications has recently become an important topic of study for both researchers and practitioners. A plethora of tools and approaches have been proposed to aid mobile developers both by augmenting manual testing practices and by automating various parts of the testing process. However, current approaches for automated testing fall short in convincing de… ▽ More Enabling fully automated testing of mobile applications has recently become an important topic of study for both researchers and practitioners. A plethora of tools and approaches have been proposed to aid mobile developers both by augmenting manual testing practices and by automating various parts of the testing process. However, current approaches for automated testing fall short in convincing developers about their benefits, leading to a majority of mobile testing being performed manually. With the goal of hel** researchers and practitioners - who design approaches supporting mobile testing - to understand developer's needs, we analyzed survey responses from 102 open source contributors to Android projects about their practices when performing testing. The survey focused on questions regarding practices and preferences of developers/testers in-the-wild for (i) designing and generating test cases, (ii) automated testing practices, and (iii) perceptions of quality metrics such as code coverage for determining test quality. Analyzing the information gleaned from this survey, we compile a body of knowledge to help guide researchers and professionals toward tailoring new automated testing approaches to the need of a diverse set of open source developers. △ Less

Submitted 18 January, 2018; originally announced January 2018.

Comments: 11 pages, accepted to the Proceedings of the 33rd IEEE International Conference on Software Maintenance and Evolution (ICSME'17)

arXiv:1801.05940 [pdf, other]

FUSION: A Tool for Facilitating and Augmenting Android Bug Reporting

Authors: Kevin Moran, Mario Linares-Vasquez, Carlos Bernal-Cardenas, Denys Poshyvanyk

Abstract: As the popularity of mobile smart devices continues to climb the complexity of "apps" continues to increase, making the development and maintenance process challenging. Current bug tracking systems lack key features to effectively support construction of reports with actionable information that directly lead to a bug's resolution. In this demo we present the implementation of a novel bug reporting… ▽ More As the popularity of mobile smart devices continues to climb the complexity of "apps" continues to increase, making the development and maintenance process challenging. Current bug tracking systems lack key features to effectively support construction of reports with actionable information that directly lead to a bug's resolution. In this demo we present the implementation of a novel bug reporting system, called Fusion, that facilitates users including reproduction steps in bug reports for mobile apps. Fusion links user-provided information to program artifacts extracted through static and dynamic analysis performed before testing or release. Results of preliminary studies demonstrate that Fusion both effectively facilitates reporting and allows for more reliable reproduction of bugs from reports compared to traditional issue tracking systems by presenting more detailed contextual app information. Tool website: www.fusion-android. com Video url: https://youtu.be/AND9h0ElxRg △ Less

Submitted 18 January, 2018; originally announced January 2018.

Comments: 4 pages, Accepted in the Proceedings of 38thACM/IEEE International Conference on Software Engineering (ICSE'16)

arXiv:1801.05924 [pdf, other]

On-Device Bug Reporting for Android Applications

Authors: Kevin Moran, Richard Bonett, Carlos Bernal-Cardenas, Brendan Otten, Daniel Park, Denys Poshyvanyk

Abstract: Bugs that surface in mobile applications can be difficult to reproduce and fix due to several confounding factors including the highly GUI-driven nature of mobile apps, varying contextual states, differing platform versions and device fragmentation. It is clear that developers need support in the form of automated tools that allow for more precise reporting of application defects in order to facil… ▽ More Bugs that surface in mobile applications can be difficult to reproduce and fix due to several confounding factors including the highly GUI-driven nature of mobile apps, varying contextual states, differing platform versions and device fragmentation. It is clear that developers need support in the form of automated tools that allow for more precise reporting of application defects in order to facilitate more efficient and effective bug fixes. In this paper, we present a tool aimed at supporting application testers and developers in the process of On-Device Bug Reporting. Our tool, called ODBR, leverages the uiautomator framework and low-level event stream capture to offer support for recording and replaying a series of input gesture and sensor events that describe a bug in an Android application. △ Less

Submitted 17 January, 2018; originally announced January 2018.

Comments: 2 pages, Accepted in the Proceedings of 4th IEEE/ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft'17)

arXiv:1707.09038 [pdf, other]

doi 10.1145/3106237.3106275

Enabling Mutation Testing for Android Apps

Authors: Mario Linares-Vásquez, Gabriele Bavota, Michele Tufano, Kevin Moran, Massimiliano Di Penta, Christopher Vendome, Carlos Bernal-Cárdenas, Denys Poshyvanyk

Abstract: Mutation testing has been widely used to assess the fault-detection effectiveness of a test suite, as well as to guide test case generation or prioritization. Empirical studies have shown that, while mutants are generally representative of real faults, an effective application of mutation testing requires "traditional" operators designed for programming languages to be augmented with operators spe… ▽ More Mutation testing has been widely used to assess the fault-detection effectiveness of a test suite, as well as to guide test case generation or prioritization. Empirical studies have shown that, while mutants are generally representative of real faults, an effective application of mutation testing requires "traditional" operators designed for programming languages to be augmented with operators specific to an application domain and/or technology. This paper proposes MDroid+, a framework for effective mutation testing of Android apps. First, we systematically devise a taxonomy of 262 types of Android faults grouped in 14 categories by manually analyzing 2,023 software artifacts from different sources (e.g., bug reports, commits). Then, we identified a set of 38 mutation operators, and implemented an infrastructure to automatically seed mutations in Android apps with 35 of the identified operators. The taxonomy and the proposed operators have been evaluated in terms of stillborn/trivial mutants generated and their capacity to represent real faults in Android apps, as compared to other well know mutation tools. △ Less

Submitted 31 July, 2017; v1 submitted 27 July, 2017; originally announced July 2017.

Comments: Accepted at 11TH Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 17)

arXiv:1706.01130 [pdf, other]

doi 10.1109/ICST.2016.34

Automatically Discovering, Reporting and Reproducing Android Application Crashes

Authors: Kevin Moran, Mario Linares-Vásquez, Carlos Bernal-Cárdenas, Christopher Vendome, Denys Poshyvanyk

Abstract: Mobile developers face unique challenges when detecting and reporting crashes in apps due to their prevailing GUI event-driven nature and additional sources of inputs (e.g., sensor readings). To support developers in these tasks, we introduce a novel, automated approach called CRASHSCOPE. This tool explores a given Android app using systematic input generation, according to several strategies info… ▽ More Mobile developers face unique challenges when detecting and reporting crashes in apps due to their prevailing GUI event-driven nature and additional sources of inputs (e.g., sensor readings). To support developers in these tasks, we introduce a novel, automated approach called CRASHSCOPE. This tool explores a given Android app using systematic input generation, according to several strategies informed by static and dynamic analyses, with the intrinsic goal of triggering crashes. When a crash is detected, CRASHSCOPE generates an augmented crash report containing screenshots, detailed crash reproduction steps, the captured exception stack trace, and a fully replayable script that automatically reproduces the crash on a target device(s). We evaluated CRASHSCOPE's effectiveness in discovering crashes as compared to five state-of-the-art Android input generation tools on 61 applications. The results demonstrate that CRASHSCOPE performs about as well as current tools for detecting crashes and provides more detailed fault information. Additionally, in a study analyzing eight real-world Android app crashes, we found that CRASHSCOPE's reports are easily readable and allow for reliable reproduction of crashes by presenting more explicit information than human written reports. △ Less

Submitted 4 June, 2017; originally announced June 2017.

Comments: 12 pages, in Proceedings of 9th IEEE International Conference on Software Testing, Verification and Validation (ICST'16), Chicago, IL, April 10-15, 2016, pp. 33-44

Journal ref: IEEE International Conference on Software Testing, Verification and Validation (ICST), Chicago, IL, 2016, pp. 33-44

arXiv:1705.04016 [pdf, other]

doi 10.1145/2786805.2786857

Auto-completing Bug Reports for Android Applications

Authors: Kevin Moran, Mario Linares-Vásquez, Carlos Bernal-Cárdenas, Denys Poshyvanyk

Abstract: The modern software development landscape has seen a shift in focus toward mobile applications as tablets and smartphones near ubiquitous adoption. Due to this trend, the complexity of these apps has been increasing, making development and maintenance challenging. Additionally, current bug tracking systems are not able to effectively support construction of reports with actionable information that… ▽ More The modern software development landscape has seen a shift in focus toward mobile applications as tablets and smartphones near ubiquitous adoption. Due to this trend, the complexity of these apps has been increasing, making development and maintenance challenging. Additionally, current bug tracking systems are not able to effectively support construction of reports with actionable information that directly lead to a bug's resolution. To address the need for an improved reporting system, we introduce a novel solution, called FUSION, that helps users auto complete reproduction steps in bug reports for mobile apps. FUSION links user provided information to program artifacts extracted through static and dynamic analysis performed before testing or release. The approach that FUSION employs is generalizable to other current mobile software platforms, and constitutes a new method by which off device bug reporting can be conducted for mobile software projects. In a study involving 28 participants we applied FUSION to support the maintenance tasks of reporting and reproducing defects from 15 real world bugs found in 14 open source Android apps while qualitatively and qualitatively measuring the user experience of the system. Our results demonstrate that FUSION both effectively facilitates reporting and allows for more reliable reproduction of bugs from reports compared to traditional issue tracking systems by presenting more detailed contextual app information. △ Less

Submitted 11 May, 2017; originally announced May 2017.

Comments: ESEC/FSE 2015 10th Joint Meeting on Foundations of Software Engineering, 14 pages

ACM Class: D.2.7

Showing 1–20 of 20 results for author: Bernal-Cardenas, C