-
A Snowballing Literature Study on Test Amplification
Authors:
Benjamin Danglot,
Oscar Luis Vera-Pérez,
Zhongxing Yu,
Andy Zaidman,
Martin Monperrus,
Benoit Baudry
Abstract:
The adoption of agile development approaches has put an increased emphasis on developer testing, resulting in software projects with strong test suites. These suites include a large number of test cases, in which developers embed knowledge about meaningful input data and expected properties in the form of oracles. This article surveys various works that aim at exploiting this knowledge in order to…
▽ More
The adoption of agile development approaches has put an increased emphasis on developer testing, resulting in software projects with strong test suites. These suites include a large number of test cases, in which developers embed knowledge about meaningful input data and expected properties in the form of oracles. This article surveys various works that aim at exploiting this knowledge in order to enhance these manually written tests with respect to an engineering goal (e.g., improve coverage of changes or increase the accuracy of fault localization). While these works rely on various techniques and address various goals, we believe they form an emerging and coherent field of research, which we call `test amplification'. We devised a first set of papers from DBLP, looking for all papers containing `test' and `amplification' in their title. We reviewed the 70 papers in this set and selected the 4 papers that fit our definition of test amplification. We use these 4 papers as the seed for our snowballing study, and systematically followed the citation graph. This study is the first that draws a comprehensive picture of the different engineering goals proposed in the literature for test amplification. In particular, we note that the goal of test amplification goes far beyond maximizing coverage only. We believe that this survey will help researchers and practitioners entering this new field to understand more quickly and more deeply the intuitions, concepts and techniques used for test amplification.
△ Less
Submitted 17 August, 2022; v1 submitted 30 May, 2017;
originally announced May 2017.
-
Test Case Generation for Program Repair: A Study of Feasibility and Effectiveness
Authors:
Zhongxing Yu,
Matias Martinez,
Benjamin Danglot,
Thomas Durieux,
Martin Monperrus
Abstract:
Among the many different kinds of program repair techniques, one widely studied family of techniques is called test suite based repair. Test-suites are in essence input-output specifications and are therefore typically inadequate for completely specifying the expected behavior of the program under repair. Consequently, the patches generated by test suite based program repair techniques pass the te…
▽ More
Among the many different kinds of program repair techniques, one widely studied family of techniques is called test suite based repair. Test-suites are in essence input-output specifications and are therefore typically inadequate for completely specifying the expected behavior of the program under repair. Consequently, the patches generated by test suite based program repair techniques pass the test suite, yet may be incorrect. Patches that are overly specific to the used test suite and fail to generalize to other test cases are called overfitting patches. In this paper, we investigate the feasibility and effectiveness of test case generation in alleviating the overfitting issue. We propose two approaches for using test case generation to improve test suite based repair, and perform an extensive evaluation of the effectiveness of the proposed approaches in enabling better test suite based repair on 224 bugs of the Defects4J repository. The results indicate that test case generation can change the resulting patch, but is not effective at turning incorrect patches into correct ones. We identify the problems related with the ineffectiveness, and anticipate that our results and findings will lead to future research to build test-case generation techniques that are tailored to automatic repair systems.
△ Less
Submitted 1 March, 2017;
originally announced March 2017.
-
Correctness Attraction: A Study of Stability of Software Behavior Under Runtime Perturbation
Authors:
Benjamin Danglot,
Philippe Preux,
Benoit Baudry,
Martin Monperrus
Abstract:
Can the execution of a software be perturbed without breaking the correctness of the output? In this paper, we devise a novel protocol to answer this rarely investigated question. In an experimental study, we observe that many perturbations do not break the correctness in ten subject programs. We call this phenomenon ``correctness attraction''. The uniqueness of this protocol is that it considers…
▽ More
Can the execution of a software be perturbed without breaking the correctness of the output? In this paper, we devise a novel protocol to answer this rarely investigated question. In an experimental study, we observe that many perturbations do not break the correctness in ten subject programs. We call this phenomenon ``correctness attraction''. The uniqueness of this protocol is that it considers a systematic exploration of the perturbation space as well as perfect oracles to determine the correctness of the output. To this extent, our findings on the stability of software under execution perturbations have a level of validity that has never been reported before in the scarce related work. A qualitative manual analysis enables us to set up the first taxonomy ever of the reasons behind correctness attraction.
△ Less
Submitted 30 May, 2017; v1 submitted 28 November, 2016;
originally announced November 2016.
-
Production-Driven Patch Generation and Validation
Authors:
Thomas Durieux,
Youssef Hamadi,
Martin Monperrus
Abstract:
We envision a world where the developer would receive each morning in her GitHub dashboard a list of potential patches that fix certain production failures. For this, we propose a novel program repair scheme, with the unique feature of being applicable to production directly. We present the design and implementation of a prototype system for Java, called Itzal, that performs patch generation for u…
▽ More
We envision a world where the developer would receive each morning in her GitHub dashboard a list of potential patches that fix certain production failures. For this, we propose a novel program repair scheme, with the unique feature of being applicable to production directly. We present the design and implementation of a prototype system for Java, called Itzal, that performs patch generation for uncaught exceptions in production. We have performed two empirical experiments to validate our system: the first one on 34 failures from 14 different software applications, the second one on 16 seeded failures in 3 real open-source e-commerce applications for which we have set up a realistic user traffic. This validates the novel and disruptive idea of using program repair directly in production.
△ Less
Submitted 12 June, 2018; v1 submitted 22 September, 2016;
originally announced September 2016.
-
BanditRepair: Speculative Exploration of Runtime Patches
Authors:
Thomas Durieux,
Youssef Hamadi,
Martin Monperrus
Abstract:
We propose, BanditRepair, a system that systematically explores and assesses a set of possible runtime patches. The system is grounded on so-called bandit algorithms, that are online machine learning algorithms, designed for constantly balancing exploitation and exploration. BanditRepair's runtime patches are based on modifying the execution state for repairing null dereferences. BanditRepair cons…
▽ More
We propose, BanditRepair, a system that systematically explores and assesses a set of possible runtime patches. The system is grounded on so-called bandit algorithms, that are online machine learning algorithms, designed for constantly balancing exploitation and exploration. BanditRepair's runtime patches are based on modifying the execution state for repairing null dereferences. BanditRepair constantly trades the ratio of automatically handled failures for searching for new runtime patches and vice versa. We evaluate the system with 16 null dereference field bugs, where BanditRepair identifies a total of 8460 different runtime patches, which are composed of 1 up to 8 decisions (execution modifications) taken in a row. We are the first to finely characterize the search space and the outcomes of runtime repair based on execution modification.
△ Less
Submitted 24 March, 2016;
originally announced March 2016.
-
A Learning Algorithm for Change Impact Prediction
Authors:
Vincenzo Musco,
Antonin Carette,
Martin Monperrus,
Philippe Preux
Abstract:
Change impact analysis consists in predicting the impact of a code change in a software application. In this paper, we take a learning perspective on change impact analysis and consider the problem formulated as follows. The artifacts that are considered are methods of object-oriented software, the change under study is a change in the code of the method, the impact is the test methods that fail b…
▽ More
Change impact analysis consists in predicting the impact of a code change in a software application. In this paper, we take a learning perspective on change impact analysis and consider the problem formulated as follows. The artifacts that are considered are methods of object-oriented software, the change under study is a change in the code of the method, the impact is the test methods that fail because of the change that has been performed. We propose an algorithm, called LCIP that learns from past impacts to predict future impacts. To evaluate our system, we consider 7 Java software applications totaling 214,000+ lines of code. We simulate 17574 changes and their actual impact through code mutations, as done in mutation testing. We find that LCIP can predict the impact with a precision of 69%, a recall of 79%, corresponding to a F-Score of 55%.
△ Less
Submitted 6 May, 2018; v1 submitted 23 December, 2015;
originally announced December 2015.
-
NPEFix: Automatic Runtime Repair of Null Pointer Exceptions in Java
Authors:
Benoit Cornu,
Thomas Durieux,
Lionel Seinturier,
Martin Monperrus
Abstract:
Null pointer exceptions, also known as null dereferences are the number one exceptions in the field. In this paper, we propose 9 alternative execution semantics when a null pointer exception is about to happen. We implement those alternative execution strategies using code transformation in a tool called NPEfix. We evaluate our prototype implementation on 11 field null dereference bugs and 519 see…
▽ More
Null pointer exceptions, also known as null dereferences are the number one exceptions in the field. In this paper, we propose 9 alternative execution semantics when a null pointer exception is about to happen. We implement those alternative execution strategies using code transformation in a tool called NPEfix. We evaluate our prototype implementation on 11 field null dereference bugs and 519 seeded failures and show that NPEfix is able to repair at runtime 10/11 and 318/519 failures.
△ Less
Submitted 23 December, 2015;
originally announced December 2015.
-
Automatic Software Diversity in the Light of Test Suites
Authors:
Benoit Baudry,
Simon Allier,
Marcelino Rodriguez-Cancio,
Martin Monperrus
Abstract:
A few works address the challenge of automating software diversification, and they all share one core idea: using automated test suites to drive diversification. However, there is is lack of solid understanding of how test suites, programs and transformations interact one with another in this process. We explore this intricate interplay in the context of a specific diversification technique called…
▽ More
A few works address the challenge of automating software diversification, and they all share one core idea: using automated test suites to drive diversification. However, there is is lack of solid understanding of how test suites, programs and transformations interact one with another in this process. We explore this intricate interplay in the context of a specific diversification technique called "sosiefication". Sosiefication generates sosie programs, i.e., variants of a program in which some statements are deleted, added or replaced but still pass the test suite of the original program. Our investigation of the influence of test suites on sosiefication exploits the following observation: test suites cover the different regions of programs in very unequal ways. Hence, we hypothesize that sosie synthesis has different performances on a statement that is covered by one hundred test case and on a statement that is covered by a single test case. We synthesize 24583 sosies on 6 popular open-source Java programs. Our results show that there are two dimensions for diversification. The first one lies in the specification: the more test cases cover a statement, the more difficult it is to synthesize sosies. Yet, to our surprise, we are also able to synthesize sosies on highly tested statements (up to 600 test cases), which indicates an intrinsic property of the programs we study. The second dimension is in the code: we manually explore dozens of sosies and characterize new types of forgiving code regions that are prone to diversification.
△ Less
Submitted 23 December, 2018; v1 submitted 1 September, 2015;
originally announced September 2015.
-
Dynamic Analysis can be Improved with Automatic Test Suite Refactoring
Authors:
Jifeng Xuan,
Benoit Cornu,
Matias Martinez,
Benoit Baudry,
Lionel Seinturier,
Martin Monperrus
Abstract:
Context: Developers design test suites to automatically verify that software meets its expected behaviors. Many dynamic analysis techniques are performed on the exploitation of execution traces from test cases. However, in practice, there is only one trace that results from the execution of one manually-written test case.
Objective: In this paper, we propose a new technique of test suite refacto…
▽ More
Context: Developers design test suites to automatically verify that software meets its expected behaviors. Many dynamic analysis techniques are performed on the exploitation of execution traces from test cases. However, in practice, there is only one trace that results from the execution of one manually-written test case.
Objective: In this paper, we propose a new technique of test suite refactoring, called B-Refactoring. The idea behind B-Refactoring is to split a test case into small test fragments, which cover a simpler part of the control flow to provide better support for dynamic analysis.
Method: For a given dynamic analysis technique, our test suite refactoring approach monitors the execution of test cases and identifies small test cases without loss of the test ability. We apply B-Refactoring to assist two existing analysis tasks: automatic repair of if-statements bugs and automatic analysis of exception contracts.
Results: Experimental results show that test suite refactoring can effectively simplify the execution traces of the test suite. Three real-world bugs that could previously not be fixed with the original test suite are fixed after applying B-Refactoring; meanwhile, exception contracts are better verified via applying B-Refactoring to original test suites.
Conclusions: We conclude that applying B-Refactoring can effectively improve the purity of test cases. Existing dynamic analysis tasks can be enhanced by test suite refactoring.
△ Less
Submitted 5 June, 2015;
originally announced June 2015.
-
Automatic Repair of Real Bugs: An Experience Report on the Defects4J Dataset
Authors:
Matias Martinez,
Thomas Durieux,
Jifeng Xuan,
Romain Sommerard,
Martin Monperrus
Abstract:
Defects4J is a large, peer-reviewed, structured dataset of real-world Java bugs. Each bug in Defects4J is provided with a test suite and at least one failing test case that triggers the bug. In this paper, we report on an experiment to explore the effectiveness of automatic repair on Defects4J. The result of our experiment shows that 47 bugs of the Defects4J dataset can be automatically repaired b…
▽ More
Defects4J is a large, peer-reviewed, structured dataset of real-world Java bugs. Each bug in Defects4J is provided with a test suite and at least one failing test case that triggers the bug. In this paper, we report on an experiment to explore the effectiveness of automatic repair on Defects4J. The result of our experiment shows that 47 bugs of the Defects4J dataset can be automatically repaired by state-of- the-art repair. This sets a baseline for future research on automatic repair for Java. We have manually analyzed 84 different patches to assess their real correctness. In total, 9 real Java bugs can be correctly fixed with test-suite based repair. This analysis shows that test-suite based repair suffers from under-specified bugs, for which trivial and incorrect patches still pass the test suite. With respect to practical applicability, it takes in average 14.8 minutes to find a patch. The experiment was done on a scientific grid, totaling 17.6 days of computation time. All their systems and experimental results are publicly available on Github in order to facilitate future research on automatic repair.
△ Less
Submitted 23 December, 2015; v1 submitted 26 May, 2015;
originally announced May 2015.
-
Automatic Repair of Infinite Loops
Authors:
Sebastian R. Lamelas Marcote,
Martin Monperrus
Abstract:
Research on automatic software repair is concerned with the development of systems that automatically detect and repair bugs. One well-known class of bugs is the infinite loop. Every computer programmer or user has, at least once, experienced this type of bug. We state the problem of repairing infinite loops in the context of test-suite based software repair: given a test suite with at least one f…
▽ More
Research on automatic software repair is concerned with the development of systems that automatically detect and repair bugs. One well-known class of bugs is the infinite loop. Every computer programmer or user has, at least once, experienced this type of bug. We state the problem of repairing infinite loops in the context of test-suite based software repair: given a test suite with at least one failing test, generate a patch that makes all test cases pass. Consequently, repairing infinites loop means having at least one test case that hangs by triggering the infinite loop. Our system to automatically repair infinite loops is called $Infinitel$. We develop a technique to manipulate loops so that one can dynamically analyze the number of iterations of loops; decide to interrupt the loop execution; and dynamically examine the state of the loop on a per-iteration basis. Then, in order to synthesize a new loop condition, we encode this set of program states as a code synthesis problem using a technique based on Satisfiability Modulo Theory (SMT). We evaluate our technique on seven seeded-bugs and on seven real-bugs. $Infinitel$ is able to repair all of them, within seconds up to one hour on a standard laptop configuration.
△ Less
Submitted 20 April, 2015;
originally announced April 2015.
-
DSpot: Test Amplification for Automatic Assessment of Computational Diversity
Authors:
Benoit Baudry,
Simon Allier,
Marcelino Rodriguez-Cancio,
Martin Monperrus
Abstract:
Context: Computational diversity, i.e., the presence of a set of programs that all perform compatible services but that exhibit behavioral differences under certain conditions, is essential for fault tolerance and security. Objective: We aim at proposing an approach for automatically assessing the presence of computational diversity. In this work, computationally diverse variants are defined as (i…
▽ More
Context: Computational diversity, i.e., the presence of a set of programs that all perform compatible services but that exhibit behavioral differences under certain conditions, is essential for fault tolerance and security. Objective: We aim at proposing an approach for automatically assessing the presence of computational diversity. In this work, computationally diverse variants are defined as (i) sharing the same API, (ii) behaving the same according to an input-output based specification (a test-suite) and (iii) exhibiting observable differences when they run outside the specified input space. Method: Our technique relies on test amplification. We propose source code transformations on test cases to explore the input domain and systematically sense the observation domain. We quantify computational diversity as the dissimilarity between observations on inputs that are outside the specified domain. Results: We run our experiments on 472 variants of 7 classes from open-source, large and thoroughly tested Java classes. Our test amplification multiplies by ten the number of input points in the test suite and is effective at detecting software diversity. Conclusion: The key insights of this study are: the systematic exploration of the observable output space of a class provides new insights about its degree of encapsulation; the behavioral diversity that we observe originates from areas of the code that are characterized by their flexibility (caching, checking, formatting, etc.).
△ Less
Submitted 15 June, 2015; v1 submitted 19 March, 2015;
originally announced March 2015.
-
Casper: Debugging Null Dereferences with Dynamic Causality Traces
Authors:
Benoit Cornu,
Earl T. Barr,
Lionel Seinturier,
Martin Monperrus
Abstract:
Fixing a software error requires understanding its root cause. In this paper, we introduce ''causality traces'', crafted execution traces augmented with the information needed to reconstruct the causal chain from the root cause of a bug to an execution error. We propose an approach and a tool, called Casper, for dynamically constructing causality traces for null dereference errors. The core idea o…
▽ More
Fixing a software error requires understanding its root cause. In this paper, we introduce ''causality traces'', crafted execution traces augmented with the information needed to reconstruct the causal chain from the root cause of a bug to an execution error. We propose an approach and a tool, called Casper, for dynamically constructing causality traces for null dereference errors. The core idea of Casper is to inject special values, called ''ghosts'', into the execution stream to construct the causality trace at runtime. We evaluate our contribution by providing and assessing the causality traces of 14 real null dereference bugs collected over six large, popular open-source projects. Over this data set, Casper builds a causality trace in less than 5 seconds.
△ Less
Submitted 20 November, 2015; v1 submitted 6 February, 2015;
originally announced February 2015.
-
Software that Learns from its Own Failures
Authors:
Martin Monperrus
Abstract:
All non-trivial software systems suffer from unanticipated production failures. However, those systems are passive with respect to failures and do not take advantage of them in order to improve their future behavior: they simply wait for them to happen and trigger hard-coded failure recovery strategies. Instead, I propose a new paradigm in which software systems learn from their own failures. By u…
▽ More
All non-trivial software systems suffer from unanticipated production failures. However, those systems are passive with respect to failures and do not take advantage of them in order to improve their future behavior: they simply wait for them to happen and trigger hard-coded failure recovery strategies. Instead, I propose a new paradigm in which software systems learn from their own failures. By using an advanced monitoring system they have a constant awareness of their own state and health. They are designed in order to automatically explore alternative recovery strategies inferred from past successful and failed executions. Their recovery capabilities are assessed by self-injection of controlled failures; this process produces knowledge in prevision of future unanticipated failures.
△ Less
Submitted 3 February, 2015;
originally announced February 2015.
-
A Generative Model of Software Dependency Graphs to Better Understand Software Evolution
Authors:
Vincenzo Musco,
Martin Monperrus,
Philippe Preux
Abstract:
Software systems are composed of many interacting elements. A natural way to abstract over software systems is to model them as graphs. In this paper we consider software dependency graphs of object-oriented software and we study one topological property: the degree distribution. Based on the analysis of ten software systems written in Java, we show that there exists completely different systems t…
▽ More
Software systems are composed of many interacting elements. A natural way to abstract over software systems is to model them as graphs. In this paper we consider software dependency graphs of object-oriented software and we study one topological property: the degree distribution. Based on the analysis of ten software systems written in Java, we show that there exists completely different systems that have the same degree distribution. Then, we propose a generative model of software dependency graphs which synthesizes graphs whose degree distribution is close to the empirical ones observed in real software systems. This model gives us novel insights on the potential fundamental rules of software evolution.
△ Less
Submitted 10 April, 2017; v1 submitted 29 October, 2014;
originally announced October 2014.
-
ASTOR: Evolutionary Automatic Software Repair for Java
Authors:
Matias Martinez,
Martin Monperrus
Abstract:
Context: During last years, many automatic software repair approaches have been presented by the software engineering research community. According to the corresponding papers, these approaches are able to repair real defects from open source projects. Problematic: Some previous publications in the automatic repair field do not provide the implementation of theirs approaches. Consequently, it is n…
▽ More
Context: During last years, many automatic software repair approaches have been presented by the software engineering research community. According to the corresponding papers, these approaches are able to repair real defects from open source projects. Problematic: Some previous publications in the automatic repair field do not provide the implementation of theirs approaches. Consequently, it is not possible for the research community to re-execute the original evaluation, to set up new evaluations (for example, to evaluate the performance against new defects) or to compare approaches against each others. Solution: We propose a publicly available automatic software repair tool called Astor. It implements three state-of-the-art automatic software repair approaches in the context of Java programs (including GenProg and a subset of PAR's templates). The source code of Astor is licensed under the GNU General Public Licence (GPL v2).
△ Less
Submitted 24 October, 2014;
originally announced October 2014.
-
The Multiple Facets of Software Diversity: Recent Developments in Year 2000 and Beyond
Authors:
Benoit Baudry,
Martin Monperrus
Abstract:
Early experiments with software diversity in the mid 1970's investigated N-version programming and recovery blocks to increase the reliability of embedded systems. Four decades later, the literature about software diversity has expanded in multiple directions: goals (fault-tolerance, security, software engineering); means (managed or automated diversity) and analytical studies (quantification of d…
▽ More
Early experiments with software diversity in the mid 1970's investigated N-version programming and recovery blocks to increase the reliability of embedded systems. Four decades later, the literature about software diversity has expanded in multiple directions: goals (fault-tolerance, security, software engineering); means (managed or automated diversity) and analytical studies (quantification of diversity and its impact). Our paper contributes to the field of software diversity as the first paper that adopts an inclusive vision of the area, with an emphasis on the most recent advances in the field. This survey includes classical work about design and data diversity for fault tolerance, as well as the cybersecurity literature that investigates randomization at different system levels. It broadens this standard scope of diversity, to include the study and exploitation of natural diversity and the management of diverse software products. Our survey includes the most recent works, with an emphasis from 2000 to present. The targeted audience is researchers and practitioners in one of the surveyed fields, who miss the big picture of software diversity. Assembling the multiple facets of this fascinating topic sheds a new light on the field.
△ Less
Submitted 25 September, 2014;
originally announced September 2014.
-
Test Case Purification for Improving Fault Localization
Authors:
Jifeng Xuan,
Martin Monperrus
Abstract:
Finding and fixing bugs are time-consuming activities in software development. Spectrum-based fault localization aims to identify the faulty position in source code based on the execution trace of test cases. Failing test cases and their assertions form test oracles for the failing behavior of the system under analysis. In this paper, we propose a novel concept of spectrum driven test case purific…
▽ More
Finding and fixing bugs are time-consuming activities in software development. Spectrum-based fault localization aims to identify the faulty position in source code based on the execution trace of test cases. Failing test cases and their assertions form test oracles for the failing behavior of the system under analysis. In this paper, we propose a novel concept of spectrum driven test case purification for improving fault localization. The goal of test case purification is to separate existing test cases into small fractions (called purified test cases) and to enhance the test oracles to further localize faults. Combining with an original fault localization technique (e.g., Tarantula), test case purification results in better ranking the program statements. Our experiments on 1800 faults in six open-source Java programs show that test case purification can effectively improve existing fault localization techniques.
△ Less
Submitted 10 September, 2014;
originally announced September 2014.
-
Static Analysis for Extracting Permission Checks of a Large Scale Framework: The Challenges And Solutions for Analyzing Android
Authors:
Alexandre Bartel,
Jacques Klein,
Martin Monperrus,
Yves Le Traon
Abstract:
A common security architecture is based on the protection of certain resources by permission checks (used e.g., in Android and Blackberry). It has some limitations, for instance, when applications are granted more permissions than they actually need, which facilitates all kinds of malicious usage (e.g., through code injection). The analysis of permission-based framework requires a precise map**…
▽ More
A common security architecture is based on the protection of certain resources by permission checks (used e.g., in Android and Blackberry). It has some limitations, for instance, when applications are granted more permissions than they actually need, which facilitates all kinds of malicious usage (e.g., through code injection). The analysis of permission-based framework requires a precise map** between API methods of the framework and the permissions they require. In this paper, we show that naive static analysis fails miserably when applied with off-the-shelf components on the Android framework. We then present an advanced class-hierarchy and field-sensitive set of analyses to extract this map**. Those static analyses are capable of analyzing the Android framework. They use novel domain specific optimizations dedicated to Android.
△ Less
Submitted 18 August, 2014;
originally announced August 2014.
-
A Critical Review of "Automatic Patch Generation Learned from Human-Written Patches": Essay on the Problem Statement and the Evaluation of Automatic Software Repair
Authors:
Martin Monperrus
Abstract:
At ICSE'2013, there was the first session ever dedicated to automatic program repair. In this session, Kim et al. presented PAR, a novel template-based approach for fixing Java bugs. We strongly disagree with key points of this paper. Our critical review has two goals. First, we aim at explaining why we disagree with Kim and colleagues and why the reasons behind this disagreement are important for…
▽ More
At ICSE'2013, there was the first session ever dedicated to automatic program repair. In this session, Kim et al. presented PAR, a novel template-based approach for fixing Java bugs. We strongly disagree with key points of this paper. Our critical review has two goals. First, we aim at explaining why we disagree with Kim and colleagues and why the reasons behind this disagreement are important for research on automatic software repair in general. Second, we aim at contributing to the field with a clarification of the essential ideas behind automatic software repair. In particular we discuss the main evaluation criteria of automatic software repair: understandability, correctness and completeness. We show that depending on how one sets up the repair scenario, the evaluation goals may be contradictory. Eventually, we discuss the nature of fix acceptability and its relation to the notion of software correctness.
△ Less
Submitted 9 August, 2014;
originally announced August 2014.
-
Automatic Repair of Buggy If Conditions and Missing Preconditions with SMT
Authors:
Favio Demarco,
Jifeng Xuan,
Daniel Le Berre,
Martin Monperrus
Abstract:
We present Nopol, an approach for automatically repairing buggy if conditions and missing preconditions. As input, it takes a program and a test suite which contains passing test cases modeling the expected behavior of the program and at least one failing test case embodying the bug to be repaired. It consists of collecting data from multiple instrumented test suite executions, transforming this d…
▽ More
We present Nopol, an approach for automatically repairing buggy if conditions and missing preconditions. As input, it takes a program and a test suite which contains passing test cases modeling the expected behavior of the program and at least one failing test case embodying the bug to be repaired. It consists of collecting data from multiple instrumented test suite executions, transforming this data into a Satisfiability Modulo Theory (SMT) problem, and translating the SMT result -- if there exists one -- into a source code patch. Nopol repairs object oriented code and allows the patches to contain nullness checks as well as specific method calls.
△ Less
Submitted 11 April, 2014;
originally announced April 2014.
-
Principles of Antifragile Software
Authors:
Martin Monperrus
Abstract:
The goal of this paper is to study and define the concept of "antifragile software". For this, I start from Taleb's statement that antifragile systems love errors, and discuss whether traditional software dependability fits into this class. The answer is somewhat negative, although adaptive fault tolerance is antifragile: the system learns something when an error happens, and always imrpoves. Auto…
▽ More
The goal of this paper is to study and define the concept of "antifragile software". For this, I start from Taleb's statement that antifragile systems love errors, and discuss whether traditional software dependability fits into this class. The answer is somewhat negative, although adaptive fault tolerance is antifragile: the system learns something when an error happens, and always imrpoves. Automatic runtime bug fixing is changing the code in response to errors, fault injection in production means injecting errors in business critical software. I claim that both correspond to antifragility. Finally, I hypothesize that antifragile development processes are better at producing antifragile software systems.
△ Less
Submitted 7 June, 2017; v1 submitted 11 April, 2014;
originally announced April 2014.
-
Do the Fix Ingredients Already Exist? An Empirical Inquiry into the Redundancy Assumptions of Program Repair Approaches
Authors:
Matias Martinez,
Westley Weimer,
Martin Monperrus
Abstract:
Much initial research on automatic program repair has focused on experimental results to probe their potential to find patches and reduce development effort. Relatively less effort has been put into understanding the hows and whys of such approaches. For example, a critical assumption of the GenProg technique is that certain bugs can be fixed by copying and re-arranging existing code. In other wor…
▽ More
Much initial research on automatic program repair has focused on experimental results to probe their potential to find patches and reduce development effort. Relatively less effort has been put into understanding the hows and whys of such approaches. For example, a critical assumption of the GenProg technique is that certain bugs can be fixed by copying and re-arranging existing code. In other words, GenProg assumes that the fix ingredients already exist elsewhere in the code. In this paper, we formalize these assumptions around the concept of ''temporal redundancy''. A temporally redundant commit is only composed of what has already existed in previous commits. Our experiments show that a large proportion of commits that add existing code are temporally redundant. This validates the fundamental redundancy assumption of GenProg.
△ Less
Submitted 25 March, 2014;
originally announced March 2014.
-
An Approach for Discovering Traceability Links between Regulatory Documents and Source Code Through User-Interface Labels
Authors:
Antoine Mischler,
Martin Monperrus
Abstract:
In application domains that are regulated, software vendors must maintain traceability links between the regulatory items and the code base implementing them. In this paper, we present a traceability approach based on the intuition that the regulatory documents and the user-interface of the corresponding software applications are very close. First, they use the same terminology. Second, most impor…
▽ More
In application domains that are regulated, software vendors must maintain traceability links between the regulatory items and the code base implementing them. In this paper, we present a traceability approach based on the intuition that the regulatory documents and the user-interface of the corresponding software applications are very close. First, they use the same terminology. Second, most important regulatory pieces of information appear in the graphical user-interface because the end-users in those application domains care about the regulation (by construction). We evaluate our approach in the domain of green building. The evaluation involves a domain expert, lead architect of a commercial product within this area. The evaluation shows that the recovered traceability links are accurate.
△ Less
Submitted 11 March, 2014;
originally announced March 2014.
-
Tailored Source Code Transformations to Synthesize Computationally Diverse Program Variants
Authors:
Benoit Baudry,
Simon Allier,
Martin Monperrus
Abstract:
The predictability of program execution provides attackers a rich source of knowledge who can exploit it to spy or remotely control the program. Moving target defense addresses this issue by constantly switching between many diverse variants of a program, which reduces the certainty that an attacker can have about the program execution. The effectiveness of this approach relies on the availability…
▽ More
The predictability of program execution provides attackers a rich source of knowledge who can exploit it to spy or remotely control the program. Moving target defense addresses this issue by constantly switching between many diverse variants of a program, which reduces the certainty that an attacker can have about the program execution. The effectiveness of this approach relies on the availability of a large number of software variants that exhibit different executions. However, current approaches rely on the natural diversity provided by off-the-shelf components, which is very limited. In this paper, we explore the automatic synthesis of large sets of program variants, called sosies. Sosies provide the same expected functionality as the original program, while exhibiting different executions. They are said to be computationally diverse. This work addresses two objectives: comparing different transformations for increasing the likelihood of sosie synthesis (densifying the search space for sosies); demonstrating computation diversity in synthesized sosies. We synthesized 30184 sosies in total, for 9 large, real-world, open source applications. For all these programs we identified one type of program analysis that systematically increases the density of sosies; we measured computation diversity for sosies of 3 programs and found diversity in method calls or data in more than 40% of sosies. This is a step towards controlled massive unpredictability of software.
△ Less
Submitted 29 January, 2014;
originally announced January 2014.
-
Reasoning and Improving on Software Resilience against Unanticipated Exceptions
Authors:
Benoit Cornu,
Lionel Seinturier,
Martin Monperrus
Abstract:
In software, there are the errors anticipated at specification and design time, those encountered at development and testing time, and those that happen in production mode yet never anticipated. In this paper, we aim at reasoning on the ability of software to correctly handle unanticipated exceptions. We propose an algorithm, called short-circuit testing, which injects exceptions during test suite…
▽ More
In software, there are the errors anticipated at specification and design time, those encountered at development and testing time, and those that happen in production mode yet never anticipated. In this paper, we aim at reasoning on the ability of software to correctly handle unanticipated exceptions. We propose an algorithm, called short-circuit testing, which injects exceptions during test suite execution so as to simulate unanticipated errors. This algorithm collects data that is used as input for verifying two formal exception contracts that capture two resilience properties. Our evaluation on 9 test suites, with 78% line coverage in average, analyzes 241 executed catch blocks, shows that 101 of them expose resilience properties and that 84 can be transformed to be more resilient.
△ Less
Submitted 31 December, 2013;
originally announced January 2014.
-
Abmash: Mashing Up Legacy Web Applications by Automated Imitation of Human Actions
Authors:
Alper Ortac,
Martin Monperrus,
Mira Mezini
Abstract:
Many business web-based applications do not offer applications programming interfaces (APIs) to enable other applications to access their data and functions in a programmatic manner. This makes their composition difficult (for instance to synchronize data between two applications). To address this challenge, this paper presents Abmash, an approach to facilitate the integration of such legacy web a…
▽ More
Many business web-based applications do not offer applications programming interfaces (APIs) to enable other applications to access their data and functions in a programmatic manner. This makes their composition difficult (for instance to synchronize data between two applications). To address this challenge, this paper presents Abmash, an approach to facilitate the integration of such legacy web applications by automatically imitating human interactions with them. By automatically interacting with the graphical user interface (GUI) of web applications, the system supports all forms of integrations including bi-directional interactions and is able to interact with AJAX-based applications. Furthermore, the integration programs are easy to write since they deal with end-user, visual user-interface elements. The integration code is simple enough to be called a "mashup".
△ Less
Submitted 2 December, 2013;
originally announced December 2013.
-
Mining Software Repair Models for Reasoning on the Search Space of Automated Program Fixing
Authors:
Matias Martinez,
Martin Monperrus
Abstract:
This paper is about understanding the nature of bug fixing by analyzing thousands of bug fix transactions of software repositories. It then places this learned knowledge in the context of automated program repair. We give extensive empirical results on the nature of human bug fixes at a large scale and a fine granularity with abstract syntax tree differencing. We set up mathematical reasoning on t…
▽ More
This paper is about understanding the nature of bug fixing by analyzing thousands of bug fix transactions of software repositories. It then places this learned knowledge in the context of automated program repair. We give extensive empirical results on the nature of human bug fixes at a large scale and a fine granularity with abstract syntax tree differencing. We set up mathematical reasoning on the search space of automated repair and the time to navigate through it. By applying our method on 14 repositories of Java software and 89,993 versioning transactions, we show that not all probabilistic repair models are equivalent.
△ Less
Submitted 14 November, 2013;
originally announced November 2013.
-
Automatically Extracting Instances of Code Change Patterns with AST Analysis
Authors:
Matias Martinez,
Laurence Duchien,
Martin Monperrus
Abstract:
A code change pattern represents a kind of recurrent modification in software. For instance, a known code change pattern consists of the change of the conditional expression of an if statement. Previous work has identified different change patterns. Complementary to the identification and definition of change patterns, the automatic extraction of pattern instances is essential to measure their emp…
▽ More
A code change pattern represents a kind of recurrent modification in software. For instance, a known code change pattern consists of the change of the conditional expression of an if statement. Previous work has identified different change patterns. Complementary to the identification and definition of change patterns, the automatic extraction of pattern instances is essential to measure their empirical importance. For example, it enables one to count and compare the number of conditional expression changes in the history of different projects. In this paper we present a novel approach for search patterns instances from software history. Our technique is based on the analysis of Abstract Syntax Trees (AST) files within a given commit. We validate our approach by counting instances of 18 change patterns in 6 open-source Java projects.
△ Less
Submitted 15 September, 2013;
originally announced September 2013.
-
Empirical Evidence of Large-Scale Diversity in API Usage of Object-Oriented Software
Authors:
Diego Mendez,
Benoit Baudry,
Martin Monperrus
Abstract:
In this paper, we study how object-oriented classes are used across thousands of software packages. We concentrate on "usage diversity'", defined as the different statically observable combinations of methods called on the same object. We present empirical evidence that there is a significant usage diversity for many classes. For instance, we observe in our dataset that Java's String is used in 24…
▽ More
In this paper, we study how object-oriented classes are used across thousands of software packages. We concentrate on "usage diversity'", defined as the different statically observable combinations of methods called on the same object. We present empirical evidence that there is a significant usage diversity for many classes. For instance, we observe in our dataset that Java's String is used in 2460 manners. We discuss the reasons of this observed diversity and the consequences on software engineering knowledge and research.
△ Less
Submitted 21 August, 2013; v1 submitted 15 July, 2013;
originally announced July 2013.
-
Detecting Missing Method Calls as Violations of the Majority Rule
Authors:
Martin Monperrus,
Mira Mezini
Abstract:
When using object-oriented frameworks it is easy to overlook certain important method calls that are required at particular places in code. In this paper, we provide a comprehensive set of empirical facts on this problem, starting from traces of missing method calls in a bug repository. We propose a new system that searches for missing method calls in software based on the other method calls that…
▽ More
When using object-oriented frameworks it is easy to overlook certain important method calls that are required at particular places in code. In this paper, we provide a comprehensive set of empirical facts on this problem, starting from traces of missing method calls in a bug repository. We propose a new system that searches for missing method calls in software based on the other method calls that are observable. Our key insight is that the voting theory concept of majority rule holds for method calls: a call is likely to be missing if there is a majority of similar pieces of code where this call is present. The evaluation shows that the system predictions go further missing method calls and often reveal different kinds of code smells (e.g. violations of API best practices).
△ Less
Submitted 4 June, 2013;
originally announced June 2013.
-
Mashup of Meta-Languages and its Implementation in the Kermeta Language Workbench
Authors:
Jean-Marc Jézéquel,
Benoit Combemale,
Olivier Barais,
Martin Monperrus,
François Fouquet
Abstract:
With the growing use of domain-specific languages (DSL) in industry, DSL design and implementation goes far beyond an activity for a few experts only and becomes a challenging task for thousands of software engineers. DSL implementation indeed requires engineers to care for various concerns, from abstract syntax, static semantics, behavioral semantics, to extra-functional issues such as run-time p…
▽ More
With the growing use of domain-specific languages (DSL) in industry, DSL design and implementation goes far beyond an activity for a few experts only and becomes a challenging task for thousands of software engineers. DSL implementation indeed requires engineers to care for various concerns, from abstract syntax, static semantics, behavioral semantics, to extra-functional issues such as run-time performance. This paper presents an approach that uses one meta-language per language implementation concern. We show that the usage and combination of those meta-languages is simple and intuitive enough to deserve the term "mashup". We evaluate the approach by completely implementing the non trivial fUML modeling language, a semantically sound and executable subset of the Unified Modeling Language (UML).
△ Less
Submitted 4 June, 2013;
originally announced June 2013.
-
XSS-FP: Browser Fingerprinting using HTML Parser Quirks
Authors:
Erwan Abgrall,
Yves Le Traon,
Martin Monperrus,
Sylvain Gombault,
Mario Heiderich,
Alain Ribault
Abstract:
There are many scenarios in which inferring the type of a client browser is desirable, for instance to fight against session stealing. This is known as browser fingerprinting. This paper presents and evaluates a novel fingerprinting technique to determine the exact nature (browser type and version, eg Firefox 15) of a web-browser, exploiting HTML parser quirks exercised through XSS. Our experiment…
▽ More
There are many scenarios in which inferring the type of a client browser is desirable, for instance to fight against session stealing. This is known as browser fingerprinting. This paper presents and evaluates a novel fingerprinting technique to determine the exact nature (browser type and version, eg Firefox 15) of a web-browser, exploiting HTML parser quirks exercised through XSS. Our experiments show that the exact version of a web browser can be determined with 71% of accuracy, and that only 6 tests are sufficient to quickly determine the exact family a web browser belongs to.
△ Less
Submitted 20 November, 2012;
originally announced November 2012.
-
In-Vivo Bytecode Instrumentation for Improving Privacy on Android Smartphones in Uncertain Environments
Authors:
Alexandre Bartel,
Jacques Klein,
Martin Monperrus,
Kevin Allix,
Yves Le Traon
Abstract:
In this paper we claim that an efficient and readily applicable means to improve privacy of Android applications is: 1) to perform runtime monitoring by instrumenting the application bytecode and 2) in-vivo, i.e. directly on the smartphone. We present a tool chain to do this and present experimental results showing that this tool chain can run on smartphones in a reasonable amount of time and with…
▽ More
In this paper we claim that an efficient and readily applicable means to improve privacy of Android applications is: 1) to perform runtime monitoring by instrumenting the application bytecode and 2) in-vivo, i.e. directly on the smartphone. We present a tool chain to do this and present experimental results showing that this tool chain can run on smartphones in a reasonable amount of time and with a realistic effort. Our findings also identify challenges to be addressed before running powerful runtime monitoring and instrumentations directly on smartphones. We implemented two use-cases leveraging the tool chain: BetterPermissions, a fine-grained user centric permission policy system and AdRemover an advertisement remover. Both prototypes improve the privacy of Android systems thanks to in-vivo bytecode instrumentation.
△ Less
Submitted 8 October, 2013; v1 submitted 5 June, 2012;
originally announced August 2012.
-
Automatically Securing Permission-Based Software by Reducing the Attack Surface: An Application to Android
Authors:
Alexandre Bartel,
Jacques Klein,
Martin Monperrus,
Yves Le Traon
Abstract:
A common security architecture, called the permission-based security model (used e.g. in Android and Blackberry), entails intrinsic risks. For instance, applications can be granted more permissions than they actually need, what we call a "permission gap". Malware can leverage the unused permissions for achieving their malicious goals, for instance using code injection. In this paper, we present an…
▽ More
A common security architecture, called the permission-based security model (used e.g. in Android and Blackberry), entails intrinsic risks. For instance, applications can be granted more permissions than they actually need, what we call a "permission gap". Malware can leverage the unused permissions for achieving their malicious goals, for instance using code injection. In this paper, we present an approach to detecting permission gaps using static analysis. Our prototype implementation in the context of Android shows that the static analysis must take into account a significant amount of platform-specific knowledge. Using our tool on two datasets of Android applications, we found out that a non negligible part of applications suffers from permission gaps, i.e. does not use all the permissions they declare.
△ Less
Submitted 20 March, 2013; v1 submitted 22 May, 2012;
originally announced June 2012.
-
What Should Developers Be Aware Of? An Empirical Study on the Directives of API Documentation
Authors:
Martin Monperrus,
Michael Eichberg,
Elif Tekes,
Mira Mezini
Abstract:
Application Programming Interfaces (API) are exposed to developers in order to reuse software libraries. API directives are natural-language statements in API documentation that make developers aware of constraints and guidelines related to the usage of an API. This paper presents the design and the results of an empirical study on the directives of API documentation of object-oriented libraries.…
▽ More
Application Programming Interfaces (API) are exposed to developers in order to reuse software libraries. API directives are natural-language statements in API documentation that make developers aware of constraints and guidelines related to the usage of an API. This paper presents the design and the results of an empirical study on the directives of API documentation of object-oriented libraries. Its main contribution is to propose and extensively discuss a taxonomy of 23 kinds of API directives.
△ Less
Submitted 29 May, 2012;
originally announced May 2012.
-
Querying Source Code with Natural Language
Authors:
Markus Kimmig,
Martin Monperrus,
Mira Mezini
Abstract:
One common task of develo** or maintaining software is searching the source code for information like specific method calls or write accesses to certain fields. This kind of information is required to correctly implement new features and to solve bugs. This paper presents an approach for querying source code with natural language.
One common task of develo** or maintaining software is searching the source code for information like specific method calls or write accesses to certain fields. This kind of information is required to correctly implement new features and to solve bugs. This paper presents an approach for querying source code with natural language.
△ Less
Submitted 29 May, 2012;
originally announced May 2012.
-
Dexpler: Converting Android Dalvik Bytecode to Jimple for Static Analysis with Soot
Authors:
Alexandre Bartel,
Jacques Klein,
Martin Monperrus,
Yves Le Traon
Abstract:
This paper introduces Dexpler, a software package which converts Dalvik bytecode to Jimple. Dexpler is built on top of Dedexer and Soot. As Jimple is Soot's main internal rep- resentation of code, the Dalvik bytecode can be manipu- lated with any Jimple based tool, for instance for performing point-to or flow analysis.
This paper introduces Dexpler, a software package which converts Dalvik bytecode to Jimple. Dexpler is built on top of Dedexer and Soot. As Jimple is Soot's main internal rep- resentation of code, the Dalvik bytecode can be manipu- lated with any Jimple based tool, for instance for performing point-to or flow analysis.
△ Less
Submitted 31 January, 2013; v1 submitted 16 May, 2012;
originally announced May 2012.
-
Towards Ecology Inspired Software Engineering
Authors:
Benoit Baudry,
Martin Monperrus
Abstract:
Ecosystems are complex and dynamic systems. Over billions of years, they have developed advanced capabilities to provide stable functions, despite changes in their environment. In this paper, we argue that the laws of organization and development of ecosystems provide a solid and rich source of inspiration to lay the foundations for novel software construction paradigms that provide stability as m…
▽ More
Ecosystems are complex and dynamic systems. Over billions of years, they have developed advanced capabilities to provide stable functions, despite changes in their environment. In this paper, we argue that the laws of organization and development of ecosystems provide a solid and rich source of inspiration to lay the foundations for novel software construction paradigms that provide stability as much as openness.
△ Less
Submitted 10 July, 2012; v1 submitted 5 May, 2012;
originally announced May 2012.
-
Semi-Automatically Extracting FAQs to Improve Accessibility of Software Development Knowledge
Authors:
Stefan Henß,
Martin Monperrus,
Mira Mezini
Abstract:
Frequently asked questions (FAQs) are a popular way to document software development knowledge. As creating such documents is expensive, this paper presents an approach for automatically extracting FAQs from sources of software development discussion, such as mailing lists and Internet forums, by combining techniques of text mining and natural language processing. We apply the approach to popular…
▽ More
Frequently asked questions (FAQs) are a popular way to document software development knowledge. As creating such documents is expensive, this paper presents an approach for automatically extracting FAQs from sources of software development discussion, such as mailing lists and Internet forums, by combining techniques of text mining and natural language processing. We apply the approach to popular mailing lists and carry out a survey among software developers to show that it is able to extract high-quality FAQs that may be further improved by experts.
△ Less
Submitted 23 March, 2012;
originally announced March 2012.