-
Fuzzing Symbolic Expressions
Authors:
Luca Borzacchiello,
Emilio Coppa,
Camil Demetrescu
Abstract:
Recent years have witnessed a wide array of results in software testing, exploring different approaches and methodologies ranging from fuzzers to symbolic engines, with a full spectrum of instances in between such as concolic execution and hybrid fuzzing. A key ingredient of many of these tools is Satisfiability Modulo Theories (SMT) solvers, which are used to reason over symbolic expressions coll…
▽ More
Recent years have witnessed a wide array of results in software testing, exploring different approaches and methodologies ranging from fuzzers to symbolic engines, with a full spectrum of instances in between such as concolic execution and hybrid fuzzing. A key ingredient of many of these tools is Satisfiability Modulo Theories (SMT) solvers, which are used to reason over symbolic expressions collected during the analysis. In this paper, we investigate whether techniques borrowed from the fuzzing domain can be applied to check whether symbolic formulas are satisfiable in the context of concolic and hybrid fuzzing engines, providing a viable alternative to classic SMT solving techniques. We devise a new approximate solver, FUZZY-SAT, and show that it is both competitive with and complementary to state-of-the-art solvers such as Z3 with respect to handling queries generated by hybrid fuzzers.
△ Less
Submitted 12 February, 2021;
originally announced February 2021.
-
On-Stack Replacement à la Carte
Authors:
Daniele Cono D'Elia,
Camil Demetrescu
Abstract:
On-stack replacement (OSR) dynamically transfers execution between different code versions. This mechanism is used in mainstream runtime systems to support adaptive and speculative optimizations by running code tailored to provide the best expected performance for the actual workload. Current approaches either restrict the program points where OSR can be fired or require complex optimization-speci…
▽ More
On-stack replacement (OSR) dynamically transfers execution between different code versions. This mechanism is used in mainstream runtime systems to support adaptive and speculative optimizations by running code tailored to provide the best expected performance for the actual workload. Current approaches either restrict the program points where OSR can be fired or require complex optimization-specific operations to realign the program's state during a transition. The engineering effort to implement OSR and the lack of abstractions make it rarely accessible to the research community, leaving fundamental question regarding its flexibility largely unexplored.
In this article we make a first step towards a provably sound abstract framework for OSR. We show that compiler optimizations can be made OSR-aware in isolation, and then safely composed. We identify a class of transformations, which we call live-variable equivalent (LVE), that captures a natural property of fundamental compiler optimizations, and devise an algorithm to automatically generate the OSR machinery required for an LVE transition at arbitrary program locations.
We present an implementation of our ideas in LLVM and evaluate it against prominent benchmarks, showing that bidirectional OSR transitions are possible almost everywhere in the code in the presence of common, unhindered global optimizations. We then discuss the end-to-end utility of our techniques in source-level debugging of optimized code, showing how our algorithms can provide novel building blocks for debuggers for both executables and managed runtimes.
△ Less
Submitted 8 August, 2017;
originally announced August 2017.
-
A Survey of Symbolic Execution Techniques
Authors:
Roberto Baldoni,
Emilio Coppa,
Daniele Cono D'Elia,
Camil Demetrescu,
Irene Finocchi
Abstract:
Many security and software testing applications require checking whether certain properties of a program hold for any possible usage scenario. For instance, a tool for identifying software vulnerabilities may need to rule out the existence of any backdoor to bypass a program's authentication. One approach would be to test the program using different, possibly random inputs. As the backdoor may onl…
▽ More
Many security and software testing applications require checking whether certain properties of a program hold for any possible usage scenario. For instance, a tool for identifying software vulnerabilities may need to rule out the existence of any backdoor to bypass a program's authentication. One approach would be to test the program using different, possibly random inputs. As the backdoor may only be hit for very specific program workloads, automated exploration of the space of possible inputs is of the essence. Symbolic execution provides an elegant solution to the problem, by systematically exploring many possible execution paths at the same time without necessarily requiring concrete inputs. Rather than taking on fully specified input values, the technique abstractly represents them as symbols, resorting to constraint solvers to construct actual instances that would cause property violations. Symbolic execution has been incubated in dozens of tools developed over the last four decades, leading to major practical breakthroughs in a number of prominent software reliability applications. The goal of this survey is to provide an overview of the main ideas, challenges, and solutions developed in the area, distilling them for a broad audience.
The present survey has been accepted for publication at ACM Computing Surveys. If you are considering citing this survey, we would appreciate if you could use the following BibTeX entry: http://goo.gl/Hf5Fvc
△ Less
Submitted 2 May, 2018; v1 submitted 3 October, 2016;
originally announced October 2016.
-
Experimental Evaluation of Algorithms for the Food-Selection Problem
Authors:
Camil Demetrescu,
Irene Finocchi,
Giuseppe F. Italiano,
Luigi Laura
Abstract:
In this paper, we describe the result of our experiments on Algorithms for the Food-Selection Problem, which is the fundamental problem first stated and addressed in the seminal paper \cite{pigout}. Because the key aspect of any experimental evaluation is the \textbf{reproducibility}, we detail deeply the setup of all our experiments, thus leaving to the interested eater the opportunity to reprodu…
▽ More
In this paper, we describe the result of our experiments on Algorithms for the Food-Selection Problem, which is the fundamental problem first stated and addressed in the seminal paper \cite{pigout}. Because the key aspect of any experimental evaluation is the \textbf{reproducibility}, we detail deeply the setup of all our experiments, thus leaving to the interested eater the opportunity to reproduce all the results described in this paper. More specifically, we describe all the answers we provided to the questions proposed in \cite{pigout}: Where can I have dinner tonight? What is the typical Roman cuisine that I should (not) miss? Where can I find the best coffee or gelato in town?
△ Less
Submitted 29 January, 2014;
originally announced January 2014.
-
Ball-Larus Path Profiling Across Multiple Loop iterations
Authors:
Daniele Cono D'Elia,
Camil Demetrescu,
Irene Finocchi
Abstract:
Identifying the hottest paths in the control flow graph of a routine can direct optimizations to portions of the code where most resources are consumed. This powerful methodology, called path profiling, was introduced by Ball and Larus in the mid 90s and has received considerable attention in the last 15 years for its practical relevance. A shortcoming of Ball-Larus path profiling was the inabilit…
▽ More
Identifying the hottest paths in the control flow graph of a routine can direct optimizations to portions of the code where most resources are consumed. This powerful methodology, called path profiling, was introduced by Ball and Larus in the mid 90s and has received considerable attention in the last 15 years for its practical relevance. A shortcoming of Ball-Larus path profiling was the inability to profile cyclic paths, making it difficult to mine interesting execution patterns that span multiple loop iterations. Previous results, based on rather complex algorithms, have attempted to circumvent this limitation at the price of significant performance losses already for a small number of iterations. In this paper, we present a new approach to multiple iterations path profiling, based on data structures built on top of the original Ball-Larus numbering technique. Our approach allows it to profile all executed paths obtained as a concatenation of up to k Ball-Larus acyclic paths, where k is a user-defined parameter. An extensive experimental investigation on a large variety of Java benchmarks on the Jikes RVM shows that, surprisingly, our approach can be even faster than Ball-Larus due to fewer operations on smaller hash tables, producing compact representations of cyclic paths even for large values of k.
△ Less
Submitted 18 April, 2013;
originally announced April 2013.
-
Multithreaded Input-Sensitive Profiling
Authors:
Emilio Coppa,
Camil Demetrescu,
Irene Finocchi,
Romolo Marotta
Abstract:
Input-sensitive profiling is a recent performance analysis technique that makes it possible to estimate the empirical cost function of individual routines of a program, hel** developers understand how performance scales to larger inputs and pinpoint asymptotic bottlenecks in the code. A current limitation of input-sensitive profilers is that they specifically target sequential computations, igno…
▽ More
Input-sensitive profiling is a recent performance analysis technique that makes it possible to estimate the empirical cost function of individual routines of a program, hel** developers understand how performance scales to larger inputs and pinpoint asymptotic bottlenecks in the code. A current limitation of input-sensitive profilers is that they specifically target sequential computations, ignoring any communication between threads. In this paper we show how to overcome this limitation, extending the range of applicability of the original approach to multithreaded applications and to applications that operate on I/O streams. We develop new metrics for automatically estimating the size of the input given to each routine activation, addressing input produced by non-deterministic memory stores performed by other threads as well as by the OS kernel (e.g., in response to I/O or network operations). We provide real case studies, showing that our extension allows it to characterize the behavior of complex applications more precisely than previous approaches. An extensive experimental investigation on a variety of benchmark suites (including the SPEC OMP2012 and the PARSEC benchmarks) shows that our Valgrind-based input-sensitive profiler incurs an overhead comparable to other prominent heavyweight analysis tools, while collecting significantly more performance points from each profiling session and correctly characterizing both thread-induced and external input.
△ Less
Submitted 13 April, 2013;
originally announced April 2013.
-
Reactive Imperative Programming with Dataflow Constraints
Authors:
Camil Demetrescu,
Irene Finocchi,
Andrea Ribichini
Abstract:
Dataflow languages provide natural support for specifying constraints between objects in dynamic applications, where programs need to react efficiently to changes of their environment. Researchers have long investigated how to take advantage of dataflow constraints by embedding them into procedural languages. Previous mixed imperative/dataflow systems, however, require syntactic extensions or libr…
▽ More
Dataflow languages provide natural support for specifying constraints between objects in dynamic applications, where programs need to react efficiently to changes of their environment. Researchers have long investigated how to take advantage of dataflow constraints by embedding them into procedural languages. Previous mixed imperative/dataflow systems, however, require syntactic extensions or libraries of ad hoc data types for binding the imperative program to the dataflow solver. In this paper we propose a novel approach that smoothly combines the two paradigms without placing undue burden on the programmer. In our framework, programmers can define ordinary commands of the host imperative language that enforce constraints between objects stored in "reactive" memory locations. Reactive objects can be of any legal type in the host language, including primitive data types, pointers, arrays, and structures. Constraints are automatically re-executed every time their input memory locations change, letting a program behave like a spreadsheet where the values of some variables depend upon the values of other variables. The constraint solving mechanism is handled transparently by altering the semantics of elementary operations of the host language for reading and modifying objects. We provide a formal semantics and describe a concrete embodiment of our technique into C/C++, showing how to implement it efficiently in conventional platforms using off-the-shelf compilers. We discuss relevant applications to reactive scenarios, including incremental computation, observer design pattern, and data structure repair. The performance of our implementation is compared to ad hoc problem-specific change propagation algorithms and to language-centric approaches such as self-adjusting computation and subject/observer communication mechanisms, showing that the proposed approach is efficient in practice.
△ Less
Submitted 12 April, 2011;
originally announced April 2011.
-
Mantaining Dynamic Matrices for Fully Dynamic Transitive Closure
Authors:
Camil Demetrescu,
Giuseppe F. Italiano
Abstract:
In this paper we introduce a general framework for casting fully dynamic transitive closure into the problem of reevaluating polynomials over matrices. With this technique, we improve the best known bounds for fully dynamic transitive closure. In particular, we devise a deterministic algorithm for general directed graphs that achieves $O(n^2)$ amortized time for updates, while preserving unit wo…
▽ More
In this paper we introduce a general framework for casting fully dynamic transitive closure into the problem of reevaluating polynomials over matrices. With this technique, we improve the best known bounds for fully dynamic transitive closure. In particular, we devise a deterministic algorithm for general directed graphs that achieves $O(n^2)$ amortized time for updates, while preserving unit worst-case cost for queries. In case of deletions only, our algorithm performs updates faster in O(n) amortized time.
Our matrix-based approach yields an algorithm for directed acyclic graphs that breaks through the $O(n^2)$ barrier on the single-operation complexity of fully dynamic transitive closure. We can answer queries in $O(n^ε)$ time and perform updates in $O(n^{ω(1,ε,1)-ε}+n^{1+ε})$ time, for any $ε\in[0,1]$, where $ω(1,ε,1)$ is the exponent of the multiplication of an $n\times n^ε$ matrix by an $n^ε\times n$ matrix. The current best bounds on $ω(1,ε,1)$ imply an $O(n^{0.58})$ query time and an $O(n^{1.58})$ update time. Our subquadratic algorithm is randomized, and has one-side error.
△ Less
Submitted 31 March, 2001;
originally announced April 2001.