-
LTL Model Checking of Self Modifying Code
Authors:
Tayssir Touili,
Xin Ye
Abstract:
Self modifying code is code that can modify its own instructions during the execution of the program. It is extensively used by malware writers to obfuscate their malicious code. Thus, analysing self modifying code is nowadays a big challenge. In this paper, we consider the LTL model-checking problem of self modifying code. We model such programs using self-modifying pushdown systems (SM-PDS), an…
▽ More
Self modifying code is code that can modify its own instructions during the execution of the program. It is extensively used by malware writers to obfuscate their malicious code. Thus, analysing self modifying code is nowadays a big challenge. In this paper, we consider the LTL model-checking problem of self modifying code. We model such programs using self-modifying pushdown systems (SM-PDS), an extension of pushdown systems that can modify its own set of transitions during execution. We reduce the LTL model-checking problem to the emptiness problem of self-modifying Büchi pushdown systems (SM-BPDS). We implemented our techniques in a tool that we successfully applied for the detection of several self-modifying malware. Our tool was also able to detect several malwares that well-known antiviruses such as BitDefender, Kinsoft, Avira, eScan, Kaspersky, Qihoo-360, Baidu, Avast, and Symantec failed to detect.
△ Less
Submitted 27 September, 2019;
originally announced September 2019.
-
Reachability Analysis of Self Modifying Code
Authors:
Tayssir Touili,
Xin Ye
Abstract:
A Self modifying code is code that modifies its own instructions during execution time. It is nowadays widely used, especially in malware to make the code hard to analyse and to detect by anti-viruses. Thus, the analysis of such self modifying programs is a big challenge. Pushdown systems (PDSs) is a natural model that is extensively used for the analysis of sequential programs because they allow…
▽ More
A Self modifying code is code that modifies its own instructions during execution time. It is nowadays widely used, especially in malware to make the code hard to analyse and to detect by anti-viruses. Thus, the analysis of such self modifying programs is a big challenge. Pushdown systems (PDSs) is a natural model that is extensively used for the analysis of sequential programs because they allow to accurately model procedure calls and mimic the program's stack. In this work, we propose to extend the PushDown System model with self-modifying rules. We call the new model Self-Modifying PushDown System (SM-PDS). A SM-PDS is a PDS that can modify its own set of transitions during execution. We show how SM-PDSs can be used to naturally represent self-modifying programs and provide efficient algorithms to compute the backward and forward reachable configurations of SM-PDSs. We implemented our techniques in a tool and obtained encouraging results. In particular, we successfully applied our tool for the detection of self-modifying malware.
△ Less
Submitted 27 September, 2019;
originally announced September 2019.
-
Static Analysis of Multithreaded Recursive Programs Communicating via Rendez-vous
Authors:
Adrien Pommellet,
Tayssir Touili
Abstract:
We present in this paper a generic framework for the analysis of multi-threaded programs with recursive procedure calls, synchronisation by rendez-vous between parallel threads, and dynamic creation of new threads. To this end, we consider a model called Synchronized Dynamic Pushdown Networks (SDPNs) that can be seen as a network of pushdown processes executing synchronized transitions, spawning n…
▽ More
We present in this paper a generic framework for the analysis of multi-threaded programs with recursive procedure calls, synchronisation by rendez-vous between parallel threads, and dynamic creation of new threads. To this end, we consider a model called Synchronized Dynamic Pushdown Networks (SDPNs) that can be seen as a network of pushdown processes executing synchronized transitions, spawning new pushdown processes, and performing internal pushdown actions. The reachability problem for this model is unfortunately undecidable. Therefore, we tackle this problem by introducing an abstraction framework based on Kleene algebras in order to compute an abstraction of the execution paths between two regular sets of configurations. We combine an automata theoretic saturation procedure with constraint solving in a finite domain. We then apply this framework to an iterative abstraction refinement scheme, using multiple abstractions of increasing complexity and precision.
△ Less
Submitted 5 July, 2019;
originally announced July 2019.
-
Reachability Analysis of Pushdown Systems with an Upper Stack
Authors:
Adrien Pommellet,
Marcio Diaz,
Tayssir Touili
Abstract:
Pushdown systems (PDSs) are a natural model for sequential programs, but they can fail to accurately represent the way an assembly stack actually operates. Indeed, one may want to access the part of the memory that is below the current stack or base pointer, hence the need for a model that keeps track of this part of the memory.
To this end, we introduce pushdown systems with an upper stack (UPD…
▽ More
Pushdown systems (PDSs) are a natural model for sequential programs, but they can fail to accurately represent the way an assembly stack actually operates. Indeed, one may want to access the part of the memory that is below the current stack or base pointer, hence the need for a model that keeps track of this part of the memory.
To this end, we introduce pushdown systems with an upper stack (UPDSs), an extension of PDSs where symbols popped from the stack are not destroyed but instead remain just above its top, and may be overwritten by later push rules. We prove that the sets of successors post* and predecessors pre* of a regular set of configurations of such a system are not always regular, but that post* is context-sensitive, so that we can decide whether a single configuration is forward reachable or not.
In order to under-approximate pre* in a regular fashion, we consider a bounded-phase analysis of UPDSs, where a phase is a part of a run during which either push or pop rules are forbidden. We then present a method to over-approximate post* that relies on regular abstractions of runs of UPDSs. Finally, we show how these approximations can be used to detect stack overflows and stack pointer manipulations with malicious intent.
△ Less
Submitted 7 November, 2018;
originally announced November 2018.
-
Branching Temporal Logic of Calls and Returns for Pushdown Systems
Authors:
Huu-Vu Nguyen,
Tayssir Touili
Abstract:
Pushdown Systems (PDSs) are a natural model for sequential programs with (recursive) procedure calls. In this work, we define the Branching temporal logic of CAlls and RETurns (BCARET) that allows to write branching temporal formulas while taking into account the matching between calls and returns. We consider the model-checking problem of PDSs against BCARET formulas with "standard" valuations (w…
▽ More
Pushdown Systems (PDSs) are a natural model for sequential programs with (recursive) procedure calls. In this work, we define the Branching temporal logic of CAlls and RETurns (BCARET) that allows to write branching temporal formulas while taking into account the matching between calls and returns. We consider the model-checking problem of PDSs against BCARET formulas with "standard" valuations (where an atomic proposition holds at a configuration c or not depends only on the control state of c, not on its stack) as well as regular valuations (where the set of configurations in which an atomic proposition holds is regular). We show that these problems can be effectively solved by a reduction to the emptiness problem of Alternating Büchi Pushdown Systems. We show that our results can be applied for malware detection.
△ Less
Submitted 11 May, 2018;
originally announced May 2018.
-
CARET analysis of multithreaded programs
Authors:
Huu-Vu Nguyen,
Tayssir Touili
Abstract:
Dynamic Pushdown Networks (DPNs) are a natural model for multithreaded programs with (recursive) procedure calls and thread creation. On the other hand, CARET is a temporal logic that allows to write linear temporal formulas while taking into account the matching between calls and returns. We consider in this paper the model-checking problem of DPNs against CARET formulas. We show that this proble…
▽ More
Dynamic Pushdown Networks (DPNs) are a natural model for multithreaded programs with (recursive) procedure calls and thread creation. On the other hand, CARET is a temporal logic that allows to write linear temporal formulas while taking into account the matching between calls and returns. We consider in this paper the model-checking problem of DPNs against CARET formulas. We show that this problem can be effectively solved by a reduction to the emptiness problem of Büchi Dynamic Pushdown Systems. We then show that CARET model checking is also decidable for DPNs communicating with locks. Our results can, in particular, be used for the detection of concurrent malware.
△ Less
Submitted 20 September, 2017;
originally announced September 2017.
-
LTL Model-Checking for Dynamic Pushdown Networks Communicating via Locks
Authors:
Fu Song,
Tayssir Touili
Abstract:
A Dynamic Pushdown Network (DPN) is a set of pushdown systems (PDSs) where each process can dynamically create new instances of PDSs. DPNs are a natural model of multi-threaded programs with (possibly recursive) procedure calls and thread creation. Extending DPNs with locks allows processes to synchronize with each other. Thus, DPNs with locks are a well adapted formalism to model multi-threaded p…
▽ More
A Dynamic Pushdown Network (DPN) is a set of pushdown systems (PDSs) where each process can dynamically create new instances of PDSs. DPNs are a natural model of multi-threaded programs with (possibly recursive) procedure calls and thread creation. Extending DPNs with locks allows processes to synchronize with each other. Thus, DPNs with locks are a well adapted formalism to model multi-threaded programs that synchronize via locks. Therefore, it is important to have model-checking algorithms for DPNs with locks. We consider in this work model-checking for DPNs with locks against single-indexed LTL properties of the form V fi s.t. fi is a LTL formula interpreted over the PDS i. We consider the model-checking problems w.r.t. simple valuations (i.e, whether a configuration satisfies an atomic proposition depends only on its control location and held locks) and w.r.t. regular valuations (i.e., the set of the configurations satisfying an atomic proposition is a regular set of configurations). We show that these model-checking problems are decidable.
△ Less
Submitted 11 October, 2016;
originally announced November 2016.
-
Mining Malware Specifications through Static Reachability Analysis
Authors:
Hugo Daniel Macedo,
Tayssir Touili
Abstract:
The number of malicious software (malware) is growing out of control. Syntactic signature based detection cannot cope with such growth and manual construction of malware signature databases needs to be replaced by computer learning based approaches. Currently, a single modern signature capturing the semantics of a malicious behavior can be used to replace an arbitrarily large number of old-fashion…
▽ More
The number of malicious software (malware) is growing out of control. Syntactic signature based detection cannot cope with such growth and manual construction of malware signature databases needs to be replaced by computer learning based approaches. Currently, a single modern signature capturing the semantics of a malicious behavior can be used to replace an arbitrarily large number of old-fashioned syntactical signatures. However teaching computers to learn such behaviors is a challenge. Existing work relies on dynamic analysis to extract malicious behaviors, but such technique does not guarantee the coverage of all behaviors. To sidestep this limitation we show how to learn malware signatures using static reachability analysis. The idea is to model binary programs using pushdown systems (that can be used to model the stack operations occurring during the binary code execution), use reachability analysis to extract behaviors in the form of trees, and use subtrees that are common among the trees extracted from a training set of malware files as signatures. To detect malware we propose to use a tree automaton to compactly store malicious behavior trees and check if any of the subtrees extracted from the file under analysis is malicious. Experimental data shows that our approach can be used to learn signatures from a training set of malware files and use them to detect a test set of malware that is 5 times the size of the training set.
△ Less
Submitted 17 December, 2013;
originally announced December 2013.