Search | arXiv e-print repository

Robust Reward Placement under Uncertainty

Authors: Petros Petsinis, Kaichen Zhang, Andreas Pavlogiannis, **gbo Zhou, Panagiotis Karras

Abstract: We consider a problem of placing generators of rewards to be collected by randomly moving agents in a network. In many settings, the precise mobility pattern may be one of several possible, based on parameters outside our control, such as weather conditions. The placement should be robust to this uncertainty, to gain a competent total reward across possible networks. To study such scenarios, we in… ▽ More We consider a problem of placing generators of rewards to be collected by randomly moving agents in a network. In many settings, the precise mobility pattern may be one of several possible, based on parameters outside our control, such as weather conditions. The placement should be robust to this uncertainty, to gain a competent total reward across possible networks. To study such scenarios, we introduce the Robust Reward Placement problem (RRP). Agents move randomly by a Markovian Mobility Model with a predetermined set of locations whose connectivity is chosen adversarially from a known set $Π$ of candidates. We aim to select a set of reward states within a budget that maximizes the minimum ratio, among all candidates in $Π$, of the collected total reward over the optimal collectable reward under the same candidate. We prove that RRP is NP-hard and inapproximable, and develop $Ψ$-Saturate, a pseudo-polynomial time algorithm that achieves an $ε$-additive approximation by exceeding the budget constraint by a factor that scales as $O(\ln |Π|/ε)$. In addition, we present several heuristics, most prominently one inspired by a dynamic programming algorithm for the max-min 0-1 KNAPSACK problem. We corroborate our theoretical analysis with an experimental evaluation on synthetic and real data. △ Less

Submitted 3 June, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

Comments: Accepted for publication in IJCAI 2024

arXiv:2404.15986 [pdf, other]

Seed Selection in the Heterogeneous Moran Process

Authors: Petros Petsinis, Andreas Pavlogiannis, Josef Tkadlec, Panagiotis Karras

Abstract: The Moran process is a classic stochastic process that models the rise and takeover of novel traits in network-structured populations. In biological terms, a set of mutants, each with fitness $m\in(0,\infty)$ invade a population of residents with fitness $1$. Each agent reproduces at a rate proportional to its fitness and each offspring replaces a random network neighbor. The process ends when the… ▽ More The Moran process is a classic stochastic process that models the rise and takeover of novel traits in network-structured populations. In biological terms, a set of mutants, each with fitness $m\in(0,\infty)$ invade a population of residents with fitness $1$. Each agent reproduces at a rate proportional to its fitness and each offspring replaces a random network neighbor. The process ends when the mutants either fixate (take over the whole population) or go extinct. The fixation probability measures the success of the invasion. To account for environmental heterogeneity, we study a generalization of the Standard process, called the Heterogeneous Moran process. Here, the fitness of each agent is determined both by its type (resident/mutant) and the node it occupies. We study the natural optimization problem of seed selection: given a budget $k$, which $k$ agents should initiate the mutant invasion to maximize the fixation probability? We show that the problem is strongly inapproximable: it is $\mathbf{NP}$-hard to distinguish between maximum fixation probability 0 and 1. We then focus on mutant-biased networks, where each node exhibits at least as large mutant fitness as resident fitness. We show that the problem remains $\mathbf{NP}$-hard, but the fixation probability becomes submodular, and thus the optimization problem admits a greedy $(1-1/e)$-approximation. An experimental evaluation of the greedy algorithm along with various heuristics on real-world data sets corroborates our results. △ Less

Submitted 10 May, 2024; v1 submitted 24 April, 2024; originally announced April 2024.

Comments: Accepted for publication in IJCAI 2024

arXiv:2403.17818 [pdf, other]

CSSTs: A Dynamic Data Structure for Partial Orders in Concurrent Execution Analysis

Authors: Hünkar Can Tunç, Ameya Prashant Deshmukh, Berk Çirisci, Constantin Enea, Andreas Pavlogiannis

Abstract: Dynamic analyses are a standard approach to analyzing and testing concurrent programs. Such techniques observe program traces and analyze them to infer the presence or absence of bugs. At its core, each analysis maintains a partial order $P$ that represents order dependencies between events of the analyzed trace $σ$. Naturally, the scalability of the analysis largely depends on how efficiently it… ▽ More Dynamic analyses are a standard approach to analyzing and testing concurrent programs. Such techniques observe program traces and analyze them to infer the presence or absence of bugs. At its core, each analysis maintains a partial order $P$ that represents order dependencies between events of the analyzed trace $σ$. Naturally, the scalability of the analysis largely depends on how efficiently it maintains $P$. The standard data structure for this task has thus far been vector clocks. These, however, are slow for analyses that follow a non-streaming style, costing $O(n)$ for inserting (and propagating) each new ordering in $P$, where $n$ is the size of $σ$, while they cannot handle the deletion of existing orderings. In this paper we develop collective sparse segment trees (CSSTs), a simple but elegant data structure for generically maintaining a partial order $P$. CSSTs thrive when the width $k$ of $P$ is much smaller than the size $n$ of its domain, allowing inserting, deleting, and querying for orderings in $P$ to run in $O(logn)$ time. For a concurrent trace, $k$ is bounded by the number of its threads, and is normally orders of magnitude smaller than its size $n$, making CSSTs fitting for this setting. Our experimental results confirm that CSSTs are the best data structure currently to handle a range of dynamic analyses from existing literature. △ Less

Submitted 29 March, 2024; v1 submitted 26 March, 2024; originally announced March 2024.

arXiv:2401.05642 [pdf, other]

Optimistic Prediction of Synchronization-Reversal Data Races

Authors: Zheng Shi, Umang Mathur, Andreas Pavlogiannis

Abstract: Dynamic data race detection has emerged as a key technique for ensuring reliability of concurrent software in practice. However, dynamic approaches can often miss data races owing to nondeterminism in the thread scheduler. Predictive race detection techniques cater to this shortcoming by inferring alternate executions that may expose data races without re-executing the underlying program. More for… ▽ More Dynamic data race detection has emerged as a key technique for ensuring reliability of concurrent software in practice. However, dynamic approaches can often miss data races owing to nondeterminism in the thread scheduler. Predictive race detection techniques cater to this shortcoming by inferring alternate executions that may expose data races without re-executing the underlying program. More formally, the dynamic data race prediction problem asks, given a trace σof an execution of a concurrent program, can σbe correctly reordered to expose a data race? Existing state-of-the art techniques for data race prediction either do not scale to executions arising from real world concurrent software, or only expose a limited class of data races, such as those that can be exposed without reversing the order of synchronization operations. In general, exposing data races by reasoning about synchronization reversals is an intractable problem. In this work, we identify a class of data races, called Optimistic Sync(hronization)-Reversal races that can be detected in a tractable manner and often include non-trivial data races that cannot be exposed by prior tractable techniques. We also propose a sound algorithm OSR for detecting all optimistic sync-reversal data races in overall quadratic time, and show that the algorithm is optimal by establishing a matching lower bound. Our experiments demonstrate the effectiveness of OSR on our extensive suite of benchmarks, OSR reports the largest number of data races, and scales well to large execution traces. △ Less

Submitted 10 January, 2024; originally announced January 2024.

Comments: ICSE'24

arXiv:2311.04319 [pdf, other]

On-The-Fly Static Analysis via Dynamic Bidirected Dyck Reachability

Authors: Shankaranarayanan Krishna, Aniket Lal, Andreas Pavlogiannis, Omkar Tuppe

Abstract: Dyck reachability is a principled, graph-based formulation of a plethora of static analyses. Bidirected graphs are used for capturing dataflow through mutable heap data, and are usual formalisms of demand-driven points-to and alias analyses. The best (offline) algorithm runs in $O(m+n\cdot α(n))$ time, where $n$ is the number of nodes and $m$ is the number of edges in the flow graph, which becomes… ▽ More Dyck reachability is a principled, graph-based formulation of a plethora of static analyses. Bidirected graphs are used for capturing dataflow through mutable heap data, and are usual formalisms of demand-driven points-to and alias analyses. The best (offline) algorithm runs in $O(m+n\cdot α(n))$ time, where $n$ is the number of nodes and $m$ is the number of edges in the flow graph, which becomes $O(n^2)$ in the worst case. In the everyday practice of program analysis, the analyzed code is subject to continuous change, with source code being added and removed. On-the-fly static analysis under such continuous updates gives rise to dynamic Dyck reachability, where reachability queries run on a dynamically changing graph, following program updates. Naturally, executing the offline algorithm in this online setting is inadequate, as the time required to process a single update is prohibitively large. In this work we develop a novel dynamic algorithm for bidirected Dyck reachability that has $O(n\cdot α(n))$ worst-case performance per update, thus beating the $O(n^2)$ bound, and is also optimal in certain settings. We also implement our algorithm and evaluate its performance on on-the-fly data-dependence and alias analyses, and compare it with two best known alternatives, namely (i) the optimal offline algorithm, and (ii) a fully dynamic Datalog solver. Our experiments show that our dynamic algorithm is consistently, and by far, the top performing algorithm, exhibiting speedups in the order of 1000X. The running time of each update is almost always unnoticeable to the human eye, making it ideal for the on-the-fly analysis setting. △ Less

Submitted 7 November, 2023; originally announced November 2023.

arXiv:2311.04302 [pdf, other]

How Hard is Weak-Memory Testing?

Authors: Soham Chakraborty, Shankaranarayanan Krishna, Umang Mathur, Andreas Pavlogiannis

Abstract: Weak-memory models are standard formal specifications of concurrency across hardware, programming languages, and distributed systems. A fundamental computational problem is consistency testing: is the observed execution of a concurrent program in alignment with the specification of the underlying system? The problem has been studied extensively across Sequential Consistency (SC) and weak memory, a… ▽ More Weak-memory models are standard formal specifications of concurrency across hardware, programming languages, and distributed systems. A fundamental computational problem is consistency testing: is the observed execution of a concurrent program in alignment with the specification of the underlying system? The problem has been studied extensively across Sequential Consistency (SC) and weak memory, and proven to be NP-complete when some aspect of the input (e.g., number of threads/memory locations) is unbounded. This unboundedness has left a natural question open: are there efficient parameterized algorithms for testing? The main contribution of this paper is a deep hardness result for consistency testing under many popular weak-memory models: the problem remains NP-complete even in its bounded setting, where candidate executions contain a bounded number of threads, memory locations, and values. This hardness spreads across several Release-Acquire variants of C11, a popular variant of its Relaxed fragment, popular Causal Consistency models, and the POWER architecture. To our knowledge, this is the first result that fully exposes the hardness of weak-memory testing and proves that the problem admits no parameterization under standard input parameters. It also yields a computational separation of these models from SC, x86-TSO, PSO, and Relaxed, for which bounded consistency testing is either known (for SC), or shown here (for the rest), to be in polynomial time. △ Less

Submitted 15 November, 2023; v1 submitted 7 November, 2023; originally announced November 2023.

arXiv:2304.03714 [pdf, other]

Optimal Reads-From Consistency Checking for C11-Style Memory Models

Authors: Hünkar Can Tunç, Parosh Aziz Abdulla, Soham Chakraborty, Shankaranarayanan Krishna, Umang Mathur, Andreas Pavlogiannis

Abstract: Over the years, several memory models have been proposed to capture the subtle concurrency semantics of C/C++.One of the most fundamental problems associated with a memory model M is consistency checking: given an execution X, is X consistent with M? This problem lies at the heart of numerous applications, including specification testing and litmus tests, stateless model checking, and dynamic anal… ▽ More Over the years, several memory models have been proposed to capture the subtle concurrency semantics of C/C++.One of the most fundamental problems associated with a memory model M is consistency checking: given an execution X, is X consistent with M? This problem lies at the heart of numerous applications, including specification testing and litmus tests, stateless model checking, and dynamic analyses. As such, it has been explored extensively and its complexity is well-understood for traditional models like SC and TSO. However, less is known for the numerous model variants of C/C++, for which the problem becomes challenging due to the intricacies of their concurrency primitives. In this work we study the problem of consistency checking for popular variants of the C11 memory model, in particular, the RC20 model, its release-acquire (RA) fragment, the strong and weak variants of RA (SRA and WRA), as well as the Relaxed fragment of RC20. Motivated by applications in testing and model checking, we focus on reads-from consistency checking. The input is an execution X specifying a set of events, their program order and their reads-from relation, and the task is to decide the existence of a modification order on the writes of X that makes X consistent in a memory model. We draw a rich complexity landscape for this problem; our results include (i)~nearly-linear-time algorithms for certain variants, which improve over prior results, (ii)~fine-grained optimality results, as well as (iii)~matching upper and lower bounds (NP-hardness) for other variants. To our knowledge, this is the first work to characterize the complexity of consistency checking for C11 memory models. We have implemented our algorithms inside the TruSt model checker and the C11Tester testing tool. Experiments on standard benchmarks show that our new algorithms improve consistency checking, often by a significant margin. △ Less

Submitted 11 May, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

arXiv:2304.03692 [pdf, other]

Sound Dynamic Deadlock Prediction in Linear Time

Authors: Hünkar Can Tunç, Umang Mathur, Andreas Pavlogiannis, Mahesh Viswanathan

Abstract: Deadlocks are one of the most notorious concurrency bugs, and significant research has focused on detecting them efficiently. Dynamic predictive analyses work by observing concurrent executions, and reason about alternative interleavings that can witness concurrency bugs. Such techniques offer scalability and sound bug reports, and have emerged as an effective approach for concurrency bug detectio… ▽ More Deadlocks are one of the most notorious concurrency bugs, and significant research has focused on detecting them efficiently. Dynamic predictive analyses work by observing concurrent executions, and reason about alternative interleavings that can witness concurrency bugs. Such techniques offer scalability and sound bug reports, and have emerged as an effective approach for concurrency bug detection, such as data races. Effective dynamic deadlock prediction, however, has proven a challenging task, as no deadlock predictor currently meets the requirements of soundness, high-precision, and efficiency. In this paper, we first formally establish that this tradeoff is unavoidable, by showing that (a) sound and complete deadlock prediction is intractable, in general, and (b) even the seemingly simpler task of determining the presence of potential deadlocks, which often serve as unsound witnesses for actual predictable deadlocks, is intractable. The main contribution of this work is a new class of predictable deadlocks, called sync(hronization)-preserving deadlocks. Informally, these are deadlocks that can be predicted by reordering the observed execution while preserving the relative order of conflicting critical sections. We present two algorithms for sound deadlock prediction based on this notion. Our first algorithm SPDOffline detects all sync-preserving deadlocks, with running time that is linear per abstract deadlock pattern, a novel notion also introduced in this work. Our second algorithm SPDOnline predicts all sync-preserving deadlocks that involve two threads in a strictly online fashion, runs in overall linear time, and is better suited for a runtime monitoring setting. We implemented both our algorithms and evaluated their ability to perform offline and online deadlock-prediction on a large dataset of standard benchmarks. △ Less

Submitted 25 June, 2023; v1 submitted 7 April, 2023; originally announced April 2023.

arXiv:2211.14676 [pdf, other]

Maximizing the Probability of Fixation in the Positional Voter Model

Authors: Petros Petsinis, Andreas Pavlogiannis, Panagiotis Karras

Abstract: The Voter model is a well-studied stochastic process that models the invasion of a novel trait $A$ (e.g., a new opinion, social meme, genetic mutation, magnetic spin) in a network of individuals (agents, people, genes, particles) carrying an existing resident trait $B$. Individuals change traits by occasionally sampling the trait of a neighbor, while an invasion bias $δ\geq 0$ expresses the stocha… ▽ More The Voter model is a well-studied stochastic process that models the invasion of a novel trait $A$ (e.g., a new opinion, social meme, genetic mutation, magnetic spin) in a network of individuals (agents, people, genes, particles) carrying an existing resident trait $B$. Individuals change traits by occasionally sampling the trait of a neighbor, while an invasion bias $δ\geq 0$ expresses the stochastic preference to adopt the novel trait $A$ over the resident trait $B$. The strength of an invasion is measured by the probability that eventually the whole population adopts trait $A$, i.e., the fixation probability. In more realistic settings, however, the invasion bias is not ubiquitous, but rather manifested only in parts of the network. For instance, when modeling the spread of a social trait, the invasion bias represents localized incentives. In this paper, we generalize the standard biased Voter model to the positional Voter model, in which the invasion bias is effectuated only on an arbitrary subset of the network nodes, called biased nodes. We study the ensuing optimization problem, which is, given a budget $k$, to choose $k$ biased nodes so as to maximize the fixation probability of a randomly occurring invasion. We show that the problem is NP-hard both for finite $δ$ and when $δ\rightarrow \infty$ (strong bias), while the objective function is not submodular in either setting, indicating strong computational hardness. On the other hand, we show that, when $δ\rightarrow 0$ (weak bias), we can obtain a tight approximation in $O(n^{2ω})$ time, where $ω$ is the matrix-multiplication exponent. We complement our theoretical results with an experimental evaluation of some proposed heuristics. △ Less

Submitted 25 February, 2023; v1 submitted 26 November, 2022; originally announced November 2022.

Comments: Accepted for publication in AAAI 2023

arXiv:2210.02394 [pdf, other]

doi 10.1103/PhysRevE.106.034321

Social Balance on Networks: Local Minima and Best Edge Dynamics

Authors: Krishnendu Chatterjee, Jakub Svoboda, Ðorđe Žikelić, Andreas Pavlogiannis, Josef Tkadlec

Abstract: Structural balance theory is an established framework for studying social relationships of friendship and enmity. These relationships are modeled by a signed network whose energy potential measures the level of imbalance, while stochastic dynamics drives the network towards a state of minimum energy that captures social balance. It is known that this energy landscape has local minima that can trap… ▽ More Structural balance theory is an established framework for studying social relationships of friendship and enmity. These relationships are modeled by a signed network whose energy potential measures the level of imbalance, while stochastic dynamics drives the network towards a state of minimum energy that captures social balance. It is known that this energy landscape has local minima that can trap socially-aware dynamics, preventing it from reaching balance. Here we first study the robustness and attractor properties of these local minima. We show that a stochastic process can reach them from an abundance of initial states, and that some local minima cannot be escaped by mild perturbations of the network. Motivated by these anomalies, we introduce Best Edge Dynamics (BED), a new plausible stochastic process. We prove that BED always reaches balance, and that it does so fast in various interesting settings. △ Less

Submitted 5 October, 2022; originally announced October 2022.

Comments: 13 pages, 14 figures

Journal ref: Phys. Rev. E 106, 2022, 034321

arXiv:2204.11799 [pdf, other]

Reachability in Bidirected Pushdown VASS

Authors: Moses Ganardi, Rupak Majumdar, Andreas Pavlogiannis, Lia Schütze, Georg Zetzsche

Abstract: A pushdown vector addition system with states (PVASS) extends the model of vector addition systems with a pushdown store. A PVASS is said to be \emph{bidirected} if every transition (pushing/pop** a symbol or modifying a counter) has an accompanying opposite transition that reverses the effect. Bidirectedness arises naturally in many models; it can also be seen as a overapproximation of reachabi… ▽ More A pushdown vector addition system with states (PVASS) extends the model of vector addition systems with a pushdown store. A PVASS is said to be \emph{bidirected} if every transition (pushing/pop** a symbol or modifying a counter) has an accompanying opposite transition that reverses the effect. Bidirectedness arises naturally in many models; it can also be seen as a overapproximation of reachability. We show that the reachability problem for \emph{bidirected} PVASS is decidable in Ackermann time and primitive recursive for any fixed dimension. For the special case of one-dimensional bidirected PVASS, we show reachability is in $\mathsf{PSPACE}$, and in fact in polynomial time if the stack is polynomially bounded. Our results are in contrast to the \emph{directed} setting, where decidability of reachability is a long-standing open problem already for one dimensional PVASS, and there is a $\mathsf{PSPACE}$-lower bound already for one-dimensional PVASS with bounded stack. The reachability relation in the bidirected (stateless) case is a congruence over $\mathbb{N}^d$. Our upper bounds exploit saturation techniques over congruences. In particular, we show novel elementary-time constructions of semilinear representations of congruences generated by finitely many vector pairs. In the case of one-dimensional PVASS, we employ a saturation procedure over bounded-size counters. We complement our upper bound with a $\mathsf{TOWER}$-hardness result for arbitrary dimension and $k$-$\mathsf{EXPSPACE}$ hardness in dimension $2k+6$ using a technique by Lazić and Totzke to implement iterative exponentiations. △ Less

Submitted 25 April, 2022; originally announced April 2022.

Comments: Accepted for ICALP 2022

arXiv:2201.08207 [pdf, other]

Invasion Dynamics in the Biased Voter Process

Authors: Loke Durocher, Panagiotis Karras, Andreas Pavlogiannis, Josef Tkadlec

Abstract: The voter process is a classic stochastic process that models the invasion of a mutant trait $A$ (e.g., a new opinion, belief, legend, genetic mutation, magnetic spin) in a population of agents (e.g., people, genes, particles) who share a resident trait $B$, spread over the nodes of a graph. An agent may adopt the trait of one of its neighbors at any time, while the invasion bias $r\in(0,\infty)$… ▽ More The voter process is a classic stochastic process that models the invasion of a mutant trait $A$ (e.g., a new opinion, belief, legend, genetic mutation, magnetic spin) in a population of agents (e.g., people, genes, particles) who share a resident trait $B$, spread over the nodes of a graph. An agent may adopt the trait of one of its neighbors at any time, while the invasion bias $r\in(0,\infty)$ quantifies the stochastic preference towards ($r>1$) or against ($r<1$) adopting $A$ over $B$. Success is measured in terms of the fixation probability, i.e., the probability that eventually all agents have adopted the mutant trait $A$. In this paper we study the problem of fixation probability maximization under this model: given a budget $k$, find a set of $k$ agents to initiate the invasion that maximizes the fixation probability. We show that the problem is NP-hard for both $r>1$ and $r<1$, while the latter case is also inapproximable within any multiplicative factor. On the positive side, we show that when $r>1$, the optimization function is submodular and thus can be greedily approximated within a factor $1-1/e$. An experimental evaluation of some proposed heuristics corroborates our results. △ Less

Submitted 2 May, 2022; v1 submitted 20 January, 2022; originally announced January 2022.

Comments: 8 pages, 3 figures. To be published in IJCAI-22

arXiv:2201.06325 [pdf, other]

A Tree Clock Data Structure for Causal Orderings in Concurrent Executions

Authors: Umang Mathur, Andreas Pavlogiannis, Hünkar Can Tunç, Mahesh Viswanathan

Abstract: Dynamic techniques are a scalable and effective way to analyze concurrent programs. Instead of analyzing all behaviors of a program, these techniques detect errors by focusing on a single program execution. Often a crucial step in these techniques is to define a causal ordering between events in the execution, which is then computed using vector clocks, a simple data structure that stores logical… ▽ More Dynamic techniques are a scalable and effective way to analyze concurrent programs. Instead of analyzing all behaviors of a program, these techniques detect errors by focusing on a single program execution. Often a crucial step in these techniques is to define a causal ordering between events in the execution, which is then computed using vector clocks, a simple data structure that stores logical times of threads. The two basic operations of vector clocks, namely join and copy, require $Θ(k)$ time, where $k$ is the number of threads. Thus they are a computational bottleneck when $k$ is large. In this work, we introduce tree clocks, a new data structure that replaces vector clocks for computing causal orderings in program executions. Joining and copying tree clocks takes time that is roughly proportional to the number of entries being modified, and hence the two operations do not suffer the a-priori $Θ(k)$ cost per application. We show that when used to compute the classic happens-before (HB) partial order, tree clocks are optimal, in the sense that no other data structure can lead to smaller asymptotic running time. Moreover, we demonstrate that tree clocks can be used to compute other partial orders, such as schedulable-happens-before (SHB) and the standard Mazurkiewicz (MAZ) partial order, and thus are a versatile data structure. Our experiments show that just by replacing vector clocks with tree clocks, the computation becomes from $2.02 \times$ faster (MAZ) to $2.66 \times$ (SHB) and $2.97 \times$ (HB) on average per benchmark. These results illustrate that tree clocks have the potential to become a standard data structure with wide applications in concurrent analyses. △ Less

Submitted 17 January, 2022; originally announced January 2022.

arXiv:2201.02248 [pdf, other]

Fixation Maximization in the Positional Moran Process

Authors: Joachim Brendborg, Panagiotis Karras, Andreas Pavlogiannis, Asger Ullersted Rasmussen, Josef Tkadlec

Abstract: The Moran process is a classic stochastic process that models invasion dynamics on graphs. A single "mutant" (e.g., a new opinion, strain, social trait etc.) invades a population of residents spread over the nodes of a graph. The mutant fitness advantage $δ\geq 0$ determines how aggressively mutants propagate to their neighbors. The quantity of interest is the fixation probability, i.e., the proba… ▽ More The Moran process is a classic stochastic process that models invasion dynamics on graphs. A single "mutant" (e.g., a new opinion, strain, social trait etc.) invades a population of residents spread over the nodes of a graph. The mutant fitness advantage $δ\geq 0$ determines how aggressively mutants propagate to their neighbors. The quantity of interest is the fixation probability, i.e., the probability that the initial mutant eventually takes over the whole population. However, in realistic settings, the invading mutant has an advantage only in certain locations. E.g., a bacterial mutation allowing for lactose metabolism only confers an advantage on places where dairy products are present. In this paper we introduce the positional Moran process, a natural generalization in which the mutant fitness advantage is only realized on specific nodes called active nodes. The associated optimization problem is fixation maximization: given a budget $k$, choose a set of $k$ active nodes that maximize the fixation probability of the invading mutant. We show that the problem is NP-hard, while the optimization function is not submodular, thus indicating strong computational hardness. Then we focus on two natural limits. In the limit of $δ\to\infty$ (strong selection), although the problem remains NP-hard, the optimization function becomes submodular and thus admits a constant-factor approximation using a simple greedy algorithm. In the limit of $δ\to 0$ (weak selection), we show that in $O(m^ω)$ time we can obtain a tight approximation, where $m$ is the number of edges and $ω$ is the matrix-multiplication exponent. Finally, we present an experimental evaluation of the new algorithms together with some proposed heuristics. △ Less

Submitted 25 April, 2022; v1 submitted 6 January, 2022; originally announced January 2022.

Comments: 11 pages, 6 figures, to appear at AAAI 2022

arXiv:2111.05923 [pdf, other]

The Decidability and Complexity of Interleaved Bidirected Dyck Reachability

Authors: Adam Husted Kjelstrøm, Andreas Pavlogiannis

Abstract: Dyck reachability is the standard formulation of a large domain of static analyses, as it achieves the sweet spot between precision and efficiency, and has thus been studied extensively. Interleaved Dyck reachability (denoted $D_k\odot D_k$) uses two Dyck languages for increased precision (e.g., context and field sensitivity) but is well-known to be undecidable. As many static analyses yield a cer… ▽ More Dyck reachability is the standard formulation of a large domain of static analyses, as it achieves the sweet spot between precision and efficiency, and has thus been studied extensively. Interleaved Dyck reachability (denoted $D_k\odot D_k$) uses two Dyck languages for increased precision (e.g., context and field sensitivity) but is well-known to be undecidable. As many static analyses yield a certain type of bidirected graphs, they give rise to interleaved bidirected Dyck reachability problems. Although these problems have seen numerous applications, their decidability and complexity has largely remained open. In a recent work, Li et al. made the first steps in this direction, showing that (i) $D_1\odot D_1$ reachability (i.e., when both Dyck languages are over a single parenthesis and act as counters) is computable in $O(n^7)$ time, while (ii) $D_k\odot D_k$ reachability is NP-hard. In this work we address the decidability and complexity of all variants of interleaved bidirected Dyck reachability. First, we show that $D_1\odot D_1$ reachability can be computed in $O(n^3\cdot α(n))$ time, significantly improving over the existing $O(n^7)$ bound. Second, we show that $D_k\odot D_1$ reachability (i.e., when one language acts as a counter) is decidable, in contrast to the non-bidirected case where decidability is open. We further consider $D_k\odot D_1$ reachability where the counter remains linearly bounded. Our third result shows that this bounded variant can be solved in $O(n^2\cdot α(n))$ time, while our fourth result shows that the problem has a (conditional) quadratic lower bound, and thus our upper bound is essentially optimal. Fifth, we show that full $D_k\odot D_k$ reachability is undecidable. This improves the recent NP-hardness lower-bound, and shows that the problem is equivalent to the non-bidirected case. △ Less

Submitted 10 November, 2021; originally announced November 2021.

arXiv:2107.03569 [pdf, other]

Dynamic Data-Race Detection through the Fine-Grained Lens

Authors: Rucha Kulkarni, Umang Mathur, Andreas Pavlogiannis

Abstract: Data races are among the most common bugs in concurrency. The standard approach to data-race detection is via dynamic analyses, which work over executions of concurrent programs, instead of the program source code. The rich literature on the topic has created various notions of dynamic data races, which are known to be detected efficiently when certain parameters (e.g., number of threads) are smal… ▽ More Data races are among the most common bugs in concurrency. The standard approach to data-race detection is via dynamic analyses, which work over executions of concurrent programs, instead of the program source code. The rich literature on the topic has created various notions of dynamic data races, which are known to be detected efficiently when certain parameters (e.g., number of threads) are small. However, the \emph{fine-grained} complexity of all these notions of races has remained elusive, making it impossible to characterize their trade-offs between precision and efficiency. In this work we establish several fine-grained separations between many popular notions of dynamic data races. The input is an execution trace with $N$ events, $T$ threads and $L$ locks. Our main results are as follows. First, we show that happens-before (HB) races can be detected in $O(N\cdot \min(T, L))$ time, improving over the standard $O(N\cdot T)$ bound when $L=o(T)$. Moreover, we show that even reporting an HB race that involves a read access is hard for 2-orthogonal vectors (2-OV). This is the first rigorous proof of the conjectured quadratic lower-bound in detecting HB races. Second, we show that the recently introduced synchronization-preserving races are hard to detect for OV-3 and thus have a cubic lower bound, when $T=Ω(N)$. This establishes a complexity separation from HB races which are known to be less expressive. Third, we show that lock-cover races are hard for 2-OV, and thus have a quadratic lower-bound, even when $T=2$ and $L = ω(\log N)$. The similar notion of lock-set races is known to be detectable in $O(N\cdot L)$ time, and thus we achieve a complexity separation between the two. Moreover, we show that lock-set races become hitting-set (HS)-hard when $L=Θ(N)$, and thus also have a quadratic lower bound, when the input is sufficiently complex. △ Less

Submitted 7 July, 2021; originally announced July 2021.

arXiv:2105.06424 [pdf, other]

Stateless Model Checking under a Reads-Value-From Equivalence

Authors: Pratyush Agarwal, Krishnendu Chatterjee, Shreya Pathak, Andreas Pavlogiannis, Viktor Toman

Abstract: Stateless model checking (SMC) is one of the standard approaches to the verification of concurrent programs. As scheduling non-determinism creates exponentially large spaces of thread interleavings, SMC attempts to partition this space into equivalence classes and explore only a few representatives from each class. The efficiency of this approach depends on two factors: (a) the coarseness of the p… ▽ More Stateless model checking (SMC) is one of the standard approaches to the verification of concurrent programs. As scheduling non-determinism creates exponentially large spaces of thread interleavings, SMC attempts to partition this space into equivalence classes and explore only a few representatives from each class. The efficiency of this approach depends on two factors: (a) the coarseness of the partitioning, and (b) the time to generate representatives in each class. For this reason, the search for coarse partitionings that are efficiently explorable is an active research challenge. In this work we present RVF-SMC, a new SMC algorithm that uses a novel \emph{reads-value-from (RVF)} partitioning. Intuitively, two interleavings are deemed equivalent if they agree on the value obtained in each read event, and read events induce consistent causal orderings between them. The RVF partitioning is provably coarser than recent approaches based on Mazurkiewicz and "reads-from" partitionings. Our experimental evaluation reveals that RVF is quite often a very effective equivalence, as the underlying partitioning is exponentially coarser than other approaches. Moreover, RVF-SMC generates representatives very efficiently, as the reduction in the partitioning is often met with significant speed-ups in the model checking task. △ Less

Submitted 13 May, 2021; originally announced May 2021.

Comments: Full technical report of the CAV2021 work

arXiv:2011.11763 [pdf, other]

The Reads-From Equivalence for the TSO and PSO Memory Models

Authors: Truc Lam Bui, Krishnendu Chatterjee, Tushar Gautam, Andreas Pavlogiannis, Viktor Toman

Abstract: The verification of concurrent programs remains an open challenge due to the non-determinism in inter-process communication. One algorithmic problem in this challenge is the consistency verification of concurrent executions. Consistency verification under a reads-from map allows to compute the reads-from (RF) equivalence between concurrent traces, with direct applications to areas such as Stateles… ▽ More The verification of concurrent programs remains an open challenge due to the non-determinism in inter-process communication. One algorithmic problem in this challenge is the consistency verification of concurrent executions. Consistency verification under a reads-from map allows to compute the reads-from (RF) equivalence between concurrent traces, with direct applications to areas such as Stateless Model Checking (SMC). The RF equivalence was recently shown to be coarser than the standard Mazurkiewicz equivalence, leading to impressive scalability improvements for SMC under SC (sequential consistency). However, for the relaxed memory models of TSO and PSO (total/partial store order), the algorithmic problem of deciding the RF equivalence, as well as its impact on SMC, has been elusive. In this work we solve the problem of consistency verification for the TSO and PSO memory models given a reads-from map, denoted VTSO-rf and VPSO-rf, respectively. For an execution of $n$ events over $k$ threads and $d$ variables, we establish novel bounds that scale as $n^{k+1}$ for TSO and as $n^{k+1}\cdot \min(n^{k^2}, 2^{k\cdot d})$ for PSO. Based on our solution to these problems, we develop an SMC algorithm under TSO and PSO that uses the RF equivalence. The algorithm is exploration-optimal, in the sense that it is guaranteed to explore each class of the RF partitioning exactly once, and spends polynomial time per class when $k$ is bounded. We implement all our algorithms in the SMC tool Nidhugg, and perform a large number of experiments over benchmarks from existing literature. Our experimental results show that our algorithms for VTSO-rf and VPSO-rf provide significant scalability improvements over standard alternatives. When used for SMC, the RF partitioning is often much coarser than the standard Shasha-Snir partitioning for TSO/PSO, which yields a significant speedup in the model checking task. △ Less

Submitted 6 September, 2021; v1 submitted 23 November, 2020; originally announced November 2020.

Comments: Full technical report of the OOPSLA2021 work

arXiv:2010.16385 [pdf, other]

Optimal Prediction of Synchronization-Preserving Races

Authors: Umang Mathur, Andreas Pavlogiannis, Mahesh Viswanathan

Abstract: Concurrent programs are notoriously hard to write correctly, as scheduling nondeterminism introduces subtle errors that are both hard to detect and to reproduce. The most common concurrency errors are (data) races, which occur when memory-conflicting actions are executed concurrently. Consequently, considerable effort has been made towards develo** efficient techniques for race detection. The mo… ▽ More Concurrent programs are notoriously hard to write correctly, as scheduling nondeterminism introduces subtle errors that are both hard to detect and to reproduce. The most common concurrency errors are (data) races, which occur when memory-conflicting actions are executed concurrently. Consequently, considerable effort has been made towards develo** efficient techniques for race detection. The most common approach is dynamic race prediction: given an observed, race-free trace $σ$ of a concurrent program, the task is to decide whether events of $σ$ can be correctly reordered to a trace $σ^*$ that witnesses a race hidden in $σ$. In this work we introduce the notion of sync(hronization)-preserving races. A sync-preserving race occurs in $σ$ when there is a witness $σ^*$ in which synchronization operations (e.g., acquisition and release of locks) appear in the same order as in $σ$. This is a broad definition that strictly subsumes the famous notion of happens-before races. Our main results are as follows. First, we develop a sound and complete algorithm for predicting sync-preserving races. For moderate values of parameters like the number of threads, the algorithm runs in $\widetilde{O}(\mathcal{N})$ time and space, where $\mathcal{N}$ is the length of the trace $σ$. Second, we show that the problem has a $Ω(\mathcal{N}/\log^2 \mathcal{N})$ space lower bound, and thus our algorithm is essentially time and space optimal. Third, we show that predicting races with even just a single reversal of two sync operations is $\operatorname{NP}$-complete and even $\operatorname{W}[1]$-hard when parameterized by the number of threads. Thus, sync-preservation characterizes exactly the tractability boundary of race prediction, and our algorithm is nearly optimal for the tractable side. △ Less

Submitted 30 October, 2020; originally announced October 2020.

arXiv:2006.01491 [pdf, other]

The Fine-Grained and Parallel Complexity of Andersen's Pointer Analysis

Authors: Anders Alnor Mathiasen, Andreas Pavlogiannis

Abstract: Pointer analysis is one of the fundamental problems in static program analysis. Given a set of pointers, the task is to produce a useful over-approximation of the memory locations that each pointer may point-to at runtime. The most common formulation is Andersen's Pointer Analysis (APA), defined as an inclusion-based set of $m$ pointer constraints over a set of $n$ pointers. Existing algorithms so… ▽ More Pointer analysis is one of the fundamental problems in static program analysis. Given a set of pointers, the task is to produce a useful over-approximation of the memory locations that each pointer may point-to at runtime. The most common formulation is Andersen's Pointer Analysis (APA), defined as an inclusion-based set of $m$ pointer constraints over a set of $n$ pointers. Existing algorithms solve APA in $O(n^2\cdot m)$ time, while it has been conjectured that the problem has no truly sub-cubic algorithm, with a proof so far having remained elusive. In this work we draw a rich fine-grained and parallel complexity landscape of APA, and present upper and lower bounds. First, we establish an $O(n^3)$ upper-bound for general APA, improving over $O(n^2\cdot m)$ as $n=O(m)$. Second, we show that even on-demand APA ("may a specific pointer $a$ point to a specific location $b$?") has an $Ω(n^3)$ (combinatorial) lower bound under standard complexity-theoretic hypotheses. This formally establishes the long-conjectured "cubic bottleneck" of APA, and shows that our $O(n^3)$-time algorithm is optimal. Third, we show that under mild restrictions, APA is solvable in $\tilde{O}(n^ω)$ time, where $ω<2.373$ is the matrix-multiplication exponent. It is believed that $ω=2+o(1)$, in which case this bound becomes quadratic. Fourth, we show that even under such restrictions, even the on-demand problem has an $Ω(n^2)$ lower bound under standard complexity-theoretic hypotheses, and hence our algorithm is optimal when $ω=2+o(1)$. Fifth, we study the parallelizability of APA and establish lower and upper bounds: (i) in general, the problem is P-complete and hence unlikely parallelizable, whereas (ii) under mild restrictions, the problem is parallelizable. Our theoretical treatment formalizes several insights that can lead to practical improvements in the future. △ Less

Submitted 14 October, 2020; v1 submitted 2 June, 2020; originally announced June 2020.

arXiv:2004.14931 [pdf, other]

The Complexity of Dynamic Data Race Prediction

Authors: Umang Mathur, Andreas Pavlogiannis, Mahesh Viswanathan

Abstract: Writing concurrent programs is notoriously hard due to scheduling non-determinism. The most common concurrency bugs are data races, which are accesses to a shared resource that can be executed concurrently. Dynamic data-race prediction is the most standard technique for detecting data races: given an observed, data-race-free trace $t$, the task is to determine whether $t$ can be reordered to a tra… ▽ More Writing concurrent programs is notoriously hard due to scheduling non-determinism. The most common concurrency bugs are data races, which are accesses to a shared resource that can be executed concurrently. Dynamic data-race prediction is the most standard technique for detecting data races: given an observed, data-race-free trace $t$, the task is to determine whether $t$ can be reordered to a trace $t^*$ that exposes a data-race. Although the problem has received significant practical attention for over three decades, its complexity has remained elusive. In this work, we address this lacuna, identifying sources of intractability and conditions under which the problem is efficiently solvable. Given a trace $t$ of size $n$ over $k$ threads, our main results are as follows. First, we establish a general $O(k\cdot n^{2\cdot (k-1)})$ upper-bound, as well as an $O(n^k)$ upper-bound when certain parameters of $t$ are constant. In addition, we show that the problem is NP-hard and even W[1]-hard parameterized by $k$, and thus unlikely to be fixed-parameter tractable. Second, we study the problem over acyclic communication topologies, such as server-clients hierarchies. We establish an $O(k^2\cdot d\cdot n^2\cdot \log n)$ upper-bound, where $d$ is the number of shared variables accessed in $t$. In addition, we show that even for traces with $k=2$ threads, the problem has no $O(n^{2-ε})$ algorithm under Orthogonal Vectors. Since any trace with 2 threads defines an acyclic topology, our upper-bound for this case is optimal wrt polynomial improvements for up to moderate values of $k$ and $d$. Finally, we study a distance-bounded version of the problem, where the task is to expose a data race by a witness trace that is similar to $t$. We develop an algorithm that works in $O(n)$ time when certain parameters of $t$ are constant. △ Less

Submitted 2 May, 2020; v1 submitted 30 April, 2020; originally announced April 2020.

arXiv:2004.08828 [pdf, other]

Faster Algorithms for Quantitative Analysis of Markov Chains and Markov Decision Processes with Small Treewidth

Authors: Ali Asadi, Krishnendu Chatterjee, Amir Kafshdar Goharshady, Kiarash Mohammadi, Andreas Pavlogiannis

Abstract: Discrete-time Markov Chains (MCs) and Markov Decision Processes (MDPs) are two standard formalisms in system analysis. Their main associated quantitative objectives are hitting probabilities, discounted sum, and mean payoff. Although there are many techniques for computing these objectives in general MCs/MDPs, they have not been thoroughly studied in terms of parameterized algorithms, particularly… ▽ More Discrete-time Markov Chains (MCs) and Markov Decision Processes (MDPs) are two standard formalisms in system analysis. Their main associated quantitative objectives are hitting probabilities, discounted sum, and mean payoff. Although there are many techniques for computing these objectives in general MCs/MDPs, they have not been thoroughly studied in terms of parameterized algorithms, particularly when treewidth is used as the parameter. This is in sharp contrast to qualitative objectives for MCs, MDPs and graph games, for which treewidth-based algorithms yield significant complexity improvements. In this work, we show that treewidth can also be used to obtain faster algorithms for the quantitative problems. For an MC with $n$ states and $m$ transitions, we show that each of the classical quantitative objectives can be computed in $O((n+m)\cdot t^2)$ time, given a tree decomposition of the MC that has width $t$. Our results also imply a bound of $O(κ\cdot (n+m)\cdot t^2)$ for each objective on MDPs, where $κ$ is the number of strategy-iteration refinements required for the given input and objective. Finally, we make an experimental evaluation of our new algorithms on low-treewidth MCs and MDPs obtained from the DaCapo benchmark suite. Our experimental results show that on MCs and MDPs with small treewidth, our algorithms outperform existing well-established methods by one or more orders of magnitude. △ Less

Submitted 19 April, 2020; originally announced April 2020.

arXiv:2001.11070 [pdf, other]

Optimal and Perfectly Parallel Algorithms for On-demand Data-flow Analysis

Authors: Krishnendu Chatterjee, Amir Kafshdar Goharshady, Rasmus Ibsen-Jensen, Andreas Pavlogiannis

Abstract: Interprocedural data-flow analyses form an expressive and useful paradigm of numerous static analysis applications, such as live variables analysis, alias analysis and null pointers analysis. The most widely-used framework for interprocedural data-flow analysis is IFDS, which encompasses distributive data-flow functions over a finite domain. On-demand data-flow analyses restrict the focus of the a… ▽ More Interprocedural data-flow analyses form an expressive and useful paradigm of numerous static analysis applications, such as live variables analysis, alias analysis and null pointers analysis. The most widely-used framework for interprocedural data-flow analysis is IFDS, which encompasses distributive data-flow functions over a finite domain. On-demand data-flow analyses restrict the focus of the analysis on specific program locations and data facts. This setting provides a natural split between (i) an offline (or preprocessing) phase, where the program is partially analyzed and analysis summaries are created, and (ii) an online (or query) phase, where analysis queries arrive on demand and the summaries are used to speed up answering queries. In this work, we consider on-demand IFDS analyses where the queries concern program locations of the same procedure (aka same-context queries). We exploit the fact that flow graphs of programs have low treewidth to develop faster algorithms that are space and time optimal for many common data-flow analyses, in both the preprocessing and the query phase. We also use treewidth to develop query solutions that are embarrassingly parallelizable, i.e. the total work for answering each query is split to a number of threads such that each thread performs only a constant amount of work. Finally, we implement a static analyzer based on our algorithms, and perform a series of on-demand analysis experiments on standard benchmarks. Our experimental results show a drastic speed-up of the queries after only a lightweight preprocessing phase, which significantly outperforms existing techniques. △ Less

Submitted 14 April, 2020; v1 submitted 29 January, 2020; originally announced January 2020.

Comments: A conference version appeared in ESOP 2020

arXiv:1910.00241 [pdf, other]

Optimal Dyck Reachability for Data-Dependence and Alias Analysis

Authors: Krishnendu Chatterjee, Bhavya Choudhary, Andreas Pavlogiannis

Abstract: A fundamental algorithmic problem at the heart of static analysis is Dyck reachability. The input is a graph where the edges are labeled with different types of opening and closing parentheses, and the reachability information is computed via paths whose parentheses are properly matched. We present new results for Dyck reachability problems with applications to alias analysis and data-dependence a… ▽ More A fundamental algorithmic problem at the heart of static analysis is Dyck reachability. The input is a graph where the edges are labeled with different types of opening and closing parentheses, and the reachability information is computed via paths whose parentheses are properly matched. We present new results for Dyck reachability problems with applications to alias analysis and data-dependence analysis. Our main contributions, that include improved upper bounds as well as lower bounds that establish optimality guarantees, are as follows. First, we consider Dyck reachability on bidirected graphs, which is the standard way of performing field-sensitive points-to analysis. Given a bidirected graph with $n$ nodes and $m$ edges, we present: (i)~an algorithm with worst-case running time $O(m + n \cdot α(n))$, where $α(n)$ is the inverse Ackermann function, improving the previously known $O(n^2)$ time bound; (ii)~a matching lower bound that shows that our algorithm is optimal wrt to worst-case complexity; and (iii)~an optimal average-case upper bound of $O(m)$ time, improving the previously known $O(m \cdot \log n)$ bound. Second, we consider the problem of context-sensitive data-dependence analysis, where the task is to obtain analysis summaries of library code in the presence of callbacks. Our algorithm preprocesses libraries in almost linear time, after which the contribution of the library in the complexity of the client analysis is only linear, and only wrt the number of call sites. Third, we prove that combinatorial algorithms for Dyck reachability on general graphs with truly sub-cubic bounds cannot be obtained without obtaining sub-cubic combinatorial algorithms for Boolean Matrix Multiplication, which is a long-standing open problem. We also show that the same hardness holds for graphs of constant treewidth. △ Less

Submitted 1 October, 2019; originally announced October 2019.

arXiv:1909.00989 [pdf, other]

Value-centric Dynamic Partial Order Reduction

Authors: Krishnendu Chatterjee, Andreas Pavlogiannis, Viktor Toman

Abstract: The verification of concurrent programs remains an open challenge, as thread interaction has to be accounted for, which leads to state-space explosion. Stateless model checking battles this problem by exploring traces rather than states of the program. As there are exponentially many traces, dynamic partial-order reduction (DPOR) techniques are used to partition the trace space into equivalence cl… ▽ More The verification of concurrent programs remains an open challenge, as thread interaction has to be accounted for, which leads to state-space explosion. Stateless model checking battles this problem by exploring traces rather than states of the program. As there are exponentially many traces, dynamic partial-order reduction (DPOR) techniques are used to partition the trace space into equivalence classes, and explore a few representatives from each class. The standard equivalence that underlies most DPOR techniques is the happens-before equivalence, however recent works have spawned a vivid interest towards coarser equivalences. The efficiency of such approaches is a product of two parameters: (i) the size of the partitioning induced by the equivalence, and (ii) the time spent by the exploration algorithm in each class of the partitioning. In this work, we present a new equivalence, called value-happens-before and show that it has two appealing features. First, value-happens-before is always at least as coarse as the happens-before equivalence, and can be even exponentially coarser. Second, the value-happens-before partitioning is efficiently explorable when the number of threads is bounded. We present an algorithm called value-centric DPOR (VCDPOR), which explores the underlying partitioning using polynomial time per class. Finally, we perform an experimental evaluation of VCDPOR on various benchmarks, and compare it against other state-of-the-art approaches. Our results show that value-happens-before typically induces a significant reduction in the size of the underlying partitioning, which leads to a considerable reduction in the running time for exploring the whole partitioning. △ Less

Submitted 3 September, 2019; originally announced September 2019.

arXiv:1901.08857 [pdf, other]

Fast, Sound and Effectively Complete Dynamic Race Prediction

Authors: Andreas Pavlogiannis

Abstract: Writing concurrent programs is highly error-prone due to the nondeterminism in interprocess communication. The most reliable indicators of errors in concurrency are data races, which are accesses to a shared resource that can be executed consecutively. We study the problem of predicting data races in lock-based concurrent programs. The input consists of a concurrent trace $t$, and the task is to d… ▽ More Writing concurrent programs is highly error-prone due to the nondeterminism in interprocess communication. The most reliable indicators of errors in concurrency are data races, which are accesses to a shared resource that can be executed consecutively. We study the problem of predicting data races in lock-based concurrent programs. The input consists of a concurrent trace $t$, and the task is to determine all pairs of events of $t$ that constitute a data race. The problem lies at the heart of concurrent verification and has been extensively studied for over three decades. However, existing polynomial-time sound techniques are highly incomplete and can miss many simple races. In this work we develop M2: a new polynomial-time algorithm for this problem, which has no false positives. In addition, our algorithm is complete for input traces that consist of two processes, i.e., it provably detects all races in the trace. We also develop sufficient conditions for detecting completeness dynamically in cases of more than two processes. We make an experimental evaluation of our algorithm on a standard set of benchmarks taken from recent literature on the topic. Our tool soundly reports thousands of races and misses at most one race in the whole benchmark set. In addition, our technique detects all racy memory locations of the benchmark set. Finally, its running times are comparable, and often smaller than the theoretically fastest, yet highly incomplete, existing methods. To our knowledge, M2 is the first sound algorithm that achieves such a level of performance on both running time and completeness of the reported races. △ Less

Submitted 5 November, 2019; v1 submitted 25 January, 2019; originally announced January 2019.

arXiv:1810.02687 [pdf, other]

Fixation probability and fixation time in structured populations

Authors: Josef Tkadlec, Andreas Pavlogiannis, Krishnendu Chatterjee, Martin A. Nowak

Abstract: The rate of biological evolution depends on the fixation probability and on the fixation time of new mutants. Intensive research has focused on identifying population structures that augment the fixation probability of advantageous mutants. But these `amplifiers of natural selection' typically increase fixation time. Here we study population structures that achieve a trade-off between high fixatio… ▽ More The rate of biological evolution depends on the fixation probability and on the fixation time of new mutants. Intensive research has focused on identifying population structures that augment the fixation probability of advantageous mutants. But these `amplifiers of natural selection' typically increase fixation time. Here we study population structures that achieve a trade-off between high fixation probability and short fixation time. First, we show that no amplifiers can have asymptotically lower absorption time than the well-mixed population. Then we design population structures that substantially augment the fixation probability with just a minor increase in fixation time. Finally, we show that those structures enable higher effective rate of evolution than the well-mixed population provided that the rate of generating advantageous mutants is relatively low. Our work sheds light on how population structure affects the rate of evolution. Moreover, our structures could be useful for lab-based, medical or industrial applications of evolutionary optimization. △ Less

Submitted 8 March, 2019; v1 submitted 27 September, 2018; originally announced October 2018.

arXiv:1802.02509 [pdf, other]

Strong Amplifiers of Natural Selection: Proofs

Authors: Andreas Pavlogiannis, Josef Tkadlec, Krishnendu Chatterjee, Martin A. Nowak

Abstract: We consider the modified Moran process on graphs to study the spread of genetic and cultural mutations on structured populations. An initial mutant arises either spontaneously (aka \emph{uniform initialization}), or during reproduction (aka \emph{temperature initialization}) in a population of $n$ individuals, and has a fixed fitness advantage $r>1$ over the residents of the population. The fixati… ▽ More We consider the modified Moran process on graphs to study the spread of genetic and cultural mutations on structured populations. An initial mutant arises either spontaneously (aka \emph{uniform initialization}), or during reproduction (aka \emph{temperature initialization}) in a population of $n$ individuals, and has a fixed fitness advantage $r>1$ over the residents of the population. The fixation probability is the probability that the mutant takes over the entire population. Graphs that ensure fixation probability of~1 in the limit of infinite populations are called \emph{strong amplifiers}. Previously, only a few examples of strong amplifiers were known for uniform initialization, whereas no strong amplifiers were known for temperature initialization. In this work, we study necessary and sufficient conditions for strong amplification, and prove negative and positive results. We show that for temperature initialization, graphs that are unweighted and/or self-loop-free have fixation probability upper-bounded by $1-1/f(r)$, where $f(r)$ is a function linear in $r$. Similarly, we show that for uniform initialization, bounded-degree graphs that are unweighted and/or self-loop-free have fixation probability upper-bounded by $1-1/g(r,c)$, where $c$ is the degree bound and $g(r,c)$ a function linear in $r$. Our main positive result complements these negative results, and is as follows: every family of undirected graphs with (i)~self loops and (ii)~diameter bounded by $n^{1-ε}$, for some fixed $ε>0$, can be assigned weights that makes it a strong amplifier, both for uniform and temperature initialization. △ Less

Submitted 14 May, 2018; v1 submitted 7 February, 2018; originally announced February 2018.

arXiv:1701.04914 [pdf, other]

doi 10.1007/978-3-662-54434-1_11

Faster Algorithms for Weighted Recursive State Machines

Authors: Krishnendu Chatterjee, Bernhard Kragl, Samarth Mishra, Andreas Pavlogiannis

Abstract: Pushdown systems (PDSs) and recursive state machines (RSMs), which are linearly equivalent, are standard models for interprocedural analysis. Yet RSMs are more convenient as they (a) explicitly model function calls and returns, and (b) specify many natural parameters for algorithmic analysis, e.g., the number of entries and exits. We consider a general framework where RSM transitions are labeled f… ▽ More Pushdown systems (PDSs) and recursive state machines (RSMs), which are linearly equivalent, are standard models for interprocedural analysis. Yet RSMs are more convenient as they (a) explicitly model function calls and returns, and (b) specify many natural parameters for algorithmic analysis, e.g., the number of entries and exits. We consider a general framework where RSM transitions are labeled from a semiring and path properties are algebraic with semiring operations, which can model, e.g., interprocedural reachability and dataflow analysis problems. Our main contributions are new algorithms for several fundamental problems. As compared to a direct translation of RSMs to PDSs and the best-known existing bounds of PDSs, our analysis algorithm improves the complexity for finite-height semirings (that subsumes reachability and standard dataflow properties). We further consider the problem of extracting distance values from the representation structures computed by our algorithm, and give efficient algorithms that distinguish the complexity of a one-time preprocessing from the complexity of each individual query. Another advantage of our algorithm is that our improvements carry over to the concurrent setting, where we improve the best-known complexity for the context-bounded analysis of concurrent RSMs. Finally, we provide a prototype implementation that gives a significant speed-up on several benchmarks from the SLAM/SDV project. △ Less

Submitted 17 January, 2017; originally announced January 2017.

arXiv:1610.01188 [pdf, other]

Data-centric Dynamic Partial Order Reduction

Authors: Marek Chalupa, Krishnendu Chatterjee, Andreas Pavlogiannis, Nishant Sinha, Kapil Vaidya

Abstract: We present a new dynamic partial-order reduction method for stateless model checking of concurrent programs. A common approach for exploring program behaviors relies on enumerating the traces of the program, without storing the visited states (aka stateless exploration). As the number of distinct traces grows exponentially, dynamic partial-order reduction (DPOR) techniques have been successfully u… ▽ More We present a new dynamic partial-order reduction method for stateless model checking of concurrent programs. A common approach for exploring program behaviors relies on enumerating the traces of the program, without storing the visited states (aka stateless exploration). As the number of distinct traces grows exponentially, dynamic partial-order reduction (DPOR) techniques have been successfully used to partition the space of traces into equivalence classes (Mazurkiewicz partitioning), with the goal of exploring only few representative traces from each class. We introduce a new equivalence on traces under sequential consistency semantics, which we call the observation equivalence. Two traces are observationally equivalent if every read event observes the same write event in both traces. While the traditional Mazurkiewicz equivalence is control-centric, our new definition is data-centric. We show that our observation equivalence is coarser than the Mazurkiewicz equivalence, and in many cases even exponentially coarser. We devise a DPOR exploration of the trace space, called data-centric DPOR, based on the observation equivalence. For acyclic architectures, our algorithm is guaranteed to explore exactly one representative trace from each observation class, while spending polynomial time per class. Hence, our algorithm is optimal wrt the observation equivalence, and in several cases explores exponentially fewer traces than any enumerative method based on the Mazurkiewicz equivalence. For cyclic architectures, we consider an equivalence between traces which is finer than the observation equivalence; but coarser than the Mazurkiewicz equivalence, and in some cases is exponentially coarser. Our data-centric DPOR algorithm remains optimal under this trace equivalence. △ Less

Submitted 25 January, 2019; v1 submitted 4 October, 2016; originally announced October 2016.

arXiv:1510.07565 [pdf, other]

Algorithms for Algebraic Path Properties in Concurrent Systems of Constant Treewidth Components

Authors: Krishnendu Chatterjee, Amir Kafshdar Goharshady, Rasmus Ibsen-Jensen, Andreas Pavlogiannis

Abstract: We study algorithmic questions for concurrent systems where the transitions are labeled from a complete, closed semiring, and path properties are algebraic with semiring operations. The algebraic path properties can model dataflow analysis problems, the shortest path problem, and many other natural problems that arise in program analysis. We consider that each component of the concurrent system is… ▽ More We study algorithmic questions for concurrent systems where the transitions are labeled from a complete, closed semiring, and path properties are algebraic with semiring operations. The algebraic path properties can model dataflow analysis problems, the shortest path problem, and many other natural problems that arise in program analysis. We consider that each component of the concurrent system is a graph with constant treewidth, a property satisfied by the controlflow graphs of most programs. We allow for multiple possible queries, which arise naturally in demand driven dataflow analysis. The study of multiple queries allows us to consider the tradeoff between the resource usage of the one-time preprocessing and for each individual query. The traditional approach constructs the product graph of all components and applies the best-known graph algorithm on the product. In this approach, even the answer to a single query requires the transitive closure, which provides no room for tradeoff between preprocessing and query time. Our main contributions are algorithms that significantly improve the worst-case running time of the traditional approach, and provide various tradeoffs depending on the number of queries. For example, in a concurrent system of two components, the traditional approach requires hexic time in the worst case for answering one query as well as computing the transitive closure, whereas we show that with one-time preprocessing in almost cubic time, each subsequent query can be answered in at most linear time, and even the transitive closure can be computed in almost quartic time. Furthermore, we establish conditional optimality results showing that the worst-case running time of our algorithms cannot be improved without achieving major breakthroughs in graph algorithms. △ Less

Submitted 26 October, 2015; originally announced October 2015.

ACM Class: F.3.2

arXiv:1504.07384 [pdf, other]

Faster Algorithms for Quantitative Verification in Constant Treewidth Graphs

Authors: Krishnendu Chatterjee, Rasmus Ibsen-Jensen, Andreas Pavlogiannis

Abstract: We consider the core algorithmic problems related to verification of systems with respect to three classical quantitative properties, namely, the mean-payoff property, the ratio property, and the minimum initial credit for energy property. The algorithmic problem given a graph and a quantitative property asks to compute the optimal value (the infimum value over all traces) from every node of the g… ▽ More We consider the core algorithmic problems related to verification of systems with respect to three classical quantitative properties, namely, the mean-payoff property, the ratio property, and the minimum initial credit for energy property. The algorithmic problem given a graph and a quantitative property asks to compute the optimal value (the infimum value over all traces) from every node of the graph. We consider graphs with constant treewidth, and it is well-known that the control-flow graphs of most programs have constant treewidth. Let $n$ denote the number of nodes of a graph, $m$ the number of edges (for constant treewidth graphs $m=O(n)$) and $W$ the largest absolute value of the weights. Our main theoretical results are as follows. First, for constant treewidth graphs we present an algorithm that approximates the mean-payoff value within a multiplicative factor of $ε$ in time $O(n \cdot \log (n/ε))$ and linear space, as compared to the classical algorithms that require quadratic time. Second, for the ratio property we present an algorithm that for constant treewidth graphs works in time $O(n \cdot \log (|a\cdot b|))=O(n\cdot\log (n\cdot W))$, when the output is $\frac{a}{b}$, as compared to the previously best known algorithm with running time $O(n^2 \cdot \log (n\cdot W))$. Third, for the minimum initial credit problem we show that (i) for general graphs the problem can be solved in $O(n^2\cdot m)$ time and the associated decision problem can be solved in $O(n\cdot m)$ time, improving the previous known $O(n^3\cdot m\cdot \log (n\cdot W))$ and $O(n^2 \cdot m)$ bounds, respectively; and (ii) for constant treewidth graphs we present an algorithm that requires $O(n\cdot \log n)$ time, improving the previous known $O(n^4 \cdot \log (n \cdot W))$ bound. △ Less

Submitted 28 April, 2015; originally announced April 2015.

ACM Class: G.2.2

arXiv:1410.7724 [pdf, ps, other]

Faster Algorithms for Algebraic Path Properties in RSMs with Constant Treewidth

Authors: Krishnendu Chatterjee, Rasmus Ibsen-Jensen, Andreas Pavlogiannis, Prateesh Goyal

Abstract: Interprocedural analysis is at the heart of numerous applications in programming languages, such as alias analysis, constant propagation, etc. Recursive state machines (RSMs) are standard models for interprocedural analysis. We consider a general framework with RSMs where the transitions are labeled from a semiring, and path properties are algebraic with semiring operations. RSMs with algebraic pa… ▽ More Interprocedural analysis is at the heart of numerous applications in programming languages, such as alias analysis, constant propagation, etc. Recursive state machines (RSMs) are standard models for interprocedural analysis. We consider a general framework with RSMs where the transitions are labeled from a semiring, and path properties are algebraic with semiring operations. RSMs with algebraic path properties can model interprocedural dataflow analysis problems, the shortest path problem, the most probable path problem, etc. The traditional algorithms for interprocedural analysis focus on path properties where the starting point is \emph{fixed} as the entry point of a specific method. In this work, we consider possible multiple queries as required in many applications such as in alias analysis. The study of multiple queries allows us to bring in a very important algorithmic distinction between the resource usage of the \emph{one-time} preprocessing vs for \emph{each individual} query. The second aspect that we consider is that the control flow graphs for most programs have constant treewidth. Our main contributions are simple and implementable algorithms that support multiple queries for algebraic path properties for RSMs that have constant treewidth. Our theoretical results show that our algorithms have small additional one-time preprocessing, but can answer subsequent queries significantly faster as compared to the current best-known solutions for several important problems, such as interprocedural reachability and shortest path. We provide a prototype implementation for interprocedural reachability and intraprocedural shortest path that gives a significant speed-up on several benchmarks. △ Less

Submitted 25 November, 2014; v1 submitted 28 October, 2014; originally announced October 2014.

arXiv:1409.2291 [pdf, other]

A Framework for Automated Competitive Analysis of On-line Scheduling of Firm-Deadline Tasks

Authors: Krishnendu Chatterjee, Andreas Pavlogiannis, Alexander Kößler, Ulrich Schmid

Abstract: We present a flexible framework for the automated competitive analysis of on-line scheduling algorithms for firm-deadline real-time tasks based on multi-objective graphs: Given a taskset and an on-line scheduling algorithm specified as a labeled transition system, along with some optional safety, liveness, and/or limit-average constraints for the adversary, we automatically compute the competitive… ▽ More We present a flexible framework for the automated competitive analysis of on-line scheduling algorithms for firm-deadline real-time tasks based on multi-objective graphs: Given a taskset and an on-line scheduling algorithm specified as a labeled transition system, along with some optional safety, liveness, and/or limit-average constraints for the adversary, we automatically compute the competitive ratio of the algorithm w.r.t. a clairvoyant scheduler. We demonstrate the flexibility and power of our approach by comparing the competitive ratio of several on-line algorithms, including $D^{over}$, that have been proposed in the past, for various tasksets. Our experimental results reveal that none of these algorithms is universally optimal, in the sense that there are tasksets where other schedulers provide better performance. Our framework is hence a very useful design tool for selecting optimal algorithms for a given application. △ Less

Submitted 14 September, 2014; v1 submitted 8 September, 2014; originally announced September 2014.

arXiv:1012.2440 [pdf, ps, other]

Passively Mobile Communicating Machines that Use Restricted Space

Authors: Ioannis Chatzigiannakis, Othon Michail, Stavros Nikolaou, Andreas Pavlogiannis, Paul G. Spirakis

Abstract: We propose a new theoretical model for passively mobile Wireless Sensor Networks, called PM, standing for Passively mobile Machines. The main modification w.r.t. the Population Protocol model is that agents now, instead of being automata, are Turing Machines. We provide general definitions for unbounded memories, but we are mainly interested in computations upper-bounded by plausible space limitat… ▽ More We propose a new theoretical model for passively mobile Wireless Sensor Networks, called PM, standing for Passively mobile Machines. The main modification w.r.t. the Population Protocol model is that agents now, instead of being automata, are Turing Machines. We provide general definitions for unbounded memories, but we are mainly interested in computations upper-bounded by plausible space limitations. However, we prove that our results hold for more general cases. We focus on complete communication graphs and define the complexity classes PMSPACE(f(n)) parametrically, consisting of all predicates that are stably computable by some PM protocol that uses O(f(n)) memory on each agent. We provide a protocol that generates unique ids from scratch only by using O(log n) memory, and use it to provide an exact characterization for the classes PMSPACE(f(n)) when f(n)=Ω(log n): they are precisely the classes of all symmetric predicates in NSPACE(nf(n)). In this way, we provide a space hierarchy for the PM model when the memory bounds are Ω(log n). Finally, we explore the computability of the PM model when the protocols use o(loglog n) space per machine and prove that SEMILINEAR=PMSPACE(f(n)) when f(n)=o(loglog n), where SEMILINEAR denotes the class of the semilinear predicates. In fact, we prove that this bound acts as a threshold, so that SEMILINEAR is a proper subset of PMSPACE(f(n)) when f(n)=O(loglog n). △ Less

Submitted 11 December, 2010; originally announced December 2010.

Comments: 17 pages

arXiv:1004.3395 [pdf, ps, other]

Passively Mobile Communicating Logarithmic Space Machines

Authors: Ioannis Chatzigiannakis, Othon Michail, Stavros Nikolaou, Andreas Pavlogiannis, Paul G. Spirakis

Abstract: We propose a new theoretical model for passively mobile Wireless Sensor Networks. We call it the PALOMA model, standing for PAssively mobile LOgarithmic space MAchines. The main modification w.r.t. the Population Protocol model is that agents now, instead of being automata, are Turing Machines whose memory is logarithmic in the population size n. Note that the new model is still easily implementab… ▽ More We propose a new theoretical model for passively mobile Wireless Sensor Networks. We call it the PALOMA model, standing for PAssively mobile LOgarithmic space MAchines. The main modification w.r.t. the Population Protocol model is that agents now, instead of being automata, are Turing Machines whose memory is logarithmic in the population size n. Note that the new model is still easily implementable with current technology. We focus on complete communication graphs. We define the complexity class PLM, consisting of all symmetric predicates on input assignments that are stably computable by the PALOMA model. We assume that the agents are initially identical. Surprisingly, it turns out that the PALOMA model can assign unique consecutive ids to the agents and inform them of the population size! This allows us to give a direct simulation of a Deterministic Turing Machine of O(nlogn) space, thus, establishing that any symmetric predicate in SPACE(nlogn) also belongs to PLM. We next prove that the PALOMA model can simulate the Community Protocol model, thus, improving the previous lower bound to all symmetric predicates in NSPACE(nlogn). Going one step further, we generalize the simulation of the deterministic TM to prove that the PALOMA model can simulate a Nondeterministic TM of O(nlogn) space. Although providing the same lower bound, the important remark here is that the bound is now obtained in a direct manner, in the sense that it does not depend on the simulation of a TM by a Pointer Machine. Finally, by showing that a Nondeterministic TM of O(nlogn) space decides any language stably computable by the PALOMA model, we end up with an exact characterization for PLM: it is precisely the class of all symmetric predicates in NSPACE(nlogn). △ Less

Submitted 20 April, 2010; originally announced April 2010.

Comments: 22 pages

Report number: FRONTS-TR-2010-16

Showing 1–36 of 36 results for author: Pavlogiannis, A