Search | arXiv e-print repository

Compositional Value Iteration with Pareto Caching

Authors: Kazuki Watanabe, Marck van der Vegt, Sebastian Junges, Ichiro Hasuo

Abstract: The de-facto standard approach in MDP verification is based on value iteration (VI). We propose compositional VI, a framework for model checking compositional MDPs, that addresses efficiency while maintaining soundness. Concretely, compositional MDPs naturally arise from the combination of individual components, and their structure can be expressed using, e.g., string diagrams. Towards efficiency,… ▽ More The de-facto standard approach in MDP verification is based on value iteration (VI). We propose compositional VI, a framework for model checking compositional MDPs, that addresses efficiency while maintaining soundness. Concretely, compositional MDPs naturally arise from the combination of individual components, and their structure can be expressed using, e.g., string diagrams. Towards efficiency, we observe that compositional VI repeatedly verifies individual components. We propose a technique called Pareto caching that allows to reuse verification results, even for previously unseen queries. Towards soundness, we present two stop** criteria: one generalizes the optimistic value iteration paradigm and the other uses Pareto caches in conjunction with recent baseline algorithms. Our experimental evaluations shows the promise of the novel algorithm and its variations, and identifies challenges for future work. △ Less

Submitted 16 May, 2024; originally announced May 2024.

Comments: Extended version (includes the Appendix) of the paper accepted at CAV-24

arXiv:2401.08377 [pdf, other]

Pareto Curves for Compositionally Model Checking String Diagrams of MDPs

Authors: Kazuki Watanabe, Marck van der Vegt, Ichiro Hasuo, Jurriaan Rot, Sebastian Junges

Abstract: Computing schedulers that optimize reachability probabilities in MDPs is a standard verification task. To address scalability concerns, we focus on MDPs that are compositionally described in a high-level description formalism. In particular, this paper considers string diagrams, which specify an algebraic, sequential composition of subMDPs. Towards their compositional verification, the key challen… ▽ More Computing schedulers that optimize reachability probabilities in MDPs is a standard verification task. To address scalability concerns, we focus on MDPs that are compositionally described in a high-level description formalism. In particular, this paper considers string diagrams, which specify an algebraic, sequential composition of subMDPs. Towards their compositional verification, the key challenge is to locally optimize schedulers on subMDPs without considering their context in the string diagram. This paper proposes to consider the schedulers in a subMDP which form a Pareto curve on a combination of local objectives. While considering all such schedulers is intractable, it gives rise to a highly efficient sound approximation algorithm. The prototype on top of the model checker Storm demonstrates the scalability of this approach. △ Less

Submitted 16 January, 2024; originally announced January 2024.

Comments: Extended version (includes the Appendix) of the paper accepted at TACAS-24

arXiv:2301.11296 [pdf, ps, other]

Robust Almost-Sure Reachability in Multi-Environment MDPs

Authors: Marck van der Vegt, Nils Jansen, Sebastian Junges

Abstract: Multiple-environment MDPs (MEMDPs) capture finite sets of MDPs that share the states but differ in the transition dynamics. These models form a proper subclass of partially observable MDPs (POMDPs). We consider the synthesis of policies that robustly satisfy an almost-sure reachability property in MEMDPs, that is, one policy that satisfies a property for all environments. For POMDPs, deciding the… ▽ More Multiple-environment MDPs (MEMDPs) capture finite sets of MDPs that share the states but differ in the transition dynamics. These models form a proper subclass of partially observable MDPs (POMDPs). We consider the synthesis of policies that robustly satisfy an almost-sure reachability property in MEMDPs, that is, one policy that satisfies a property for all environments. For POMDPs, deciding the existence of robust policies is an EXPTIME-complete problem. In this paper, we show that this problem is PSPACE-complete for MEMDPs, while the policies in general require exponential memory. We exploit the theoretical results to develop and implement an algorithm that shows promising results in synthesizing robust policies for various benchmarks. △ Less

Submitted 26 January, 2023; originally announced January 2023.

Comments: preprint

Showing 1–3 of 3 results for author: van der Vegt, M