11institutetext: Department of Measurement and Information Systems
Budapest University of Technology and Economics
11email: {szekeres, marussy, majzik}@mit.bme.hu

A Lazy Abstraction Algorithm for Markov Decision Processes

Theory and Initial Evaluation
Dániel Szekeres 11 0000-0002-2912-028X    Kristóf Marussy 11 0000-0002-9135-8256    István Majzik 11 0000-0002-1184-2882
Abstract

Analysis of Markov Decision Processes (MDP) is often hindered by state space explosion. Abstraction is a well-established technique in model checking to mitigate this issue. This paper presents a novel lazy abstraction method for MDP analysis based on adaptive simulation graphs. Refinement is performed only when new parts of the state space are explored, which makes partial exploration techniques like Bounded Real-Time Dynamic Programming (BRTDP) retain more merged states. Therefore, we propose a combination of lazy abstraction and BRTDP. To evaluate the performance of our algorithm, we conduct initial experiments using the Quantitative Verification Benchmark Set.

Keywords:
Lazy abstraction Markov Decision Processes Abstraction refinement Probabilistic model checking
\NewEnviron

reptheorem[1]

Theorem 0.1
\BODY

1 Introduction

Ensuring the reliable operation of safety-critical systems, like railway interlocking systems and embedded controllers, is vital. Probabilistic model checking addresses this by offering an automated approach with formal mathematical guarantees for the analysis of quantitative properties, like reliability and availability [20]. We focus on a fundamental task in probabilistic model checking: computing the worst-case probability of reaching an error state.

Markov Decision Processes (MDPs) are discrete-time models able to describe both probabilistic and non-deterministic behavior, used in reliability and safety analysis for worst-case modeling of unknown factors. The analysis of other modeling formalisms, like Markov Automata or Probabilistic Timed Automata, can often be reduced to MDP analysis as well.

State space explosion presents an obstacle for MDP model checking: as the number of components or variables increases, the state space may grow exponentially. Consequently, practical implementations face problems in representing the system in memory and the numerical solution methods also become intractable.

Abstraction aims to counteract this. Several abstraction-based techniques have been adapted to probabilistic systems, like CEGAR [17, 15] and abstract interpretation [7]. Partial state space exploration, like Bounded Real-Time Dynamic Programming (BRTDP) [3, 18] is another approach for counteracting it. As most existing MDP abstraction methods rely on computing the whole abstract model to choose a refinement, they do not lend themselves well to combination with partial state space exploration techniques.

Lazy abstraction [13, 23] in contrast merges state-space exploration and refinement, making it a good candidate for this combination. However, no such method has been proposed for MDPs to our knowledge.

We adapt an existing lazy abstraction algorithm [27] to MDPs (Section 3). We combine it with BRTDP, benefiting from the synergy of lazy abstraction and partial state space exploration and enabling a trade-off between time and accuracy (Subsection 3.1). We evaluate the performance of the proposed algorithms using models from the Quantitative Verification Benchmark Set [12] (Section 4).

1.0.1 Related Work

Counterexample-Guided Abstraction Refinement (CEGAR) [6] is a successful approach for abstraction-based model checking: it starts with a coarse abstraction and refines it based on abstract counterexamples.

Lazy abstraction, introduced in [13], improved CEGAR through on-demand refinement during abstract state space exploration and varying precision from node to node in the state graph. An interpolant-based version was proposed in [23]. This was adapted to timed automata [14], introducing Adaptive Simulation Graphs (ASG) as the abstract model. This allows earlier refinement, cutting spurious paths before reaching a target, and a less expensive covering check. The ASG-based algorithm was adapted to explicit value abstraction of discrete variables in [27], which we, in turn, adapt to MDPs.

Different abstraction methods have been proposed for probabilistic systems. While some CEGAR-based methods employ MDPs as abstraction [8, 15, 5], others utilize stochastic games [24, 28, 17], which we plan to incorporate in the future. Abstract interpretation has also been used for probabilistic systems [7, 11]. Some others include magnifying lens abstraction [9], which explores the whole concrete state space but keeps only its subset in memory and assume-guarantee-style abstraction [19], specialized for composite systems. To our knowledge, no lazy abstraction method has been proposed for probabilistic models yet.

The algorithm presented in this paper uses a symmetric representation constraint, resulting in an approach similar to bisimulation reduction techniques [10, 16]. The main difference is that until the whole ASG is explored, only a limited version of “bisimilarity” holds which does not take the unexplored part of the state space into account, allowing coarser partitions on the already explored part. When combined with partial exploration, the algorithm can stop before exploring the full ASG. Finite-horizon bisimulation minimization [16] and incremental bisimulation abstraction-refinement [25] are similar in that they employ a relaxed version of bisimulation. Both of them limit the bisimilarity to a fixed path length and compute exact quotients w.r.t the relaxed relation, while we base the relaxation on the currently explored state space and do not aim for computing the coarsest relation.

BRTDP was introduced in [22] for Stochastic Shortest Paths, and [3] applied it to general MDPs. We combine it with our lazy abstraction algorithm.

2 Background and Notations

𝔻(A)𝔻𝐴\mathbb{D}(A)blackboard_D ( italic_A ) is the set of probability distributions over the set A𝐴Aitalic_A. For d𝔻(A),aAformulae-sequence𝑑𝔻𝐴𝑎𝐴d\in\mathbb{D}(A),a\in Aitalic_d ∈ blackboard_D ( italic_A ) , italic_a ∈ italic_A, d(a)𝑑𝑎d(a)italic_d ( italic_a ) denotes the probability measure of a𝑎aitalic_a according to d𝑑ditalic_d. f:AB:𝑓𝐴𝐵f\colon A\hookrightarrow Bitalic_f : italic_A ↪ italic_B means f𝑓fitalic_f is a partial function from A𝐴Aitalic_A to B𝐵Bitalic_B, and Supp(f)𝑆𝑢𝑝𝑝𝑓Supp(f)italic_S italic_u italic_p italic_p ( italic_f ) is the set of values for which f𝑓fitalic_f is defined. For d𝔻(A)𝑑𝔻𝐴d\in\mathbb{D}(A)italic_d ∈ blackboard_D ( italic_A ), Supp(d)={aA|d(a)>0}𝑆𝑢𝑝𝑝𝑑conditional-set𝑎𝐴𝑑𝑎0Supp(d)=\{a\in A|d(a)>0\}italic_S italic_u italic_p italic_p ( italic_d ) = { italic_a ∈ italic_A | italic_d ( italic_a ) > 0 }. δxsubscript𝛿𝑥\delta_{x}italic_δ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT is a Dirac distribution: δx(x)=1,yx:δx(y)=0:formulae-sequencesubscript𝛿𝑥𝑥1for-all𝑦𝑥subscript𝛿𝑥𝑦0\delta_{x}(x)=1,\forall y\neq x\colon\delta_{x}(y)=0italic_δ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_x ) = 1 , ∀ italic_y ≠ italic_x : italic_δ start_POSTSUBSCRIPT italic_x end_POSTSUBSCRIPT ( italic_y ) = 0.

2.1 Markov Decision Process (MDP)

MDPs are low-level mathematical models that describe both probabilistic and non-deterministic behavior in discrete time.

Definition 1 (MDP)

An MDP is a tuple M=(S,Act,T,s0)𝑀𝑆𝐴𝑐𝑡𝑇subscript𝑠0M=(S,Act,T,s_{0})italic_M = ( italic_S , italic_A italic_c italic_t , italic_T , italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ), where S𝑆Sitalic_S is the set of states, Act𝐴𝑐𝑡Actitalic_A italic_c italic_t is the set of actions, T:S×Act×S[0,1]:𝑇absent𝑆𝐴𝑐𝑡𝑆01T\colon S\times Act\times S\xrightarrow{}[0,1]italic_T : italic_S × italic_A italic_c italic_t × italic_S start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW [ 0 , 1 ] is a probabilistic transition function s.t. sS,aAct:sST(s,a,s){0,1}:formulae-sequencefor-all𝑠𝑆𝑎𝐴𝑐𝑡subscriptsuperscript𝑠𝑆𝑇𝑠𝑎superscript𝑠01\forall s\in S,a\in Act\colon\sum_{s^{\prime}\in S}T(s,a,s^{\prime})\in\{0,1\}∀ italic_s ∈ italic_S , italic_a ∈ italic_A italic_c italic_t : ∑ start_POSTSUBSCRIPT italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_S end_POSTSUBSCRIPT italic_T ( italic_s , italic_a , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ { 0 , 1 } and s0Ssubscript𝑠0𝑆s_{0}\in Sitalic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_S is the initial state.

An action aAct𝑎𝐴𝑐𝑡a\in Actitalic_a ∈ italic_A italic_c italic_t is enabled in sS𝑠𝑆s\in Sitalic_s ∈ italic_S if sST(s,a,s)=1subscriptsuperscript𝑠𝑆𝑇𝑠𝑎superscript𝑠1\sum_{s^{\prime}\in S}T(s,a,s^{\prime})=1∑ start_POSTSUBSCRIPT italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_S end_POSTSUBSCRIPT italic_T ( italic_s , italic_a , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = 1. In this case, T(s,a)𝔻(S)𝑇𝑠𝑎𝔻𝑆T(s,a)\in\mathbb{D}(S)italic_T ( italic_s , italic_a ) ∈ blackboard_D ( italic_S ) denotes the next state distribution after taking a𝑎aitalic_a in s𝑠sitalic_s, defined as T(s,a)(s)=T(s,a,s)𝑇𝑠𝑎superscript𝑠𝑇𝑠𝑎superscript𝑠T(s,a)(s^{\prime})=T(s,a,s^{\prime})italic_T ( italic_s , italic_a ) ( italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_T ( italic_s , italic_a , italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). The intuitive behavior of an MDP is as follows: starting in s0subscript𝑠0s_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT an action a𝑎aitalic_a is chosen non-deterministically from those enabled in the current state sisubscript𝑠𝑖s_{i}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in each step, and the next state is sampled from T(si,a)𝑇subscript𝑠𝑖𝑎T(s_{i},a)italic_T ( italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_a ). A trace of an MDP is an alternating list of states and actions s0a1s1a2s2a3subscript𝑎1subscript𝑠0subscript𝑠1subscript𝑎2subscript𝑠2subscript𝑎3s_{0}\xrightarrow{a_{1}}s_{1}\xrightarrow{a_{2}}s_{2}\xrightarrow{a_{3}}\dotsitalic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_a start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW … such that i:T(si1,ai,si)>0:for-all𝑖𝑇subscript𝑠𝑖1subscript𝑎𝑖subscript𝑠𝑖0\forall i\colon T(s_{i-1},a_{i},s_{i})>0∀ italic_i : italic_T ( italic_s start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) > 0. Fixing a strategy for resolving the non-determinism, the set of traces can be equipped with a probability measure: intuitively, the probability of a trace is the product of the probability of landing in each state of the trace after taking the action specified by the strategy in the previous state. For a detailed formal treatment, see e.g. [20].

Given an MDP of the system behavior and a set of target (error) states E𝐸Eitalic_E, we want to compute (an upper approximation of) the probability of the set of traces involving a state in E𝐸Eitalic_E with non-determinism resolved by a maximizing strategy: max({(s0a1s1a2s2a3)|i+:siE})subscript𝑚𝑎𝑥conditional-setsubscript𝑎1subscript𝑠0subscript𝑠1subscript𝑎2subscript𝑠2subscript𝑎3:𝑖superscriptsubscript𝑠𝑖𝐸\mathbb{P}_{max}(\{(s_{0}\xrightarrow{a_{1}}s_{1}\xrightarrow{a_{2}}s_{2}% \xrightarrow{a_{3}}\dots)\ |\ \exists i\in\mathbb{Z}^{+}\colon s_{i}\in E\})blackboard_P start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT ( { ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_a start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW … ) | ∃ italic_i ∈ blackboard_Z start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT : italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_E } ). The result is the same if we make all target states absorbing, allowing us to restrict the analysis to finite traces.

2.1.1 Symbolic MDPs

Most real-life models are specified symbolically using state variables and operations on them. We assume that the MDP is given by a set of variables 𝒱𝒱\mathcal{V}caligraphic_V and a set of probabilistic guarded commands 𝒞𝒞\mathcal{C}caligraphic_C. Each v𝒱𝑣𝒱v\in\mathcal{V}italic_v ∈ caligraphic_V has a set R(v)𝑅𝑣\mathit{R}(v)italic_R ( italic_v ) of values it can take, and an initial value v0R(v)subscript𝑣0𝑅𝑣v_{0}\in\mathit{R}(v)italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ italic_R ( italic_v ). A valuation over 𝒱𝒱\mathcal{V}caligraphic_V is a function val:𝒱v𝒱R(v):𝑣𝑎𝑙absent𝒱subscript𝑣𝒱𝑅𝑣val\colon\mathcal{V}\xrightarrow{}\bigcup_{v\in\mathcal{V}}\mathit{R}(v)italic_v italic_a italic_l : caligraphic_V start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW ⋃ start_POSTSUBSCRIPT italic_v ∈ caligraphic_V end_POSTSUBSCRIPT italic_R ( italic_v ) s.t. v𝒱:val(v)R(v):for-all𝑣𝒱𝑣𝑎𝑙𝑣𝑅𝑣\forall{v\in\mathcal{V}}\colon val(v)\in\mathit{R}(v)∀ italic_v ∈ caligraphic_V : italic_v italic_a italic_l ( italic_v ) ∈ italic_R ( italic_v ), and 𝑉𝐴𝐿𝒱subscript𝑉𝐴𝐿𝒱\mathit{VAL}_{\mathcal{V}}italic_VAL start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT is the set of all valuations over 𝒱𝒱\mathcal{V}caligraphic_V. The initial valuation of the model is a valuation val0𝑣𝑎subscript𝑙0val_{0}italic_v italic_a italic_l start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT s.t. v𝒱:val0(v)=v0:for-all𝑣𝒱𝑣𝑎subscript𝑙0𝑣subscript𝑣0\forall v\in\mathcal{V}\colon val_{0}(v)=v_{0}∀ italic_v ∈ caligraphic_V : italic_v italic_a italic_l start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ( italic_v ) = italic_v start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. The state space of this MDP is a subset of 𝑉𝐴𝐿𝒱subscript𝑉𝐴𝐿𝒱\mathit{VAL}_{\mathcal{V}}italic_VAL start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT, and its initial state is val0𝑣𝑎subscript𝑙0val_{0}italic_v italic_a italic_l start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.

Let 𝒱subscript𝒱\mathcal{B}_{\mathcal{V}}caligraphic_B start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT denote the set of Boolean expressions over 𝒱𝒱\mathcal{V}caligraphic_V, 𝒱vsuperscriptsubscript𝒱𝑣\mathcal{E}_{\mathcal{V}}^{v}caligraphic_E start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT the set of expressions over 𝒱𝒱\mathcal{V}caligraphic_V that result in an element of R(v)𝑅𝑣\mathit{R}(v)italic_R ( italic_v ), and 𝒱=v𝒱𝒱vsubscript𝒱subscript𝑣𝒱superscriptsubscript𝒱𝑣\mathcal{E}_{\mathcal{V}}=\bigcup_{v\in\mathcal{V}}\mathcal{E}_{\mathcal{V}}^{v}caligraphic_E start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT = ⋃ start_POSTSUBSCRIPT italic_v ∈ caligraphic_V end_POSTSUBSCRIPT caligraphic_E start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT. An assignment is a function a:𝒱𝒱:𝑎absent𝒱subscript𝒱a\colon\mathcal{V}\xrightarrow{}\mathcal{E}_{\mathcal{V}}italic_a : caligraphic_V start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW caligraphic_E start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT, such that v𝒱:a(v)𝒱v:for-all𝑣𝒱𝑎𝑣superscriptsubscript𝒱𝑣\forall{v\in\mathcal{V}}\colon a(v)\in\mathcal{E}_{\mathcal{V}}^{v}∀ italic_v ∈ caligraphic_V : italic_a ( italic_v ) ∈ caligraphic_E start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_v end_POSTSUPERSCRIPT. Let 𝒜𝒱subscript𝒜𝒱\mathcal{A}_{\mathcal{V}}caligraphic_A start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT denote the set of assignments for 𝒱𝒱\mathcal{V}caligraphic_V. eval(e,val)𝑒𝑣𝑎𝑙𝑒𝑣𝑎𝑙eval(e,val)italic_e italic_v italic_a italic_l ( italic_e , italic_v italic_a italic_l ) for e𝒱𝑒subscript𝒱e\in\mathcal{E}_{\mathcal{V}}italic_e ∈ caligraphic_E start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT and val𝑉𝐴𝐿𝒱𝑣𝑎𝑙subscript𝑉𝐴𝐿𝒱val\in\mathit{VAL}_{\mathcal{V}}italic_v italic_a italic_l ∈ italic_VAL start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT is the constant resulting from replacing each v𝒱𝑣𝒱v\in\mathcal{V}italic_v ∈ caligraphic_V in e𝑒eitalic_e with val(v)𝑣𝑎𝑙𝑣val(v)italic_v italic_a italic_l ( italic_v ). eval(a,val)𝑒𝑣𝑎𝑙𝑎𝑣𝑎𝑙eval(a,val)italic_e italic_v italic_a italic_l ( italic_a , italic_v italic_a italic_l ) for a𝒜𝒱𝑎subscript𝒜𝒱a\in\mathcal{A}_{\mathcal{V}}italic_a ∈ caligraphic_A start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT is a valuation val𝑣𝑎superscript𝑙val^{\prime}italic_v italic_a italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT such that v𝒱:val(v)=eval(a(v),val):for-all𝑣𝒱𝑣𝑎superscript𝑙𝑣𝑒𝑣𝑎𝑙𝑎𝑣𝑣𝑎𝑙\forall{v\in\mathcal{V}}\colon val^{\prime}(v)=eval(a(v),val)∀ italic_v ∈ caligraphic_V : italic_v italic_a italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_v ) = italic_e italic_v italic_a italic_l ( italic_a ( italic_v ) , italic_v italic_a italic_l ). eval(d,val)𝑒𝑣𝑎𝑙𝑑𝑣𝑎𝑙eval(d,val)italic_e italic_v italic_a italic_l ( italic_d , italic_v italic_a italic_l ) for d𝔻(𝒜𝒱)𝑑𝔻subscript𝒜𝒱d\in\mathbb{D}(\mathcal{A}_{\mathcal{V}})italic_d ∈ blackboard_D ( caligraphic_A start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT ) is the distribution d𝔻(𝑉𝐴𝐿𝒱)superscript𝑑𝔻subscript𝑉𝐴𝐿𝒱d^{\prime}\in\mathbb{D}(\mathit{VAL}_{\mathcal{V}})italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ blackboard_D ( italic_VAL start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT ) such that d(val)=a{aSupp(d)|eval(a,val)=val}d(a)superscript𝑑𝑣𝑎superscript𝑙subscript𝑎conditional-set𝑎𝑆𝑢𝑝𝑝𝑑𝑒𝑣𝑎𝑙𝑎𝑣𝑎𝑙𝑣𝑎superscript𝑙𝑑𝑎d^{\prime}(val^{\prime})=\sum_{a\in\{a\in Supp(d)\ |\ eval(a,val)=val^{\prime}% \}}d(a)italic_d start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_v italic_a italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_a ∈ { italic_a ∈ italic_S italic_u italic_p italic_p ( italic_d ) | italic_e italic_v italic_a italic_l ( italic_a , italic_v italic_a italic_l ) = italic_v italic_a italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT } end_POSTSUBSCRIPT italic_d ( italic_a ).

A command c𝒞𝑐𝒞c\in\mathcal{C}italic_c ∈ caligraphic_C consists of a guard gc𝒱subscript𝑔𝑐subscript𝒱g_{c}\in\mathcal{B}_{\mathcal{V}}italic_g start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ∈ caligraphic_B start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT and a result distribution over assignments dc𝔻(𝒜𝒱)subscript𝑑𝑐𝔻subscript𝒜𝒱d_{c}\in\mathbb{D}(\mathcal{A}_{\mathcal{V}})italic_d start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ∈ blackboard_D ( caligraphic_A start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT ). c𝑐citalic_c is enabled by val𝑣𝑎𝑙valitalic_v italic_a italic_l iff eval(gc,val)=𝑇𝑟𝑢𝑒𝑒𝑣𝑎𝑙subscript𝑔𝑐𝑣𝑎𝑙𝑇𝑟𝑢𝑒eval(g_{c},val)=\mathit{True}italic_e italic_v italic_a italic_l ( italic_g start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_v italic_a italic_l ) = italic_True. Let aicSupp(dc)subscriptsuperscript𝑎𝑐𝑖𝑆𝑢𝑝𝑝subscript𝑑𝑐a^{c}_{i}\in Supp(d_{c})italic_a start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S italic_u italic_p italic_p ( italic_d start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ) denote the i𝑖iitalic_ith assignment of command c𝑐citalic_c for a fixed ordering. The enabled actions in each state val𝑣𝑎𝑙valitalic_v italic_a italic_l of the represented MDP are the commands enabled by val𝑣𝑎𝑙valitalic_v italic_a italic_l, and taking the command c𝑐citalic_c results in the distribution eval(dc,val)𝑒𝑣𝑎𝑙subscript𝑑𝑐𝑣𝑎𝑙eval(d_{c},val)italic_e italic_v italic_a italic_l ( italic_d start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_v italic_a italic_l ). Widely used MDP description formats, like that of PRISM [21] or the JANI [4] format can be mapped to this low-level description.

Our running example is given by the following variables and commands:

𝒱={x,y},R(x)=R(y)=,x0=y0=0formulae-sequenceformulae-sequence𝒱𝑥𝑦𝑅𝑥𝑅𝑦subscript𝑥0subscript𝑦00\displaystyle\mathcal{V}=\{x,y\},\mathit{R}(x)=\mathit{R}(y)=\mathbb{N},x_{0}=% y_{0}=0caligraphic_V = { italic_x , italic_y } , italic_R ( italic_x ) = italic_R ( italic_y ) = blackboard_N , italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_y start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 0
𝐜𝟏:[𝑡𝑟𝑢𝑒] 0.8:(x:=x+1y:=y),0.2:(x:=xy:=y):subscript𝐜1delimited-[]𝑡𝑟𝑢𝑒0.8:assignsuperscript𝑥𝑥1superscript𝑦assign𝑦0.2:assignsuperscript𝑥𝑥superscript𝑦assign𝑦\displaystyle\mathbf{c_{1}}\colon[\mathit{true}]\ 0.8:(x^{\prime}:=x+1\wedge y% ^{\prime}:=y),0.2:(x^{\prime}:=x\wedge y^{\prime}:=y)bold_c start_POSTSUBSCRIPT bold_1 end_POSTSUBSCRIPT : [ italic_true ] 0.8 : ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := italic_x + 1 ∧ italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := italic_y ) , 0.2 : ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := italic_x ∧ italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := italic_y )
𝐜𝟐:[x==0] 1.0:(x:=1y:=2)\displaystyle\mathbf{c_{2}}\colon[x==0]\ 1.0:(x^{\prime}:=1\wedge y^{\prime}:=2)bold_c start_POSTSUBSCRIPT bold_2 end_POSTSUBSCRIPT : [ italic_x = = 0 ] 1.0 : ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := 1 ∧ italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := 2 )
𝐜𝟑:[x==2y==2] 1.0:(x:=xy:=3)\displaystyle\mathbf{c_{3}}\colon[x==2\wedge y==2]\ 1.0:(x^{\prime}:=x\wedge y% ^{\prime}:=3)bold_c start_POSTSUBSCRIPT bold_3 end_POSTSUBSCRIPT : [ italic_x = = 2 ∧ italic_y = = 2 ] 1.0 : ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := italic_x ∧ italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := 3 )

c1subscript𝑐1c_{1}italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is enabled in every state, and it increments x𝑥xitalic_x by 1111 with probability 0.80.80.80.8. c2subscript𝑐2c_{2}italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is enabled when x=0𝑥0x=0italic_x = 0, and always sets y𝑦yitalic_y to 2222 and x𝑥xitalic_x to 1111. c3subscript𝑐3c_{3}italic_c start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT is enabled when x𝑥xitalic_x is 2222 and y𝑦yitalic_y is 2, and sets y𝑦yitalic_y to 3333. 1(a) shows this MDP.

2.2 Lazy Abstraction

Abstraction-refinement methods mitigate state-space explosion by disregarding information present in the original concrete model to create an abstract model that is iteratively refined until a conclusion is reached. Lazy abstraction performs refinement on-the-fly and only on a subset of the state space.

For checking safety properties in the qualitative case, a conservative abstraction overapproximates the reachable state set. In the probabilistic setting, the probability of reaching a target state in the abstract model overapproximates that in the concrete one, which we will prove for the proposed algorithm.

We build on the lazy abstraction method of [27] for non-probabilistic systems. It constructs an Adaptive Simulation Graph (ASG) with nodes labeled by both a concrete and an abstract state: the concrete state represents all states in the abstract state regarding possible action sequences. The abstract state labels start very coarse and are refined as needed. Covering edges indicate that action sequences starting from the coverer node encompass those starting from the covered node, eliminating the need to explore paths from the covered node.

If an action is enabled in at least one concrete state described by the abstract label of a node, but not in the concrete label, the abstract label is strengthened by removing states with the action enabled. This operation can trigger additional strengthenings. The algorithm terminates once all enabled actions in non-covered nodes have been explored. The abstract labels in the finished ASG cover all reachable concrete states, and contain a target state only if one is reachable.

2.2.1 Abstract domains

Abstract states are described using an abstract domain. For a set of concrete states S𝑆Sitalic_S, an abstract domain D=(S^,,α,γ)𝐷^𝑆precedes-or-equals𝛼𝛾D=(\hat{S},\preceq,\alpha,\gamma)italic_D = ( over^ start_ARG italic_S end_ARG , ⪯ , italic_α , italic_γ ) consists of the abstract state set S^^𝑆\hat{S}over^ start_ARG italic_S end_ARG, a partial ordering S^×S^\preceq\ \subseteq\hat{S}\times\hat{S}⪯ ⊆ over^ start_ARG italic_S end_ARG × over^ start_ARG italic_S end_ARG, an abstraction function α:2SS^:𝛼absentsuperscript2𝑆^𝑆\alpha\colon 2^{S}\xrightarrow{}\hat{S}italic_α : 2 start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW over^ start_ARG italic_S end_ARG and a concretization function γ:S^2S:𝛾absent^𝑆superscript2𝑆\gamma\colon\hat{S}\xrightarrow{}2^{S}italic_γ : over^ start_ARG italic_S end_ARG start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW 2 start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT satisfying A2S,a^S^:α(A)a^Aγ(a^):formulae-sequencefor-all𝐴superscript2𝑆^𝑎^𝑆precedes-or-equals𝛼𝐴^𝑎iff𝐴𝛾^𝑎\forall{A\in 2^{S},\hat{a}\in\hat{S}}\colon\alpha(A)\preceq\hat{a}\iff A% \subseteq\gamma(\hat{a})∀ italic_A ∈ 2 start_POSTSUPERSCRIPT italic_S end_POSTSUPERSCRIPT , over^ start_ARG italic_a end_ARG ∈ over^ start_ARG italic_S end_ARG : italic_α ( italic_A ) ⪯ over^ start_ARG italic_a end_ARG ⇔ italic_A ⊆ italic_γ ( over^ start_ARG italic_a end_ARG ). γ𝛾\gammaitalic_γ lets us treat abstract states as sets of concrete states; we write “ss^𝑠^𝑠s\in\hat{s}italic_s ∈ over^ start_ARG italic_s end_ARG” for sγ(s^)𝑠𝛾^𝑠s\in\gamma(\hat{s})italic_s ∈ italic_γ ( over^ start_ARG italic_s end_ARG ) when γ𝛾\gammaitalic_γ is clear from the context. xyprecedes-or-equals𝑥𝑦x\preceq yitalic_x ⪯ italic_y denotes (x,y)𝑥𝑦precedes-or-equals(x,y)\in\ \preceq( italic_x , italic_y ) ∈ ⪯. S^^𝑆\hat{S}over^ start_ARG italic_S end_ARG has two special elements: top\top and bottom\bot satisfying γ()=S,γ()={}formulae-sequence𝛾top𝑆𝛾bottom\gamma(\top)=S,\gamma(\bot)=\{\}italic_γ ( ⊤ ) = italic_S , italic_γ ( ⊥ ) = { }.

Our lazy abstraction algorithms are domain agnostic, but need an abstract domain for S=𝑉𝐴𝐿𝒱𝑆subscript𝑉𝐴𝐿𝒱S=\mathit{VAL}_{\mathcal{V}}italic_S = italic_VAL start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT with the following operations.

For a𝒜𝒱𝑎subscript𝒜𝒱a\in\mathcal{A}_{\mathcal{V}}italic_a ∈ caligraphic_A start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT and s^S^^𝑠^𝑆\hat{s}\in\hat{S}over^ start_ARG italic_s end_ARG ∈ over^ start_ARG italic_S end_ARG, abstract post operator eval(a,s^)S^𝑒𝑣𝑎𝑙𝑎^𝑠^𝑆eval(a,\hat{s})\in\hat{S}italic_e italic_v italic_a italic_l ( italic_a , over^ start_ARG italic_s end_ARG ) ∈ over^ start_ARG italic_S end_ARG applies an assignment in the abstract state space: eval(a,s^)=α({eval(a,s)|ss^})𝑒𝑣𝑎𝑙𝑎^𝑠𝛼conditional-set𝑒𝑣𝑎𝑙𝑎𝑠𝑠^𝑠eval(a,\hat{s})=\alpha(\{eval(a,s)|s\in\hat{s}\})italic_e italic_v italic_a italic_l ( italic_a , over^ start_ARG italic_s end_ARG ) = italic_α ( { italic_e italic_v italic_a italic_l ( italic_a , italic_s ) | italic_s ∈ over^ start_ARG italic_s end_ARG } ). For b𝒱𝑏subscript𝒱b\in\mathcal{B}_{\mathcal{V}}italic_b ∈ caligraphic_B start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT, eval(b,s^){𝑇𝑟𝑢𝑒,𝐹𝑎𝑙𝑠𝑒,𝑈𝑛𝑘𝑛𝑜𝑤𝑛}𝑒𝑣𝑎𝑙𝑏^𝑠𝑇𝑟𝑢𝑒𝐹𝑎𝑙𝑠𝑒𝑈𝑛𝑘𝑛𝑜𝑤𝑛eval(b,\hat{s})\in\{\mathit{True},\mathit{False},\mathit{Unknown}\}italic_e italic_v italic_a italic_l ( italic_b , over^ start_ARG italic_s end_ARG ) ∈ { italic_True , italic_False , italic_Unknown } denotes evaluating b𝑏bitalic_b in the abstract state space: 𝑇𝑟𝑢𝑒𝑇𝑟𝑢𝑒\mathit{True}italic_True if b𝑏bitalic_b evaluates to 𝑇𝑟𝑢𝑒𝑇𝑟𝑢𝑒\mathit{True}italic_True for all ss^𝑠^𝑠s\in\hat{s}italic_s ∈ over^ start_ARG italic_s end_ARG, 𝐹𝑎𝑙𝑠𝑒𝐹𝑎𝑙𝑠𝑒\mathit{False}italic_False if b𝑏bitalic_b evaluates to 𝐹𝑎𝑙𝑠𝑒𝐹𝑎𝑙𝑠𝑒\mathit{False}italic_False for all ss^𝑠^𝑠s\in\hat{s}italic_s ∈ over^ start_ARG italic_s end_ARG, otherwise 𝑈𝑛𝑘𝑛𝑜𝑤𝑛𝑈𝑛𝑘𝑛𝑜𝑤𝑛\mathit{Unknown}italic_Unknown.

We also need a block operation: for an abstract state s^S^^𝑠^𝑆\hat{s}\in\hat{S}over^ start_ARG italic_s end_ARG ∈ over^ start_ARG italic_S end_ARG, a Boolean expression b𝒱𝑏subscript𝒱b\in\mathcal{B}_{\mathcal{V}}italic_b ∈ caligraphic_B start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT and a concrete state ss^𝑠^𝑠s\in\hat{s}italic_s ∈ over^ start_ARG italic_s end_ARG s.t. eval(b,s)=𝐹𝑎𝑙𝑠𝑒𝑒𝑣𝑎𝑙𝑏𝑠𝐹𝑎𝑙𝑠𝑒eval(b,s)=\mathit{False}italic_e italic_v italic_a italic_l ( italic_b , italic_s ) = italic_False, s^=block(s^,b,s)superscript^𝑠𝑏𝑙𝑜𝑐𝑘^𝑠𝑏𝑠\hat{s}^{\prime}=block(\hat{s},b,s)over^ start_ARG italic_s end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_b italic_l italic_o italic_c italic_k ( over^ start_ARG italic_s end_ARG , italic_b , italic_s ) is an abstract state s.t. s^s^,ss^,eval(b,s^)=𝐹𝑎𝑙𝑠𝑒formulae-sequenceprecedes-or-equalssuperscript^𝑠^𝑠formulae-sequence𝑠superscript^𝑠𝑒𝑣𝑎𝑙𝑏superscript^𝑠𝐹𝑎𝑙𝑠𝑒\hat{s}^{\prime}\preceq\hat{s},s\in\hat{s}^{\prime},eval(b,\hat{s}^{\prime})=% \mathit{False}over^ start_ARG italic_s end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⪯ over^ start_ARG italic_s end_ARG , italic_s ∈ over^ start_ARG italic_s end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_e italic_v italic_a italic_l ( italic_b , over^ start_ARG italic_s end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_False. Its goal is to give a new abstract state by removing at least those states from s^^𝑠\hat{s}over^ start_ARG italic_s end_ARG which satisfy b𝑏bitalic_b (potentially others as well) while kee** s𝑠sitalic_s.

The abstract states must be representable as Boolean expressions: for each s^S^^𝑠^𝑆\hat{s}\in\hat{S}over^ start_ARG italic_s end_ARG ∈ over^ start_ARG italic_S end_ARG a bs^𝒱subscript𝑏^𝑠subscript𝒱b_{\hat{s}}\in\mathcal{B}_{\mathcal{V}}italic_b start_POSTSUBSCRIPT over^ start_ARG italic_s end_ARG end_POSTSUBSCRIPT ∈ caligraphic_B start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT must exist s.t. sS:eval(bs^,s)=𝑇𝑟𝑢𝑒ss^:for-all𝑠𝑆𝑒𝑣𝑎𝑙subscript𝑏^𝑠𝑠𝑇𝑟𝑢𝑒iff𝑠^𝑠\forall s\in S\colon eval(b_{\hat{s}},s)=\mathit{True}\iff s\in\hat{s}∀ italic_s ∈ italic_S : italic_e italic_v italic_a italic_l ( italic_b start_POSTSUBSCRIPT over^ start_ARG italic_s end_ARG end_POSTSUBSCRIPT , italic_s ) = italic_True ⇔ italic_s ∈ over^ start_ARG italic_s end_ARG. Relying on this, we will freely use abstract states in place of Boolean expressions.

We will use the explicit value domain Dexplsubscript𝐷𝑒𝑥𝑝𝑙D_{expl}italic_D start_POSTSUBSCRIPT italic_e italic_x italic_p italic_l end_POSTSUBSCRIPT (abstract states correspond to tracking only a subset of 𝒱𝒱\mathcal{V}caligraphic_V) as an example throughout the paper which we implemented in our prototype, along with predicate abstraction Dpredsubscript𝐷𝑝𝑟𝑒𝑑D_{pred}italic_D start_POSTSUBSCRIPT italic_p italic_r italic_e italic_d end_POSTSUBSCRIPT (abstract states are Boolean predicates over 𝒱𝒱\mathcal{V}caligraphic_V). A partial function pval:𝒱v𝒱R(v):𝑝𝑣𝑎𝑙𝒱subscript𝑣𝒱𝑅𝑣pval\colon\mathcal{V}\hookrightarrow\bigcup_{v\in\mathcal{V}}\mathit{R}(v)italic_p italic_v italic_a italic_l : caligraphic_V ↪ ⋃ start_POSTSUBSCRIPT italic_v ∈ caligraphic_V end_POSTSUBSCRIPT italic_R ( italic_v ) s.t. vSupp(pval):pval(v)R(v):for-all𝑣𝑆𝑢𝑝𝑝𝑝𝑣𝑎𝑙𝑝𝑣𝑎𝑙𝑣𝑅𝑣\forall v\in Supp(pval)\colon pval(v)\in\mathit{R}(v)∀ italic_v ∈ italic_S italic_u italic_p italic_p ( italic_p italic_v italic_a italic_l ) : italic_p italic_v italic_a italic_l ( italic_v ) ∈ italic_R ( italic_v ) is called a partial valuation, 𝑃𝑉𝐴𝐿𝒱subscript𝑃𝑉𝐴𝐿𝒱\mathit{PVAL}_{\mathcal{V}}italic_PVAL start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT denotes the set of all partial valuations over 𝒱𝒱\mathcal{V}caligraphic_V. Description of these domains and their operations can be seen in Table 1, assuming a concrete state set 𝑉𝐴𝐿𝒱subscript𝑉𝐴𝐿𝒱\mathit{VAL}_{\mathcal{V}}italic_VAL start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT.

The lazy abstraction algorithm does not use the abstraction and concretization functions α𝛼\alphaitalic_α and γ𝛾\gammaitalic_γ and the abstract post operator eval(a,s^)𝑒𝑣𝑎𝑙𝑎^𝑠eval(a,\hat{s})italic_e italic_v italic_a italic_l ( italic_a , over^ start_ARG italic_s end_ARG ) directly, only as arguments of a block operation (see later), so they need not be efficiently computable if the corresponding block operation can be implemented efficiently.

EXPL PRED
S^^𝑆\hat{S}over^ start_ARG italic_S end_ARG 𝑃𝑉𝐴𝐿𝒱{expl}subscript𝑃𝑉𝐴𝐿𝒱subscriptbottom𝑒𝑥𝑝𝑙\mathit{PVAL}_{\mathcal{V}}\cup\{\bot_{expl}\}italic_PVAL start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT ∪ { ⊥ start_POSTSUBSCRIPT italic_e italic_x italic_p italic_l end_POSTSUBSCRIPT } 𝒱subscript𝒱\mathcal{B}_{\mathcal{V}}caligraphic_B start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT
precedes-or-equals\preceq pvalpval(Supp(pval)Supp(pval)vSupp(pval):pval(v)=pval(v))pval\preceq pval^{\prime}\iff(Supp(pval^{\prime})\subseteq Supp(pval)\wedge% \forall{v\in Supp(pval^{\prime})}\colon pval(v)=pval^{\prime}(v))italic_p italic_v italic_a italic_l ⪯ italic_p italic_v italic_a italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⇔ ( italic_S italic_u italic_p italic_p ( italic_p italic_v italic_a italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊆ italic_S italic_u italic_p italic_p ( italic_p italic_v italic_a italic_l ) ∧ ∀ italic_v ∈ italic_S italic_u italic_p italic_p ( italic_p italic_v italic_a italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) : italic_p italic_v italic_a italic_l ( italic_v ) = italic_p italic_v italic_a italic_l start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_v ) ) b1b2(b1b2)iffprecedes-or-equalssubscript𝑏1subscript𝑏2subscript𝑏1subscript𝑏2b_{1}\preceq b_{2}\iff(b_{1}\implies b_{2})italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⪯ italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⇔ ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⟹ italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT )
α𝛼\alphaitalic_α Supp(α(A))={v𝒱|kR(v):valA:val(v)=k}𝑆𝑢𝑝𝑝𝛼𝐴conditional-set𝑣𝒱:𝑘𝑅𝑣for-all𝑣𝑎𝑙𝐴:𝑣𝑎𝑙𝑣𝑘Supp(\alpha(A))=\{v\in\mathcal{V}\ |\ \exists k\in\mathit{R}(v)\colon\forall val% \in A\colon val(v)=k\}italic_S italic_u italic_p italic_p ( italic_α ( italic_A ) ) = { italic_v ∈ caligraphic_V | ∃ italic_k ∈ italic_R ( italic_v ) : ∀ italic_v italic_a italic_l ∈ italic_A : italic_v italic_a italic_l ( italic_v ) = italic_k } and vSupp(α(A)):α(A)(v)=kvalA:val(v)=k:for-all𝑣𝑆𝑢𝑝𝑝𝛼𝐴𝛼𝐴𝑣𝑘ifffor-all𝑣𝑎𝑙𝐴:𝑣𝑎𝑙𝑣𝑘\forall v\in Supp(\alpha(A))\colon\alpha(A)(v)=k\iff\forall val\in A\colon val% (v)=k∀ italic_v ∈ italic_S italic_u italic_p italic_p ( italic_α ( italic_A ) ) : italic_α ( italic_A ) ( italic_v ) = italic_k ⇔ ∀ italic_v italic_a italic_l ∈ italic_A : italic_v italic_a italic_l ( italic_v ) = italic_k 𝑣𝑎𝑙A(v𝒱:v=𝑣𝑎𝑙(v))\bigvee_{\mathit{val}\in A}(\forall v\in\mathcal{V}\colon v=\mathit{val}(v))⋁ start_POSTSUBSCRIPT italic_val ∈ italic_A end_POSTSUBSCRIPT ( ∀ italic_v ∈ caligraphic_V : italic_v = italic_val ( italic_v ) )
γ𝛾\gammaitalic_γ γ(pval)={val𝑉𝐴𝐿𝒱|vSupp(pval):val(v)=pval(v)}𝛾𝑝𝑣𝑎𝑙conditional-set𝑣𝑎𝑙subscript𝑉𝐴𝐿𝒱:for-all𝑣𝑆𝑢𝑝𝑝𝑝𝑣𝑎𝑙𝑣𝑎𝑙𝑣𝑝𝑣𝑎𝑙𝑣\gamma(pval)=\{val\in\mathit{VAL}_{\mathcal{V}}\ |\ \forall v\in Supp(pval)% \colon val(v)=pval(v)\}italic_γ ( italic_p italic_v italic_a italic_l ) = { italic_v italic_a italic_l ∈ italic_VAL start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT | ∀ italic_v ∈ italic_S italic_u italic_p italic_p ( italic_p italic_v italic_a italic_l ) : italic_v italic_a italic_l ( italic_v ) = italic_p italic_v italic_a italic_l ( italic_v ) } γ(b)={val𝑉𝐴𝐿𝒱|𝑣𝑎𝑙b)}\gamma(b)=\{val\in\mathit{VAL}_{\mathcal{V}}\ |\ \mathit{val}\vdash b)\}italic_γ ( italic_b ) = { italic_v italic_a italic_l ∈ italic_VAL start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT | italic_val ⊢ italic_b ) }
,topbottom\top,\bot⊤ , ⊥ top\top is the empty valuation, bottom\bot is a non-valuation element explsubscriptbottom𝑒𝑥𝑝𝑙\bot_{expl}⊥ start_POSTSUBSCRIPT italic_e italic_x italic_p italic_l end_POSTSUBSCRIPT representing contradiction =𝑇𝑟𝑢𝑒,=𝐹𝑎𝑙𝑠𝑒\top=\mathit{True},\bot=\mathit{False}⊤ = italic_True , ⊥ = italic_False
Boolean representation bpval=(vSupp(pval)v=pval(v))subscript𝑏𝑝𝑣𝑎𝑙subscript𝑣𝑆𝑢𝑝𝑝𝑝𝑣𝑎𝑙𝑣𝑝𝑣𝑎𝑙𝑣b_{pval}=(\bigwedge_{v\in Supp(pval)}v=pval(v))italic_b start_POSTSUBSCRIPT italic_p italic_v italic_a italic_l end_POSTSUBSCRIPT = ( ⋀ start_POSTSUBSCRIPT italic_v ∈ italic_S italic_u italic_p italic_p ( italic_p italic_v italic_a italic_l ) end_POSTSUBSCRIPT italic_v = italic_p italic_v italic_a italic_l ( italic_v ) ) identity (already a Boolean expression)
eval(b,s^)𝑒𝑣𝑎𝑙𝑏^𝑠eval(b,\hat{s})italic_e italic_v italic_a italic_l ( italic_b , over^ start_ARG italic_s end_ARG ) Substituting the values in Supp(s^)𝑆𝑢𝑝𝑝^𝑠Supp(\hat{{s}})italic_S italic_u italic_p italic_p ( over^ start_ARG italic_s end_ARG ), and deciding the satisfiability of the result. 𝑇𝑟𝑢𝑒𝑇𝑟𝑢𝑒\mathit{True}italic_True if s^b^𝑠𝑏\hat{s}\implies bover^ start_ARG italic_s end_ARG ⟹ italic_b, 𝐹𝑎𝑙𝑠𝑒𝐹𝑎𝑙𝑠𝑒\mathit{False}italic_False if s^¬b^𝑠𝑏\hat{s}\implies\neg bover^ start_ARG italic_s end_ARG ⟹ ¬ italic_b, 𝑈𝑛𝑘𝑛𝑜𝑤𝑛𝑈𝑛𝑘𝑛𝑜𝑤𝑛\mathit{Unknown}italic_Unknown if neither.
eval(a,s^)𝑒𝑣𝑎𝑙𝑎^𝑠eval(a,\hat{s})italic_e italic_v italic_a italic_l ( italic_a , over^ start_ARG italic_s end_ARG ) Substituting known variables into the result expressions. If this results in a constant, that is the result, else unknown. strongest postcondition
Table 1: Properties and operations of the explicit value and predicate domains

3 Lazy Abstraction for MDPs

Refer to caption
(a)
Refer to caption
(b)
Figure 1: Our running example MDP (a) and an in-progress PASG for it (b). The result of a refinement step is marked with orange
Algorithm 1 PASG construction
1:N{n0};ET{};EC{}formulae-sequence𝑁subscript𝑛0formulae-sequencesubscript𝐸𝑇subscript𝐸𝐶N\leftarrow\{n_{0}\};E_{T}\leftarrow\{\};E_{C}\leftarrow\{\}italic_N ← { italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } ; italic_E start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ← { } ; italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ← { } ; Lc(n0)s0;La(n0)formulae-sequencesubscript𝐿𝑐subscript𝑛0subscript𝑠0subscript𝐿𝑎subscript𝑛0topL_{c}(n_{0})\leftarrow s_{0};L_{a}(n_{0})\leftarrow\topitalic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ← italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ; italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ← ⊤ ; waitlist {n0}absentsubscript𝑛0\leftarrow\{n_{0}\}← { italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } \Whilewaitlist is not empty
2:n𝑛absentn\initalic_n ∈ waitlist ; waitlist \leftarrow waitlist{n}𝑛\setminus\{n\}∖ { italic_n } \IfncnN:Lc(n)La(nc)nc not covered:subscript𝑛𝑐𝑛𝑁subscript𝐿𝑐𝑛subscript𝐿𝑎subscript𝑛𝑐subscript𝑛𝑐 not covered\exists\ n_{c}\neq n\in N:L_{c}(n)\in L_{a}(n_{c})\wedge n_{c}\text{ not covered}∃ italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ≠ italic_n ∈ italic_N : italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ) ∧ italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT not covered
3:ECEC{(n,nc)}subscript𝐸𝐶subscript𝐸𝐶𝑛subscript𝑛𝑐E_{C}\leftarrow E_{C}\cup\{(n,n_{c})\}italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ← italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ∪ { ( italic_n , italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ) } ; Block(n,¬La(nc))𝐵𝑙𝑜𝑐𝑘𝑛subscript𝐿𝑎subscript𝑛𝑐Block(n,\neg L_{a}(n_{c}))italic_B italic_l italic_o italic_c italic_k ( italic_n , ¬ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ) ) \Else \triangleright Expansion \ForAllc𝒞𝑐𝒞c\in\mathcal{C}italic_c ∈ caligraphic_C \Ifeval(gc,Lc(n))=𝑇𝑟𝑢𝑒𝑒𝑣𝑎𝑙subscript𝑔𝑐subscript𝐿𝑐𝑛𝑇𝑟𝑢𝑒eval(g_{c},L_{c}(n))=\mathit{True}italic_e italic_v italic_a italic_l ( italic_g start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ) = italic_True \triangleright c𝑐citalic_c is enabled in Lc(n)subscript𝐿𝑐𝑛L_{c}(n)italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) \Ifc𝑐citalic_c is target command mark n𝑛nitalic_n as target \EndIf\Ifeval(gc,La(n))=𝑈𝑛𝑘𝑛𝑜𝑤𝑛𝑒𝑣𝑎𝑙subscript𝑔𝑐subscript𝐿𝑎𝑛𝑈𝑛𝑘𝑛𝑜𝑤𝑛eval(g_{c},L_{a}(n))=\mathit{Unknown}italic_e italic_v italic_a italic_l ( italic_g start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) ) = italic_Unknown
4:Block(n𝑛nitalic_n, ¬gcsubscript𝑔𝑐\neg g_{c}¬ italic_g start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT) \triangleright c𝑐citalic_c can be disabled in Lasubscript𝐿𝑎L_{a}italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT, so we refine \EndIf\ForAllaiSupp(dc)subscript𝑎𝑖𝑆𝑢𝑝𝑝subscript𝑑𝑐a_{i}\in Supp(d_{c})italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S italic_u italic_p italic_p ( italic_d start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT )
5:NN{nnew}𝑁𝑁subscript𝑛𝑛𝑒𝑤N\leftarrow N\cup\{n_{new}\}italic_N ← italic_N ∪ { italic_n start_POSTSUBSCRIPT italic_n italic_e italic_w end_POSTSUBSCRIPT }; Lc(nnew)eval(a,Lc(n))subscript𝐿𝑐subscript𝑛𝑛𝑒𝑤𝑒𝑣𝑎𝑙𝑎subscript𝐿𝑐𝑛L_{c}(n_{new})\leftarrow eval(a,L_{c}(n))italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT italic_n italic_e italic_w end_POSTSUBSCRIPT ) ← italic_e italic_v italic_a italic_l ( italic_a , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ); La(nnew)subscript𝐿𝑎subscript𝑛𝑛𝑒𝑤topL_{a}(n_{new})\leftarrow\topitalic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT italic_n italic_e italic_w end_POSTSUBSCRIPT ) ← ⊤; δ(nnew)pic𝛿subscript𝑛𝑛𝑒𝑤subscriptsuperscript𝑝𝑐𝑖\delta(n_{new})\leftarrow p^{c}_{i}italic_δ ( italic_n start_POSTSUBSCRIPT italic_n italic_e italic_w end_POSTSUBSCRIPT ) ← italic_p start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT waitlist \leftarrow waitlist nnewsubscript𝑛𝑛𝑒𝑤\cup\ n_{new}∪ italic_n start_POSTSUBSCRIPT italic_n italic_e italic_w end_POSTSUBSCRIPT \EndFor
6:ETET(n,c,δ)subscript𝐸𝑇subscript𝐸𝑇𝑛𝑐𝛿E_{T}\leftarrow E_{T}\cup(n,c,\delta)italic_E start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ← italic_E start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ∪ ( italic_n , italic_c , italic_δ ) \ElsIfeval(g(c),La(n))=𝑈𝑛𝑘𝑛𝑜𝑤𝑛𝑒𝑣𝑎𝑙𝑔𝑐subscript𝐿𝑎𝑛𝑈𝑛𝑘𝑛𝑜𝑤𝑛eval(g(c),L_{a}(n))=\mathit{Unknown}italic_e italic_v italic_a italic_l ( italic_g ( italic_c ) , italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) ) = italic_Unknown
7:Block(n𝑛nitalic_n, gcsubscript𝑔𝑐g_{c}italic_g start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT) \triangleright c𝑐citalic_c can be enabled in Lasubscript𝐿𝑎L_{a}italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT, but is not in Lcsubscript𝐿𝑐L_{c}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT, so we refine \EndIf\EndFor\EndIf\EndWhile
8:\ReturnPASG (N,ET,EC,Lc,La)𝑁subscript𝐸𝑇subscript𝐸𝐶subscript𝐿𝑐subscript𝐿𝑎(N,E_{T},E_{C},L_{c},L_{a})( italic_N , italic_E start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT )
Algorithm 2 Block(n, ϕitalic-ϕ\phiitalic_ϕ)
eval(ϕ,Lc(n))=𝐹𝑎𝑙𝑠𝑒𝑒𝑣𝑎𝑙italic-ϕsubscript𝐿𝑐𝑛𝐹𝑎𝑙𝑠𝑒eval(\phi,L_{c}(n))=\mathit{False}italic_e italic_v italic_a italic_l ( italic_ϕ , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ) = italic_False
9:La(n)block(La(n),ϕ,Lc(n))subscript𝐿𝑎𝑛𝑏𝑙𝑜𝑐𝑘subscript𝐿𝑎𝑛italic-ϕsubscript𝐿𝑐𝑛L_{a}(n)\leftarrow block(L_{a}(n),\phi,L_{c}(n))italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) ← italic_b italic_l italic_o italic_c italic_k ( italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) , italic_ϕ , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ) \ForAll(n,n)ECsuperscript𝑛𝑛subscript𝐸𝐶(n^{\prime},n)\in E_{C}( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_n ) ∈ italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT \triangleright Check nodes nsuperscript𝑛n^{\prime}italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT covered by n𝑛nitalic_n \IfLc(n)La(n)subscript𝐿𝑐superscript𝑛subscript𝐿𝑎𝑛L_{c}(n^{\prime})\notin L_{a}(n)italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∉ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) \triangleright Remove if new Lasubscript𝐿𝑎L_{a}italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT cannot cover
10:ECEC(n,n);waitlistwaitlistnformulae-sequencesubscript𝐸𝐶subscript𝐸𝐶superscript𝑛𝑛𝑤𝑎𝑖𝑡𝑙𝑖𝑠𝑡𝑤𝑎𝑖𝑡𝑙𝑖𝑠𝑡superscript𝑛E_{C}\leftarrow E_{C}\setminus(n^{\prime},n);waitlist\leftarrow waitlist\cup\ % n^{\prime}italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ← italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ∖ ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_n ) ; italic_w italic_a italic_i italic_t italic_l italic_i italic_s italic_t ← italic_w italic_a italic_i italic_t italic_l italic_i italic_s italic_t ∪ italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT \Else Block(nsuperscript𝑛n^{\prime}italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, ¬La(n)subscript𝐿𝑎𝑛\neg L_{a}(n)¬ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n )) \triangleright Else refine covered node \EndIf\EndFor
11:Let e=(npre,c,d)ETs.t.nSupp(d)formulae-sequence𝑒subscript𝑛𝑝𝑟𝑒𝑐𝑑subscript𝐸𝑇𝑠𝑡𝑛𝑆𝑢𝑝𝑝𝑑e=(n_{pre},c,d)\in E_{T}\ s.t.\ n\in Supp(d)italic_e = ( italic_n start_POSTSUBSCRIPT italic_p italic_r italic_e end_POSTSUBSCRIPT , italic_c , italic_d ) ∈ italic_E start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT italic_s . italic_t . italic_n ∈ italic_S italic_u italic_p italic_p ( italic_d ) \triangleright npresubscript𝑛𝑝𝑟𝑒n_{pre}italic_n start_POSTSUBSCRIPT italic_p italic_r italic_e end_POSTSUBSCRIPT denotes the parent of this node
12:Let aicsubscriptsuperscript𝑎𝑐𝑖a^{c}_{i}italic_a start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT s.t. nie=nsubscriptsuperscript𝑛𝑒𝑖𝑛n^{e}_{i}=nitalic_n start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_n \triangleright aicsubscriptsuperscript𝑎𝑐𝑖a^{c}_{i}italic_a start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the assignment which resulted in the node n𝑛nitalic_n
13:Block(npre,¬eval1(aic,La(n))subscript𝑛𝑝𝑟𝑒𝑒𝑣𝑎superscript𝑙1subscriptsuperscript𝑎𝑐𝑖subscript𝐿𝑎𝑛n_{pre},\neg eval^{-1}(a^{c}_{i},L_{a}(n))italic_n start_POSTSUBSCRIPT italic_p italic_r italic_e end_POSTSUBSCRIPT , ¬ italic_e italic_v italic_a italic_l start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_a start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) )) \triangleright Making sure that 2 still holds for Lasubscript𝐿𝑎L_{a}italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT
\Require

Now we adapt the lazy algorithm to symbolic MDPs given by a variable set 𝒱𝒱\mathcal{V}caligraphic_V and a command set 𝒞0subscript𝒞0\mathcal{C}_{0}caligraphic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. Given a target formula ϕ𝒱italic-ϕsubscript𝒱\phi\in\mathcal{E}_{\mathcal{V}}italic_ϕ ∈ caligraphic_E start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT, the goal is to compute the maximal probability of reaching a state s𝑠sitalic_s s.t. eval(ϕ,s)=𝑇𝑟𝑢𝑒𝑒𝑣𝑎𝑙italic-ϕ𝑠𝑇𝑟𝑢𝑒eval(\phi,s)=\mathit{True}italic_e italic_v italic_a italic_l ( italic_ϕ , italic_s ) = italic_True.

We select abstract domain (S^,,α,γ)^𝑆precedes-or-equals𝛼𝛾(\hat{S},\preceq,\alpha,\gamma)( over^ start_ARG italic_S end_ARG , ⪯ , italic_α , italic_γ ). The set of commands is extended with a target command: 𝒞=𝒞0{(ϕ,δid)\mathcal{C}=\mathcal{C}_{0}\cup\{(\phi,\delta_{id})caligraphic_C = caligraphic_C start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∪ { ( italic_ϕ , italic_δ start_POSTSUBSCRIPT italic_i italic_d end_POSTSUBSCRIPT )}, where id𝑖𝑑iditalic_i italic_d is the identity assignment. A node is a target if this command is enabled in it. This ensures that the finished ASG contains a node labeled with a target state exactly if a target state is reachable in the concrete state space [27].

3.0.1 Abstract model

We use a probabilistic extension of the ASG. A direct adaptation of the non-probabilistic lazy algorithm by switching to probabilistic actions would overapproximate the target probability with no control over the approximation. Therefore, we use a stricter, symmetric representation constraint for the relation between the concrete and abstract labels of a Probabilistic ASG node.

Definition 2 (PASG)

A Probabilistic Adaptive Simulation Graph is a tuple (N,ET,EC,Lc,La)𝑁subscript𝐸𝑇subscript𝐸𝐶subscript𝐿𝑐subscript𝐿𝑎(N,E_{T},E_{C},L_{c},L_{a})( italic_N , italic_E start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT , italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ), where N𝑁Nitalic_N is a set of nodes, ETN×𝒞×𝔻(N)subscript𝐸𝑇𝑁𝒞𝔻𝑁E_{T}\subseteq N\times\mathcal{C}\times\mathbb{D}(N)italic_E start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⊆ italic_N × caligraphic_C × blackboard_D ( italic_N ) is a set of transition “edges” from nodes to node distributions labeled with commands, ECN×Nsubscript𝐸𝐶𝑁𝑁E_{C}\subseteq N\times Nitalic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ⊆ italic_N × italic_N is a set of directed covering edges, Lc:N𝑉𝐴𝐿𝒱:subscript𝐿𝑐absent𝑁subscript𝑉𝐴𝐿𝒱L_{c}:N\xrightarrow{}\mathit{VAL}_{\mathcal{V}}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT : italic_N start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW italic_VAL start_POSTSUBSCRIPT caligraphic_V end_POSTSUBSCRIPT is the concrete labeling function, La:NS^:subscript𝐿𝑎absent𝑁^𝑆L_{a}:N\xrightarrow{}\hat{S}italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT : italic_N start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW over^ start_ARG italic_S end_ARG is the abstract labeling function.

For an edge e=(n,c,d)ET𝑒𝑛𝑐𝑑subscript𝐸𝑇e=(n,c,d)\in E_{T}italic_e = ( italic_n , italic_c , italic_d ) ∈ italic_E start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT, niesubscriptsuperscript𝑛𝑒𝑖n^{e}_{i}italic_n start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the i𝑖iitalic_ith element of Supp(d)𝑆𝑢𝑝𝑝𝑑Supp(d)italic_S italic_u italic_p italic_p ( italic_d ) for a fixed ordering. A PASG is well-labeled, if it satisfies the constraints in Table 2. The main difference from the original ASG is that in 2, we use a distribution of results instead of a single one, and 2 requires the set of enabled actions to be exactly the same in all concrete states contained in the abstract label.

In the original lazy algorithm, refinement is performed when an action disabled in Lc(n)subscript𝐿𝑐𝑛L_{c}(n)italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ), but enabled in some element of the abstract label La(n)subscript𝐿𝑎𝑛L_{a}(n)italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ), and we refine by blocking the guard of the action from La(n)subscript𝐿𝑎𝑛L_{a}(n)italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ). Here, we also refine when an action is enabled in Lc(n)subscript𝐿𝑐𝑛L_{c}(n)italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) but disabled in some element of La(n)subscript𝐿𝑎𝑛L_{a}(n)italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) by blocking the negation of the guard. We also need to adapt soundness to probabilities, see LABEL:thm:bi_soundness. 2 is a technical constraint to make our proofs easier.

Constraint Formalisation
A1) Abstract label contains the concrete label: nN:Lc(n)La(n):for-all𝑛𝑁subscript𝐿𝑐𝑛subscript𝐿𝑎𝑛\forall n\in N:L_{c}(n)\in L_{a}(n)∀ italic_n ∈ italic_N : italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n )
A2) Concrete label exactly represents the whole abstract label with respect to the enabled commands nN:c𝒞:eval(gc,Lc(n))=eval(gc,La(n)):for-all𝑛𝑁for-all𝑐𝒞:𝑒𝑣𝑎𝑙subscript𝑔𝑐subscript𝐿𝑐𝑛𝑒𝑣𝑎𝑙subscript𝑔𝑐subscript𝐿𝑎𝑛\forall n\in N:\forall c\in\mathcal{C}:eval(g_{c},L_{c}(n))=eval(g_{c},L_{a}(n))∀ italic_n ∈ italic_N : ∀ italic_c ∈ caligraphic_C : italic_e italic_v italic_a italic_l ( italic_g start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ) = italic_e italic_v italic_a italic_l ( italic_g start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) )
B1) The command of a transition edge is enabled in the concrete label of the source (n,c,)ET:𝑒𝑣𝑎𝑙(gc,Lc(n))=𝑇𝑟𝑢𝑒:for-all𝑛𝑐subscript𝐸𝑇𝑒𝑣𝑎𝑙subscript𝑔𝑐subscript𝐿𝑐𝑛𝑇𝑟𝑢𝑒\forall(n,c,\cdot)\in E_{T}\colon{}\mathit{eval}(g_{c},L_{c}(n))=\mathit{True}∀ ( italic_n , italic_c , ⋅ ) ∈ italic_E start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT : italic_eval ( italic_g start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ) = italic_True
B2) For transition edges e=(n,c,d)ET𝑒𝑛𝑐𝑑subscript𝐸𝑇e=(n,c,d)\in E_{T}italic_e = ( italic_n , italic_c , italic_d ) ∈ italic_E start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT, the i𝑖iitalic_ith result node is consistent with the i𝑖iitalic_ith assignment: same probability, concrete label is the result of the assignment, abstract label overapproximates the result d(nie)=dc(aic)𝑑subscriptsuperscript𝑛𝑒𝑖subscript𝑑𝑐subscriptsuperscript𝑎𝑐𝑖d(n^{e}_{i})=d_{c}(a^{c}_{i})italic_d ( italic_n start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = italic_d start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_a start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) Lc(nie)=eval(aic,Lc(n))subscript𝐿𝑐subscriptsuperscript𝑛𝑒𝑖𝑒𝑣𝑎𝑙subscriptsuperscript𝑎𝑐𝑖subscript𝐿𝑐𝑛L_{c}(n^{e}_{i})=eval(a^{c}_{i},L_{c}(n))italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = italic_e italic_v italic_a italic_l ( italic_a start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ) eval(aic,La(n))La(nie)precedes-or-equals𝑒𝑣𝑎𝑙subscriptsuperscript𝑎𝑐𝑖subscript𝐿𝑎𝑛subscript𝐿𝑎subscriptsuperscript𝑛𝑒𝑖eval(a^{c}_{i},L_{a}(n))\preceq L_{a}(n^{e}_{i})italic_e italic_v italic_a italic_l ( italic_a start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) ) ⪯ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT )
C1) Abstract label of covering node contains the concrete label of covered node (n,n)EC:Lc(n)La(n):for-all𝑛superscript𝑛subscript𝐸𝐶subscript𝐿𝑐𝑛subscript𝐿𝑎superscript𝑛\forall(n,n^{\prime})\in E_{C}\colon L_{c}(n)\in L_{a}(n^{\prime})∀ ( italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT : italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
C2) Covering node is at least as abstract as the covered node (n,n)EC:La(n)La(n):for-all𝑛superscript𝑛subscript𝐸𝐶precedes-or-equalssubscript𝐿𝑎𝑛subscript𝐿𝑎superscript𝑛\forall(n,n^{\prime})\in E_{C}\colon L_{a}(n)\preceq L_{a}(n^{\prime})∀ ( italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT : italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) ⪯ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
C3) Covering node is not covered (n,n)EC:¬(n,n′′)EC:for-all𝑛superscript𝑛subscript𝐸𝐶superscript𝑛superscript𝑛′′subscript𝐸𝐶\forall(n,n^{\prime})\in E_{C}\colon\neg\exists(n^{\prime},n^{\prime\prime})% \in E_{C}∀ ( italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT : ¬ ∃ ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_n start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ) ∈ italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT
D1) At most one node labeled with a given concrete label can be non-covered n,nN:nnLc(n)=Lc(n)(n′′:(n,n′′)EC(n,n′′)EC)\forall n,n^{\prime}\in N\colon n\neq n^{\prime}\wedge L_{c}(n)=L_{c}(n^{% \prime})\implies(\exists n^{\prime\prime}\colon(n,n^{\prime\prime})\in E_{C}% \vee(n^{\prime},n^{\prime\prime})\in E_{C})∀ italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_N : italic_n ≠ italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∧ italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) = italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⟹ ( ∃ italic_n start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT : ( italic_n , italic_n start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ) ∈ italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT ∨ ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_n start_POSTSUPERSCRIPT ′ ′ end_POSTSUPERSCRIPT ) ∈ italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT )
Table 2: PASG constraints
Example 1

1(b) shows an example PASG with black (the orange part is a refinement example explained later). The abstract label tracks only x𝑥xitalic_x in all nodes (this could differ from node to node in general). Lcsubscript𝐿𝑐L_{c}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT is contained in Lasubscript𝐿𝑎L_{a}italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT for all nodes as the value of x𝑥xitalic_x is the same in Lcsubscript𝐿𝑐L_{c}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT and Lasubscript𝐿𝑎L_{a}italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT (2).

n0subscript𝑛0n_{0}italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT covers n1subscript𝑛1n_{1}italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, n2subscript𝑛2n_{2}italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT covers n4subscript𝑛4n_{4}italic_n start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT and n3subscript𝑛3n_{3}italic_n start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, satisfying 2 and 2. n3subscript𝑛3n_{3}italic_n start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT could cover n4subscript𝑛4n_{4}italic_n start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT according to the labels, but it would violate 2.

This PASG is unfinished, n4subscript𝑛4n_{4}italic_n start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT and n5subscript𝑛5n_{5}italic_n start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT are not expanded. n0subscript𝑛0n_{0}italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is an example for the remaining constraints. 2 is satisfied, as tracking x𝑥xitalic_x in Lasubscript𝐿𝑎L_{a}italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT is enough to disable c3subscript𝑐3c_{3}italic_c start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, and both c1subscript𝑐1c_{1}italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and c2subscript𝑐2c_{2}italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are enabled in Lcsubscript𝐿𝑐L_{c}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT and everywhere in Lasubscript𝐿𝑎L_{a}italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT. As the outgoing edges are labeled with c1subscript𝑐1c_{1}italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and c2subscript𝑐2c_{2}italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, 2 is satisfied. Let e=(n0,c1,d)𝑒subscript𝑛0subscript𝑐1𝑑e=(n_{0},c_{1},d)italic_e = ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_d ) be the c1subscript𝑐1c_{1}italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT edge from n0subscript𝑛0n_{0}italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. The assignments in c1subscript𝑐1c_{1}italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT are a1c1=(x:=x+1y:=y)subscriptsuperscript𝑎subscript𝑐11assignsuperscript𝑥𝑥1superscript𝑦assign𝑦a^{c_{1}}_{1}=(x^{\prime}:=x+1\wedge y^{\prime}:=y)italic_a start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := italic_x + 1 ∧ italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := italic_y ), paired with n1e=n2subscriptsuperscript𝑛𝑒1subscript𝑛2n^{e}_{1}=n_{2}italic_n start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and a2c1=(x:=xy:=y)subscriptsuperscript𝑎subscript𝑐12assignsuperscript𝑥𝑥superscript𝑦assign𝑦a^{c_{1}}_{2}=(x^{\prime}:=x\wedge y^{\prime}:=y)italic_a start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := italic_x ∧ italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := italic_y ) paired with n2e=n1subscriptsuperscript𝑛𝑒2subscript𝑛1n^{e}_{2}=n_{1}italic_n start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, which satisfies 2. E.g. for a1c1subscriptsuperscript𝑎subscript𝑐11a^{c_{1}}_{1}italic_a start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT: eval(a1c1,Lc(n0))=eval((x:=x+1y:=y),(x=0,y=0))=(x=1,y=0)𝑒𝑣𝑎𝑙subscriptsuperscript𝑎subscript𝑐11subscript𝐿𝑐subscript𝑛0𝑒𝑣𝑎𝑙assignsuperscript𝑥𝑥1superscript𝑦assign𝑦formulae-sequence𝑥0𝑦0formulae-sequence𝑥1𝑦0eval(a^{c_{1}}_{1},L_{c}(n_{0}))=eval((x^{\prime}:=x+1\wedge y^{\prime}:=y),(x% =0,y=0))=(x=1,y=0)italic_e italic_v italic_a italic_l ( italic_a start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) = italic_e italic_v italic_a italic_l ( ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := italic_x + 1 ∧ italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := italic_y ) , ( italic_x = 0 , italic_y = 0 ) ) = ( italic_x = 1 , italic_y = 0 ), eval(a1c1,La(n0))=(x=1)(x=1)𝑒𝑣𝑎𝑙subscriptsuperscript𝑎subscript𝑐11subscript𝐿𝑎subscript𝑛0𝑥1precedes-or-equals𝑥1eval(a^{c_{1}}_{1},L_{a}(n_{0}))=(x=1)\preceq(x=1)italic_e italic_v italic_a italic_l ( italic_a start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ) = ( italic_x = 1 ) ⪯ ( italic_x = 1 ) and dc1(a1c1)=0.8=d(n2)subscript𝑑subscript𝑐1subscriptsuperscript𝑎subscript𝑐110.8𝑑subscript𝑛2d_{c_{1}}(a^{c_{1}}_{1})=0.8=d(n_{2})italic_d start_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_a start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = 0.8 = italic_d ( italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ).

3.0.2 Exploration

Algorithm 1 shows PASG construction. An initial node n0subscript𝑛0n_{0}italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT labeled Lc(n0)=val0,La(n0)=formulae-sequencesubscript𝐿𝑐subscript𝑛0𝑣𝑎subscript𝑙0subscript𝐿𝑎subscript𝑛0topL_{c}(n_{0})=val_{0},L_{a}(n_{0})=\topitalic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = italic_v italic_a italic_l start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = ⊤ is extended to a well-labeled PASG with each node either covered or expanded. Algorithm 2 shows blocking an expression from La(n)subscript𝐿𝑎𝑛L_{a}(n)italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ), used during refinement.

When removing a node n𝑛nitalic_n from the waitlist, we check whether ncnN:Lc(n)La(nc):subscript𝑛𝑐𝑛𝑁subscript𝐿𝑐superscript𝑛subscript𝐿𝑎subscript𝑛𝑐\exists n_{c}\neq n\in N\colon L_{c}(n^{\prime})\in L_{a}(n_{c})∃ italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ≠ italic_n ∈ italic_N : italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ) s.t. ncsubscript𝑛𝑐n_{c}italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT is not covered. If so, a covering edge (n,nc)superscript𝑛subscript𝑛𝑐(n^{\prime},n_{c})( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_n start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ) is created and La(n)subscript𝐿𝑎superscript𝑛L_{a}(n^{\prime})italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) is strengthened for 2 to hold. Else, it is expanded.

If a node nN𝑛𝑁n\in Nitalic_n ∈ italic_N is selected for expansion, we check for each c𝒞𝑐𝒞c\in\mathcal{C}italic_c ∈ caligraphic_C whether eval(gc,Lc(n))=𝑇𝑟𝑢𝑒𝑒𝑣𝑎𝑙subscript𝑔𝑐subscript𝐿𝑐𝑛𝑇𝑟𝑢𝑒eval(g_{c},L_{c}(n))=\mathit{True}italic_e italic_v italic_a italic_l ( italic_g start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ) = italic_True. If so, a new node nisubscriptsuperscript𝑛𝑖n^{\prime}_{i}italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is created for each aiSupp(dc)subscript𝑎𝑖𝑆𝑢𝑝𝑝subscript𝑑𝑐a_{i}\in Supp(d_{c})italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S italic_u italic_p italic_p ( italic_d start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ) with Lc(ni)=eval(ai,Lc(n)),La(ni)=formulae-sequencesubscript𝐿𝑐subscriptsuperscript𝑛𝑖𝑒𝑣𝑎𝑙subscript𝑎𝑖subscript𝐿𝑐𝑛subscript𝐿𝑎subscriptsuperscript𝑛𝑖topL_{c}(n^{\prime}_{i})=eval(a_{i},L_{c}(n)),L_{a}(n^{\prime}_{i})=\topitalic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = italic_e italic_v italic_a italic_l ( italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ) , italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = ⊤, and a transition edge (n,c,de)𝑛𝑐subscript𝑑𝑒(n,c,d_{e})( italic_n , italic_c , italic_d start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ) is created such that de(ni)=dc(ai)subscript𝑑𝑒subscriptsuperscript𝑛𝑖subscript𝑑𝑐subscript𝑎𝑖d_{e}(n^{\prime}_{i})=d_{c}(a_{i})italic_d start_POSTSUBSCRIPT italic_e end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) = italic_d start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) for i=1|Supp(dc)|𝑖1𝑆𝑢𝑝𝑝subscript𝑑𝑐i=1\dots|Supp(d_{c})|italic_i = 1 … | italic_S italic_u italic_p italic_p ( italic_d start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ) |. Because of 2, if the abstract label contains states where the transition is disabled, we remove them by blocking out the negated guard (Line 3).

If 𝑒𝑣𝑎𝑙(gc,Lc(n))=𝐹𝑎𝑙𝑠𝑒𝑒𝑣𝑎𝑙subscript𝑔𝑐subscript𝐿𝑐𝑛𝐹𝑎𝑙𝑠𝑒\mathit{eval(g_{c},L_{c}(n))=\mathit{False}}italic_eval ( italic_g start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ) = italic_False, we compute eval(gc,La(n))𝑒𝑣𝑎𝑙subscript𝑔𝑐subscript𝐿𝑎𝑛eval(g_{c},L_{a}(n))italic_e italic_v italic_a italic_l ( italic_g start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) ). If 𝐹𝑎𝑙𝑠𝑒𝐹𝑎𝑙𝑠𝑒\mathit{False}italic_False, we move on to the next command. If 𝑈𝑛𝑘𝑛𝑜𝑤𝑛𝑈𝑛𝑘𝑛𝑜𝑤𝑛\mathit{Unknown}italic_Unknown (cannot be 𝑇𝑟𝑢𝑒𝑇𝑟𝑢𝑒\mathit{True}italic_True, as LcLasubscript𝐿𝑐subscript𝐿𝑎L_{c}\in L_{a}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT), 2 is violated, so La(n)subscript𝐿𝑎𝑛L_{a}(n)italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) needs to be strengthened: a new abstract label is computed as s^=block(La(n),gc,Lc(n))superscript^𝑠𝑏𝑙𝑜𝑐𝑘subscript𝐿𝑎𝑛subscript𝑔𝑐subscript𝐿𝑐𝑛\hat{s}^{\prime}=block(L_{a}(n),g_{c},L_{c}(n))over^ start_ARG italic_s end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_b italic_l italic_o italic_c italic_k ( italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) , italic_g start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ). Because of the contract of block𝑏𝑙𝑜𝑐𝑘blockitalic_b italic_l italic_o italic_c italic_k, eval(gc,s^)=𝐹𝑎𝑙𝑠𝑒𝑒𝑣𝑎𝑙subscript𝑔𝑐superscript^𝑠𝐹𝑎𝑙𝑠𝑒eval(g_{c},\hat{s}^{\prime})=\mathit{False}italic_e italic_v italic_a italic_l ( italic_g start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT , over^ start_ARG italic_s end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_False, so this command no longer causes a constraint violation (Line 6).

3.0.3 Refinement

Refinement is interleaved with exploring the abstract state space. Whenever the La(n)subscript𝐿𝑎𝑛L_{a}(n)italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) changes for some nN𝑛𝑁n\in Nitalic_n ∈ italic_N, the constraints may be violated. If constraint 2 is violated, the problematic covering edge is removed from ECsubscript𝐸𝐶E_{C}italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT. This makes n𝑛nitalic_n non-covered, so we expand it later (Line 10).

If constraint 2 is violated by covering edge (n,n)EC𝑛superscript𝑛subscript𝐸𝐶(n,n^{\prime})\in E_{C}( italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∈ italic_E start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT, but constraint 2 still holds, the current La(n)subscript𝐿𝑎𝑛L_{a}(n)italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) must be replaced with s^superscript^𝑠\hat{s}^{\prime}over^ start_ARG italic_s end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT such that Lc(n)s^subscript𝐿𝑐𝑛superscript^𝑠L_{c}(n)\in\hat{s}^{\prime}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ∈ over^ start_ARG italic_s end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, s^La(n)precedes-or-equalssuperscript^𝑠subscript𝐿𝑎superscript𝑛\hat{s}^{\prime}\preceq L_{a}(n^{\prime})over^ start_ARG italic_s end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⪯ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) and s^La(n)precedes-or-equalssuperscript^𝑠subscript𝐿𝑎𝑛\hat{s}^{\prime}\preceq L_{a}(n)over^ start_ARG italic_s end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⪯ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) (referring to the current Lasubscript𝐿𝑎L_{a}italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT). An appropriate s^superscript^𝑠\hat{s}^{\prime}over^ start_ARG italic_s end_ARG start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is block(La(n),¬La(n),Lc(n))𝑏𝑙𝑜𝑐𝑘subscript𝐿𝑎𝑛subscript𝐿𝑎𝑛subscript𝐿𝑐𝑛block(L_{a}(n),\neg L_{a}(n),L_{c}(n))italic_b italic_l italic_o italic_c italic_k ( italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) , ¬ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ) (Line 10).

Assume that 2 is violated by an edge e=(n,c,d)𝑒𝑛𝑐𝑑e=(n,c,d)italic_e = ( italic_n , italic_c , italic_d ). Because of how the PASG is constructed, the concrete label and probability subconstraints of 2 must still hold, but the abstract label part is violated by some node niesubscriptsuperscript𝑛𝑒𝑖n^{e}_{i}italic_n start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT: Lasubscript𝐿𝑎L_{a}italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT of niesubscriptsuperscript𝑛𝑒𝑖n^{e}_{i}italic_n start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT no longer overapproximates applying aicsubscriptsuperscript𝑎𝑐𝑖a^{c}_{i}italic_a start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (the assignment that led to its creation) to Lasubscript𝐿𝑎L_{a}italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT of its parent. The violation caused by this assignment is eliminated by changing La(n)subscript𝐿𝑎𝑛L_{a}(n)italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) to block(La(n),eval1(aic,La(nie),Lc(n))block(L_{a}(n),eval^{-1}(a^{c}_{i},L_{a}(n^{e}_{i}),L_{c}(n))italic_b italic_l italic_o italic_c italic_k ( italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) , italic_e italic_v italic_a italic_l start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_a start_POSTSUPERSCRIPT italic_c end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT italic_e end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ) (Line 13).

Strengthenings may create new violations, but all of them are eliminated after finite steps (if the concrete label can be finitely represented in the abstract domain), and we continue expanding non-covered nodes. Efficient implementations of the algorithm can employ sequence interpolation to strengthen the whole path up to the root at once [27], which we do when using the predicate domain, as we observed that both simple weakest-precondition-based refinement and binary interpolation lead to predicates growing very fast.

Example 2

1(b) shows an example of refinement in orange. Expanding n5subscript𝑛5n_{5}italic_n start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT, we realize c3subscript𝑐3c_{3}italic_c start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT is enabled in some states described by abstract label x=2𝑥2x=2italic_x = 2, but not in the concrete label x=2,y=0formulae-sequence𝑥2𝑦0x=2,y=0italic_x = 2 , italic_y = 0. Thus, we strengthen n5subscript𝑛5n_{5}italic_n start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT by blocking the guard x==2y==2x==2\wedge y==2italic_x = = 2 ∧ italic_y = = 2, resulting in the abstract label x=2,y=0formulae-sequence𝑥2𝑦0x=2,y=0italic_x = 2 , italic_y = 0.

This triggers another strengthening, as La(n5)subscript𝐿𝑎subscript𝑛5L_{a}(n_{5})italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT ) no longer overapproximates applying x=x+1y=ysuperscript𝑥𝑥1superscript𝑦𝑦x^{\prime}=x+1\wedge y^{\prime}=yitalic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_x + 1 ∧ italic_y start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_y to La(n2)subscript𝐿𝑎subscript𝑛2L_{a}(n_{2})italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ). n2subscript𝑛2n_{2}italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is also strengthened, removing a cover edge as Lc(n3)subscript𝐿𝑐subscript𝑛3L_{c}(n_{3})italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) is no longer contained in La(n2)subscript𝐿𝑎subscript𝑛2L_{a}(n_{2})italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ). Strengthening a covering node also strengthens the covered nodes if the covering remains (see n4subscript𝑛4n_{4}italic_n start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT and n1subscript𝑛1n_{1}italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT).

3.0.4 Numerical analysis

The finished PASG can be treated as an MDP (N,𝒞cover,T𝑃𝐴𝑆𝐺,n0)𝑁𝒞𝑐𝑜𝑣𝑒𝑟subscript𝑇𝑃𝐴𝑆𝐺subscript𝑛0(N,\mathcal{C}\cup cover,T_{\mathit{PASG{}{}}},n_{0})( italic_N , caligraphic_C ∪ italic_c italic_o italic_v italic_e italic_r , italic_T start_POSTSUBSCRIPT italic_PASG end_POSTSUBSCRIPT , italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). Regarding T𝑃𝐴𝑆𝐺subscript𝑇𝑃𝐴𝑆𝐺T_{\mathit{PASG{}{}}}italic_T start_POSTSUBSCRIPT italic_PASG end_POSTSUBSCRIPT, for a non-covered node n𝑛nitalic_n, a command cC𝑐𝐶c\in Citalic_c ∈ italic_C is enabled if there is an edge (n,c,d)ET𝑛𝑐𝑑subscript𝐸𝑇(n,c,d)\in E_{T}( italic_n , italic_c , italic_d ) ∈ italic_E start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT in the PASG, and T𝑃𝐴𝑆𝐺(n,c)=dsubscript𝑇𝑃𝐴𝑆𝐺𝑛𝑐𝑑T_{\mathit{PASG{}{}}}(n,c)=ditalic_T start_POSTSUBSCRIPT italic_PASG end_POSTSUBSCRIPT ( italic_n , italic_c ) = italic_d. The only action in covered nodes is cover𝑐𝑜𝑣𝑒𝑟coveritalic_c italic_o italic_v italic_e italic_r, which results in their covering node.

{reptheorem}

thm:bi_soundness The maximal/minimal probability of reaching a target node in the PASG of an MDP M𝑀Mitalic_M is the same as in M𝑀Mitalic_M.

Refer to the Appendix for proofs.

Due to the symmetry of constraint 2, this abstraction can be considered similar to bisimulation-reduction, but not aiming for the coarsest bisimulation. Constructing the PASG can be computationally cheaper than the coarsest bisimulation, but the larger state space may result in more expensive numerical computation. The advantages appear when the abstraction is combined with partial state space exploration, as most bisimulation reduction algorithms in the literature cannot be done on the fly. Comparing our method to bisimulation reduction in depth (both theoretically and empirically) is planned for future work.

3.1 Combining with BRTDP

Bounded Real-Time Dynamic Programming (BRTDP) [22, 3] approximates the value function of an MDP during state space exploration. It maintains an upper and a lower bound (U𝑈Uitalic_U and L𝐿Litalic_L) by generating traces and updating the bounds for the encountered states. In each step, the optimal action is chosen according to the current U𝑈Uitalic_U. The strategy for choosing a state from the result distribution is a parameter of the algorithm, for which we implemented the RANDOM and DIFF_BASED trace generation strategies from [3].

Our lazy abstraction algorithm combines well with such methods. As refinement is performed during expansion, the abstract states in an in-progress PASG are coarser than those in a finished PASG. Thus, if BRTDP reaches the required threshold before constructing the full PASG, the abstract labels remain coarser. Existing probabilistic CEGAR schemes like [17] cannot benefit from this, as they need lower and upper value bounds for all nodes for refinement, while BRTDP works best when only the value of the initial node is needed. This combination enables a controlled trade-off between time and accuracy.

The skeleton of the algorithm is the same as the BRTDP algorithm described in [3] for general MDPs. The difference is that instead of generating traces from the final PASG, we use the steps of building the PASG for trace generation. As such, refinement is also possible during trace simulation, potentially removing covers used in previous simulations.

{reptheorem}

thm:bi_brtdp_soundness The maximal probability of reaching a target state is always between L(n0)𝐿subscript𝑛0L(n_{0})italic_L ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) and U(n0)𝑈subscript𝑛0U(n_{0})italic_U ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) if BRTDP is used with PASG construction steps.

This theorem is non-trivial, as the traces are not generated for the finished PASG, but for an in-progress version where transient cover edges can exist which would not be present when finished. The main idea behind its proof is that if a value is propagated through a cover edge, then the value has been updated only based on traces consistent with the cover edge (as we would have already removed it if any trace inconsistent with it had been explored).

4 Evaluation

Refer to caption
Figure 2: Comparison of standard and abstract BVI with the EXPL (top row) and PRED (bottom row) domain. Columns: 1st: number of (non-covered) nodes, 2nd: running time on log scale, 3rd: ratio of the original and abstract state space size (red line marks 1, which would be no reduction). Only for inputs where the two algorithms compared in the plot terminated.

Implementation Our prototype111https://github.com/szdan97/probabilistic-theta/tree/prob-proto with the explicit value and predicate domains is implemented in the Theta model checker [26], taking JANI models[4] as input. Only properties of the form Pmax(pUq)=?subscript𝑃𝑚𝑎𝑥𝑝𝑈𝑞?P_{max}(p\ U\ q)=?italic_P start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT ( italic_p italic_U italic_q ) = ? are supported where p,q𝑝𝑞p,qitalic_p , italic_q are Boolean constraints. Action result probabilities must be constant. The locations of JANI models are always tracked in the abstract states and covering can only occur between nodes with the same locations. To fairly assess the algorithms rather than the implementations, we implemented both Bounded Value Iteration (BVI) [1] and BRTDP as MDP solution techniques in Theta both with and without lazy abstraction. We precompute almost sure reachability and avoidance to speed up the numerical solution if BVI is used.

The main metrics of interest are state space size and running time. As the numerical computations mostly scale with the non-covered PASG nodes, we are especially interested in the number of non-covered nodes. Our evaluation focuses on the following research questions:

RQ1. How does lazy abstraction affect the state space size and analysis time?

RQ2. Does the combination of lazy abstraction with BRTDP lead to reduced abstract model size when converged? How does it affect the running time?

4.0.1 Setup

We used the 104 MDP model-property pairs from the Quantitative Verification Benchmark Set [12] compatible with our current implementation. The experiments were conducted using BenchExec[2], running them on the Komondor HPC 222https://hpc.kifu.hu/en with AMD EPYCTM 7763 CPUs, each run getting 8 CPU cores, 16GB RAM, and a 1-hour timeout. Convergence threshold was 106superscript10610^{-6}10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT (absolute) for all algorithms.

4.0.2 Results and discussion

RQ1. Figure 2 shows the BVI results. Lazy abstraction often significantly reduced the state space: for example, a 3-fold reduction was possible for beb.3-4 [N=3, prop: GaveUp] (from 4632 nodes to 1559 non-covered nodes) and csma.2-6 [prop: all_before_max] (from 67741 to 24837) with EXPL, and for beb.3-4 [N=3, prop: GaveUp] (from 4632 to 1354) with PRED. There are also inputs where the explicit domain could not reduce the state space size (e.g. blocksworld.5, cdrive.2, but that is expected because of its low granularity.

Measuring the analysis time, it turned out that the overhead of more complex operations during exploration outweighed the benefits of numerical computations on a smaller state space. The overhead is apparent for EXPL, but it is much less severe than for PRED.

Predicate abstraction was sometimes able to reduce the state space more than the explicit domain when it terminated before timeout, but the opposite was also present (related plots can be found in the appendix). The overhead of interpolating using an SMT solver was often too large, and so PRED often failed to terminate in time.

We identified several abstraction-specific optimization possibilities for the implementation. For one, the interpolants returned by Z3 were often very large and had a redundant structure, which we could mitigate through structural simplification, improving both time and memory efficiency. Investigating alternative refinement methods and other solvers could also lead to better results.

Another opportunity for optimization is that like the non-probabilistic version [27], we currently create multiple nodes with the same LCsubscript𝐿𝐶L_{C}italic_L start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT if a concrete state is reachable on multiple paths. As one of them always covers the others, only that is explored. However, according to our investigation, there are inputs where this led to multiple magnitudes of increase in the number of nodes during exploration. Merging such nodes instead would solve this issue, but refinement must be slightly changed as the path up to the root will no longer be unambiguous. We already had some preliminary measurements with promising results regarding this modification.

Refer to caption
Figure 3: Comparison of abstract BVI and BRTDP. Same plot types as Figure 2.

RQ2. The simulation-based nature of BRTDP makes it harder to gauge the benefits of abstract BRTDP compared to standard BRTDP: there were inputs where abstract BRTDP converged with fewer nodes and where the concrete one did. (We relegated plots related to this comparison to the appendix.)

The benefits of abstract BRTDP compared to abstract BVI are much more apparent. The plots in Figure 3 show this comparison (the results of the strategy leading to fewer nodes were used for each input-domain pair for BRTDP).

The highest relative state space size reduction was on zeroconf [N=20, k=2, !reset, prop: correct] (170-fold from 64109 to 373 non-covered nodes) for EXPL and pnueli-zuck.3 [prop: live] (from 1888 to 239 non-covered nodes) for PRED. There were inputs where no further reduction was achieved though (e.g. blocksworld.5, cdrive.2, rectangle-tireworld.5, ij.10 for both domains). When both BVI and BRTDP terminated, BRTDP was often able to do so in less time, especially with PRED (the results are much more two-sided for EXPL).

5 Conclusions

We proposed a lazy abstraction algorithm for symbolic MDPs and combined it with BRTDP. We provided numerical evaluation for different versions of the proposed algorithm using the Quantitative Verification Benchmark Set, comparing them to explicitly computing the concrete state space.

The initial experimental evaluation shows potential in the proposed algorithm, especially if reducing the state space is paramount for staying within the memory limits. As the time overhead introduced by more complex computations in the state space exploration often outweighed the gains from analyzing a smaller state space, we plan to explore possible improvements for this aspect. Further measurements using other benchmark sets and more parameterizations of the scalable models in the QVBS are also planned.

Additionally, we wish to explore fine-grained transitioning between the strict representation version (see 2) and a more direct overapproximating adaptation by incorporating ideas from game-based abstraction refinement, moving the algorithm closer to standard abstraction approaches instead of bisimulation reduction.

References

  • [1] Baier, C., Klein, J., Leuschner, L., Parker, D., Wunderlich, S.: Ensuring the reliability of your model checker: Interval iteration for Markov Decision Processes. In: CAV’17 (2017). https://doi.org/10.1007/978-3-319-63387-9_8
  • [2] Beyer, D., Löwe, S., Wendler, P.: Reliable benchmarking: requirements and solutions. Int. J. Softw. Tools Technol. Transf. (2019). https://doi.org/10.1007/s10009-017-0469-y
  • [3] Brázdil, T., Chatterjee, K., Chmelik, M., Forejt, V., Kretínský, J., Kwiatkowska, M.Z., Parker, D., Ujma, M.: Verification of Markov Decision Processes using learning algorithms. In: ATVA’14 (2014). https://doi.org/10.1007/978-3-319-11936-6_8
  • [4] Budde, C.E., Dehnert, C., Hahn, E.M., Hartmanns, A., Junges, S., Turrini, A.: JANI: quantitative model and tool interaction. In: TACAS’17 (2017). https://doi.org/10.1007/978-3-662-54580-5_9
  • [5] Chadha, R., Viswanathan, M.: A counterexample-guided abstraction-refinement framework for Markov decision processes. ACM TOCL (2010). https://doi.org/10.1145/1838552.1838553
  • [6] Clarke, E.M., Grumberg, O., Jha, S., Lu, Y., Veith, H.: Counterexample-guided abstraction refinement. In: CAV’00 (2000). https://doi.org/10.1007/10722167_15
  • [7] Cousot, P., Monerau, M.: Probabilistic abstract interpretation. In: ESOP’12 (2012). https://doi.org/10.1007/978-3-642-28869-2_9
  • [8] D’Argenio, P.R., Jeannet, B., Jensen, H.E., Larsen, K.G.: Reduction and refinement strategies for probabilistic analysis. In: PAPM-PROBMIV’02 (2002). https://doi.org/10.1007/3-540-45605-8_5
  • [9] De Alfaro, L., Roy, P.: Magnifying-lens abstraction for Markov decision processes. In: CAV’07 (2007). https://doi.org/10.1007/978-3-540-73368-3_38
  • [10] Dehnert, C., Katoen, J., Parker, D.: SMT-Based bisimulation minimisation of Markov models. In: VMCAI’13 (2013). https://doi.org/10.1007/978-3-642-35873-9_5
  • [11] Esparza, J., Gaiser, A.: Probabilistic abstractions with arbitrary domains. In: SAS’11 (2011). https://doi.org/10.1007/978-3-642-23702-7_25
  • [12] Hartmanns, A., Klauck, M., Parker, D., Quatmann, T., Ruijters, E.: The quantitative verification benchmark set. In: TACAS’19 (2019). https://doi.org/10.1007/978-3-030-17462-0_20
  • [13] Henzinger, T.A., Jhala, R., Majumdar, R., Sutre, G.: Lazy abstraction. In: POPL’02 (2002)
  • [14] Herbreteau, F., Srivathsan, B., Walukiewicz, I.: Lazy abstractions for timed automata. In: CAV’13 (2013). https://doi.org/10.1007/978-3-642-39799-8_71
  • [15] Hermanns, H., Wachter, B., Zhang, L.: Probabilistic CEGAR. In: CAV’08 (2008). https://doi.org/10.1007/978-3-540-70545-1_16
  • [16] Kamaleson, N., Parker, D., Rowe, J.E.: Finite-horizon bisimulation minimisation for probabilistic systems. In: SPIN’16 (2016). https://doi.org/10.1007/978-3-319-32582-8_10
  • [17] Kattenbelt, M., Kwiatkowska, M., Norman, G., Parker, D.: Abstraction refinement for probabilistic software. In: VMCAI’09 (2009)
  • [18] Kelmendi, E., Krämer, J., Křetínskỳ, J., Weininger, M.: Value iteration for simple stochastic games: Stop** criterion and learning algorithm. In: CAV’18 (2018)
  • [19] Komuravelli, A., Păsăreanu, C.S., Clarke, E.M.: Assume-guarantee abstraction refinement for probabilistic systems. In: CAV’12 (2012). https://doi.org/10.1007/978-3-642-31424-7_25
  • [20] Kwiatkowska, M., Norman, G., Parker, D.: Stochastic model checking. In: SFM’07 (2007)
  • [21] Kwiatkowska, M., Norman, G., Parker, D.: PRISM 4.0: Verification of probabilistic real-time systems. In: CAV’11 (2011)
  • [22] McMahan, H.B., Likhachev, M., Gordon, G.J.: Bounded real-time dynamic programming: RTDP with monotone upper bounds and performance guarantees. In: ICML’05 (2005). https://doi.org/10.1145/1102351.1102423
  • [23] McMillan, K.L.: Lazy abstraction with interpolants. In: CAV’06 (2006). https://doi.org/10.1007/11817963_14
  • [24] Parker, D., Norman, G., Kwiatkowska, M.: Game-based abstraction for Markov decision processes. In: QEST’06 (2006). https://doi.org/10.1109/QEST.2006.19
  • [25] Song, L., Zhang, L., Hermanns, H., Godskesen, J.C.: Incremental bisimulation abstraction refinement. ACM TECS (2014). https://doi.org/10.1145/2627352
  • [26] Tóth, T., Hajdu, A., Vörös, A., Micskei, Z., Majzik, I.: Theta: a framework for abstraction refinement-based model checking. In: FMCAD’17 (2017). https://doi.org/10.23919/FMCAD.2017.8102257
  • [27] Tóth, T., Majzik, I.: Configurable verification of timed automata with discrete variables. Acta Informatica (2022). https://doi.org/10.1007/s00236-020-00393-4
  • [28] Wachter, B., Zhang, L.: Best probabilistic transformers. In: VMCAI’10 (2010). https://doi.org/10.1007/978-3-642-11319-2_26

Appendix 0.A Proofs

For a PASG trace n0act1n1act2n2act3actknk𝑎𝑐subscript𝑡1subscript𝑛0subscript𝑛1𝑎𝑐subscript𝑡2subscript𝑛2𝑎𝑐subscript𝑡3𝑎𝑐subscript𝑡𝑘subscript𝑛𝑘n_{0}\xrightarrow{act_{1}}n_{1}\xrightarrow{act_{2}}n_{2}\xrightarrow{act_{3}}% \dots\xrightarrow{act_{k}}n_{k}italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_a italic_c italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_a italic_c italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_a italic_c italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW … start_ARROW start_OVERACCENT italic_a italic_c italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_n start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT such that actkcover𝑎𝑐subscript𝑡𝑘𝑐𝑜𝑣𝑒𝑟act_{k}\neq coveritalic_a italic_c italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ≠ italic_c italic_o italic_v italic_e italic_r, its coverless representation is constructed such that for each i𝑖iitalic_i where acti=cover𝑎𝑐subscript𝑡𝑖𝑐𝑜𝑣𝑒𝑟act_{i}=coveritalic_a italic_c italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_c italic_o italic_v italic_e italic_r, acti𝑎𝑐subscript𝑡𝑖act_{i}italic_a italic_c italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and nisubscript𝑛𝑖n_{i}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is dropped from the trace. The original trace can always be reconstructed from the coverless representation, as

Lemma 1

For any well-labeled finished PASG P𝑃Pitalic_P for the MDP M𝑀Mitalic_M, let s0c1s1c2s2c3cksksubscript𝑐1subscript𝑠0subscript𝑠1subscript𝑐2subscript𝑠2subscript𝑐3subscript𝑐𝑘subscript𝑠𝑘s_{0}\xrightarrow{c_{1}}s_{1}\xrightarrow{c_{2}}s_{2}\xrightarrow{c_{3}}\dots% \xrightarrow{c_{k}}s_{k}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_c start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW … start_ARROW start_OVERACCENT italic_c start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_s start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT be a trace of M𝑀Mitalic_M. Then there must exist a trace of P𝑃Pitalic_P with coverless representation n0act1n1act2n2act3actknk𝑎𝑐subscript𝑡1subscript𝑛0subscript𝑛1𝑎𝑐subscript𝑡2subscript𝑛2𝑎𝑐subscript𝑡3𝑎𝑐subscript𝑡𝑘subscript𝑛𝑘n_{0}\xrightarrow{act_{1}}n_{1}\xrightarrow{act_{2}}n_{2}\xrightarrow{act_{3}}% \dots\xrightarrow{act_{k}}n_{k}italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_a italic_c italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_a italic_c italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_a italic_c italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW … start_ARROW start_OVERACCENT italic_a italic_c italic_t start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_n start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT such that for all i=1..k:siLa(ni)i=1..k:s_{i}\in L_{a}(n_{i})italic_i = 1 . . italic_k : italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ).

Proof

For any non-covered node n𝑛nitalic_n and concrete state s𝑠sitalic_s, if sLa(n)𝑠subscript𝐿𝑎𝑛s\in L_{a}(n)italic_s ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ), then all commands enabled in s𝑠sitalic_s are enabled in n𝑛nitalic_n because of 2 and 2 (the concrete label has at least those commands enabled that are enabled in any state described by the abstract label, and the actions enabled in a non-covered node are the commands enabled in its concrete label). Because of 2, if there is a covering edge (n,n)𝑛superscript𝑛(n,n^{\prime})( italic_n , italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) and sLa(n)𝑠subscript𝐿𝑎𝑛s\in L_{a}(n)italic_s ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ), the sLa(n)𝑠subscript𝐿𝑎superscript𝑛s\in L_{a}(n^{\prime})italic_s ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) is also true. Because of the abstract label subconstraint of 2, if sLa(n)𝑠subscript𝐿𝑎𝑛s\in L_{a}(n)italic_s ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n ) and choosing the assignment in n𝑛nitalic_n leads to nsuperscript𝑛n^{\prime}italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, then eval(a,s)La(n)𝑒𝑣𝑎𝑙𝑎𝑠subscript𝐿𝑎superscript𝑛eval(a,s)\in L_{a}(n^{\prime})italic_e italic_v italic_a italic_l ( italic_a , italic_s ) ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ).

Based on these observations, the theorem can be proven by induction on the trace length. It holds for the 0-length trace s0subscript𝑠0s_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, as s0=Lc(n0)La(n0)subscript𝑠0subscript𝐿𝑐subscript𝑛0subscript𝐿𝑎subscript𝑛0s_{0}=L_{c}(n_{0})\in L_{a}(n_{0})italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ). Then for an M𝑀Mitalic_M trace with length i+1𝑖1i+1italic_i + 1, we know that the length i𝑖iitalic_i prefix has a corresponding P𝑃Pitalic_P trace ending in nisubscript𝑛𝑖n_{i}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. If this is not a covered node, then ci+1subscript𝑐𝑖1c_{i+1}italic_c start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT is enabled in it as (from the induction assumption) siLa(ni)subscript𝑠𝑖subscript𝐿𝑎subscript𝑛𝑖s_{i}\in L_{a}(n_{i})italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ), then we choose any assignment of ci+1subscript𝑐𝑖1c_{i+1}italic_c start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT that leads to si+1subscript𝑠𝑖1s_{i+1}italic_s start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT, and choose the same assignment in P𝑃Pitalic_P leading to a node ni+1subscript𝑛𝑖1n_{i+1}italic_n start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT. As we observed, because of the abstract label subconstraint of 2, si+1La(ni+q)subscript𝑠𝑖1subscript𝐿𝑎subscript𝑛𝑖𝑞s_{i+1}\in L_{a}(n_{i+q})italic_s start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT italic_i + italic_q end_POSTSUBSCRIPT ). If nisubscript𝑛𝑖n_{i}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is covered, we take the cover𝑐𝑜𝑣𝑒𝑟coveritalic_c italic_o italic_v italic_e italic_r action to nsuperscript𝑛n^{\prime}italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, where ci+1subscript𝑐𝑖1c_{i+1}italic_c start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT is enabled, as siLa(n)subscript𝑠𝑖subscript𝐿𝑎superscript𝑛s_{i}\in L_{a}(n^{\prime})italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_L start_POSTSUBSCRIPT italic_a end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) because of 2. We take ci+1subscript𝑐𝑖1c_{i+1}italic_c start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT and an appropriate assignment to ni+1subscript𝑛𝑖1n_{i+1}italic_n start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT, and si+1ni+1subscript𝑠𝑖1subscript𝑛𝑖1s_{i+1}\in n_{i+1}italic_s start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT ∈ italic_n start_POSTSUBSCRIPT italic_i + 1 end_POSTSUBSCRIPT is true for the same reason as in the other case. This concludes our induction proof.

A consequence of this is that if a node n𝑛nitalic_n is covered by nsuperscript𝑛n^{\prime}italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT in a finished PASG, then for all traces from Lc(n)subscript𝐿𝑐𝑛L_{c}(n)italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ), there exists a trace from Lc(n)subscript𝐿𝑐superscript𝑛L_{c}(n^{\prime})italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) with the same commands and assignments chosen.

Lemma 2

The maximal probability of reaching a target node in a PASG for an MDP M𝑀Mitalic_M is at least as high as the maximal probability of reaching a target state in M𝑀Mitalic_M.

Proof

Fix a maximizing memoryless strategy σ𝜎\sigmaitalic_σ for the MDP M𝑀Mitalic_M (we can restrict ourselves to memoryless strategies as an optimal memoryless strategy for an MDP is also an optimal general strategy if the optimization goal is a reachability probability). This induces a Discrete-Time Markov Chain Mσsuperscript𝑀𝜎M^{\sigma}italic_M start_POSTSUPERSCRIPT italic_σ end_POSTSUPERSCRIPT, where the probability of reaching a target state from the initial state is the maximal probability of reaching a target in M𝑀Mitalic_M. We will construct a strategy σ^:N𝒞{cover}\hat{\sigma}:N*\xrightarrow{}\mathcal{C}\cup\{cover\}over^ start_ARG italic_σ end_ARG : italic_N ∗ start_ARROW start_OVERACCENT end_OVERACCENT → end_ARROW caligraphic_C ∪ { italic_c italic_o italic_v italic_e italic_r } for the PASG P𝑃Pitalic_P that results in at least as high probability for reaching a target node as σ𝜎\sigmaitalic_σ. Note that σ^^𝜎\hat{\sigma}over^ start_ARG italic_σ end_ARG is not a memoryless strategy.

The aim of σ^^𝜎\hat{\sigma}over^ start_ARG italic_σ end_ARG will be to copy the traces of Mσsuperscript𝑀𝜎M^{\sigma}italic_M start_POSTSUPERSCRIPT italic_σ end_POSTSUPERSCRIPT. The non-trivial part of this is that because of covers, multiple states of Mσsuperscript𝑀𝜎M^{\sigma}italic_M start_POSTSUPERSCRIPT italic_σ end_POSTSUPERSCRIPT can correspond to the same node in the PASG – that is why σ^^𝜎\hat{\sigma}over^ start_ARG italic_σ end_ARG is not a memoryless strategy but uses the whole trace to select an action.

σ^^𝜎\hat{\sigma}over^ start_ARG italic_σ end_ARG will obviously always choose cover𝑐𝑜𝑣𝑒𝑟coveritalic_c italic_o italic_v italic_e italic_r if the trace ends in a covered node, as there is no other enabled action. Let n0c1n1c2n2c3cncnsubscript𝑐1subscript𝑛0subscript𝑛1subscript𝑐2subscript𝑛2subscript𝑐3subscript𝑐𝑛subscript𝑐𝑛n_{0}\xrightarrow{c_{1}}n_{1}\xrightarrow{c_{2}}n_{2}\xrightarrow{c_{3}}\dots% \xrightarrow{c_{n}}c_{n}italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_c start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW … start_ARROW start_OVERACCENT italic_c start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_c start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT be a coverless representation of a PASG trace τ^^𝜏\hat{\tau}over^ start_ARG italic_τ end_ARG. Because of 2, there is a one-to-one correspondence between the assignments of cisubscript𝑐𝑖c_{i}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and the result nodes of the transition edge. Let aisubscript𝑎𝑖a_{i}italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT be the assignment corresponding to nisubscript𝑛𝑖n_{i}italic_n start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in this trace.

We have two cases:

  1. 1.

    A concrete trace τ=s0c1s1c2s2𝜏subscript𝑠0subscript𝑐1subscript𝑠1subscript𝑐2subscript𝑠2\tau=s_{0}\xrightarrow{c_{1}}s_{1}\xrightarrow{c_{2}}s_{2}italic_τ = italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT can be constructed from τ^^𝜏\hat{\tau}over^ start_ARG italic_τ end_ARG by starting in s0subscript𝑠0s_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, choosing cisubscript𝑐𝑖c_{i}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and applying aisubscript𝑎𝑖a_{i}italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT to si1subscript𝑠𝑖1s_{i-1}italic_s start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT in the i𝑖iitalic_ith step. This way, we can compute which concrete state we would actually be in if this assignment sequence happened in the original state space. We will call τ𝜏\tauitalic_τ the concretization of τ^^𝜏\hat{\tau}over^ start_ARG italic_τ end_ARG. This is possible only if cisubscript𝑐𝑖c_{i}italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is enabled in ci1subscript𝑐𝑖1c_{i-1}italic_c start_POSTSUBSCRIPT italic_i - 1 end_POSTSUBSCRIPT, which need not always be true – that will be the second case. Let σ^(τ^)^𝜎^𝜏\hat{\sigma}(\hat{\tau})over^ start_ARG italic_σ end_ARG ( over^ start_ARG italic_τ end_ARG ) be the same command as σ(last(τ))𝜎last𝜏\sigma(\text{last}(\tau))italic_σ ( last ( italic_τ ) ).

  2. 2.

    The trace is not concretizable. In this case, σ^(τ^)^𝜎^𝜏\hat{\sigma}(\hat{\tau})over^ start_ARG italic_σ end_ARG ( over^ start_ARG italic_τ end_ARG ) can be anything, as such traces will never be generated by σ^^𝜎\hat{\sigma}over^ start_ARG italic_σ end_ARG if it is constructed according to the previous case for concretizable traces, so it cannot change the induced MDP Pσ^superscript𝑃^𝜎P^{\hat{\sigma}}italic_P start_POSTSUPERSCRIPT over^ start_ARG italic_σ end_ARG end_POSTSUPERSCRIPT.

This is a valid strategy, as 1 ensures the selected command is enabled in the PASG node. The fact that the choice for non-concretizable traces does not matter can be seen by induction:

  • The only possible 1-length trace n1subscript𝑛1n_{1}italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is trivially concretizable, the commands enabled in n0subscript𝑛0n_{0}italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT are exactly those enabled in Lc(n0)=s0subscript𝐿𝑐subscript𝑛0subscript𝑠0L_{c}(n_{0})=s_{0}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.

  • Let τ𝜏\tauitalic_τ be the concretization of a concretizable trace τ^^𝜏\hat{\tau}over^ start_ARG italic_τ end_ARG. At the end of τ^^𝜏\hat{\tau}over^ start_ARG italic_τ end_ARG, σ^^𝜎\hat{\sigma}over^ start_ARG italic_σ end_ARG chooses a command that is enabled at the end of τ𝜏\tauitalic_τ, resulting in a trace one step longer that is still concretizable.

Although there is always a unique assignment that results in entering a node in the PASG, this is not true in general in the original MDP (e.g. in the state x=1𝑥1x=1italic_x = 1, the assignments x:=x+1assignsuperscript𝑥𝑥1x^{\prime}:=x+1italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := italic_x + 1 and x:=2assignsuperscript𝑥2x^{\prime}:=2italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT := 2 both lead to the state x=2𝑥2x=2italic_x = 2 – if these are the assignments of the same command, we do not know which one was chosen), so there are multiple Pσ^superscript𝑃^𝜎P^{\hat{\sigma}}italic_P start_POSTSUPERSCRIPT over^ start_ARG italic_σ end_ARG end_POSTSUPERSCRIPT traces with the same concretization.

Assume for a moment that concretization creates a one-to-one correspondence between abstract and concrete traces, so the aforementioned problem is not present. In this case, the probability subconstraint of 2 would ensure that the probability of τ^^𝜏\hat{\tau}over^ start_ARG italic_τ end_ARG according to the strategy σ^^𝜎\hat{\sigma}over^ start_ARG italic_σ end_ARG is the same as the probability of its concretization τ𝜏\tauitalic_τ according to σ𝜎\sigmaitalic_σ.

This can be generalized to the case when there are multiple abstract traces τ^1,τ^2,τ^ksubscript^𝜏1subscript^𝜏2subscript^𝜏𝑘\hat{\tau}_{1},\hat{\tau}_{2},\dots\hat{\tau}_{k}over^ start_ARG italic_τ end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , over^ start_ARG italic_τ end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … over^ start_ARG italic_τ end_ARG start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT with the same concretization τ𝜏\tauitalic_τ: in this case, σ(τ)=i=1kσ^(τ^i)subscript𝜎𝜏superscriptsubscript𝑖1𝑘subscript^𝜎subscript^𝜏𝑖\mathbb{P}_{\sigma}(\tau)=\sum_{i=1}^{k}\mathbb{P}_{\hat{\sigma}}(\hat{\tau}_{% i})blackboard_P start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT ( italic_τ ) = ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT blackboard_P start_POSTSUBSCRIPT over^ start_ARG italic_σ end_ARG end_POSTSUBSCRIPT ( over^ start_ARG italic_τ end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ).

Proof sketch for proving this by induction: if at the first command of a trace there are two assignments a1subscript𝑎1a_{1}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and a2subscript𝑎2a_{2}italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT in a command leading to the same concrete state ssuperscript𝑠s^{\prime}italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT when applied to s𝑠sitalic_s, then the probability of going from s𝑠sitalic_s to ssuperscript𝑠s^{\prime}italic_s start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is (a1)+(a2)subscript𝑎1subscript𝑎2\mathbb{P}(a_{1})+\mathbb{P}(a_{2})blackboard_P ( italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + blackboard_P ( italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), and the probability of the whole trace is (a1)+(a2)subscript𝑎1subscript𝑎2\mathbb{P}(a_{1})+\mathbb{P}(a_{2})blackboard_P ( italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + blackboard_P ( italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) times the probability of the suffix τsuperscript𝜏\tau^{\prime}italic_τ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT; in the PASG, these two assignments result in different nodes, but the abstract traces corresponding to the suffix are available from both nodes (or there covering nodes, which only multiplies the probability of the suffix by 1.01.01.01.0), so assuming the sum of the probability of these abstract traces is (τ)superscript𝜏\mathbb{P}(\tau^{\prime})blackboard_P ( italic_τ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ), then the probability of the whole trace is (a1)(τ)+(a2)(τ)=((a1)+(a2))(τ)subscript𝑎1superscript𝜏subscript𝑎2superscript𝜏subscript𝑎1subscript𝑎2superscript𝜏\mathbb{P}(a_{1})\mathbb{P}(\tau^{\prime})+\mathbb{P}(a_{2})\mathbb{P}(\tau^{% \prime})=(\mathbb{P}(a_{1})+\mathbb{P}(a_{2}))\mathbb{P}(\tau^{\prime})blackboard_P ( italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) blackboard_P ( italic_τ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) + blackboard_P ( italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) blackboard_P ( italic_τ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = ( blackboard_P ( italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + blackboard_P ( italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ) blackboard_P ( italic_τ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ), which is the original probability.

Let abstr(τ)𝑎𝑏𝑠𝑡𝑟𝜏abstr(\tau)italic_a italic_b italic_s italic_t italic_r ( italic_τ ) denote the set of abstract traces that have the concretization τ𝜏\tauitalic_τ, and (abstr(τ))𝑎𝑏𝑠𝑡𝑟𝜏\mathbb{P}(abstr(\tau))blackboard_P ( italic_a italic_b italic_s italic_t italic_r ( italic_τ ) ) is the sum of their probabilities. From the previous statement, we have (τ)=(abstr(τ))𝜏𝑎𝑏𝑠𝑡𝑟𝜏\mathbb{P}(\tau)=\mathbb{P}(abstr(\tau))blackboard_P ( italic_τ ) = blackboard_P ( italic_a italic_b italic_s italic_t italic_r ( italic_τ ) ) Let concr(τ^)𝑐𝑜𝑛𝑐𝑟^𝜏concr(\hat{\tau})italic_c italic_o italic_n italic_c italic_r ( over^ start_ARG italic_τ end_ARG ) be the concretization of τ^^𝜏\hat{\tau}over^ start_ARG italic_τ end_ARG. Observe that concretization is deterministic, so for any τ1,τ2subscript𝜏1subscript𝜏2\tau_{1},\tau_{2}italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, abstr(τ1)abstr(τ2)=𝑎𝑏𝑠𝑡𝑟subscript𝜏1𝑎𝑏𝑠𝑡𝑟subscript𝜏2abstr(\tau_{1})\cap abstr(\tau_{2})=\emptysetitalic_a italic_b italic_s italic_t italic_r ( italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∩ italic_a italic_b italic_s italic_t italic_r ( italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = ∅. Let T^^𝑇\hat{T}over^ start_ARG italic_T end_ARG be the set of Pσ^superscript𝑃^𝜎P^{\hat{\sigma}}italic_P start_POSTSUPERSCRIPT over^ start_ARG italic_σ end_ARG end_POSTSUPERSCRIPT traces that lead to target nodes, and T^c=(τ^T^|concr(τ^) is a target trace in Mσ)subscript^𝑇𝑐^𝜏conditional^𝑇𝑐𝑜𝑛𝑐𝑟^𝜏 is a target trace in Mσ\hat{T}_{c}=(\hat{\tau}\in\hat{T}|concr(\hat{\tau})\text{ is a target trace in% $M^{\sigma}$})over^ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT = ( over^ start_ARG italic_τ end_ARG ∈ over^ start_ARG italic_T end_ARG | italic_c italic_o italic_n italic_c italic_r ( over^ start_ARG italic_τ end_ARG ) is a target trace in italic_M start_POSTSUPERSCRIPT italic_σ end_POSTSUPERSCRIPT ). T^csubscript^𝑇𝑐\hat{T}_{c}over^ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT need not be the whole T^^𝑇\hat{T}over^ start_ARG italic_T end_ARG, as target nodes can cover non-target nodes, let T^ssubscript^𝑇𝑠\hat{T}_{s}over^ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT denote the ”spurious” target traces T^T^c^𝑇subscript^𝑇𝑐\hat{T}\setminus\hat{T}_{c}over^ start_ARG italic_T end_ARG ∖ over^ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT. Let T𝑇Titalic_T denote the set of target traces of Mσsuperscript𝑀𝜎M^{\sigma}italic_M start_POSTSUPERSCRIPT italic_σ end_POSTSUPERSCRIPT. Observe that T^c=τTabstr(τ)subscript^𝑇𝑐subscript𝜏𝑇𝑎𝑏𝑠𝑡𝑟𝜏\hat{T}_{c}=\bigcup_{\tau\in T}abstr(\tau)over^ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT = ⋃ start_POSTSUBSCRIPT italic_τ ∈ italic_T end_POSTSUBSCRIPT italic_a italic_b italic_s italic_t italic_r ( italic_τ ). Now we can prove that the target probability on Pσ^superscript𝑃^𝜎P^{\hat{\sigma}}italic_P start_POSTSUPERSCRIPT over^ start_ARG italic_σ end_ARG end_POSTSUPERSCRIPT is at least as high as in Mσsuperscript𝑀𝜎M^{\sigma}italic_M start_POSTSUPERSCRIPT italic_σ end_POSTSUPERSCRIPT:

(T^)^𝑇\displaystyle\mathbb{P}(\hat{T})blackboard_P ( over^ start_ARG italic_T end_ARG ) =(T^c)+(T^s)=(τ^T^c(τ^))+(T^s)=absentsubscript^𝑇𝑐subscript^𝑇𝑠subscript^𝜏subscript^𝑇𝑐^𝜏subscript^𝑇𝑠absent\displaystyle=\mathbb{P}(\hat{T}_{c})+\mathbb{P}(\hat{T}_{s})=(\sum_{\hat{\tau% }\in\hat{T}_{c}}\mathbb{P}(\hat{\tau}))+\mathbb{P}(\hat{T}_{s})== blackboard_P ( over^ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ) + blackboard_P ( over^ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ) = ( ∑ start_POSTSUBSCRIPT over^ start_ARG italic_τ end_ARG ∈ over^ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT end_POSTSUBSCRIPT blackboard_P ( over^ start_ARG italic_τ end_ARG ) ) + blackboard_P ( over^ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ) =
=(τT(abstr(τ)))+(T^s)=(τT(τ))+(T^s)=(T)+(T^s)absentsubscript𝜏𝑇𝑎𝑏𝑠𝑡𝑟𝜏subscript^𝑇𝑠subscript𝜏𝑇𝜏subscript^𝑇𝑠𝑇subscript^𝑇𝑠\displaystyle=(\sum_{\tau\in T}\mathbb{P}(abstr(\tau)))+\mathbb{P}(\hat{T}_{s}% )=(\sum_{\tau\in T}\mathbb{P}(\tau))+\mathbb{P}(\hat{T}_{s})=\mathbb{P}(T)+% \mathbb{P}(\hat{T}_{s})= ( ∑ start_POSTSUBSCRIPT italic_τ ∈ italic_T end_POSTSUBSCRIPT blackboard_P ( italic_a italic_b italic_s italic_t italic_r ( italic_τ ) ) ) + blackboard_P ( over^ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ) = ( ∑ start_POSTSUBSCRIPT italic_τ ∈ italic_T end_POSTSUBSCRIPT blackboard_P ( italic_τ ) ) + blackboard_P ( over^ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ) = blackboard_P ( italic_T ) + blackboard_P ( over^ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT )

As (T^s)subscript^𝑇𝑠\mathbb{P}(\hat{T}_{s})blackboard_P ( over^ start_ARG italic_T end_ARG start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ) is non-negative, and (T)𝑇\mathbb{P}(T)blackboard_P ( italic_T ) is exactly the target probability in Mσsuperscript𝑀𝜎M^{\sigma}italic_M start_POSTSUPERSCRIPT italic_σ end_POSTSUPERSCRIPT, we have proven that the target probability in σ^superscript^𝜎\mathbb{P}^{\hat{\sigma}}blackboard_P start_POSTSUPERSCRIPT over^ start_ARG italic_σ end_ARG end_POSTSUPERSCRIPT is at least as high as in Mσsuperscript𝑀𝜎M^{\sigma}italic_M start_POSTSUPERSCRIPT italic_σ end_POSTSUPERSCRIPT. σ^^𝜎\hat{\sigma}over^ start_ARG italic_σ end_ARG is not necessarily a maximizing strategy, so the maximal probability in P𝑃Pitalic_P can be even higher. σ𝜎\sigmaitalic_σ was chosen to be maximizing, so this proves the theorem.

The probability of reaching a target state in Mσsuperscript𝑀𝜎M^{\sigma}italic_M start_POSTSUPERSCRIPT italic_σ end_POSTSUPERSCRIPT is the sum of the probabilities of all traces reaching a target state. For each of these traces, we have a “copy” in the MDP induced by σ^^𝜎\hat{\sigma}over^ start_ARG italic_σ end_ARG for PASG, which has the same probability, as the same commands are chosen throughout the trace and the probability of choosing a given assignment in a step of the trace is the same in the PASG as in Mσsuperscript𝑀𝜎M^{\sigma}italic_M start_POSTSUPERSCRIPT italic_σ end_POSTSUPERSCRIPT. There are other traces as well in the PASG with strategy σ^^𝜎\hat{\sigma}over^ start_ARG italic_σ end_ARG that end in target nodes: non-concretizable traces because of covering nodes enabling commands that are not enabled in the covered node and traces that end in target nodes that cover non-target nodes.

As σ^^𝜎\hat{\sigma}over^ start_ARG italic_σ end_ARG already results in at least as high target reachability probability as an optimal strategy on M𝑀Mitalic_M, and it might not even be optimal, this shows that the maximal probability in the PASG must be at least as high as in M𝑀Mitalic_M.

Lemma 3

The maximal probability of reaching a target node in a PASG for an MDP M𝑀Mitalic_M is at most as high as the maximal probability of reaching a target state in M𝑀Mitalic_M.

Proof

Sketch: The proof is done similarly to that of 2, just the other way around. The main difference is that strategies on M𝑀Mitalic_M have less information in the trace than strategies in the PASG, as nodes in the PASG are basically labeled with which assignment was the last. This leads to a problem when two assignments of a command can lead to the same state in M𝑀Mitalic_M, as a strategy on the PASG may choose a different action in them (e.g. they can be covered by different nodes).

Because of this, we construct a different MDP Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT from the symbolic description of M𝑀Mitalic_M, where the states also contain the assignment that was applied last time. This MDP is bisimilar to M𝑀Mitalic_M, the bisimulation relation being based on forgetting the last assignment – neither the “targetness” nor the enabled commands and their result distribution depends on the last assignment if the current state is known, so it is easy to prove that this is a bisimulation. As such, the maximal target probability on Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is the same as on M𝑀Mitalic_M.

For any maximizing memoryless strategy σ^^𝜎\hat{\sigma}over^ start_ARG italic_σ end_ARG on the PASG, we can construct a non-memoryless strategy σ𝜎\sigmaitalic_σ on Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT such that σ(τ)=σ^(last(abstr(τ)))𝜎𝜏^𝜎𝑙𝑎𝑠𝑡𝑎𝑏𝑠𝑡𝑟𝜏\sigma(\tau)=\hat{\sigma}(last(abstr(\tau)))italic_σ ( italic_τ ) = over^ start_ARG italic_σ end_ARG ( italic_l italic_a italic_s italic_t ( italic_a italic_b italic_s italic_t italic_r ( italic_τ ) ) ), which is now unique because of the assignment labels. The strategy is valid, which can be proven using an “inverted” 1 based on 2. This results in at least as high a probability for reaching a target state as in the PASG.

Theorem LABEL:thm:bi_soundness
Proof

The theorem is a result of combining 2 and 3: the probability in the PASG is both a lower and an upper approximation of the probability in the MDP, so it must be the same.

Lemma 4

The maximal probability of reaching a target state is always below U(n0)𝑈subscript𝑛0U(n_{0})italic_U ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) if BRTDP is used with PASG construction steps.

Proof

Sketch: Let Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT be an assignment-labeled version of M𝑀Mitalic_M similarly to the proof of 3. For an element s𝑠sitalic_s and an assignment a𝑎aitalic_a, the Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT state sadirect-sum𝑠𝑎s\oplus aitalic_s ⊕ italic_a corresponds to the M𝑀Mitalic_M state s𝑠sitalic_s with the latest assignment being a𝑎aitalic_a. The initial state is a special s0s_{0}\oplus\botitalic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⊕ ⊥ element without an assignment label. For all already existing PASG nodes n𝑛nitalic_n, U(n)V(Lc(n)an)𝑈𝑛𝑉direct-sumsubscript𝐿𝑐𝑛subscript𝑎𝑛U(n)\geq V(L_{c}(n)\oplus a_{n})italic_U ( italic_n ) ≥ italic_V ( italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ⊕ italic_a start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ), where V𝑉Vitalic_V is the real value function of Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, ansubscript𝑎𝑛a_{n}italic_a start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is the assignment corresponding to n𝑛nitalic_n according to constraint 2, and Lc(n)andirect-sumsubscript𝐿𝑐𝑛subscript𝑎𝑛L_{c}(n)\oplus a_{n}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ⊕ italic_a start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT is a state of Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. This can be proven by induction. This is of course true when the node is created, as it is initialized to 1.01.01.01.0 unless it is a non-target absorbing node, in which case U(n)=V(Lc(n))=0.0𝑈𝑛𝑉subscript𝐿𝑐𝑛0.0U(n)=V(L_{c}(n))=0.0italic_U ( italic_n ) = italic_V ( italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ) = 0.0. Whenever U(n)𝑈𝑛U(n)italic_U ( italic_n ) is updated in later steps, we have two cases:

  1. 1.

    n𝑛nitalic_n is non-covered, so it is updated according to a transition edge. In this case, the update corresponds to an update on Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT according to an upper approximation of the value function Usuperscript𝑈U^{\prime}italic_U start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, which is set to U(s)=U(m)superscript𝑈𝑠𝑈𝑚U^{\prime}(s)=U(m)italic_U start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( italic_s ) = italic_U ( italic_m ) if a node exists with Lc(m)amdirect-sumsubscript𝐿𝑐𝑚subscript𝑎𝑚L_{c}(m)\oplus a_{m}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_m ) ⊕ italic_a start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT and 1.01.01.01.0 otherwise. Because of the induction hypothesis, this is a valid upper bound for the value function, which means that computing a Bellman update using it is a valid

  2. 2.

    n𝑛nitalic_n is covered by nsuperscript𝑛n^{\prime}italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT. We do not yet know whether the currently covering node will remain covering until the finished PASG, so we cannot say that all action sequences possible from Lc(n)andirect-sumsubscript𝐿𝑐𝑛subscript𝑎𝑛L_{c}(n)\oplus a_{n}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ⊕ italic_a start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT are possible from Lc(n)a(n)L_{c}(n^{\prime})\oplus a_{(}n^{\prime})italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊕ italic_a start_POSTSUBSCRIPT ( end_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ). However, as the covering edge still exists, we know that those action sequences that are possible from Lc(n)andirect-sumsubscript𝐿𝑐𝑛subscript𝑎𝑛L_{c}(n)\oplus a_{n}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ⊕ italic_a start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT but not from Lc(n)andirect-sumsubscript𝐿𝑐superscript𝑛subscript𝑎superscript𝑛L_{c}(n^{\prime})\oplus a_{n^{\prime}}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊕ italic_a start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT have not been explored yet: along the way of these sequences, there exists a non-covered non-expanded node. The value of this node is currently overapproximated by 1.01.01.01.0. So for all action sequences possible from Lc(n)andirect-sumsubscript𝐿𝑐𝑛subscript𝑎𝑛L_{c}(n)\oplus a_{n}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ⊕ italic_a start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, they are either also possible from Lc(n)andirect-sumsubscript𝐿𝑐superscript𝑛subscript𝑎superscript𝑛L_{c}(n^{\prime})\oplus a_{n^{\prime}}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊕ italic_a start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT, or there is a prefix of it that is possible from it, and ends in a node whose value approximation is set to 1.01.01.01.0. Because of this (and the fact that we are aiming to maximize the value), the current value approximation of nsuperscript𝑛n^{\prime}italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT must be higher than the value of Lc(n)andirect-sumsubscript𝐿𝑐𝑛subscript𝑎𝑛L_{c}(n)\oplus a_{n}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ⊕ italic_a start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, so copying it to n𝑛nitalic_n does not violate the induction assumption.

As Msuperscript𝑀M^{\prime}italic_M start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is bisimilar to M𝑀Mitalic_M, V(s0)=V(s0)V(s_{0})=V(s_{0}\oplus\bot)italic_V ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) = italic_V ( italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⊕ ⊥ ), so U(n0)𝑈subscript𝑛0U(n_{0})italic_U ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) is also an upper bound for s0subscript𝑠0s_{0}italic_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.

Lemma 5

The maximal probability of reaching a target state is always above L(n0)𝐿subscript𝑛0L(n_{0})italic_L ( italic_n start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) if BRTDP is used with PASG construction steps.

Proof

Sketch: This can be proven similarly to 4. The upper approximation U𝑈Uitalic_U is replaced in all statements with the lower approximation L𝐿Litalic_L. The non-covered case goes the same.

If n𝑛nitalic_n is covered by nsuperscript𝑛n^{\prime}italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT in the finished PASG, we know that all action sequences starting from Lc(n)andirect-sumsubscript𝐿𝑐superscript𝑛subscript𝑎superscript𝑛L_{c}(n^{\prime})\oplus a_{n^{\prime}}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊕ italic_a start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT are also available from Lc(n)andirect-sumsubscript𝐿𝑐𝑛subscript𝑎𝑛L_{c}(n)\oplus a_{n}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ⊕ italic_a start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. When an update happens during BRTDP, we do not know this yet. However, for all action sequences available from Lc(n)andirect-sumsubscript𝐿𝑐superscript𝑛subscript𝑎superscript𝑛L_{c}(n^{\prime})\oplus a_{n^{\prime}}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⊕ italic_a start_POSTSUBSCRIPT italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT and not available from Lc(n)andirect-sumsubscript𝐿𝑐𝑛subscript𝑎𝑛L_{c}(n)\oplus a_{n}italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ⊕ italic_a start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, as the covering edge is still there, there must be a state whose current lower approximation is 00, as the corresponding PASG node has not been expanded yet. So the current L(n)𝐿superscript𝑛L(n^{\prime})italic_L ( italic_n start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) must be lower than V(Lc(n)an)𝑉direct-sumsubscript𝐿𝑐𝑛subscript𝑎𝑛V(L_{c}(n)\oplus a_{n})italic_V ( italic_L start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT ( italic_n ) ⊕ italic_a start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ).

Theorem LABEL:thm:bi_brtdp_soundness
Proof

Consequence of combining 4 and 5.

Appendix 0.B Further Experiment Results

Figure 4 compares the PRED and EXPL domains when using BVI. Although EXPL always wins in running time, PRED can sometimes reduce the number of nodes further. If the overhead of refinement in the PRED domain can be mitigated making it terminate in time on more inputs, the better reduction capability might be observed on even more inputs.

Figure 5 compares the results using BRTDP both for the baseline and abstraction-based analysis. No clear advantage of either of them can be observed.

Refer to caption
Figure 4: Comparison of EXPL and PRED BVI.
Refer to caption
Figure 5: Comparison of abstract and standard BRTDP. Unlike in the other two figures, the node number comparison uses log scale, as linear scaling made it hard to see.