License: CC BY 4.0
arXiv:2401.09292v1 [cs.PF] 17 Jan 2024

Hierarchical Analyses Applied to Computer System Performance: Review and Call for Further Studies

Alexander Thomasian
Thomasian and Associates
Pleasantville, NY
[email protected]
Abstract

We review studies based on analytic (A) and simulation (S) methods for hierarchical performance analysis of Queueing Network - QN models. A at lower and S at higher level have been applied most. The proposed methods result in an order of magnitude reduction in performance evaluation cost with respect to simulation. The computational cost at the lower level to obtain an exact solution is reduced when the computer system can be modeled as a product-form QN amenable to a low cost solution. A Continuous Time Markov Chain - CTMC or discrete-event simulation can then be used at the higher level. We first consider a multiprogrammed transaction - txn processing system with Poisson arrivals and predeclared lock requests. Txns with lock conflicts with active txns are held in a FCFS queue and txns are activated after they acquire all requested locks. Txn throughputs obtained by the analysis of multiprogrammed computer systems serve as the transition rates in a higher level CTMC to determine txn response times. We next analyze a task system where task precedence relationships are specified by a directed acyclic graph to determine its makespan. Task service demands are specified on the devices of a computer system. The composition of tasks in execution determines their processing time and throughputs, which serve as transition rates among the states of the CTMC model. To reduce memory space requirements the CTMC is built and solved one set of task completions at a time. As a third example we consider the hierarchical simulation of a timesharing system with two user classes. Txn throughputs in processing various combinations of requests are obtained by analyzing a closed product-form QN model. A discrete event simulator is provided. More detailed QN modeling parameters, such as the distribution of the number of cycles of tasks consisting of Fork/Join (F/J) requests affect performance. This detail can be taken into account in Schwetman’s hybrid simulation method, which counts remaining number of cycles in CSM-like queueing model. We discuss an extension to hybrid simulation to adjust job service demands according to elapsed time, rather than counting cycles. A section reviewing related studies is provided. Equilibrium Point Analysis to reduce the computational cost in applying hierarchical analysis is presented in the Appendix. The discussion is applicable to performance modeling of manufacturing systems.

1 Introduction

Product-form Queueing Networks - QNs were initially restricted to single- and multi-server nodes with exponential service times and FCFS scheduling Jackson 1957 [25]. Product-form QN’s were extended to Processor-Sharing - PS and Last-Come First-Served Preemptive Resume - LCFSPR Kleinrock 1976 [29] and delay servers. The latter servers allow general service times according to the BCMP theorem Baskett et al. 1975 [5]. PS is an extreme form of round-robin CPU scheduling, where each job is allowed a quantum q0𝑞0q\rightarrow 0italic_q → 0 time units before preemption [29].

The Buzen Convolution Algorithm - BCA Buzen 1973 [8] was a first step in efficiently solving product-form closed QNs, where completed jobs are immediately replaced by a new job. BCA was applied to the Central server Model - CSM described below.

Central Server Model - CSM

CSM is a closed QN model of a multiprogrammed computer system Buzen 1973 [8], which consists of a CPU and multiple disks. Jobs alternate between CPU and disk processing until they are completed. Completed jobs are immediately replaced by another job in closed systems or after think times modeled as a delay servers in time-sharing systems.

The CPU is designated as the central station 𝒮1subscript𝒮1{\cal S}_{1}caligraphic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and the N1𝑁1N-1italic_N - 1 disks as peripheral stations 𝒮n,2nNsubscript𝒮𝑛2𝑛𝑁{\cal S}_{n},2\leq n\leq Ncaligraphic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , 2 ≤ italic_n ≤ italic_N. Given the state transition probabilities 𝒮ipi,j𝒮jsubscript𝑝𝑖𝑗subscript𝒮𝑖subscript𝒮𝑗{\cal S}_{i}\xrightarrow{p_{i,j}}{\cal S}_{j}caligraphic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_ARROW start_OVERACCENT italic_p start_POSTSUBSCRIPT italic_i , italic_j end_POSTSUBSCRIPT end_OVERACCENT → end_ARROW caligraphic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT the following transitions are applicable to CSM.

p1,n,2nN,pn,1=1,2nNformulae-sequencesubscript𝑝1𝑛2𝑛𝑁formulae-sequencesubscript𝑝𝑛112𝑛𝑁p_{1,n},\hskip 5.69054pt2\leq n\leq N,\hskip 8.53581ptp_{n,1}=1,\hskip 5.69054% pt2\leq n\leq Nitalic_p start_POSTSUBSCRIPT 1 , italic_n end_POSTSUBSCRIPT , 2 ≤ italic_n ≤ italic_N , italic_p start_POSTSUBSCRIPT italic_n , 1 end_POSTSUBSCRIPT = 1 , 2 ≤ italic_n ≤ italic_N

The self-transition p1,1=1n=2Np1,nsubscript𝑝111superscriptsubscript𝑛2𝑁subscript𝑝1𝑛p_{1,1}=1-\sum_{n=2}^{N}p_{1,n}italic_p start_POSTSUBSCRIPT 1 , 1 end_POSTSUBSCRIPT = 1 - ∑ start_POSTSUBSCRIPT italic_n = 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_p start_POSTSUBSCRIPT 1 , italic_n end_POSTSUBSCRIPT implies the completion of a job in a closed QN (or a job that leaves the system in an open QN). The number of visits to the CPU (v¯1subscript¯𝑣1\bar{v}_{1}over¯ start_ARG italic_v end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT) is given by the geometric distribution Trivedi 2001 [66].

qk=p1,1(1p1,1)k1,k1k¯=v1=1/p1,1.formulae-sequencesubscript𝑞𝑘subscript𝑝11superscript1subscript𝑝11𝑘1𝑘1¯𝑘subscript𝑣11subscript𝑝11q_{k}=p_{1,1}(1-p_{1,1})^{k-1},k\geq 1\hskip 5.69054pt\bar{k}=v_{1}=1/p_{1,1}.italic_q start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = italic_p start_POSTSUBSCRIPT 1 , 1 end_POSTSUBSCRIPT ( 1 - italic_p start_POSTSUBSCRIPT 1 , 1 end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_k - 1 end_POSTSUPERSCRIPT , italic_k ≥ 1 over¯ start_ARG italic_k end_ARG = italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 1 / italic_p start_POSTSUBSCRIPT 1 , 1 end_POSTSUBSCRIPT .

The relative number of visits to the stations is obtained by solving

v¯=v¯𝐏.¯𝑣¯𝑣𝐏\underline{v}=\underline{v}{\bf P}.under¯ start_ARG italic_v end_ARG = under¯ start_ARG italic_v end_ARG bold_P .

It follows vn=p1,nv1=p1,n/p(1,1),2nNformulae-sequencesubscript𝑣𝑛subscript𝑝1𝑛subscript𝑣1subscript𝑝1𝑛𝑝112𝑛𝑁v_{n}=p_{1,n}v_{1}=p_{1,n}/p(1,1),2\leq n\leq Nitalic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = italic_p start_POSTSUBSCRIPT 1 , italic_n end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_p start_POSTSUBSCRIPT 1 , italic_n end_POSTSUBSCRIPT / italic_p ( 1 , 1 ) , 2 ≤ italic_n ≤ italic_N.

Given mean service time at 𝒮nsubscript𝒮𝑛{\cal S}_{n}caligraphic_S start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT per visit is x¯nsubscript¯𝑥𝑛\bar{x}_{n}over¯ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, the mean loading per job is Xn=vnx¯n,1nNformulae-sequencesubscript𝑋𝑛subscript𝑣𝑛subscript¯𝑥𝑛1𝑛𝑁X_{n}=v_{n}\bar{x}_{n},1\leq n\leq Nitalic_X start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT over¯ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT , 1 ≤ italic_n ≤ italic_N.

Sauer and Chandy 1975 [47] consider the analysis of a CSM with a CPU with FCFS and priority (nonpreemptive and preemptive) scheduling and nonexponential service time scheduling.

CSM is a single job class QN but BCA was extended to multiple job classes by Buzen and coworkers at BGS for inclusion in the BEST/1 capacity planning tool for MVS OS, renamed z/OS [9]. This extension was also done independently at IBM by Reiser and Kobayashi in 1975 [41]. The tutorial by Williams and Bhandiwad 1976 [70] on the use of generating functions in develo** the convolution algorithm for multiple job classes was extended in Thomasian and Nadji 1981 [54].

Mean Value Analysis - MVA method developed by Reiser and Lavenberg 1980 [42, 43] has the same computational cost as BCA, but higher memory requirements. It has numerical problems in dealing with state-dependent servers whose service rate varies with the number of jobs, such as multiserver queues. MVA on the other hand has led to several low-cost, iterative solution methods, such as Bard-Schweitzer, see e.g., Lazowska et al. 1984 [34], and Linearizer Chandy and Neuse 1982 [12] Efficient approximate computational methods were later developed by extending MVA to non-product-form QNs, such as FCFS scheduling with general service times Lazowska et al. 1984 [34].

Analysis of open (resp. closed) QNs requires the arrival rate (resp. degree of concurrency or MultiProgramming Level - MPL) and service demands or loadings. which are the product of the mean number of job visits to the devices of a computer and the mean service time per visit [8].

IBM’s Software Measurement Facility - SMF measures the mean time computer devices (CPU and disk) are busy serving tasks. The service demands differ according to job class, e.g., batch versus online transactions - txns. Given the MultiProgramming Level - MPL and service demands BEST/1 Buzen et al. 1978 [9, 10] and MAP Lazowska et al. 1984 [34] capacity planning tools use QN analysis to obtained performance metrics of interest such as job throughput, device utilizations, response times, and queuelengths [33, 34, 6, 31].

When tasks are to be processed at heterogeneous computer systems, e.g., with different CPU speeds or different storage systems: Hard Disk Drives - HDDs versus Solid State Disks - SSDs. task processing requirements should be specified in device independent manner, e.g., program pathlenghts which can be converted to CPU time based on its MIPS.

Processing Time of Fork/Join Requests

As an example of hierarchical modeling consider the time it takes to execute the tasks of a single k-way F/J request on a multiprogrammed computer system. The approximate hierarchical analysis method based on decomposition Courtois 1975 [16] (see e.g., Section 9.3.1 in Lazowska et al. 1984 [34]) uses a Flow-Equivalent Service Center - FESC, whose throughput characteristic is obtained by analyzing the underlying QN model.

The K𝐾Kitalic_K tasks are assumed to have identical service demands that can be activated concurrently by a computer system with maximum MPL MmaxKsubscript𝑀𝑚𝑎𝑥𝐾M_{max}\geq Kitalic_M start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT ≥ italic_K, Task completion rates can be determined at low cost yielding T(k),1kK𝑇𝑘1𝑘𝐾T(k),1\leq k\leq Kitalic_T ( italic_k ) , 1 ≤ italic_k ≤ italic_K. In hierarchical modeling task completions are assumed to be exponentially distributed and the completing time of K𝐾Kitalic_K tasks van be determined by a death process [28].

SkT(k)Sk1,Kk1.formulae-sequencesuperscript𝑇𝑘subscript𝑆𝑘subscript𝑆𝑘1𝐾𝑘1S_{k}\stackrel{{\scriptstyle T(k)}}{{\longrightarrow}}S_{k-1},K\geq k\geq 1.italic_S start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_RELOP SUPERSCRIPTOP start_ARG ⟶ end_ARG start_ARG italic_T ( italic_k ) end_ARG end_RELOP italic_S start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT , italic_K ≥ italic_k ≥ 1 .

The completion time of F/J requests is:

RF/J(K)=k=1KR(k) where R(k)=[T(k)]1.subscript𝑅𝐹𝐽𝐾superscriptsubscript𝑘1𝐾𝑅𝑘 where 𝑅𝑘superscriptdelimited-[]𝑇𝑘1R_{F/J}(K)=\sum_{k=1}^{K}R(k)\mbox{ where }R(k)=[T(k)]^{-1}.italic_R start_POSTSUBSCRIPT italic_F / italic_J end_POSTSUBSCRIPT ( italic_K ) = ∑ start_POSTSUBSCRIPT italic_k = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_K end_POSTSUPERSCRIPT italic_R ( italic_k ) where italic_R ( italic_k ) = [ italic_T ( italic_k ) ] start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .

1.1 Degree of Concurrency Constraints

Product form QN models of computer systems with Poisson arrivals with rate λ𝜆\lambdaitalic_λ is not amenable to a direct solution when the degree of concurrency or MPL, say Mmaxsubscript𝑀𝑚𝑎𝑥M_{max}italic_M start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT, is taken into account, because the number of jobs at the QN may exceed Mmaxsubscript𝑀𝑚𝑎𝑥M_{max}italic_M start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT.

Assuming that the throughout characteristic T(M)𝑇𝑀T(M)italic_T ( italic_M ) is a nondecreasing function of M𝑀Mitalic_M the maximum system throughput λmax<T(Mmax)subscript𝜆𝑚𝑎𝑥𝑇subscript𝑀𝑚𝑎𝑥\lambda_{max}<T(M_{max})italic_λ start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT < italic_T ( italic_M start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT ), since otherwise the system will become saturated, i.e., a queue of infinite length will be formed Kleinrock 1975 [28].

The MPL constraint is taken into account in applying a birth-death queueing model with arrival rate λ𝜆\lambdaitalic_λ and service rate μksubscript𝜇𝑘\mu_{k}italic_μ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT by “flattening” the throughput characteristic beyond T(Mmax)𝑇subscript𝑀𝑚𝑎𝑥T(M_{max})italic_T ( italic_M start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT ) for the FESC as given by Eq. (1):

μk={T(k),1kMmaxT(Mmax),kMmaxsubscript𝜇𝑘cases𝑇𝑘1𝑘subscript𝑀𝑚𝑎𝑥𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒𝑇subscript𝑀𝑚𝑎𝑥𝑘subscript𝑀𝑚𝑎𝑥𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒\displaystyle\mu_{k}=\begin{cases}T(k),\hskip 5.69054pt1\leq k\leq M_{max}\\ T(M_{max}),\hskip 19.91692ptk\geq M_{max}\end{cases}italic_μ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = { start_ROW start_CELL italic_T ( italic_k ) , 1 ≤ italic_k ≤ italic_M start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_T ( italic_M start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT ) , italic_k ≥ italic_M start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT end_CELL start_CELL end_CELL end_ROW (1)

The mean number of tasks in the system and the memory queue (MemQ) are obtained as follows.

N¯=k1kpk,N¯MemQ=kMmax(kMmax)pk.formulae-sequence¯𝑁subscript𝑘1𝑘subscript𝑝𝑘subscript¯𝑁𝑀𝑒𝑚𝑄subscript𝑘subscript𝑀𝑚𝑎𝑥𝑘subscript𝑀𝑚𝑎𝑥subscript𝑝𝑘\displaystyle\bar{N}=\sum_{k\geq 1}kp_{k},\hskip 5.69054pt\bar{N}_{MemQ}=\sum_% {k\geq M_{max}}(k-M_{max})p_{k}.over¯ start_ARG italic_N end_ARG = ∑ start_POSTSUBSCRIPT italic_k ≥ 1 end_POSTSUBSCRIPT italic_k italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , over¯ start_ARG italic_N end_ARG start_POSTSUBSCRIPT italic_M italic_e italic_m italic_Q end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_k ≥ italic_M start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_k - italic_M start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT ) italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT . (2)

The mean response time in the systems and mean waiting time in the queue are obtained by applying Little’s result by dividing by the task arrival rate λ𝜆\lambdaitalic_λ [28]:

Rsystem=Nsystems/λ and WmemQ=N¯MemQ/λ.subscript𝑅𝑠𝑦𝑠𝑡𝑒𝑚subscript𝑁𝑠𝑦𝑠𝑡𝑒𝑚𝑠𝜆 and subscript𝑊𝑚𝑒𝑚𝑄subscript¯𝑁𝑀𝑒𝑚𝑄𝜆R_{system}=N_{systems}/\lambda\mbox{ and }W_{memQ}=\bar{N}_{MemQ}/\lambda.italic_R start_POSTSUBSCRIPT italic_s italic_y italic_s italic_t italic_e italic_m end_POSTSUBSCRIPT = italic_N start_POSTSUBSCRIPT italic_s italic_y italic_s italic_t italic_e italic_m italic_s end_POSTSUBSCRIPT / italic_λ and italic_W start_POSTSUBSCRIPT italic_m italic_e italic_m italic_Q end_POSTSUBSCRIPT = over¯ start_ARG italic_N end_ARG start_POSTSUBSCRIPT italic_M italic_e italic_m italic_Q end_POSTSUBSCRIPT / italic_λ .

State probabilities of the birth-death process with arrival rate λ𝜆\lambdaitalic_λ and processing rate μk,k1subscript𝜇𝑘𝑘1\mu_{k},k\geq 1italic_μ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , italic_k ≥ 1 are obtained by setting S=p0=1𝑆subscript𝑝01S=p_{0}=1italic_S = italic_p start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 1, N¯=0¯𝑁0\bar{N}=0over¯ start_ARG italic_N end_ARG = 0

pk=(λ/μk)pk1,S+=pk,N¯+=kpkformulae-sequencesubscript𝑝𝑘𝜆subscript𝜇𝑘subscript𝑝𝑘1formulae-sequencelimit-from𝑆subscript𝑝𝑘limit-from¯𝑁𝑘subscript𝑝𝑘\displaystyle p_{k}=(\lambda/\mu_{k})p_{k-1},\hskip 5.69054ptS+=p_{k},\hskip 5% .69054pt\bar{N}+=kp_{k}italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT = ( italic_λ / italic_μ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ) italic_p start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT , italic_S + = italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT , over¯ start_ARG italic_N end_ARG + = italic_k italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT (3)
k1 till pkϵ,N¯=N¯/S,R=N¯/λ.formulae-sequence𝑘1 till subscript𝑝𝑘italic-ϵformulae-sequence¯𝑁¯𝑁𝑆𝑅¯𝑁𝜆\displaystyle k\geq 1\mbox{ till }p_{k}\leq\epsilon,\bar{N}=\bar{N}/S,R=\bar{N% }/\lambda.italic_k ≥ 1 till italic_p start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ≤ italic_ϵ , over¯ start_ARG italic_N end_ARG = over¯ start_ARG italic_N end_ARG / italic_S , italic_R = over¯ start_ARG italic_N end_ARG / italic_λ .

Example I: Transactions with predeclared lock requests: Txns with predeclared lock requests arriving according to a Poisson process with frequency fj,1jJsubscript𝑓𝑗1𝑗𝐽f_{j},1\leq j\leq Jitalic_f start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , 1 ≤ italic_j ≤ italic_J can execute concurrently if they have no conflicts Thomasian 1985 [58]. Txn response time is the sum of the queueing delay in a FCFS queue awaiting the acquisition of all locks at which point the txn is activated and task execution time at the computer system. It is assumed that the maximum MPL is not a constraint.

An approximate solution is also presented by analyzing the QN for various degrees of concurrency and using the resulting throughout as if there is a single job class.

Example II: Tasks with precedence relationships: Task precedence relationships are specified by a directed acyclic graph - dag which is referred to as a task system in Coffman and Denning 1973 [15]. Task processing times are specified by their execution time on the devices of a single computer. An optimal scheduling algorithm for two processors is presented in this book, while scheduling with more processors is explored in Adam et al. 1973 [2]. The results were compared against the bound by Fernandez and Bussell 1973 [20].

A Continuous Time Markov Chain - CTMC at the higher level model and a product-form QN model for task execution on a multiprogrammed computer system is considered in Thomasian and Bay 1986 [59]

Example III: Timesharing system: Simulation is a flexible approach for the higher level and its use is illustrated in the context of a timesharing system with two job classes Sauer 1981 [49]. At the lower modeling level task throughputs are obtained by analyzing product-form closed QN model. Section 4 specifies a discrete event simulation for higher level analysis of a timesharing system, whose tasks are processed in a multiprogrammed computer system.

Example IV: Fork/Join Analysis: A detailed QN model is required in evaluating the performance of a Fork/Join - F/J systems Thomasian 2014 [64]. This is because the completion time of several tasks started concurrently is affected by the distribution of the number of processing cycles. Detailed modeling can be better handled by hybrid simulation Schwetman 1978 [51].

The paper is organized as follows. Section 2 discusses a hierarchical model for analyzing a txn processing system with predeclared lock requests. Section 3 determines the makespan of a task system, whose tasks execute at a computer system. Section 4 describes a simulation model to estimate the mean response times of timesharing requests. The effect of transition probabilities on completion times is discussed in section 5. Section 6 describes the hybrid simulation method and propose extensions to it which were earlier discussed on [56], which requires further investigation and validation. Related work is presented in Section 7. Conclusions and further work are provided in Section 8. Equilibrium Point Analysis - EPA applied to reducing the cost of solving a txn processing systems is presented in the Appendix.

2 Transaction Processing with Predeclared Lock Requests

The effect of granularity of locking on txn response time is investigated in Thomasian 1985 [58], Txns are activated after acquiring all locks, while txns with lock conflicts with currently active txns are held in a queue until requested locks are released by completed txns. Txns are processed in FCFS order.

Txn response time is the sum of queueing due to acquire all locks and txn execution time at the computer system, which is represented by a product-form QN model. We consider J=5𝐽5J=5italic_J = 5 txn classes and a maximum degree of concurrency K=2𝐾2K=2italic_K = 2, since only txns in class 𝒞1subscript𝒞1{\cal C}_{1}caligraphic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 𝒞2subscript𝒞2{\cal C}_{2}caligraphic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT can be processed concurrently. Txns in 𝒞jsubscript𝒞𝑗{\cal C}_{j}caligraphic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT are a fraction fjsubscript𝑓𝑗f_{j}italic_f start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT in Poisson arrival stream.

Using the hierarchical decomposition method these throughputs are then incorporated into the higher-level model which is a 2-dimensional CTMC. One dimension is the composition of timesharing requests in execution

𝒮j,1jJ and 𝒮1,2.subscript𝒮𝑗1𝑗𝐽 and subscript𝒮12{\cal S}_{j},1\leq j\leq J\mbox{ and }{\cal S}_{1,2}.caligraphic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , 1 ≤ italic_j ≤ italic_J and caligraphic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT .

and another dimension the number of requests in the system.

Wallace and Rosenberg’s 1966 Recursive Queueing Analyzer - RQA [68] was used to succinctly specify the sparse regularly structured state transition matrix (Q𝑄Qitalic_Q). The number of states in the second dimension is set to be sufficiently large so that the fraction of txns lost due to the finite capacity is negligibly small for the given arrival rate. An iterative method of the form

π¯(k+1)=π¯(k)(c𝐐+𝐈),¯𝜋𝑘1¯𝜋𝑘𝑐𝐐𝐈\underline{\pi}(k+1)=\underline{\pi}(k)(c{\bf Q}+{\bf I}),under¯ start_ARG italic_π end_ARG ( italic_k + 1 ) = under¯ start_ARG italic_π end_ARG ( italic_k ) ( italic_c bold_Q + bold_I ) ,

where c𝑐citalic_c is a constant and 𝐈𝐈{\bf I}bold_I is the unity matrix Kleinrock 1975 [28], Bolch et al. 2006 [6].

Given the state probabilities we can obtain the mean number of txns in different classes N¯j,1jJsubscript¯𝑁𝑗1𝑗𝐽\overline{N}_{j},1\leq j\leq Jover¯ start_ARG italic_N end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , 1 ≤ italic_j ≤ italic_J. The mean txn response times follow as Rj=N¯j/λjsubscript𝑅𝑗subscript¯𝑁𝑗subscript𝜆𝑗R_{j}=\overline{N}_{j}/\lambda_{j}italic_R start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = over¯ start_ARG italic_N end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT / italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT.

The analysis can be extended to FCFS with skip** for static locking as in the analysis of static locking in Thomasian and Ryu 1983 [55] The latter analysis postulates a fine granularity of locking and that lock requests are uniformly distributed over database granules.

Aggregating Multiple Transaction Classes

The resulting system can be specified as:

T(k),1kKmax and T(k)=T(Kmax),kKmaxformulae-sequence𝑇𝑘1𝑘subscript𝐾𝑚𝑎𝑥 and 𝑇𝑘𝑇subscript𝐾𝑚𝑎𝑥𝑘subscript𝐾𝑚𝑎𝑥T(k),1\leq k\leq K_{max}\mbox{ and }T(k)=T(K_{max}),k\geq K_{max}italic_T ( italic_k ) , 1 ≤ italic_k ≤ italic_K start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT and italic_T ( italic_k ) = italic_T ( italic_K start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT ) , italic_k ≥ italic_K start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT

and then incorporating the throughout in a higher birth-death model.

Txn throughput with a single class is a weighted sum according to txn frequencies. Note that txns in the same class are not compatible with each other and only txns in 𝒞1subscript𝒞1{\cal C}_{1}caligraphic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 𝒞2subscript𝒞2{\cal C}_{2}caligraphic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT can be executed together. With at most two txns in execution and an infinite backlog of txns processed in FCFS order we have the transition rate matrix T𝑇Titalic_T among the execution states:

Tj,j=ijTj,i,j=1,6.formulae-sequencesubscript𝑇𝑗𝑗subscript𝑖𝑗subscript𝑇𝑗𝑖𝑗16T_{j,j}=-\sum_{i\neq j}T_{j,i},j=1,6.italic_T start_POSTSUBSCRIPT italic_j , italic_j end_POSTSUBSCRIPT = - ∑ start_POSTSUBSCRIPT italic_i ≠ italic_j end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT italic_j , italic_i end_POSTSUBSCRIPT , italic_j = 1 , 6 .

Solving the set of linear equations yields the state probabilities.

p¯𝐓(𝟐)=0 and (i)pi=1.¯𝑝𝐓20 and subscriptfor-all𝑖subscript𝑝𝑖1\underline{p}{\bf T(2)}=0\mbox{ and }\sum_{\forall(i)}p_{i}=1.under¯ start_ARG italic_p end_ARG bold_T ( bold_2 ) = 0 and ∑ start_POSTSUBSCRIPT ∀ ( italic_i ) end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1 .

The matrix for a closed task system with two tasks and compatible C11{}_{1}start_FLOATSUBSCRIPT 1 end_FLOATSUBSCRIPT and C22{}_{2}start_FLOATSUBSCRIPT 2 end_FLOATSUBSCRIPT classes is as follows.

(T1,10f31f2T1f41f2T1f51f2T1f1f21f2T10T2,2f31f1T2f41f1T2f51f1T2f1f21f1T2f1(1f2)T3f2(1f1)T3T3.3f4T3f5T32f1f4T3f1(1f2)T4f2(1f1)T4f3T4T4,4f5T42f1f2T4f1(1f2)T5f2(1f1)T5f3T5f4T5T5,52f1f2T5(1f2)T2(1f1)T1000T6,6)matrixsubscript𝑇110subscript𝑓31subscript𝑓2subscript𝑇1subscript𝑓41subscript𝑓2subscript𝑇1subscript𝑓51subscript𝑓2subscript𝑇1subscript𝑓1subscript𝑓21subscript𝑓2subscript𝑇10subscript𝑇22subscript𝑓31subscript𝑓1subscript𝑇2subscript𝑓41subscript𝑓1subscript𝑇2subscript𝑓51subscript𝑓1subscript𝑇2subscript𝑓1subscript𝑓21subscript𝑓1subscript𝑇2subscript𝑓11subscript𝑓2subscript𝑇3subscript𝑓21subscript𝑓1subscript𝑇3subscript𝑇3.3subscript𝑓4subscript𝑇3subscript𝑓5subscript𝑇32subscript𝑓1subscript𝑓4subscript𝑇3subscript𝑓11subscript𝑓2subscript𝑇4subscript𝑓21subscript𝑓1subscript𝑇4subscript𝑓3subscript𝑇4subscript𝑇44subscript𝑓5subscript𝑇42subscript𝑓1subscript𝑓2subscript𝑇4subscript𝑓11subscript𝑓2subscript𝑇5subscript𝑓21subscript𝑓1subscript𝑇5subscript𝑓3subscript𝑇5subscript𝑓4subscript𝑇5subscript𝑇552subscript𝑓1subscript𝑓2subscript𝑇51subscript𝑓2subscriptsuperscript𝑇21subscript𝑓1subscriptsuperscript𝑇1000subscript𝑇66\displaystyle\tiny\begin{pmatrix}T_{1,1}&0&\frac{f_{3}}{1-f_{2}}T_{1}&\frac{f_% {4}}{1-f_{2}}T_{1}&\frac{f_{5}}{1-f_{2}}T_{1}&\frac{f_{1}f_{2}}{1-f_{2}}T_{1}% \\ 0&T_{2,2}&\frac{f_{3}}{1-f_{1}}T_{2}&\frac{f_{4}}{1-f_{1}}T_{2}&\frac{f_{5}}{1% -f_{1}}T_{2}&\frac{f_{1}f_{2}}{1-f_{1}}T_{2}\\ f_{1}(1-f_{2})T_{3}&f_{2}(1-f_{1})T_{3}&T_{3.3}&f_{4}T_{3}&f_{5}T_{3}&2f_{1}f_% {4}T_{3}\\ f_{1}(1-f_{2})T_{4}&f_{2}(1-f_{1})T_{4}&f_{3}T_{4}&T_{4,4}&f_{5}T_{4}&2f_{1}f_% {2}T_{4}\\ f_{1}(1-f_{2})T_{5}&f_{2}(1-f_{1})T_{5}&f_{3}T_{5}&f_{4}T_{5}&T_{5,5}&2f_{1}f_% {2}T_{5}\\ (1-f_{2}){T^{\prime}}_{2}&(1-f_{1}){T^{\prime}}_{1}&0&0&0&T_{6,6}\end{pmatrix}( start_ARG start_ROW start_CELL italic_T start_POSTSUBSCRIPT 1 , 1 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL start_CELL divide start_ARG italic_f start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL divide start_ARG italic_f start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL divide start_ARG italic_f start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL divide start_ARG italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL italic_T start_POSTSUBSCRIPT 2 , 2 end_POSTSUBSCRIPT end_CELL start_CELL divide start_ARG italic_f start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL divide start_ARG italic_f start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL divide start_ARG italic_f start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL divide start_ARG italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 1 - italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_T start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL start_CELL italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 1 - italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_T start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL start_CELL italic_T start_POSTSUBSCRIPT 3.3 end_POSTSUBSCRIPT end_CELL start_CELL italic_f start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL start_CELL italic_f start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL start_CELL 2 italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 1 - italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_T start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_CELL start_CELL italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 1 - italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_T start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_CELL start_CELL italic_f start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_CELL start_CELL italic_T start_POSTSUBSCRIPT 4 , 4 end_POSTSUBSCRIPT end_CELL start_CELL italic_f start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_CELL start_CELL 2 italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 1 - italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_T start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT end_CELL start_CELL italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 1 - italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_T start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT end_CELL start_CELL italic_f start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT end_CELL start_CELL italic_f start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT end_CELL start_CELL italic_T start_POSTSUBSCRIPT 5 , 5 end_POSTSUBSCRIPT end_CELL start_CELL 2 italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ( 1 - italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL ( 1 - italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL italic_T start_POSTSUBSCRIPT 6 , 6 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ) (10)

After solving the set of linear equations to obtain the state probabilities Ei,isubscript𝐸𝑖for-all𝑖E_{i},\forall{i}italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , ∀ italic_i we can obtain the throughputs for class 𝒞jsubscript𝒞𝑗{\cal C}_{j}caligraphic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT with k𝑘kitalic_k txns in execution.

Tj(k)=iP[Ei]Tj(Ei)subscript𝑇𝑗𝑘subscriptfor-all𝑖𝑃delimited-[]subscript𝐸𝑖subscript𝑇𝑗subscript𝐸𝑖\displaystyle T_{j}(k)=\sum_{\forall{i}}P[E_{i}]T_{j}(E_{i})italic_T start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_k ) = ∑ start_POSTSUBSCRIPT ∀ italic_i end_POSTSUBSCRIPT italic_P [ italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] italic_T start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) (11)

The overall txn throughput is

T(k)=jTj(k)𝑇𝑘subscriptfor-all𝑗subscript𝑇𝑗𝑘\displaystyle T(k)=\sum_{\forall{j}}T_{j}(k)italic_T ( italic_k ) = ∑ start_POSTSUBSCRIPT ∀ italic_j end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_k ) (12)

The mean number of txns in Cj𝑗{}_{j}start_FLOATSUBSCRIPT italic_j end_FLOATSUBSCRIPT with j𝑗jitalic_j txns in the systems is:

N¯j(k)=iP[Ei]|Ei|jsubscript¯𝑁𝑗𝑘subscriptfor-all𝑖𝑃delimited-[]subscript𝐸𝑖subscriptsubscript𝐸𝑖𝑗\displaystyle\bar{N}_{j}(k)=\sum_{\forall{i}}P[E_{i}]|E_{i}|_{j}over¯ start_ARG italic_N end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_k ) = ∑ start_POSTSUBSCRIPT ∀ italic_i end_POSTSUBSCRIPT italic_P [ italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] | italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT (13)

where |Ei|jsubscriptsubscript𝐸𝑖𝑗|E_{i}|_{j}| italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT | start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is the number of txns in class 𝒞jsubscript𝒞𝑗{\cal C}_{j}caligraphic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT executing in that state (zero or one).

Txn throughputs for the five classes with k=2𝑘2k=2italic_k = 2 txns are:

Tj(2)=P(𝒮1)T1(𝒮1)+P(𝒮1,2)Tj(𝒮1,2),j=1,2.formulae-sequencesubscript𝑇𝑗2𝑃subscript𝒮1subscript𝑇1subscript𝒮1𝑃subscript𝒮12subscript𝑇𝑗subscript𝒮12𝑗12T_{j}(2)=P({\cal S}_{1})T_{1}({\cal S}_{1})+P({\cal S}_{1,2}){T}_{j}({\cal S}_% {1,2}),\hskip 5.69054ptj=1,2.italic_T start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( 2 ) = italic_P ( caligraphic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( caligraphic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) + italic_P ( caligraphic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) italic_T start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( caligraphic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) , italic_j = 1 , 2 .
Tj(2)=P(Sj)Tj(𝒮j),j=3,5.formulae-sequencesubscript𝑇𝑗2𝑃subscript𝑆𝑗subscript𝑇𝑗subscript𝒮𝑗𝑗35T_{j}(2)=P({S_{j}})T_{j}({\cal S}_{j}),\hskip 5.69054ptj=3,5.italic_T start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( 2 ) = italic_P ( italic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) italic_T start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( caligraphic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) , italic_j = 3 , 5 .

For execution states with k𝑘kitalic_k txns in the system are:

Ti(k)/Tj(k)=fi/fj and Ti(k)=fiT(k),subscript𝑇𝑖𝑘subscript𝑇𝑗𝑘subscript𝑓𝑖subscript𝑓𝑗 and subscript𝑇𝑖𝑘subscript𝑓𝑖𝑇𝑘T_{i}(k)/T_{j}(k)=f_{i}/f_{j}\mbox{ and }T_{i}(k)=f_{i}T(k),italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_k ) / italic_T start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_k ) = italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT / italic_f start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT and italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_k ) = italic_f start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_T ( italic_k ) ,

where T(k)=iTi(k)𝑇𝑘subscriptfor-all𝑖subscript𝑇𝑖𝑘T(k)=\sum_{\forall{i}}T_{i}(k)italic_T ( italic_k ) = ∑ start_POSTSUBSCRIPT ∀ italic_i end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_k ). For degree of concurrency k=2𝑘2k=2italic_k = 2 the mean number of txns in different classes is:

N¯1(2)=P[𝒮1](1+f11f2)+P(𝒮1,2)+f1j=35P(𝒮j)subscript¯𝑁12𝑃delimited-[]subscript𝒮11subscript𝑓11subscript𝑓2𝑃subscript𝒮12subscript𝑓1superscriptsubscript𝑗35𝑃subscript𝒮𝑗\overline{N}_{1}(2)=P[{\cal S}_{1}](1+\frac{f_{1}}{1-f_{2}})+P({\cal S}_{1,2})% +f_{1}\sum_{j=3}^{5}P({\cal S}_{j})over¯ start_ARG italic_N end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 2 ) = italic_P [ caligraphic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] ( 1 + divide start_ARG italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG ) + italic_P ( caligraphic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) + italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_j = 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT italic_P ( caligraphic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )
N¯2(2)=P[𝒮2](1+f21f1)+P(𝒮1,2)+f2j=35P(𝒮j)subscript¯𝑁22𝑃delimited-[]subscript𝒮21subscript𝑓21subscript𝑓1𝑃subscript𝒮12subscript𝑓2superscriptsubscript𝑗35𝑃subscript𝒮𝑗\overline{N}_{2}(2)=P[{\cal S}_{2}](1+\frac{f_{2}}{1-f_{1}})+P({\cal S}_{1,2})% +f_{2}\sum_{j=3}^{5}P({\cal S}_{j})over¯ start_ARG italic_N end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 2 ) = italic_P [ caligraphic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] ( 1 + divide start_ARG italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_ARG start_ARG 1 - italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) + italic_P ( caligraphic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) + italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_j = 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT italic_P ( caligraphic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT )
N¯j(2)=P[𝒮𝒿](1+fj)+P[𝒮1][f1/(1f2)]subscript¯𝑁𝑗2𝑃delimited-[]subscript𝒮𝒿1subscript𝑓𝑗𝑃delimited-[]subscript𝒮1delimited-[]subscript𝑓11subscript𝑓2\displaystyle\overline{N}_{j}(2)=P[{\cal S_{j}}](1+f_{j})+P[{\cal S}_{1}][f_{1% }/(1-f_{2})]over¯ start_ARG italic_N end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( 2 ) = italic_P [ caligraphic_S start_POSTSUBSCRIPT caligraphic_j end_POSTSUBSCRIPT ] ( 1 + italic_f start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) + italic_P [ caligraphic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] [ italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT / ( 1 - italic_f start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ]
P[𝒮2][fj/(1f1)],3j5.𝑃delimited-[]subscript𝒮2delimited-[]subscript𝑓𝑗1subscript𝑓13𝑗5\displaystyle P[{\cal S}_{2}][f_{j}/(1-f_{1})],\hskip 5.69054pt3\leq j\leq 5.italic_P [ caligraphic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] [ italic_f start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT / ( 1 - italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ] , 3 ≤ italic_j ≤ 5 .

3 Makespan of Task System with Multiprogrammed Tasks

Given a task system is specified by a dag with precedence relationships among tasks Tasks in Coffman and Denning 1973 [15] have fixed execution times. We consider a task system whose tasks are specified by their service demands at the devices of a multiprogrammed computer system. In Thomasian and Bay 1986 we develop a hierarchical analysis to determine the makespan, the completion time of the task system.

We consider a simple task system 𝐓𝐓{\cal\bf T}bold_T with six tasks. Two complementary tasks are added: τ0subscript𝜏0\tau_{0}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, which precedes all tasks and τsubscript𝜏\tau_{\infty}italic_τ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT which succeeds tasks with no successors otherwise. These two tasks are processed instantaneously.

𝐓={τ0,τ1,τ2,τ3,τ4,τ5,τ6,τ}𝐓subscript𝜏0subscript𝜏1subscript𝜏2subscript𝜏3subscript𝜏4subscript𝜏5subscript𝜏6subscript𝜏{\cal\bf T}=\{\tau_{0},\tau_{1},\tau_{2},\tau_{3},\tau_{4},\tau_{5},\tau_{6},% \tau_{\infty}\}bold_T = { italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT }

with the following precedence relationships:
τ0{τ1,τ2}τ3precedessubscript𝜏0subscript𝜏1subscript𝜏2precedessubscript𝜏3\tau_{0}\prec\{\tau_{1},\tau_{2}\}\prec\tau_{3}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≺ { italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } ≺ italic_τ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT,
τ0τ4{τ5,τ6}precedessubscript𝜏0subscript𝜏4precedessubscript𝜏5subscript𝜏6\tau_{0}\prec\tau_{4}\prec\{\tau_{5},\tau_{6}\}italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≺ italic_τ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ≺ { italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT },
{τ3,τ5,τ6}τprecedessubscript𝜏3subscript𝜏5subscript𝜏6subscript𝜏\{\tau_{3},\tau_{5},\tau_{6}\}\prec\tau_{\infty}{ italic_τ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT } ≺ italic_τ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT.

The task system makespan is C=Init𝐶subscriptInitC=\mbox{Init}_{\infty}italic_C = Init start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT. We are also interested in the initiation (Initi𝐼𝑛𝑖subscript𝑡𝑖Init_{i}italic_I italic_n italic_i italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT), completion (Compi𝐶𝑜𝑚subscript𝑝𝑖Comp_{i}italic_C italic_o italic_m italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT) and execution Execi=CompiIniti𝐸𝑥𝑒subscript𝑐𝑖𝐶𝑜𝑚subscript𝑝𝑖𝐼𝑛𝑖subscript𝑡𝑖Exec_{i}=Comp_{i}-Init_{i}italic_E italic_x italic_e italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_C italic_o italic_m italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_I italic_n italic_i italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT time of the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT task. The execution time of a task is the time the task spends in the system.

The task system 𝐓𝐓{\cal\bf T}bold_T leads to the CTMC for task execution states given in Table 1. Task combinations executed together known as tasksets are given in a list. An implicit instant transition from state {τ}subscript𝜏\{\tau_{\infty}\}{ italic_τ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT } to state {τ0}subscript𝜏0\{\tau_{0}\}{ italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } can be postulated so the execution of the task system is repeated.

L0subscript𝐿0L_{0}italic_L start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT {τ0}subscript𝜏0\{\tau_{0}\}{ italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT }
L1subscript𝐿1L_{1}italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT {τ1,τ2,τ4}subscript𝜏1subscript𝜏2subscript𝜏4\{\tau_{1},\tau_{2},\tau_{4}\}{ italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT }
L2subscript𝐿2L_{2}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT {τ1,τ4}subscript𝜏1subscript𝜏4\{\tau_{1},\tau_{4}\}{ italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT }, {τ2,τ4}subscript𝜏2subscript𝜏4\{\tau_{2},\tau_{4}\}{ italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT } {τ1,τ2,τ5,τ6}subscript𝜏1subscript𝜏2subscript𝜏5subscript𝜏6\{\tau_{1},\tau_{2},\tau_{5},\tau_{6}\}{ italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT }
L3subscript𝐿3L_{3}italic_L start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT {τ1,τ2,τ5}subscript𝜏1subscript𝜏2subscript𝜏5\{\tau_{1},\tau_{2},\tau_{5}\}{ italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT } {τ1,τ2,τ6}subscript𝜏1subscript𝜏2subscript𝜏6\{\tau_{1},\tau_{2},\tau_{6}\}{ italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT } {τ1,τ5,τ6}subscript𝜏1subscript𝜏5subscript𝜏6\{\tau_{1},\tau_{5},\tau_{6}\}{ italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT } {τ2,τ5,τ6}subscript𝜏2subscript𝜏5subscript𝜏6\{\tau_{2},\tau_{5},\tau_{6}\}{ italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT } {τ3,τ4}subscript𝜏3subscript𝜏4\{\tau_{3},\tau_{4}\}{ italic_τ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT }
L4subscript𝐿4L_{4}italic_L start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT {τ1,τ2}subscript𝜏1subscript𝜏2\{\tau_{1},\tau_{2}\}{ italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } {τ1,τ5}subscript𝜏1subscript𝜏5\{\tau_{1},\tau_{5}\}{ italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT } {τ1,τ6}subscript𝜏1subscript𝜏6\{\tau_{1},\tau_{6}\}{ italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT } {τ2,τ5}subscript𝜏2subscript𝜏5\{\tau_{2},\tau_{5}\}{ italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT } {τ2,τ6}subscript𝜏2subscript𝜏6\{\tau_{2},\tau_{6}\}{ italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT } {τ3,τ5,τ6}subscript𝜏3subscript𝜏5subscript𝜏6\{\tau_{3},\tau_{5},\tau_{6}\}{ italic_τ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT } {τ4}subscript𝜏4\{\tau_{4}\}{ italic_τ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT }
L5subscript𝐿5L_{5}italic_L start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT {τ1}subscript𝜏1\{\tau_{1}\}{ italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } {τ2}subscript𝜏2\{\tau_{2}\}{ italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } {τ3,T5}subscript𝜏3subscript𝑇5\{\tau_{3},T_{5}\}{ italic_τ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_T start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT } {τ3,τ6}subscript𝜏3subscript𝜏6\{\tau_{3},\tau_{6}\}{ italic_τ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT } {τ5,τ6}subscript𝜏5subscript𝜏6\{\tau_{5},\tau_{6}\}{ italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT }
L6subscript𝐿6L_{6}italic_L start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT {τ3}subscript𝜏3\{\tau_{3}\}{ italic_τ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT } {τ5}subscript𝜏5\{\tau_{5}\}{ italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT } {τ6}subscript𝜏6\{\tau_{6}\}{ italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT }
L7subscript𝐿7L_{7}italic_L start_POSTSUBSCRIPT 7 end_POSTSUBSCRIPT {τ}\tau_{\infty}\}italic_τ start_POSTSUBSCRIPT ∞ end_POSTSUBSCRIPT }
Table 1: CTMC has been built taking into account precedence relationships top to bottom. Given that one task completes per level the number of levels equals the number of tasks. Tasksets at lower levels are either a subset of tasksets at the higher level or additional tasks being activated when precedence relationships are satisfied.

The completion of τ4subscript𝜏4\tau_{4}italic_τ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT leads to the following transition

S={τ1,τ2,τ4}{τ1,τ2,τ5,τ6}𝑆subscript𝜏1subscript𝜏2subscript𝜏4subscript𝜏1subscript𝜏2subscript𝜏5subscript𝜏6S=\{\tau_{1},\tau_{2},\tau_{4}\}\rightarrow\{\tau_{1},\tau_{2},\tau_{5},\tau_{% 6}\}italic_S = { italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT } → { italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT }

The state holding time is the inverse of the sum of task throughputs, which determine the rates of an exponential distribution according to the decomposition principle Courtois 1975 [16], which is discussed informally in Lazowska et al. 1984 [34]. The notation used in this section is as follows:

I𝐼Iitalic_I: Number of tasks including two dummy tasks, which complete instantaneously.
L𝐿Litalic_L: Number of CTMC levels with L=I𝐿𝐼L=Iitalic_L = italic_I, since one task completed per level.
S𝑆Sitalic_S: State representation.
Ssubscript𝑆S_{\ell}italic_S start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT: Set of states at level \ellroman_ℓ.
|S|={τi,τj,}𝑆subscript𝜏𝑖subscript𝜏𝑗|S|=\{\tau_{i},\tau_{j},\ldots\}| italic_S | = { italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , … }: Set of tasks in execution at state 𝒮𝒮{\cal S}caligraphic_S.
Sisubscript𝑆𝑖{S}_{i}italic_S start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT: Set of states at which task τisubscript𝜏𝑖\tau_{i}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is executed.
S+superscript𝑆{S}^{+}italic_S start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT: Immediate successors to state S𝑆Sitalic_S.
Ssuperscript𝑆{S^{-}}italic_S start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT: Immediate predecessors to state S𝑆Sitalic_S.
P(|S|)𝑃𝑆P(|S|)italic_P ( | italic_S | ): Steady state probability of being in state S𝑆Sitalic_S.
Ti(S)subscript𝑇𝑖𝑆T_{i}(S)italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S ): Completion rate or throughput of task τi|S|subscript𝜏𝑖𝑆\tau_{i}\in|S|italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ | italic_S |.
T(S)=τi(|S|Ti(S)T(S)=\sum_{\tau_{i}\in(|S|}T_{i}(S)italic_T ( italic_S ) = ∑ start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ ( | italic_S | end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S ): Sum of completion rates at S𝑆{S}italic_S.
H(S)=[T(S)]1𝐻𝑆superscriptdelimited-[]𝑇𝑆1H(S)=[T(S)]^{-1}italic_H ( italic_S ) = [ italic_T ( italic_S ) ] start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT: Mean holding time in S𝑆Sitalic_S.
bR(S)subscript𝑏𝑅𝑆b_{R}(S)italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_S ): Branching probability from S𝑆Sitalic_S to R𝑅Ritalic_R.
p(R)=SRp(S)bR(S)𝑝𝑅subscript𝑆subscript𝑅𝑝𝑆subscript𝑏𝑅𝑆p(R)=\sum_{S\in R_{-}}p(S)b_{R}(S)italic_p ( italic_R ) = ∑ start_POSTSUBSCRIPT italic_S ∈ italic_R start_POSTSUBSCRIPT - end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_p ( italic_S ) italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_S ) /* path probability to R𝑅Ritalic_R */
/* The Mean delay to complete state R𝑅Ritalic_R is: */
D(R)𝐷𝑅D(R)italic_D ( italic_R ) = SRp(S)bR(S)P(s)+H(R)subscript𝑆subscript𝑅𝑝𝑆subscript𝑏𝑅𝑆𝑃𝑠𝐻𝑅\sum_{S\in R_{-}}p(S)b_{R}(S)P(s)+H(R)∑ start_POSTSUBSCRIPT italic_S ∈ italic_R start_POSTSUBSCRIPT - end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_p ( italic_S ) italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_S ) italic_P ( italic_s ) + italic_H ( italic_R )
C𝐶Citalic_C: = Completion time of all tasks.

Path probability to reach state R𝑅Ritalic_R.

p(S)=RSp(R)bS(R).𝑝𝑆subscript𝑅superscript𝑆𝑝𝑅subscript𝑏𝑆𝑅\displaystyle p(S)=\sum_{R\in S^{-}}p(R)b_{S}(R).italic_p ( italic_S ) = ∑ start_POSTSUBSCRIPT italic_R ∈ italic_S start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_p ( italic_R ) italic_b start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_R ) . (14)

The mean delay to the completion of state R𝑅Ritalic_R weighed by the path probabilities.

D(S)=H(S)+RinS+p(R)bS(R)D(R).𝐷𝑆𝐻𝑆subscript𝑅𝑖𝑛superscript𝑆𝑝𝑅subscript𝑏𝑆𝑅𝐷𝑅\displaystyle D(S)=H(S)+\sum_{RinS^{+}}p(R)b_{S}(R)D(R).italic_D ( italic_S ) = italic_H ( italic_S ) + ∑ start_POSTSUBSCRIPT italic_R italic_i italic_n italic_S start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_p ( italic_R ) italic_b start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_R ) italic_D ( italic_R ) . (15)

Initiation time of τisubscript𝜏𝑖\tau_{i}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is a weighed sum of all delays for its activation in state R𝑅Ritalic_R.

Initi=(SR)(τiS)(τiR)p(S)D(S).subscriptInit𝑖subscript𝑆superscript𝑅subscript𝜏𝑖𝑆subscript𝜏𝑖𝑅𝑝𝑆𝐷𝑆\displaystyle\mbox{Init}_{i}=\sum_{(S\in R^{-})\land(\tau_{i}\notin S)\land(% \tau_{i}\in R)}p(S)D(S).Init start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT ( italic_S ∈ italic_R start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT ) ∧ ( italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∉ italic_S ) ∧ ( italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_R ) end_POSTSUBSCRIPT italic_p ( italic_S ) italic_D ( italic_S ) . (16)

The completion time of τisubscript𝜏𝑖\tau_{i}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT which completes at S𝑆Sitalic_S leads to R𝑅Ritalic_R which does not include τisubscript𝜏𝑖\tau_{i}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT

Compi=(SR)(τiS)(τiR)p(S)D(S).𝐶𝑜𝑚subscript𝑝𝑖subscript𝑆superscript𝑅subscript𝜏𝑖𝑆subscript𝜏𝑖𝑅𝑝𝑆𝐷𝑆\displaystyle Comp_{i}=\sum_{(S\in R^{-})\land(\tau_{i}\in S)\land(\tau_{i}% \notin R)}p(S)D(S).italic_C italic_o italic_m italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT ( italic_S ∈ italic_R start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT ) ∧ ( italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_S ) ∧ ( italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∉ italic_R ) end_POSTSUBSCRIPT italic_p ( italic_S ) italic_D ( italic_S ) . (17)

Unnormalized state probabilities are computed level by level by setting the probability of the initial state S={τ0}𝑆subscript𝜏0S=\{\tau_{0}\}italic_S = { italic_τ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } to one.

P(R)=SRT(S)bR(S)P(S).𝑃𝑅subscript𝑆superscript𝑅𝑇𝑆subscript𝑏𝑅𝑆𝑃𝑆\displaystyle P(R)=\sum_{S\in R^{-}}T(S)b_{R}(S)P(S).italic_P ( italic_R ) = ∑ start_POSTSUBSCRIPT italic_S ∈ italic_R start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_T ( italic_S ) italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_S ) italic_P ( italic_S ) . (18)

The state probabilities are normalized by

NormConstant=SP(S).NormConstantsubscriptfor-all𝑆𝑃𝑆\mbox{NormConstant}=\sum_{\forall S}P(S).NormConstant = ∑ start_POSTSUBSCRIPT ∀ italic_S end_POSTSUBSCRIPT italic_P ( italic_S ) .

The execution time of τisubscript𝜏𝑖\tau_{i}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is Execi=CompiIniti𝐸𝑥𝑒subscript𝑐𝑖𝐶𝑜𝑚subscript𝑝𝑖𝐼𝑛𝑖subscript𝑡𝑖Exec_{i}=Comp_{i}-Init_{i}italic_E italic_x italic_e italic_c start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_C italic_o italic_m italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_I italic_n italic_i italic_t start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. Alternatively, completion time C𝐶Citalic_C times the sum of state probabilities of states in which the task was executing.

Ei=CS𝒮𝒾P(S).subscript𝐸𝑖𝐶subscript𝑆subscript𝒮𝒾𝑃𝑆\displaystyle E_{i}=C\sum_{S\in{\cal S_{i}}}P(S).italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_C ∑ start_POSTSUBSCRIPT italic_S ∈ caligraphic_S start_POSTSUBSCRIPT caligraphic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_P ( italic_S ) . (19)

Each entry in the CTMC is represented as:

[P(S);p(S;D(S);Ti(S),τi|S|;T(S),H(S),bR(S),RS+]\left[P(S);p({S};D(S);T_{i}(S),\tau_{i}\in|S|;T(S),H(S),b_{R}(S),R\in S^{+}\right][ italic_P ( italic_S ) ; italic_p ( italic_S ; italic_D ( italic_S ) ; italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S ) , italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ | italic_S | ; italic_T ( italic_S ) , italic_H ( italic_S ) , italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_S ) , italic_R ∈ italic_S start_POSTSUPERSCRIPT + end_POSTSUPERSCRIPT ]

Procedure for Performance Analysis of Task System

Input: Set of I𝐼Iitalic_I tasks, precedence relationships and service demands at the N𝑁Nitalic_N devices of a multiprogrammed computer system:

Xi,n,1iI,1nN.formulae-sequencesubscript𝑋𝑖𝑛1𝑖𝐼1𝑛𝑁X_{i,n},1\leq i\leq I,1\leq n\leq N.italic_X start_POSTSUBSCRIPT italic_i , italic_n end_POSTSUBSCRIPT , 1 ≤ italic_i ≤ italic_I , 1 ≤ italic_n ≤ italic_N .

Given S={tau0}𝑆𝑡𝑎subscript𝑢0S=\{tau_{0}\}italic_S = { italic_t italic_a italic_u start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } set H(S)=0𝐻𝑆0H(S)=0italic_H ( italic_S ) = 0, P(S)=1𝑃𝑆1P(S)=1italic_P ( italic_S ) = 1, p(S)=1𝑝𝑆1p(S)=1italic_p ( italic_S ) = 1

for levels =00\ell=0roman_ℓ = 0 to L+1𝐿1L+1italic_L + 1 do

for states S𝒮𝑆subscript𝒮S\in{\cal S}_{\ell}italic_S ∈ caligraphic_S start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT do

Given that the completion rate of τi|S|subscript𝜏𝑖𝑆\tau_{i}\in|S|italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ | italic_S | is Ti(S)subscript𝑇𝑖𝑆T_{i}(S)italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S )

T(S)=τi|𝒮Ti(S) hence H(S)=1/T(S)T(S)=\sum_{\tau_{i}\in|{\cal S}}T_{i}(S)\mbox{ hence }H(S)=1/T(S)italic_T ( italic_S ) = ∑ start_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ | caligraphic_S end_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S ) hence italic_H ( italic_S ) = 1 / italic_T ( italic_S )

Determine all successor states to 𝒮subscript𝒮{\cal S}_{\ell}caligraphic_S start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT and merge with the set of previously created states at L+1subscript𝐿1L_{\ell+1}italic_L start_POSTSUBSCRIPT roman_ℓ + 1 end_POSTSUBSCRIPT.

+1=+1Rsubscript1subscript1𝑅{\cal R}_{\ell+1}={\cal R}_{\ell+1}\cup Rcaligraphic_R start_POSTSUBSCRIPT roman_ℓ + 1 end_POSTSUBSCRIPT = caligraphic_R start_POSTSUBSCRIPT roman_ℓ + 1 end_POSTSUBSCRIPT ∪ italic_R

Obtain probability of reaching state R𝑅Ritalic_R via S𝑆Sitalic_S: p(R)=p(S)×bR(S)𝑝𝑅𝑝𝑆subscript𝑏𝑅𝑆p(R)=p(S)\times b_{R}(S)italic_p ( italic_R ) = italic_p ( italic_S ) × italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_S )

Completion of τisubscript𝜏𝑖\tau_{i}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT at S𝑆Sitalic_S leads to R𝑅Ritalic_R with probability: bR(S)=Ti(S)/T(S)subscript𝑏𝑅𝑆subscript𝑇𝑖𝑆𝑇𝑆b_{R}(S)=T_{i}(S)/T(S)italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_S ) = italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S ) / italic_T ( italic_S ).

Path probability: p(R)=p(R)+SRp(S)bR(S)𝑝𝑅𝑝𝑅subscript𝑆superscript𝑅𝑝𝑆subscript𝑏𝑅𝑆p(R)=p(R)+\sum_{S\in R^{-}}p(S)b_{R}(S)italic_p ( italic_R ) = italic_p ( italic_R ) + ∑ start_POSTSUBSCRIPT italic_S ∈ italic_R start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_p ( italic_S ) italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_S )

D(R)=D(R)+RSpR(S)×D(S)𝐷𝑅𝐷𝑅subscript𝑅superscript𝑆subscript𝑝𝑅𝑆𝐷𝑆D(R)=D(R)+\sum_{R\in S^{-}}p_{R}(S)\times D(S)italic_D ( italic_R ) = italic_D ( italic_R ) + ∑ start_POSTSUBSCRIPT italic_R ∈ italic_S start_POSTSUPERSCRIPT - end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_S ) × italic_D ( italic_S )

Add to InitisubscriptInit𝑖\mbox{Init}_{i}Init start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT tasks τisubscript𝜏𝑖\tau_{i}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT activated at this level using Eq. 16.

Update the completion time of a task τIsubscript𝜏𝐼\tau_{I}italic_τ start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT at this level using Eq. 17.

Obtain the steady state probability of P(R)𝑃𝑅P(R)italic_P ( italic_R ) using Eq. (18).

Update normalization constant for state probabilities Norm_Constant=Norm_Constant+P(R)Norm_ConstantNorm_Constant𝑃𝑅\mbox{Norm\_Constant}=\mbox{Norm\_Constant}+P(R)Norm_Constant = Norm_Constant + italic_P ( italic_R )

end /* all tasks R in level \ellroman_ℓ */

end /* level Lsubscript𝐿L_{\ell}italic_L start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT */

Normalize state probabilities
P(S)=P(S)/Norm_Constant S𝑃𝑆𝑃𝑆Norm_Constant for-all𝑆P(S)=P(S)/\mbox{Norm\_Constant }\hskip 5.69054pt\forall{S}italic_P ( italic_S ) = italic_P ( italic_S ) / Norm_Constant ∀ italic_S.

Given the solution of the computer system model state probabilities can be used to determine the mean device utilization when executing across all states.

Two numerical examples validated by simulation are provided in [59].

4 Simulation at Higher and Analysis at Lower Level

Hierarchical simulation is a more flexible method than building and solving a higher level CTMC for the analysis of task system performance. It is computationally more expensive, since using the batch method the simulation has to be repeated to obtain confidence intervals at an acceptably high level Welch 1983 [69].

The method is specified in the context of performance analysis of a timesharing system with two sets of users generating requests Sauer [49]. The analysis is repeated in Thomasian and Gargeya 1984 [57].

The first (resp. second) set of users are at L1subscript𝐿1L_{1}italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT (resp. L2subscript𝐿2L_{2}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT) terminals, which generate small class C1subscript𝐶1C_{1}italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and large class C2subscript𝐶2C_{2}italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT requests. The think times at the terminals are exponentially distributed with means Z1subscript𝑍1Z_{1}italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Z2subscript𝑍2Z_{2}italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. The maximum MPL for processing C1subscript𝐶1C_{1}italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and C2subscript𝐶2C_{2}italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT job classes are M1subscript𝑀1M_{1}italic_M start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and M2subscript𝑀2M_{2}italic_M start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. The parameter settings used in experiments are based on Table 2 in Sauer 1981 [49], which is repeated in Table 2.

Case 1 2 3 4 5 6 7 8 9
L1subscript𝐿1L_{1}italic_L start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 20 20 20 30 30 30 40 40 40
L2subscript𝐿2L_{2}italic_L start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 2 2 2 3 3 3 4 4 4
K1subscript𝐾1K_{1}italic_K start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 4 3 1 7 5 2 14 9 5
K2subscript𝐾2K_{2}italic_K start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 2 1 1 2 1 1 4 3 1
Table 2: The nine cases considered in this study .

Requests are processed at the CPU with the PS discipline and access four FCFS disks with exponential service times uniform probabilities. Think times and device service times are given in milliseconds as:

Z1=5,000,XCPU1=100,XDiski1,1i4=87.5formulae-sequencesubscript𝑍15000formulae-sequencesuperscriptsubscript𝑋𝐶𝑃𝑈1100subscriptsuperscript𝑋1𝐷𝑖𝑠subscript𝑘𝑖1𝑖487.5Z_{1}=5,000,X_{CPU}^{1}=100,X^{1}_{Disk_{i}},1\leq i\leq 4=87.5italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 5 , 000 , italic_X start_POSTSUBSCRIPT italic_C italic_P italic_U end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT = 100 , italic_X start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_D italic_i italic_s italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT , 1 ≤ italic_i ≤ 4 = 87.5
Z2=100,000,XCPU2=2,000,XDiski2,1i4=175formulae-sequencesubscript𝑍2100000formulae-sequencesuperscriptsubscript𝑋𝐶𝑃𝑈22000subscriptsuperscript𝑋2𝐷𝑖𝑠subscript𝑘𝑖1𝑖4175Z_{2}=100,000,X_{CPU}^{2}=2,000,X^{2}_{Disk_{i}},1\leq i\leq 4=175italic_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 100 , 000 , italic_X start_POSTSUBSCRIPT italic_C italic_P italic_U end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 2 , 000 , italic_X start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_D italic_i italic_s italic_k start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT , 1 ≤ italic_i ≤ 4 = 175

The closed QN, including the terminals, would be product-form if MjLj,j=1,2formulae-sequencesubscript𝑀𝑗subscript𝐿𝑗𝑗12M_{j}\geq L_{j},j=1,2italic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≥ italic_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_j = 1 , 2, so that there would be no blocking due to MPL constraints. A hierarchical solution method is required since realistically Mj<Ljsubscript𝑀𝑗subscript𝐿𝑗M_{j}<L_{j}italic_M start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT < italic_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT, which implies an extra delay in the memory queue.

The computer system is substituted with an FESC with the throughput characteristic for two classes:

Tj(k1,k2),0kjKj,j=1,2formulae-sequencesubscript𝑇𝑗subscript𝑘1subscript𝑘20subscript𝑘𝑗subscript𝐾𝑗𝑗12T_{j}(k_{1},k_{2}),\hskip 14.22636pt0\leq k_{j}\leq K_{j},\hskip 14.22636ptj=1,2italic_T start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , 0 ≤ italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≤ italic_K start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_j = 1 , 2

There are j=12(Nj+1)superscriptsubscriptproduct𝑗12subscript𝑁𝑗1\prod_{j=1}^{2}(N_{j}+1)∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_N start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT + 1 ) states in the two-dimensional CTMC, since 0kjNj,j=1,2formulae-sequence0subscript𝑘𝑗subscript𝑁𝑗𝑗120\leq k_{j}\leq N_{j},j=1,20 ≤ italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≤ italic_N start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_j = 1 , 2. At state (k1,k2)subscript𝑘1subscript𝑘2(k_{1},k_{2})( italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) class Cjsubscript𝐶𝑗C_{j}italic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT job requests are issued at rate

Λj=(Ljkj)/Zj,j1,2subscriptΛ𝑗subscript𝐿𝑗subscript𝑘𝑗subscript𝑍𝑗𝑗12\Lambda_{j}=(L_{j}-k_{j})/Z_{j},j-1,2roman_Λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = ( italic_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT - italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ) / italic_Z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_j - 1 , 2

and the throughputs obtained by solving the QN of the computer system are:

Tj(k1,k2),0kjKj,j=1,2.formulae-sequencesubscript𝑇𝑗subscript𝑘1subscript𝑘20subscript𝑘𝑗subscript𝐾𝑗𝑗12T_{j}(k_{1},k_{2}),\hskip 5.69054pt0\leq k_{j}\leq K_{j},\hskip 5.69054ptj=1,2.italic_T start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , 0 ≤ italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≤ italic_K start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_j = 1 , 2 .

The CTMC can also be build for an infinite source model, i.e., by specifying the fraction of classes in the Poisson arrival stream. The number of states in the CTMC should be set to be sufficiently large, so that probability of blocking due to finite capacity for the given arrival rate is negligibly small. The set of linear equations to obtain CTMC’s steady state probabilities can be solved using the Gauss-Seidel iterative method Stewart 2009 [52].

Procedure: Hierarchical Simulation of Timesharing System

1: Input simulation parameters.

Number of classes of timesharing users: J=2𝐽2J=2italic_J = 2.
Number of users or terminals in each class: Lj,j=1,Jformulae-sequencesubscript𝐿𝑗𝑗1𝐽L_{j},j=1,Jitalic_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_j = 1 , italic_J.
Maximum MPL in each class Kj,j=1,Jformulae-sequencesubscript𝐾𝑗𝑗1𝐽K_{j},j=1,Jitalic_K start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_j = 1 , italic_J.
Number of active terminals/users in two classes Nj=Lj,j=1,Jformulae-sequencesubscript𝑁𝑗subscript𝐿𝑗𝑗1𝐽N_{j}=L_{j},j=1,Jitalic_N start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_j = 1 , italic_J.
Think times and job service demands at the CPU and four disks.
Percent Confidence Interval - CI (say ±5%plus-or-minuspercent5\pm 5\%± 5 %)of job response times desired about the mean at a given Confidence Level - CL (say 95%) using the batch means method [69].
Settings such as: NCompTargetj=10,000,j=1,2formulae-sequence𝑁𝐶𝑜𝑚𝑝𝑇𝑎𝑟𝑔𝑒subscript𝑡𝑗10000𝑗12NCompTarget_{j}=10,000,j=1,2italic_N italic_C italic_o italic_m italic_p italic_T italic_a italic_r italic_g italic_e italic_t start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 10 , 000 , italic_j = 1 , 2 NumBatches=10𝑁𝑢𝑚𝐵𝑎𝑡𝑐𝑒𝑠10NumBatches=10italic_N italic_u italic_m italic_B italic_a italic_t italic_c italic_h italic_e italic_s = 10 need be adjusted to meet CL and CI target.

2: Solve lower level model.

Obtain throughputs by solving closed QN for all request compositions in two classes.

Tj(k1,k2)0kjKjj=1,2formulae-sequencesubscript𝑇𝑗subscript𝑘1subscript𝑘20subscript𝑘𝑗subscript𝐾𝑗𝑗12T_{j}(k_{1},k_{2})\hskip 5.69054pt0\leq k_{j}\leq K_{j}\hskip 5.69054ptj=1,2italic_T start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) 0 ≤ italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≤ italic_K start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_j = 1 , 2
3: Higher Level Discrete-Event Simulation.

(a) Initialization.

Clock=0𝐶𝑙𝑜𝑐𝑘0Clock=0italic_C italic_l italic_o italic_c italic_k = 0. /* Simulation Clock */
BatchCtr=0𝐵𝑎𝑡𝑐𝐶𝑡𝑟0BatchCtr=0italic_B italic_a italic_t italic_c italic_h italic_C italic_t italic_r = 0; /* Batch means method */
for class Cjj=1,2subscript𝐶𝑗𝑗12C_{j}j=1,2italic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_j = 1 , 2 do
countj=0𝑐𝑜𝑢𝑛subscript𝑡𝑗0count_{j}=0italic_c italic_o italic_u italic_n italic_t start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 0 /* #arrivals per class */
Nj=Ljsubscript𝑁𝑗subscript𝐿𝑗N_{j}=L_{j}italic_N start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_L start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT /* initialize # of thinking users */
Sample IntArvlTime𝐼𝑛𝑡𝐴𝑟𝑣𝑙𝑇𝑖𝑚𝑒IntArvlTimeitalic_I italic_n italic_t italic_A italic_r italic_v italic_l italic_T italic_i italic_m italic_e from Exp(Nj/Zjsubscript𝑁𝑗subscript𝑍𝑗N_{j}/Z_{j}italic_N start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT / italic_Z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT)
/* Set arrival time */
ArvlTimej=Clock+IntArvlTime𝐴𝑟𝑣𝑙𝑇𝑖𝑚subscript𝑒𝑗𝐶𝑙𝑜𝑐𝑘𝐼𝑛𝑡𝐴𝑟𝑣𝑙𝑇𝑖𝑚𝑒ArvlTime_{j}=Clock+IntArvlTimeitalic_A italic_r italic_v italic_l italic_T italic_i italic_m italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_C italic_l italic_o italic_c italic_k + italic_I italic_n italic_t italic_A italic_r italic_v italic_l italic_T italic_i italic_m italic_e
/* Set departure times for all requests */
DepartTimejk=,1kKjformulae-sequence𝐷𝑒𝑝𝑎𝑟𝑡𝑇𝑖𝑚superscriptsubscript𝑒𝑗𝑘1𝑘subscript𝐾𝑗DepartTime_{j}^{k}=\infty,1\leq k\leq K_{j}italic_D italic_e italic_p italic_a italic_r italic_t italic_T italic_i italic_m italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT = ∞ , 1 ≤ italic_k ≤ italic_K start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT
/* Sum class Cjsubscript𝐶𝑗C_{j}italic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT response times */
SumRespj=0𝑆𝑢𝑚𝑅𝑒𝑠subscript𝑝𝑗0SumResp_{j}=0italic_S italic_u italic_m italic_R italic_e italic_s italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 0
/* # of completed jobs in Cjsubscript𝐶𝑗C_{j}italic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT */
NCompj=0𝑁𝐶𝑜𝑚subscript𝑝𝑗0NComp_{j}=0italic_N italic_C italic_o italic_m italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 0
end do /* do j */

(b) Scheduling the next event.

  1. 1.

    Determine most imminent event from (N1+N2+J×Ksubscript𝑁1subscript𝑁2𝐽𝐾N_{1}+N_{2}+J\times Kitalic_N start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_N start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + italic_J × italic_K) possibilities):
    Next=min[(ArvlTimej,DepartTimej,kNext=\mbox{min}[(ArvlTime_{j},DepartTime_{j,k}italic_N italic_e italic_x italic_t = min [ ( italic_A italic_r italic_v italic_l italic_T italic_i italic_m italic_e start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_D italic_e italic_p italic_a italic_r italic_t italic_T italic_i italic_m italic_e start_POSTSUBSCRIPT italic_j , italic_k end_POSTSUBSCRIPT,
    1kKj,j=1,2.formulae-sequence1𝑘subscript𝐾𝑗𝑗121\leq k\leq K_{j},\hskip 14.22636ptj=1,2.1 ≤ italic_k ≤ italic_K start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_j = 1 , 2 .

    If departure record request class and request’s id: jn=j𝑗𝑛𝑗jn=jitalic_j italic_n = italic_j and kn=k𝑘𝑛𝑘kn=kitalic_k italic_n = italic_k,
    otherwise if arrival set arriving job class jn𝑗𝑛jnitalic_j italic_n based on the arrival stream.

  2. 2.

    Advance simulation time Clock=Next𝐶𝑙𝑜𝑐𝑘𝑁𝑒𝑥𝑡Clock=Nextitalic_C italic_l italic_o italic_c italic_k = italic_N italic_e italic_x italic_t.

  3. 3.

    If event an arrival goto (c),
    else goto to (d).

(c) Arrival of a class 𝒞jnsubscript𝒞𝑗𝑛{\cal C}_{jn}caligraphic_C start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT request.

  1. 1.

    Countjn=Countjn+1𝐶𝑜𝑢𝑛subscript𝑡𝑗𝑛𝐶𝑜𝑢𝑛subscript𝑡𝑗𝑛1Count_{jn}=Count_{jn}+1italic_C italic_o italic_u italic_n italic_t start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT = italic_C italic_o italic_u italic_n italic_t start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT + 1;

  2. 2.

    /* one request generated at a time */ Sample from Cjn𝑗𝑛{}_{jn}start_FLOATSUBSCRIPT italic_j italic_n end_FLOATSUBSCRIPT Exp(Njn/Zjnsubscript𝑁𝑗𝑛subscript𝑍𝑗𝑛N_{jn}/Z_{jn}italic_N start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT / italic_Z start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT).

  3. 3.

    /* next arrival time */ ArvlTimejn=Clock+IntArvlTjn𝐴𝑟𝑣𝑙𝑇𝑖𝑚subscript𝑒𝑗𝑛𝐶𝑙𝑜𝑐𝑘𝐼𝑛𝑡𝐴𝑟𝑣𝑙subscript𝑇𝑗𝑛ArvlTime_{jn}=Clock+IntArvlT_{jn}italic_A italic_r italic_v italic_l italic_T italic_i italic_m italic_e start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT = italic_C italic_l italic_o italic_c italic_k + italic_I italic_n italic_t italic_A italic_r italic_v italic_l italic_T start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT.

  4. 4.

    If kjnKjnsubscript𝑘𝑗𝑛subscript𝐾𝑗𝑛k_{jn}\geq K_{jn}italic_k start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT ≥ italic_K start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT enqueue
    Queue[Countjn]=ArvlTimejn𝑄𝑢𝑒𝑢𝑒delimited-[]𝐶𝑜𝑢𝑛subscript𝑡𝑗𝑛𝐴𝑟𝑣𝑙𝑇𝑖𝑚subscript𝑒𝑗𝑛Queue[Count_{jn}]=ArvlTime_{jn}italic_Q italic_u italic_e italic_u italic_e [ italic_C italic_o italic_u italic_n italic_t start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT ] = italic_A italic_r italic_v italic_l italic_T italic_i italic_m italic_e start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT;
    else goto (e)

(d) Task completed at Dept_Timejnkn𝐷𝑒𝑝𝑡_𝑇𝑖𝑚subscriptsuperscript𝑒𝑘𝑛𝑗𝑛Dept\_Time^{kn}_{jn}italic_D italic_e italic_p italic_t _ italic_T italic_i italic_m italic_e start_POSTSUPERSCRIPT italic_k italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT.

  • NCompjn++NComp_{jn}++italic_N italic_C italic_o italic_m italic_p start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT + + /* increment completions */

  • SumRespjn=+(ClockArrivalkn,jn)𝑆𝑢𝑚𝑅𝑒𝑠subscript𝑝𝑗𝑛𝐶𝑙𝑜𝑐𝑘𝐴𝑟𝑟𝑖𝑣𝑎subscript𝑙𝑘𝑛𝑗𝑛SumResp_{jn}=+(Clock-Arrival_{kn,jn})italic_S italic_u italic_m italic_R italic_e italic_s italic_p start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT = + ( italic_C italic_l italic_o italic_c italic_k - italic_A italic_r italic_r italic_i italic_v italic_a italic_l start_POSTSUBSCRIPT italic_k italic_n , italic_j italic_n end_POSTSUBSCRIPT )

  • if [(NComp1NCompTarget1)[(NComp_{1}\geq NCompTarget_{1})\land[ ( italic_N italic_C italic_o italic_m italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≥ italic_N italic_C italic_o italic_m italic_p italic_T italic_a italic_r italic_g italic_e italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ∧
    (NComp2NCompTarget2)](NComp_{2}\geq NCompTarget_{2})]( italic_N italic_C italic_o italic_m italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≥ italic_N italic_C italic_o italic_m italic_p italic_T italic_a italic_r italic_g italic_e italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ] goto (g), /* run complete */

  • kjn=kjn1subscript𝑘𝑗𝑛subscript𝑘𝑗𝑛1k_{jn}=k_{jn}-1italic_k start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT = italic_k start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT - 1 /* degree of concurrency

  • Njn=Njn+1subscript𝑁𝑗𝑛subscript𝑁𝑗𝑛1N_{jn}=N_{jn}+1italic_N start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT = italic_N start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT + 1. /* active users */

  • Obtain IntArvlT from Exp(Njn/Zjnsubscript𝑁𝑗𝑛subscript𝑍𝑗𝑛N_{jn}/Z_{jn}italic_N start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT / italic_Z start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT).

  • ArvlTimejn=Clock+IntArvlT𝐴𝑟𝑣𝑙𝑇𝑖𝑚subscript𝑒𝑗𝑛𝐶𝑙𝑜𝑐𝑘𝐼𝑛𝑡𝐴𝑟𝑣𝑙𝑇ArvlTime_{jn}=Clock+IntArvlTitalic_A italic_r italic_v italic_l italic_T italic_i italic_m italic_e start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT = italic_C italic_l italic_o italic_c italic_k + italic_I italic_n italic_t italic_A italic_r italic_v italic_l italic_T.

  • If Queue[jn]𝑄𝑢𝑒𝑢𝑒delimited-[]𝑗𝑛Queue[jn]italic_Q italic_u italic_e italic_u italic_e [ italic_j italic_n ] is nonempty goto (e), else goto (b).

(e) Activate an arriving or waiting task.
Activate Cjnsubscript𝐶𝑗𝑛C_{jn}italic_C start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT task, kjn++k_{jn}++italic_k start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT + +
Arrivalkn,jn=TaskArvl[Countjn]𝐴𝑟𝑟𝑖𝑣𝑎subscript𝑙𝑘𝑛𝑗𝑛𝑇𝑎𝑠𝑘𝐴𝑟𝑣𝑙delimited-[]𝐶𝑜𝑢𝑛subscript𝑡𝑗𝑛Arrival_{kn,jn}=TaskArvl[Count_{jn}]italic_A italic_r italic_r italic_i italic_v italic_a italic_l start_POSTSUBSCRIPT italic_k italic_n , italic_j italic_n end_POSTSUBSCRIPT = italic_T italic_a italic_s italic_k italic_A italic_r italic_v italic_l [ italic_C italic_o italic_u italic_n italic_t start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT ]
/* Set arrival time */
Sample execution time Ejnknsuperscriptsubscript𝐸𝑗𝑛𝑘𝑛E_{jn}^{kn}italic_E start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k italic_n end_POSTSUPERSCRIPT from Exp(Tjn(k1,k2)subscript𝑇𝑗𝑛subscript𝑘1subscript𝑘2T_{jn}(k_{1},k_{2})italic_T start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT ( italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT )).
DepartTimejnkn=Clock+Ejn,kn𝐷𝑒𝑝𝑎𝑟𝑡𝑇𝑖𝑚superscriptsubscript𝑒𝑗𝑛𝑘𝑛𝐶𝑙𝑜𝑐𝑘subscript𝐸𝑗𝑛𝑘𝑛DepartTime_{jn}^{kn}=Clock+E_{jn,kn}italic_D italic_e italic_p italic_a italic_r italic_t italic_T italic_i italic_m italic_e start_POSTSUBSCRIPT italic_j italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_k italic_n end_POSTSUPERSCRIPT = italic_C italic_l italic_o italic_c italic_k + italic_E start_POSTSUBSCRIPT italic_j italic_n , italic_k italic_n end_POSTSUBSCRIPT
goto (b)

(g) BatchCtr++;
RespjBatchCtr=SumRespjNCompj,j=1,2formulae-sequence𝑅𝑒𝑠superscriptsubscript𝑝𝑗𝐵𝑎𝑡𝑐𝐶𝑡𝑟𝑆𝑢𝑚𝑅𝑒𝑠subscript𝑝𝑗𝑁𝐶𝑜𝑚subscript𝑝𝑗𝑗12Resp_{j}^{BatchCtr}=\frac{SumResp_{j}}{NComp_{j}},j=1,2italic_R italic_e italic_s italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_B italic_a italic_t italic_c italic_h italic_C italic_t italic_r end_POSTSUPERSCRIPT = divide start_ARG italic_S italic_u italic_m italic_R italic_e italic_s italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_N italic_C italic_o italic_m italic_p start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG , italic_j = 1 , 2
Exit if BatchCtr++NumBatchesBatchCtr++\geq NumBatchesitalic_B italic_a italic_t italic_c italic_h italic_C italic_t italic_r + + ≥ italic_N italic_u italic_m italic_B italic_a italic_t italic_c italic_h italic_e italic_s,
else goto (a)

Batch means method to obtain mean response times and its CI at given CL was utilized.


5 Effect of Transition Probabilities on Task Completion Times

We elaborates on Section 7.2 in Thomasian 2014 [64] that the distribution of the number of task cycles between CPU and disk processing affect the completion time of the task system with parallelism.

In Section 3.3.3 in Kobayashi 1978 [30] it is stated that routing in QNs need not be governed by a homogeneous first-order Markov chain and it is the mean number of visits to QN nodes that determines the usual performance metrics, but this would affect task’s sojourn time distribution.

The mean completion time of a 2-way F/J task system in a closed QN model is sensitive to the distribution of the number of cycles. A cyclic server model with two devices, a processor and a disk is postulated. The disk is accessed following CPU processing and according to the cyclic server model after disk processing tasks may require additional CPU time or leave the system. Note difference with CSM where jobs complete their processing at the CPU.

Consider the concurrent processing of two tasks whose number of cycles follows a geometric distribution:

pn=(1p)pn1,n1 with a mean n¯=1/(1p).formulae-sequencesubscript𝑝𝑛1𝑝superscript𝑝𝑛1𝑛1 with a mean ¯𝑛11𝑝p_{n}=(1-p)p^{n-1},n\geq 1\mbox{ with a mean }\bar{n}=1/(1-p).italic_p start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT = ( 1 - italic_p ) italic_p start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT , italic_n ≥ 1 with a mean over¯ start_ARG italic_n end_ARG = 1 / ( 1 - italic_p ) .

The number of jobs completed in a time interval approaches a Poisson process when p1𝑝1p\rightarrow 1italic_p → 1, signifying a large number of cycles. Poisson inter-departure times imply exponentially distributed residence times. It can be shown that the geometrically distributed sum of exponential random variables is also exponentially distributed. The argument based on thinning point processes in Salza and Lavenberg 1981 [46] does not require the per cycle residence times to be exponentially distributed or even i.i.d.

Consider the processing of two possibly heterogeneous tasks: τ1subscript𝜏1\tau_{1}italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and τ2subscript𝜏2\tau_{2}italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, which are processed concurrently at a computer system at state S1,2(τ1,τ2)subscript𝑆12subscript𝜏1subscript𝜏2S_{1,2}(\tau_{1},\tau_{2})italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ( italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ). Given that Ti(S1,2T_{i}(S_{1,2}italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT is the completion rate of τi,i=1,2formulae-sequencesubscript𝜏𝑖𝑖12\tau_{i},i=1,2italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i = 1 , 2 then the mean holding time for state S1,2subscript𝑆12S_{1,2}italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT is:

H(S1,2)=[T1(1,2)+T2(1,2)]1.𝐻subscript𝑆12superscriptdelimited-[]subscript𝑇112subscript𝑇2121H(S_{1,2})=[T_{1}(1,2)+T_{2}(1,2)]^{-1}.italic_H ( italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) = [ italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 1 , 2 ) + italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( 1 , 2 ) ] start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .

The completion of τisubscript𝜏𝑖\tau_{i}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT in S1,2subscript𝑆12S_{1,2}italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT leads to Sj={τji}subscript𝑆𝑗subscript𝜏𝑗𝑖S_{j}=\{\tau_{j\neq i}\}italic_S start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = { italic_τ start_POSTSUBSCRIPT italic_j ≠ italic_i end_POSTSUBSCRIPT } and the completion of τjsubscript𝜏𝑗\tau_{j}italic_τ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT leads to the completion of 2-way F/J task system.

Assuming that the transition rates are exponentially distributed, the probability that τisubscript𝜏𝑖\tau_{i}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT completes first is:

pi=Ti(S1,2)H[S1,2],i=1,2formulae-sequencesubscript𝑝𝑖subscript𝑇𝑖subscript𝑆12𝐻delimited-[]subscript𝑆12𝑖12p_{i}=T_{i}(S_{1,2})H[S_{1,2}],i=1,2italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_T start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) italic_H [ italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ] , italic_i = 1 , 2

The time to complete the two tasks is then:

Cgeo=H(S1,2)+p1H(S2)+p2H(S1)).C_{geo}=H(S_{1,2})+p_{1}H(S_{2})+p_{2}H(S_{1})).italic_C start_POSTSUBSCRIPT italic_g italic_e italic_o end_POSTSUBSCRIPT = italic_H ( italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) + italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_H ( italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) + italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_H ( italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ) .

Consider two tasks whose number of cycles is given as follows: Case 1: geometrically distributed with mean n¯=5¯𝑛5\bar{n}=5over¯ start_ARG italic_n end_ARG = 5, Case 2: fixed with n=5𝑛5n=5italic_n = 5. The service demands per cycle is x¯c=x¯d=x¯=1subscript¯𝑥𝑐subscript¯𝑥𝑑¯𝑥1\bar{x}_{c}=\bar{x}_{d}=\bar{x}=1over¯ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT = over¯ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = over¯ start_ARG italic_x end_ARG = 1 at the CPU and the disk in both cases.

Given balanced service demands task residence times for N=K=2𝑁𝐾2N=K=2italic_N = italic_K = 2 are:

r(K)=(N+K1)x¯=3𝑟𝐾𝑁𝐾1¯𝑥3r(K)=(N+K-1)\bar{x}=3italic_r ( italic_K ) = ( italic_N + italic_K - 1 ) over¯ start_ARG italic_x end_ARG = 3

according to Lazowska et al. [34] and Thomasian 2023 [65]. Due to the memoryless property of the geometric distribution the completion time for the two tasks in the two cases is:

Cgeo=n¯×r(2)+n¯×r(1)=5×3+5×2=25.subscript𝐶𝑔𝑒𝑜¯𝑛𝑟2¯𝑛𝑟1535225C_{geo}=\bar{n}\times r(2)+\bar{n}\times r(1)=5\times 3+5\times 2=25.italic_C start_POSTSUBSCRIPT italic_g italic_e italic_o end_POSTSUBSCRIPT = over¯ start_ARG italic_n end_ARG × italic_r ( 2 ) + over¯ start_ARG italic_n end_ARG × italic_r ( 1 ) = 5 × 3 + 5 × 2 = 25 .
Cfixed=n×3=15.subscript𝐶𝑓𝑖𝑥𝑒𝑑𝑛315C_{fixed}=n\times 3=15.italic_C start_POSTSUBSCRIPT italic_f italic_i italic_x italic_e italic_d end_POSTSUBSCRIPT = italic_n × 3 = 15 .

We next consider the parallel processing of two tasks τ1subscript𝜏1\tau_{1}italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and τ2subscript𝜏2\tau_{2}italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, with a different number of geometric cycles which means n¯1=5subscript¯𝑛15\bar{n}_{1}=5over¯ start_ARG italic_n end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 5 and n¯2=10subscript¯𝑛210\bar{n}_{2}=10over¯ start_ARG italic_n end_ARG start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 10. Setting the time per visit at the processor x¯c=1subscript¯𝑥𝑐1\bar{x}_{c}=1over¯ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT = 1 and the disk x¯d=1subscript¯𝑥𝑑1\bar{x}_{d}=1over¯ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = 1 yields the service demands XC1=XD1=5superscriptsubscript𝑋𝐶1superscriptsubscript𝑋𝐷15X_{C}^{1}=X_{D}^{1}=5italic_X start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT = italic_X start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT = 5 for τ1subscript𝜏1\tau_{1}italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and XC2=XD2=10superscriptsubscript𝑋𝐶2superscriptsubscript𝑋𝐷210X_{C}^{2}=X_{D}^{2}=10italic_X start_POSTSUBSCRIPT italic_C end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = italic_X start_POSTSUBSCRIPT italic_D end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 10 for τ2subscript𝜏2\tau_{2}italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. Task throughputs when processed together can be obtained by solving the corresponding QN model.

T1(S1,2)=1/15 and T2(S1,2)=1/30.subscript𝑇1subscript𝑆12115 and subscript𝑇2subscript𝑆12130T_{1}(S_{1,2})=1/15\mbox{ and }T_{2}(S_{1,2})=1/30.italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) = 1 / 15 and italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) = 1 / 30 .

The mean holding time is then

H(S1,2)=[T1(S1,2)+T2(S1,2]1=10.H(S_{1,2})=[T_{1}(S_{1,2})+T_{2}(S_{1,2}]^{-1}=10.italic_H ( italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) = [ italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) + italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ] start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT = 10 .

The probability that τ1subscript𝜏1\tau_{1}italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT or τ2subscript𝜏2\tau_{2}italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT finishes first is given as:

{p1=T1(S1,2)H(S1,2)=(1/15)10=2/3p2=T2(S1,2)H(S1,2)=(1/30)10=1/3.casessubscript𝑝1subscript𝑇1subscript𝑆12𝐻subscript𝑆121151023𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒subscript𝑝2subscript𝑇2subscript𝑆12𝐻subscript𝑆121301013𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒\begin{cases}p_{1}=T_{1}(S_{1,2})H(S_{1,2})=(1/15)10=2/3\\ p_{2}=T_{2}(S_{1,2})H(S_{1,2})=(1/30)10=1/3.\end{cases}{ start_ROW start_CELL italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) italic_H ( italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) = ( 1 / 15 ) 10 = 2 / 3 end_CELL start_CELL end_CELL end_ROW start_ROW start_CELL italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) italic_H ( italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) = ( 1 / 30 ) 10 = 1 / 3 . end_CELL start_CELL end_CELL end_ROW

The mean residence time of these tasks is:

R(S1)=Xc1+Xd1=10 and R(S2)=Xc2+Xd2=20.𝑅subscript𝑆1superscriptsubscript𝑋𝑐1superscriptsubscript𝑋𝑑110 and 𝑅subscript𝑆2superscriptsubscript𝑋𝑐2superscriptsubscript𝑋𝑑220R(S_{1})=X_{c}^{1}+X_{d}^{1}=10\mbox{ and }R(S_{2})=X_{c}^{2}+X_{d}^{2}=20.italic_R ( italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = italic_X start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT + italic_X start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT = 10 and italic_R ( italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = italic_X start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_X start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 20 .

It follows that the completion time for the geometric distribution is:

Cgeo=10+23(20)+13(10)26.77.subscript𝐶𝑔𝑒𝑜102320131026.77C_{geo}=10+\frac{2}{3}(20)+\frac{1}{3}(10)\approx 26.77.italic_C start_POSTSUBSCRIPT italic_g italic_e italic_o end_POSTSUBSCRIPT = 10 + divide start_ARG 2 end_ARG start_ARG 3 end_ARG ( 20 ) + divide start_ARG 1 end_ARG start_ARG 3 end_ARG ( 10 ) ≈ 26.77 .

When the number of cycles is fixed at n1=5subscript𝑛15n_{1}=5italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 5 and n2=10subscript𝑛210n_{2}=10italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 10, then the two tasks share the CPU and disk for min(n1,n2)=5minsubscript𝑛1subscript𝑛25\mbox{min}(n_{1},n_{2})=5min ( italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = 5 cycles. The mean residence time per cycle for the balanced QN with K=2𝐾2K=2italic_K = 2 task and N=2𝑁2N=2italic_N = 2 devices based on a balanced F?J queueing system is [34], also see Thomasian 2023 [65].

r(N)=(K+N1)x¯=3.𝑟𝑁𝐾𝑁1¯𝑥3r(N)=(K+N-1)\overline{x}=3.italic_r ( italic_N ) = ( italic_K + italic_N - 1 ) over¯ start_ARG italic_x end_ARG = 3 .

After τ1subscript𝜏1\tau_{1}italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT completes, τ2subscript𝜏2\tau_{2}italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT has five remaining cycles, where each cycle takes r(1)=2𝑟12r(1)=2italic_r ( 1 ) = 2 time units. The completion time is:

Cfixed=H(S1,2)+H(S2)=n1×3+(n2n1)×2=25.subscript𝐶𝑓𝑖𝑥𝑒𝑑𝐻subscript𝑆12𝐻subscript𝑆2subscript𝑛13subscript𝑛2subscript𝑛1225C_{fixed}=H(S_{1,2})+H(S_{2})=n_{1}\times 3+(n_{2}-n_{1})\times 2=25.italic_C start_POSTSUBSCRIPT italic_f italic_i italic_x italic_e italic_d end_POSTSUBSCRIPT = italic_H ( italic_S start_POSTSUBSCRIPT 1 , 2 end_POSTSUBSCRIPT ) + italic_H ( italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT × 3 + ( italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) × 2 = 25 .

In counting the number of remaining cycles we have in effect used the hybrid simulation method. Given that the completion time of F/J and parallel task systems in general is sensitive to the number of cycles raises the issue that alternative methods to estimate the completion time of task systems.

6 Hybrid Simulation Method

Hybrid simulation is a hierarchical simulation method proposed by Schwetman 1978 [51], which was applied to the CSM QN model with a single job class. Tasks are specified by the number of required cycles and the loadings per cycle. The degree of concurrency is required for the analysis.

Tasks are specified by service demands per cycle and the initial number of required cycles. The remaining cycles for the ithsuperscript𝑖𝑡i^{th}italic_i start_POSTSUPERSCRIPT italic_t italic_h end_POSTSUPERSCRIPT task τisubscript𝜏𝑖\tau_{i}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT (CR[i]𝐶𝑅delimited-[]𝑖CR[i]italic_C italic_R [ italic_i ]) is updated based on elapsed time X=NextClock,Clock=Nextformulae-sequence𝑋𝑁𝑒𝑥𝑡𝐶𝑙𝑜𝑐𝑘𝐶𝑙𝑜𝑐𝑘𝑁𝑒𝑥𝑡X=Next-Clock,Clock=Nextitalic_X = italic_N italic_e italic_x italic_t - italic_C italic_l italic_o italic_c italic_k , italic_C italic_l italic_o italic_c italic_k = italic_N italic_e italic_x italic_t, where

Next=min{ArvlTime,CT[i]×CR[i]},1in]Next=\mbox{min}\{ArvlTime,CT[i]\times CR[i]\},1\leq i\leq n]italic_N italic_e italic_x italic_t = min { italic_A italic_r italic_v italic_l italic_T italic_i italic_m italic_e , italic_C italic_T [ italic_i ] × italic_C italic_R [ italic_i ] } , 1 ≤ italic_i ≤ italic_n ]

CT[n]𝐶𝑇delimited-[]𝑛CT[n]italic_C italic_T [ italic_n ] is the mean cycle time computed by solving the underlying QN model and CR[i]=CR[i]X/CT[i]𝐶𝑅delimited-[]𝑖𝐶𝑅delimited-[]𝑖𝑋𝐶𝑇delimited-[]𝑖CR[i]=CR[i]-X/CT[i]italic_C italic_R [ italic_i ] = italic_C italic_R [ italic_i ] - italic_X / italic_C italic_T [ italic_i ] As txns arrive and depart the degree of txn concurrency is updated as n=n±1𝑛plus-or-minus𝑛1n=n\pm 1italic_n = italic_n ± 1 and CT[n]𝐶𝑇delimited-[]𝑛CT[n]italic_C italic_T [ italic_n ] for active tasks is recomputed.

The hybrid simulation method was extended to multiple job classes in Thomasian 1987 [60], which deals with dynamic load balancing in a distributed system. Simply stated it is better to process I/O bound jobs together with CPU bound jobs and it is not the just remaining number of job cycles that matters.

The generalization of hybrid simulation proposed in Thomasian and Bay 1983 [56] does not require cyclic processing and can be applied to multiple job classes. Rather than quantizing task processing times we may modify service demands according to Eq. (20), Given that τisubscript𝜏𝑖\tau_{i}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT starts its execution at device d𝑑ditalic_d with loading Xdisuperscriptsubscript𝑋𝑑𝑖X_{d}^{i}italic_X start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT and its mean residence time obtained by solving closed QN model for a given task composition is Risubscript𝑅𝑖R_{i}italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, the residual job service demand after elapsed time t𝑡titalic_t is:

Xdi=Xdi(1t/Ri).\displaystyle\boxed{X_{d}^{i}=X_{d}^{i}(1-t/R_{i}).}italic_X start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT = italic_X start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_i end_POSTSUPERSCRIPT ( 1 - italic_t / italic_R start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) . (20)

Tasks systems as in [59] can be dealt with the hybrid simulation as follows. Referring back to the example in Section 3 we refer to the processing of tasks τ1,τ2,τ4subscript𝜏1subscript𝜏2subscript𝜏4\tau_{1},\tau_{2},\tau_{4}italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT, which start processing concurrently. The service demands at the CPU and Disk 1 and 2 for τ1,τ2,τ5,τ6subscript𝜏1subscript𝜏2subscript𝜏5subscript𝜏6\tau_{1},\tau_{2},\tau_{5},\tau_{6}italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT are (420,400,400)420400400(420,400,400)( 420 , 400 , 400 ) and for τ3,τ4subscript𝜏3subscript𝜏4\tau_{3},\tau_{4}italic_τ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT are (620,600,600)620600600(620,600,600)( 620 , 600 , 600 ) and the number of cycles made by tasks are fixed. Analysis of the QN model processing τ1subscript𝜏1\tau_{1}italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, τ2subscript𝜏2\tau_{2}italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, τ4subscript𝜏4\tau_{4}italic_τ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT concurrently yields the mean residence time R1=R2<R4subscript𝑅1subscript𝑅2subscript𝑅4R_{1}=R_{2}<R_{4}italic_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT < italic_R start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT. τ4subscript𝜏4\tau_{4}italic_τ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT’s residual loadings are obtained by multiplying by (1R2/R4(1-R_{2}/R_{4}( 1 - italic_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT / italic_R start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT.

The modified hybrid simulation method is easier to implement and less costly than the method described in Section 3. In fact this method may be more accurate than the method based on decomposition that postulates exponentially distributed completion times. This is especially so when the number of job cycles is not geometrically distributed.

With n𝑛nitalic_n identical tasks when the distribution of number of cycles is uniformly distributed over (0,x)0𝑥(0,x)( 0 , italic_x ), the minimum of the number of cycles to completion is: rmin=x/n+1subscript𝑟𝑚𝑖𝑛𝑥𝑛1r_{min}=x/{n+1}italic_r start_POSTSUBSCRIPT italic_m italic_i italic_n end_POSTSUBSCRIPT = italic_x / italic_n + 1

Batch jobs undergo different processing phases with different loadings from phase to phase, e.g., (i) loading and preprocessing of data, (ii) computation, (iii) visualization processing. Rather dealing with a single set of task loading we assume loadings associated with its three phases designated as τ1τ2τ3subscript𝜏1subscript𝜏2subscript𝜏3\tau_{1}\rightarrow\tau_{2}\rightarrow\tau_{3}italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT → italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT → italic_τ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT are known. The residence time of the batch job is the sum of the residence times of three tasks. The discussion is simplified by assuming that the computer also processes transactions whose intensity remains fixed during the execution of the batch job. Programs should be instrumented to signal change of phases for measurement purposes, i.e., to report per phase loadings based on variation in resource consumption. Phases accessing bottleneck resources will have increased residence times while other phases will have shorter residence times. Experiments are required if considering phases will yield significant differences There is also the possibility of parallel processing via multitasking.

7 Related Work

Given that many computations exhibit parallelism, several parallel models of computation have been developed, since the early days of computing. The early task system model based on a dag was described in Coffman and Denning 1973.

More complex relations in parallel processing can be represented by Petri nets Peterson 1981 [40]. The so-called UCLA graph model proposed in Martin 1966 [35] can be transformed into a Petri net. but the proof of the equivalence of the two models by Kim Gostelow is flawed according to the book.

A major extension to Petri nets was the Timed Petri Net - TPN model by several researchers including Molloy 1982 [39], which led to workshops on TPNs starting in 1985. Generalized Stochastic Petri Nets - GSPNs allow immediate and exponentially distributed transitions Ajmone-Marsan et al. 1984 [3]. GSPNs are translatable to CTMCs which can the be solved using usual methods Stewart 2009 [52]. Further examples of this modeling approach are given in Ajmone-Marsan 1995 [4]

Research Queueing Package - RESQ is a software tool for constructing and solving QN models. RESQ under a different name was imported to IBM Research from Univ. of Texas at Austin Sauer and MacNair 1983 [50]. It is also described in Sauer et al. 1977 [48]. RESQ provides a high level language for describing models also in hierarchical fashion.

Job execution in closed QN models allowing parallelism is considered in Heidelberger and Trivedi 1982 [22]. Jobs spawns two or more tasks at some point during execution, which execute independently of one another and do not require synchronization. An approximate solution method is developed and results of the approximation are compared to those of simulations. Bounds on the performance improvement due to overlap are derived.

The same authors in 1983 [23] consider parallel tasks, which wait at the end of their execution for all of their siblings to finish execution. Two approximate solution methods are developed and compared with simulations. The approximations are computationally efficient and highly accurate. A single instance of a task systems executing in a multiprogrammed computer is considered Thomasian and Bay 1986 [59].

Concurrency in parallel processing systems is the topic of Kung 1984 [32]. Jobs are modeled as dags whose nodes represent separate tasks. Four variations are considered: (1) jobs available at time zero or Poisson arrivals. (2) dags: fixed or random. (3) task service times: constant or exponentially distributed. (4) fixed or infinite number of jobs. Algorithm 1 minimizes the expected time to complete all jobs, while Algorithm 2 maximizes processor utilization.

An algorithm to search for module assignments and replications to reduce task response times is explored by Chu and Leung [13]. The objective function is the sum of task response time and a delay penalty for the violations of thread response time requirements. The PS queueing discipline is used in this study, so that with Poisson arrivals the output stream is Poisson and the nodes can be analyzed separately when there are no synchronization delays,

Chu et al. [14] propose two submodels for estimating task response times in distributed systems with resource contention. The first submodel is an extended QN to obtain module response times, which is solved by a decomposition technique to reduce computational cost by 2-3 orders of magnitude with respect to a direct approach.

The second submodel is a weighed control-flow graph model from which task response time can be obtained by aggregating module response time in accordance with precedence relationships. Task response times estimated by the analytic model compare closely with simulation results. The model can be used to study the tradeoffs among module assignments, scheduling policies, interprocessor communications, and resource contentions in distributed processing systems.

Response time is affected by interprocessor communications, precedence relationships, module assignments, hardware resource and data resource contention, and processor scheduling policies. A task response time model that considers all of these factors is proposed. A Petri net is used to represent resource contention, and the task control flow graph represents precedence relationships. A QN with resource contention is used to estimate the response time of each module. Module response time consists of delays at the processors and resource queues and is estimated by approximating the extended QN as independent finite capacity QNs. The module response time is mapped onto a control flow graph, and task response time is obtained by aggregating the module response times in accordance with their precedence relationship in the control flow graph. The task response time derived from the analytical model were validated against simulation results.

A modeling methodology for evaluating the execution of parallel programs containing loo** constructs by estimating the average execution time of such a program in a distributed, multicomputer environment is proposed in Kapelnikov et al. [26, 27]. A combination of QN analysis of graph models of program behavior is considered in these studies. Complex programs are first decomposed into program segments, which are analyzed independently. Combined results produce an approximate solution for the whole program.

Task graphs represent parallel programs with dags, which are specified as a 4-tuple { T,P,A,E } by Menasce and Barroso 1992 [37].

  • 𝐓={τ1,τ2,}𝐓subscript𝜏1subscript𝜏2{\cal\bf T}=\{\tau_{1},\tau_{2},\dots\}bold_T = { italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … } is the set of tasks of a parallel program.

  • 𝐏𝐏{\cal\bf P}bold_P is the precedence relationship among tasks.
    τjsubscript𝜏𝑗{\tau_{j}}italic_τ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT can be activated when all (τi𝐏(\tau_{i}\in{\cal\bf P}( italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ bold_P complete their execution.

  • A is an allocation function for tasks to processors τiPjsubscript𝜏𝑖subscript𝑃𝑗\tau_{i}\rightarrow P_{j}italic_τ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT → italic_P start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT.

  • E: determines the execution time based on processor speeds. Similarly to [59] tasks are specified by the their service demands on a computer system. The execution time depends on task mix and hence can be determined by solving the QN.

A CTMC based technique to obtain the execution time of a task graph in a multiprogrammed computer system based on Thomasian and Bay 1986 [59] is reported by Menasce et al. in [38]. The use of the static processor assignment policy called Largest Task First Minimum Finish Time - LTFMFT shows that it is very sensitive to the degree of heterogeneity of the architecture, and that it outperforms all other policies analyzed.

Three dynamic assignment disciplines are compared and it is shown that in heterogeneous environments, the disciplines that perform better are those that consider the structure of the task graph and not only the service demands of the individual tasks. The performance of heterogeneous architectures is compared with cost-equivalent homogeneous ones taking into account different scheduling policies. Static and dynamic processor assignment disciplines are compared in terms of performance.

8 Conclusion and Further Work

Hierarchical modeling is a useful tool in develo** approximate analyses when the system does not lend itself to a direct solution or to reduce the cost of analysis or simulation by replacing a detailed model of a computer system by an FESC.

Several instances of hierarchic analysis are discussed in this paper with the goal of reducing the solution cost. An efficient solution of a CTMC for a task system and a simulation of a timesharing system with two jobs classes with MPL constraints. In both cases tasks are processed on a multiprogrammed computer system representable as a product form QN.

When the tasks of a subtask system execute at the devices of independent computer systems the completion time of subtask system may be used to determine the overall completion time. The data transmission delays among nodes is the ratio of message length and data transmission rate, assuming queueing delays are negligible.

In addition to the mean completion time the variance of completion times is of interest. With the assumption that the holding time in each state of the CTMC is exponentially distributed the variance of completion time is the variance of first passage time from the initial to final state in the CTMC. The variance of first passage times in discrete-time Markov chains is derived in Hunter 2006 [24].

Rather than considering all tasks in the distributed computer system at once, as done in [59], the mean and variance of completion time of task subsystems can be determined separately. In a distributed system can use separate task systems to determine mean and variance of completion time per system, which can be used to determine overall completion time.

Rather than the six task system in Section 3 consider two subtask systems with three tasks each:
{ τ1,τ2,τ3}\tau_{1},\tau_{2},\tau_{3}\}italic_τ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT } and {τ4,τ5,τ6}subscript𝜏4subscript𝜏5subscript𝜏6\{\tau_{4},\tau_{5},\tau_{6}\}{ italic_τ start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT , italic_τ start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT }.
The two subtasks are executed independly at two identical computer systems. The mean makespan of of the task system can be determined by obtaining the mean and variance of each subtask system.

Assuming a normal distribution a formula for the expected value of the maximum for three such random variables is given in Dasgupta 2023 [17]. In the case of nine random variables a two step computation can be used e.g., by computing the maximum of X1:3max,X4:6max,X7:9maxsuperscriptsubscript𝑋:13𝑚𝑎𝑥superscriptsubscript𝑋:46𝑚𝑎𝑥superscriptsubscript𝑋:79𝑚𝑎𝑥X_{1:3}^{max},X_{4:6}^{max},X_{7:9}^{max}italic_X start_POSTSUBSCRIPT 1 : 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_a italic_x end_POSTSUPERSCRIPT , italic_X start_POSTSUBSCRIPT 4 : 6 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_a italic_x end_POSTSUPERSCRIPT , italic_X start_POSTSUBSCRIPT 7 : 9 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_a italic_x end_POSTSUPERSCRIPT.

The expected value of the maximum of n𝑛nitalic_n i.i.d. random variables with mean μXsubscript𝜇𝑋\mu_{X}italic_μ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT and standard deviation σXsubscript𝜎𝑋\sigma_{X}italic_σ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT of the components of an F/J request according to David and Nagaraja 2003 [18] is given as:

X¯nmaxμX+σXG(n).superscriptsubscript¯𝑋𝑛𝑚𝑎𝑥subscript𝜇𝑋subscript𝜎𝑋𝐺𝑛\displaystyle\overline{X}_{n}^{max}\approx\mu_{X}+\sigma_{X}G(n).over¯ start_ARG italic_X end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m italic_a italic_x end_POSTSUPERSCRIPT ≈ italic_μ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT + italic_σ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT italic_G ( italic_n ) . (21)

A simulation based method to substitute disks with a Shortest Access Time First - SATF scheduling method with an FESC is discussed in [63]. Given p𝑝pitalic_p pending requests the service time is reduced according to p1/5superscript𝑝15p^{1/5}italic_p start_POSTSUPERSCRIPT 1 / 5 end_POSTSUPERSCRIPT, i.e., service time is halved for p=32𝑝32p=32italic_p = 32 random requests. This is an example of using simulation at the lower level to develop an analytic formula for disk service time.

The method developed in [59] is incorporated in the SHARPE reliability and performance modeling package at Duke University.
https://trivedi.pratt.duke.edu/

Hierarchical modeling in the context of reliability and availability engineering is discussed in Chapter 16 in Trivedi and Bobbio 2017 [67].

The discussion is applicable to queueing analysis of communication networks and manufacturing systems Buzacott and Shanthikumar 1993 [7].

Appendix: Equilibrium Point Approximation

A multilevel analysis method of dynamic locking is used to determine txn response times in Ryu and Thomasian 1990 [44]. The analysis takes into account hardware and data resource contention, which is due to lock contention. Txns encountering a lock conflict are blocked and those whose lock requests lead to a deadlock are aborted and restarted. Realistically a txn which has the least resources should be aborted as in the case of the Wait-depth Limited - WDL policy [21]. In realistic models the probability of deadlock is negligibly small.

Txns arrive according to a Poisson process. The number of activated txns (V𝑉Vitalic_V) is restricted by the maximum multiprogramming level (W)𝑊(W)( italic_W ), i.e., V=min(A,W)𝑉min𝐴𝑊V=\mbox{min}(A,W)italic_V = min ( italic_A , italic_W ), where A𝐴Aitalic_A is the number of txns in the system. Txns making successful lock requests continue their execution, while txns making a conflicting lock requests are blocked, so that J𝐽Jitalic_J txns are active and VJ𝑉𝐽V-Jitalic_V - italic_J txns are blocked.

Txns causing a deadlock are aborted and restarted so that the number of active txns remains the same. In fact deadlocks are rare [61] and have a negligible effect on performance and their effect was ignored in further studies Thomasian 19993 [62]. Blocked txns are activated when a txn completes or is aborted releasing all of its locks. The lock and hardware resource contention models are used to determine txn throughput μ(V),1VW𝜇𝑉1𝑉𝑊\mu(V),1\leq V\leq Witalic_μ ( italic_V ) , 1 ≤ italic_V ≤ italic_W, which can then be used in conjunction with the arrival process to determine mean txn response time.

To compute the effective system throughput μ(V)𝜇𝑉\mu(V)italic_μ ( italic_V ) we need to compute the steady-state probabilities of a Markov chain PT(J),1,JVP_{T}(J),1\leq,J\leq Vitalic_P start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ( italic_J ) , 1 ≤ , italic_J ≤ italic_V. The transition from state J𝐽Jitalic_J to I𝐼Iitalic_I is determined by the events upon the completion of a txn step. The probabilities for these events are determined at the completion time of txn steps.

μ(V)=JVPS(J|V)t(J) respect for 1VW.𝜇𝑉subscript𝐽𝑉subscript𝑃𝑆conditional𝐽𝑉𝑡𝐽 respect for 1𝑉𝑊\mu(V)=\sum_{J\leq V}P_{S}(J|V)t(J)\mbox{ respect for }1\leq V\leq W.italic_μ ( italic_V ) = ∑ start_POSTSUBSCRIPT italic_J ≤ italic_V end_POSTSUBSCRIPT italic_P start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_J | italic_V ) italic_t ( italic_J ) respect for 1 ≤ italic_V ≤ italic_W .

An alternative solution method obviates the need to compute the state probabilities for 1JleqV1𝐽𝑙𝑒𝑞𝑉1\leq JleqV1 ≤ italic_J italic_l italic_e italic_q italic_V and reduces the solution cost of lower levels by a factor of V/log2(V)absent𝑉subscriptlog2𝑉\approx V/\mbox{log}_{2}(V)≈ italic_V / log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_V ).

A(J)𝐴𝐽A(J)italic_A ( italic_J ) which is the mean of the difference IJ𝐼𝐽I-Jitalic_I - italic_J at completion instants can be computed as follows:

A(J)=I=J1V(IJ)Ptran(j,i|V)𝐴𝐽superscriptsubscript𝐼𝐽1𝑉𝐼𝐽subscript𝑃𝑡𝑟𝑎𝑛𝑗conditional𝑖𝑉\displaystyle A(J)=\sum_{I=J-1}^{V}(I-J)P_{tran}(j,i|V)italic_A ( italic_J ) = ∑ start_POSTSUBSCRIPT italic_I = italic_J - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_V end_POSTSUPERSCRIPT ( italic_I - italic_J ) italic_P start_POSTSUBSCRIPT italic_t italic_r italic_a italic_n end_POSTSUBSCRIPT ( italic_j , italic_i | italic_V ) (22)

Given that J¯=J=1VJPS(J|V)¯𝐽superscriptsubscript𝐽1𝑉𝐽subscript𝑃𝑆conditional𝐽𝑉\bar{J}=\sum_{J=1}^{V}JP_{S}(J|V)over¯ start_ARG italic_J end_ARG = ∑ start_POSTSUBSCRIPT italic_J = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_V end_POSTSUPERSCRIPT italic_J italic_P start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ( italic_J | italic_V ) the systems is in equilibrium we have

A(J)=0 for J=j¯𝐴𝐽0 for 𝐽¯𝑗\displaystyle A(J)=0\mbox{ for }J=\bar{j}italic_A ( italic_J ) = 0 for italic_J = over¯ start_ARG italic_j end_ARG (23)

A(J)𝐴𝐽A(J)italic_A ( italic_J ) is positive (resp. negative) when J<J¯𝐽¯𝐽J<\bar{J}italic_J < over¯ start_ARG italic_J end_ARG (resp. J>J¯𝐽¯𝐽J>\bar{J}italic_J > over¯ start_ARG italic_J end_ARG.

Eq. 23 can be solved using the bisection method, since A(J) is a monotonically decreasing function in J. The number of iterations is bounded by ceil(log2(V))ceilsubscriptlog2𝑉\mbox{ceil}(\mbox{log}_{2}(V))ceil ( log start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_V ) ).

The interpretation of this relationship is that J¯¯𝐽\bar{J}over¯ start_ARG italic_J end_ARG is the system’s balance point such that the system tends to stay there [16] These studies dealt with potential overload due to thrashing in overloaded in virtual memory system, where the system throughput increases as the MultiProgramming Level - MPL is increased, but drops beyond a certain MPL. This phenomenon is explored in the context of 2-Phase Locking - 2PL using a simple model as the degree of txn concurrency is increased in [62].

The above analysis is motivated by Equilibrium Point Analysis - EPA, which was applied to the analysis of multiaccess protocols in Tasaka [53]: “EPA is a fluid-type approximation which is only applied to the steady state. It assumes that the systems is always at an equilibrium point. Therefore, EPA does not necessitate calculating state transition probabilities. An equilibrium point can easily be obtained by numerically solving a set of simultaneous nonlinear equations.”

An application of EPA is illustrated by an example if Figure 20 in [19], where there are M=18𝑀18M=18italic_M = 18 users with think time M/Z=0.9𝑀𝑍0.9M/Z=0.9italic_M / italic_Z = 0.9 so that Z=18/0.90=20𝑍180.9020Z=18/0.90=20italic_Z = 18 / 0.90 = 20 The arrival rate of requests to the computer system is a(N)=(MN)/Z𝑎𝑁𝑀𝑁𝑍a(N)=(M-N)/Zitalic_a ( italic_N ) = ( italic_M - italic_N ) / italic_Z. The mean number requests at the computer systems is given by the intersection of the throughput characteristic T(N)𝑇𝑁T(N)italic_T ( italic_N ) and a(N)𝑎𝑁a(N)italic_a ( italic_N ), but the calculation is simplified by using intersection of two graphs to determine NintersectN¯subscript𝑁𝑖𝑛𝑡𝑒𝑟𝑠𝑒𝑐𝑡¯𝑁N_{intersect}\approx\bar{N}italic_N start_POSTSUBSCRIPT italic_i italic_n italic_t italic_e italic_r italic_s italic_e italic_c italic_t end_POSTSUBSCRIPT ≈ over¯ start_ARG italic_N end_ARG Otherwise we have to solve the following set of equations for (N)𝑁(N)( italic_N ) setting p(0)=1𝑝01p(0)=1italic_p ( 0 ) = 1 noting that they add to one.

[(MN)/X]p(N)=t(N1)p(N1),1NMformulae-sequencedelimited-[]𝑀𝑁𝑋𝑝𝑁𝑡𝑁1𝑝𝑁11𝑁𝑀[(M-N)/X]p(N)=t(N-1)p(N-1),1\leq N\leq M[ ( italic_M - italic_N ) / italic_X ] italic_p ( italic_N ) = italic_t ( italic_N - 1 ) italic_p ( italic_N - 1 ) , 1 ≤ italic_N ≤ italic_M

such that:

p(0)=[1+N=1Mp(N)]1.𝑝0superscriptdelimited-[]1superscriptsubscript𝑁1𝑀𝑝𝑁1p(0)=[1+\sum_{N=1}^{M}p(N)]^{-1}.italic_p ( 0 ) = [ 1 + ∑ start_POSTSUBSCRIPT italic_N = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT italic_p ( italic_N ) ] start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT .

Acknowledgements

This paper is partially based on papers the author coauthored PhD student Paul Bay, most notably Thomasian and Bay [59]. The Appendix is based on Thomasian and Ryu [44].

References

  • [1]
  • [2] T. L. Adam, K. M. Chandy, and J. R. Dickson. A comparison of list schedules for parallel processing systems. Commun. ACM 17, 12 (1974), 685-690.
  • [3] M. Ajmone Marsan, G. Conte, and G. Balbo: A class of generalized stochastic Petri nets for the performance evaluation of multiprocessor systems. ACM Trans. on Computer Systems 2, 2 (May 1984), 93-122.
  • [4] M. Ajmone Marsan, G. Conte, G. Balbo, S. Donatelli, and G. Franceschinis. Modelling with Generalised Stochastic Petri Nets. John-Wiley & Sons, 1995.
  • [5] F. Baskett, K. M. Chandy, R. R. Muntz, and F. G. Palacios. Open, closed, and mixed networks of queues with different classes of customers. J. ACM 22, 2 (1975), 248-260.
  • [6] G. Bolch, S. Greiner, H. de Meer, and K. S. Trivedi. Queueing Networks and Markov Chains: Modeling and Performance Evaluation with Computer Science Applications, 2nd ed. Wiley-Interscience, 2006.
  • [7] J. A. Buzacott and J. G. Shanthikumar. Stochastic Models of Manufacturing Systems. Prentice Hall, 1993.
  • [8] J. P. Buzen. Computational algorithms for closed queueing networks with exponential servers. Commun. ACM 16(9): 527-531 (1973).
  • [9] J. P. Buzen, R. P. Goldberg, A. M. Langer, E. S. Lentz, H. S. Schwenk, D. A. Sheetz, and A. W. Shum. BEST/1 - Design of a tool for computer system capacity planning. In Proc. AFIPS National Computer Conf. - NCC 1978, 447-455.
  • [10] J. P. Buzen. A Queueing Network Model of MVS. ACM Computing Survey 10(3): 319-331 (1978).
  • [11] K. M. Chandy and C. H. Sauer. Computational algorithms for product form queueing networks. Commun. ACM 23, 10 (Oct. 1980), 573-583.
  • [12] K. Mani Chandy and D. Neuse. Linearizer: A heuristic algorithm for queueing network models of computing systems. Commun. ACM 25, 2 (Feb. 1982), 126-134.
  • [13] W. W. Chu and K. K. Leung, Module replication and assignment for real-time distributed processing systems. Proc. IEEE 75, 5 (May 1987), pp. 547-562.
  • [14] W. W. Chu, C. Sit, and K. K. Leung. Estimating task response time for real-time distributed systems with resource contentions. IEEE Trans. on Software Engineering 17(10): 1076-1092 (October 1991).
  • [15] E. G. Coffman Jr. and P. J. Denning. Operating Systems Theory. Prentice-Hall 1973.
  • [16] P.-J. Courtois. Decomposability, instabilities, and saturation in multiprogramming systems. Commun. ACM 18(7): 371-377 (1975
  • [17] A. Dasgupta. A formula for the expected value of the maximum of three independent normals and a sparse high dimensional case. Statistics Dept. at Purdue Univ. downloaded 2023
    https://www.stat.purdue.edu/~dasgupta/orderstat.pdf
  • [18] H. A. David and H. N. Nagaraja. Order Statistics, 3rd edition, Wiley-Interscience 2003.
  • [19] P. J. Denning and J. P. Buzen: The operational analysis of queueing network models. ACM Computing Surveys 10, 3 (1978), 225-261.
  • [20] E. B. Fernandez and B. Bussell. Bounds on the number of processors and time for multiprocessor optimal schedules. IEEE Trans. Computers 22, 8 (1973), 745-751.
  • [21] P. A. Franaszek, J. T. Robinson, and A. Thomasian. Concurrency control for high contention environments. ACM Trans. Database Systems 17, 2 (1992), 304-345.
  • [22] P. Heidelberger and K. S. Trivedi: Queueing Network Models for Parallel Processing with Asynchronous Tasks. IEEE Trans. Computers 31, 11 (Nov. 1982), 1099-1109.
  • [23] P. Heidelberger and K. S. Trivedi: Analytic Queueing Models for Programs with Internal Concurrency. IEEE Trans. Computers 32, 1 (Jan. 1983), 73-82.
  • [24] J. J. Hunter. Variances of first passage times in a Markov Chain with applications to mixing times Res. Lett. Inf. Math. Sci., 10 (2006), 17-48.
  • [25] J. R. Jackson. Networks of Waiting Lines. Operations Research 5, 4 (1957), 516-521.
  • [26] A. Kapelnikov, R. R. Muntz, and M. D. Ercegovac. A Modeling Methodology for the Analysis of Concurrent Systems and Computations. J. Parallel Distributed Computing 6, 3 (1989), 568-597.
  • [27] A. Kapelnikov, R. R. Muntz, and M. D. Ercegovac. A methodology for performance analysis of parallel computations with loo** constructs. J. Parallel Distributed Computing 14, 2 (1992), 105-120.
  • [28] L. Kleinrock. Queueing Systems, Vol I: Theory. Wiley-Interscience 1975.
  • [29] L. Kleinrock. Queueing Systems, Vol. II: Computer Applications, Wiley-Interscience 1976.
  • [30] H. Kobayashi. System Design and Performance Analysis Using Analytic Models. Chapter 3 in K. M. Chandy and R. T. Yeh. Current Trends in Programming Methodology, Vol. III: Software Modeling, Prentice-Hall 1978, 72-114.
  • [31] H. Kobayashi and B. L. Mark. System Modeling and Analysis: Foundations of System Performance Evaluation. Pearson, 2009.
  • [32] K. C.-Y. Kung. Concurrency in Parallel Processing Systems. Ph.D. Dissertation. Computer Science Department, UCLA, 1984.
  • [33] S. S. Lavenberg. Computer Performance Modeling Handbook. Academic Press 1983.
  • [34] E. D. Lazowska, J. Zahorjan, G. Scott Graham, and K. C. Sevcik: Quantitative System Performance: Computer System Analysis Using Queueing Network Models Prentice-Hall 1984.
  • [35] D. F. Martin. The Automatic Assignment and Sequencing of Computations on Parallel Processor Systems. Ph.D. Thesis, U. of California, Los Angeles, Jan. I966.
  • [36] D. A. Menasce and V. A. F. Almeida. Analytic Models of Supercomputer Performance in Multiprogramming Environments. Int’l J. High Performance Computing Applications 3 2 (1989), 71-91.
  • [37] D. A. Menasce and L. A. Barroso. A methodology for performance evaluation of parallel applications on multiprocessors. J. Parallel Distributed Computing - JPDC 14, 1 (1992), 1-14.
  • [38] D. A. Menasce, D. Saha, S. C. S. Porto, V. Almeida, and S. K. Tripathi. Static and dynamic processor scheduling disciplines in heterogeneous parallel architectures. J. Parallel Distributed Computing - JPDC 28, 1 (Jan. 1995), 1-18.
  • [39] M. K. Molloy: Performance analysis using stochastic Petri nets. IEEE Trans. Computers 31(9): 913-917 (1982)
  • [40] J. L. Peterson. Petri Net Theory and the Modeling of Systems. Prentice-Hall 1981.
  • [41] M. Reiser and H. Kobayashi. Queuing Networks with Multiple Closed Chains: Theory and Computational Algorithms. IBM J. Research & Development 19, 3 (1975), 283-294.
  • [42] M. Reiser and S. S. Lavenberg. Mean-value Analysis of Closed Multi-chain Queuing Networks. J. ACM 27, 2 (1980), 313-322.
  • [43] M. Reiser: Mean value analysis: A personal account. Performance Evaluation 2000, 491-504
  • [44] I. K. Ryu and A. Thomasian. Analysis of database performance with dynamic locking. J. ACM 37, 3 (1990), 491-523.
  • [45] R. A. Sahner, K. S. Trivedi, and A. Puliafito. Performance and Reliability Analysis of Computer Systems: An Example-Based Approach Using the SHARPE Software Package. Kluwer 1996.
  • [46] S. Salza and S. S. Lavenberg. Approximating response time distributions in closed queueing network models of computer performance. In Proc. 8th Int’l Symp. on Compute Performance Modelling, Measurement and Evaluation, 1981, F. J. Kylstra, Ed., 133-144.
  • [47] C. H. Sauer and K. M. Chandy. Approximate analysis of central server models. IBM J. Research & Development 19, 3 (1975), 301-313.
  • [48] C. H. Sauer, M. Reiser, and E. A. MacNair. RESQ — A package for solution of generalized queueing networks. In Proc. Nat’l Computer Conf. 1977, 978-986.
  • [49] C. H. Sauer. Approximate solution of queueing networks with simultaneous resource possession. IBM J. Research & Development 25, 6 (Nov.-Dec. 1981), 894-903.
  • [50] C. H. Sauer and E. A. MacNair, Extended Queueing Network Models. Chapter 8 in Computer Performance Handbook, S. S. Lavenberg, (ed.), 1983.
  • [51] H. D. Schwetman. Hybrid simulation models of computer systems. Commun. ACM 21, 9 (Sept. 1978), 718-723.
  • [52] W. J. Stewart. Probability, Markov Chains, Queues, and Simulation: The Mathematical Basis of Performance Modeling Princeton Univ. Press. 2009.
  • [53] S. Tasaka. Performance Analysis of Multiple Access Protocols. The MIT Press 1986.
  • [54] A. Thomasian and B. Nadji. Algorithms for queueing network models of multiprogrammed computer systems. Computer Performance 2, 3 (Sept. 1981), 100-123.
  • [55] A. Thomasian and I. K. Ryu. A decomposition solution to the queueing network model of the centralized DBMS with static locking. In Proc. ACM SIGMETRICS on Measurement and Modeling of Computer Systems - SIGMERTICS 1983, 82-92.
  • [56] A. Thomasian and P. F. Bay. Queueing network models for parallel processing of task systems. In Proc. Int’l Conf. on Parallel Processing - ICPP 1983, 421-428
  • [57] A. Thomasian and K. Gargeya. Speeding up computer system simulations using hierarchical modeling. ACM SIGMETRICS Perform. Evaluation Review 12, 4 (1984), 34-39.
  • [58] A. Thomasian. Performance evaluation of centralized databases with static locking. IEEE Trans. Software Eng. TSE-11, 4 (April 1985), 346-355.
  • [59] A. Thomasian and P. F. Bay. Analytic queueing network models for parallel processing of task systems. IEEE Trans. Computers 35, 12 (Dec. 1986), 1045-1054.
  • [60] A. Thomasian. A performance study of dynamic load balancing in distributed systems. In Proc. Int’l Conf. on Distributed Computing Systems - ICDCS 1987: 178-184
  • [61] A. Thomasian and I. K. Ryu. Performance analysis of two-phase locking. IEEE Trans. Software Eng. 17, 5 (May 1991), 386-402.
  • [62] A. Thomasian. Two-phase locking performance and its thrashing behavior. ACM Trans. Database Systems 18, 4 (1993), 579-625.
  • [63] A. Thomasian. Survey and analysis of disk scheduling methods. ACM SIGARCH Computer Architecture Newsletter 39, 2 (2011), 8-25.
  • [64] A. Thomasian: Analysis of fork/join and related queueing systems. ACM Computing Surveys 47, 2 (Aug. 2014), 17:1-17:71.
  • [65] A. Thomasian Unbalanced job approximation using Taylor series expansion and review of performance bounds. https://doi.org/10.48550/arXiv.2309.15172
  • [66] K. S. Trivedi. Probabilistic and Statistics with Reliability, Queueing and Computer Science Applications, 2nd ed. Wiley 2001.
  • [67] K. S. Trivedi and A. Bobbio. Reliability and Availability Engineering: Modeling, Analysis, and Applications. Cambridge Univ. Press, 2017.
  • [68] V. L. Wallace and R. S. Rosenberg. Markovian models and numerical analysis of computer system behavior. In Proc. AFIPS Spring Joint Computer Conf. - SJCC 1966, Vol. 27, 141-148.
  • [69] P. D. Welch. Statistical Analysis of Simulation Results. Chapter 6 Computer Performance Handbook, S.S. Lavenberg (ed.), 1983.
  • [70] A. C. Williams and R. A. Bhandiwad. A generating function approach to queueing network analysis of multiprogrammed computer systems. Networks 6, 1 (1976), 1-22.