Processors: Analysis, Simulation, and Design Implications
4.4 Markov Chain Formalization
4.4.1 State Definition and State Space Determination
Our model considers I processors, each of which contains J threads, and aims to characterize the behaviors of processors, threads and memory. To do that, we need to clarify possible activities, i.e. statuses transitions, of a thread. They are depicted in Fig. 4.3 and elaborated below. When a packet arrives at an idle thread, the thread either enters the ready queue of the processor waiting for execution, or enters the active status if no thread is currently active. Sometimes it issues a memory access to, for instance, perform table lookups and manipulate packet descriptors. Once serviced it re-enters the ready queue, or goes directly back to execution if the ready queue is empty. Normally, the thread becomes idle again after the packet is processed and passed to the succeeding thread. Nonetheless, it may get stuck and enter the finished status if the succeeding thread also has a packet under processing.
Fig. 4.3. Status transitions of a thread.
According to the above descriptions we can formally define a state of the system as status of Ti,j , the jth thread in processor i. Furthermore we define
} Service), rather than the FIFO, is assumed as the queuing discipline for both memory and ready queues. This assumption further diminishes the state space by disregarding the ordering information in the queues, and is proven not to affect the correctness of the analytical result in section 4.5. Taking (I,J)=(2,2) as an example, the state space can be derived by excluding exceptional states exhibiting the following properties:
1. A processor has more than one active thread. For instance, 1,1,0,0)( .
2. At least one ready thread but no active thread, such as 2,3,0,0)( . One of the ready threads must enter the active status as long as the previous active thread completes its processing.
3. si,j =4 whilesi,j+1=0, 0≤ j<J . In this case, Ti,j must pass the packet
immediately to the succeeding one.
4. si,J−1 =4; the same reason as the one in 3.
4.4.2 Determination of the Status Transition Diagram and State Transition Matrix
We will need the state transition matrix in order to solve the model. To derive the matrix, however, we have to deal with the status transition rate diagram of threads since a state change occurs when one or more threads alter its status. By assuming the packet arrival rate for processor i as λi, memory access rate and service time of the jth thread in that processor as ri,j and 1 μi,j , memory service rate as m, and number of queued memory accesses from the processor as h, we can have the status transition rate diagram shown in Fig. 4.4. Notably the service rates, as well as the memory access rates, of threads having same thread index in all processors are set the same because of the homogeneity among those threads. That is,
j j
i μ
μ, = and ri,j =rj.
Notice that some status transitions in Fig. 4.4 do not have a rate because of being a follower transition. A transition is regarded as a follower if it does not initiate a status transition but follow a certain activator transition which actively launches a transition. For example, a finished thread (follower) blocked by its successor can enter the idle status only after the successor (activator) finishes processing and passes down the packet. Another example is that a ready thread (follower) will never enter the active status unless a thread switches out from active.
λ
r
jidle (0) active (1)
mh finished (4)
λ
mem (2) ready (3)
μj mh
μj
μj
μj
Fig. 4.4. Status transition rate diagram of Ti,j.
Observing the relationship between activator and follower, two additional transitions can be discovered out of Fig. 4.3 and shown in Fig. 4.4, the active to active and active to ready transitions. The former occurs when an active thread switches out and is then chosen again to execute for the packet passed by its finished predecessor; the latter is similar except that it is not chosen for execution but put into the ready queue.
The state transitions and transition matrix can therefore be determined according to the status transition diagram. More specifically, a state transition is considered valid if there exists only one activation event containing an activator transition and possibly a number of corresponding follower transitions. Figure 4.5 shows four example state transitions, assuming (I,J)=(1,6). The detailed matrix derivation is described in the following section.
m rj
μ
μ μ μ
Fig. 4.5. Example state transitions.
4.4.3 Determination of the State Transition Matrix
A state transition of a non-zero rate consists of one activation event containing an activator transition and possibly a number of corresponding follower transitions.
To verify a state transition, we need to characterize the activation event, namely the activator and follower transitions. Obviously, a transition initiated by a thread in the active(1) or the memory access(2) status is always an activator transition, whereas a transition performed by a thread in the idle(0), ready(3) or finished(4) status is a follower transition with two exceptions. The exceptions occur when the transitioning thread is the first one in a processor, in which idle-to-active or idle-to-ready transitions are possible because of the packet arrival.
With the observations above and the conditions defining the status of threads other than the activator thread, all activation events can be identified as summarized in Table 4.2. An activation event is considered valid if the corresponding conditions of the activator transition are satisfied. For instance, before recognizing an activation event with the activator transition being from active to finished, namely the thread is finishing the processing of a packet but getting blocked by its successor, two conditions need to be met. First, j< J −1 and si,j+1∈{2,3,4}, since if j equals J-1 or si,j+1 =0, the thread would have been able to send out the packet. Second, for threads other than Ti,j in processor i, their statuses remain unchanged if none of them is in the ready status; otherwise one thread shall be chosen for execution. Take (I,J)=(1,3) as an example, the activation events (2,2,1)=>(2,2,4), (2,1,0)=>(2,4,0), and (3,1,2)=>(3,4,2) are all invalid.
Table 4.2. Activation events initiated by Ti,j, and the corresponding examples (I= J1, ∈{3,4}) and conditions. si,j and si′ denote the source and destination ,j status of Ti,j, respectively. The status transition rates are shown in Fig. 4.4.
Activator Example Condition
act(1) => fin(4) j 1 j 4
2. There exists an active thread.
mem(2) => act(1) Ex: (2,2,4) => (2,1,4) ∀ j′≠ j,si,j′ =si′,j′
act(1) => rdy(3) j 4. The same with 1=>1 except
}.
4.4.4 Performance Estimation for the Analytical Model
The performance metrics that we are interested in obtaining from the analytical model include the processor and memory efficiencies. We can compute these measures from the stationary probability vector, π, for the Markov chain. The mean number of executing processors, which we call processing power (Ppower), and the processor utilization, which we call processor efficiency (Pefficiency), are then calculated from the vector as
|)
Memory utilization, which we call memory efficiency (Mefficiency), number of memory accesses in memory system (Maccesses), and ready queue length of a processor (Rlength) can be calculated as
∑
=I