• 沒有找到結果。

According to the evaluation metrics described in the previous section, an energy cost-based sleep window selection policy is picked for the proposed PSWD approach, where the reciprocal of the average energy cost is utilized for immediate rewards r(sd i, su k, an) ∈ R of the POMDP model. The reward is designated via a succinct Reward Assignment Algorithm as illustrated in Algorithm 1. It is noticed that the algorithm also takes the tolerable packet delay δ into account in order to draw the adequate length of sleep win-dows. As shown in Algorithm 1, if the expected delay of state/action pair (sd i, su k, an) calculated by (5.8) satisfies the delay constraint δ, the immediate reward r(sd i, su k, an) is assigned in accordance with the value from (5.5). Otherwise, a pre-defined value 1/Emax is given so that infeasible action would be ruled out from A corresponding to the traffic states (sd i, su k) ∈ S.

Given the sets S and A, Algorithm 1 is considered a table-lookup algorithm since E(sd i, su k, an) and Emax can be calculated in advance. Therefore, the time complexity of the algorithm becomes O(|S||A|), where |S| and |A| represent the number of states and the number of actions in S and A respectively.

Algorithm 1: Reward Assignment Algorithm Input: S, A, tolerable delay δ

Output: set of immediate rewards R(S, A) foreach sd i ∈ Sd do

foreach su k ∈ Su do foreach an∈ A do

if D(sd i, su k, an) ≤ δ then r(sd i, su k, an) ← E(s 1

d i,su k,an)

else

r(sd i, su k, an) ← E1

max

end end end end

The optimal solution of sleep window selection problem is unavailable owing to the reason of unobservable traffic states. However, thanks to the belief states of POMDP model, the unobservable traffic states can thus be estimated. Furthermore, the reward of an action made in a given traffic state can be acquired by the immediate reward set R. Based on these two kinds of information, the suboptimal choice can be made via adopting a T -step value function in the PSWD approach. The final decision of sleep window selection at the decision epoch dt ∈ D can be determined as

D(dt)h

in (5.9) is defined as the T -step value function for the energy-cost based sleep window determination policy at a decision epoch dtwhich starts at dt, and there are T − 1 decision steps remaining. The first item of (5.10) denotes the immediate reward for the belief state pair b(s(dd it)) ∈ Bd(dt) and b(s(du jt)) ∈ Bu(dt). The expected reward of the future belief state h

b(s(dd it+1)), b(s(du kt+1))i is represented in the second term. Besides, the items of conditional probabilities can be

acquired from the denominator of (5.3). The parameter γ(dt+1) stands for a discount factor of the dt+1-step for convergence control of the future value function. In other words, the value function V(dt)h

b(s(dd it)), b(s(du jt))i

intends to determine an action with the maximum reward (i.e. the minimum energy consumption) according to the currently estimated traffic state and the expected rewards result from future actions made in the subsequent states.

Chapter 6

Performance Evaluation

6.1 Model Validation

This section provides the validation of the proposed analytical models by comparing with simulations. Both the sleep ratio and the mean packet delay are evaluated versus arrival rate λ, where λ is selected as λ = λd+ λu with λd = λu = λ/2, and especially λd,i = λu,i = 0.01 packets/frame for 16e. Other parameters utilized in both the analyt-ical models and the simulations are listed as follows: the service time 1/µ = 1 frame, the variance of the service time σ2 = 0, the length of the (default) listening window TL = 1 frame, and idle period τ = 4 frame. It is noted that the results of analysis are represented by lines and the results of simulation are indicated by symbols.

Fig. 6.1 shows the numerical evaluation of Type I for both 16e and 16m sleep mode operation with different lengths of the initial sleep window/cycle, in which the maximum sleep window/cycle is set to be 16 frame durations. On the other hand, the outcomes of Type II for 16e and 16m are examined in Fig. 6.2. It can be observed from both figures that the results obtained from the analytical models are consistent with the that acquired from the simulation results, which validate the correctness of

0.1 0.2 0.3 0.4

Figure 6.1: Numerical evaluation for Type I of the IEEE 802.16e/m with: (a) sleep ratio vs. packet arrival rate λ; (b) mean packet delay vs. packet arrival rate λ.

the derived models. There are small amounts of deviation in the results of the sleep ratio as observed from Fig. 6.1(a) and the mean packet delay in Fig. 6.2(b), which are primarily caused by the approximation with the derivation of ωBη,n in (3.17) and E[WLi] in (4.32), respectively. In both Fig. 6.1(a) and Fig. 6.2(a), the sleep ratio decreases when the traffic load λ is augmented, and it is obvious to observe that 16m performs better while that of 16e degrades dramatically. As for the mean packet delay of both Type I and Type II of 16e and 16m , as shown in Fig. 6.1(b) and Fig. 6.2(b), the operation of 16m owns somewhat higher values than the 16e, however, such packet delays still meet the requirements for individual traffic types, that is, BE traffic for Type I; while the bounded characteristics for QoS-guaranteed services for Type II.

These phenomenons are mainly attributed to the differences between 16m and 16e mentioned in Section 2.3. Without the existence of idle periods, the AMS in 16m can

0.1 0.2 0.3 0.4

Figure 6.2: Numerical evaluation for Type II of the IEEE 802.16e/m with: (a) sleep ratio vs. packet arrival rate λ; (b) mean packet delay vs. packet arrival rate λ.

return to sleep window as soon as it completes data transmission, and energy can be conserved. Nevertheless, the opportunities to immediately receive the incoming packets are diminished and consequently incurs slightly more packet delays. On the other hand, the MS of 16e almost stays in the normal (active) mode when lying in heavy traffic, which leads to poor power-saving performance.

相關文件