Per-flow sleep scheduling for power management in IEEE 802.16 wireless networks

(1)

Per-ﬂow sleep scheduling for power management in IEEE 802.16

wireless networks

Jen-Jee Chen

a

, Shih-Lin Wu

b,⇑

, Shiou-Wen Wang

b

, Yu-Chee Tseng

c,d a

Department of Electrical Engineering, National University of Tainan, Tainan 70005, Taiwan

b

Department of Computer Science and Information Engineering, Chang Gung University, Kweishan Taoyuan 33302, Taiwan

c

Department of Computer Science, National Chiao Tung University, Hsin-Chu 30010, Taiwan

d

Research Center for Information Technology Innovation, Academia Sinica, Taipei 11529, Taiwan

a r t i c l e

i n f o

Article history:

Available online 30 March 2011 Keywords:

IEEE 802.16

Mobile communication Power saving class Quality of service (QoS) WiMAX

Wireless network

a b s t r a c t

Power management is a critical issue in IEEE 802.16 wireless networks. In the standard, a power saving class (PSC) of type II is defined to support real-time traffic flows. It allows a flow to switch periodically between active and sleep states to save energy. However, pre-vious studies either consider adjusting start frames of PSCs by assuming that the PSCs are already given or assume one single PSC to accommodate all flows in a mobile station, thus leading to higher energy cost. This paper proposes two ‘‘per-flow’’ sleep scheduling schemes, which assign one PSC to each real-time flow according to its QoS parameters. This leads to less energy consumption, more efficient use of bandwidth, and more compact lis-tening windows. We also prove that deciding whether a given scheduling problem is solv-able can be reduced to a maximum matching problem, which is computationally tractsolv-able. Simulation results show that such a per-flow scheduling does perform much closer to the active ratio lower bound and achieve higher resource utilization than previous schemes.

1. Introduction

The IEEE 802.16 [1] has been defined for broadband wireless access for mobile stations (MSs). One important de-sign issue in IEEE 802.16 is power saving classes (PSCs), which allow an MS to switch between active and sleep modes to reduce unnecessary energy consumption. The standard defines three types of PSCs for different traffic characteristics. For type I, the sizes of listening windows are fixed while the sizes of sleep windows grow exponen-tially if no data packet arrives. Once a data packet arrives, the corresponding PSC is deactivated. This type is suitable for non-real-time variable-rate (NRT-VR) and Best-Effort (BE) connections. For type II, both listening and sleep win-dows are of fixed sizes. Such a PSC is deactivated only when asked. This type is suitable for Unsolicited Grant Service

(UGS) and real-time variable-rate (RT-VR) connections. Type III is only valid for one sleep window, after which the PSC is deactivated. It is suitable for multicast and man-agement operations. The standard suggests that connec-tions with similar characteristics can be associated with one PSC. If there are multiple PSCs between an MS and a BS, the MS can go to sleep only if all its PSCs are in sleep windows.Fig. 1shows an example with three PSCs in an MS and the actual intervals that the MS can go to sleep. Clearly, we observe that by increasing the overlapping of listening windows, the MS’s energy consumption can be re-duced. However, this is left as an open issue for designers. Intensive works have been devoted to the power saving issues of IEEE 802.11 networks[2–4]. However, these tech-niques can not be directly applied to IEEE 802.16 because IEEE 802.11 is a CSMA/CA-based (Carrier Sense Multiple Access with Collision Avoidance-based) wireless access protocol. Analyses of IEEE 802.16 networks’ energy costs are in[5–8]. These results have provided a potential guid-ance for setting PSCs’ parameters. However, these schemes

⇑Corresponding author.

E-mail addresses: jjchen@mail.nutn.edu.tw(J.-J. Chen), slwu@mail.

cgu.edu.tw (S.-L. Wu), m9529004@stmail.cgu.edu.tw (S.-W. Wang),

yctseng@cs.nctu.edu.tw(Y.-C. Tseng).

Contents lists available atScienceDirect

Computer Networks

(2)

all assume non-real-time trafﬁc and most of them consider the arrival patterns to be memoryless, which is not always true in the real world. Numerous works in the literature

[9–13]focus on the designs of PSCs of type I. How to

adap-tively adjust the initial sleep window is addressed in[9]. The relationship between the initial sleep window and the estimated packet inter-arrival time is studied in[10]. How to adjust the initial and the maximum sleep windows is studied in[11]. Assuming that the distribution of the re-sponse packet arrival time is known,[12]proposes a deci-sion algorithm. How to adjust the minimum and the maximum sleep windows is discussed in[13]. For PSCs of type II, a Maximum Unavailability Interval (MUI) scheme is proposed in[14]for selecting the optimal start frames of PSCs to maximize the unavailable time. This work does not answer how to decide PSCs’ parameters. In[15,16], since one single PSC is applied to serve all real-time con-nections in an MS, the selection of sleep and listening win-dows must meet the strictest bandwidth and packet delay bound requirements of all connections. This could incur waste of bandwidth and extra listening windows.

Given a set of real-time flows between an MS and a BS, this paper considers the ‘‘per-flow’’ sleep scheduling prob-lem. This involves not only the selection of PSC parameters of all flows to meet their QoS, but also the scheduling of their active times to reduce the overall duty cycle. This may out-perform the results in[15,16]since using one PSC for each flow can more accurately capture its traffic characteristic (Section2will illustrate this by an example.) Our results are thus more bandwidth- and energy-efficient. Two stan-dard-compliant schemes are proposed. Moreover, we prove that deciding whether a given scheduling problem is solv-able can be reduced to a maximum matching problem in a bipartite graph which can be solved in polynomial time. Fi-nally, simulation results are provided to verify these claims. The rest of this paper is organized as follows. Section2

shows our motivation and the problem deﬁnition. Section3

presents our schemes. In Section4, we conduct some fea-sibility study of the scheduling problem. Simulation results are in Section5. Section6concludes this paper.

2. Motivation and problem deﬁnition

We ﬁrst motivate our work and then present our prob-lem deﬁnition. For PSCs of type II, previous works[15,16]

try to form one single PSC to serve all connections in an MS. While this is simpler, there are several drawbacks. First, the strictest delay bound among all connections must be followed to avoid missing any deadlines. Second, in each listening window, the BS needs to reserve the maxi-mum bandwidth to accommodate the maximaxi-mum possible load. This leads to waste of bandwidth and extra awake frames. If each connection has its own PSC according to its packet inter-arrival time and delay bound, the MS only needs to stay awake at proper times, thus incurring less re-source waste.Fig. 2shows an example with two connec-tions C1 and C2, which have packet inter-arrival times of

PI1= 3F and PI2= 6F, delay bounds of D1= 6F and

D2= 18F, and expected packet sizes of S1¼25B and

S2¼2₅B, respectively, where F is the frame duration and B

is the maximum available resource per frame for the MS.

AsFig. 2(a) shows, if the periodic on–off scheme (PS)[15]

is applied, there is only one PSC of type II, which has a sleeping cycle of 6 frames (i.e., the strictest delay bound 6F by C1) and a listening window of 2 frames per cycle

(i.e., the trafﬁc demand of C1 and C2 per 6F,

d6F

3FeS1þ d6F6FeS2¼65B, after taking the ceiling function). As

Fig. 2(b) shows, by using two PSCs, we can construct a

PSC P1of period T1= 6 frames for C1and a PSC P2of period

T2= 18 frames for C2. P1and P2need 1 and 2 frames of

lis-tening windows per cycle, respectively (note that two PSCs can share some active frames as long as their trafﬁcs can be consumed). Compared toFig. 2(a), which needs 2 listening frames per 6 frames,Fig. 2(b) needs 4 listening frames per 18 frames, thus reducing the energy cost by 33.3%.

This paper considers the per-ﬂow sleep scheduling problem as follows:

Given

(1) n real-time connections Ci, i = 1..n, between an MS

and a BS, where each connection Ci has a packet

inter-arrival time PIi(ms), a delay bound for packets

Di(ms), and an expected packet size Si(bits) and

(2) the BS can allocate to the MS at most B (bits) per frame

Find

n PSCs Pi, i = 1..n, each is assigned to the connection Ciand

with sleeping cycle Ti (frames), listening window TLi

(frames), and starting frame of the ﬁrst listening window TS i. Fig. 1. Sleep frame determination of an MS.

(3)

Minimize the ratio of active frames for the MS Subject to

(1) Ti F 6 Di(delay bound) and

(2) ddTiF

PIie Si=Be 6 T

L

i (bandwidth requirement), i = 1..n

3. Proposed schemes

Next, we present our per-ﬂow sleep scheduling (PSS) algorithm. Our goal is to determine the following parame-ters for each Ci, i = 1..n: (1) the cycle length Ti, (2) the

listen-ing window TL

i in each cycle, (3) the starting frame T S i of the

ﬁrst listening window, and (4) the amount of resource Ri,j

to be allocated to the jth active frame of each cycle, j ¼ 1::TL

i. The calculation should be done at the BS side.

Our scheme will maintain a property that the cycle length Tiof each Ci is an integer multiple of the previous Ti1.

Therefore, we will sometimes call T1the basic cycle Tbasic.

The goal is to increase the overlapping of these listening windows so as to reduce the MS’s duty cycle. Assuming that Tbasicis known, our PSS algorithm involves an iterative

process for i = 1..n, where each iteration has two steps: (i) determine Tiof each Ciand (ii) schedule Ri;j; TLi, and T

S

i of

each Ci. We will discuss how to determine the basic cycle

Tbasiclater on. Below, we present some observations, which

will serve as guidelines for our design.

Observation 1. The resource of bandwidth B bits per frame allocated to an MS can be regarded as an inﬁnite sequence S of a period of one frame. S can be divided into p sub-sequences Sp_i, each with a period of p frames and a resource of B bits per cycle, where i = 1..p and p is a positive integer. Alternatively, S can be divided into m sub-sequences Sp;ki

i , each with a period

of p and a resource of ki B bits per cycle, where i = 1..m,

m < p, kiis a positive integer, andPi¼1::mki¼ p.

Observation 2. A sub-sequence Sp_i can be further divided into p0_{sub-sequences S}pp0

i;j , each with a period of p p0frames and

each shifted by a distance of p, where j = 1..p0_{. Similarly, a}

sub-sequence Sp;ki

i can be further divided into p0 sub-sequences

Spp0;ki

i;j , j = 1..p0, with similar properties except that in each

cycle the amount of resource is ki B bits.

Observation 3. The scheduling problem for a connection Ci

can be regarded as placing its demand on a sub-sequence with a proper period and resource per cycle. We propose two strat-egies toward this goal. The first one, called delay bound-based (DB-based) strategy, tries to accumulate a connection’s traffic as much as possible until reaching the delay bound and serve the connection by a sub-sequence with a period slightly tigh-ter than the delay bound and a resource sufficient for the accumulated traffics per cycle. In this way, there are less active frames incurred by the connection. The second one, called packet inter-arrival-based (PI-based) strategy, tries to serve a connection’s traffic immediately once a packet arrives by a sub-sequence with a period slightly tighter than the packet inter-arrival time. If each packet size is small, we may overlap multiple connections’ active frames and serve their traffic by one or few active frames.

Observations 1 and 2indicate some ways to decompose

the resource allocated to an MS.Fig. 3(a) shows two exam-ples. With p = 3, S is divided into three subsequences S3₁; S3₂, and S3₃, each with a period of 3 frames. Alternatively, with m = 2, S can be divided into two subsequences S3;21 and

S3;1

2 , which have 2B and B bits per cycle, respectively.

Fol-lowingObservation 2,Fig. 3(b) shows how to decompose S31into p0= 2 sub-sequences S

32 1;1 and S

32

1;2 , each with a

per-iod of 3 2 frames, and how to decompose S3;2 1 into S

32;2

1;1 ,

and S32;2

1;2 .

Our PSS follows these observations to pack the trafﬁc of connections together to reduce energy consumption. It has two steps and adds Cione-by-one to the MS such that the

additional active frames for the MS are as few as possible. Analyzing this for each Ci, we will determine the following

four sleeping parameters: the sleeping cycle Ti, the starting

frame of the ﬁrst listening window TS

i, the listening

win-dow TLi, and the amount of resource Ri,jto be allocated to

the jth active frame of each sleeping cycle, j ¼ 1::TL i. A data

structure

p

k; k ¼ 1::TbasicTn , is maintained to record the

amount of remaining free resource in the kth basic cycle.

(4)

Initially,

p

k= B Tbasic. Fig. 4 shows an example. In the

example, a connection Ciis scheduled and assigned Ti= 6

(frames), TL

i ¼ 2 (frames), T S

i ¼ 4; Ri;1¼ 0:6B, and Ri,2= B,

where Tbasic= 3 frames. On the other hand, the BS records

the remaining free resources of the 2nd, 4th, and 6th basic cycles as

p

2=

p

4=

p

6= B and the 1st, 3rd, and 5th basic

cy-cles as

p

1=

p

3=

p

5= 2.6B. Note that there is 0.4B

band-width in each basic cycle being consumed by other connections scheduled before Ci.

Step (1) Determining Tiof each Ci: We propose two

ap-proaches for this step. The ﬁrst one, called PSS-DB (PSS by delay bound), sorts Cis by their packet delay bounds such

that D16D₂6_{6 D}_n. The second one, called PSS-PI (PSS by packet inter-arrival time), sorts Cis by their packet

in-ter-arrival times such that PI16PI26 6 PIn. The design

philosophy is in accordance withObservation 3.

For PSS-DB, we let T1= Tbasic and set Tifor i = 2..n as

follows: Ti¼ Ti1 Di Ti1 F : ð1Þ

Eq.(1)sets Tias a positive integer, a multiple of the

previ-ous Ti1. In fact, our assignment guarantees in a recursive

manner that Ti6bDFic. The initial T1would satisfy this

con-dition (to be shown later on). Since Di16Di; bTi1DiFc in Eq.

(1) must be a positive integer. Also, Eq.(1) implies that Ti6Ti1Ti1DiF¼

Di

F. Since Tiis an integer, Ti6bDFic meets

the delay bound for Ci.

For PSS-PI, we assume that PIi6Di(this is usually true

for most of delay-tolerant real-time applications). We let

T1= Tbasicand set Tifor i = 2..n as follows:

Ti¼ max Tbasic;Ti1

PIi

Ti1 F

: ð2Þ

Eq. (2) also sets Tias a positive integer multiple of the

previous Ti1. Our assignment guarantees in a recursive

S3x21,1 S3x2,21,1 S3x21,2 S3x2,2 1,2 S31 S3,21 S3,12 S S3 2 S33

(a)

(b)

Fig. 3. Examples of (a)Observation 1and (b)Observation 2.

Fig. 4. Output parameters Ti;TLi;T S

(5)

manner that Ti¼ Tbasic6 min

i¼1::nfDig

F

if PIi6mini=1..n{Di} and

Ti6 PIFi

j k

otherwise. The initial T1would satisfy the former

(to be shown later). Since PIis are sorted in an ascending

or-der, Eq.(2)will force those Cis such that PIi6mini=1..n{Di}

to choose their Ti= Tbasic. The rest of the Cis will satisfy

the later condition since Ti6Ti1Ti1PIiF¼

PIi

F. It follows

that all Cis will meet their delay bounds because PIi6D_i.

Theorem 1. In both PSS-DB and PSS-PI, it is guaranteed that each Tiis a positive integer multiple of T1= Tbasic, i = 2..n, and

each Ci’s cycle meets its delay bound, i.e., Ti6 DFi

j k

; i ¼ 1::n. Step (2) Scheduling Ri;j; TLi, and T

S

i of each Ci: This step is

the same for PSS-DB and PSS-PI. So we will not distinguish between them. Recall the data structure

p

k; k ¼ 1::TTbasicn.

We will sequentially schedule Ci, i = 1..n, by updating

p

k.

Speciﬁcally, when Ciis under consideration, we will pick

one basic cycle among all Tn

Tbasicbasic cycles as the starting

point and examine the subsequent basic cycles. For each basic cycle being examined, its remaining resource is allo-cated; this is repeated until we have allocated sufﬁcient re-source for Ci. Among all starting points, the one which

causes the least increment on the number of active frames is selected. The detail procedure for placing Ci’s demand is

as follows:

(a) Calculate the required resource

c

iof Ciper Tiby

c

i¼

Ti F

PIi

Si: ð3Þ

(b) Recall that each Tj is an integer multiple of Tj1,

j = 2..n. So after placing C1’s, C2’s, . . . , Ci1’s demands,

the sequence

p

k; k ¼ 1::TTbasicn, has a period of

Ti1

Tbasic. To

place Ci’s demand, we only need to check the ﬁrst Ti

Tbasic basic cycles as Ti’s starting point. Speciﬁcally,

for k = 1 to Ti

Tbasicwith

p

k> 0, we compute a cost

func-tion f(k) to represent the extra active frames incurred to the MS if we place Ci’s demand on the kth and

sub-sequent basic cycles. Note that since the listening window of a PSC must be continuous, the resources allocated to Cimust be continuous (i.e., we will not

leave a frame unallocated if there are frames being allocated before and after the frame).

(c) Let k⁄_{be the index which induces the smallest cost}

function f(k) in step b. We will place Ci’s demand

starting from the k⁄_{th basic cycle. In case that there}

is a tie, we will give priority to the one which leaves the least remaining resource in the last frame where Ci’s demand is placed.

(d) Then we set TSi to the index of the ﬁrst frame where

Ci’s demand is placed and set TLi to the number of

frames from the ﬁrst to the last frame where Ci’s

demand is placed. Also, Ri,jis set accordingly. Finally,

we update the remaining free resources in

p

k,

k = 1..n, by subtracting from them the amounts of resources allocated to Ci(note that since the period

is Ti;

p

k¼

p

_kþ‘Ti Tbasic ; ‘¼ 1:: Tn Ti 1 ).

Example 1. Fig.5shows an example of step 2.The MS con-tains 4 connections C1, C2, C3, and C4with sleeping cycles of

T1= Tbasic, T2= 2Tbasic, T3= 2Tbasic, and T4= 4Tbasic and

Fig. 5. Example of scheduling Ri;j;TLi, and T S

(6)

required resources per cycle of

c

1= 0.5B,

c

2= 1.1B,

c

3= 0.3B, and

c

4= 2.6B, respectively, where Tbasic= 4

frames.Initially,

p

k= 4B for k = 1..4. Then, each Ciis

sched-uled as follows.For C1, any selection of k⁄is the same for

it.So we set k¼ 1; R1;1¼ 0:5B; TS1¼ 1, and T L

1¼ 1.The BS

reserves

c

1= 0.5B resource for C1 in every basic cycle as

shown Fig.5(a), so

p

1=

p

2=

p

3=

p

4= 3.5B. For C2, its k⁄

can be 1 or 2.Allocating

c

2in the ﬁrst or second basic cycle

would add one more active frame to the MS (i.e., f(1) = f(2) = 1) and leave the same remaining resource of 0.4B in the last frame.So we randomly select k⁄_{= 2, which}

gives R2;1¼ 0:5B; R2;2¼ 0:6B; TS2¼ 5, and T L

2¼ 2, as shown

inFig.5(b).Then we update

p

1= 3.5B,

p

2= 2.4B,

p

3= 3.5B,

and

p

4= 2.4B. For C3, choosing k⁄= 1 or 2 would require

no additional active frame for the MS (i.e., f(1) = f(2) = 0).But setting k⁄_{= 2 would leave less remaining}

resource in the last frame (i.e., 0.1B). So we setk⁄

= 2 and update R3;1¼ 0:3B; TS3¼ 6, and T

L

3¼ 1, as shown in

Fig. 5(c). Then we update

p

1= 3.5B,

p

2= 2.1B,

p

3= 3.5B,

and

p

4= 2.1B. For C4, setting k⁄= 2 or 4 would add less

active frames (i.e., f(2) = f(4) = 2 < f(1) = f(3) = 3). Since k⁄_{= 2 and 4 will both leave the same remaining resource}

in the last frame, we randomly pick k⁄

= 2. So we set R4;1¼ 0:1B; R4;2¼ 1B; R4;3¼ 1B; R4;4¼ 0:5B; TS4¼ 6, and

TL

4¼ 4, as shown inFig. 5(d). Then we update

p

1= 3.5B,

p

2= 0,

p

3= 3B, and

p

4= 2.1B.

Example 2.Fig. 6uses an example to compare PSS-DB and PSS-PI. InFig. 6(a), the MS contains 3 connections C1, C2,

and C3 with packet inter-arrival times of PI1= 10 ms,

PI2= 30 ms, and PI3= 30 ms, delay bounds of D1= 20 ms,

D2= 100 ms, and D3= 100 ms, and packet sizes of

S1¼B2; S2¼B2, and S3¼B2.Fig. 6(b) changes the packet sizes

to S1¼B₄; S2¼B₄, and S3¼B₄.Fig. 6(a) shows that PSS-DB

will consume 9 active frames per 20 frames and PSS-PI will consume 10 active frames per 20 frames, whileFig. 6(b)

shows that PSS-DB will consume 7 active frames per 20 frames and PSS-PI will consume only 5 active frames per 20 frames. Intuitively, PSS-DB can pack packets of a con-nection together according to their delay bound and is more favorable when Sis are relatively closer to B.

Con-trarily, PSS-PI tries to pack packets of different connections together and serve them immediately after their arrivals and is more favorable when Sis are relatively smaller than

B.

Lastly, we present how to determine the basic cycle

Tbasic. Since Tbasic= T1 and T1 must satisfy T16 DF1

for PSS-DB and T16 min i¼1::nDi F

for PSS-PI, we propose to pick

Tbasicfrom the interval 1; D_F1 for PSS-DB and the interval

1; mini¼1::nDi

F

j k

h i

for PSS-PI. For each candidate Tbasic, we adopt

an exhausted search to compute a cost function g (Tbasic) to

represent the ratio of active frames of the MS if Tbasicis

used. Let Tbasicbe the basic cycle which induces the least

cost. Then this T

basicis chosen for T1.

4. Feasibility study of the scheduling problem

The above discussion did not answer the question: ‘‘What happens when a feasible scheduling can not be found?’’ Below, we conduct a feasibility study of the sleep scheduling problem. It is not hard to see that given a set of connections, if their traffic can be satisfactorily arranged without violating their deadlines, then ‘‘always active’’ is a straightforward schedule (note that this statement does not imply whether the scheduling is energy-efficient or not). Below, we show that deciding whether traffic of a set of connections can be satisfactorily scheduled can be reduced to a maximum matching problem, which is com-putationally tractable. Solutions for maximum matching

(7)

can be found in[17]. Thus, deciding whether a scheduling problem is feasible has a polynomial-time solution (fol-lowing the above note, it remains a question whether the solution is most energy-efﬁcient).

We are still given Ciand its PIi, Di, and Si, i = 1..n. In our

derivation, we assume that the units of PIi and Di are

frames and the units of Siand the resource B are bits. Let

L = lcm{PIi, i = 1..n}. Note that trafﬁc arrivals repeat at the

period of L frames. Below, we will model our scheduling problem as a maximum matching problem in a bipartite graph G = ({Vl, Vr}, E) as deﬁned below.

1. For each Ci, consider its packet arrivals during L frames.

There are L

PIi Si bits. So we construct for Cithe same

number of vertices, denoted by Ci;j;k; j ¼ 1::_PIL

i and

k = 1..Si, in Vl. So jVlj ¼Pni¼1 SiPILi

.

2. For the B bits in continuous L frames, we construct the same number of vertices, denoted by Fx,y, x = 1..L and

y = 1..B, in Vr. So jVrj = B L.

3. We regard vertex Ci,j,k2 Vlas the kth bit of the jth packet

of connection Ci. Let ti,jbe the frame index of its arrival.

Then it has to be delivered by frame ti,j+ Di 1. So we

construct edges (Ci,j,k, Fx,y) in E for x = ti,j+ 1..ti,j+

Di 1(mod L) and y = 1..B. Each edge means that Ci,j,k

can be assigned to resource Fx,ywithout violating the

delay bound. Note that some x may exceed L, so we use ‘‘mod L’’ to represent the subsequent L frames in the next round, i.e., they are represented by wrap-around edges.

Example 3. Fig. 7(a) shows an example. The MS contains 2 connections C1and C2with PI1= 2, PI2= 3, D1= 3, D2= 3,

S1= 1, and S2= 2. Assuming B = 2, we can form a bipartite

graph as shown inFig. 7(b). Since lcm{PI1, PI2} = 6, we

con-sider 6 PI1¼ 3 and

6

PI2¼ 2 packet arrivals of C1and C2,

respec-tively. Since each packet of C2 has 2 bits, there are 7

vertices in Vl. In Vr, there are B lcm{PI1, PI2} = 2 6 = 12

vertices. For example, bits C2,2,1and C2,2,2can be assigned

to the sixth frame (of the current round) and the ﬁrst frame (of the next round). The bold lines in Fig. 7(b) show one maximum-matching solution.

Theorem 2. A scheduling problem has a feasible schedule if and only if its corresponding bipartite graph G = ({Vl, Vr}, E)

has a maximum matching with size of jVlj.

Proof. In the above, we show how to translate a scheduling problem to a maximum matching problem in a bipartite graph G = ({Vl, Vr}, E). If there is a feasible solution for the

scheduling problem, it means that each bit arrival is assigned its bit resource in the solution without violating the delay bound, where each bit arrival (resp., each bit resource) in the scheduling problem has a corresponding vertex in Vl

(resp., Vr). Therefore, each bit resource assignment in the

fea-sible solution can be mapped to an edge in E and all the bit arrivals being scheduled implies that the feasible solution can be mapped to a maximum bipartite matching in G = ({Vl, Vr}, E) with size of jVlj. This proves the if part.

On the contrary, if we can ﬁnd a maximum matching in the corresponding bipartite graph G of a scheduling

problem with size of jVlj, it implies that each bit arrival

in the scheduling problem during the period of L contin-uous frames is assigned a bit of resource without violating the delay bound constraint. Thus, the packet arrivals of all Cis are satisfactorily scheduled. This proves the only if

part. h

5. Performance evaluation

To verify our result, we have simulated a BS-MS pair with multiple real-time connections by developing a simu-lator in C++. Unless otherwise stated, the following assumptions are made in our simulation. The number of connections n is ranged from 1 to 30. Each connection Ci

has a data rate of 320–3200 bits/frame, delay bound of 50–1000 ms, and PI of 20–200 ms, where 320 is the mini-mum data rate, 3200 is the maximini-mum data rate, 50 is the minimum delay bound, 1000 is the maximum delay bound, 20 is the minimum PI, 200 is the maximum PI of the MS. The maximum available resource per frame for the MS is B = 20000 bits and the length of an OFDM/OFDMA frame is set to 5 ms[18]. We consider three performance metrics: (i) active ratio: the ratio of active frames for the MS, (ii) re-source utilization: the ratio of the amount of rere-source con-sumed by the MS to the total amount allocated to it, and (iii) Fail ratio: the ratio of failure to schedule the MS’s sleep. We will compare our PSS-DB and PSS-PI against the PS scheme in [15] and an ideal active ratio lower bound (ARL), where ARL stands for the lowest active ratio for

D1 D2 C1 C2 PI1=2 PI2=3 C1,1 C1,2 C1,3 C1,4 C2,1 C2,2 C2,3 1 2 3 4 5 6 7 8

(a)

F1,1 F1,2 F2,1 F2,2 Vl Vr C1,1,1 C1,2,1 C1,3,1 C2,1,1 C2,1,2 C2,2,1 C2,2,2 F3,1 F3,2 F4,1 F4,2 F5,1 F5,2 F6,1 F6,2

(b)

Fig. 7. Example of modeling a scheduling problem as a maximum matching problem.

(8)

the MS to support all given connections’ trafﬁcs. We derive ARL by relaxing the delay bounds of connections as 1, i.e., Di= 1, i = 1..n. Then, we can set the sleep cycle of the MS as

TclcmfPIiFji¼1::ng frames, where Tc is the minimum integer

that makes (1) the sleep cycle be an integer and (2) the arrival data during the sleep cycle ﬁll the frame up. Therefore, the listening window of the MS, TL, can be derived as: TL ¼ Pn i¼1 TclcmfPIiji¼1::ng PIi Si B :

Then, we can conduct ARL as follows:

ARL ¼ Pn i¼1 Tc lcmfPIiji¼1::ng PIi Si B Tc lcmfPIiji ¼ 1::ng=F ¼X n i¼1 Si PIi B F: ð4Þ

Note that ARL provides only the value of active ratio, so we don’t compare PSS-DB and PSS-PI against ARL in the re-source utilization and fail ratio.

5.1. Effects of n

Fig. 8shows the effect of n on the active ratio, resource

utilization, and fail ratio by ﬁxing B = 80 kbits/frame. Gen-erally, as shown inFig. 8(a), the active ratio increases as n increases. As n increases, we can see that, in the initial, when n = 1, three schemes, PS, PSS-DB, and PSS-PI, perform the same; but as n becomes larger, PSS-DB consumes the least active frames in three schemes and PS performs the worst because PS schedules the sleep of the MS by only considering the strictest delay bound among all connec-tions while PSS-DB and PSS-PI can more accurately capture the required resource than the PS scheme by adapting each connection’s sleeping cycle to its delay bound and PI,

respectively, such that PSS-DB and PSS-PI can consume less active frames. As can be seen inFig. 8(a), the difference of the active ratios between PSS-DB and PS increases as n in-creases when n 6 20 (the same phenomenon can also be seen between PSS-PI and PS). The difference decreases when n > 30 because the network is becoming more and more saturated. In three schemes, PSS-DB has the smallest difference to ARL, which is an ideal lower bound for active ratio by assuming the delay bounds of connections as 1.

Fig. 8(b) show the resource utilization over different n. In

general, the resource utilization decreases as n increases. PSS-DB always performs the best and no less than 88%, fol-lowed by the PSS-PI scheme, and PS performs the worst. This is because our approaches assign each connection a PSC such that the resource requirement can be more accu-rately captured compared to the PS scheme. Furthermore, PSS-DB shows that accumulating packets of a connection together according to the delay bound can help the re-source utilization.Fig. 8(c) shows the schedule fail ratio of each scheme. We can see that PSS-DB always success-fully schedules the MS into sleep (with fail ratio zero) while PSS-PI and PS start to have a probability to fail the sleep schedule when n P 30 and n P 15, respectively. Our PSS-DB and PSS-PI schemes show better performance than the PS scheme because PSS-DB and PSS-PI can more precisely capture the resource requirement of connections than PS such that less resource is required to be reserved (which has already been shown in Fig. 8(b)). Note that we have also done some experiments to see the packet drop rate because of the violation of delay bound for the three schemes. The results are all zero and show that our proposed schemes can guarantee the delay bound of pack-ets like PS. Since the results are zero, we choose not to present these ﬁgures.

(9)

5.2. Effects of maximum connection delay bound

We then investigate the effect of maximum connection delay bound on the active ratio, resource utilization, and fail ratio by ﬁxing n = 5 and the minimum connection delay bound as 50 ms. As shown inFig. 9(a), the active ratio de-creases as the maximum connection delay bound in-creases. In three schemes, PSS-DB shows the best active ratio than PSS-PI and PS and has the smallest difference to ARL. Since ARL does not take the connection delay bound as a constraint, its active ratio is always the same over different delay ranges. Fig. 9(b) shows that PSS-DB performs the best resource utilization compared with PSS-PI and PS because it evaluates required resource of connections by delay bounds, causing the least resource waste. Generally, the resource utilization increases as the maximum connection delay bound increases for all three schemes. This is because the MS can have a longer sleeping cycle when the maximum connection delay bound in-creases. Fig. 9(c) shows that the PS scheme may fail to schedule the MS’s sleep but this is not the case for PSS-DB and PSS-PI.

5.3. Effects of maximum connection packet inter-arrival time

In this experiment, we investigate the effect of maxi-mum connection packet inter-arrival time (PI) on the ac-tive ratio, resource utilization, and fail ratio by ﬁxing n = 5 and the minimum connection PI as 20 ms.Fig. 10(a) shows the active ratio increases when the maximum con-nection PI increases. When the data rate of a concon-nection is ﬁxed, a larger PI increases the packet size. This increases the penalty once a scheme cannot accurately capture the required resource of connections. This is why the PS scheme performs the worst of the three schemes. On the other hand, our PSS-DB and PSS-PI schemes both perform

better than the PS scheme because they assign each con-nection a PSC according to its trafﬁc characteristics such that the resource requirement can be more accurately cap-tured and less active frames are consumed. As shown in

Fig. 10(a), compared to the PS scheme, the active ratios

of the PSS-DB and PSS-PI schemes improve almost 20% when the maximum connection PI is 650 ms. In

Fig. 10(b), resource utilization decreases as the maximum

connection PI increases. Despite the lower resource tion our PSS-DB and PSS-PI schemes deliver a high utiliza-tion (over 86%) and converge. On the contrary, the resource utilization of the PS scheme keeps decreasing as the max-imum connection PI increases. This is because a single PSC can not capture the trafﬁc characteristics of connections.

Fig. 10(c) shows that the PS scheme’s fail ratio increases

as the maximum connection PI increases while the PSS-DB and PSS-PI schemes have zero fail ratio. This shows our PSS-DB and PSS-PI schemes can more accurately cap-ture connections’ resource requirement again.

5.4. Effects of B

We consider two environments as follows. Environment 1 has ﬁve connections and each has a data rate of 320– 3200 bits/frame, delay bound of 50–1000 ms, and PI of 20–200 ms. Environment 2 is for stress test, where we in-crease n = 20 and widen the range of PI to 20–500 ms.

Fig. 11shows the effect of B on the active ratio, resource

utilization, and fail ratio under environments 1 and 2. Gen-erally, as shown inFig. 11(a) and (b), the active ratio de-creases as B inde-creases. In all three schemes, PSS-DB always performs the best and has the smallest difference to ARL, then is the PSS-PI scheme, and the PS scheme per-forms the worst. As shown inFig. 11(a), our PSS-DB and PSS-PI schemes perform better than the PS scheme by 12.4–14.7% and 6.6–9.1%, respectively, in environment 1.

(10)

Fig. 10. Effects of maximum connection packet inter-arrival time on (a) active ratio, (b) resource utilization, and (c) fail ratio with n = 5.

(11)

In a stressful environment (environment 2 as shown in

Fig. 11(b)), we can see that PS performs worse than that

inFig. 11(a) because it only considers the most strict delay

bound of connections, thus the QoS characteristics of other connections are neglected. On the contrary, by assigning each connection a PSC, PSS-DB and PSS-PI still have low ac-tive ratios even in a stressful environment. As we can see in

Fig. 11(b), our PSS-DB and PSS-PI schemes perform much

better than PS by 32.9–51.6% and 27.7–48%, respectively.

Fig. 11(c) and (d) show the resource utilization over

differ-ent B. The resource utilizations of PSS-DB and PSS-PI are al-ways more than 82% and 76%, respectively. This shows that our schemes can more precisely capture the resource requirement of connections than PS, which suffers severe resource utilization downgrade when the environment is stressful (only 39.5% resource utilization when B = 140 kbits/frame as shown inFig. 11(d)). This also ex-plains why the PSS-DB and PSS-PI schemes always have better active ratios than PS.Fig. 11(e) and (f) show the fail ratio of each scheme. In general, the fail ratio decreases as B increases. In environment 1, as shown inFig. 11(e), since all three schemes have high resource utilization (over 80%), we can see they do not have any fail ratio except the PS scheme at B = 20 kbits/frame. On the other hand, the PS scheme has a severe fail ratio in environment 2 as shown

inFig. 11(f) because it cannot effectively capture the trafﬁc

characteristics of connections.

5.5. Active ratios by different candidate basic cycles (Tbasic)

Recall that T

basic is the basic cycle which induces the

least active ratio for PSS-DB and PSS-PI. For each candidate

Tbasic, PSS-DB and PSS-PI adopt an exhausted search to ﬁnd

T

basic. However, this consumes time and computation

power. To reduce this overhead, using the minimum con-nection delay bound as the direct basic cycle, i.e., Tbasic¼

min

i¼1::nfDig

F

, seems to be a possbile strategy because we can make T1and Ti, i = 2..n, be close to D1and Di,

respec-tively. Thus, the sleep schedule of each Cican match its

trafﬁc characteristics.Fig. 12(a) and (b) show the perfor-mance evaluation for PSS-DB and PSS-PI by directly using the minimum connection delay bound as the basic cycle, respectively. In the experiment, the input parameters are n = 20 and each connection Ci has a data rate of 320–

3200 bits/frame, delay bound of 50–1000 ms, and PI of

20–500 ms. As shown inFig. 12(a), by using the minimum connection delay bound as the basic cycle, the PSS-DB per-forms in the middle of Optimal PSS-DB and Worst PSS-DB, where Optimal PSS-DB uses T

basic as the basic cycle and

Worst PSS-DB always selects the worst candidate Tbasic.

As B increases, we can see the active ratio of the PSS-DB with the minimum connection delay bound as the basic cy-cle is closer and closer to Optimal PSS-DB. On the other hand,

Fig. 12(b) shows that, using the minimum connection delay

bound as the basic cycle, the PSS-PI performs close to Opti-mal PSS-PI. To summarize, if the cost of searching for T

basic

is not a concern, we suggest to use Tbasicto obtain the best

performance for PSS-DB and PSS-PI. In fact, this exhaustive search is executed only when initialization. On the contrary, if time and computation power is a concern, using the min-imum connection delay bound as the direct basic cycle is a compromise and as shown in our experiment, it will not be the worst case. Actually, as shown inFig. 12, even using the minimum delay bound as the direct basic cycle, PSS-DB and PSS-PI still perform better than the PS scheme.

6. Conclusions

In this paper, we propose two per-flow sleep scheduling schemes, PSS-DB and PSS-PI, for IEEE 802.16 wireless net-works, which guarantee the QoS of real-time connections. For each real-time connection, PSS-DB considers the delay bound to assign the sleeping cycle while PSS-PI uses the packet inter-arrival time to assign the sleeping cycle. Through these two multiple PSC solutions, required re-source of the MS can be more accurately predicted than that of the single PSC solution; so the MS can sleep more and has a higher resource utilization. Also, the proposed schemes are compatible to the standard and easy to imple-ment. Furthermore, both PSS-DB and PSS-PI perform better than the PS scheme regarding the active ratio, resource uti-lization, and fail ratio. We also prove that deciding whether a given scheduling problem is solvable can be reduced to a maximum matching problem, which can always be solved in polynomial time. In this work, we consider per-flow sleeping scheduling for one MS. For the case of multiple MSs, the operations of our schemes are not changed. But, the number of MSs influences the resource which could be assigned to each MS. Since the network bandwidth is fixed, the maximum available resource per frame B de-creases when the number of MSs inde-creases; B inde-creases

Fig. 12. Active ratio by using different Tbasic: (a) PSS-DB and (b) PSS-PI with n = 20, data rate = 320–3200 bits/frame, delay bound = 50–1000 ms, and PI = 20–

(12)

when the number of MSs decreases. As shown in our experiment (Section 5.4), the active ratio of an MS in-creases as its B dein-creases. On the other hand, for the sys-tem, when the number of MSs increases (This means that there is more data traffic), the utilization of system re-source increases. So, the system rere-source utilization bene-fits when the number of MSs increases. In this work, we use an ideal lower bound, ARL, to evaluate the performance of our schemes. As our future work, we will try to develop a mathematical model for analyzing the performance of the proposed schemes and find the real optimum to evaluate our schemes.

Acknowledgment

Y.-C. Tseng’s research is co-sponsored by MoE ATU Plan, by NSC Grants 97-3114-E-009-001, 97-2221-E-009-142-MY3, 98-2219-E-009-019, 98-2219-E-009-005, and 99-2218-E-009-005, by ITRI, Taiwan, by III, Taiwan, by D-Link, and by Intel. S.-L. Wu’s research is supported by the Na-tional Science Council, ROC, under Grant NSC99-2221-E-182-039, and the High Speed Intelligent Communication (HSIC) Research Center of Chang Gung University.

References

[1] IEEE Std 802.16-2009, IEEE Standard for Local and metropolitan area networks. Part 16: Air Interface for Broadband Wireless Access Systems, May 2009.

[2] J.A. Stine, G.D. Veciana, Improving energy efﬁciency of centrally controlled wireless data Networks, ACM/Baltzer Wireless Networks 8 (6) (2002) 681–700.

[3] M. Anand, E.B. Nightingale, J. Flinn, Self-tuning wireless network power management, ACM/Baltzer Wireless Networks 11 (4) (2005) 451–469.

[4] F. Zhang, T.C. Todd, D. Zhao, V. Kezys, Power saving access points for IEEE 802.11 wireless network infrastructure, IEEE Transactions on Mobile Computing 5 (2) (2006) 144–156.

[5] Y. Xiao, Energy saving mechanism in the IEEE 802.16 e wireless MAN, IEEE Communications Letters 9 (7) (2005) 595–597. [6] Y. Zhang, M. Fujise, Energy management in the IEEE 802.16 e MAC,

IEEE Communications Letters 10 (4) (2006) 311–313.

[7] K. Han, S. Choi, Performance analysis of sleep mode operation in IEEE 802.16 e mobile broadband wireless access systems, in: Proceedings of IEEE 63rd Vehicular Technology Conference (VTC’06-Spring), vol. 3, May 2006, pp. 1141–1145.

[8] Y. Zhang, Performance modeling of energy management mechanism in IEEE 802.16 e mobile WiMAX, in: Proceedings of IEEE Wireless Communications and Networking Conference (WCNC’07), March 2007, pp. 3205–3209.

[9] J. Xiao, S. Zou, B. Ren, S. Cheng, An enhanced energy saving

mechanism in IEEE 802.16 e, in: Proceedings of IEEE

GLOBECOM’06, November 2006.

[10] S. Cho, Y. Kim, Improving power savings by using adaptive initial-sleep window in IEEE802.16 e, in: Proceedings of IEEE VTC’07-Spring, April 2007, pp. 1321–1325.

[11] F. Xu, W. Zhong, Z. Zhou, A novel adaptive energy saving mode in IEEE 802.16 e System, in: Proceedings of Military Communications Conference (MILCOM’06), October 2006.

[12] J.-R. Lee, D.-H. Cho, Performance Evaluation of Energy-Saving

Mechanism Based on Probabilistic Sleep Interval Decision

Algorithm in IEEE 802.16 e, IEEE Transactions on Vehicular Technology 56 (4) (2007) 1773–1780.

[13] M.-G. Kim, J.-Y. Choi, M. Kang, Adaptive power saving mechanism considering the request period of each initiation of awakening in the IEEE 802.16 e system, IEEE Communications Letters 12 (2) (2008) 106–108.

[14] T.-C. Chen, J.-C. Chen, Y.-Y. Chen, Maximizing unavailability interval for energy saving in IEEE 802.16 e wireless MANs, IEEE Transactions on Mobile Computing 8 (4) (2009) 475–487.

[15] S.-L. Tsao, Y.-L. Chen, Energy-efﬁcient packet scheduling algorithms for real-time communications in a mobile WiMAX system, Computer Communications 31 (10) (2008) 2350–2359.

[16] H.-L. Tseng, Y.-P. Hsu, C.-H. Hsu, P.-H. Tseng, K.-T. Feng, A maximal power-conserving scheduling algorithm for broadband wireless networks, in: Proceedings of IEEE WCNC’08, March 2008, pp. 1877–1882.

[17] T.H. Cormen, C.E. Leiserson, R.L. Rivest, Introduction to Algorithms, MIT Press, 2001.

[18] H.S. Kim, S. Yang, Tiny MAP: an efﬁcient MAP in IEEE 802.16/WiMAX broadband wireless access systems, Computer Communications 30 (9) (2007) 2122–2128.

Jen-Jee Chen received his BS and MS degrees in Computer Science and Information Engi-neering from the National Chiao Tung

Uni-versity, Taiwan, in 2001 and 2003,

respectively. He was a Visiting Scholar at the University of Illinois at Urbana-Champaign during the 2007–2008 academic year. Then, he obtained his Ph.D. in Computer Science from the National Chiao Tung University, Taiwan, in October of 2009. He was a post-doctoral research fellow (2010-2011) at the

Department of Electrical Engineering,

National Chiao Tung University, Taiwan. He is Assistant Professor (2011-present) at the Department of Electrical Engineering, National University of Tainan, Taiwan. His research interests include wireless communica-tions and networks, personal communication networks, mobile comput-ing, cross-layer design, and cloud computing. Dr. Chen is a member of the IEEE and the Phi Tau Phi Society.

Shih-Lin Wu received the B.S. degree in Computer Science from Tamkang University, Taiwan, in June 1987 and the Ph.D. degree in Computer Science and Information Engineer-ing from National Central University, Taiwan, in May 2001. He was an Assistant Professor at Chang Gung University (2001–2007). He is Associate Professor (2007preset) and Chair-man (2007-present) at the Department of Computer Science and Information Enginee-ing, Chang Gung University. His current research interests include mobile communi-cations, wireless networks, and distributed robotics. He serves as a member of editor board of Telecommunication Systems, Journal of Posi-tioning and ISRN Communications. He was a Guest Editor of International Journal of Pervasive Computing and Communications 2007, a Program Chair of Mobile Computing 2005, a Co-Chair of Workshop on IEEE Inter-national Conference on Sensor Networks, Ubiquitous, and Trustworthy Computing 2006, a Program Chair of International Workshop on Data Management in Ad Hoc and Pervasive Computing 2009, a Co-Chair of International High Speed Intelligent Communication 2009, and a Co-Chair of International Symposium on Bioengineering 2011. Several of his papers have been chosen as Selected/Distinguished Papers in international/local conferences. Dr. Wu is a member of the IEEE and the Phi Tau Phi Society.

Shiou-Wen Wang received her BS degree in Information Engineering and Computer Sci-ence from the Feng Chia University, Taiwan, in 2006. Then, she obtained her MS degree in Computer Science and Information Engineer-ing from the Chang Gung University, Taiwan, in 2009. Her research interests include

wire-less communications and networks and

mobile computing. She has been with Panteck Technology Corp., Taipei, Taiwan, as a soft-ware engineer since 2009.

(13)

Yu-Chee Tseng got his Ph.D. in Computer and Information Science from the Ohio State Uni-versity in January of 1994. He is/was Professor (2000-present), Chairman (2005–2009), and Associate Dean (2007-present), Department of Computer Science, National Chiao-Tung

University, Taiwan, and Chair Professor,

Chung Yuan Christian University (2006– 2010).

Dr. Tseng received Outstanding Research Award (National Science Council, 2001, 2003, and 2009), Best Paper Award (Internationall Conference on Parallel Processing, 2003), Elite I. T. Award (2004), and Distinguished Alumnus Award (Ohio State University, 2005), and Y. Z. Hsu

Scientiﬁc Paper Award (2009). His research interests include mobile computing, wireless communication, and parallel and distributed com-puting.

Dr. Tseng serves/served on the editorial boards for Telecommunication Systems (2005-present), IEEE Trans. on Vehicular Technology (2005– 2009), IEEE Trans. on Mobile Computing (2006-present), and IEEE Trans. on Parallel and Distributed Systems (2008-present).