Low-Complexity Class-Based Scheduling Algorithm for Scheduled Automatic Power-Save Delivery for Wireless LANs

(1)

Low-Complexity Class-Based Scheduling

Algorithm for Scheduled Automatic

Power-Save Delivery for Wireless LANs

Tsern-Huei Lee, Senior Member, IEEE, and Jing-Rong Hsieh, Member, IEEE

Abstract—Power saving is an important issue when integrating the wireless LAN technology into mobile devices. Besides Quality of Service (QoS) guarantee, the IEEE 802.11e introduces an architecture called Scheduled Automatic Power-Save Delivery (S-APSD) aiming at delivering buffered frames to power save stations. In S-APSD, the Access Point (AP) schedules the Service Period (SP) of stations. To increase power efficiency, SPs should be scheduled to minimize the chance of overlapping. In a recent paper, an algorithm named Overlapping Aware S-APSD (OAS-APSD) was proposed to find the wake-up time schedule for a new Traffic Stream (TS) to minimize the chance of SP overlapping. The combination of OAS-APSD and HCF Controlled Channel Access (HCCA) was proved to outperform 802.11 Power Save Mode (PSM) with Enhanced Distributed Channel Access (EDCA) in power saving efficiency and QoS support. However, the OAS-APSD algorithm requires high online computational complexity which could make it infeasible for real systems. Without harming the optimality, this paper presents an efficient algorithm with much less complexity by exploiting the periodicity of service schedule. Because of largely reduced online computational complexity, the proposed algorithm is much more feasible than OAS-APSD.

Index Terms—Wireless LAN, scheduling, power saving

Ç

1 I

NTRODUCTION

T

HEIEEE 802.11 [1] wireless LAN has been widely spread

due to its low cost and easy installation. One can easily find wireless LAN hotspots in most places of a modern city such as office, campus, cafe´, or even on the street. Therefore, more and more mobile devices include wireless LAN functionality as a method for accessing the Internet or sharing files and multimedia data between peer devices. As the hardware performance of the mobile devices is greatly improved and many useful features such as location-based service are introduced, it is more likely for people to access the Internet anytime and anywhere through their mobile devices. However, to provide Quality of Service (QoS) guarantee while prolonging the usage time of mobile devices, several challenges need to be settled.

To cope with QoS support, IEEE 802.11e standard [2], an enhancement of 802.11, defines a QoS-aware coordination function called Hybrid Coordination Function (HCF). This function consists of two channel access mechanisms. One is contention-based Enhanced Distributed Channel Access (EDCA) and the other is contention-free HCF Controlled Channel Access (HCCA). Because of the contention-free nature, HCCA can provide much better QoS guarantee than EDCA. EDCA can be used only during contention period while HCCA can be used in both contention period and contention-free period. Interested readers can find an

overview of the 802.11e QoS enhancements in [3]. The IEEE 802.11e and other amendments finished before the year 2005 had been merged with the 1999 version of the 802.11 standard and the currently published specification is the IEEE 802.11-2007 [16], [17].

Regarding power saving for wireless LAN, most pre-vious works consider ad hoc scenarios because devices in an infrastructure system are usually connected to a power supply or equipped with long-life batteries. The situation is changed for multimode mobile devices because their small size severely limits the battery size. The IEEE 802.11 standard provides a power management mechanism at the MAC layer, known as Power Save Mode (PSM). When using PSM, a station (STA) sleeps and wakes up regularly to listen to beacons transmitted by Access Point (AP). It is assigned an Association ID (AID) during the association process. If its AID is indicated in the Traffic Indication Map of the beacon, meaning that there are data buffered at AP, the STA remains awake and tries to retrieve the data by sending PS-Poll frames. AP will set the More Data bit in the data frame it sends to the STA if there are more frames buffered for the STA. The STA enters the Doze state only when all its data are retrieved. Thus, the time it spends to listen to the channel is reduced. Obviously, under the PSM mechanism, delay of downlink frames depends on the STA’s listening interval which is multiples of the beacon interval. This may not be acceptable for real-time applica-tions. Quantitative evaluation for combinations of PSM and 802.11e can be found in [5].

To provide QoS support for unicast traffic and achieve power saving, the 802.11e standard includes an extension of the PSM mechanism, called Automatic Power Save Deliv-ery (APSD). Two different APSD modes were defined.

. The authors are with the Institute of Communication Engineering, National Chiao Tung University, Hsinchu 300, Taiwan, R.O.C. E-mail: {tlee, jingrong}@banyan.cm.nctu.edu.tw.

Manuscript received 2 Dec. 2009; revised 10 Jan. 2011; accepted 30 Dec. 2011; published online 8 May 2012.

For information on obtaining reprints of this article, please send e-mail to: [email protected], and reference IEEECS Log Number TMC-2009-12-0524. Digital Object Identifier no. 10.1109/TMC.2012.114.

(2)

Unscheduled APSD (U-APSD) is a distributed mechanism where STAs decide when to awake to retrieve their data buffered at QoS AP (QAP). Scheduled APSD (S-APSD) is a centralized mechanism where QAP determines the wake-up periods and wake-wake-up times for STAs. There exist some famous scheduling algorithms such as Rate Monotonic [11] and Earliest Deadline First [15] that were designed for real-time OS and server/switch which handle delay-sensitive traffic. Unfortunately, their scheduling results are in general not periodic and, therefore, not suitable for power saving purpose. In [4], it was shown that the S-APSD mechanism combined with HCCA provide excellent QoS support and power saving. However, the scheduling algorithm proposed in [4] requires high computational complexity which can make it infeasible for real systems.

The purpose of this paper is to present a low complexity scheduling algorithm which achieves QoS support and power saving simultaneously. The scheduling criterion is slightly different from that adopted in [4]. To reduce online computational complexity, we classify Traffic Streams (TS) based on their wake-up periods and store necessary information to allow fast scheduling. According to simula-tion results, our proposed scheduling algorithm obtains almost the same energy consumption as that obtained by the scheduling algorithm presented in [4], with much smaller decision time.

The rest of this paper is organized as follows: in Section 2, we review the S-APSD mechanism and the scheduling algorithm proposed in [4]. Section 3 describes the idea of our proposed scheduling algorithm. The idea is further extended to a system with multiple classes of TSs in Section 4. In Section 5, we compare our proposed algo-rithms with the existing work [4] in terms of computational complexity and energy consumption. Finally, we draw conclusion in Section 6.

2 R

ELATED

W

ORKS

2.1 Scheduled Automatic Power-Save Delivery (S-APSD)

To support QoS while dealing with power saving issue, U-APSD and S-U-APSD architectures are defined in IEEE 802.11e [2], [4]. They avoid the necessity of PS-Poll frames when retrieving downlink frames. The U-APSD requires STAs to contend for the channel to transmit an uplink data frame or null frame to trigger the delivery of the buffered downlink frames from QAP. The QAP should use EDCA access method when selecting the U-APSD scheme. On the other hand, the S-APSD lays the burden of channel coordination on the QAP to calculate the schedule and announce it to STAs. When S-APSD is used, depending on whether the usage is for an Access Category (AC) or for a TS, EDCA, or HCCA is chosen as access policy, respectively.

For the S-APSD, the STA first communicates with the QAP via Add Traffic Stream (ADDTS) request frame setting both APSD and Schedule subfields in the TS info field before getting admitted. If the requested service can be satisfied, the QAP will notify the STA of the schedule including the Service Start Time (SST) and the negotiated Service Interval (SI) in the schedule element. As shown in Fig. 1, both the SST and SI fields are four octets and carry

time values in microseconds. Note that although the services are delivered periodically in S-APSD, the conveyed sources of applications are not necessarily to be periodic or constant-bit-rate. In APSD, the contiguous time that an STA stays awake to receive the buffered frames from QAP is defined as Service Period (SP). STAs using S-APSD should automatically switch to the Awake state at the scheduled starting time of each SP defined by

SSTþ m SI; where m 0; m 2 N: ð1Þ Then, they fall back to sleep till receiving the frames with the End Of Service Period (EOSP) flag being set. The service schedule can be updated after negotiation between the QAP and STA finishes. To maintain QoS guarantee, the new SST should fall into the region between the minimum SI and maximum SI after the beginning of the previous SP. Compared with the 802.11 PSM, besides QoS support, the S-APSD can also reduce signaling loads such as PS-Poll. Moreover, the number of collisions can be decreased as well.

2.2 Overlapping Aware S-APSD (OAS-APSD) [4] Although IEEE 802.11e defines the architecture of S-APSD, its specific implementation is left as an open issue. For the scenario with multiple STAs which wake up periodically to retrieve their buffered data, the overlapping of SPs is the major source that wastes their energy because they may be awake for durations longer than their transmissions. Since the medium is shared among STAs, an STA may spend energy on overhearing the transmissions between the QAP and other STAs before returning to sleep.

To reduce the chance of SP overlapping, as described in [4], there could be two scheduling approaches to schedule the starting time of SPs. One is contiguous scheduling, which means that the scheduled SPs should be placed one after another. It has the advantage of simplifying the process to determine the SST for the schedule of a new TS. However, it often requires the SIs to be altered to satisfy certain constraint so that contiguous scheduling is possible. A necessary and sufficient condition for a group of periodic tasks defined by SIs and Transmission Opportunities (TXOPs) to be scheduled contiguously by an Equal-spacing-based Rate Monotonic algorithm was derived in [6]. Altering SIs may shorten the sleeping time of STAs and, as a result, cause more energy consumption to retrieve the same amount of data buffered at QAP. Besides, contiguous scheduling is only suitable for constant-bit-rate traffic. For variable-bit-rate traffic, it is often a difficult task to determine the duration of an SP to achieve high efficiency in both energy consumption and bandwidth utilization. In [10], Hsieh et al. formulated the problem as one which optimizes energy consumption given an upper bound of bandwidth loss. However, it was assumed that the

(3)

distribution of traffic arrival of each TS is stationary and known. The assumption may not be realistic and, even if it is acceptable, the huge computational complexity prohibits the optimal solution from being adopted.

In [4], a noncontiguous scheduling algorithm called Overlapping Aware S-APSD was proposed. The OAS-APSD algorithm aims at finding the SST of a new TS which achieves the least probability of SP overlapping. The pseudocode of the OAS-APSD algorithm is shown below. To be concise, we use scheduled instants to represent the scheduled starting time of SPs. The Scheduled Events (SEs) in the OAS-APSD algorithm refer to the scheduled instants known to the QAP, for example, Beacons with period BI and already scheduled SPs with period SIs for TSs. In this algorithm, SInew represents the

SI of the new TS to be scheduled. The OAS-APSD algorithm[4].

SST to be determined given a specific SInew

N; SST ; SSTtemp; distavg; temp distavg; max distmin 0

temp distmin BI

Create empty list of SEs ! ListSE

Compute LCM considering All SIs plus BI ! LCM for 8SEs 2 ½tcurrent; tcurrentþ LCM] do

Insertion in ListSE of SEs end for

while SSTtemp< SInew do

while SSTtempþ SInew N < LCM do

Find prev SE and next SE in ListSE

distnext SE next SE (SSTtempþ SInewN)

distprev SE SSTtempþ SInew N prev SE

Insertion in distances SSTtemp of distnext SE and

distprev SE

N N + 1 end while

temp distmin Minimum of distances SSTtemp

temp distavg Average of distances SSTtemp

if temp distmin> max distmin then

max distmin temp distmin

distavg temp distavg

SST SSTtemp

else if temp distmin¼ max distmin then

if temp distavg> distavgthen

distavg temp distavg

SST SSTtemp

else if temp distavg¼ distavgthen

SST random(SST ; SSTtemp)

end if end if

SSTtemp SSTtempþ precision

end while

The basic idea of the OAS-APSD algorithm is to find the optimal SST of the new TS in an interval [0, SInew-1] which

achieves the maximum among minimum relative distances between SPs of the new TS and existing scheduled events. Here, the relative distances between SPs of the new TS and existing scheduled events are defined as the distances from the starting time of every SP of the new TS to its closest previous and next existing scheduled events, as illustrated

in Fig. 2. Note that there are only a finite number of possibilities for the SST because there is a maximum precision used by the 802.11 specification. According to [1], [12], the timing synchronization function of each STA is based on a 1-MHz clock and thus the time ticks in microseconds. As a result, the system is actually slotted with slot size or maximum precision equal to multiple of one microsecond. Without loss of generality, the duration of the maximum precision is normalized to 1. As a result, the SIs and the scheduled events are integers.

To determine the optimal SST, relative distances are calculated for an interval of duration LCM, the least common multiple of the SIs, including SInew. If there is a

tie, then the one with maximum average relative distance is selected. In case there is still a tie based, on average, relative distance, it is broken arbitrarily. Clearly, the computational complexity of the OAS-APSD algorithm is large for large values of LCM. This can make the algorithm infeasible for real systems.

Note that the scheduled instants for a specific TS can be represented as a sequence, say,

XX¼ fxmg1m¼1; such that xm¼ xm1þ p; ð2Þ

where p, called period, is the SI of the TS. The origin can be chosen arbitrarily. Consequently, the SST of the TS corresponds to some xi and the STA to which the TS is

attached wakes up periodically at time instants xj for all

j i.

3 T

HE

P

ROPOSED

L

OW

-C

OMPLEXITY

S

CHEDULING

A

LGORITHM

3.1 Basic Idea

In this section, we present the basic idea of determining the optimal SST for a new TS. Consider the simplest case of scheduling the SPs for the first TS with period p. Let XX¼ fxmg1_m¼1 be the scheduled instants for the first TS. It is

clear that any sequence of period p is optimal. For simplicity, we choose xm¼ p m for all m.

Consider now the case where an existing TS with period p had been scheduled and a new TS is to be scheduled. Assume that the period of the new TS to be scheduled is q. Let YY ¼ fymg1m¼1be a periodic sequence of period q such

that ym¼ q m for all m. Further, let

Y

Y þ k ¼ fymþ kg1m¼1 ð3Þ

be a shifted version of YY. We call k the offset of YY þ k with respect to YY. It is clear that YY þ k is periodic in k with period q.

(4)

Define the distance between sequences XXand YY þ k as dðXX; YY þ kÞ ¼ min

1l;m1jylþ k xmj: ð4Þ

It is not hard to see that

dðXX; YY þ kÞ ¼ dðYY þ k; XXÞ ¼ dðXX k; YYÞ; ð5Þ and dðXX; YY þ 0Þ ¼ dðXX; YYÞ ¼ 0: ð6Þ Define DðXX; YYÞ ¼ max k fdðXX; YY þ kÞg: ð7Þ

Our goal is to find k_{, the optimal value of k which satisfies}

k¼ arg max

k dðXX; YY þ kÞ: ð8Þ

Once k_{is obtained, the scheduled instants of the new TS is}

YY þ k _{and the SST can be determined based on current}

time. Let G ¼ gcdðp; qÞ and L ¼ lcmfp; qg be, respectively, the greatest common divisor and the least common multiple of p and q. We prove in Theorem 1 a property of dðXX; YY þ kÞ.

Theorem 1. dðXX; YY þ kÞ is periodic in k with period G. Proof. It is clear that dðXX; YY þ kÞ is periodic in k because

dðXX; YY þ q þ kÞ ¼ dðXX; YY þ kÞ. Let n be its period. We shall prove that njG (i.e., n divides G) and Gjn.

According to the euclidean algorithm [7], there exist integers a and b such that

G¼ a p þ b q or G þ ðbÞ q ¼ a p; ð9Þ which implies one of the scheduled instants of YY þ G coincides with some scheduled instant of XX. Conse-quently, we have dðXX; YY þ G þ kÞ ¼ dðXX; YY þ kÞ, which implies njG.

Conversely, since n is the period of dðXX; YY þ kÞ, we have dðXX; YY þ 0Þ ¼ dðXX; YY þ nÞ, which implies there must exist integers s and t such that

nþ s q ¼ t p or n ¼ ðsÞ q þ t p: ð10Þ As a result, it holds that Gjn because Gjp and Gjq. This completes the proof of Theorem 1. tu A consequence of Theorem 1 is that k can be chosen to satisfy 0 k_{G 1. Note that to compute dðX}_{X; Y}_Y _{þ kÞ,}

0 k G 1, we need only consider finite partial se-quences of XXand YY þ k because the same situation repeats every L slots. Two cases are analyzed separately below.

Case 1. q p.

For q p, we need only consider fxmgL=p1m¼0 and

fymþ kgL=q1m¼0 . Let

fmðkÞ ¼ ðq m þ kÞ=pb c; ð11Þ

where xb c represents the largest integer smaller than or equal to x. Define

amðkÞ ¼ minfq m þ k p fmðkÞ;

p ½fmðkÞ þ 1 ðq m þ kÞg;

ð12Þ

as the shorter distance of ymþ k to the two closest

neighboring x0

ms. We have

dðXX; YY þ kÞ ¼ min

0mL=q1famðkÞg: ð13Þ

Figs. 3a and 3b show an example for p ¼ 4 and q ¼ 6. For this example, we have G ¼ 2 and, therefore, we need only compute dðXX; YY þ 0Þ and dðXX; YY þ 1Þ. One can easily verify that a0ð0Þ ¼ minf0; 4g ¼ 0, a1ð0Þ ¼ minf2; 2g ¼ 2,

dðXX; YY þ 0Þ ¼ minf0; 2g ¼ 0; and a0ð1Þ ¼ minf1; 3g ¼ 1,

a1ð1Þ ¼ minf3; 1g ¼ 1, dðXX; YY þ 1Þ ¼ minf1; 1g ¼ 1. As a

consequence, we have DðXX; YYÞ ¼ maxfdðXX; YY þ 0Þ, dðXX; YY þ 1Þg ¼ 1 and k_{¼ 1.}

Case 2. q < p.

For q < p, one can compute amðkÞ for 0 m L=q 1

and then determine dðXX; YY þ kÞ ¼ min0mL=q1famðkÞg.

Alternatively, one can change the roles of p and q and apply the procedure performed for Case 1. Note that dðXX; YY þ kÞ ¼ dðXX k; YYÞ ¼ dðXXþ G k; YYÞ implies the desired results can be obtained by interchanging the roles of p and q. By doing so, the complexity of computing dðXX; YY þ kÞ is reduced because fewer amðkÞ0s (L=p versus

L=q) are calculated.

To summarize, in order to determine dðXX; YY þ kÞ, we need to compute L= maxfp; qg amðkÞ0s and then pick the

smallest one. Since there are G different values for variable k, the complexity of determining dðXX; YY þ kÞ, 0 k G 1, is OðG L= maxfp; qgÞ ¼ Oðminfp; qgÞ multiplications and divisions. The following algorithm eliminates all multi-plications and divisions. The algorithm requires roughly 4 minfp; qg comparisons.

Algorithmforcomputing DðXX; YYÞ and k_{assuming that q p.}

R¼ q mod p

D, k 0 /* D stores the value of DðXX; YYÞ.*/ k 1

while k < G m 0

S k /* S stores the value of q m þ k p fmðkÞ. */

relative dist minfS; p Sg min relative dist relative_dist while m L=q 1

S S þ R if S > p

S S p end if

relative dist minfS; p Sg

(5)

min relative dist minfrelative dist, min relative distg m m þ 1

end while

if D < min relative dist D min relative dist k_k

else if D ¼ min relative dist k randomðk_{; kÞ}

end if k k þ 1 end while

3.2 Generalization to KKExisting TSs

Let us now extend the results to K 2 existing TSs when a new TS is to be scheduled. Assume that the period of the ith existing TS is piand the period of the TS to be scheduled

is q. Let XXi¼ fxi;mg1_m¼1, 1 i K, be a sequence of

period pi such that xi;m¼ pi m for all m. Further, let

XX0_i¼ XXiþ Oi; ð14Þ

be the scheduled instants of the ith existing TS. We shall use XX1as reference and, therefore, assign O1¼ 0. Consequently,

we have XX0₁¼ XX1 and

Oi¼ xi;0 x1;0; ð15Þ

represents the offset of XX0_i with respect to XX0₁. According to the results obtained in Section 3.1, it holds that 0 O2 p2 1. We shall prove that Oi satisfies 0 Oi

pi 1 for all i. Let Gi¼ gcdðpi; qÞ, 1 i K, and

GL¼ lcmfG1; G2; . . . ;GKg; ð16Þ

the least common multiple of G1; G2; . . . ;and GK. Also, let

Li¼ lcmfpi; qg, 1 i K.

Again, let YY ¼ fymg1m¼1 be a periodic sequence of

period q with ym¼ q m for all m and YY þ k ¼ fymþ

kg1_m¼1 be a shifted version of YY. Let X

X¼ [

1iKXX

0

i; ð17Þ

such that x is an element of XXif and only if it is an element of XX0_i for some i, 1 i K. Clearly, XX is a periodic sequence with period lcmfp1; p2; . . . ; pKg. Define

dðXX0_i; YY þ kÞ ¼ min 1l;m1jylþ k xi;m Oij; ð18Þ and dðXX; YY þ kÞ ¼ min 1iKfdðXX 0 i; YY þ kÞg: ð19Þ

The optimal value of k is again given by (8). It can be easily shown that dðXX; YY þ kÞ is periodic with period GL because dðXX0_i; YY þ kÞ is periodic with period Gi, 1 i K.

As a result, k_{can be chosen to satisfy 0 k}_{GL 1. The}

fact that Gijq, 1 i K, implies GLjq. In other words, GL

is a factor of q and, therefore, is upper bounded by q. If the new TS is considered as the ðK þ 1Þth TS when the ðK þ 2Þth TS is to be scheduled, then we have

OKþ1¼ k q 1 ¼ pKþ1 1: ð20Þ

This proves the property that Oi satisfies 0 Oi pi 1.

To compute dðXX; YY þ kÞ, we need dðXX0_i; YY þ kÞ for all i, 1 i K. To determine k_{, a scheduling matrix of size K}

GLis constructed, as shown in Fig. 4. The ith row of the scheduling matrix is ½dðXX0_i; YY þ 0ÞdðXX_i0; YY þ 1Þ . . . dðXX0_i; YY þ Gi 1Þ repeated for GL=Gi times. Given the scheduling

matrix, dðXX; YY þ kÞ can be obtained as the minimum element of the kth column. Finally, the optimal value of k is given by the index of the column with the maximum dðXX; YY þ kÞ.

As derived previously, the complexity of computing dðXX0_i; YY þ kÞ, 0 k Gi 1, requires 4 minfpi; qg

com-parisons. The overall complexity to generate the scheduling matrix for K existing TSs is, therefore,PK_i¼14 minfpi; qg,

which is upper bounded by 4 q K comparisons. To find k, we need K GL comparisons to obtain dðXX; YY þ kÞ, 0 k GL 1, and GL comparisons to determine k_.

Note that our algorithm is unable to break a tie based on average distance. In case there are multiple choices for k, we select the one with the maximum column sum. As a result, it requires 2ðK 1Þ additions and one comparison for each tie-breaking. If there is still a tie, it is broken arbitrarily.

Example 1.Consider an example for K ¼ 2, p1¼ 12, p2¼ 15,

and q ¼ 18. Assume that the TS with period p1 was

scheduled earlier than the TS with period p2. Since XX1is

used as reference, we assign XX0₁¼ XX1¼ f12 mg1m¼1.

O2, the offset of XX02with respect to XX01has to be determined

based on the scheduling algorithm. Since gcdð12; 15Þ ¼ 3, there are only three possible values for O2, as shown in

Fig. 5. After some calculations, we get O2¼ 1 or 2. Assume

that we choose O2¼ 2. To schedule the third TS with

period q ¼ 18, we need to compute dðXX0₁; YY þ kÞ, 0 k 5, and dðXX0₂; YY þ kÞ, 0 k 2, because G1¼

gcdð12; 18Þ ¼ 6 and G2¼ gcdð15; 18Þ ¼ 3. The results are

½dðXX0₁; YY þ 0Þ dðXX₁0; YY þ 1Þ . . . dðXX0₁; YY þ 5Þ ¼ ½0 1 2 3 2 1 and ½dðXX0₂; YY þ 0Þ dðXX₂0; YY þ 1Þ dðXX0₂; YY þ 2Þ ¼ ½1 1 0.

(6)

Since GL ¼ lcmf6; 3g ¼ 6, we have six choices for O3.

Fig. 6 illustrates the relative positions of XX0₁, XX0₂, and YY þ k, 0 k 5. For each choice, we need to compare and select the minimum between dðXX0₁; YY þ kÞ and dðXX0₂; YY þ kÞ. Based on our algorithm, the scheduling matrix is of size 2 6 and is given by

0 1 2 3 2 1 1 1 0 1 1 0

:

Note that the second row is ½1 1 0 repeated for two times. Given the scheduling matrix, we have ½dðXX; YYþ 0Þ dðXX; YYþ 1Þ . . . dðXX; YYþ 5Þ ¼ ½0 1 0 1 1 0. As a result, the value of k _{can be selected as 1, 3, or 4. The}

column sums are 2, 4, and 3 for columns 1, 3, and 4, respectively. Therefore, k_{is selected as 3.}

4 H

ANDLING OF

M

ULTIPLE

T

RAFFIC

C

LASSES

4.1 Class-Based Scheduling

In real applications, it is likely that there are only a few possible periods to schedule TSs. Therefore, one can partition TSs into classes such that two TSs are in the same class if and only if they have identical periods of schedule. Assume that there are C classes, called Class 1, Class 2, ..., and Class C. Let pirepresent the period of Class i and nithe

number of TSs in Class i. If ni¼ 0, then Class i is considered

not exist. For ease of description, we assume that ni> 0for

all i, 1 i C.

Consider Class i and let e1; e2; . . . ;and eni be the TSs in

the class. A TS, say, e1, is selected as the representative of

Class i. Let XXi¼ fxi;mg1m¼1 be a sequence of period pi

such that xi;m¼ pi m for all m. We shall use the

representative TS as reference within the class and, there-fore, assign XXias the scheduled instants of TS e1. Let oi;sbe

the intraclass offset of TS es with respect to TS e1. As a

result, the scheduled instants for TS esis XXiþ oi;s. Let

X Xi;s¼ XXiþ oi;s; ð21Þ and X Xi¼ [ 1sni X Xi;s: ð22Þ

Assume that a new TS of Class j is to be scheduled. The impact of XXi to the new TS can be analyzed as follows.

Let YY ¼ fymg1m¼1 be a periodic sequence of period pj

with ym¼ pj m for all m. According to the results

presented in the previous section, we need to construct a scheduling matrix of size ni Gi;j, where Gi;j¼ gcdðpi; pjÞ.

The sth row of the scheduling matrix is ½dðXXi; YY þ 0Þ dðXXi;

Y

Y þ 1Þ . . . dðXXi; YY þ Gi;j 1Þ circularly shifted to the right

by oi;s positions. By taking the minimum element in each

column, we obtain

Rð XXi; YYÞ ¼ ½dð XXi; YYþ 0Þ dð XXi; YYþ 1Þ . . . dð XXi; YYþ Gi;j 1Þ:

ð23Þ Note that the optimal SST of the new TS cannot be determined solely by Rð XXi; YYÞ because there are still TSs

in other classes.

Assume that Rð XXi; YYÞ, 1 i C, are obtained. We shall

use the representative TS of Class 1 as reference of the overall system. Let

XX¼ [

1iC

XXi: ð24Þ

Further, let Oi, 1 i C, be the interclass offset of the

Class i representative TS with respect to the Class 1 representative TS. To determine the optimal SST of the new TS, we construct a global scheduling matrix M of size C GLj, where

GLj¼ lcmfG1;j; G2;j; . . . ; GC;jg: ð25Þ

The ith row of M is Rð XXi; YYÞ repeated for GLj=Gi;j times

and then circularly shifted to the right by Oi positions. By

taking the minimum element of each column, we obtain Rð XX; YYÞ ¼ ½dð XX; YYþ 0Þ dð XX; YYþ 1Þ . . . dð XX; YYþ GLj 1Þ:

ð26Þ Finally, the optimal scheduled instants of the new TS is given by YY þ k_{, where k} _satisfies

k¼ arg max

0kGLj1

fdð XX; YY þ kÞg: ð27Þ If the new TS is considered as the ðnjþ 1Þth TS of Class j,

then we update the intraclass offset oj;njþ1¼ k

_O

j: ð28Þ

4.2 Suggested Implementation Method

To reduce online scheduling complexity, we allocate C pairs of arrays for each class. Again, consider Class i. Denote the kth element of the jth pair of arrays by Ai;j½k Fig. 6. The six choices for O3in Example 1.

(7)

and Bi;j½k, 0 k Gi;j1. The array Ai;jstores ½dðXXi; YYþ 0Þ

dðXXi; YYþ1Þ . . . dðXXi; YYþ Gi;j 1Þ for YY ¼ fymg1m¼1 with

ym¼ pj m for all m. Note that Ai;j represents the impact

of the representative TS e1 to a new TS of Class j. The

array Bi;j stores Rð XXi; YYÞ and represents the impact of all

TSs in Class i to a new TS of Class j. When a new TS of Class j is to be scheduled, the arrays Bi;j, 1 i C, are

used to construct the global scheduling matrix M as illustrated in Fig. 7. All we need to do is taking the minimum of each column and then find the maximum among the minima. The complexity is only C GLj

comparisons. After the new TS is scheduled, we need to update Bj;l½k, 1 l C, and 0 k Gj;l 1, as

Bj;l½k ¼ minfBj;l½k; Aj;l½k oj;njþ1g; ð29Þ

because the impact of TSs in Class j to a new TS of every class is changed. Here, the index k oj;njþ1 is performed

modulo Gj;l. The complexity of the update process is

PC

l¼1Gj;l comparisons, which is upper bounded by C pj

comparisons.

When a TS in Class i finishes, we need to update Bi;j½k,

1 j C and 0 k Gi;j 1, as follows: remove the

finished TS so that the updated ni, denoted by n0i, becomes

ni 1. Construct a scheduling matrix of size n0i Gi;j such

that the mth row is Ai;jcircularly shifted to the right by oi;m

positions. The content of Bi;j½k is updated as the minimum

of the kth column. Of course, a new representative TS is selected before the update process if the finished one was originally the representative TS of Class i. The complexity of the update process is n0

i

PC

j¼1Gi;j comparisons.

Again, we break a tie based on column sum of the global scheduling matrix which requires 2ðC 1Þ additions and one comparison. Note that one can precompute Ai;j½k,

1 i, j C, and 0 k Gi;j 1. The initial content of

Bi;j½k is set to a sufficiently large value for all j and k if

there is no Class i TS, i.e., ni¼ 0.

Example 2.Assume that C ¼ 2, n1¼ 2, n2¼ 1, p1¼ 6, and

p2¼ 9, and a new TS of Class 2 is to be scheduled. Based

on the assumptions, we have G1;1¼ 6, G1;2¼ 3, G2;1¼ 3,

and G2;2¼ 9. Besides, the contents of Ai;j½k are given

by A1;1¼ ½0 1 2 3 2 1, A1;2¼ ½0 1 1 and A2;1¼ ½0 1 1,

A2;2¼ ½0 1 2 3 4 4 3 2 1. Let e1 and e2 represent the TSs

in Class 1 with e1being the representative. Also, let f1be

the representative TS in Class 2. Assume that TS e1was

scheduled first, followed by TS f1, and then TS e2. As a

result, when the new TS f2 is to be scheduled, we have

O2¼ 1 (which is randomly selected from 1 and 2) and

o1;2¼ 3 (which is randomly selected among 2, 3, and 5).

At this moment, the contents of Bi;j½k are given by

B1;1¼ ½0 1 1 0 1 1, B1;2¼ ½0 1 1; a n d B2;1¼ ½0 1 1,

B2;2¼ ½0 1 2 3 4 4 3 2 1. The global scheduling matrix

M¼ 0 1 1 0 1 1 0 1 1 1 0 1 2 3 4 4 3 2

:

Therefore, the value of kcan be chosen as 2, 4, 5, 7, or 8. We select k_{¼ 5 because it has maximum column sum.}

Then we compute o2;2¼ k O2¼ 4 and update B2;1¼

½0 0 1; B2;2¼ ½0 1 2 1 0 1 2 2 1.

5 P

ERFORMANCE

E

VALUATION

The considered scenario for our simulations is composed of periodic beacons and five classes of traffic. The five classes of traffic in the system are bidirectional time voice, real-time video, streaming audio, streaming video, and gaming. The traffic characteristics, listed in Table 1, are obtained from [4], [6], and [8]. The video traces are available online [13]. It is assumed that there are K STAs, each is configured with a scheduled TS belonging to one of the five classes. The number of STAs is increased in multiples of five STAs to maintain the same number of TSs in each traffic class. The system parameters conform to the Orthogonal Frequency Division

(8)

Multiplexing (OFDM) PHY specification [16] and the calculation for frame transmission time can be found in [9]. The repetition period L equals lcmf100; 40; 60; 150; 300g ¼ 600(ms) when all traffic classes, including the beacons, exist in the system. The chosen maximum precision in our simulations is 1s.

5.1 Comparison of Computational Complexity For the OAS-APSD algorithm, it needs to insert the already scheduled events within L into the ListSE and sort the

elements. The complexity of sorting is log2K

PK

i¼1L=pi

comparisons if the Merge Sort [7] is used. To find the two closest scheduled events for all the scheduled instants given a candidate of SST, by the idea of the Insertion Sort [7], requires PK_i¼1L=pi comparisons. Since there are q

candi-dates, the overall complexity is q PK_i¼1L=picomparisons.

Computation of relative distances for all the q candidates takes q 2 L=q ¼ 2L subtractions. To find the minimum relative distances for all the q candidates requires q 2 L=q¼ 2L comparisons. Finally, it takes q comparisons to determine the optimal SST. Note that the tie-breaking can be realized by using the sum of relative distances, rather than the average distance. It takes 2L q additions to obtain those sums of relative distances and each tie-breaking needs one comparison. Comparisons of online scheduling com-plexity are listed in Table 2.

In addition to complexity analysis, we also provide some numerical results. In the numerical evaluation, we increase the number of existing TS in each class of application, and check the average online complexity of the OAS-APSD algorithm and that of the proposed algorithms. Given the number of existing TSs, the average complexity for finding the SST is derived by averaging the number of necessary online operations when a new TS belonging to each class of application joins. Since the ListSE of OAS-APSD could be

reused after it is established, the complexity of sorting is ignored here. The Low Complexity S-APSD (LCS-APSD) algorithm refers to the idea described in Section 3.2; however, to reduce complexity, the contents of dðXX0i; YY þ kÞ’s are

reused for the TSs of the same class. Therefore, the complex-ity in preparing the scheduling matrix for K existing TSs is reduced from 4 q K to 4 q C comparisons. Our proposed algorithm using the suggested implementation method presented in Section 4.2 is referred to as Class-based LCS-APSD (CLCS-APSD) algorithm. The number of required operations and complexity reduction ratios are shown in Fig. 8. Here, the complexity reduction ratios are defined as

ðNOAS-AP SD NLCS-AP SDÞ=NOAS-AP SD; ð30Þ

and

ðNOAS-AP SD NCLCS-AP SDÞ=NOAS-AP SD; ð31Þ

where NOAS-AP SD is the number of operations

(compar-isons and subtractions) required by the OAS-APSD algo-rithm and NLCS-AP SD and NCLCS-AP SD are those required

by LCS-APSD and CLCS-APSD, respectively. As can be seen, the average reduction ratio is as high as about 82 percent for LCS-APSD and 98 percent for CLCS-APSD when there are 50 TSs in the system.

5.2 Comparison of Energy Consumptions

In this evaluation, we fix the number of existing TSs at 50 (10 for each class) and use the OAS-APSD and our proposed algorithms to schedule those TSs. In our simula-tions, we consider power saving for TSs which require QoS support and HCCA is chosen as the access policy. As a

TABLE 2

Online Complexity Comparisons

(In Comparisons/Subtractions/Additions).

Fig. 8. The performance of complexity reduction.

TABLE 1 Traffic Characteristics

(9)

consequence, the QAP is responsible for coordinating the channel access and no collision can happen. We assume that the ð5 i þ jÞth TS belongs to Class j for 0 i 9 and 1 j 5. The simulation is performed to model 600 seconds of the real time. The Awake state takes 1.4 W while the Doze state consumes only 0.045 W [14]. The switchover in between the states takes about 250 s [10] and consumes the same power as that in the Awake state. The system parameters conform to the 802.11a and are available in [12]. The PHY data rate is 24 Mbps while the PHY control rate is 6 Mbps. In addition to OAS-APSD and LCS-APSD/CLCS-APSD, we also conduct simulation for the Random algorithm which selects the SST for the newly joined TS randomly and uniformly over ½0; SInew 1. An

ideal case which assumes no SP overlapping is also presented as a reference.

Comparison of total energy consumption of the 50 STAs for the investigated schemes is provided in Fig. 9. In this figure, Rand-Min, Rand-Avg, and Rand-Max are, respec-tively, the minimum, average, and maximum energy consumptions for the Random algorithm of 500 simulations. As revealed in Fig. 9, the OAS-APSD performs slightly better than LCS-APSD/CLCS-APSD, which in turn con-sumes slightly less energy than Rand-Min. The reason OAS-APSD performs slightly better than the proposed LCS-APSD/CLCS-APSD is that it adopts a more complicated tie-breaking scheme based on average distances. However, the difference is not significant. As for the Random algorithm, its performance varies randomly. In our simulations, the proposed LCS-APSD/CLCS-APSD algorithms consume, respectively, about 9 and 26 percent less energy as compared with average and maximum energy consump-tions of the Random algorithm. Because of the low online complexity, we believe it is worthwhile to use the proposed CLCS-APSD algorithm for energy saving.

The energy consumption of an STA depends on the time spent for data delivery (including interframe spaces and acknowledgments), the waiting time the STA has to stay awake before transmission, and the number of switchovers during simulations. The waiting time of an STA in a given SP starts from its scheduled wake-up time and covers the duration during which it cannot access the medium because

of the transmissions of previous STAs. In our simulations, we give higher channel access priorities to the TS/STAs which are scheduled earlier, i.e., the STAs with smaller indices. Therefore, the later-order STAs tend to wait longer than the earlier ones. The average waiting time and energy consumption of different STAs for the proposed LCS-APSD/ CLCS-APSD are shown in Figs. 10 and 11, respectively.

To explain the results shown in Figs. 10 and 11, the following statistics are helpful. In our simulations, the ideal average SPs for delivering these five classes of applications are 0.5, 0.22, 1.52, 1, and 2.39 ms, while the SIs are 100, 40, 60, 150, and 300 ms, respectively. Fig. 10 shows the waiting time of STAs. In general, the waiting time increases as STA index increases. However, since TSs are added one by one, scheduling of their SSTs may slightly affect the results. According to the results shown in Fig. 11, real-time video consumes the most energy among all classes because it requires a large number of switchovers and long time duration for delivering data. As for streaming video, although it also has long time duration for delivering data, it needs the least number of switchovers among the five classes of applications due to its long SI. Consequently, its energy consumption is moderate. Define the duty cycle of a TS as the ratio of its average SP to its SI. One can easily compute the duty cycles for the five classes of applications as 0.005, 0.006, 0.025, 0.007, and 0.008, respectively. In

Fig. 10. Average waiting time among STAs.

Fig. 11. Energy consumptions among STAs. Fig. 9. Comparison of energy consumptions.

(10)

general, a larger duty cycle implies more energy consump-tion. The exception in our simulations is that real-time voice consumes more energy than real-time gaming, streaming audio, and streaming video. The reason is that real-time voice requires a large number of switchovers.

6 C

ONCLUSION

Compared with PSM, the S-APSD scheme defined in IEEE 802.11e provides a better mechanism to increase power saving performance when delivering QoS-sensitive traffic. In this paper, we focus on designing a feasible noncontiguous scheduling algorithm to be used for S-APSD. Our design takes advantage of the periodicity property of schedule to largely reduce online computational complexity. We also present an efficient implementation method for class-based systems. As demonstrated in performance comparison, the online computational complexity of our proposed algo-rithms is much smaller than that of previous related work with comparable energy consumption performance. Some interesting and challenging further research topics such as efficient rearrangement of existing schedule when a new TS is to be added is currently under investigation.

R

EFERENCES

[1] IEEE 802.11 WG: IEEE Standard 802.11-1999, Part 11: Wireless LAN MAC and PHY Layer Specifications, ISO/IEC 8802-11:1999(E), IEEE, 1999.

[2] IEEE Std 802.11e-2005, Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications Amendment 8: Medium Access Control (MAC) Quality of Service Enhancements, IEEE, 2005.

[3] S. Mangold, S. Choi, G.R. Hiertz, O. Klein, and B. Walke, “Analysis of IEEE 802.11e for QoS Support in Wireless LANs,” IEEE Wireless Comm. Magazine, vol. 10, no. 6, pp. 40-50, Dec. 2003. [4] X. Pe´rez-Costa, D. Camps-Mur, J. Palau, D. Rebolleda, and S. Akbarzadeh, “Overlapping Aware Scheduled Automatic Power Save Delivery Algorithm,” Proc. European Wireless Conf. (EW), Apr. 2007.

[5] X. Pe´rez-Costa, D. Camps-Mur, and T. Sashihara, “Analysis of the Integration of IEEE 802.11e Capabilities in Battery Limited Mobile Devices,” IEEE Wireless Comm. Magazine, vol. 12, no. 6, pp. 26-32, Dec. 2005.

[6] Q. Zhao and D.H.K. Tsang, “An Equal-Spacing-Based Design for QoS Guarantee in IEEE 802.11e HCCA Wireless Networks,” IEEE Trans. Mobile Computing, vol. 7, no. 12, pp. 1474-1490, Dec. 2008. [7] T.H. Cormen, C.R. Leiserson, R.L. Rivest, and C. Stein, Introduction

to Algorithms, second ed. MIT, 2009.

[8] F. Fitzek, A. Koepsel, A. Wolisz, M. Krishnam, and M. Reisslein, “Providing Application-Level QoS in 3G/4G Wireless Systems: A Comprehensive Framework Based on Multi-Rate CDMA,” IEEE Wireless Comm. Magazine, vol. 9, no. 2, pp. 42-47, Apr. 2002. [9] Y. Xiao and J. Rosdahl, “Throughput and Delay Limits of IEEE

802.11,” IEEE Comm. Letters, vol. 6, no. 8, pp. 355-357, Aug. 2002. [10] J.-R. Hsieh, T.-H. Lee, and Y.-W. Kuo, “Energy-Efficient Multi-Polling Scheme for Wireless LANs,” IEEE Trans. Wireless Comm. vol. 8, no. 3, pp. 1532-1541, Mar. 2009.

[11] C.L. Liu and J.W. Layland, “Scheduling Algorithms for Multi-Programming in a Hard-Real-Time Environment,” J. ACM, vol. 20, no. 1, pp. 46-61, 1973.

[12] M.S. Gast, 802.11 Wireless Networks—The Definition Guide. O’Reilly, 2002.

[13] F.H.P. Fitzek and M. Reisslein, “MPEG-4 and H.263 Video Traces for Network Performance Evaluation,” IEEE Networks, vol. 15, no. 6, pp. 40-54, http://www-tkn.ee.tu-berlin.de/research/trace/ trace.html, Dec. 2001.

[14] E.-S. Jung and N.H. Vaidya, “An Efficient MAC Protocol for Wireless LANs,” Proc. IEEE INFOCOM, vol. 3, pp. 1756-1764, June 2002.

[15] A. Grilo, M. Macedo, and M. Nunes, “A Scheduling Algorithm for QoS Support in IEEE 802.11e Networks,” IEEE Wireless Comm. Magazine, vol. 10, no. 3, pp. 36-43, June 2003.

[16] IEEE Standard for Information Technology - Telecomm. and Informa-tion Exchange Between Systems-Local and Metropolitan Area Networks-Specific Requirements - Part 11: Wireless LAN Medium Access Control (MAC) and Physical Layer (PHY) Specifications, IEEE Std 802.11-2007, IEEE, 2007.

[17] IEEE 802.11 Working Group, http://www.ieee802.org/11, 2012. Tsern-Huei Lee received the BS degree from National Taiwan University, Taipei, ROC, the MS degree from the University of California, Santa Barbara, and the PhD degree from the University of Southern California, Los Angeles, in 1981, 1984, and 1987, respectively, all in electrical engineering. Since 1987, he has been a member of the Faculty of National Chiao Tung University, Hsinchu, Taiwan, where he is a professor in the Department of Communications Engineering and a member of the Center for Telecommunications Research. He serves as a consultant of various research institutes and local companies. His current research interests include network security, broadband switching systems, network traffic management, and wireless communications. He received an outstanding paper award from the Institute of Chinese Engineers in 1991. He is a senior member of the IEEE.

Jing-Rong Hsieh received the BS, MS, and PhD degrees from National Chiao Tung Uni-versity, Hsinchu, Taiwan, ROC, in 2003, 2005, and 2010, respectively, all in communications engineering. Since November 2010, he has been a senior engineer in protocol standardiza-tion at the HTC Corporastandardiza-tion in Taipei, Taiwan. His current research interests include power management and quality of service issues of wireless local area networks and wireless cellular networks. He is a member of the IEEE.

. For more information on this or any other computing topic, please visit our Digital Library at www.computer.org/publications/dlib.