Resource Allocation Achieving High System Throughput with QoS Support in OFDMA-Based System

(1)

Resource Allocation Achieving

High System Throughput with

QoS Support in OFDMA-Based System

Tsern-Huei Lee, Senior Member, IEEE, and Yu-Wen Huang, Student Member, IEEE

Abstract—In this paper, we present a resource allocation

algorithm for OFDMA-based systems which handles both real-time and non-real-real-time traffic. For real-real-time traffic, the QoS requirements are specified with delay bound and loss proba-bility. The resource allocation problem is formulated as one which maximizes system throughput subject to the constraint that the bandwidth allocated to a flow is no less than its minimum requested bandwidth, a value computed based on loss probability requirement and running loss probability. A user-level proportional-loss scheduler is adopted to determine the resource share for flows attached to the same subscriber station (SS). In case the available resource is not sufficient to provide every flow its minimum requested bandwidth, we maximize the amount of real-time traffic transmitted subject to the constraint that the bandwidth allocated to an SS is no greater than the sum of minimum requested bandwidths of all flows attached to it. Moreover, a pre-processor is added to maximize the number of real-time flows attached to each SS that meet their QoS requirements. We show that, in any frame, the proposed proportional-loss scheduler guarantees QoS if there is any scheduler which guarantees QoS. Simulation results reveal that our proposed algorithm performs better than previous works.

Index Terms—OFDMA, QoS, delay bound, loss probability,

proportional-loss.

I. INTRODUCTION

R

ESOURCE allocation is an important component of OFDMA-based wireless systems, such as IEEE 802.16 [1] and the Long Term Evolution (LTE) [2], where channel access is partitioned into frames in the time domain and sub-channels in the frequency domain to achieve multi-user and frequency diversities. One obvious performance metric to evaluate resource allocation schemes is system throughput. A simple strategy to achieve high system throughput is to allocate more resources to users with better channel qualities. This strategy, unfortunately, may lead to starvation and cause QoS violation to real-time applications attached to users who have poor channel qualities. A well-designed resource allocation scheme should, therefore, take QoS support into consideration while maximizing system throughput.

Several previous works, say, [3], [4], adopted the concept of proportional fairness (PF) to eliminate starvation while

Paper approved by A. MacKenzie, the Editor for Game Theory and Cognitive Networking of the IEEE Communications Society. Manuscript received October 15, 2010; revised May 9, 2011 and October 18, 2011.

The authors are with the Institute of Communication Engineering, Na-tional Chiao Tung University, Hsinchu, 30010 Taiwan (e-mail: {tlee, vi-cent}@banyan.cm.nctu.edu.tw).

Digital Object Identifier 10.1109/TCOMM.2012.020912.100632

maintaining acceptable system throughput. These schemes, although achieve a kind of fairness among users, are not suitable for QoS support. In [5] and [6], the ideas of PF and static minimum bandwidth guarantee were combined to support multiple service classes. This enhanced algorithm, however, does not take delay bound and loss probability requirements of real-time flows into consideration and thus is unlikely to provide QoS support well.

In [7], a power and sub-carrier allocation policy was pro-posed for system throughput optimization with the constraint that the average delay of each traffic flow is controlled to be lower than its pre-defined level. Guaranteeing average delay, however, is in general not sufficient for real-time applications. The results presented in [8] reveal that dynamic power allo-cation can only give a small improvement over fixed power allocation with an effective adaptive modulation and coding (AMC) scheme. As a result, to reduce the complexity, it is reasonable to design resource allocation schemes under the assumption that equal power is allocated to each sub-channel. Some resource allocation algorithms were proposed, assum-ing equal-power allocation, to assign a user a higher priority for channel access if the deadline of its head-of-line (HOL) packet is smaller [9]-[12]. A simple scheme, called modified largest weighted delay first (M-LWDF), which uses a kind of utility function that is sensitive to loss probability and delay bound requirements as well as delay of HOL packets, was presented in [10]. Obviously, considering only the deadlines of HOL packets is not optimal. A QoS scheduling and resource allocation algorithm which considers deadlines of all packets was presented in [13]. This scheme requires high compu-tational complexity and thus may not be practical for real systems. To reduce computational complexity, a matrix-based scheduling algorithm was proposed in [4]-[6]. The M-LWDF, the scheme proposed in [13] and the matrix-based scheduling algorithm are related to our work and will be reviewed in Section III.

The purpose of this paper is to present a resource allocation algorithm which tries to maximize system throughput with QoS support for real-time traffic flows. Our contributions include: 1) define and derive the minimum requested band-width of each real-time flow based on the loss probability requirement and the running loss probability, 2) formulate the resource allocation problem as one which maximizes system throughput subject to the constraint that the bandwidth allocated to a flow is greater than or equal to its minimum

(2)

requested value, 3) propose a user-level proportional-loss (PL) scheduler for multiple real-time traffic flows attached to the same subscriber station (SS) to share the allocated resource, and 4) modify the resource allocation problem to maximize the amount of real-time traffic transmitted and add a pre-processor in front of the PL scheduler to maximize the number of real-time flows attached to each SS that meet their QoS requirements, when the available resource is not sufficient to provide each flow its minimum requested bandwidth. We show that, in any frame, the proposed PL scheduler guarantees QoS if there is any scheduler which guarantees QoS. Simulation results reveal that our proposed algorithm performs better than previous works.

The rest of this paper is organized as follows. In Section II, we describe the investigated system model. Related works are reviewed in Section III. Section IV contains our proposed scheme. Simulation results are presented in Section V. Finally, we draw conclusion in Section VI.

II. SYSTEMMODEL

We consider a single-cell OFDMA-based system which consists of one base station (BS) and multiple users or subscriber stations (SSs). Time is divided into frames, and the duration of a frame is equal to Tf rame. In a frame, there are M channels and S time slots. We assume that the sub-channel statuses of different SSs are independent. Moreover, for a given SS, its statuses on the M sub-channels are also independent. The channel quality for a given SS on a specific sub-channel is fixed during one frame. Transmission power is equally allocated to each sub-channel. To improve reliable transmission rate, an effective AMC scheme is adopted to choose a transmission mode based on the reported signal-to-noise ratio (SNR). We only consider downlink transmission.

For ease of description, we assume that no SS is attached with both real-time and non-real-time traffic flows. Let ΓRT and ΓN RT represent, respectively, the sets of SSs that are attached with real-time and non-real-time traffic flows. Further, let Γ=ΓRT∪ΓN RT. We shall use Knto denote the number of traffic flows attached to SS n. All non-real-time flows attached to the same SS are aggregated into one so that Kn = 1 if SS n∈ΓN RT. The QoS requirements of real-time traffic flows are specified by delay bound and loss probability. The kth flow attached to SS n is denoted by fn,k. If SS n∈ΓRT, then the delay bound and loss probability requirements of fn,k are represented by Dn,k· Tf rameand Pn,k, respectively. Data are assumed to arrive at the beginning of frames.

In the BS, a separate queue is maintained for each real-time traffic flow while non-real-real-time data are stored per SS. Assume that SS n∈ΓRT. The data of flow fn,k are buffered in Queuen,k, which can be partitioned into Dn,k disjoint virtual sub-queues, denoted by Queuedn,k, 1 ≤ d ≤ Dn,k, where Queued

n,k contains the data in Queuen,k that can be buffered up to d·Tf ramewithout violating their delay bounds. We shall use Qd

n,k[t] to represent the size of Queuedn,k at the beginning of the tth _{frame (including the newly arrived),}

Qn,k[t] = Dn,k

d=1 Qdn,k[t] , and Qn[t] = _K_n

k=1Qn,k[t]. Data which violate their delay bounds are dropped. It is assumed that the size of each queue is sufficiently large so that no data

will be dropped due to buffer overflow. To simplify notation, the queue for storing data of SS n ∈ ΓN RT is denoted by

Queuen.

III. RELATEDWORKS

In all the reviewed related works, resource allocation is performed at the beginning of each frame and, therefore, it suffices to consider one specific frame, say the tth _{frame. For} SS n, we denote its maximum achievable transmission rate on the mth _{sub-channel in the t}th _{frame and its long-term} average throughput up to the tth _{frame by r}

n,m[t] and rn[t], respectively.

A. Scheme of [13]

In [13], resource allocation is formulated as an optimization problem which maximizes some utility function subject to QoS guarantee. It consists of two stages. In the first stage, resources are allocated to real-time traffic flows only. If there are un-allocated resources after the first stage, the second stage is performed to allocate the remaining resources to non-real-time traffic.

In the first stage, called real-time QoS scheduling, the minimum requested bandwidth of each real-time traffic flow is calculated by Rmin n = Kn k=1 Dn,k d=1 Qd n,k[t]

dβ . Note that sub-stituting β with 0, 1, or ∞ corresponds, respectively, to strict priority [14], average QoS provisioning [15], or urgent [16] scheduling policy. With the assumption that sub-channel is the smallest resource granularity, the first stage aims to minimize the total number of sub-channels used to serve the sum of calculated minimum requested bandwidths of all real-time flows. This problem can be modeled as maximum weighted bipartite matching (MWBM) and solved by the famous On Kuhn’s Hungarian method, whose complexity is O(M |ΓRT|(min(M, |ΓRT|))2) [17], where |ΓRT| is the size of ΓRT.

In the second stage, the mth _{sub-channel, if still} available, is allocated to the SS which satisfies n∗ = arg maxn∈ΓN RTUn(rn[t])rn,m[t], where Un(x), called marginal utility function, is the first derivative of the utility function. For every SS, the utility function, defined by α-proportional fairness [18], is given by

Uα(x) =

(1 − α)−1_x(1−α)_{, if α = 1}

log(x), otherwise, (1)

where x represents the average throughput. Note that the policy corresponds to maximum throughput, proportional fair-ness, or max-min fairness if α is chosen to be 0, 1, or ∞, respectively.

It was shown in [13] that the above scheme with β = 1 makes a reasonable trade-off between QoS support and maximization of system utility. However, it has some draw-backs. Firstly, assuming the granularity of resource to be sub-channels can result in waste of bandwidth. In current standards such as IEEE 802.16 and LTE, a sub-channel can be shared by multiple SSs. Secondly, although the number of sub-channels used to serve real-time traffic is minimized in the first stage, the remaining service capability for non-real-time traffic may

(3)

not be maximized. This is because the qualities of remain-ing sub-channels could be poor for SSs attached with non-real-time traffic flows. Thirdly, calculation of the minimum requested bandwidth for each real-time traffic flow does not take its loss probability requirement into consideration. Real-time traffic usually can tolerate data loss to certain degree. System throughput can be improved significantly if one takes advantage of this feature in resource allocation. Finally, the complexity of the Hungarian method could make this scheme infeasible for a real system.

B. Matrix-based Scheduling Algorithm [4]

A matrix-based scheduling algorithm which tries to maxi-mize the utility sum of all users with acceptable computational complexity was proposed in [4]. In this scheme, a matrix U = [un,m] of dimension |Γ| × M is defined for resource allocation, where un,m = rn,m_r_n_[t][t] represents the marginal utility of user n on sub-channel m. For sub-channel m, let smrepresent the number of slots that have not been allocated and xn,mthe number of slots allocated to SS n. Initially, we have sm= S and xn,m= 0, n ∈ Γ, 1 ≤ m ≤ M. The matrix-based scheduling algorithm consists of three steps: 1) Find an (n∗_{, m}∗_{) which satisfies u} n∗,m∗ = maxn∈Γ,1≤m≤M{un,m}. 2) Set xn∗_,m∗ = min(s_m∗, _rQn∗[t] n∗,m∗[t]) (allocate  Qn∗[t] rn∗,m∗[t] or all the remaining slots of sub-channel m∗, whichever is smaller, to user n∗), Qn∗[t] = max(0, Qn∗[t] − xn∗,m∗ ·

rn∗,m∗[t])(update queue status of user n∗), and sm∗ = sm∗−

xn∗,m∗(update the remaining number of slots of sub-channel

m∗). Replace the (n∗)th row of U by an all-zero row if Qn∗[t] = 0 (user n∗ does not need any more resource) and the (m∗)th _{column of U by an all-zero column if s}

m∗ = 0 (all slots of sub-channel m∗ are allocated). 3) Update rn∗[t]. If Qn∗[t] > 0 , then re-calculate u_n∗_,m = rn∗,m_r∗ [t]

n[t] for all

m = m∗ (update the marginal utilities of user n∗ on various sub-channels before allocating the remaining resources). The above three steps are repeatedly executed until all elements of U are replaced with zeroes. The resulting values of xn,m,

n ∈ Γ, 1 ≤ m ≤ M , are the solutions. Assuming that M ≥ |Γ|, the computational complexity of the matrix-based scheduling algorithm in the worst case is O(M2|Γ| + |Γ|2), which happens when M −1 columns of U are replaced by all-zero columns one by one, followed by replacing the rows by all-zero rows one by one. Its complexity is O(|Γ|2M + M2) if M < |Γ|.

Note that the matrix-based scheduling algorithm takes queue occupancy into consideration. However, it does not consider QoS support. The same authors combined the idea of PF with static minimum bandwidth guarantee to support multiple service classes [5], [6]. A user whose channel quality is better than some threshold is guaranteed a pre-defined minimum bandwidth. This enhanced version, still, cannot provide QoS support well because it does not consider delay bound and loss probability requirements of real-time flows. C. Modified-largest Weighted Delay First (M-LWDF) [10]

The goal of the M-LWDF scheme is to achieve P (Wn,k >

Dn,k) ≤ Pn,k for all n ∈ ΓRT, 1 ≤ k ≤ Kn. In M-LWDF, the marginal utility of flow fn,k on sub-channel m

Q o S p a r a m e t e r Q u e u e S t a t u s SS n M inim um r equ est ed bandw idth cal culatio n Re so urce al location f or m axim um-t hroug hp ut with Qo S co nstr ain ts C h a n n e l s t a t u s r e p o r t   * n R t   n Q t n SS    n Q t   * 0 n R t    1,2,..., , , 1,2,..., { } n k n d D d n k k K Q t  , , 1,2,..., {Dn k,Pn k}k Kn Propo rtio na l-lo ss s ch edul er   , 1,2,..., {Rn kt}k Kn   n R t RT n   NRT n   Q u e u e S t a t u s   *  (R tn R tn ) Pre-proc es so r   *  (R tn R tn )   n R t   n R t

Fig. 1. Architecture of the proposed scheme.

is γn,k· Wn,k[t] · rn,m[t], where Wn,k[t] · Tf rameis the delay of the HOL packet of Queuen,k at the beginning of frame t and γn,k is an arbitrary positive constant. To transmit data, the flow with the largest marginal utility on some available sub-channel is selected for service. It was shown that M-LWDF is throughput-optimal in the sense that it is able to keep all queues stable if this is at all feasible to do with any scheduling algorithm. Moreover, it was reported that γn,k = a_r_nn,k_[t], where an,k = −(log Pn,k

)

Dn,k , performs very well. Clearly, for such a selection of γn,k, the marginal utility is sensitive to loss probability and delay bound requirements as well as delay of the HOL packet. When combined with a token bucket control, M-LWDF can provide QoS support to flows with minimum bandwidth requirements. However, how to serve non-real-time flows with zero minimum bandwidth requirements was not studied. To compare its performance with that of our proposed scheme, we shall assume that the operation of M-LWDF is divided into two stages. In the first stage, only real-time traffic flows are considered. As a consequence, the first stage of M-LWDF is the same as that of the matrix-based scheduling, except for a different marginal utility function. The complexity of the first stage is max{O(M2_|Γ

RT|+|ΓRT|2), O(|ΓRT|2M +M2)}. If there are un-allocated resources after the first stage, then the remainig resources are allocated in the second stage to non-real-time flows with zero minimum resource requirements. The goal of the second stage is to maximize system throughput. Assume that the matrix-based scheduling algorithm is adopted in the second stage. As a result, the complexity of the second stage is max{O(M2|Γ_{N RT}| + |Γ_{N RT}|2), O(|Γ_{N RT}|2M + M2)}.

IV. THEPROPOSEDSCHEME

In this section, we present a resource allocation scheme which considers both delay bound and loss probability re-quirements requested by real-time traffic flows. As shown in Fig. 1, the minimum requested bandwidths of real-time flows are computed, summed for each SS, and then used together with queue occupancy as constraints in resource allocation. After the solution is obtained, a PL scheduler is adopted to determine how multiple real-time traffic flows attached to the same SS share the allocated bandwidth. In case the available resource is not sufficient to provide each flow its minimum requested bandwidth, a pre-processor is required to maximize the number of real-time flows attached to each SS that meet

(4)

 

, n k P t

 

, n k R t   max , n k P t   knee , n k P t   min , n k P t   1 , n k Q t Qn k, t   ,_{ }  1_{ },  , _{ } , 1 , , , 1 1 1 n k n k n k n k n k n k n k L t Q t R t P t S t L t Q t     _{ } _{ }   _{ } , _{ } _{ } , , , , 1 1 1 n k n k n k n k n k L t P t S t L t R t       0 0

Fig. 2. The relationship betweenP_n,k[t] and R_n,k[t].

their QoS requirements. We describe calculation of minimum requested bandwidth, resource allocation, PL scheduler, and pre-processor separately below.

A. The Minimum Requested Bandwidth

For flow fn,k attached to SS n ∈ ΓRT, define Pn,k[x], the running loss probability up to frame x, as Pn,k[x] =

Ln,k[x]

Sn,k[x]+Ln,k[x], where Sn,k[x] and Ln,k[x] represent, respec-tively, the accumulated amount of data served and lost up to the end of the xth _{frame. Consider the t}th _{frame. Let R}

n,k[t] be the bandwidth allocated to flow fn,k. For convenience,

Rn,k[t] is expressed in terms of the amount of data served. As a result, we have 0≤ Rn,k[t] ≤ Qn,k[t] . Let x+= max(0, x). Since data are lost only due to violation of their delay bounds, we have Pn,k[t] = Ln,k[t − 1] + (Q1n,k[t] − Rn,k[t])+ Sn,k[t − 1] + Ln,k[t − 1] + max(Rn,k[t], Q1n,k[t]) . (2) It is not hard to see that Pn,k[t] is a continuous, strictly decreasing function of Rn,k[t] in the range 0 ≤ Rn,k[t] ≤

Qn,k[t]. The curve of Pn,k[t] as a function of Rn,k[t] is illus-trated in Fig. 2. In this figure, there are three special points on the y-axis, namely, Pn,kmax[t] , Pn,kknee[t], and Pn,kmin[t], which can be obtained by substituting Rn,k[t] with 0, Q1_n,k[t], and Qn,k[t] into equation (2), respectively. Note that if Qn,k[t] = 0, we have Pn,k[t] = Pn,k[t − 1] = P_n,kmax[t] = P_n,kknee[t] = Pn,kmin[t].

The minimum requested bandwidth of fn,k, denoted by

R∗_n,k[t], is determined as follows. If Pn,k ≥ Pn,kmax[t], then we set R∗_n,k[t] = 0 because there is no loss probability violation even if zero resource is allocated to fn,k. Assume that Pmax

n,k[t] > Pn,k > Pn,kmin[t]. In this case, R∗n,k[t] is obtained by solving Pn,k = Pn,k[t], where Pn,k[t] is described by equation (2). Finally, if Pn,k ≤ Pn,kmin[t], then the running loss probability is still greater than or equal to the pre-defined level Pn,keven if all buffered data of fn,kare served. Therefore, we assign R∗n,k[t] = Qn,k[t] to minimize the difference between

Pn,k[t] and Pn,k. For convenience, we use Pn,k∗ [t] to denote the running loss probability of fn,kat the end of the tthframe if the bandwidth allocated to fn,k is R∗n,k[t]. Clearly, Pn,k∗ [t] equals Pmax

n,k[t] if Pn,k > Pn,kmax[t] or Pn,kmin[t] if Pn,k< Pn,kmin[t].

The following lemma states that P_n,k∗ [t] is closer to Pn,k than any other Pn,k[t].

Lemma 1. It holds that

min

0≤Rn,k[t]≤Qn,k[t]

|Pn,k[t] − Pn,k| = |Pn,k∗ [t] − Pn,k|. Proofs of lemmas and theorems are provided in Ap-pendix A. The minimum requested bandwidth for all cases is summarized in Table I. Note that the actual allocated band-width could be different from R_n,k∗ [t]. After obtaining R∗_n,k[t] for all k, 1≤k≤Kn, one can compute R∗n[t], the aggregate minimum requested bandwidth for SS n, as Kn

k=1R∗n,k[t]. The values of Rn∗[t], n ∈ ΓRT are used in the resource allocation algorithm described in the next sub-section. B. Resource Allocation for Maximum-throughput With QoS Constraints

As described in Problem P1, the proposed resource alloca-tion algorithm maximizes system throughput while providing QoS guarantee to real-time traffic flows. In problem P1, we let R∗n[t] = 0 for all SS n∈ΓN RT. As in previous section, we use rn,m[t] to denote the maximum achievable transmission rate on the mth _{sub-channel for SS n in the t}th _{frame. The} variable xn,m[t] represents the number of time slots allocated to SS n on the mth _{sub-channel, in the t}th _frame.

P1 max n∈Γ M m=1 xn,m[t] · rn,m[t], (3) subject to n∈Γ xn,m[t] ≤ S, ∀m, 1 ≤ m ≤ M, (4) R∗n[t] ≤ M m=1 xn,m[t] · rn,m[t] ≤ Qn[t], ∀n ∈ Γ, (5) and xn,m[t] ∈ {0, 1, 2, ..., S}, ∀n ∈ Γ, 1 ≤ m ≤ M. (6) Problem P1 can be solved by some integer linear program-ming algorithm [19]. If there is no feasible solution, meaning that the available resource is smaller than the summation of all minimum requested bandwidths, we set xn,m[t] = 0, for all n ∈ ΓN RT, 1≤m≤M, and solve a modified problem, called problem P2, which is basically the same as problem P1 except that the constraint shown in equation (5) is replaced by 0 ≤ M

m=1xn,m[t] · rn,m[t] ≤R∗n[t], ∀n ∈ Γ. Note that the solution of Problem P2 always exists because xn,m[t] = 0, for all n ∈ Γ, 1≤m≤M , is one feasible solution. Unfortunately, the complexity of integer linear programming is NP-complete [20]. One possible strategy to mitigate the computational complexity is to set un,m= rn,m[t] for all n ∈ Γ, 1≤m≤M, and conduct the matrix-based scheduling algorithm for one or two rounds. In the first round, we only consider SSs contained in ΓRT, assuming that the queue occupancy of SS n is equal to R∗n[t]. The algorithm ends if the resource is exhausted in the first round. Otherwise, the second round is performed to

(5)

TABLE I

CALCULATION OFR∗_n,k[t]AND THERESULTINGP_n,k∗ [t]FORFOURCONDITIONS

Condition R∗_n,k[t] P_n,k∗ [t] Pn,k≥ Pn,kmax[t] 0 Pn,kmax[t] Pmax n,k> Pn,k≥ Pn,kknee[t] (1 − Pn,k)(Ln,k[t − 1] + Q 1 n,k[t]) _P_n,k −Pn,k· Sn,k[t − 1] Pknee n,k> Pn,k> Pn,kmin[t] Ln,k[t−1] Pn,k P_n,k −(Sn,k[t − 1] + Ln,k[t − 1]) Pn,k≤ Pn,kmin[t] Qn,k[t] Pn,kmin[t]

allocate the remaining resource to all SSs, assuming the queue occupancy of SS n is equal to Qn[t] − R∗n[t]. According to the analysis provided in the last section, the computational complexity of the modified matrix-based scheduling algorithm is O(max(M2|Γ| + |Γ|2, |Γ|2M + M2)).

Let yn,m[t] be the solution obtained either from integer linear programming or matrix-based scheduling algorithm. We have Rn[t] =

_M

m=1yn,m[t] · rn,m[t]. If Rn[t] = Rn∗[t], then the bandwidth allocated to the kth _{attached flow, i.e., R}

n,k[t], is equal to R∗_n,k[t]. Assume that Rn[t]=R∗n[t]. In this case, we need a user-level resource allocation algorithm for the attached flows to share the allocated bandwidth. In the following sub-section, we define the PL scheduler to solve this problem. C. Proportional-loss (PL) Scheduler

Consider SS n and assume that it is attached with multiple real-time traffic flows. Define three disjoint sets UZ, UP, and

UA such that flow fn,k is contained in UZ, UP, or UA iff

Rn,k[t] = 0, 0 < Rn,k[t] < Qn,k[t], or Rn,k[t] = Qn,k[t], respectively. Given Rn,k[t], the proposed PL scheduler is a scheduler which achieves, for any fn,z∈ UZ, fn,p,fn,p ∈ UP,

and fn,a∈ UA, Pn,z[t] Pn,z ≤ Pn,p[t] Pn,p = Pn,p[t] Pn,p ≤ Pn,a[t] Pn,a , (7) subject to Rn[t] = Kn k=1 Rn,k[t]. (8) Define Pn,k[t]

Pn,k as the normalized running loss probability of fn,k up to frame t. The proposed PL scheduler achieves min-max optimality, as stated in Lemma 2. In Theorem 3, we show that if there exists a scheduler which guarantees the loss probability requirements, so does the PL scheduler.

Lemma 2. Given Rn[t] > 0, Sn,k[t − 1], Ln,k[t − 1] and

{Qd n,k[t]}

Dn,k

d=1 , 1 ≤ k ≤ Kn, the proposed PL scheduler

minimizes the maximum normalized running loss probability of all the traffic flows attached to SS n.

Theorem 3. Given Rn[t] > 0, Sn,k[t − 1], Ln,k[t − 1] and

{Qd n,k[t]}

Dn,k

d=1 , 1≤ k ≤ Kn, if there exists a scheduler which

can guarantee the loss probability requirements of all the Kn

traffic flows, so can the PL scheduler.

Theorem 3 provides the answer why the PL scheduler is proposed as the user-level resource allocation algorithm. Define [Rn[t], Sn,k[t − 1], Ln,k[t − 1], and {Qdn,k[t]}

Dn,k d=1 (1 ≤

k ≤ Kn)] as the state of SS n at the beginning of the tth

frame. Given the state at the beginning of the first frame, the PL scheduler is preferred over other schedulers in the first frame, according to Theorem 3. Assume that the PL scheduler is adopted in the first frame. The state at the beginning of the second frame is determined once traffic arrivals at the beginning of the second frame is known and Rn[2] is provided. Based on Theorem 3 again, the PL scheduler is still the preferred scheduler in the second frame. The arguments can be applied to all frames.

In the rest of this sub-section, we present a realization of the PL scheduler. Again, consider SS n in the tth _{frame and} assume that Rn[t] is given. We need to determine Rn,k[t], 1≤k≤Kn, so that equations (7) and (8) are satisfied.

Lemma 4. If Rn[t] = R∗n[t], equations (7) and (8) are

satisfied for Rn,k[t] = R∗n,k[t], 1 ≤ k ≤ Kn.

Assume that Rn[t] = R∗n[t]. We have the following Theo-rem 5.

Theorem 5. Define ΔRn[t] = Rn[t] − R∗n[t] and ΔRn,k[t] =

Rn,k[t] − Rn,k∗ [t], 1 ≤ k ≤ Kn. Under the PL scheduler, it

holds that ΔRn,k[t] ≥ 0 (1 ≤ k ≤ Kn) if ΔRn[t] ≥ 0 or ΔRn,k[t] ≤ 0 otherwise.

A consequence of Theorem 5 is that R∗_n,k[t] = Qn,k[t] implies Rn,k[t] = Qn,k[t] if Rn[t] ≥ R∗n[t]; and R∗n,k[t] = 0 implies Rn,k[t] = 0 if Rn[t] ≤ R∗n[t]. To realize the PL scheduler, we start with Rn,k[t] = R∗n,k[t], 1≤k≤Kn. If

Rn[t] = R∗n[t], then the solution is found. Adjustment is necessary if Rn[t] = Rn∗[t]. To do the adjustment, flows are classified into four sets UZ, UP1, UP2, and UAsuch that fn,k is in UZ, UP1, UP2, or UAiff R_n,k∗ [t] = 0, 0 < R_n,k∗ [t] ≤ Q1_n,k[t],

Q1_n,k[t] < R∗_n,k[t] < Qn,k[t], or R∗n,k[t] = Qn,k[t], respec-tively. Two cases are considered separately.

Case 1 Rn[t] > R∗n[t]

According to Theorem 5, Rn[t] > Rn∗[t] implies Rn,k[t] ≥

R_n,k∗ [t]. Therefore, we should increase the value of Rn,k[t] for

fn,k ∈ UP1∪UP2∪UZ. Our idea is to increase Rn,k[t] gradually, keeping equations (7) satisfied, until Rn[t] =

_K_n

k=1Rn,k[t] is true. During the process of increasing Rn,k[t], we shall either find a solution or have to move a flow from UZ to UP1, from

UP1 to UP2, or from UP2 to UA. For example, assume that

fn,i∈ UP1and the first event, called Event 1, we encountered

is to move fn,i from UP1 to UP2. For Event 1 to happen,

the conditions to be met are 1) P

knee

n,i[t]

Pn,i = maxfn,k∈UP1 Pn,kknee[t]

Pn,k (no flow is moved from UP1 to UP2 earlier than Event 1), 2)

Pn,iknee[t]

Pn,i ≥ maxfn,k∈UP2

Pn,kmin[t]

Pn,k (no flow is moved from UP2 to

UA earlier than Event 1), 3) P

knee

n,i[t]

Pn,i ≥ maxfn,k∈UZ

Pn,kmax[t] Pn,k (no

(6)

hn,k(x; t) = 1

x· Ln,k[t − 1] − Sn,k[t − 1] − Ln,k[t − 1], if Pn,kmin[t] ≤ x < Pn,kknee[t]

Ln,k[t − 1] + Q1n,k[t] − x · (Sn,k[t − 1] + Ln,k[t − 1] + Q1n,k[t]), if Pn,kknee[t] ≤ x ≤ Pn,kmax[t]

(9)

flow is moved from UZ to UP1 earlier than Event 1), and 4)

fn,k∈UP1∪UP2hn,k(( Pn,iknee[t] Pn,i ) · Pn,k; t) + fn,k∈UAQn,k[t] <

Rn[t] (no solution is found earlier than Event 1), where the definition of hn,k(x; t) is shown in equation (9). Note that

hn,k(x; t) is the inverse function of Pn,k[t] shown in equation (2). The conditions for other events to happen can be similarly determined. After all flows are placed in the correct sets, the solution can be obtained by solving equations (7) and (8). To summarize, we repeatedly check the inequality shown in equation (10). If it holds, flow fn,k∗ is moved from one set to another. fn,k∈UP1∪UP2 hn,k(p · Pn,k; t) + fn,k∈UA Qn,k[t] < Rn[t], (10) where p = max( max fn,k∈UZ Pn,kmax[t] Pn,k , maxfn,k∈UP1 Pn,kknee[t] Pn,k , maxfn,k∈UP2 Pn,kmin[t] Pn,k ), (11) and

k∗= arg max( max

fn,k∈UZ Pn,kmax[t] Pn,k , maxfn,k∈UP1 Pn,kknee[t] Pn,k , maxfn,k∈UP2 Pn,kmin[t] Pn,k ). (12) All flows are placed in their correct sets once the inequality shown in equation (10) becomes false. The solution can then be obtained as follows. Set Rn,k[t] = 0 if fn,k∈ UZor Qn,k[t] if fn,k ∈ UA. For fn,k ∈ UP1∪ UP2, Rn,k[t] can be obtained by Rn,k[t] = hn,k(PnF[t] · Pn,k; t), where PnF[t] represents the normalized running loss probability for any fn,k∈ UP1∪ UP2

at the end of the tth _{frame and is derived in Appendix B.}

Case 2 Rn[t] < R∗n[t]

Case 2 is similar to Case 1, except that we need to decrease Rn,k[t] for fn,k∈ UP1∪UP2∪UA. For this case, we repeatedly

check the inequality shown in equation (13) until it becomes false. If it is true, flow fn,k∗ is moved from UA to UP2, from

UP2 to UP1, or from UP1 to UZ. fn,k∈UP1∪UP2 hn,k(p · Pn,k; t) + fn,k∈UA Qn,k[t] > Rn[t], (13) where p = min( min fn,k∈UP1 Pmax n,k[t] Pn,k , minfn,k∈UP2 Pknee n,k [t] Pn,k , minfn,k∈UA Pmin n,k[t] Pn,k ), (14) and

k∗= arg min( min

fn,k∈UP1 Pmax n,k[t] Pn,k , minfn,k∈UP2 Pknee n,k [t] Pn,k , minfn,k∈UA Pmin n,k[t] Pn,k ). (15) After the inequality shown in equation (13) becomes false, the solution can be obtained as follows. Set Rn,k[t] = 0 if

fn,k ∈ UZ or Qn,k[t] if fn,k ∈ UA. For fn,k ∈ UP1∪ UP2,

Rn,k[t] can be obtained by Rn,k[t] = hn,k(PnF[t] · Pn,k; t). The pseudo code of the above realization of the PL scheduler is provided below. Algorithm 1: PL scheduler Data: 1) UZ= {fn,k: R∗_n,k[t] = 0} 2) UP1= {fn,k: 0 < R∗_n,k[t] ≤ Q1_n,k[t]} 3) UP2= {fn,k: Q1_n,k[t] < R∗_n,k[t] < Qn,k[t]} 4) UA= {fn,k: R∗_n,k[t] = Qn,k[t]}

Result:Rn,k[t] for all fn,k withQn,k[t] > 0, 1 ≤ k ≤ Kn

begin

ifRn[t] = R∗n[t] then

Rn,k[t] = R∗n,k[t],1 ≤ k ≤ Kn

else ifR_n[t] > R∗_n[t] then while (1) do

calculatep according to equation (11)

if equation (10) is false then

Rn,k[t] = 0 for all fn,k∈ UZ

Rn,k[t] = Qn,k[t] for all fn,k∈ UA

Rn,k[t] = hn,k(PnF[t] · Pn,k; t) for all

fn,k∈ UP1∪ UP2

(Flowfn,k is moved fromUP2toUA if

Rn,k[t] = Qn,k[t].)

exit else

determinek∗according to equation (12)

iff_n,k∗ ∈ UZthen UZ= UZ− fn,k∗,UP1= UP1∪ fn,k∗ else iff_n,k∗ ∈ UP1 then UP1= UP1− fn,k∗,UP2= UP2∪ fn,k∗ else UP2= UP2− fn,k∗,UA= UA∪ fn,k∗ end end end else while (1) do

calculatep according to equation (14)

if equation (13) is false then

Rn,k[t] = 0 for all fn,k∈ UZ

Rn,k[t] = Qn,k[t] for all fn,k∈ UA

Rn,k[t] = hn,k(PnF[t] · Pn,k; t) for all

fn,k∈ UP1∪ UP2

(Flowf_n,k is moved fromUP2toUP1if

Rn,k[t] = Q1n,k[t] or from UP1 toUZif

Rn,k[t] = 0.)

exit else

determinek∗according to equation (15)

iff_n,k∗ ∈ UP1then UP1= UP1− fn,k∗,UZ= UZ∪ fn,k∗ else iff_n,k∗ ∈ UP2 then UP2= UP2− fn,k∗,UP1= UP1∪ fn,k∗ else UA= UA− fn,k∗,UP2= UP2∪ fn,k∗ end end end end end

(7)

Note that, for Case 1, the maximum number of iterations needed for the PL scheduler is 3Kn, which happens when each flow is moved from UZto UP1, from UP1to UP2, and then from

UP2 to UA. In each iteration, the computational complexity

is O(Kn). Therefore, the total computational complexity is

O(Kn2). Obviously, the complexity for Case 2 is the same.

D. Pre-processor

Assume that Rn[t] < Rn∗[t] (i.e., Case 2 occurs) and

R∗n,k[t] > 0. In this case, flow fn,k will violate its loss probability requirement if the PL scheduler is adopted. As a consequence, all flows attached to SS n violate their loss probability requirements if R∗n,k[t] > 0 for all k. This is clearly not desirable. One possible remedy is to place a pre-processor in front of the PL scheduler to maximize the number of flows which meet their loss probability requirements. Let Ω = UP1 ∪ UP2 ∪ {fn,k|fn,k ∈ UA, P_n,k∗ [t] = Pn,k}. The operation of the pre-processor is as follows. 1) Select flow fn,k which satisfies R∗n,k[t] = minfn,i∈Ω{R∗n,i[t]}, 2) End the pre-processor operation if R∗_n,k[t] > Rn[t]. Otherwise, set Rn,k[t] = R∗n,k[t] and remove fn,k from the set it originally belongs to, 3) Update Rn[t] = Rn[t] − R∗_n,k[t] and Ω = Ω− {fn,k}, 4) End the pre-processor operation if Ω = ∅. Otherwise, repeat the process. After the operation of the pre-process ends, the remaining resource is allocated to the remaining flows belonging to UP1 ∪ UP2 ∪ UA by

the PL scheduler. Clearly, the computational complexity of the pre-processor is O(Kn log Kn), where Kn = |UP1 ∪

UP2∪ {fn,k|fn,k ∈ UA, P_n,k∗ [t] = Pn,k}| ≤ Kn. As will be seen in the next section, adoption of the pre-processor can significantly increase the number of real-time flows which meet their QoS requirements.

V. SIMULATIONRESULTS

In our simulations, SSs are uniformly distributed in a circu-lar area of radius 2Km and the BS is located at the center. Two types of real-time traffic flows are studied. Parameters of the simulation environment, AMC schemes, traffic specifications and QoS requirements of real-time flows are summarized in Table II. A frame is decomposed into downlink and uplink sub-frame. We only consider downlink transmission, which is assumed to occupy 30 time slots in a frame. The other time slots are used for uplink transmission and signaling overhead. For non-real-time traffic, we assume that its queue is always non-empty. Two scenarios are investigated. In both scenarios, we assume that |Γ_{N RT}| = 40 and the minimum requested bandwidth of every non-real-time flow is zero.

In the first scenario, in addition to the 40 non-real-time flows, there are various number of SSs each attached with one Type I real-time flow. The second scenario has 13 SSs each attached with two real-time flows, one of Type I and another of Type II. Simulations are performed for 10,000 frames using Matlab on a PC with an Intel Core 2 Quad CPU operated at 2.83GHz with 3072 MB of RAM.

For the first scenario, we compare our proposed scheme with the pure maximum-throughput algorithm, the three scheduling polices proposed in [13], and the M-LWDF

10 20 30 40 50 60 70 0 5 10 15 20 25 30

Number of SSs attached with Type I traffic flows

Throughput (Mbps) MAX−throughput Scheme of [13] with  =0 Scheme of [13] with  =1 Scheme of [13] with  =  proposed:Matrix M−LWDF

Fig. 3. Throughputs of various schemes in the first scenario.

scheme. To maximize system throughput, the minimum re-quested bandwidth of any real-time traffic flow is zero for the pure maximum-throughput algorithm. For fair compar-ison, we change the resource granularity from sub-channel to time slot for the three policies proposed in [13]. With such a change, their performances are better than the original versions. We label our proposed scheme by "proposed:ILP" or "proposed:Matrix" if the resource allocation problem is solved by integer linear programming or matrix-based schedul-ing algorithm, respectively. Both the PL scheduler and the pre-processor are adopted in Scenario 2 for all investigated schemes, except the M-LWDF scheme.

In Fig. 3 and Fig. 4, we compare, respectively, total system throughput and loss probability of the investigated schemes for SSs attached with Type I real-time traffic flows in the first scenario. Compared with the schemes presented in [13] for β = 0 and β = 1, our proposed scheme achieves better system throughput. The maximum improvement is about 28% (6.018Mbps versus 4.696Mbps), which occurs when |ΓN RT| = 60. Although the pure maximum-throughput algorithm and the scheme presented in [13] for β = ∞ have better throughput performance than our proposed scheme, their loss probabilities are higher than the specified value. In fact, a large proportion (about 80%) of real-time data is lost for the pure maximum-throughput algorithm. The reason is that there are many SSs attached with non-real-time traffic flows that are assumed to always have data for transmission. The improvement of our proposed scheme stops when|Γ_RT| ≥ 70. The reason is that, for |Γ_RT| ≥ 70, the average running loss probability is greater than the loss probability requirement and, therefore, the resource is allocated to users with good channel qualities by our proposed scheme and the scheme presented in [13] for β = 0 and β = 1. Compared with the M-LWDF scheme, our proposed algorithm achieves higher throughput without sacrificing QoS guarantee.

In Fig. 5 and Fig. 6, we compare the performances of our proposed:ILP and proposed:Matrix schemes. Results show that the difference is not significant. For|ΓRT| = 30 , the execution time of the proposed:Matrix scheme is 0.9 ms, which is much smaller than 47.4 ms, the execution time of the proposed:ILP scheme.

(8)

TABLE II

PARAMETERS OF SIMULATION ENVIRONMENT,TRAFFIC CHARACTERISTICS, QOSREQUIREMENTS AND ADOPTED MODULATION AND CODING SCHEME.

Simulation environment

Radius of cell 2 km

User distribution Uniform

Bandwidth 10 MHz

Channel model Rayleigh fading channel Doppler frequency 4.6 Hz (speed:2 km/hr)

Pass loss exponent 4

Frame duration 5ms

Time slot duration 0.1ms

Number of sub-channels 16

Number of sub-carriers 64 (per sub-channel) Traffic characteristics and QoS requirements

Traffic Type Type I Type II [21]

Content Voice video streaming (Star War II)

Codec format G.711 MPEG 4

Mean inter-arrival time 20ms 40ms

Mean packet size 200 bytes 267bytes

Delay bound 80ms 160ms

Loss probability requirement 10(%) 5, 10, 15, 20, 25(%) The adopted modulation and coding scheme [12] Mode Modulation Coding rate Receiver SNR (dB)

1 QPSK 1/2 5 2 QPSK 3/4 8 3 16QAM 1/2 10.5 4 16QAM 3/4 14 5 64QAM 1/2 16 6 64QAM 2/3 18 7 64QAM 3/4 20 TABLE III

LOSS PROBABILITIES FOR USERS ATTACHED WITH ONETYPEIAND ONETYPEIIREAL-TIME FLOWS.

Loss probability requirement M-LWDF Scheme of [13] withβ = 0 Scheme of [13] withβ = 1 proposed: Matrix

PL,I PL,II PL,I PL,II PL,I PL,II PL,I PL,II 5% 0.0025 0.0013 0.0182 0.0091 0.0671 0.0336 0.1000 0.0502 10% 0 0.0035 0.0122 0.0122 0.0448 0.0448 0.1000 0.1000 15% 0 0.0036 0.0094 0.0141 0.0342 0.0513 0.1002 0.1505 20% 0 0.0037 0.0079 0.0158 0.0280 0.0561 0.1000 0.2000 25% 0 0.0039 0.0066 0.0165 0.0238 0.0594 0.1001 0.2503 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70 80

Loss probability (%) MAX−throughput Scheme of [13] with  =0 Scheme of [13] with  =1 Scheme of [13] with  =  proposed:Matrix M−LWDF

Fig. 4. Loss probabilities of SSs attached with real-time traffic flows in the first scenario.

Fig. 7 shows the comparison of throughput performances of the investigated schemes which guarantee QoS of all the real-time flows in the second scenario. As one can see, our proposed:Matrix scheme outperforms M-LWDF and the scheme of [13] with β = 0 or 1. The improvement increases as the loss probability requirement increases. The reason is simply because our proposed:Matrix scheme takes loss

10 20 30 40 50 60 70 0 5 10 15 20 25

Throughput

proposed:ILP porposed:Matrix

Fig. 5. Throughput comparison between proposed:ILP and proposed:Matrix schemes.

probability requirements into consideration in calculating the minimum requested bandwidth of every real-time flow. As shown in Table III, both M-LWDF and the scheme of [13] (with β = 0 or 1) do not take full advantage of the tolerance of data loss feature of real-time flows. By controlling the actual loss probabilities close to requirements, our proposed scheme improves system throughput.

(9)

TABLE IV

NUMBER OFTYPEIANDTYPEIIFLOWS WHICH MEET THEIRQOSREQUIREMENTS IN THE SECOND SCENARIO.

Number proposed: Matrix _{without pre-processor}proposed: Matrix M-LWDF of SSs Type I Type II Type I Type II Type I Type II

10 10 10 10 10 10 10 20 20 20 20 20 19 13 30 12 30 12 12 28 14 40 16 40 16 16 30 16 50 20 50 20 20 32 20 10 20 30 40 50 60 70 0 2 4 6 8 10 12 14 16

Loss probability (%)

proposed:ILP porposed:Matrix

Fig. 6. Loss probability comparison between proposed:ILP and pro-posed:Matrix schemes. 5 10 15 20 25 9 10 11 12 13 14 15 16

Loss probability requirements (%)

Throughput (Mbps)

Scheme of [13] with  = 0 Scheme of [13] with  = 1 proposed:Matrix M−LWDF

Fig. 7. Throughputs of various schemes in the second scenario.

To study the effect of pre-processor, we conduct simula-tions for our proposed:Matrix scheme with and without pre-processor. The results are shown in Table IV. For comparison, we also include simulation results of the M-LWDF scheme. In this table, the loss probability requirement of Type II real-time flows is chosen to be 10%. As one can see, the number of Type II flows which meet their QoS requirements with pre-processor is much larger than that without pre-processor when|ΓRT| is large. The reason is that, under the PL scheduler, the denom-inator of the running loss probability, i.e, Sn,k[t] + Ln,k[t], is often smaller for a real-time flow with a smaller data arrival rate. As a result, a flow with a smaller data arrival rate tends to have a smaller minimum requested bandwidth and is more likely to be selected by the pre-processor. In our simulations, a flow of Type II has a smaller data arrival

rate than a flow of Type I. When compared with M-LWDF, the proposed:Matrix scheme with pre-processor yields more flows which meet their QoS requirements. One interesting observation is that M-LWDF favors Type I flows. This is because Type I flows require more stringent delay bounds than Type II flows, which implies Type I flows are assigned higher priority than Type II flows when loss probability requirements are identical. We also conducted simulations for a scenario where all SSs are attached with two Type II flows. The loss probability requirement is 10% for one flow and 20% for the other. Results show that the pre-processor favors flows with 20% loss probability requirement. This is intuitively true because, under the same data arrival distribution, a flow with a larger loss probability requirement tends to have a smaller minimum requested bandwidth than one which has a smaller loss probability requirement. Owing to space limitation, we do not show these results.

VI. CONCLUSION

We have presented in this paper an efficient resource allocation scheme which tries to maximize system throughput while providing QoS support to real-time traffic flows. The basic idea of our proposed scheme is to calculate a dynamic minimum requested bandwidth for each traffic flow and use it as a constraint in an optimization problem which maximizes system throughput. The minimum requested bandwidth is a function of the pre-defined loss probability and the running loss probability. In addition, a user-level PL scheduler is proposed to determine the bandwidth share for multiple real-time flows attached to the same SS. A pre-processor is adopted to maximize the number of real-time flows attached to each SS which meet their QoS requirements, when the resource is not sufficient to provide every flow its minimum requested bandwidth. Computer simulations were conducted to evaluate the performance of our proposed scheme. Results show that the running loss probabilities of traffic flows attached to the same SS are effectively controlled to be proportional to their loss probability requirements. Besides, compared with previ-ous designs, our proposed scheme achieves higher throughput while providing QoS support. Although we present our designs for long time average of loss probabilities, the idea can be applied to other measurements such as exponentially weighted moving average. How to design a pre-processor which meets user’s need is an interesting topic which can be further studied. Evaluation of the impact to user perception of satisfaction for various performance measurements is another potential further research topic.

(10)

APPENDIXA

PROOFS OFLEMMAS ANDTHEOREMS Proof of Lemma 1: Lemma 1 is obviously true for Pmin

n,k[t] ≤

Pn,k≤ Pn,kmax[t] because, in this case, we have Pn,k∗ [t]−Pn,k = 0. For Pn,k > Pn,kmax[t], it holds that

|P∗ n,k[t] − Pn,k| = Pn,k− Ln,k[t−1]+Q 1 n,k[t] Sn,k[t−1]+Ln,k[t−1]+Q1n,k[t] ≤ Pn,k− Ln,k[t−1]+(Q 1 n,k[t]−Rn,k[t])+ Sn,k[t−1]+Ln,k[t−1]+max(Rn,k[t],Q1n,k[t]). since Rn,k[t] ≥ 0. Therefore, Lemma 1 is true for

Pn,k> Pn,kmax[t]. For Pn,k< Pn,kmin[t], we have

|P∗

n,k[t] − Pn,k| = _S_n,k_[t−1]+LLn,k_n,k[t−1]_[t−1]+Q_n,k_[t] − Pn,k

≤ Ln,k[t−1]+(Q1n,k[t]−Rn,k[t])+

Sn,k[t−1]+Ln,k[t−1]+max(Rn,k[t],Q1n,k[t])−Pn,k. since Rn,k[t] ≤ Qn,k[t]. This completes the proof of Lemma 1.

Proof of Lemma 2: Let Rn,k[t] and Pn,k[t] be, respectively, the bandwidth allocated to and the resulting running loss probability of fn,k under our proposed PL scheduler. Further, let Rn,k[t] and Pn,k [t] be the same variables under some other scheduler. Assume that φ = arg max1≤k≤Kn

Pn,k[t] Pn,k . We shall prove Pn,φ[t] Pn,φ ≤ max1≤k≤Kn Pn,k [t] Pn,k .

Let UZ, UP, and UA be the three sets such that flow

fn,k is contained in UZ, UP, or UA iff Rn,k[t] = 0, 0 <

Rn,k[t] < Qn,k[t], or Rn,k[t] = Qn,k[t], under the proposed PL scheduler. Assume that UA = ∅. Since Rn[t] > 0, it must hold that φ ∈ UP. If P_Pn,φ[t]

n,φ > Pn,φ [t]

Pn,φ , meaning that

Rn,φ[t] < Rn,φ [t], there must exist fn,k ∈ UP such that

Rn,k[t] > R_n,k[t]. Otherwise, equation (8) is violated. Since Pn,k [t]

Pn,k > Pn,k[t]

Pn,k = Pn,φ[t]

Pn,φ , Lemma 2 is true for this case. Consider the case UA = ∅. The proposed PL scheduler

allocates Rn,i[t] = Qn,i[t] to all fn,i∈ UA, which implies fn,φ is in UA or can be selected from UA, according to equation

(7). Consequently, Lemma 2 is true because Rn,φ[t] ≥ Rn,φ[t], which implies Pn,φ[t]

Pn,φ ≤ P_n,φ [t]

Pn,φ .

Proof of Theorem 3: Assume that there exists a scheduler which can guarantee the loss probability requirements of all the Kn traffic flows. In other words, it holds that P

n,k[t] Pn,k ≤ 1, 1 ≤ k ≤ Kn, where Pn,k [t] is the loss probability of flow fn,k at the end of the tth frame, under the considered scheduler. Let Pn,k[t] be the loss probability of flow fn,k at the end of the tth _{frame, under the PL scheduler. According to Lemma} 2, we have Pn,k[t]

Pn,k ≤ max1≤i≤Kn Pn,i [t]

Pn,i ≤ 1, 1 ≤ k ≤ Kn, and, therefore, Theorem 3 is true.

Proof of Lemma 4: Lemma 4 can be easily verified with the calculation results shown in Table I.

Proof of Theorem 5: We prove Theorem 5 for ΔRn[t] ≥ 0. The other case can be proved similarly. Let VZ, VPand VAbe

three sets such that fn,k is in VZ, VP, or VA iff R∗_n,k[t] = 0,

0 < R∗

n,k[t] < Qn,k[t], or R∗n,k[t] = Qn,k[t], respectively. Similarly, fn,k is in UZ, UP, or UA iff Rn,k[t] = 0, 0 <

Rn,k[t] < Qn,k[t], or Rn,k[t] = Qn,k[t], respectively. Recall that equations (7) and (8) are satisfied under the PL scheduler. Assume that ΔRn,i[t] < 0 for some flow fn,i. Since ΔRn[t] ≥ 0, there must be some other fn,j with ΔRn,j[t] > 0. The assumption ΔRn,i[t] < 0 implies fn,i ∈ VP∪ VA and

ΔRn,j[t] > 0 implies fn,j∈ VZ∪VP. From Lemma 4, we have

Pn,i∗ [t] Pn,i ≥

Pn,j∗ [t]

Pn,j . The assumption ΔRn,i[t] < 0 also implies

fn,i ∈ UZ∪ UP and ΔRn,j[t] > 0 implies fn,j ∈ UP∪ UA.

According to equation (7), we have Pn,i[t] Pn,i ≤

Pn,j[t] Pn,j , a contradiction, because Pn,k[t] is a strictly decreasing function of Rn,k[t] for 0 ≤ Rn,k[t] ≤ Qn,k[t], which together with

Pn,i∗ [t] Pn,i ≥

Pn,j∗ [t]

Pn,j , ΔRn,i[t] < 0, and ΔRn,j[t] > 0 imply Pn,i[t]

Pn,i > Pn,j[t]

Pn,j . This proves Theorem 5. APPENDIXB DERIVATION OF_P_nF[t]

Given PF

n[t], one can compute hn,k(PnF[t]·Pn,k; t) based on equation (9) for any fn,k∈ UP1∪UP2. Substituting hn,k(PnF[t]·

Pn,k; t) into fn,k∈UP1∪UP2hn,k(P F n[t] · Pn,k; t) = Rn[t] − fn,k∈UAQn,k[t], we get A · (P F n[t])2+ B · (PnF[t]) + C = 0, where A =_f_n,k_∈UP1Pn,k·(Sn,k[t−1]+Ln,k[t−1]+Q 1 n,k[t]), B = Rn[t] + fn,k∈UP2(Sn,k[t − 1] + Ln,k[t − 1]) − fn,k∈UAQn,k[t] − fn,k∈UP1(Ln,k[t − 1] + Q 1 n,k[t]) and C = −_f_n,k_∈UP2 Ln,k[t−1] Pn,k . If UP1=∅, which implies A = 0, PF

n[t] can be obtained by PnF[t] = −CB. Assume that A = 0. In this case, we have PF

n[t] = −B+ √

B2−4AC

2A because

B2− 4AC ≥ B2 and PnF[t] must be non-negative. ACKNOWLEDGMENT

The authors would like to thank the anonymous reviewers for their valuable comments which lead to improvement of the paper.

REFERENCES

[1] IEEE Standard for Local and Metropolitan Area Networks-Part 16: Air Interface for Fixed Broadband Wireless Access Systems, IEEE Std. 802.16-2009, May 2009.

[2] E. Dahlman, S. Parkvall, J. Skold, and P. Beming, 3G HSPA and LTE

for Mobile Broadband. Academic, 2007.

[3] M. Kaneko, P. Popovski, and J. Dahl, “Proportional fairness in multi-carrier system with multi-slot frames: upper bound and user multiplexing algorithms,” IEEE Trans. Wireless Commun., vol. 7, no. 1, pp. 22–26, Jan. 2008.

[4] N. Ruangchaijatupon and Y. Ji, “Simple proportional fairness scheduling for OFDMA-based wireless systems,” in Proc. 2008 IEEE WCNC, pp. 1593–1597.

[5] N. Ruangchaijatupon and Y. Ji, “OFDMA resource allocation based on traffic class-oriented optimization,” IEICE Trans. Commun., vol. E92-B, no.1, pp. 93–101, Jan. 2009.

[6] N. Ruangchaijatupon and Y. Ji, “Integrated approach to proportional fair resource allocation for multiclass services in an OFDMA system,” in Proc. 2009 IEEE GLOBECOM.

[7] D. S. W. Hui, V. K. N. Lau, and W. H. Lam, “Cross-layer design for OFDMA wireless systems with heterogeneous delay requirements,”

IEEE Trans. Wireless Commun., vol. 6, no. 8, pp. 2872–2880, Aug.

2007.

[8] J. Jang and K. B. Lee, “Transmit power adaptation for multiuser OFDM system,” IEEE J. Sel. Areas Commun., vol. 21, no. 12, pp. 171–178, Feb. 2003.

[9] S. Shakkottai and A. L. Stolyar, “A study of scheduling algorithms for a mixture of real and non-real time data in HDR,” Bell Labs Tech. Memo., Aug. 2000.

[10] M. Andrews, K. Kumaran, K. Ramanan, A. L. Stolyar, P. Whiting, and R. Vijayakumar, “Providing quality of service over a shared wireless link,” IEEE Commun. Mag., vol. 39, no. 2, pp. 150–154, Feb. 2001. [11] A. K. F. Khattab and K. M. F. Elsayed, “Opportunistic scheduling of

delay sensitive traffic in OFDMA-based networks,” in Proc. 2006 IEEE

WOWMOM, pp. 109–114.

[12] X. Zhu, J. Huo, C. Xu, and W. Ding, “QoS-guaranteed scheduling and resource allocation algorithm for IEEE 802.16 OFDMA system,” in

(11)

[13] Y. Kim, K. Son, and S. Chong, “QoS scheduling for heterogeneous traf-fic in OFDMA-based wireless systems,” in Proc. 2009 IEEE

GLOBE-COM.

[14] R. Chipalkatti, J. Jurose, and D. Towsley, “Scheduling policies for real-time and non-real-real-time traffic in a statistical multiplexer,” in Proc. 1989

IEEE INFOCOM, pp. 774–783.

[15] R. Yang, C. Yuan, and K. Yang, “Cross layer resource allocation of delay sensitive service in OFDMA wireless systems,” in Proc. 2008

IEEE ICCSC, pp. 862–866.

[16] V. Huang and W. Zhuang, “QoS-oriented packet scheduling for wireless multimedia CDMA communications,” IEEE Trans. Mobile Comput., vol. 3, no. 1, pp. 73–85, Jan. 2004.

[17] A. Frank, “On Kuhn’s Hungarian method—a tribute from Hungary,”

Naval Research Logistics, vol. 52, no. 1, pp. 2–5, Dec. 2005.

[18] J. Mo and J. Walrand, “Fair end-to-end window-based congestion control,” IEEE/ACM Trans. Netw., vol. 8, no. 5, pp. 556–567, Oct. 2000. [19] J. E. Beasley, Advances in Linear and Integer Programming. Oxford

Science, 1996.

[20] A. Schrijver, Theory of Linear and Integer Programming. Wiley, 1986. [21] “MPEG-4 and H.263 video traces for network perfor-mance evaluation,” Oct. 2006. Available: http://www.tkn.tu-berlin.de/research/trace/trace.html

Tsern-Huei Lee (S’86-M’87-SM’98) received the

B.S. degree from National Taiwan University, Taipei, Taiwan, the M.S. degree from the University of California, Santa Barbara, and the Ph.D. degree from the University of Southern California, Los Angeles, in 1981, 1983, and 1987, respectively, all in electrical engineering.

Since 1987, he has been a member of the faculty of National Chiao Tung University, Hsinchu, Tai-wan, where he is a professor in the Department of Electrical Engineering. He received an Outstanding Paper Award from the Institute of Chinese Engineers in 1991. During the past years, he has served as a consultant to various companies to develop large scale QoS-enabled frame-based switches/routers, integrated access devices, and unified threat management Internet appliances. His current research interests are in communication protocols, broadband switching systems, traffic management, wireless communications, and network security.

Yu-Wen Huang (S’07) was born in Gangshan

District, Kaohsiung City, Taiwan, in 1982. He re-ceived the B.S. and M.S. degrees in communication engineering from National Chiao Tung University, Hsinchu, Taiwan, in 2004 and 2006, respectively. Currently, he is pursing his Ph.D. degree at the same university. His current research interests in-clude resource allocation, power management, and communication protocols in wireless networks.