Cross Layer Adaptation with QoS Guarantees for Wireless Scalable Video Streaming

(1)

IEEE COMMUNICATIONS LETTERS, VOL. 16, NO. 9, SEPTEMBER 2012 1349

Cross Layer Adaptation with

QoS Guarantees for Wireless Scalable Video Streaming

Hsuan-Li Lin, Tung-Yu Wu, and Ching-Yao Huang, Member, IEEE

Abstract—In this paper, a cross-layer adaptation scheme is

proposed for quality of service (QoS) provision in the scalable video streaming of high definition (HD) content. The cross-layer parameters, which contain the video rate, payload length of a packet, the mode of modulation and coding scheme (MCS), can be dynamically adapted to minimize distortion of a video streaming under the given delay bound. Based on the channel quality and rate-distortion parameters, the proposed scheme for-mulates the problem of parameter selection into an optimization problem. Simulation results show that our approach guarantees video quality under QoS constraints.

Index Terms—Cross-layer design, payload length adaptation,

wireless video transmission, geometry programming.

I. Introduction

W

ITH recent development of wireless technologies, wireless high-definition (HD) video services become possible for home entertainment systems. However, there exist challenges to deliver HD video over fluctuated wireless channels. To support HD video, WiMedia ultra wideband (UWB) [1] which supports high data rates is considered. In addition, the scalable video coding (SVC) extension of H.264/AVC is applied to the HD video streaming for the high compression eﬃciency and video rate adaptation based on channel conditions and terminal types [2].

In this paper, the quality of service (QoS) of video streaming is defined as the video quality under a delay bound of each group of pictures (GOP) reception. Due to the limit in the receiver video buffer, the delay bound is relatively tight for each HD GOP reception. To meet the delay bound, the throughput improvement is essential, [3] and [4] provided a joint consideration of the payload length adaptation and the se-lection of modulation and coding scheme (MCS) to maximize the throughput. However the volatility of transmission rates would violate its delay bound and lead to fluctuations of video quality, which is undesirable from users’ point of view. The volatility of transmission rates can be reduced by restricting the payload length and transmission modes to satisfy the packet error rate (PER) constraint [5,6]. However, the effective throughput also would be reduced by inefficient transmission modes. Hence, the retransmission policy is considered to improve the QoS in [7]. The authors proposed a scheme to satisfy the frame error rate, and the required transmission time of a video frame is minimized by properly adjusting the payload length and MCS.

Manuscript received April 6, 2012. The associate editor coordinating the review of this letter and approving it for publication was J. van de Beek.

The authors are with the Department of Electronics Engineering & the Institute of Electronics, National Chiao-Tung University, Taiwan (e-mail: x3232.ee95g@nctu.edu.tw).

Digital Object Identifier 10.1109/LCOMM.2012.070512.120760

In this paper, we propose a cross-layer optimization scheme which employs the concept of retransmission. Instead of maximizing eﬀective throughput, the proposed scheme guar-antees the HD video quality and smooth play-out of scalable video streaming by minimizing the video distortion under a delay bound. While SVC techniques enable the video rate adaptation, the proposed scheme can suggest a suitable video rate for the SVC extractor. It also provides the information of proper payload length and MCS for Medium Access Control (MAC) layer to improve the video quality. This cross layer design can ensure the quality of video transmission.

II. System Model

Consider a GOP of a video stream needed to deliver from a transmitter to a user within a delay bound. The quality of a compressed video stream can be measured by mean squared error (MSE), which represents the diﬀerences between reconstructed pixel values and the original pixel values. The parametric rate-distortion (RD) model in [8] is adopted in our proposed optimization framework. For a given video content, the MSE of a video streaming in a GOP period can be estimated as:

D= D0+ θ0

V− V0

(1) where D is the video distortion represented by MSE, and the peak signal-to-noise-ratio (PSNR) is given by 20· log10255√_D.

V is the video rate in a GOP period, D0 and V0 are the

distortion oﬀset and video rate oﬀset respectively. While the video rate V (bits/sec) increases, the MSE distortion decreases non-linearly. The curve fitting parameters D0,θ0and V0 in (1)

can be obtained from the video extractor by the least-square method based on a collection of K empirical pairs of video rate Vkand distortion Dk[9]. These parameters are all positive

constants in a GOP period and they are updated periodically to track time-varying video contents.

Suppose the data size of a GOP is G = nV/ f , where

n is the number of frames in a GOP and f denotes the

frames per second (fps). Before transmitting to the receiver, the information data of size G is fragmented into Nf packets of

L bytes. The transmission time of a L-byte fragmented packet

at MCS-mode m is given by

Tm= 8L

Rm + T

m

O (2)

where Rm is transmission rate corresponding to MCS-mode

m in bits per second, and T_Om is the transmission time of

overhead. This overhead consists of transmission time of layer headers and acknowledgement, either positive (ACK) or negative (NACK) acknowledgement. Immediate ACK (Imm-ACK) scheme [1] is considered in this paper.

(2)

1350 IEEE COMMUNICATIONS LETTERS, VOL. 16, NO. 9, SEPTEMBER 2012

In wireless video communication, there are two reasons for GOP reception failures: one is the packet errors due to fluctuation of the wireless channel, and the other one is the packet loss due to the transmission time of a GOP that exceeds the delay bound. If there is no constraint on delay bound, all error packets can be retransmitted, and the GOP will be successfully received eventually. However, the latency should be minimized for real-time services, and the delay bound should be considered. Hence, the outage rate Pe is defined

as the probability that the GOP is not completely delivered to the receiver within the delay bound. For HD streaming, the delay bound could be tightly bounded by the limited size of buﬀer at receiver. We assume the wireless link is a memoryless packet erasure channel [4], so errors appear independently among packets. Considering all possible transmission paths with retransmission policy, the outage rate is:

Pe= Nf−1 i₌₀ Nr i Pi(1− P)Nr−i ₍₃₎

where P is the packet success rate (PSR), and Nris the number

of packets including retransmitted which would be accommo-dated within the delay bound. The diﬀerence between Nr and

Nf is the upper limit of retransmission. The packet success

rate for convolution code can be formulated [10] as

P= (1 − Pmu)8L (4)

where Pm

u is the union bound of the first-event error probability

corresponding to MCS-mode m and it is calculated based on the signal-to-noise-ratio (SNR). For other channel codes, the proposed scheme, e.g. block codes or turbo codes, can be also applied when the proper PSR estimation is available. From (2) and (3), the transmission time of a GOP should follow a delay bound, which is the transmission period allocated by a radio resource scheduler:

Nr· Tm≤ delaybound (5)

Any packet of the GOP beyond the delay bound is discarded in the transmission.

III. Proposed Optimization Scheme

A. Video Distortion Minimization with Target Outage Rate and Delay Bound

In this section, we investigate the scenario that a radio resource scheduler allocates a transmission period to a single user. By considering this scenario, the parameter adaptation of payload length L, video rate V, and MCS-mode m can be systematically solved to minimize the video distortion D with the target outage rate and delay bound. The target outage rate

Pout could be properly selected for various video applications.

Base on (1)-(5), the jointly optimization problem with fixed MCS-mode m is formulated as min V,L D0+ θ0 V− V0 s.t. Nf−1 i=0 Nr i Pi(1− P)Nr−i≤ P out Nr· Tm≤ delaybound V≥ V0 L≥ 0. (6)

where Nf = _{8 f}n·V_·L, and Pout is the target outage rate. The

minimum distortion can be achieved by having dynamic parameter setting to meet channel conditions. In order to solve the optimization problem, we transform the outage rate constraint and packet transmission time Tm_{in (6) into explicit}

forms in terms of design variables V, L and m. We can express

Nr as a function in terms of Nf , P, and Pout to prevent the

complexity introduced by integer programming. The left side of outage rate constraint is a cumulative distribution function (CDF) of a binomial distribution. The constraint could be approximated by a CDF of a standard normal distribution [11]:

Pout≥ x i=0 n i Pi(1− P)n−i, for x ≤ n ≈⎧⎪⎪⎪⎪⎪⎨_{⎪⎪⎪⎪⎪} ⎩ Φ(√{4x + 3}{1 − P} −√{4n − 4x − 1}P) , for 0.05 ≤ P ≤ 0.93; Φ(√{2x + 1}{1 − P} − 2√{n − x}P) , for P > 0.93. (7)

where x= Nf− 1, n = Nr , andΦ is the CDF of the standard

normal distribution. This approximation is valid to express Nr

when Nf > 10. However, in HD video services, the data size

of a GOP is often fragmented into many packets (Nf >> 10).

Given delay bound and fixing L, m and Nr, the high video rate

in (6) with more packets leads the decreasing video distortion and the increasing outage rate. In other words, it achieves the minimum video distortion as the outage rate approaching to the upper bound of outage rate Pout. Therefore, the equality

in (7) holds, and we can let s= Φ−1(Pout), whereΦ−1denotes

the normal inverse CDF of normal. After a small amount of manipulation, (7) can be rewritten as:

Nr≈ Nf − α +

s−(βNf− 1)(1 − P)

2

4P (8)

where α = 3/4, and β = 4 for 0.05 ≤ P ≤ 0.93; α = 1, and β = 2 for P > 0.93 form (7). By combining (2), (4), (6) and (8), the original problem in (6) becomes

min V_,L D0+ θ0 V− V0 s.t. Nf− α + s− (βNf − 1)(1 − (1 − p)8L) 2 4(1− p)8L · (8L Rm+ T m O)≤ delaybound V≥ V0 L≥ 0. (9)

where p represents the error probability Pm

u in (4). Note that

s is given by the target outrage rate.

Further, we formulate the constraints and objective function in (9) to posynomials, which is a summation of products of non-negative variables with positive coeﬃcients and real exponents. Because the error probability p is small enough (p < 10−5) in our application, (1− p)8L is approximated by 1− 8Lp. The α is neglected as Nf >> α, and the βNf − 1 is

(3)

LIN et al.: CROSS LAYER ADAPTATION WITH QOS GUARANTEES FOR WIRELESS SCALABLE VIDEO STREAMING 1351 constraint in (9) as Nf + s− 8βNfLp 2 4(1− 8Lp) · (8L Rm + T m O)≤ delaybound (10)

Note that s is negative, because the operating range of Pout

is below 0.5. Therefore, (s− 8βNfLp

2

is a posynomial. By introducing a new variable k, we can replace (10) by two posynomial constrains as shown in Appendix I:

Nf+ s− 8βNfLp 2 k · (8L Rm+ T m O)≤ delaybound (11) 1 4k+ 8Lp ≤ 1 (12)

Similarly, the objective function D0+Vθ_−V00 can be replaced by

D0+ θ0q≤ D (13)

1

q+ V0 ≤ V (14)

Combining (11)-(14), the optimization problem (9) is trans-formed into min V,L,k,q D0+ θ0q s.t. Nf+ s−8βNfLp 2 k · (8L Rm + T m O)≤ delaybound 1 4k+ 8Lp ≤ 1 1 q+ V0≤ V L≥ 0. (15)

where Nf = _{8 f}n·V_·L. Since all the variables and parameters are

positive, the object function and all constraints in (15) are posynomials. As a result, if the MCS-mode m is fixed, the problem can be recognized as a geometric programming (GP) problem which can be converted to a convex optimization problem by a logarithmic change of variables and a logarith-mic transformation of the objective and constraint functions. It can be solved eﬃciently with a global solution [12].

In summary, first we can solve the GP problem in (15) to acquire the optimal video rate and payload length for each MCS-mode. Secondly, since the number of MCS-modes is finite for implementation, we then find the best MCS-modes with the optimal video rate and payload length to minimize video distortion. These optimal decisions are designed with the delay bound, SNR, and target outage rate to ensure QoS of HD video streaming.

B. Single-user Transmission Scenario with Target Outage Rate and Constrained Video Distortion

In this subsection, we investigate the minimized trans-mission duration, which is allocated by a radio resource scheduler, to satisfy the target outage rate and video quality. By considering this scenario, the adaptation of the payload length, video rate, and MCS-mode can also be systematically solved. The transmission duration can be transformed from the delay bound constraint in (15). If the target video distortion is

0 2 4 6 8 10 12 0 0.5 1 1.5 2x 10 8 SNR(dB) Video Rate(bps)

delay bound=8/50 sec

exhaustive, P out=10 −3 exhaustive, P out=10 −5 approximated, P out=10 −3 approximated, P out=10 −5

Fig. 1. Approximation of video rate adaptation with delay bound=8/50 seconds at Pout= 10−3and 10−5respectively.

Dtarget, the optimization problem to minimize the transmission

period with fixed MCS-mode m is given by:

min L nVtarget 8L f + s− 8βnVtarget f Lp 2 k · (8L Rm+ T m O) s.t. 1 4k + 8Lp ≤ 1 L≥ 0. (16)

where Vtarget = _D_targetθ0_−D₀ + V0. The objective function is a

function of payload length, which trades the transmission time for outage rate. Because both the objective and constraints are posynomials, this adaptation is also a GP problem, and it can be solved globally with fixed MCS-mode m.

IV. Result and Discussion

In our simulation, the HD video sample (1280 by 720 pixels) is encoded using JSVM 9.19.9 reference code at a frame rate of 50 fps and GOP length of 8 frames [13]. We allocate the delay bound of a GOP as 8/50 seconds and 2/50 seconds respectively to investigate the video rate allocation and PSNR performance. The Maximum Likelihood (ML) estimate of SNR is used in this simulation as [14]. If the number of pilot samples N is large enough, which can be achieved by keeping tracking wireless channel indoor, the impact of channel SNR estimation error over additive white Gaussian noise (AWGN) channel could be negligible.

Fig. 1 shows the video rates of exact solution from (6) and the approximation from proposed scheme, respectively. It is clear the approximated video rate is close to the exact solution. We also observed the approximation errors increase slightly at high SNR values. It is because the approximation is accurate when the number of packets is large. However, the payload length trends to increase at high SNR values, so the number of packets in a GOP is fewer.

Fig. 2 shows the comparison of PSNR between the proposed scheme and one of common methodologies, the restricted PER method which set a PER constraint and decide the proper length of payload for maximizing throughput [5,6]. Consequently, for the same MCS mode, the PNSR of restricted PER method will saturate due to the constrained payload length. The proposed scheme makes trade-oﬀ between the volatility of throughput and video quality. In addition, while the restricted PER method hardly support the video streaming

(4)

1352 IEEE COMMUNICATIONS LETTERS, VOL. 16, NO. 9, SEPTEMBER 2012 2 4 6 8 10 12 14 16 18 20 32 34 36 38 40 42 44 46 48 50 52 SNR(dB) PSNR(dB)

delay bound=8/50 sec

maximized throughput with restricted PER=10−3 maximized throughput with restricted PER=10−5 proposed, P out=10 −3 proposed, P out=10 −5 0 2 4 6 8 10 12 14 16 18 20 32 34 36 38 40 42 44 46 48 50 52 SNR(dB) PSNR(dB)

delay bound=2/50 sec maximized throughput with restricted PER=10−3 maximized throughput with restricted PER=10−5

proposed, P out=10 −3 proposed, P out=10 −5

Fig. 2. PSNR under (a) delay bound=8/50 seconds and (b) delay bound=2/50 seconds at Pout= 10−3 and 10−5respectively.

at low SNR values (below 5 dB), the proposed scheme extends the operating range at low SNR values, which can enhance the radio coverage in the indoor environment.

V. Conclusion

We propose a cross-layer optimization scheme for HD scalable video streaming over wireless environment by jointly considering video content rate-distortion characteristics, chan-nel conditions, transmission duration, and QoS constraints. We have shown that the proposed scheme can be approximated to a geometry programming problem, where the globally optimal solutions are guaranteed. The proposed scheme minimizes the video distortion by properly selecting the payload length, video rate, and MCS for applications with various target outage rate and delay bound. In addition, we analyze the min-imized duration of transmission period under the target outage rate and target video distortion. Furthermore, the proposed scheme can enhance the radio coverage at low SNR values.

Appendix A

Claim 1 Let r(x) be a monomial, p(x), q(x), f (x) be generalized

posynomials, and suppose q(x)< r(x). The inequality p(x)

r(x)− q(x)+ f (x) ≤ k (17) can be replaced with the following constraints.

p(x)· t + f (x) ≤ k (18)

q(x)+1

t ≤ r(x) (19)

for k, t ∈ R+, whereR+denotes the set of nonnegative real numbers. Proo f : The feasible set of (17) is

A= {x | p(x)

r(x)− q(x)≤ k − f (x), ∀x ∈ R +_}

And the feasible sets of (18) and (19) are

B= {x | t ≤ k− f (x) p(x) , ∀x ∈ R +_} C= {x | t ≥ 1 r(x)− q(x), ∀x ∈ R +_}

First ,we show A⊆ B ∩ C. Let

D= {x | t = 1 r(x)− q(x), ∀x ∈ R +_} and E= B ∩ D, hence E= {x | 1 r(x)− q(x)≤ k− f (x) p(x) , ∀x ∈ R +_{} = A}

Since D ⊆ C, E ⊆ B ∩ C, hence A ⊆ B ∩ C. Second ,we show

A⊇ B ∩ C. Let F= B ∩ C = {x |≤ k− f (x) p(x) ≤ t ≤ 1 r(x)− q(x), ∀x ∈ R +_}, and if x∈ F, then k− f (x)_p(x) ≤ 1

r(x)−q(x) holds for all x∈ R+, so x∈ A.

Hence A ⊇ B ∩ C. As a result, A = B ∩ C, and the proof is then completed.

References

[1] Standard ECMA-368: High Rate Ultra Wideband PHY and MAC Stan-dard, ECMA International, Dec. 2005.

[2] H. Schwarz, D. Marpe, and T. Wiegand, “Overview of the scalable video coding extension of H.264/AVC,” IEEE Trans. Circuits Syst. Video

Technol., vol. 17, no. 9, pp. 1103–1120, Sep. 2007.

[3] T. Yoo and A. Goldsmith, “Throughput optimization using adaptive techniques,” Wireless Systems Lab, Stanford University, CA, USA, Tech. Rep., 2004.

[4] D. Qiao, S. Choi, and K. G. Shin, “Goodput analysis and link adaptation for IEEE 802.11a wireless LAN,” IEEE Trans. Mobile Comput., vol. 1, no. 4, pp. 580–589, June 2002.

[5] S. Choudhury and J. D. Gibson, “Throughput optimization for wireless LANs in the presence of packet error rate constraints,” IEEE Commun.

Lett., vol. 12, no. 1, Jan. 2008.

[6] H.-H. Juan, H.-C. Huang, C. Huang, and T. Chiang, “Cross-layer mobile WiMAX MAC designs for the H.264/AVC scalable video coding,” ACM

Wireless Networks, vol. 16, no. 1, pp. 113–123, Jan. 2010.

[7] T. Y. Wu, T. T. Chuang, and C.Y. Huang, “Optimal transmission of high definition video transmission in WiMedia systems,” ACM Wireless

Networks, vol. 17, no. 2, pp. 291–303, 2011.

[8] K. Stuhlmuller, N. Farber, M. Link, and B. Girod, “Analysis of video transmission over lossy channels,” IEEE J. Sel. Areas Commun., vol. 18, no. 6, pp. 1012–1032, June 2000.

[9] X. Zhu, T. Schierl, T. Wiegand, and B. Girod, “Distributed media-aware rate allocation for video multicast over wireless networks,” IEEE Trans.

Circuits Syst. Video Technol., vol. 21, no. 9, pp. 1181–1192, Sep. 2011.

[10] M. B. Pursley and D. J. Taipale, “Error probabilities for spread spectrum packet radio with convolutional codes and Viterbi decoding,” IEEE Trans.

Commun., vol. COM-35, pp. 1–12, Jan. 1987.

[11] W. Molenaar, “Approximations to the poisson, binomial and hyper-geometric distribution functions,” Mathematical Centre Tracts, no. 31 Mathematisch Centrum, Amsterdam, 1970.

[12] S. P. Boyd, S. J. Kim, L. Vandenberghe, and A. Hassibi, A Tutorial

on Geometric Programming, Information Systems Laboratory, Dept. of

Elect. Eng., Stanford Univ., 2004.

[13] JVT, “H.264 SVC reference software (JSVM 9.16.9) and manual,” CVS sever at garcon.ient.rwth-aachen.de, Jan. 2010.

[14] M. Mohammad and R. M. Buehrer, “On the impact of SNR estimation error on adaptive modulation,” IEEE Commun. Lett., vol. 9, no. 6, pp. 490–492, June 2005.