An
Implementation of End-to-End Controlled Streaming System
Using Similarity-based Frame Discarding Approach on
DiffServ
Jau-Yuan Chen,
Student Member,IEEE,
Min-WayHsu, and Cheng-Fu
ChouAbstract --Video streaming technology has been widely used for various Internet services. In this paper, we present an implementation strategy of video streaming sysfems under end-to-end trufic controls, and we adapt this system to f i t into the drflerentiated sewices (DisJSeerv) architecture. Unlike other previous works, we suggest that an end-to-end controlled streaming system is viable and easily deployed without any adaptation to core network routers. We will present experimental results to illustrate the effectiveness of
our proposed system.
Index Terms -DiffServ, End-&End, Streaming
I. INTRODUCTION
Video streaming technology has been widely used for various Internet services, such as Internet television programs, and video-on-demand services. Rather than downloading the whole video content, the video player at the end host will only play the received parts of the video that are able to be decoded. However, streaming services are in essence highly sensitive to network congestion, packet loss and delay, which may arise from an overloaded core network; only if any of the cases occurs, some of the received kames may not be capable to be decoded and end
users will soon and easily detect such an annoying situation. Some previous works [I], [2], [3], [4], [5] have been proposed to smooth the video streaming by sending parts of the video in advance when the available bandwidth is larger than the video bit rate, and clients have to pre-fetch pre- sending parts in their own buffer. NevertheIess, such an approach works only under the circumstances that the average bandwidth is larger than the average video bit rate. In the current best-effort Internet, to provide guarantees of
any
amount
of bandwidthrequires several
upgrades to therouters in the current core networks, and which makes it impossible to be implemented.
routers, which makes the IntServ model not scalable. DifBerv provides only per-flow bandwidth guarantees by keeping per-class states instead of per-flow states in core
routers, and is more scalable [SI. To achieve QoS
guarantees, admission control is required so that network resources are not over-allocated and can be utilized more
efficiently.
In this paper, we present an implementation strategy of
video streaming systems under end-to-end traffic controls, and we adapt this system to fit into the differentiated services architecture. Unlike other previous works, we suggest that an end-to-end controlled streaming system is viable and easily deployed without any adaptation to core network routers. In our system, each streaming video will be pre-processed to get the information about the degree of similarity of any consecutive frames, and a frame similarity report is generated. Instead of sending all the frames in the streaming video like before, our system dynamically adjusts the proper bit rate according to the current network status and discards parts of the frames of the video in the order of
frame priorities stated by the frame similarity report. Not only does our system utilize the bandwidth efficiently, but
we also avoid overloaded traffic on the Internet. ln addition, we could provide more smooth video streaming services
than others when the network resources is highly constraint
by reason of the similarity-based frame dropping
mechanism. Under the DiffServ architecture, we provide a differentiated service with the properties of consistency and controllability. In other words, the higher classes always receive better services, and the server can take control of performance of each class and admissions to services based on some policies. We will present experimental results to
illustrate the effectiveness of OUT proposed system.
11. RELATED WORKS
Integrated services (IntServ) [6] and differentiated Packet loss during a streaming service usually results
services (DiffServ) [7] are the most prominent architectures from insufficient available bandwidth and congestions. in to support quality o f service (QoS) in Internet. The both cases, if the streaming server tries to transmit data with guaranteed service model of IntServ [8] provides per-packet the original sending rate, many packets will be dropped delay guarantees and has to maintain per-flow states in core during transmission paths and the quality of video recejved
declines.
'
This work was partially supported by thc National Science Council andthe Ministry of Education of ROC under the contract No. NSC92-2622-E- A , yideo ~moo~hing,4pproaches
002-002 and 89E-FA06-24-8. Videos are usually compressed to reduce the size of
Jau-Yuan Chen, Min-Way Hsu and Cheng-Fu Chou are with the necessary storage
and
transmission time. whereasDepartment of Computer Science and Information Engineering, National
Taiwan Universitv, TaiDei. Taiwan. compressed videos, especially variable-bit-rate (VBR)
videos, typically exhibits significant bursts because of
encoding schemes and content variations. In order to lower down bandwidth requirement when transmitting bursty videos, some video smoothing approaches El], [ 2 ] , [3], [4], [ 5 ] have been proposed. When available bandwidth is larger
than the video bit rate, the streaming server transmits some bursty parts of the video in advance as well as the normal video stream, and the pre-fetched video will be temporarily stored in the client-side buffer. Therefore, the bandwidth required is averaged to a smoother and lower value, as shown in Figure 1. However, the video smoothing approach is only suitable for the condition that the average available bandwidth is larger than the average video bit rate. If the average bandwidth is less than the average video bit rate, it is impossible to deliver the video stream without any
infomation loss.
Fig. 1. Constrnint Curves of Video Smoothing Transmission:
The transmission plan S stays between the upper (U) and the lower (L) smoothing constraints.
B. Selective Frame Discarding Approaches
A simple and na‘ive way to deliver a video stream is transmit all frames with required rates. However, being in a bandwidth-constrained network, such a simple approach will lead to packet losses or being late €or its playback time because of network jiiter, and in both cases, many frames may not be decoded at the receiver and should be dropped. Zhang
[LO)
introduced the concept of selective framediscarding (SFD) algorithm at the server, which
preemptively discards frames by taking network constraints,
QoS requirements, and some application-specific
information, such as content information of a frame, into consideration, so that the server can make the best of network resources and reduce the probability of frame discarding. Ramanathan [ 1 13 proposed a simple and efficient scheme, which exploits the inevitability of frame losses, and he called the subset of packets of a frame that will invalidate the frame once any of the packets are lost as “packet resiliency.” In Ramanathan scheme, whenever packet losses threshold has reached to invalidate a kame, all other packets that constitute this frame are forced to be discarded, and thus, network bandwidth and buffer spaces are released and can be reallocated to other video channels;
thereby, more services can be admitted and served. Hemy (121 proposed an integrated MPEG streaming system with a MPEG-aware filter. The main function of this filter is adjust the video bit rate by discarding frames according to the current network condition. The pofcy always discards B- frames first, then P-frames, and I-frames last.
The above SFD mechanisms considers frame type, frame dependency and cost, whereas they do not mention other
factors affecting human vision, such as content relation between two consecutive frames. Furthermore, the continuity of discarded frames is becoming less important in the video schemes with inter-frame dependency like MPEG. In such video scheme, each frame has different importance and those with higher priority, for example, I-frames in
MPEG video, are distributed evenly. In our frame discarding scheme, we not only consider the frame type and dependency, but also take the similarity between frames into
account. A frame that is less similar with its previous frame is more important because it is likely to be the beginning of a scene. In addition, we adjust the decision window size dynamically to accommodate the network dynamics.
111. SYSTEM ARCHITECTURE
There are two main parts of our streaming system. The first one is the similarity-based frame discarding module, which can do the dynamic adaptation to streaming bit rate; the other is the service control module, which controls service differentiation, In this section, we explain how the proposed system works, and we use two different strategies to deal with the situation that the available resources is rare.
A . System Overview
The architecture of our proposed system is shown as Figure 2. Most of the video streaming services establish connections over user datagram protocol (UDP) or real-time
transport protocol (RTP), and thus, it is unfair to compete
for bandwidth resources with connections established over transmission control protocol (TCP). Therefore, we adopt the widely accepted TCP-friendly rate control (TFRC) protocol [15j, which is made to break the unfairness, to get the network status information. In the server side, frame similarity report is pre-generated from calculating the similarity degrees between consecutive frames, and the
similarity-based frame discarding (SimFD) module
determines whether a frame will be sent to the client based on the TFRC report sent back from the client and the frame priority, which is decided from the similarity degree. In addition, SimFD module should consider the policy running
on the service control module to achieve DiffServ
provisioning.
In the DiffServ architecture, all clients will be divided into several classes, and we assume the number of classes is fixed. Besides, all clients have their own service level agreements (SLA) reporting their service class. The aim o f
service control module is to manage the policy to provide differentiated services. Under general situations, the service control module reports the user SLA information to the SimFD module, and the S h F D module will dynamically adjust the sending bit rate to the user in accordance with the service class. However, the best-effort Internet does not guarantee that the resource is always available for services, and on the other hand, the system operator would like to keep services running for the most benefits; to balance both of these, the service admission control policy should be carried out on the service control module.
Fig. 2. System Architecture
E.
Similarity-based Frame DiscardingIn this section, we introduce the proposed method to determine the similarity degree of each frame and the frame discarding policy taken by the SimFD module.
1 ) Video Similarity
During a streaming service, frames may arrive too Iate to
be played or even be dropped in the core networks. When either of these situations occurs, clients will suffer an unsmooth video quality, and we found that the decreased degree of QoS is correlative to the similarity between the
absent frame and the previous one. The more dissimilar they are, the more the QoS is hurt. A video is indeed a sequence of pictures, so-called frames, played more than thirty pictures per second. Therefore, we define the frame similarity with the peak signal noise ratio (PSNR), which is usually used to estimate the difference between a
reconstructed picture and the original one, as folIows:
S i m i l a r i i y ( f ( i ) , f ' ( i ) ) = P S N R ( f ( i ) , f ' ( i ) ) where f ( i ) and f ' ( i ) are the original frame and the
received frame respectively. We denote the video
similarity as follows:
Similarity
( v
( i ) , ~ ' ( i ) ) Sdmilurity(V,V') =number of frames
where Y and Y ( i ) represents the original frame sequence
and the i-th frame; V' and Y'(i) represents the received
frame sequence and the i-th frame. In order to lighten loads on the server, the similarity report can be pre-generated at other host instead of on-line calculation. The size of the frame sequence can be adjusted dynamically. A larger frame sequence could have a better QoS, whereas a smaller frame sequence can react to network changes directly.
2) Frame Discarding Policy
A frame-based MPEG video, such as MPEG-1/2/4, has three types of frames: I-frame, B-frame, and P-frame; a group-of-pictures (GOP) is defined as a kame sequence starting with an I-frame. According to the inter-frame dependency, if an I-frame is discarded or lost, the entire GOP will be lost. Similarly, discarding a P-frame will lead to the loss of the P-frames and B frames that depend on it, and there is no frame depends on B-frames. Hence, while the network resources are not enough during streaming, the 3-frame is always the first consideration in being discarded due to the inter-frame dependency, and then the P-frame, and the I-frame at last. In discarding I-frames and B-frames, the SimFD module discards frames by their similarity. As to discarding P-frames, the SimFD module discards frames from the last one and continues in the inverse order because of the dependency between P-frames. The action of discarding frames continues until there is enough bandwidth to transmit this frame sequence.
C. Service Control
Our goal focuses on the implementation of the streaming system with end-to-end frame-based rate control, and applies this system to the DiffServ architecture. In order to achieve DiffSew provisioning, we have to control the service in terms of the client amount, client classes, etc.
The concept of proportional differentiation [13], [I41 is widely used in various DiffServ services. The main idea is to keep the performance spacing between classes in a fixed ratio, that is, the higher class will receive better services for all time. However, this policy does not guarantee QoS of any class, While the network resources are highly constraint,
QoS of all classes will diminish hut still keep the spacing.
Assume there are n classes in our service system, where n
is a finite and fixed number; R ( i ) denotes the service bit rate of class i, and should be a function of the amount of the clients. Then, the proportional differentiation is defined as follows:
IV. PERFORMANCE EVALUATION
There are two parts of our experiments. First, we compare
our SimFD scheme with other previous SFD scheme with different bandwidth capacities in terms of video similarity; the second experiments, we show the viability of our integrated end-to-end controlled DiffServ streaming system.
A. Comparison of Different Frame Discording Schemes
We perform our experiments on a three-node topology as shown in Figure 3. The bandwidth of these two links is the
same and varies as parameter in our simulation. The server node simulates as a real streaming server by reading a pre- processed trace file and then sends packets to the client node The trace file format shows in Table I. We parse the video
file and record essential information for video streaming including frame type, frame size and video bit rate into the trace file. Therefore, a simulated node can act as a real
streaming server by reading this trace file. Then, the client in the simulation records received frames into another receiver tile for us to calculate the video similarity between the original video and the received video.
In this simulation, we compare our SimFD scheme with the following two schemes: one is the scheme that considers only the frame type, so called “Type” scheme; the other
considers not only the kame type but the inter-frame dependency as well, so called “TypeDepend” scheme. The streaming video bit rate is 300Kbits/sec, about 38 W i s e c . The bandwidth capacity varies from 10KB/sec to 40KBisec.
According to the results shown in Figure 4, we can see that our SimFD scheme always has better performance than the
other two schemes, and has significant improvements under circumstances that network resources are constraint.
Fig. 3. SimFU Scheme Simulation Topology
B. Integrated End-to-End Lhmerv System
In this experiment, we integrate our SimFD module with modified DiffServ module into an end-to-end controlled streaming system. In the original DiffServ module, there are edge-router functions and core-router functions, and has to
keep per-flow states to achieve the DiffServ goal. In our system, both edge and core routers only need to do packet forwarding and not any information will be kept in routers. We use application-specific information and the TFRC report to dynamic control our traffic to cany out the
ize 88 6B 46 26 i e 15 28 2s 38 35 48
Bottleneck Link Capacity (KB1
Fig. 4. Comparison oFSimFD, TypeDepend, snd Type Schemes
Dimerv goal. Figure 5 illustrates our simulation topology. There is one server, three clients with three different classes and one bottleneck links. All edge links have a capacity of
IOMbisec and the bottleneck link has IMb/sec. Propagation
delay is lms on edge links and lOms on the bottlenecks link. The end-to-end controlled mechanism is based on the TFRC protocol and the class infomation. Total simulation time is 1200 seconds. Initially, one flow 0 (from S to 0 in Figure 5) is active, and the other flows get active with 400 seconds intervals. Figure 6 shows the instantaneous rate of flows. As a new flow joins in, the control module will dynamically adjust the rates according to TFRC reports and class
information of each flow.
In order to show the viability of our system, we implement a streaming system shown in Figure 7. We use
frame-based MPEG-4 video as the source, and each
connection has two streams (part 1): one is the video stream
and the other is the audio stream. The SdRate (part 2) shows the current sending bit rate of each streaming connection; this system also shows information of each client (part 3 ) ,
including file streamed and service level, etc. Figure 7 and Figure 8 present the real video screens received by clients of
lower class and higher class respectively. We can observe that higher-class clients have a much better quality than those in lower class.
n
iom
/fv
,“TABLE 1
# of # o r Frames I-Frames Video Length Rate
Name # o f ?Y or P-Frames B-Frames class-2
-
128888 . . . , , _ _ _ _ _ _ _ I . _ _ class-3 -, 8 ' I I J e 2ee 4813 see m e i e ~ e 1288 TineCsec)Fig. 6. Rate of Different Classes of Flows
~ 1-1
Fig. 8. Streaming Client (Lower Class)
V. CONCLUSION
Streaming system is very congestion-sensitive, and most
of current proposed approaches have focused on controlling
or monitoring the core networks, which usually requires significant upgrades on routers. We proposed a novel
..
56342 32 S t r e m g
43334 1 Stieanng
bAC hbi 43605 6 Stleamng
2851 8 20 Slreamng
32 Stream0
.. . .lp?_l.n.l ,fl. C . I _ . .
B
Fig. 7. Streaming Server
\
OK
1
C a m 1 f SejI
Fig. 9. Streaming Client (Higher Class)
approach to implement DiffServ-enabled streaming system
by using similarity-based frame discarding algorithm and
end-to-end controlled mechanism without any core state information. In addition, we have shown its viability in the real world.
REFERENCES
[I]. J. Rextbrd and D. Towsley, “Smoothing Variable-Bit- Rate Video in an Internetwork,” in IEEE/ACM Transactions on Networking 1999, p.202 - 215.
[2]. W. Feng and J. Rexford, “A Comparison of
Bandwidth Smoothing Techniques for the
Transmission of Prerecorded Compressed Video,” in Proc. IEEE INFOCOM, pp 58-66, April 1997. M. h n z and S.K. Tripathi, “On the Characteristics of VBR MPEG Streams,” in Proc. ACM SIGMETRICS, pp. 192-202, June 1997.
[4]. I. Rexford, S. Sen, J. Dey, W. Feng, J. Kurose, J. Stankovic, and D. Towsley, “Online Smoothing of Live, Variable-bit-rate Video,” in Proc. Workshop on Network and Operating System Support €or Digital Audio and Video, pp. 249-257, May 1997.
[ 5 ] . J. Rexford and D. Towsley, “Smoothing Variable-Bit- Rate Video in an Internetwork,” in Proc. SPIE
Symposium on Voice, Video, and Data
Communications: Multimedia Networks: Security, Displays, Terminals, and Gateways, November 1997. 161. R. Braden, L. Zhang, S. Berson, S. Herzog, and S.
Jamin, ‘‘Resource reservation Protocol (RSVP) version 1, Functional Specification,” IETF, RFC 2205, September, 1997.
[7]. S . Blake, D. Black, M. Carlson, E. Davies, Z. Wang and W. Weiss, “An Architecture for Differentiated Services,” RFC 2475, December 1998.
[8].
S. Shenker, C. Partridge, and R. Guenn, “Specification of Guaranteed Quality of Service,” IETF, RFC 2212, September 1997.B. Davie, A. Charny, J.C.R. Bennett, K. Benson, J.Y. Le Boudec, et al., “An Expedited Forwarding PHI3
(Per-Hop Behavior),” IETF, RFC 3246, March 2002.
[lo]. Z.-L. Zhang, S. Nelakuditi, R. Aggarwal, and R.-P. Tsang, “Efficient Selective Frame Discard Algorithms
for Stored Video Delivery across Resource
Constrained Networks,” Real-Time Imaging, Special
Issue on Adaptive Real Time Muitimedia
Transmission over Packet Switching Networks, Vol. 7, No. 3, June 2001, pp. 255-273.
[ 1 11. S . Ramanathan, P.V. Rangan, and H.M. Vin, “Frame- Induced Packet Discarding: An Efficient Strategy for
Video Networking,” in Proc. of 4th NOSSDAV,
Lancaster, England, November 1993.
[12]. M. Hemy, U. Hengartner, P. Steenkiste, and T. Gross,
“MPEG System Streams in Best-Effort Networks,” Proc. IEEE Packet Video 1999, New York, April,
1999.
[3].
[9].
[ 131. C. Dovrolis and D. Stiliadis, “Relative Differentiated Services in the Internet: Issues and Mechanisms,” in ACM SIGMETRICS, May 1999. Short Paper. [14]. C. Dovrolis, D. Stiliadis, and P. Ramanathan,
“Proportional Differentiated Services: Delay
Differentiation and Packet Scheduling,” IEEE/ACM
Trans. on Networking, 1 O( 1): 12-26, 2002.
[15]. M. Handley, S. Floyd, J. Padhye, and J. Widmer,
“TCP-friendly Rate Control (TFRC): Protocol
Specification,” IETF,
RFC
3448, January 2003.Jau-Yuan Chen received the B S degree in computer
science from National Taiwan University (NTU) in
2003 He is currently a m t e r student at NTU and is a
member of Network Croup in Communication and
Multimedia Laboratory. HIS research interests include
applied topics in streaming systems, differentiated
services, and overlay networks
Min-Way Hsu received the B.S. degree in Department
of Computer Science and Information Engineenng,
National Taiwan University in 2004 He 1s currently
worlung at CyberLink C o p His research interests are streaming systems and wireless multimedia networks
Cheng-Fu Chou received the M S and Ph D degree
from University of Maryland, College Park, in 1999
and 2002, respectively In 2002, he joined the
Department Computer Science and Information
Engineenng, National Tillwan University. His current
r
\#)\
res&” ;nterests are in wide-area networkdw applicatmns, distributed multimedia systems,
heterogeneous wireless communication systems, and wireless sensor network and performance evaluation.