An Evaluation Framework for More Realistic Simulations of MPEG Video Transmission

(1)

An Evaluation Framework for More Realistic Simulations

of MPEG Video Transmission

CHIH-HENG KE1_{, C}_E_-K_UEN_S_HIEH2_{, W}_EN_-S_HYANG_H_WANG3_AND_A_RTUR_Z_IVIANI4

1_{Department of Computer Science and Information Engineering}

National Kinmen Institute of Technology Kinmen, 892 Taiwan

2_{Department of Electrical Engineering}

National Cheng Kung University Tainan, 701 Taiwan

3_{Department of Electrical Engineering}

National Kaohsiung University of Applied Sciences Kaohsiung, 807 Taiwan

4_{National Laboratory for Scientific Computing (LNCC)}

Petrópolis, Rio de Janeiro, 25651-075 Brazil

We present a novel and complete tool-set for evaluating the delivery quality of MPEG video transmissions in simulations of a network environment. This tool-set is based on the EvalVid framework. We extend the connecting interfaces of EvalVid to re-place its simple error simulation model by a more general network simulator like NS2. With this combination, researchers and practitioners in general can analyze through simulation the performance of real video streams, i.e. taking into account the video se-mantics, under a large range of network scenarios. To demonstrate the usefulness of our new tool-set, we point out that it enables the investigation of the relationship between two popular objective metrics for Quality of Service (QoS) assessment of video quality delivery: the PSNR (Peak Signal to Noise Ratio) and the fraction of decodable frames. The results show that the fraction of decodable frames reflects well the behavior of the PSNR metric, while being less time-consuming. Therefore, the fraction of decodable frames can be an alternative metric to objectively assess through simulations the delivery quality of transmission in a network of publicly available video trace files.

Keywords: network simulation, MPEG video, Evalvid, NS2, PSNR, the fraction of

de-codable frames

1. INTRODUCTION

The ever-increasing demand for multimedia distribution in the Internet motivates researchon how to provide better-delivered video quality through IP-based networks [1]. Previous studies [2-7] often use publicly available real video traces to evaluate their pro-posed network mechanisms in a simulation environment [8-12]. Results are usually pre-sented using different performance metrics, such as the packet/frame loss rate, packet/ frame jitter [13], effective frame loss rate [8], picture quality rating (PQR) [13], and the fraction of decodable frames [9]. Nevertheless, packet loss or jitter rates are network performance metrics and may be insufficient to adequately rate the perceived quality by a (human) end user. Although effective frame loss rate, PQR, and the fraction of decodable Received January 9, 2006; revised June 19, 2006; accepted August 2, 2006.

(2)

frames are application-level Quality of Service (QoS) metrics, they are not as well known and acceptable as MOS (Mean Opinion Scores) and PSNR (Peak Signal Noise Ratio) [14]. Furthermore, it is hard to study the effects of proposed network mechanisms on different characteristics of the same video extensively because the encoding settings for the publicly available video traffic traces are limited. As a consequence, how to best simulate and evaluate the performance of video quality delivery in a simulated network environment is a recursive open issue in network simulation forums, such as [15].

EvalVid [16], a complete framework and tool-set for evaluation of the quality of video transmitted over a real or simulated communication network, provides packet/ frame loss rate, packet/frame jitter, PSNR, and MOS metrics for video quality assess-ment purposes. The primary aim of EvalVid is to assist researchers or practitioners in evaluating their network designs or setups in terms of the perceived video quality by the end user. Nevertheless, the simulated environment provided by EvalVid is simply an error model to represent corrupted or missing packets in the real network. The lack of generalization of this simple error model causes problems for researchers or practitioners who seek to assess the delivered video quality to end users in more complex and realistic network scenarios. For example, when transmitting video packets via unicast over IEEE 802.11 wireless network, the MAC layer at a sender will retransmit an unacknowledged packet at a maximum of N times before it gives up. The perceived correct rate at applica-tion-level is thus 1 1 (1 ) 1 , N i N CORRECT i P p p− p = =

∑

− = −

where N is the maximum number of retransmission at the MAC layer and p is the packet error rate at the physical-level.As a consequence, the application-level error rate is p effec-tive = pN. In this kind of scenario, the results obtained from original Evalvid framework

are misleading since the simple error model does not take the retransmission mechanism into consideration.

This paper integrates EvalVid with NS2 [17], a widely adopted network simulator. On the one hand, the resulting tool-set from this integration allows network researchers and practitioners to analyze their proposed new network designs in the presence of real video traffic in a straightforward way. On the other hand, mechanisms for enhancing the delivery quality of video streams can be evaluated in more complex simulated network scenarios, including characteristics like relatively large topologies, broadband access, limited bandwidth, wireless, node mobility, and whatever functionality is available at the network simulator. Furthermore, we use our new evaluation framework provided by this tool-set to investigate the relationship between two objective QoS assessment metrics: PSNR [18] and the fraction of decodable frames [9]. PSNR takes into account the video content and hence it is more time-consuming than the fraction of decodable frames, which is straightforward to compute. The new tool-set enables the analysis showing that the fraction of decodable frames can reflect the behavior of the PSNR metric adequately, while being less time-consuming.

To the best of our knowledge, no tool-set is publicly available to perform a com-prehensive video quality evaluation of real video streams in network simulation envi-ronment. We argue that the proposed tool-set enables more realistic simulations of video

(3)

transmission in a dual sense. This tool-set enables video-coding or video-QoS techni-cians to simulate the effects of a more realistic network on video sequence resulting from their coding or QoS scheme, respectively. Likewise, the proposed tool-set also enables networking operatives to evaluate the effects of real video streams on proposed network protocols, for instance. Indeed, we believe that our tool-set provides a convergence to more realistic video simulations of video transmissions in the broad sense, thus enabling a large range of video transmissions in network scenarios to be evaluated. [19-21] are examples that use this tool-set for their respective proposed mechanism evaluation. This new proposed tool-set for evaluating the quality performance of network video transmis-sions is publicly available at [22].

The remainder of this paper is organized as follows. Section 2 provides a brief over-view of EvalVid. Section 3 describes the developed connecting agents between EvalVid and NS2 as well as an improved fix YUV program to replace the conventional one. Sec-tion 4 analyzes the proposed QoS assessment framework for video streams using two examples to illustrate the video quality evaluation. Section 5, investigates the relation-ship between the QoS assessment metrics PSNR and the fraction of decodable frames. Finally, section 6 presents the concluding remarks.

2. OVERVIEW OF EVALVID

The structure of the EvalVid framework is shown in Fig. 1, redrawn from [16].

VS Video Encoder ET PSNR FV MOS Source Network Loss / delay (or Simulation) Video Decoder erroneous video

raw YUV video (receiver) play-out

buffer user

raw YUV video (sender) coded video video trace sender trace receiver trace reconstructed erroneous video reconstructed raw YUV video (receiver)

RESULTS: - frame loss / frame jitter - user perceived quality

Fig. 1. Schematic illustration of the evaluation framework provided by EvalVid. The main components of the evaluation framework are described as follows:

Source The video source can be either in the YUV QCIF (176 × 144) or in the YUV CIF (352 × 288) formats.

(4)

coding. It supports three kinds of MPEG4 codecs, namely the NCTU codec [23], ffmpeg [24], and Xvid [25]. The focus of this investigation is NCTU codec for video coding purposes.

VS (Video Sender) The VS component reads the compressed video file from the out-put of the video encoder, fragments each large video frame into smaller segments, and then transmits these segments via UDP packets over a real or simulated network. For each transmitted UDP packet, the framework records the timestamp, the packet ID, and the packet payload size in the sender trace file with the aid of third-party tools, such as tcp-dump [26] or win-dump [27], if the network is a real link. Nevertheless, if the net-work is simulated, the sender trace file is provided by the sending entity of the simulation. The VS component also generates a video trace file that contains information about every frame in the real video file. The video trace file and the sender trace file are later used for subsequent video quality evaluation. Examples of a video trace file and a sender trace file are shown in Tables 1 and 2, respectively. It can be seen that the packets with IDs 1 to 4 originate from the same video frame since their transmission times are equal.

Table 1. Example of video trace file.

Frame Number Frame Type Frame Size Number of UDP-packets Sender Time

0 H 29 1 segment at 33 ms 1 I 3036 4 segments at 67 ms 2 P 659 1 segment at 99 ms 3 B 357 1 segment at 132 ms 4 B 374 1 segment at 165 ms ... … … … …

Table 2. Example of sender trace file.

Time stamp (sec) Packet ID Packet Type Payload Size (bytes)

0.033333 0 udp 29 0.066666 1 udp 1000 0.066666 2 udp 1000 0.066666 3 udp 1000 0.066666 4 udp 36 0.099999 5 udp 659 0.133332 6 udp 357 0.166665 7 udp 374 ... ... ... ...

ET (Evaluate Trace) Once the video transmission is over, the evaluation task begins. The evaluation takes place at the sender side. Therefore, the information about the time-stamp, the packet ID, and the packet payload size available at the receiver has to be transported back to the sender. Based on the original encoded video file, the video trace file, the sender trace file, and the receiver trace file, the ET component creates a frame/ packet loss and frame/packet jitter report and generates a reconstructed video file, which corresponds to the possibly corrupted video found at the receiver side as it would be

(5)

re-produced to an end user. In principle, the generation of the potentially corrupted video can be regarded as a process of copying the original video trace file frame by frame, omitting frames indicated as lost or corrupted at the receiver side. Nevertheless, the gen-eration of the possibly corrupted video is more complex than this and the process is fur-ther explained in more details in section 3.2. Furfur-thermore, the current version of the ET component implements the cumulative inter-frame jitter algorithm [8] for play-out buffer. If a frame arrives later than its defined playback time, the frame is counted as a lost frame. This is an optional function. The size of the play-out buffer must also be set, oth-erwise it is assumed to be of infinite size.

FV (Fix Video) Digital video quality assessment is performed frame by frame. There-fore, the total number of video frames at the receiver side, including the erroneous frames, must be the same as that of the original video at the sender side. If the codec cannot han-dle missing frames, the FV component is used to tackle this problem by inserting the last successfully decoded frame in the place of each lost frame as an error concealment tech-nique [28].

PSNR (Peak Signal Noise Ratio) PSNR is one of the most widespread objective met-rics to assess the application-level QoS of video transmissions. The following equation shows the definition of the PSNR between the luminance component Y of source image S and destination image D:

PSNR(n)dB = 20 log10 2 0 0 , 1 col row_{[ ( , , )} _{( , , )]} peak N N S D col row i j V Y n i j Y n i j N N ₌ ₌ ⎛ ⎞ ⎜ ⎟ ⎜ ⎟ − ⎜ ⎟ ⎜ ⎟ ⎝

∑ ∑

⎠

where Vpeak = 2k − 1 and k = number of bits per pixel (luminance component). PSNR

measures the error between a reconstructed image and the original one. Prior to transmis-sion, it is possible to compute a reference PSNR value sequence on the reconstruction of the encoded video as compared to the original raw video. After transmission, the PSNR is computed at the receiver for the reconstructed video of the possibly corrupted video sequence received. The individual PSNR values at the source or receiver do not mean much, but the difference between the quality of the encoded video at the source and the received one can be used as an objective QoS metric to assess the transmission impact on video quality at the application level.

Table 3. Possible PSNR to MOS conversion [29].

PSNR[dB] MOS > 37 31-37 25-31 20-25 < 20 5 (Excellent) 4 (Good) 3 (Fair) 2 (Poor) 1 (Bad)

(6)

ing modes and the agents we developed, the current framework is not suitable for video transmission over bi-directional channels. The video encoding parameters can not be changed during simulation time. So researchers interested in rate adaptive design can refer to [34] for more information. In the future, we will incorporate more codecs into the framework and support scalable video coding and multiple description coding (MDC). The prototype of a multiple description coding evaluation framework is publicly avail-able at [35]. Researchers interested in multiple-path transport and load balance designs can try this prototype framework for preliminary evaluation.

REFERENCES

1. S. F. Chang and A. Vetro, “Video adaptation: concepts, technologies, and open is-sues,” in Proceedings of the IEEE, Vol. 93, 2005, pp. 148-158.

2. F. H. P. Fitzek and M. Reisslein, “MPEG-4 and H.263 video traces for network per-formance evaluation,” IEEE Network, Vol. 15, 2001, pp. 40-54.

3. P. Seeling, M. Reisslein, and B. Kulapala, “Network performance evaluation using frame size and quality traces of single-layer and two-layer video: a tutorial,” IEEE

Communications Surveys and Tutorials, Vol. 6, 2004, pp. 58-78.

4. Traffic trace from Mark Garrett’s MPEG encoding of the Star Wars movie, http:// www.research.att.com/~breslau/vint/trace.html.

5. Video traffic generator based on TES (Transform Expand Sample) model of MPEG4 trace files, contributed by Ashraf Matrawy and Ioannis Lambadaris, It generates traf-fic that has the same first and second order statistics as an original MPEG4 trace, http://www.sce.carleton.ca/~amatrawy/mpeg4.

6. O. Rose, “Statistical properties of MPEG video traffic and their impact on traffic modeling in ATM systems,” Report No. 101, Institute of Computer Science, Univer-sity of Wurzberg, Germany, 1995.

7. D. Saparilla, K. Ross, and M. Reisslein, “Periodic broadcasting with VBR-encoded video,” in Proceedings of IEEE INFOCOM, 1999, pp. 464-471.

8. L. Tionardi and F. Hartanto, “The use of cumulative inter-frame jitter for adapting video transmission rate,” in Proceedings of the Conference on Convergent

Tech-nologies for Asia-Pacific Region, Vol. 1, 2003, pp. 364-368.

9. A. Ziviani, B. E. Wolfinger, J. F. Rezende, O. C. M. B. Duarte, and S. Fdida, “Joint adoption of QoS schemes for MPEG streams,” Multimedia Tools and Applications, Vol. 26, 2005, pp. 59-80.

10. J. M. H. Magalhaes and P. R. Guardieiro, “A new QoS mapping for streamed MPEG video over a DiffServ domain,” in Proceedings of the IEEE International

Confer-ence on Communications, Circuits and Systems and West Sino Expositions, 2002, pp.

675-679.

11. M. F. Alam, M. Atiquzzaman, and M. A. Karim, “Traffic shaping for MPEG video transmission over the next generation internet,” Computer Communications, Vol. 23, 2000, pp. 1336-1348.

12. N. E. Nasser and M. Al-Abdulmunem, “MPEG traffic over diffserv assured service,” in Proceedings of Asia-Pacific Conference on Communication, 2003, pp. 494-498. 13. J. Takahashi, H. Tode, and K. Murakami, “QoS Enhancement methods for MPEG

(7)

video transmission on the Internet,” IEICE Transactions on Communications, Vol. E85-B, 2002, pp. 1020-1030.

14. F. A. Shaikh, S. McClellan, M. Singh, and S. K. Chakravarthy, “End-to-end testing of IP QoS mechanisms,” IEEE Computer Magazine, Vol. 35, 2002, pp. 80-87. 15. NS related mailing lists, http://www.isi.edu/nsnam/htdig/search.html.

16. J. Klaue, B. Rathke, and A. Wolisz, “EvalVid − A framework for video transmission and quality evaluation,” in Proceedings of the International Conference on

Model-ling Techniques and Tools for Computer Performance Evaluation, 2003, pp. 255-

272.

17. NS, http://www.isi.edu/nsnam/ns/.

18. S. Olsson, M. Stroppiana, and J. Baina, “Objective methods for assessment of video quality: state of the art,” IEEE Transactions on Broadcasting, Vol. 43, 1997, pp. 487-495.

19. C. H. Ke, C. K. Shieh, W. S. Hwang, and A. Ziviani, “A two-markers system for im- proved MPEG video delivery in a DiffServ network,” IEEE Communications Letters, Vol. 9, 2005, pp. 381-383.

20. J. Naoum-Sawaya, B. Ghaddar, S. Khawam, H. Safa, H. Artail, and Z. Dawy, “Adaptive approach for QoS support in IEEE 802.11e wireless LAN,” in

Proceed-ings of the IEEE International Conference on Wireless and Mobile Computing, Net-working and Communications, 2005, pp. 167-173.

21. H. Huang, J. Ou, and D. Zhang, “Efficient multimedia transmission in mobile net-work by using PR-SCTP,” in Proceedings of the IASTED International Conference

on Communications and Computer Networks, 2005, pp. 213-217.

22. http://hpds.ee.ncku.edu.tw/~smallko/ns2/Evalvid_in_NS2.htm. 23. NCTU codec, http://megaera.ee.nctu.edu.tw/mpeg.

24. ffmpeg, http://ffmpeg.sourceforge.net/index.php. 25. Xvid, http://www.xvid.org/.

26. tcp-dump, http://www.tcpdump.org. 27. win-dump, http://windump.polito.it.

28. Y. Wang and Q. F. Zhu, “Error control and concealment for video communication: a review,” in Proceedings of the IEEE, Vol. 86, 1998, pp. 974-997.

29. J. R. Ohm, Bildsignalverarbeitung fuer multimedia-systeme, Skript, 1999. 30. B. Carpenter and K. Nichols, “Differentiated services in the internet,” in

Proceed-ings of the IEEE, Vol. 90, 2002, pp. 1479-1494.

31. J. Shin, J. Kim, and C. C. J. Kuo, “Quality of service mapping mechanism for packet video in differentiated services network,” IEEE Transactions on Multimedia, Vol. 3, 2001, pp. 219-231.

32. yuvviewer, http://eeweb.poly.edu/~yao/VideobookSampleData/video/application/YUV- viewer.exe.

33. YUV video sequences (CIF), http://www.tkn.tu-berlin.de/research/evalvid/cif.html. 34. Evalvid-RA, http://www.item.ntnu.no/~arnelie/Evalvid-RA.htm.

35. Multiple description coding evaluation framework, http://hpds.ee.ncku.edu.tw/~smallko/ns2/MDC.htm.

(8)

Chih-Heng Ke (柯志亨) received his B.S. and Ph.D degrees in Electrical Engineering from National Cheng-Kung University, in 1999 and 2007. He is an assistant professor of Computer Sci-ence and Information Engineering, National Kinmen Institute of Technology, Kinmen, Taiwan. His current research interests in-clude multimedia communications, wireless network, and QoS network.

Ce-Kuen Shieh (謝錫堃) is currently a professor teaching in the Department of Electrical Engineering, National Cheng Kung University. He received his Ph.D., M.S., and B.S. degrees from the Electrical Engineering Department of National Cheng Kung Uni-versity, Tainan, Taiwan. His current research areas include distrib-uted and parallel processing systems, computer networking, and operating systems.

Wen-Shyang Hwang (黃文祥) received his B.S., M.S., and Ph.D. degrees in Electrical Engineering from National Cheng Kung University, Taiwan, in 1984, 1990 and 1996, respectively. He is professor of Electrical Engineering, National Kaohsiung University of Applied Sciences, Taiwan. His current research fo-cus includes multi-channel WDM networks, performance evalua-tion, QoS, RSVP, WWW database applications

Artur Ziviani received a B.Sc. in Electronics Engineering in 1998 and a M.Sc. in Electrical Engineering in 1999, both from the Federal University of Rio de Janeiro (UFRJ), Brazil. In 2003, he received a Ph.D. in Computer Science from the University of Paris 6, France, where he has also been a lecturer during 2003 to 2004. Since 2004, he is with the National Laboratory for Scien-tific Computing (LNCC), Brazil. His research interests include QoS, wireless computing, Internet measurements, and the appli-cation of networking technologies in telemedicine.