多媒體架構與數位視訊浮水印在網際網路之應用 -- 蔣迪豪教授 - MPEG-4多媒體通訊技術之研究---總計畫(III)

MPEG-21 provides a unified solution, Universal Multimedia Access (UMA), for con-structing a multimedia content delivery and rights management framework. Based on the concepts of UMA, we build a simplified UMA model on the Internet. In this framework, the source video material is encoded and archived as FGS bitstreams. To support video contents of different formats, we create a transcoder to convert the bitstream from the FGS format to an MPEG-4 simple profile format that fits to the terminal capabilities. Moreover, a novel FGS coding scheme is present to improve coding efficiency and retain robustness of FGS bit-streams for video streaming over Internet. Consequently, the multimedia information can be streamed through the networks without networks jitters and significant quality degradations existed in the current commercial implementations. To have a more strict evaluation method-ology according to the specified common conditions for scalable coding, an FGS-based uni-cast streaming system is used as a test bed of scalability over the Internet.

(1) Architecture of UMA multimedia Delivery System

For achieving Universal Multimedia Access (UMA), we propose a video server that con-tains the key modules described in MPEG-21. In this model, we combine the tools as referred to MPEG-4 Fine Granularity Scalability (FGS), MPEG-4 Simple Profile, MPEG-7, Digital Watermarking techniques, and Internet protocols.

Based on the concepts of UMA, we build a simplified UMA model on the Internet. To fit with the issue of content adaptation according to terminal capability, we propose a real-time transcoding system that converts the FGS bitstreams into Simple Profile bitstreams. In this framework, the source video material is encoded and archived as FGS bitstreams. To support video contents of different formats, we create a transcoder to convert the bitstream from the FGS format to an MPEG-4 Simple Profile format that fits to the terminal capabilities. As shown in Fig. 16, the proposed system includes five modules. With the FGS bitstreams saved in FGS BitStream Archive module, the proposed system can serves heterogeneous terminals through the Internet. Moreover, according to Internet and Terminal devices capabilities, the Channel Monitor can adapt the different resources to each Terminal. Thus, since the source video is encoded and archived as FGS bitstreams, we can provide various QoS service like SNR scalable video coding schemes.

(2) Efficient FGS-to-Simple Transcoding

To demonstrate the performance of our proposed UMA multimedia delivery system in Fig.16, we build an FGS-to-Simple transcoder. In the proposed system, each sequence is pre-encoded and stored in the FGS Bitstream Archive.

Three methods are considered for comparison:

1. A simple profile encoder using the original video sequence (SP_ME)

2. A cascaded transcoding using a complete FGS enhancement bitstream and motion vectors from the base layer bitstream (SP_MV)

3. An efficient transcoding with complete FGS enhancement video and motion vectors from the base layer bitstream (FGS-to-SP).

The test video sequences, named as Foreman, News, and Container, are in CIF and YUV format. The first frame is coded as an I-VOP and the others are coded as P-VOP's at 30Hz.

For the FGS encoding, the quantization step size (QP) used in the base layer is set at 10 for I-VOP's and 12 for P-VOP's. The MPEG-4 Simple Profile encoder employs constant quanti-zation, where the set of QP used is {5, 7, 14, 21, 28}. As shown in Fig.17, our transcoding schemes (FGS-to-SP) have neglected quality loss in PSNR at low and medium bit-rates and have about 0.5~0.9 dB loss in PSNR at high bit-rate.

0 200,000 400,000 600,000 800,000 1,000,000 1,200,000

Bit Rate(Bits)

Figure 17. The performance of transcoding with the luminance components of the three video sequences and using various sources of motion vectors and different enhancement in-formation.

[Base + Enh] Storage

Transcoder FGS-to-Simple

Enh Layer Rate Reduction

Send te rminal c apa bilities to server for fo rma t c onversion

Network

Send network c ondition to server for rate control

Device 2

[Base + Enh] Storage

Transcoder FGS-to-Simple

Enh Layer Rate Reduction

Send te rminal c apa bilities to server for fo rma t c onversion

Network

Send network c ondition to server for rate control

Figure 16. The application scenario of the proposed UMA multimedia delivery system that employs the archived FGS bitstream

(3). FGS Streaming on the Internet

The coding efficiency of the SRFGS is compared with RFGS and MPEG-4 Part-10 Ad-vance Video Coding (AVC). For test conditions, we adopt the testing procedure specified by the MPEG Scalable Video Coding AHG. The sequences including Tempete, Bus and Con-tainer in CIF resolution are tested at four bitrates/frame-rates, including 128kbps/15fps, 256kbps/15fps, 512kbps/30fps, and 1024kbps/30fps. The results of AVC use JM42 test model.

RD-optimized and CABAC are used. Quarter-pixel motion vector accuracy is employed with search range 32 pixels. Four reference frames are used. Only one I-frame is used at the begin-ning. The P-period is 3 in both 15fps and 30 fps. For RFGS and SRFGS, the base layer is JM42. The test conditions are the same as AVC except that we have disabled RD-optimized and only one reference frame is used. At 30 fps, the P-period is 6 for Tempete and Container.

The P-period is 4 for Bus. At 15 fps, the P-period is 2. The bitplane and entropy coding are identical as the MPEG-4 FGS. In SRFGS, 2 enhancement layer loops are used for Tempete and Bus, and 3 enhancement layer loops are used for Container. A simple frame-level bit al-location with a truncation module is used in the streaming server to obtain the optimized qual-ity under the given bandwidth budget.

The simulation results are shown in Fig. 18. Two RFGS results are shown, one has lower reference bitrate (labeled as RFGS_L) and the others has higher reference bitrate (labeled as RFGS_H). SRFGS has similar performance with RFGS_L at low bitrate, and has 1.7 to 3.0 dB improvement at high bitrate. This is because SRFGS has remove the temporal redundancy at high bitrate while RFGS_L not. As compare with RFGS_H, SRFGS has 0.4 to 1.0 dB im-provement at low bitrate. This is because there is more drift error of RFGS_H at low bitrate.

At high bitrate, the SRFGS has 0.8 dB improvements at low motion sequence such as Con-tainer and has similar performance at high motion sequence, such as Tempete and Bus. This is because at high motion sequence the correlation between successive frames are lower and the improved prediction technique in SRFGS may not help too much. At medium bitrate, SRFGS has 0.15 dB losses than RFGS_H at most. This is because the increased dynamic range and sign bits of each layer in SRFGS slightly degrade the coding efficiency. The above simulation results show that while RFGS can only optimized at one operating point, SRFGS can opti-mized at several operating point to serve much wider bandwidth with superior performance.

Compare with AVC, SRFGS has 0.4 to 1.5 dB loss at base layer. This is because the MV in SRFGS is derived by considering not only the base layer but also the enhancement layer in-formation. Further, the high quality prediction image of B-frame has not totally received at this bitrate. There are 0.7 to 2.0 dB PSNR loss at low bitrate and 2.0 to 2.7 dB loss at high bi-trate.

B u s C I F

Figure 18. PSNR versus bitrate comparison between SRFGS, RFGS and AVC coding schemes for the Y component

在文檔中 MPEG-4多媒體通訊技術之研究---總計畫(III) (頁 19-22)