Error Resilience - 可調視訊編碼之高等細緻可調性研究

For most of the applications using MPEG-4 FGS [15], error resilience is desirable because the video may be transmitted over error-prone channels. For error resilience, MPEG-4 FGS [15]

uses a re-sync markering technique. Specifically, a re-sync marker defined as 23 consecutive 0 followed by 1 is periodically coded at macroblock level to prevent error propagation. In addition, the re-sync marker is followed by the location of a macroblock and other necessary information that allows the decoding to be restarted. When errors occur, re-synchronization can be achieved by first searching for the re-sync marker. Then the decoding can be resumed after the re-sync marker.

In addition to sync marker, the syntax of bit-plane start code is also used for error re-silience. In particular, the bit-plane start code is not only for re-synchronization but also for signaling the location of a bit-plane. When several bit planes of a region are lost, the bit-plane start code can stop error propagation and restart the decoding from a specific bit-plane.

Experimental results show that combing these techniques together can produce a good result when errors occur.

2.6 Summary

In this chapter, we have reviewed the algorithm of MPEG-4 FGS [15]. For the video applica-tions in a heterogenous environment, MPEG-4 FGS [15] provides a DCT-based scalable coding using a layered approach. It compresses the video into a base layer and an enhancement layer.

The base layer offers a minimum guaranteed visual quality. Then the enhancement layer refines the quality over that offered by the base layer.

Currently, the base layer is coded by a non-scalable codec while the enhancement layer is coded with an embedded bit-plane coding. To produce an embedded bit-stream, the DCT coefficients of the enhancement layer are coded from the MSB bit-plane to the LSB bit-plane.

In each bit-plane, the coding is performed in a frame raster and coefficient zigzag scanning manner. With the embedded property, the enhancement layer can be arbitrarily truncated for the adaptation of channel bandwidth and processing power.

To deliver better subjective quality, MPEG-4 FGS [15] provides a frequency weighting ma-trix and a scheme for selective enhancement. The frequency weighting prioritizes the coding of DCT coefficients so that the coefficients of lower frequency can be coded with higher priority.

The purpose is to reduce flickering effect caused by the quantization. By applying the same technique at macroblock level, the selective enhancement offers the functionality of region-of-interest by firstly coding the macroblocks in the specified regions.

Except these tools, MPEG-4 FGS [15] also incorporates a re-sync markering technique to address error resilience. To stop error propagation, the re-sync marker, represented by a specific codeword, is periodically coded in the bit-stream. Moreover, following the re-sync marker is the information required for restarting the decoding. When errors occur, the error propagation is constrained between two re-sync markers and the decoding can be resumed after a re-sync marker.

Although MPEG-4 FGS [15] offers good scalability at fine granularity, current approach suffers from poor coding efficiency and subjective quality. In the following chapters, we will analyze these problems and provide our solutions.

CHAPTER 3 Enhanced Mode-Adaptive Fine Granularity Scalability

3.1 Introduction

While offering good scalability at fine granularity, the compression efficiency of MPEG-4 FGS [15] is often much lower than that of a non-scalable codec. Currently, in MPEG-4 FGS [15], the enhancement layer is predicted from the base layer. In most applications, the base layer is encoded at very low bit rate and the reconstructed base layer is often with poor quality. Because the predictor of poor quality cannot effectively remove the redundancy, the coding efficiency is inferior.

Using the enhancement layer for prediction can improve the coding efficiency [10][18][35].

Particularly, PFGS [35] constructs a macroblock predictor from a previous enhancement-layer frame. In addition to the previous enhancement-layer frame, RFGS [11] further exploits a previous base-layer frame while producing a frame-based predictor. In our previous work [18], we offer three macroblock predictors: (1) Type B: the predictor constructed from the current base-layer frame, (2) Type E: the predictor constructed from the previous enhancement-layer

frame, and (3) Type BE: the predictor constructed from the average of the previous two modes.

While differing in constructing the enhancement-layer predictor, all the advanced FGS schemes try to find a better predictor for improving the coding efficiency.

Although the coding efficiency can be improved by using the enhancement-layer frame, drifting errors could occur at low bit rate. This is because the enhancement layer is not guar-anteed being received in an expected manner. The predictor mismatch between encoder and decoder would produce drifting errors. PFGS [32][33][36] stops the drifting errors by enabling a predictor that artificially creates mismatch errors during encoding. The predictor is enabled by a mode decision mechanism [32][33][34]. In RFGS [11], they apply a predictive leaky factor between 0 and 1 to decay the drifting errors. Their method is to multiply the previous enhancement-layer frame with a fractional factor, α. In this thesis, we adaptively use Type B and Type BE predictors to offer two schemes, the reset and fading mechanisms, to stop/reduce drifting errors. During the predictor selection, we estimate the possible drifting errors by intro-ducing a dummy reference frame in the encoder.

While preserving the scalability of MPEG-4 FGS [15], our goal is to offer better coding effi-ciency at all bit rates. Figure 3.1 characterizes our goal in terms of rate-distortion performance.

The rest of this chapter is organized as follows: Section 3.2 formulates the problem. Section 3.3 describes our enhanced mode-adaptive FGS (EMFGS) scheme, including the formulations of prediction modes and the mode selection algorithm. Section 3.4 analyzes the distributions of prediction modes in different conditions. Section 3.5 depicts our encoder and decoder structure.

Section 3.6 further compares our approach with other advanced FGS schemes and demonstrates the rate-distortion performance of our proposed codec. Finally, Section 3.7 summarizes our work.

在文檔中可調視訊編碼之高等細緻可調性研究 (頁 39-42)