• 沒有找到結果。

Overview of the MPEG-4 Video Standard

2.2 MPEG-4 Video Texture Coding

2.2.4 Texture Coder

The texture information of a VOP is present in the luminance Y and two chrominance components Cb and Cr of the video signal. The texture information is directly in the luminance and chrominance components for an I-VOP. However, for a P-VOP and a B-VOP, the texture information represents the residual values remaining after motion-compensated prediction. The texture coder includes padding process (for object-based coding, and applied only if needed), 8 × 8 two-dimensional (2D) discrete cosine trans-form (DCT), quantization, coefficient prediction, coefficient scan and variable length cod-ing (VLC).

Padding Process

When the shape of the VOP is arbitrary, two types of MB exits, those that lie inside the VOP and those that lie on the boundary of the VOP. The MBs that lie completely inside the VOP are coded using a technique identical to the technique used in H.263. The MBs that lie on the boundary of the shape need to be padded before texture coding. For residual error blocks after motion compensation, the region outside the VOP within the blocks are padded with zero. For intra blocks, the padding is performed in a three-step procedure called low pass extrapolation (LPE). This procedure is as follows:

1. Compute the arithmetic mean value m of the pixels f (i, j) in the blocks that belong to the VOP as

m = (1/N) X

(i,j)∈V OP

f (i, j)

where N is the number of pixels situated with the VOP.

2. Assign m to each block pixel situated outside of the VOP region.

3. Apply the following filtering operation to each block pixel f (i, j) outside of the VOP region, in raster-scan oeder:

f (i, j) = f (i, j − 1) + f (i − 1, j) + f (i, j + 1) + f (i + 1, j)

4 .

If one or more of the four pixels used for filtering are outside the block, the cor-responding pixels are not included into the filtering operation and the divisor 4 is reduced accordingly.

Discrete Cosine Transform (DCT) Coding

Similar to MPEG-1 and MPEG-2, the transform coding in the MPEG-4 standard is based on 2D 8×8 DCT. Before quantization, the encoder does forward transform. Then the encoder does inverse transform after inverse quantization for reconstructing the VOP.

Quantization

MPEG-4 video supports two quantization techniques, one referred to as the H.263 quan-tization method and the other, the MPEG quanquan-tization method. The H.263 quanquan-tization method is with dead zone for intra and inter AC coefficients and with no dead zone for in-tra DC coefficients. The MPEG quantization method is uniform quantizer with the default matrix as shown in Table 2.3.

Figure 2.12 shows the quantizer characteristics in H.263. It has uniform quantization for intra DC coefficients and nearly uniform midtread quantization for the inter DC and all AC coefficients. All coefficients in a MB go through the same quantizer step size Q, which can be changed in increments of 2 from 2 to 62 as desired.

Furthermore, in order to provide a higher coding efficiency, Table 2.4 shows a nonlin-ear scaler which is used for the DC coefficient of 8 × 8 block in MEPG-4 video. Note that the characteristics of nonlinear scaling are different between the luminance and chromi-nance blocks and depend on the quantizer used for the block.

Intra Prediction

When coding an intra block, the DC coefficients and many AC coefficients are coded by intra prediction. Intra prediction is an operation used in MPEG-4 standards to reduce the spatial redundancy between 8 × 8 blocks.

DC prediction is illustrated in Fig. 2.13. The quantized intra coefficients are predicted with three previous decoded DC coefficients. For example, the DC coefficients of block X

1/2Q

Figure 2.12: Quantizers in H.263. (a) For intra DC coefficient only. (b) For inter DC and all AC coefficients.

Table 2.3: Default Quantization Matrix (Q) [5]

Intra Inter

Table 2.4: Nonlinear Scaler for DC Coefficients (from [5]) Component DC Scaler for Q Range

1–4 5–8 9–24 25–31 Luminance 8 2Q Q + 8 2Q − 16 Chrominance 8 (Q + 13)/2 Q − 16

000000000000000000000000000000000000000000000

Figure 2.13: Prediction of DC coefficients of blocks in an intra MB (from [6]).

is predicted from the DC coefficients of blocks A, B and C. Unlike MPEG-2, the method of prediction in MPEG-4 is gradient based. In computing the prediction of block X, if the absolute value of a horizontal gradient is less than the absolute value of a vertical gradient, then the quantized DC (QDC) of block C is used as the prediction, else the QDC value of block A is used.

The AC prediction depends on DC prediction, as shown in Fig. 2.14. The AC coeffi-cients in the first row or in the first column are predicted with three previous decoded AC coefficients. The direction of prediction is the same as DC prediction.

Scan and VLC

Figure 2.15 shows three kinds of scan, alternate-horizontal, alternate-vertical and zigzag (the normal scan used in H.263 and MPEG-1), to scan the DC and AC coefficients and change the 2D block data to 1D data. The actual scan used depends on the coefficient prediction method used. If the direction is vertical, alternate-horizontal scan is used for the current block. If the direction is horizontal, alternate-vertical scan is selected for the current block. For all other blocks, zigzag scanned is used.

The coefficients after scan usually become data with many zeros at the end. This kind of data stream is good for run-length coding. In the MPEG-4 standard, differential DC coefficients in intra blocks are encoded in VLC. However, the AC coefficients are encoded by the variable length codes for EVENTs, where an EVENT consists of a last non-zero coefficient indication (LAST), the number of successive zeros preceding the

000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000

000000000000000000 0000000000000000000000000000000000000000

A

Figure 2.14: Prediction of AC coefficients of blocks in an intra MB (from [6]).

Figure 2.15: Scans for 8 × 8 blocks (from [5]).

coded coefficient (RUN), and the non-zero value of the coded coefficient (LEVEL). Some statistically rare events have no VLC words to represent them. For them an escape coding method is used.

相關文件