+ Integer pixel position Half pixel position
2.3.4 Texture Coder
The texture information of a video object plane is present in the luminance Y and two chrominance components Cb and Cr of the video signal. In the case of an I-VOP, the tex-ture information resides directly in the luminance and chrominance components. In the case of motion compensated VOPs the texture information represents the residual error remaining after motion-compensated prediction. The texture coder includes padding pro-cess (if needed),8 × 8 block based DCT, quantization, coefficient prediction, coefficient scan and variable length coding.
Padding Process
When the shape of the VOP is arbitrary, there are two types of MBs that belong to an arbitrarily shaped VOP:
1. Those that lie completely inside the VOP shape.
2. Those that lie on the boundary of the shape.
The macroblocks that lie completely inside the VOP are coded using a technique iden-tical to the technique used in H.263. The macroblocks that lie on the boundary of the shape need to be padded before texture coding. For residual error blocks after motion compensation, the region outside the VOP within the blocks are padded with zero. For intra blocks, the padding is performed in a three-step procedure called low pass extrapo-lation (LPE). This procedure is as follows:
1. Compute the arithmetic mean valem of the pixels f (i, j) in the blocks that belong to the VOP as
m = (1/N)
(i,j)∈V OP
f (i, j),
whereN is the number of pixels situated with the VOP. Division by N is done by rounding to the nearest integer.
2. Assignm to each block pixel situated outside of the VOP region, that is, f (i, j) = m for all (i,j) /∈ V OP.
3. Apply the following filtering operation to each block pixel f (i, j) outside of the VOP region, in raster-scan order:
f (i, j) = [f (i, j − 1) + f (i − 1, j) + f (i, j + 1) + f (i + 1, j)]/4.
Division is done by rounding to the nearest integer. If one or more of the four pixels used for filtering are outside the block, the corresponding pixels are not included into the filtering operation and the divisor 4 is reduced accordingly. For example, fori = 0 and j = 0, we have
f (i, j) = [f (i, j + 1) + f (i + 1, j)]/2.
After this padding operation the resulting block is ready for DCT coding.
Discrete Cosine Transform Coding
Similar to MPEG-1 and MPEG-2, the 2D (8×8) DCT is used for spatial data compression in MPEG-4 inter and intra coding. The encoder dose forward transform before quantiza-tion and inverse transform after inverse quantizaquantiza-tion in the loop. The reason for inverse quantization and inverse transform is to obtain reconstructed image for the next temporal frame.
1/2Q
−1/2Q Th
Th+1/2Q
−Th
−Th−Q
(b) (a)
3/2Q
−3/2Q
Figure 2.16: Quantizers in H.263. (a) For intra DC coefficient only. (b) For inter DC and all AC coefficients.
Table 2.1: Default Quantization Matrix Q (from [3])
(intra) (non intra)
8 16 19 22 26 27 29 34 16 16 16 16 16 16 16 16
16 16 22 24 27 29 34 37 16 16 16 16 16 16 16 16
19 22 26 27 29 34 34 38 16 16 16 16 16 16 16 16
22 22 26 27 29 34 37 40 16 16 16 16 16 16 16 16
22 26 27 29 32 35 40 48 16 16 16 16 16 16 16 16
26 27 29 32 35 40 48 58 16 16 16 16 16 16 16 16
26 27 29 34 38 46 56 69 16 16 16 16 16 16 16 16
27 29 35 38 46 56 69 83 16 16 16 16 16 16 16 16
Table 2.2: Nonlinear Scaler for DC Coefficients of DCT Blocks (from[3])
component DC scaler for Quantizer (Q) range
1–4 5–8 9–24 25–31
Luminance 8 2Q Q+8 2Q+16
Chrominance 8 Q+132 Q+16
Quantization
MPEG-4 video supports two techniques of quantization (Q), one referred to as the H.263 quantization method and the other, the MPEG quantization method. The H.263 quantiza-tion method is with dead zone for intra and inter AC coefficients and with no dead zone for intra DC coefficients. The MPEG quantization method is uniform quantizer with the default matrix.
Figure 2.16 shows the quantizer characteristics in H.263. It has uniform quantization for intra DC coefficients and nearly uniform midtread quantization for the inter DC and all AC coefficients. For AC data, input between −Th and +Th is quantized to zero.
All coefficients in a macroblock go through the same quantizer. The step size Q can be changed in increments of 2 from 2 to 62 depending on rate controller.
In the MPEG quantizer, each coefficient produced by 2D DCT is quantized with a uniform quantizer. The default quantizer matrix is defined as shown in Table 2.1. The default quantizer matrix can be changed by the rate controller if the required channel bandwidth is unavailable.
Typically, the DC coefficients of DCT of blocks belonging to an intra macroblock are scaled by a constant scaling factor of 8. However, in MPEG-4 video, a nonlinear scaler as shown in Table 2.2 is used to provide a higher coding efficiency. The characteristics of nonlinear scaling are different between the luminance and chrominance blocks and further depend on the quantizer used for the block.
Intra Prediction
After quantization, the DC coefficients and many AC coefficients of an intra block are coded by intra prediction. Intra prediction is a new operation used in MPEG-4 standards to reduce the spatial redundancy between8 × 8 blocks. There are two types of prediction, DC prediction and AC prediction.
Figure 2.17 shows the prediction of DC coefficients in intra8 × 8 blocks. The quan-tized intra coefficients are predicted with three previous decoded DC coefficients. For example, the DC coefficients of block X is predicted from the DC coefficients of blocks A, B and C. Unlike MPEG-2, the method of prediction in MPEG-4 standards is gradient
0000000000000000000000000000000000000000000
Figure 2.17: Prediction of DC coefficients of blocks in an intra MB (from[5]).
000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000 0000000000000000 000000 00000 000000 000000
A
Figure 2.18: Prediction of AC coefficients of blocks in an intra MB (from[5]).
Figure 2.19: Scans for8 × 8 blocks (from[3]).
based. In computing the prediction of block X, if the absolute value of a horizontal gradi-ent is less than the absolute value of a vertical gradigradi-ent, then the QDC of block C is used as the prediction, else QDC value of block A is used.
The AC prediction depends on DC prediction, as shown in Figure 2.18. The AC coefficients in the first row or in the first column are predicted with three previous decoded AC coefficients. The direction of prediction is the same as DC prediction.
Scan and VLC
The predicted DC and AC coefficients (as well as the un-predicted AC coefficients) of DCT blocks are scanned by one of three scans: alternate-horizontal, alternate-vertical and zigzag (normal scan used in H.263 and MPEG-1) to change the 2D image to one dimensional data, see Figure 2.19. The actual scan used depends on the coefficient pre-dictions used. For instance, if the DC prediction refers to the horizontally adjacent block, alternate-vertical scan is selected for the current block. If the DC prediction refer to the vertically adjacent block, alternate-horizontal scan is used for the current block. For all
other blocks, the8 × 8 blocks of transform coefficients are zigzag scanned.
The coefficients after scan usually become data with many zeros at the end. This kind of a data stream is good for run-length coding. In the MPEG-4 standard, differen-tial DC coefficients in intra blocks are encoded in variable length codes. However, the AC coefficients are encoded by the variable length codes for EVENTs. An EVENT is a combination of a last non-zero coefficient indication, the number of successive zeros preceding the coded coefficient (RUN), and the non-zero value of the coded coefficient (LEVEL). Some statistically rare events have no variable length codes to represent them.
For them an escape coding method is used.