Motion Coding - MPEG-4 Video Texture Coding (from [6], [7] and [8])

2.3 MPEG-4 Video Texture Coding (from [6], [7] and [8])

2.3.2 Motion Coding

There are four types of VOPs (see Figure 2.4) that use different coding methods, I-VOP, P-VOP, B-VOP and S-VOP. Motion coding is essential for P-VOP and B-VOP to reduce temporal redundancy. The motion coder consists of a motion estimator, motion compen-sator, previous/next VOPs store and motion vector predictor and coder.

Figure 2.6: Detailed structure of VO encoder (from [6]).

Padding Process

Figure 2.7 shows a simplified diagram of the padding process. the values of luminance and chrominance samples outside the VOP are defined by the padding process.

By replicating the boundary samples of the VOP towards the exterior, a MB that lies on the VOP boundary is padded . This process is divided into horizontal repetitive padding and vertical repetitive padding.

Motion Estimation

Motion estimation (ME) is an important method for doing prediction between adjacent frames/pictures. Further, MPEG-4 encoder adopts block-based motion estimation tech-nique, instead of pixel-based technique.

For every luminance MB, the basic motion estimation is performed. Besides, motion vectors can be sent for individual blocks to make prediction more accurate.

One way of motion estimation is doing full search to integer pixel accuracy vector and, using it as the initial estimate, performing a half-pixel search around it.

Interpolation of MB is necessary because the motion vector may be non-integer num-bers. Figure 2.8 illustrates the interpolation method. By bilinear interpolation, the half

Vertical

Figure 2.7: Padding process (from [7]).

+ +

+ Integer pixel position Half pixel position

Figure 2.8: Interpolation scheme for half sample search.

sample values can be calculated. Then, we will obtain the half-pixel motion vector by using interpolation.

Motion Vector Encoder

The motion vector will be coded when using INTER mode coding.

Motion vector is coded differentially by using a spatial neighborhood of three candi-date MVs already coded (see Figure 2.9). At the borders of the current VOP, the following decision rules are applied:

1. If there is only one MB of candidate predictors outside the VOP, it is set to zero.

2. If there are two MBs of candidate predictors outside the VOP, they are set to the third candidate predictor.

3. If all three MBs of candidate predictors are outside the VOP, they are set to zero.

For horizontal and vertical components, the median value of the three candidates for the same component is used as predictor, denoted and , respectively:

" # $%&'(

Then, the vector differences, )*+"%-,. and)*-%/,0%1 , are coded by variable-length coding.

Motion Compensation

The motion compensation is performed on the prediction block,243 $56 8795:;7, from the ref-erence VOP. In addition to basic motion compensation processing, three alternatives are supported, namely, unrestricted motion compensation, four MV motion compensation and overlapped motion compensation.

For unrestricted motion compensation, the motion vectors are allowed to point outside the decoded area of a reference VOP. The243& $5< =7>5:;7 is defined as:

?3& A@B"C-DFEG8C*HIJK$LM?33ONP;RQTS'UBLWVX3&!Y$ =UZN[QTS'UBL\V3], !

MV2 MV3 MV : Current motion vector

MV1: Previous motion vector MV2: Above motion vector MV3: Above right motion vector

: VOP border

MV1 MV1

Figure 2.9: Motion vector prediction (from [7]).

3& A@ "C-DFEJ8C*HIG8TLM$33 NP YQ QUBL\VX3;R1 =U N[QQUBLWVX3 , !

whereQTS'UBL\V3* vop horizontal mc spatial ref, ^Q QUBL\VX3- vop vertical mc spatial ref,

K1LM?33AY$LM?33& are the coordinates of a sample in the current VOP, KT3 A@$ ?3& A@G are

the coordinates of a sample in the reference VOP, ^8$ ^; is the motion vector, and

K1 U $ =U are the dimensions of the bounding rectangle of the reference VOP.

One/two/four vectors decision is indicated by the MCBPC codeword and field prediction flag for each macroblock. If one motion vector is transmitted for a certain macroblock, this is defined as four vectors with the same value as the MV. When two field motion vec-tors are transmitted, each of the four block prediction motion vecvec-tors has the value equal to the average of the field motion vectors (rounded such that all fractional pixel offsets become half pixel offsets). If MCBPC indicates that four motion vectors are transmitted for the current macroblock, the information for the first motion vector is transmitted as the codeword MVD and the information for the three additional motion vectors is transmitted as the codewords MVD2–4. If four vectors are used, each of the motion vectors is used for all pixels in one of the four luminance blocks in the macroblock.

Overlapped motion compensation is performed when the flag obmc disable = 0. Each pixel in an luminance prediction block is a weighted sum of three prediction values, divided by 8. The creation of each pixel^*8 ^9: , in an luminance prediction block

is governed by the following equation:

% ) denotes the motion vector for the current block, (

% ) de-notes the motion vector of the block either above or below, (

K Y9: denote the weighting of each pixel in the current block and neighbor blocks.

Since the VOP may be coded in P or B mode, there are three types of motion vec-tors, forward mode, backward mode, and bi-directional mode. The different modes make different predictions-8 Y>:T .

1. Forward mode

Only the forward vector (MVFx,MVFy) is applied in this mode. The prediction blocks ^% ⁸ ^9: ^! ^*)TK
Y9: ^! +&K Y9: are generated from the forward reference VOP.

2. Backward mode

Only the Backward vector (MVBx,MVBy) is applied in this mode. The prediction blocks ^% ⁸ ^9: ^! ^*)TK
Y9: ^! +&K Y9: are generated from the backward reference VOP.

3. Bi-directional mode

Both the forward vector (MVFx,MVFy) and the backward vector (MVBx,MVBy) are applied in this mode. The prediction blocks ^% ⁸ ^9: ^! ^*)TK
Y9: ^! +&K Y9: are gen-erated from the forward and backward reference VOPs by doing the forward pre-diction and the backward prepre-diction and then averaging both prepre-dictions pixel by pixel.

在文檔中使用ARM9處理器實現MPEG-4視訊之軟體解碼 (頁 24-29)