Proposed MV Decision Method

In H.264 encoder, it uses quarter-pixel-accuracy prediction and various block sizes ranging from 16X16 to 4X4 for motion estimation on. The one with the best RD cost is used for motion compensation. Therefore, the computation cost of the motion estimation is very high. In order to speed up the motion estimation for MPEG-2 to H.264 transcoding and we proposed an efficient motion vector decision method. Our algorithm determines motion vector for the partition blocks of inter 16X16, 16X8, 8X16 and 8X8 only. Figure 4.6 shows the flowchart of the proposed algorithm.

Detailed description is given in the following subsections.

Figure 4.6 The flowchart of proposed MV decision

4.2.1 Mode and Energy Check

The motion vectors included in an MPEG-2 coded video stream are estimated for block size of 16X16. Therefore, the motion vectors of MPEG-2 seems can be reused for the inter 16X16 mode in H.264. However, both MPEG-2 and H.264 allow half-pixel accuracy and the different filters are used for interpolation at the half-pixel position. In MPEG-2, two-tap filter: (1,1)/2 is used whereas six-tap filter:

(1,-5,20,20,-5,1)/32 is used in H.264. Therefore, we reuse the motion vectors of MPEG-2 only for integer-pixel motion vectors of 16X16 blocks and apply the sub-pixel refinement around the integer motion vectors.

On the other hand, although MPEG-2 standard does not apply the block sizes of 16X8, 8X16, and 8X8, we still consider reusing the MPEG-2 motion vectors directly for these partition blocks in H.264 if the energy of the partition block is equal to zero.

For example, in Figure 4.7, the motion vector of MPEG-2 is reused directly in right 16X8 partition block since the energy value of the upper-right and bottom-right 8X8 blocks of the corresponding macroblock are both zero.

Figure 4.7 Reuse of MPEG-2 MV for Eng = 0

4.2.2 Motion Vector Prediction

In the motion mapping algorithm [13], they used the motion vectors of the current macroblock and those macroblocks adjacent to the current target block to derive the output motion vector (Figure 2.2). In our approach, we also use these motion vectors as the candidate motion vectors for H.264 motion vector prediction.

However, the motion vectors of those macroblocks are not always good for the target block because some of these candidate motion vectors, called unreliable motion vectors, may have different directions from the real motion of the target block.

Therefore, we present an efficient method to remove such unreliable motion vectors from the candidate motion vectors. We consider using the energy of the residual coefficients to define the unreliable motion vectors and remove them from the candidate motion vectors, as shown in Figure 4.8.

Eng > 0 Eng = 0

Reused MPEG-2 MV

(a) 16X8 (b) 8X16 (c) 8X8 Figure 4.8 Select Candidate MV

We compare the energy of the target block with the energy of those blocks adjacent to the current target block. If the energy of the neighbor block is larger than the two times energy of the target block, this motion vector is defined as an unreliable motion vector for the target block and therefore, it will be removed from the set of candidate motion vectors. For example, for the 16X8 mode in Figure 4.8(a), the partition-A block and the block of number 1 are both in 16X8 size which has two 8X8 blocks, and the blocks of number 0, 2, 3 and 4 are all in 8X8 size block. We compare the energy of partition-A block with that of number 0, 2, 3 and 4 blocks and compare two times energy of partition-A block with that of number 1 block, as follows :

 if( Eng(i)8X8 > Eng(A)16X8 ) i = 0, 2, 3, and 4 where the weight wⁱis inversely proportional to the distance between the geometric center of the candidate macroblocks which are not removed and that of target partition block A. Since the motion vectors predicted in this way may not be good enough, in the next subsection, we describe how to decide the motion vector mapping or refine it

) (

)

( ^A ^ ^round  ^w

ⁱ

^ ^MV

ⁱ

MV

adaptively.

4.2.3 MV Mapping or Integer-Pixel Refinement

For equation (1), since the most proportion of the predicted motion vector comes from the motion vector of the macroblock with the target partition block, we use the magnitude of the energy to estimate the accuracy of the original MPEG-2 motion vector of this macroblcok and determine the search range of the refinement window.

The refinement is performed with full search method of H.264 in this search window centered at the predicted motion vector.

Search range = 2 if larger than High_Th 4 if larger than High_ThX2 For 16X8 or 8X16 mode

Search range = 2 if larger than High_Th/2 4 if larger than High_Th For 8X8 mode

Formula (2)

(a) 16X8 or 8X16 mode

(b) 8X8 mode

Figure 4.9 Set the search range

The search range is defined in formula (2), where the High_Th is the same as the high-threshold defined previously in the subsection 4.1.2. In inter 16X8 mode, if the

energy of the target block is lower than High_Th, the predicted motion vector directly is used. If the energy is larger than High_Th or even two times of High_Th, the search range of the refinement window is set be 2 or 4 pixels in order to find a better motion vector efficiently. The inter 8X16 and 8X8 mode are performed similar motion vector refinement. Figure 4.10 shows an example for inter 8X16 mode, where assume Eng(A) is less than High_Th, while Eng(B) is in between High_ThX2 and High_Th. In this case, the 8X16 partition A block uses the predicted motion vector directly, but the predicted motion vector of the partition B block needs to be refined.

Figure 4.10 Eng(A) < High_Th and High_ThX2 > Eng(B) > High_Th

4.2.4 Sub-Pixel Refinement

After the above process is finished, the best integer motion vector for each target partition block is estimated. We propose to do a sub-pixel motion refinement in order to combat the difference from the half pixel interpolation methods used in MPEG-2 and H.264. We first perform half-pixel refinement around the best integer motion vector and finally quarter-pixel refinement around the best half-pixel motion vector.

Chapter 5 Experimental Results

In the chapter, we compare the proposed method with the “Chen’s algorithm” [9]

and the “Xin’s algorithm” [13], which speed up the block mode decision and motion vector decision respectively. We also compare the proposed transcoder with the MPEG-2 to H.264 standard transcoder. The parameters of our experimental environment are set as follows:

CPU : Intel Pantium4 3.0 GHz

 Test sequence(frames): Foreman(120), Coastguard(120), Bus(120), News(120), Mobile(120), Stefan(90)

 Group of Picture (GOP): I P P P P ……

 GOP size: 30 frames

 Frame rate: 30 fps

 Frame format: CIF (352 x 288 pixels)

 Codec : MPEG-2(TM5), H.264(JM13.1)

 MPEG-2 bitrate : 3.2 Mbps

 RD Optimization : High complexity mode (if used)

 Motion Estimation : Search window size = 16

 Rate Control : used

 Inter Mode : Skip, 16X16, 16X8, 8X16, and 8X8 are enabled

As mentioned above, in our experiment the input MPEG-2 bitstream is encoded at a bitrate of 3.2 Mbps. And the output H.264 bitstreams are encoded at various bitrates in order to compare the performance at the different bitrates and the rate-control of the H.264 standard is used. The frame structures of both MPEG-2 and H.264 are IPPP structure, and the “High complexity mode” is enabled in the H.264 encoding process if the RD-optimization is used. We also experimented the results of [9] and [13] algorithms in the same condition of their paper to verify the correctness.

However, in order to make fair comparison, we set the different experimental environment from their paper in following experiment.

在文檔中針對MPEG-2到H.264/AVC的轉換編碼的快速演算法 (頁 27-34)

4.2.1 Mode and Energy Check

4.2.2 Motion Vector Prediction

) (

)

( A  round  w

 MV

MV

4.2.3 MV Mapping or Integer-Pixel Refinement

4.2.4 Sub-Pixel Refinement

Chapter 5

Experimental Results

( ^A ^ ^round  ^w

^ ^MV