Content-aware fast motion estimation algorithm

(1)

Content-Aware Fast Motion Estimation Algorithm

Yi-Wen Chen

*

_{, Ming-Ho Hsiao, Hua-Tsung Chen, Chi-Yu Liu, Suh-Yin Lee}

College of Computer Science, National Chiao Tung University, 1001 Ta-Hsueh Road, Hsinchu, Taiwan Received 31 July 2007; accepted 24 January 2008

Available online 19 February 2008

Abstract

In this paper, we propose the Content-Aware Fast Motion Estimation Algorithm (CAFME) that can reduce computation complexity of motion estimation (ME) in H.264/AVC while maintaining almost the same coding eﬃciency. Motion estimation can be divided into two phases: searching phase and matching phase. In searching phase, we propose the Simple Dynamic Search Range Algorithm (SDSR) based on video characteristics to reduce the number of search points (SP). In matching phase, we integrate the Successive Elimination Algorithm (SEA) and the integral frame to develop a new SEA for H.264/AVC video compression standard, called Successive Elimina-tion Algorithm with Integral Frame (SEAIF). Besides, we also propose the Early TerminaElimina-tion Algorithm (ETA) to early terminate the motion estimation of current block.

We implement the proposed algorithm in the reference software JM9.4 of H.264/AVC and the experimental results show that our proposed algorithm can reduce the number of search points about 93.1%, encoding time about 42%, while maintaining almost the same bitrate and PSNR.

Keywords: Motion estimation; Successive Elimination Algorithm; Integral frame; Search range; H.264/AVC; SAD; Motion vector; Computational complexity

1. Introduction

Block matching-based motion estimation (ME) and compensation is a fundamental process in international video coding standards, such as MPEG-1, MPEG-2, MPEG-4, ITU-T H.263, and H.264, which can eﬃciently remove temporal redundancy. Since an ME module is usu-ally the most computational-intensive part in a typical video encoder (about 50–90% of the entire system), a eﬃ-cient ME module is essential and vital.

In recent years, many fast motion estimation algorithms have been proposed. Some algorithms like Three-Step Search (TSS) [1] and Diamond Search (DS) [2], search the best matched blocks following a predeﬁned search pattern to speed up the searching process. The Successive Elimination

Algo-rithm (SEA)[3]is a lossless approach which can avoid unnec-essary computation of the sum of absolute diﬀerence (SAD) and reduce the computation complexity while maintaining the same performance. The Window Follower Algorithm (WFA)[7]can dynamically adjust the size of the search win-dow to avoid unnecessary computations. Since the video con-tent varies dramatically, these algorithms do not always perform well for videos of various activities. The drawbacks and advantages of these algorithms are listed inTable 1.

In this paper, we propose the Content-Aware Fast Motion Estimation (CAFME) Algorithm to speed up the motion estimation considering the video content. The CAFME consists of the Simple Dynamic Search Range Algorithm (SDSR), Successive Elimination Algo-rithm with Integral Frame (SEAIF), and Early Termina-tion Algorithm (ETA). The SDSR adjusts search range adaptively according to the motion activity of the video. The experiments show that the SDSR performs well for the video of diﬀerent kinds of motion activity. The SEAIF is designed for saving the computing of block matching in

*

Corresponding author. Fax: +886 3 5724176.

E-mail addresses: ewchen@csie.nctu.edu.tw (Y.-W. Chen),

mhhsiao@csie.nctu.edu.tw (M.-H. Hsiao), huatsung@csie.nctu.edu.tw

(H.-T. Chen),liucy@csie.nctu.edu.tw(C.-Y. Liu),sylee@csie.nctu.edu.tw

(S.-Y. Lee).

www.elsevier.com/locate/jvci J. Vis. Commun. Image R. 19 (2008) 256–269

(2)

motion estimation of H.264/AVC and the ETA could skip some search points in motion estimation by early termina-tion of the searching process. Although the CAFME con-sists of the SDSR, SEAIF, and ETA, these three algorithms can be used independently. The experimental result shows that the proposed algorithms could reduce the computing time-of-motion estimation and maintain almost the same coding eﬃciency compared with Full Search.

The paper is organized as follows: Section2 introduces the related background knowledge. In Section3, we pres-ent how the Contpres-ent-Aware Fast Motion Estimation Algo-rithm is designed and developed. Section 4 reports the signiﬁcant experimental results. Finally, the conclusions and future works are given in Section5.

2. Related works 2.1. Matching criterion

Matching criterion is exploited as a quality evaluation metric for motion estimation algorithms to find out the best matched block. Mean square difference (MSD), mean absolute difference (MAD), and sum of absolute difference (SAD) are frequently used criteria. Their definitions can be described by the following equations.

MSD fð c; frðm; nÞÞ ¼ 1 MN XM1 i¼0 X N1 j¼0 fcði; jÞ frði þ m; j þ nÞ ð Þ2 ð1Þ MAD fð c; frðm; nÞÞ ¼ 1 MN X M1 i0 XN1 j¼0 fcði; jÞ frði þ m; j þ nÞ j j ð2Þ SAD fð c; frðm; nÞÞ ¼ X M1 i¼0 XN1 j¼0 fcði; jÞ frði þ m; j þ nÞ j j ð3Þ

M and N are the width and height of a block, respec-tively. m and n are horizontal and vertical components of motion vector, respectively. fc and fr are the current

and reference blocks, respectively. Among these match-ing criteria, SAD is a multiplication-free method which enables eﬃcient implementations in hardware and

soft-ware. Therefore, SAD is chosen as the criterion for block matching in the international video coding standards.

Unlike other video coding standards, H.264 uses the Lagrange multiplier to compute the rate distortion cost for selecting the best partition from seven block partitions for inter-prediction of a macroblock. The best matched block is selected by minimizing the following Lagrange cost.

JðMV;kmotionÞ ¼ SAD fð c; frðm; nÞÞ þ kmotion

Rate MV MVð PÞ ð4Þ

MV = (m, n) is the motion vector, MVP= (mPx, nP) is the

prediction for motion vector, and kmotionis the Lagrange

multiplier. The function Rate(MV MVP) represents the

predicted motion error and is implemented by a look-up ta-ble[12].

2.2. Integral frame

Integral frame is proposed by Viola and Jones [13] to efficiently compute the sum of pixel values within any rectangle area in a frame. The main idea of integral frame is to first calculate the value of integral frame at pixel (p, q) in a frame f, denoted by If(p, q) as defined

in the Eq. (5), in which f(i, j) represents the pixel value at position (i, j). From Eq. (5), we can see that the value of the integral frame If(p, q) is the sum of the pixel values

within the rectangle whose top left corner is (0, 0) and bottom right corner is (p, q). As Fig. 1 shows, the value of integral frame If(p, q) is the sum of the pixel values

within the gray area. Ifðp; qÞ ¼ Xp i¼0 Xq j¼0 fði; jÞ ð5Þ

We could analyze the computational cost for an integral frame value with the following equations: let Rf(p, q) be

the sum of pixel values from pixel(0, q) to pixel (p, q) in row q. By using Eqs.(9) and (10)recursively, one can com-pute all the values of integral frame If (i, j) at pixel (i, j)

Table 1

Advantages and drawbacks of fast motion estimation algorithms

Category Advantage Drawback

Follow certain search pattern pNumber of SP is very small pLocal minimum problem p

Reduce considerable computation pUnsuitable for high motion p

Coding eﬃciency degradation Adjust search window size pNumber of SP is small pNeed thresholds

p

Reduce considerable computation pUnsuitable for sudden motion change p

Substantial overhead p

Coding eﬃciency degradation Reduce matching complexity pReduce considerable computation pSubstantial overhead

p

Lossless approach pUnsuitable for hardware p

(3)

within a frame in one pass. For a frame of W H pixels, 2WH additions are required to compute all the integral frame values. Rfðp; qÞ ¼ Xp i¼0 fði; qÞ ð6Þ Rfð1; qÞ ¼ 0 ð7Þ Ifðp; 1Þ ¼ 0 ð8Þ Rfðp; qÞ ¼ Rfðp 1; qÞ þ f ðp; qÞ ð9Þ Ifðp; qÞ ¼ Ifðp; q 1Þ þ Rfðp; qÞ ð10Þ

After we calculate all the integral frame values within a frame, the sum of pixel values in any rectangular block in the frame can be computed by three arithmetic operations. For example, as illustrated inFig. 2, the block sum of block D, denoted as BS(D), could be acquired by only three oper-ations as Eq.(11)shows:

BSðDÞ ¼ X p i¼rþ1 Xq j¼sþ1 fði; jÞ ¼ Ifðp; qÞ Ifðr; qÞ Ifðp; sÞ þ Ifðr; sÞ ð11Þ

Because integral frame speeds up the computation of the block sums, our proposed fast block-matching algorithm

utilizes the concept of integral frame to further improve the eﬃciency of motion estimation in the process of video encoding.

2.3. Successive Elimination Algorithm (SEA)

In motion estimation, once the SAD value between the current block and the candidate block is computed, it is compared with the current minimum SAD value. If the newly computed SAD value is smaller than the current minimum SAD, the candidate block is considered as the up-to-date best matched block. In order to reduce the computation of SAD, Successive Elimination Algo-rithm (SEA) [3] was proposed to speed up the motion estimation by pruning the unnecessary computation. The main idea of SEA can be shown in the Eq. (12), in which fc and fr represent the current block and the

candidate block, BSc and BSr are the block sums of

the current block and candidate block, respectively. sea(fc, fr) is a value computed by substracting the block

sum BSc from the block sum BSr.

SAD fð c; frÞ ¼ P M1 i¼0 P N1 j¼0 fcði; jÞ frði; jÞ j j P P M1 i¼0 P N1 j¼0 fcði; jÞ P M1 i¼0 P N1 j¼0 frði; jÞ BSj c BSrj sea fð c; frÞ ð12Þ

SAD (fc, fr) is equal to or larger than sea(fc, fr). If

sea(fc, fr) is larger than the current minimum SAD,

SAD(fc, fr) must be larger than the current minimum

SAD, and therefore, computation of SAD(fc, fr) can be

skipped. Besides, computing sea value is easier than com-puting SAD, because BSc needs to be calculated only

once and BSr can be derived from the previous value

of BSr. Hence, SEA can eﬃciently reduce the

computa-tion of SAD.

Multilevel SEA (MSEA) proposed in [4] is a general-ized SEA. MSEA partitions a block into several sub-blocks and calculates the BS for each sub-block to gen-erate a tighter decision value. The block is partitioned in a multi-level manner. At the L-level partition, the block is divided into 22L sub-blocks of size N/2L N/2L

. The msea(fc, fr) value for current block fcand candidate block

fr is then computed by summing the absolute diﬀerences

of the corresponding block sum (BS) of each sub-block. The mesa(fc, fr) is always equal to or larger than

sea(fc, fr). Consequently, the msea(fc, fr) is a lower bound

of SAD, as described in Eq. (13). In Eq. (13), k is the index of sub-block and L is the level of division. When the block size is 16 16 (M = 16, N = 16), MSEA with level L = 0 corresponds to SEA, and MSEA with level L = 4 corresponds to SAD. Obviously, the decision bound is tighter when the level L is larger; however, the computational cost is also higher.

X Y p (0, 0) f (p, q) If(p, q) q

Fig. 1. Integral frame.

X Y p (0, 0) q A B C D r s

(4)

SAD fð c; frÞ ¼ P M1 i¼0 P N1 j¼0 fcði; jÞ frði; jÞ j j P P 22L1 k¼0 BSck BSrk j j mseaðfc; frÞ PjBSc BSrj sea fð c; frÞ ð13Þ

2.4. Modiﬁed Window Follower Algorithm (MWFA) Search range is also a critical factor which inﬂuences the computational complexity of motion estimation. Small search range results in poor matching results while large search range produces higher computational load. A suit-able search range can reduce the computation complexity and also maintain good coding performance. Window Fol-lower Algorithm (WFA) [7] is proposed to adaptively adjust search range based on the following assumptions:

(1) The change of motion content between frames is gradual and not sudden.

(2) The motion content is constant over a large number of successive frames.

WFA takes the maximum displacement of MV in previ-ous frame plus one unit as the search range for the current frame. The algorithm is presented inTable 2.

However, the characteristics of motion in natural video sequences vary a lot and is hardly predictable. The assump-tions of WFA may not be true in natural video sequences. MWFA[8]modiﬁes WFA by exploiting both temporal and spatial information and adopting SAD as a measure of accuracy of MV. MWFA algorithm is presented inTable 3. SADmint1 and dt1 represent the minimum SAD and

the maximum MV displacement for the (t 1)th block in the current frame, respectively. The flag F is set to zero at the beginning of each frame. When the flag F is set to zero, only temporal information is considered; when the flag F is set to one, both temporal and spatial information are taken into account. According to the experimental results from simulations of typical video sequences, the threshold TH1 and TH2 are set to 4096 and 2048,

respectively.

3. Content-Aware Fast Motion Estimation Algorithm In this paper, in order to reduce the computational com-plexity of motion estimation in .264/AVC, we analyze the correlations between search range and the motion activity of the video content and the correlations of the motion vec-tors between neighboring blocks. Based on these observa-tions, these correlations are fully considered in the development of the Content-Aware Fast Motion Estima-tion Algorithm (CAFME). We ﬁrst present some observa-tions and analyses of search range in motion estimation in Section 3.1. Then, the details of the proposed algorithms, Simple Dynamic Search Range Algorithm (SDSR), SEA with integral frame (SEAIF) and Early Termination Algo-rithm (ETA) are described in Sections3.2–3.4, respectively. 3.1. Analysis of search range

True MV is defined as the displacement of current block from the matched block in the reference picture with min-imum SAD value. Search range constraints current block to search the best matched block within a predefined area in the reference picture. Therefore, exploring the effect of the coding parameters upon the search range in motion estimation helps to adaptively determine the search range for each coding block to retrieve true MV in a suitable search range. Some experiments have been made to observe and analyze the relationships between search range (SR) and frame rate, frame resolution, motion activity, quanti-zation parameter (QP), and SAD of best matched block. The results and discussion of each experiment are intro-duced in the following subsections. The experimental envi-ronment is shown inTable 4.

3.1.1. Search range and frame rate

Since the frame rate aﬀects the diﬀerence of successive frames, so we observe the relationship between SR and frame rate. The test data are foreman sequence with FPS = 30 and 15. The temporal distance of sequence with FPS = 30 and 15 are 1/30 and 1/15 s, respectively. In the-ory, when the frame rate is higher, the motion estimation needs smaller search range.

Table 2

Window Follower Algorithm

Step 1: For the kth frame, compute the maximum horizontal and vertical displacement from all MVs in (k 1)th frame. The maximum value D is deﬁned as Eq. (14). The dtrepresents the maximum displacement of two

components of MV of tth block

D¼ max d½ t (14)

dt¼ max MVtx;MVty

(15) Step 2: Perform motion estimation for kth frame with search range

P = D + 1. For the ﬁrst frame, the search range P is set to the default max search range deﬁned in sequence parameter set

Table 3

Modiﬁed Window Follower Algorithm

Step 1: For the kth frame, compute the value D as deﬁned in WFA Step 2: Perform motion estimation for each block in kth frame with search range Pt, for tth block. Ptis determined by the following

mutually exclusive rules

ð1Þ If SADð mint1>¼ TH1Þ Pt¼ Pmax; F¼ 1

ð2Þ If SADð mint1<¼ TH1and F¼¼ 1Þ Pt¼ max D; dð t1Þ þ 1

If SADð mint1<¼ TH1and F¼¼ 0Þ Pt¼ D þ 1

ð3Þ If ðSADmint1<¼ TH2and F ¼¼ 1Þ Pt¼ max D; dð t1Þ

(5)

The quantization parameter (QP) is mapped into quan-tization step and affects the bitrate significantly. In our experiments, the QP is fixed and Rate Control (RC) is dis-abled. Therefore, we only need to observe the bitrate for different search ranges. In Table 5, the gray areas indicate that the bitrates are stable, which means the search ranges are sufficient for most MBs to be coded with true MVs which mininimizes SAD value in motion estimation. We can observe that the bitrates are stable when search range (SR) is larger or equal to 4 with FPS = 30 and search range (SR) is larger than or equal to 8 with FPS = 15. It indicates that the sufficient search range (SR) is proportional to frame rate.

3.1.2. Search range and frame resolution

To find the relationship between search range and frame resolution, we run the testing sequence in QCIF and CIF resolution. The experimental results of coastguard sequence is shown inTable 6. The gray areas in QCIF res-olution show that the bitrate is stable when the SR varies from 2 to 32. The gray areas in CIF resolution indicate the bitrate is stable when the SR varies from 4 to 32. These observations indicate that the search range equal to 2 is sufficient to find the true MVs in QCIF resolution and search range equal to 4 is sufficient to find the true MVs in CIF resolution. The search range is proportional to res-olution, which means the search range should be adjusted adaptively based on the frame resolution.

3.1.3. Search range and motion activity

To see the impact of motion activity on search range, we divide the foreman sequence into two sequences, in which one is of low motion and the other one is of high motion. The ﬁrst sequence is clipped from the ﬁrst 90 frames and the second one is clipped from frame 151 to 240. InTable 7, it is observed that the bitrates are approximately stable when the SR = 4 in low-motion sequence and the SR = 8 in high-motion sequence, which indicates the higher search range is needed for high-motion sequences to derive true motion vectors.

3.1.4. Search range, QP, and SAD of best matched block In block matching, SAD is used as the matching crite-rion. In this experiment, we observe the impact of search range (SR) and quantization parameter (QP) on SAD value. As shown inTable 8, the experimental result shows the true MVs can be obtained as long as search range is lar-ger or equal to 8 while QP only aﬀects the magnitude of

SAD.Fig. 3depicts SAD average from frame 0 to 299 of

Foreman sequence. In this ﬁgure, the vertical axis is the value of SAD average and the horizontal axis is frame index. And, the dotted line and solid line represent the average SAD value over frames in Forman sequence with SR = 4 and 32, respectively. We can see that the value of SAD average with SR = 4 and 32 are almost the same from frame 0 to 299 except for those frames from 150 to 220. The curve of SR = 4 is above the curve of SR = 32 from frame 150 to frame 220. This phenomenon is resulted from

Table 5

The relation between SR and FPS

Table 6

The relation between SR and resolution

Table 7

The relation between SR and motion activity Table 4

Experimental environment for analysis of the correlations between search range and coding parameters

Encoder environment Software: JM 9.4[14]

RDO: Enabled

Number of reference frame: 1 Quantization parameter (QP): 36 GOP size: 15

Macroblock adaptive inter-layer prediction: Enabled Machine: Athlon XP 1700+ with 512 MB memory Proﬁle: baseline

Prediction structure: IPPP

Fast ME (UMHexagonS)[15]: disable Fast mode selection[16]: disable

(6)

the fact that the motion is higher than the rest of the sequence from frame 150 to 220. Since true MV always results in minimum SAD value, it means SR 5 4 is not suf-ﬁcient to ﬁnd the true MVs for those frames from 150 to 220, thus results in larger SAD value.

In summary, an appropriate SR can reduce unnecessary computation in motion estimation and still can ﬁnd out true MVs. In our experiments, the search range should be adjusted adaptively according to motion activity of video and parameters of encoder. Hence, we propose a mecha-nism Simple Dynamic Search Range (SDSR) to adjust SR dynamically based on motion activity.

3.2. Simple Dynamic Search Range (SDSR)

In order to adaptively adjust search range for motion estimation, some approaches (DSWA [5], AFSBM [6], MWFA[8], and MAS[9]) have already been implemented. According to diﬀerent measure criteria, these approaches

could be classiﬁed into block-matching error-based and motion vector-based approaches.

The block-matching error, which represents the degree of matching between the current block and the candidate block, is usually measured in Mean of Squared Difference (MSD), Mean of Absolute Difference (MAD) or Summation of Absolute Difference (SAD). The value of block-matching error is determined considering many factors including motion activity, texture, and quantization parameter. See

Fig. 4for example. From frame 220, the values of SAD are

much higher than the rest frames. The reason is the compli-cated video texture, not the motion activity. However, the values of SAD in frames from 150 to 170 rise sharply due to the sudden motion change instead of video texture. Con-sequently, the approaches based on block-matching error are usually unsuitable to evaluate the motion activity.

On the contrary, motion vector could represent the motion activity more precisely [9]. Since MV is the dis-placement between current block and the best matched block within the search range in reference picture, the mag-nitude of MV must be less or equal to that of search range. The relation between SR and MV can be model by Eq.

(16).

SR¼ max MVx;MVy

þ D ð16Þ

D is an oﬀset between the maximum component of MV and search range (SR) and it also can be viewed as a magnitude to measure the extra search points processed in motion esti-mation. Ideally, D = 0 represents that SR is set equal to the maximum component of MV, which means no extra search points are processed. An adaptive SR scheme is to set SR as close to MV as possible by minimizing D for each coding block. However, the MV of current block is unknown before the SR is determined. Therefore, in the proposed

Table 8

The relation between SR, QP, and SAD

Foreman QCIF QP36 0 1000 2000 3000 4000 0 13 26 39 52 65 78 91 ₁₀₄ ₁₁₇ ₁₃₀ ₁₄₃ ₁₅₆ ₁₆₉ ₁₈₂ ₁₉₅ ₂₀₈ ₂₂₁ ₂₃₄ ₂₄₇ ₂₆₀ ₂₇₃ ₂₈₆ ₂₉₉ Frame SAD SR 4 SR 32

Fig. 3. Motion activity in foreman QCIF frame by frame with respective to diﬀerent search range.

Foreman CIF SR32 SADavg=1350

0 500 1000 1500 2000 2500 0 ₁₃ ₂₆ ₃₉ ₅₂ ₆₅ ₇₈ ₉₁ 104 117 130 143 156 169 182 195 208 221 234 247 260 273 286 299 Frame SAD SAD

(7)

Simple Dynamic Search Range (SDSR) scheme, we try to set the SR according to the MVs of neighboring blocks be-cause the MV of current block and those of neighboring blocks are highly correlated.

The proposed SDSR algorithm is described in Table 9. Due to the wide variations of motion activity in video sequences and different motion activity in various areas within a single frame, we like to adjust search range based on both temporal correlation and spatial correlation of motion field, respectively. The proposed SDSR scheme first uses the maximum component of MV in previous frame plus an offset c as an upper bound for search range predic-tion. It can be described by Eq. (17), in which SR_FRA-MEk represents search range in frame_level and

max[MVxt, MVyt] represents the maximum horizontal and

vertical displacement among all motion vectors in previous frame. In this equation, c is used to enlarge the upper bound for search range prediction to avoid the bad match-ings caused by small search range.

SR FRAMEk ¼ max MVxt;MVyt

þ c; c P 0

t 2 fall blocks inðk 1Þth frameg ð17Þ

Then the maximum displacement components of neighbor-ing MVs of current block t, denoted as MV_MAXt, is used

as a lower bound for search range prediction.

Finally, the search range for each block is adjusted between the prediction upper bound, SR_FRAMEk, and

the prediction lower bound,MV_MAXt. As described in

Eq.(18), once the MV_MAXtare found to be larger than

SR_FRAMEk, the ﬁnal search range for current block,

SR_BLOCKt, will be set to MV_MAXt plus an oﬀset b

which acts like c in Eq.(17)

SR BLOCKt¼ MV MAXtþ b; b P 0 ð18Þ

SR BLOCKt¼ a MV MAXtþ ð1 aÞ SR FRAMEk;

0 6 a 6 1 ð19Þ

Otherwise, the ﬁnal search range will be calculated by Eq. (19). The control parameter a is used to adjust the weight of MV_MAXt and SR_FRAMEk to calculate the ﬁnal

search range SR_BLOCKtfor each block.

3.3. Successive Elimination Algorithm with Integral Frame (SEAIF)

In H.264/AVC standard, the partition modes of each macroblock in motion estimation include nine intra-modes and seven inter-modes (see Fig. 5). In inter-coding, 41 motion estimations are required for a 16 16 macroblock while rate-distortion optimization (RDO) is enabled for mode selection (one for 16 16, two for 16 8, two for 8 16, four for 8 8, eight for 8 4, eight for 4 8, and 16 for 4 4). Due to the support of various partition modes, the ME cost in H.264/AVC increases dramatically compared to previous video coding standards. Therefore, it is essential to develop eﬃcient algorithm to speed up the computation of ME.

In the H.264/AVC reference software JM 9.4 [14], in order to reduce the intensive computation caused by RDO, a Fast Full Pel Search algorithm is implemented by reusing SAD values of the smallest 4 4 block. At the beginning of the motion estimation of each macro-block, it ﬁrst computes the SAD values for all 4 4 block at all search points within the search window. After that, it merges the SAD values to get the SAD values of larger blocks. In this way, computation of SAD for a macroblock with all block size enabled is about equal to the computa-tion of SAD with only a 16 16 block.

We adopt the concept of reusing SAD and integrate it into our proposed algorithm. We integrate SEA and inte-gral frame technique introduced in Sections 2.2 and 2.3 to form a new SEA called SEAIF for H.264/ACV stan-dard. The main idea of the SEAIF for H.264/AVC is to reuse sea values and SAD values. The following sub-sec-tions present the details of the design. Secsub-sec-tions 3.3.1 and

3.3.2present the techniques of reusing sea and SAD values.

Finally, analysis of complexity for SEAIF is presented in Section3.3.3.

3.3.1. Reusing of sea value

For each search point, calculate the sea values of 16 4 4 blocks of the current macroblock by using integral frame technique. These sea values of 4 4 blocks are the basis for sea values of larger blocks. Then the sea values of larger blocks are derived from thesesea values of 4 4 blocks, described as follows:

Table 9

Simple Dynamic Search Range Algorithm

Step 1: Determine the search range in frame level. The search range called SR_FRAMEkis computed by the maximum horizontal and

vertical displacement from all MVs in (k 1)th frame plus c. The deﬁnition is

SR FRAMEk¼ max MVxt;MVyt

þ c; c P0 t2 all blocks in ðk 1Þth framef g

Step 2: Adjust the search range in macroblock level. Let MV_MAXt

denote the maximum displacement of two components of MVs in neighbor blocks of tth block, as described in the following rules s2 {The left, above left, above, above right blocks of tth block} If any of neighbor blocks is not available

MV_MAXt= max[max[MVxs, MVys],SR_it FRAMEk]

Else

MV_MAXt= max[MVxs, MVys]

Step 3: Determine the ﬁnal search range for tth block, called SR_BLOCKtby the following rules

//Adjust SR in block level If(MV_MAXtPSR_FRAMEk)

SR_BLOCKt= MV_MAXt+ b, b P 0

Else

SR_BLOCKt= a MV_MAXt+ (1 a) SR_FRAMEk,

0 6 a 6 1 //SR constraint If (SR_BLOCKt61)

SR_BLOCKt= 1

Else if (SR_BLOCKtPmax search range)

(8)

For 8 4 or 4 8 block, sum up sea values of two 4 4 blocks.

For 8 8 block, sum up sea values of two 8 4 blocks. For 16 8 or 8 16 block, sum up sea values of two

8 8 blocks.

For 16 16 block, sum up sea values of two 16 8 blocks.

In this way, we can get all the sea values of all blocks of diﬀerent partitions. These sea values of larger blocks are always equal to or larger than the sea values computed directly from block sums (BS) of corresponding blocks. Therefore, the sea values of larger blocks derived from 4 4 block sea values are lower bound of SAD and thus more computations of SAD can be skipped.

3.3.2. Reusing of SAD value

In SEAIF, if the sea value is less than the current mini-mum SAD value, complete calculation of SAD will be pre-formed. In H.264/AVC, overlapped blocks are used in motion estimation. In order to reduce the computations of SAD, we take the 4 4 block SAD values as the basis of the larger block SAD values. In this way, there is no redun-dant computation of SAD. The proposed approach is described inTable 10.

3.3.3. Analysis of complexity

The reason of adopting SEA is to reduce the computa-tional cost in block matching. The overhead of SEA should be considered and analyzed. The overheads of SEA are mainly the computation of block sums. In SEA [3], Salari and Li proposed a fast algorithm to compute the block sums. The conventional approach, SEA approach, and

Integral frame approach are compared and the analysis of the overhead of each approach is described as follows:

Let W denote image width, H image height, M block width, and N block height. Operations required for block sums of all M N blocks in a reference frame for the approaches are as follows:

Straightforward approach:

Number of block sum in a frame: (W M + 1)

(H N + 1)

Operations required for a block sum: MN 1 Total cost: (MN 1) (W M + 1)(H N + 1) Approximate cost: MNWH

SEA approach in[3]:

Total cost: 4WH (H N)(M + 3) 3W(N + 1) Approximate cost: 4WH

Integral frame approach:

Operations required for an integral frame: 2WH Operations required for all block sum:2(W M + 1) (H N + 1)1

Total cost: 2WH + 2(W M + 1)(H N + 1) Approximate cost: 4WH

Although Integral frame approach and the SEA approach in [3] have approximately the same complexity, there is an advantage in Integral frame approach. In Inte-gral frame approach, it is ﬂexible to get block sum of any rectangle block.

For example, if we want to use the multilevel SEA for each block size in H.264/AVC, it will be easier to implement with integral frame approach (Note that our approach uses the tighter lower bound in SEA, not multilevel SEA). Comput-ing msea value of 16 16 block with level L = 0 only needs ﬁve operations including three for getting BS, one for sub-traction operation and one for absolute operation. Never-theless, merging sixteen 4 4 sea values to get the sea value of 16 16 block with level L = 0 needs 15 addition operations while the sea value is a tighter lower bound. There

Table 10

Reusing SAD value algorithm

Regardless of block size, Calculation of SAD for the block is: Step1: Find out all 4 4 blocks within the block

Step2: Check the SAD values of these 4 4 blocks. If any SAD value of 4 4 blocks is not available, compute the SAD value

Step3: Get the SAD value of the target block by adding up SAD values of these 4 4 blocks

1 _In_[11]_{, Viet Anh Nguyen and Yap-Pen Tan proposed a fast approach}

to calculate block sum by exploiting the adjacent property of the blocks.

16×16 type 16×8 type 8×16 type 8×8 type

8×8 type 8x4 type 4x8 type 4x4 type

Different partition sizes for a macroblock subtype in 8×8 mode Fig. 5. Diﬀerent partition sizes in a macroblock.

(9)

is trade-oﬀ between the tighter lower bound and computa-tional complexity.

3.4. Early Termination Algorithm (ETA)

In the H.264/AVC encoder, the most time-consuming component is variable block-size motion estimation. To reduce the complexity of motion estimation, we propose an Early Termination Algorithm (ETA) to predict the best motion vector by exploiting the correlation between the MVs of the current block and the neighboring blocks. With the proposed method, some of the search points can be dis-carded early to speed up the process of motion estimation. Siou-Shen Lin et al. [10] show that the probability is about 79% in average when the variance of the current block and neighbor blocks is smaller than 3. They consider that it is of high probability that the variance of the motion vectors in the neighbor blocks is small, which means the diﬀerence between the MVs of current block and those of neighboring blocks might be small.

We exploit and modify the variance of motion vectors proposed in [10]to classify the motion activity of current block and neighbor blocks into simple motion and complex motion. The variance of motion vectors is deﬁned in Eq.(21).

MVmean¼ MVa þ MVb þ MVc þ MVdð Þ=4 ð20Þ

MVvar¼ MVa MVmeanj j þ MVb MVmeanj j

þ MVc MVmeanj j þ MVd MVmeanj j ð21Þ If any of the neighbor blocks is not available, MVvaris set

to a large value (999,999). As shown in Eq.(22), the thresh-old j is used to classify each block into simple_motion or complex_motion by its MV variance. According to the experimental results, it is found that setting j to ‘‘5” can bring more computation time saving while maintaining good coding performance.

IfðMVvar 5 jÞ

Mactivity¼ simple motion Else

Mactivity¼ complex motion

ð22Þ

Once classified as simple motion, the SAD value of the block should be similar to those of neighboring blocks. On the contrary, the SAD values of blocks which are clas-sified as complex_motion should be quite different from those of neighboring blocks. Based on this concept, the lower bound for the condition of termination is determined in Eq. (23).

IfðMactivity ¼ simple motionÞ SAD threshold¼ SAD prediction Else

SAD threshold¼ SAD prediction SAD standard deviation

ð23Þ

The SAD_prediction and SAD_standard_deviation repre-sent the prediction of SAD of current block and the

stan-dard deviation of SAD of all blocks in the previous frame, respectively. The deﬁnitions are deﬁned in Eqs.

(24) and (25):

SAD prediction¼ ðSADa þ SADb þ SADc

þ SADdÞ=4 ð24Þ SAD mean¼ 1 Number MB X Number MB-1 t¼0 SADt ð25Þ

SAD standard deviation¼ 1 M 1 X M1 t¼0 SADt ð SAD meanÞ21=2 ð26Þ

The SADtis the SAD value of tth block in a frame.

Num-ber_MB is the total number of MB in a frame. If there is no any neighbor block near the current block, SAD_predic-tion is set to a small value (999,999). Note that the SAD_prediction and SAD_standard_deviation are calcu-lated for 16 16 macroblock. In H.264/AVC standard, there are seven block sizes used in motion estimation. We determine the SAD_prediction and SAD_standard_devia-tion for other block size according to the area occupied by the block.

Finally, the condition of termination is tested when a new up-to-date best matched block is found. If the SAD value of the up-to-date block is equal to or smal-ler than SAD_threshold, the motion estimation is terminated.

4. Experimental results and discussions

In this section, we present the experimental results of the proposed approaches. We modify the H.264/AVC refer-ence software JM 9.4 and implement the proposed algo-rithms on it. In the experiments, we observe the number of search points for each block to measure the performance of the proposed algorithms. We also measure the coding eﬃciency. In order to measure the coding eﬃciency, we compare the bitrates of encoded sequences with the same quantization parameter and disabling rate control. Besides, we exploit the SAD value as a criterion to measure whether the determined search range is large enough. Finally, we compare the total encoding time to measure the improve-ment in practical situation.

4.1. Experimental environment

Nine testing video sequences are taken into concern in the experiments. As shown in Table 12, these testing sequences includes video data of various resolutions and diﬀerent motion activities. The experimental environment and some coding conﬁgurations are listed in Table 11. These parameters are applied to all experiments except for some circumstances which will be addressed later. Note that the maximum search range is set to 24.

(10)

4.2. Fast full Pel Search

The proposed algorithms are compared with Fast Full Pel Search2which is an improved version of conventional Full Pel Search by reusing SAD values of the smallest 4 4 block. Fast Full Pel Search computes the SAD values for all 4 4 blocks in advance whenever a new macroblock begins the motion estimation. Then, it merges the SAD val-ues to get the SAD valval-ues of larger blocks. In this way, computation of SAD for a macroblock with all block size

enabled is about equal to the computation of SAD with only a 16 16 block.

Note that the performances of the Fast Full Pel Search and the conventional Full Search are the same but the Fast Full Pel Search is faster than the conventional Full Search in H.264/AVC. In the following context, we denote the Fast Full Pel Search as FFS.

4.3. Simple Dynamic Search Range

The experimental results of the proposed Simple Dynamic Search Range (SDSR) algorithm in Table 13 shows that the proposed SDSR algorithm outperforms the Fast Full Pel Search (FFS) greatly. Compared to FFS, SDSR reduces the number of search points to 80% in average. For the testing sequences of low and medium motions, SDSR could even reduces the number of search points over 90%. Table 14 shows the comparative results of coding bitrates of the testing sequences. We can observe that the bitrates increases slightly for each testing sequence.

In Table 15the total encoding time is reduced about 40–

50%. The motion activity of Stefan and Football sequences are higher than others.

To evidence that SDSR could adaptively determine the search range in a reasonable size, we depict the search range (SR) determined in each frame as shown inFigs. 6

2 _{The Fast Full Pel Search is implemented in H.264/AVC Reference}

Software JM 9.4. Table 13

Search points of FFS and SDSR

Sequence name Number of search points Improvement (%) Fast Full Pel Search SDSR

Foreman QCIF 2401 144 94 Mobile QCIF 2401 52 98 Coastguard QCIF 2401 88 96 Foreman CIF 2401 365 85 Tempete CIF 2401 261 89 Flower CIF 2401 563 77 Stefan SIF 2401 860 64 Football CIF 2401 1411 41

Table tennis SIF 2401 497 79

Average 80

Table 14

Bitrates of FFS and SDSR

Sequence name Bitrates (Kbps) Improvement (%) Fast Full Pel Search SDSR

Foreman QCIF 69.203 68.858 0.5 Mobile QCIF 173.016 173.250 +0.1 Coastguard QCIF 76.134 76.022 0.1 Foreman CIF 188.773 188.490 0.1 Tempete CIF 425.392 425.810 +0.1 Flower CIF 669.312 669.333 +0.003 Stefan SIF 505.450 505.693 +0.05 Football CIF 413.301 416.525 +0.8 Table tennis SIF 256.259 257.925 +0.65

Average + 0.11

Table 15

Total encoding time of FFS and SDSR

Sequence name Total encoding time (s) Improvement (%) Fast Full Pel Search SDSR

Average 43

Table 11 Testing conditions Encoder conﬁgurations Software: JM 9.4[14]

ME search range: ±24 pixels RDO: enabled

Number of reference frame: 1 Quantization parameter (QP): 36 GOP size: 15

Macroblock Adaptive Inter-Layer Prediction: enabled Machine: Athlon XP 1700+ with 512 MB memory Proﬁle: baseline

Prediction structure: IPPP

Fast ME (UMHexagonS)[15]: disable Fast mode selection[16]: disable

Table 12

Descriptions of test video sequences

ID Name Resolution No. of frames Motion activity

A Foreman QCIF 150 Medium

B Mobile QCIF 150 Slow

C Coastguard QCIF 150 Medium

D Foreman CIF 150 Medium

E Tempete CIF 150 Slow, zooming

F Flower CIF 90 Slow

G Stefan SIF 150 High

H Football CIF 90 Very high

(11)

and 7. In Figs. 6 and 7, we can also see the SAD values between the original frame and each reconstructed frame using the SDSR and the conventional method. It can be observed that the SAD values of SDSR and FFS are very close which means SDSR can ﬁnd true MVs in most of the motion estimations except for the frames of higher motion activities. In average, the number of search points is reduced about 80%, bitrate increases about 0.11%, and total encoding time is reduced about 43%. Therefore, we could claim the proposed SDSR can reduce the coding complexity while maintaining almost the same coding eﬃciency.

4.4. Successive Elimination Algorithm with Integral Frame The proposed Successive Elimination Algorithm with Integral Frame (SEAIF) is designed to reduces the number of search points in the process of motion estimation.Tables

16 and 17show the experimental results of comparison of

SEAIF and Fast Full Pel Search (FFS). InTable 16, we can see the search points using SEAIF is in average 95% less than those using FFS under the constraint that only 16 16 block size is enabled for motion estimation.

There-fore, the encoding time of SEAIF is about 29% less than that of FFS as shown inTable 17.

4.5. Early Termination Algorithm

When reaching desired search point, the proposed Early Termination Algorithm (ETA) could early terminate the process of motion estimation to reduce the number of SP. As shown inTables 18 and 19, about 44.5% SP are sav-ing and the bitrate is nearly the same with the bit rate pro-duced by FFS. However,Table 20shows that the encoding time is not reduced as we expect. In the process of motion

Fig. 6. SAD and SR of SDSR frame-by-frame in Foreman QCIF.

Fig. 7. SAD and SR of SDSR frame-by-frame in Football CIF.

Table 16

Search Points of FFS and SEAIF (16 16 block size only)

Sequence name Number of search points Improvement (%) Fast Full Pel Search SEAIF

Foreman QCIF 2401 61 97 Mobile QCIF 2401 71 97 Tempete CIF 2401 114 95 Stefan SIF 2401 193 92 Average 95 Table 17

Total encoding time of FFS and SEAIF (16 16 block size only) Sequence name Total encoding time (s) Improvement (%)

Fast Full Pel Search SEAIF

Foreman QCIF 112 77 31 Mobile QCIF 117 84 28 Tempete CIF 458 332 28 Stefan SIF 369 289 27 Average 29 Table 18

Search points of FFS and ETA

Sequence name Number of search points Improvement (%) Fast Full Pel Search ETA

Foreman QCIF 2401 1484 38

Mobile QCIF 2401 1197 50

Tempete CIF 2401 1306 46

Stefan SIF 2401 1350 44

(12)

estimation, each search point is estimated in matching cri-terion, usually SAD. Although our ETA can skip a large number of search points, it cannot save the computations of SAD because FFS in JM9.4 calculates all SAD values in advance. Thus the encoding time can not be saved in this experiment.

4.6. Content-Aware Fast Motion Estimation Algorithm (CAFME)

The Content-Aware Fast Motion Estimation Algorithm (CAFME) is formed by integrating the Simple Dynamic Search Range (SDSR), Successive Elimination Algorithm with Integral Frame (SEAIF), and Early Termination Algorithm (ETA). Here we evaluate the performance of the proposed CAFME and give some discussions.

4.6.1. Performance compared to Fast Full Pel Search (FFS) Compared to Fast Full Pel Search (FFS), as shown in

Table 21, the number of search points used by CAFME

can be reduced more than 90% for most of the testing sequences. For the sequences of slow and median motion, the reduced rates of search points could even reach about 99%. The reduced rate of search points is much lower (73.8%) for football sequence because of the very high motion characteristic. In average, the increment of bitrate of the video stream coded by CAFME is about 0.26%

(Table 22), which is a slight increment. Furthermore, the

total encoding time is reduced about 41.9% as shown in

Table 23.

4.6.2. Performance compared to UMHexagonS

To compare the performance of the proposed CAFME scheme and UMHexagonS, experiments are done to evalu-ate the motion estimation time of both schemes under dif-ferent search range (SR = 24, 48, 96, and 128). Both schemes are implemented based on JM9.4 and three testing sequences including Forman, Tempete, and Flower are used in the experiments.

Tables 24–26are the results of total motion estimation

time between proposed Content-Aware Fast Motion Esti-mation Algorithm (CAFME) and UMHexagonS method. As shown in these tables, although the motion estimation time of proposed CAFME scheme is higher than that of UMHexagonS when search range is small, CAFME can reduce more computation time when search range becomes

Table 21

Search points of FFS and CAFME

Sequence name Number of search points Improvement (%) Fast Full Pel Search CAFME

Foreman QCIF 2401 37 98.5 Mobile QCIF 2401 12 99.5 Coastguard QCIF 2401 29 98.8 Foreman CIF 2401 100 95.8 Tempete CIF 2401 69 97.1 Flower CIF 2401 199 91.7 Stefan SIF 2401 184 92.3 Football CIF 2401 628 73.8

Table tennis SIF 2401 224 90.7

Average 93.1

Table 22

Bitrates of FFS and CAFME

Sequence name Bitrates (Kbps) Improvement (%) Fast Full Pel Search CAFME

Foreman QCIF 69.203 69.118 0.12 Mobile QCIF 173.016 173.285 +0.16 Coastguard QCIF 76.134 75.862 0.36 Foreman CIF 188.773 188.784 +0.005 Tempete CIF 425.392 425.955 +0.13 Flower CIF 669.312 670.211 +0.13 Stefan SIF 505.450 504.782 0.13 Football CIF 413.301 419.357 +1.5 Table tennis SIF 256.259 258.939 +1.04

Average +0.26

Table 23

Total encoding time of FFS and CAFME

Sequence name Total Encoding Time (Second) Improvement (%) Fast Full Pel Search CAFME

Average 41.9

Table 19

Bitrates of FFS and ETA

Sequence name Bitrates (Kbps) Improvement (%) Fast Full Pel Search ETA

Foreman QCIF 69.203 69.365 +0.2 Mobile QCIF 173.016 173.366 +0.2 Tempete CIF 425.392 424.898 0.1 Stefan SIF 505.450 505.987 +0.1 Average +0.1 Table 20

Total encoding time of FFS and ETA

Sequence name Total encoding time (s) Improvement (%) Fast Full Pel Search ETA

Foreman QCIF 156 140 10.3

Mobile QCIF 151 152 +0.7

Tempete CIF 583 594 +1.9

Stefan SIF 508 498 2.0

(13)

larger. It is believed that the proposed CAFME is more efficient under high search range scenarios like high pro-files. It is noted that the differences in PSNR and BitRate between the two schemes at all testing sequences are found to be negligible. For example, from Tables 27 and 28, we can see the difference of PSNR and BitRate are not larger than 0.1% for Forman sequence.

UMHexagonS scheme searches the best matched blocks following a predeﬁned search pattern to speed up the

searching process while the proposed CAFME is a hybrid scheme consisting of three approaches to speed motion estimation. Here we combine UMHexagonS with the pro-posed scheme and compare it to UMHexagonS to see if the gain of CAFME could be added on top of UMHexagonS. Compared to UMHexagonS, as shown inTables 29–31, the total motion estimation time of hybrid approach (com-bination of UMHexagonS and CAFME) can be further

Table 27

PSNR of UMHexagonS and CAFME over diﬀerent search range for sequence Foreman_CIF

Search range PSNR (dB) Improvement (%) UMHexagonS CAFME 24 32.58 32.61 +0.1 48 32.58 32.61 +0.1 96 32.58 32.61 +0.1 128 32.58 32.61 +0.1 Table 28

Bitrate of UMHexagonS and CAFME over diﬀerent search range for sequence Foreman_CIF

Search range Bitrate (bps) Improvement (%) UMHexagonS CAFME 24 194.45 194.75 +0.15 48 194.53 194.68 +0.07 96 194.56 194.71 +0.07 128 194.71 194.71 +0 Table 29

Motion estimation time of UMHexagonS and hybrid approach over diﬀerent search range for sequence Foreman_CIF

Search range

Motion estimation time (s) Improvement (%)

UMHexagonS UMHexagonS + CAFME

24 47 40 15 48 57 41 28 96 78 45 42 128 92 48 48 Average 33 Table 30

Motion estimation time of UMHexagonS and hybrid approach over diﬀerent search range for sequence Tempete CIF

Search range

24 51 40 21 48 69 41 40 96 98 46 53 128 119 49 58 Average 43 Table 31

Motion estimation time of UMHexagonS and hybrid approach over diﬀerent search range for sequence Flower CIF

Search range

24 43 35 18 48 52 35 33 96 76 39 49 128 90 43 52 Average 38 Table 24

Motion estimation time of UMHexagonS and CAFME over diﬀerent search range for sequence Foreman_CIF

Search range Motion estimation time (s) Improvement (%) UMHexagonS CAFME 24 47 70 +48 48 57 70 +23 96 78 76 3 128 92 83 10 Average +14.5 Table 25

Motion estimation time of UMHexagonS and CAFME over diﬀerent search range for sequence tempete CIF

Search range Motion estimation time (s) Improvement (%) UMHexagonS CAFME 24 51 52 +2 48 69 58 15 96 98 66 32 128 119 70 41 Average 24 Table 26

Motion estimation time of UMHexagonS and CAFME over diﬀerent search range for sequence Flower CIF

Search range Motion estimation time (s) Improvement (%) UMHexagonS CAFME 24 43 47 +8 48 52 51 1 96 76 58 23 128 90 65 27 Average 11

(14)

reduced, which means the gain of CAFME can be added on UMHexagonS for all testing sequences.

4.7. Summary

The proposed Simple Dynamic Search Range (SDSR) can reduce the number of search points about 80% while sustaining the coding eﬃciency (bitrate increases 0.11% in average). We also integrate the Successive Elimination Algorithm with Integral Frame (SEAIF) and the Early Termination Algorithm (ETA) with SDSR to form the Content-Aware Fast Motion Estimation Algorithm (CAF-ME). The CAFME improves the SDSR and the number of search points is reduced 93.1% while the bitrate increases just a little (0.26%). The overall encoding time is reduced about 41.9% in our implementation.

5. Conclusions and future works

The motion estimation plays an important role in the video coding standard. Also, it is usually the most compu-tational-intensive part in a typical video encoder. Hence, the efficient motion estimation algorithm is essential. We proposed a fast algorithm called Content-Aware Fast Motion Estimation Algorithm (CAFME). CAFME con-sists of the Simple Dynamic Search Range (SDSR), Succes-sive Elimination Algorithm with Integral Frame (SEAIF), and Early Termination Algorithm (ETA). The SDSR adjusts the search range for every block adaptively and does not need any predefined thresholds and performs well for all the test sequences. The SEAIF utilizes reusing tech-niques for calculating SAD of overlapped blocks of vari-able size thus can reduces the number of computation of SAD without loss of coding efficiency. The ETA terminates the search process early when finding a good candidate block and the performance is stable for all kinds of testing sequence of different motion activity.

The experimental results show that CAFME can reduce the number of search point about 93.1% and the bitrate only increases 0.26% while sustaining the same PSNR. We modiﬁed H.264/AVC reference software JM 9.4 and implemented our proposed algorithms on it. The total encoding time reduces about 41.9%.

The motion search algorithm currently used in CAFME is Fast Full Pel Search (FFS). However it may be replaced

by any fast motion estimation algorithm like TSS and DS. The future works will focus on developing a fast motion estimation algorithm suitable for dynamic search range, alleviate the overhead in implementation, and so on. References

[1] T. Koga, K. Iinuma, A. Hirano, Y. Iijima, T. Ishiguro, Motion Compensated Interframe Coding for Video Conferencing, in: Proc. Nat. Telecommun. Conf., New Orleans, LA, November 29–Decem-ber 3 1981, pp. G5.3.1–5.3.5.

[2] S. Zhu, K.-K. Ma, A new diamond search algorithm for fast block-matching motion estimation, IEEE Trans. Image Process. 9 (2) (2000) 287–290.

[3] W. Li, E. Salari, Successive elimination algorithm for motion estimation, IEEE Trans. Image Process. 4 (1) (1995) 105–107. [4] X.Q. Gao, C.J. Duanmu, C.R. Zou, A multilevel successive

elimina-tion algorithm for block matching moelimina-tion estimaelimina-tion, IEEE Trans. Image Process. 9 (3) (2000) 501–504.

[5] L.-W. Lee, J.-F. Wang, J.-Y. Lee, J.-D. Shie, Dynamic search-window adjustment and interlaced search for block-matching algorithm, IEEE Trans. Circuits Systems Video Technol. 3 (1) (1993) 85–87.

[6] J. Feng, K.-T. Lo, H. Mehrpour, A.E. Karbowiak, Adaptive block matching motion estimation algorithm for video coding, IEE Elec-tron. Lett. 31 (18) (1995) 1542–1543.

[7] J. Minocha, N.-R. Shanbhag, A low power data-adaptive motion estimation algorithm, in: IEEE Third Workshop on Multimedia Signal Processing, September 13–15 1999, pp. 685–690.

[8] S. Saponara, L. Fanucci, Data-adaptive motion estimation algorithm and VLSI architecture design for low-power video systems, IEE Proc. Comput. Digital Techniques 151 (1) (2004) 51–59.

[9] P.-I. Hosur, Motion adaptive search for fast motion estimation, IEEE Trans. Consumer Electron. 49 (4) (2003) 1330–1340.

[10] S.-S. Lin, P.-C. Tseng, C.-P. Lin, L.-G. Chen, Multi-mode content-aware motion estimation algorithm for power-content-aware video coding systems, in: IEEE Workshop on Signal Processing Systems, 13–15 October 2004, pp. 239–244.

[11] V.-A. Nguyen, Y.-P. Tan, Fast block-based motion estimation using integral frames, IEEE Signal Process. Lett. 11 (9) (2004) 744–747. [12] K.-P. Lim, G. Sullivan, T. Wiegand, Text Description of Joint Model

Reference Encoding Methods and Decoding Concealment Methods ITU-T, Doc. #JVT-N046, January 2005.

[13] P. Viola, M.-J. Jones, Robust Real-Time Object Detection” Cam-bridge Res. Lab., Tech. Rep. CRL 2001/01, February 2001. [14] H.264/AVC reference software, <http://ftp3.itu.ch/av-arch/jvt-site/

reference_software/> and <http://iphome.hhi.de/suehring/tml/>. [15] Z. Chen, P. Zhou, Y. He, Y. Chen, Fast Integer Pel and Fractional

Pel Motion Estimation for JVT” ITU-T, Doc. #JVT-F017, December 2002.

[16] B. Jeon, J. Lee, Fast Mode Decision for H.264 ITU-T, Doc. #JVT-J033, December 2003.