Global Motion Estimation - Motion Estimation

2. Motion Estimation

2.2.7. Global Motion Estimation

The objective of global motion vector estimation is to determine a motion vector from the existing data that has been evaluated from the motion estimation process. In a practical video sequence, it always suffers from moving objects, repeated patterns etc. The LMV in

each region may represent the global motion vector, the moving object motion vector, or even the error vector, respectively. The error vector may be caused by an ill condition, a repeated pattern, or the mixture of global and moving object motion. A reliable global motion vector is essentially selected from the LMVs and RMV. However, in the worst case, i.e. if the LMVs and RMV are all faulty, this will induce a worse result after compensation compared with the original images. Therefore, if the evaluation includes the zero motion vector (ZMV), it can prevent the occurrence of this case. Similarly, for an image sequence with constant motion in the scene, it will induce a worse result if it is compensated by ZMV or error motion vector rather than by the average motion vector (AMV). In the proposed DIS technique, the seven motion vectors, which include the four LMVs, the RMV, the ZMV, and the AMV, which are referred to as pre-selected motion vectors (pre MV ), are employed to estimate the GMV of _ the current frame. In general, one of LMVs is the highly probable GMV for the regular image.

The RMV is the highly probable GMV for the ill-conditioned image. The ZMV can prevent a worse compensation result caused by the unreliable MVs, and the AMV is useful for the constant motion of the car. In addition, if the image sequence contains a large moving object, the determination of global motion is troublesome because the determined motion vector probably switches between the background and the large moving object or it is totally dominated by the large moving object. In this case, it will lead to artificial shaking and cause an important challenge in DIS.

2.2.7.1. Background Based Peer to Peer Evaluation

In this dissertation, a background-based evaluation function is proposed to overcome the large moving object problem. Fig. 2.10 shows the areas for the background-based evaluation.

Five regions are selected to evaluate the result, which are located on the surrounding areas of the image. The reason is that, in most cases, the foreground object is located in the center of the image, so the surrounding areas of the image are the best candidates for background detection. The estimation of the GMV is calculated by the summation of absolute difference (SAD),

where I t( −1, , )X Y is the intensity of the point ( , )X Y at frame t-1, B_i is the i-th background region in the image, X Y_c, _care the components of the seven pre-select motion vectors (pre MV_ _c) in x and y directions.

Eachpre MV_ _c can obtain it’s SAD_{B c}_i_, in each region. The smaller SAD_{B c}_i_, represents the higher probability of the desired motion vector among theses pre-selected motion vectors. The score for eachpre MV_ _c in region i is denoted asS_{i c}_, , which is the order of the SAD_{B c}_i_, value, and the higher SAD_{B c}_i_, indicates the higher score. The total score for each pre MV_ _c can be obtained by

5 , 1

c i c

S S

∑

^{. (2.23)}

Five-region peer-to-peer evaluation can prevent the situation that some partial high-contrast image regions dominate the evaluation result. In this algorithm, each region has an equal priority to determine the result. In (14), S_c is the index to determine the GMV. The

_ _c

pre MV with the smallest S_c is the desired GMV and it can be expressed as GMV=pre MV_ _i, for arg(min _c)

i= S . (2.24)

According to these sophisticated evaluation areas, the evaluation function can detect the attributed background motion vector precisely in most circumstances.

2 4

1 5 3

50x150 pels 50x150 pels 150x50 pels 100x100 pels 150x50 pels

640x480 pels

Fig. 2.10. Areas for background detection and evaluation.

2.2.7.2. Skyline Detection

To improve the robustness of the global motion vector estimation, the adaptive background-based evaluation function is proposed. Firstly, the skyline detection will be

performed. Then, five regions, based on the estimated skyline, are selected to evaluate the result. In most outdoor applications, such as in-car video capture, the pixels of the area above the skyline are low contrast. The skyline detection can prevent an invalid result due to some of the five regions located on the low-contrast area. Selecting the regions surrounding the boundary of the image to evaluate the obtained motion vector can avoid the disturbance of moving-object effects for global motion vector estimation. Fig. 2.11 shows the adopted areas for the adaptive background-based evaluation according to detecting the skyline. The proposed skyline detection algorithm combines RPM correlation evaluation, minimum projection, and the inverse triangle method. First, we calculate the absolute differences between the representative point at frame( 1)t− and the corresponding neighborhood pixels in the same sub-region at frame( )t by Eq. (2.25) that is regarded as the intermedium of Eq.

(2.6),

, ( , ) ( 1, , ) ( , , )

i j i j i p j q

C p q = I t− X Y −I t X₊ Y₊ , (2.25)

where( , )i j denotes the position of one sub-region with respect to the row and column as shown in Fig. 2.12. There are 120 sub-regions (10 rows x 12 columns in this paper). ( , )X Y_i _j

is the coordinate of the representative point in the ( , )i j sub-region, ( 1,I t− X Y_i, )_j is the intensity of the representative point ( , )X Y at frame( 1)_i _j t− , and ( , )p q is a shifting vector within the sub-region. Then we can derive the correlation curve for detecting the skyline by calculating the l-th row of the sub-regions. Initially, l=1 and the minimum projection and inverse triangle method presented in Eq. (2.9) ~ (2.13) are applied to C p q to get the confidence _l( , ) index in the horizontal direction. The cost level is relatively high when the corresponding area is a low-contrast area such as the sky. If the level is lower than the presetting threshold then we stop the evaluation process and the horizontal position of the representative points of the sub-regions located in the last row of C p q is defined as the coarse skyline. Otherwise, _l( , ) we set l l= +1and continue the evaluation of ( , )C p q till the level is lower than the _l presetting threshold. Fig. 2.13 shows the results of skyline detection in the video sequence taken from the highway. The coarse skyline is used to adaptively layout the background-based

evaluation blocks located on the higher contrast area. It improves the robustness of global motion vector estimation in image stabilization applications.

2 4

1 5 3

50x100 pels 50x100 pels 100x50 pels 50x50 pels 100x50 pels

640x480 pels Detected Skyline

Fig. 2.11. Areas for the background-based evaluation adapted by the detected skyline.

Fig. 2.12. Skyline detection algorithm is to combine RPM correlation evaluation, minimum projection and inverse triangle method.

(a) (b) Fig. 2.13. Skyline detection applies on the in-car video sequence taken from the highway.

在文檔中數位影像穩定技術及其應用 (頁 34-39)