Full-Frame Video Stabilization with a Polyline-Fitted Camcorder Path

(1)

FULL-FRAME VIDEO STABILIZATION

WITH A POLYLINE-FITTED CAMCORDER PATH

1

Jong-Shan Lin (

林蓉珊

),

1

Wei-Ting Huang (

黃惟婷

)

2

Bing-Yu Chen (

陳炳宇

),

3

Ming Ouhyoung (

歐陽明

)

National Taiwan University

E-mail:

1

{maruko,weiting}@cmlab.csie.ntu.edu.tw,

2

Annoying shaky motion is one of the significant problems in home videos, since hand shake is an unavoidable effect when capturing by using a hand-held camcorder. Video stabilization is an important technique to solve this problem. However, the stabilized videos resulted by current methods usually have decreased resolution and are still not so stable. In this paper, we propose a novel, robust, and practical method of video stabilization with a polyline-fitted camcorder path. Our method can produce full-frame stabilized videos, and not only has the high frequency shaky motioned but also the low frequency unexpected movements are removed. To achieve this, we use a polyline to estimate a new stable camcorder motion path, and then we fill the dynamic and static missing areas caused by frame alignment from other frames to keep the same resolution and quality as the original video. Furthermore, we smooth the discontinuous regions by using a three-dimensional Poisson-based method. After the above automatic operations, a full-frame stabilized video can be achieved.

1. INTRODUCTION

As the use of digital camcorders grows, to capture videos using hand-held camcorders becomes more and more convenient than before. However, since most people usually do not bring a tripod with their camcorders, unwanted vibration in video sequence is an unavoidable effect due to the handshakes. To avoid or remove the annoying shaky motion is one of the significant problems in home videos, and video stabilization is an important technique to solve this problem. Many existed video stabilization applications result a stabilized video by smoothing the camcorder motion path and then truncating the missing areas after aligning the video frames along the smoothed motion path. Hence, the stabilized videos still have many

unexpected movements, since only high frequency shaky motions are removed during the smoothing stage. Moreover, the video qualities of the stabilized videos are usually decreased due to the truncating stage.

In this paper, we propose a novel, robust, and practical method of full-frame video stabilization with a polyline-fitted camcorder path. To achieve this, we use a polyline to estimate a new stable camcorder motion. Hence, the resulted videos are much stable and much close to the videos that the users want to capture. Once we obtain a new stable camcorder motion path of the video, the video frames are aligned along the new motion path to form a stabilized video.

Due to the frame alignment, there are several missing areas in the new stabilized video. Unlike other trimming approaches, we fill the dynamic and static missing areas respectively by using motion inpainting and warped neighboring video frames. This completion method works well even if there are some moving objects located at the boundary of the video frames while keeping the same resolution and quality as the original video. However, since we use a polyline to fit the camcorder motion path rather than using a parametric curve to smooth the motion path, the missing areas are usually large and cannot be easily completed by neighboring frames. To fill the missing areas using the pixels on the frames far from the current frame may cause the discontinuity at the boundaries of the filled areas, since the intensity of each video frame is usually not necessarily the same. Hence, we smooth the discontinuity boundaries by using a three-dimensional Poisson-based method which takes both of the spatial and temporal consistency into consideration and can result seamless stitching spatially and temporally.

After stabilizing the videos, the blurry video frames also caused by the handshakes become much noticeable. Hence, we detect the blurry frames and transfer the pixels from neighboring sharper frames. Therefore, our method can produce full-frame stabilized videos, and not only has the high frequency shaky motioned but also the low frequency unexpected movements are removed. The stabilized videos are stable, comfortable, and much more close to the videos which the users

(2)

really want to capture if they bring a tripod with them when capturing.

2. RELATED WORK

Video stabilization is an important research topic in multimedia, image processing, computer vision, and computer graphics. Buehler et al. proposed an image-based rendering (IBR) method to stabilize videos [4]. Recently, image processing methods are widely used for video stabilization. For estimating the camcorder motion path, Litvin et al. estimated a new camcorder motion path by altering camera parameter [9], and Matsushita et al. smoothed the camcorder motion path to reduce the high frequency shaky motions [11]. However, although the high frequency shaky motions can be easily reduced, the stabilized videos still have low frequency unexpected movements.

When filling up the missing image areas, there are some image inpainting approaches developed for recovering the missing holes in the image [1, 5, 8]. Although these approaches can complete the missing regions with correct structure, but there will be obvious discontinuity if we recover each video frame respectively. Litvin et al. used mosaic method to fill up the missing areas in the stabilized video [9], however they did not consider the moving objects may appear at the boundary of the video. Wexler et al. and Shiratori et al. filled up the missing holes by sampling the spatio-temporal volume patches from other portion of the video volume [15, 16]. The former approach used the most similar patch in color space for completing the missing holes and the later one used the patch with similar motion vector. The drawback of these methods is that they need large computing time for searching a proper patch. Jia et al. and Patwardhan et al. segmented the video into two layers and recovered them individually [7, 13]. These methods focused on long and periodic observed time of the moving objects, but this is not guaranteed in common home videos.

3. OVERVIEW

Global Path Estimation

Poly-line Fitting Motion Path Estimation Input

original video sequence

Moving Object Detection Dynamic Region Completion Static Region Completion Possion-based Smoothing Video Completion Output

stabilized video sequence

Video Deblurring

Fig. 1: System framework.

Fig. 1 shows the system framework of our algorithm. The input of our system is a video sequence captured by a hand-held camcorder without using a tripod. Hence, the video sequence has much annoying shaky motions due to the handshakes. The first process of our system is

called motion path estimation (Sec. 4). In this process, the camcorder motion path of the original video is estimated and changed to be a stabilized one. There are three steps contained in this process. In the first step (Sec. 4.1), we find out the transformation between the consecutive frames and combine all of the transformations to obtain the global camcorder motion path of the original video. In the second step (Sec. 4.2), the estimated global camcorder motion path is approximated by a polyline, since the camcorder motion path of the video captured by using a tripod is like a polyline.

After the stabilized camcorder motion path is achieved, the video completion process is applied (Sec. 5). Because the position of each frame is changed according to the frame alignment along the new motion path, there are some missing areas within each aligned frame. The first step in the video completion is to detect if there exists moving objects and where they are (Sec. 5.1). In the second step, we separate the moving objects as the dynamic foreground regions from the static background regions and complete the missing areas of them by different methods (Sec. 5.2 and 5.3). To fill the missing areas using the pixels on the frames far from the current frame may cause the discontinuity at the boundaries of the filled areas, since the intensity of each video frame is usually not necessarily the same. In order to make a seamless stitching, we apply a three-dimensional Poisson-based smoothing method on the discontinuous regions (Sec. 5.4).

Fig. 2: Top row: Three frames of the original video. There are annoying shaky motions in the frames. Middle row: Stabilized frames. Black areas show the missing areas of the frames. Bottom row: Completed frames. This is the result of our method; the shaky motions of the video are removed.

The last process is video deblurring (Sec. 6). Because the motion blur of each frame may not be matched in the stabilized camcorder motion path, the blurry frames become much noticeable in the new stabilized video. Instead of finding the accurate point spread function for image deblurring, we choose a video

(3)

deblurring method by transferring the pixel values from neighboring sharper frames to the blurry frames. After the above automatic processes, the output will be a stabilized video with stable camcorder motion path while keeping the same resolution and quality as the original one. Fig. 2 shows three frames of an input video and their stabilized results before and after the video completion and deblurring processes.

1 0 0 0 , 0 1 0 0 i i M =HK +v H= ⎢⎡ ⎤_⎥ ⎣ ⎦ and 1 1 0 1 0 0 1 0 1 , 0 0 1 0 0 0 0 1 i i K₊ K w ⎡ ⎤ ⎢ ⎥ ⎢ ⎥ = Φ + Φ = ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ,

4. MOTION PATH ESTIMATION

In order to generate stabilized videos, we first estimate the camcorder motion path of the original video (Sec. 4.1). Then, the original camcorder motion path is stabilized by using a polyline-based motion path (Sec. 4.2), so that the undesirable motion caused by hand shake can be removed.

where Φ is the state transition model which is applied to the previous state Ki to approximate the next state

1

i

K₊ , H is the observation model which maps the true state space Mi into the observed space Ki, v~N(0

vation noise which is assumed to be a zero mean Gaussian white noise with covariance R, and

~N(0,Q cess noise which is assumed to be drawn from a zero mean multivariate normal distribution with covariance Q. The camcorder motion path smoothed by the Kalman filter is shown as the green curves in Fig. 3.

, )R

is the

w

obser

) is

4.1. Global Path Estimation

the pro To estimate the global camcorder motion path, we first

extract the feature points of each frame by SIFT (Scale Invariant Feature Transform) [10], which is invariant to scaling and rotation of the image. The feature points on every consecutive frame are matched if the distances between the feature descriptions are small enough and RANSAC (RANdom SAmple Consensus) [6] is used to select the inliers of the matched feature pairs. For the accuracy, an over-constrained system is applied to find out the least square solution between these matched feature pairs and derive the affine transformation between the two consecutive frames. The affine transformation is represented by a 3×3 affine model which contains six parameters. That means if we find out the transformation matrix between frames and

, the corresponding pixel on the frame and pixel on the frame i

i T p 1 i i 1 i+ _i 1 i

p₊ + have the following

relationship: . Once the transformation matrices between the consecutive frames are obtained, all of the transformations can be combined to derive a global transformation chain.

1 i i p₊ = ⋅T p −500 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250

horizontal camera motion

pixel

frame

original camera path filtered by Kalman filter polyline i 0 50 100 150 200 250 300 350 400 450 500 −40 −30 −20 −10 0 10 20 30

vertical camera motion

pixel

frame

original camera path filtered by Kalman filter polyline

4.2. Motion Path Fitting

To obtain a stabilized camcorder motion path without not only the high frequency shaky motions but also the low frequency unexpected movements, we use a polyline to fit the estimated global camcorder motion path. We first separate the camcorder motion path estimated from Sec. 4.1 to be horizontal and vertical ones M_i =[ , ]x y , and operate them respectively. Then, Kalman filter is employed to estimate a smooth camcorder motion path Ki of Mi [12] which is

represented in two-dimensional space as

Fig. 3: The original camcorder motion path (red curve) and the estimated camcorder motion path after applying Kalman filter (green curve) and fitting by a polyline (blue straight line) for horizontal (Upper) and vertical (Lower) motion paths.

(4)

After applying the Kalman filter, we can obtain the oothed camcorder motion path (Kalman path). Then, we

along the polyline fitte

sm

use a polyline to fit the Kalman path. We first assume there is only one straight line connected the camcorder position of the first frame and that of the last frame on the Kalman path. Then, we calculate the difference between the camcorder position in the original motion path of each frame and the temporary motion path (the straight line) and find the maximal difference. If the maximal difference is larger than the a threshold, the straight line is broken by connecting the camcorder position of this frame on the Kalman path and the endpoints of the straight line, and hence the temporary camcorder motion path (the straight line) becomes a polyline with two straight line segments. This step is performed iteratively until the temporary camcorder motion path can preserve all important regions in the video. The final camcorder motion path is shown as the blue polyline in Fig. 3.

Once the camcorder motion path is fitted by a polyline, the video frames are aligned

d camcorder motion path. If the global transition matrix from the first frame to the i-th frame is denoted by M_i , then the i -th frame is aligned to

0 ₁

1 j i

j i i

M ⋅

∏

_{= −}T− ⋅p , where p_i means the pixels on the

i

T represe ed affine transformation etween j e and j+1-th frame. Hence, e can obtain a sta lized video after the polyline fitting and frame alignm nt.

5. VIDEO i e and ma -th fram trix b nt -th fram w bi e COMPLETION

After alignin e stabilized

amcorder motion path, there are several missing areas

video to a static background regi

ts, we first align every air of adjacent frames by using the affine In order to detect moving objec

p

transformation obtained in Sec. 4.1. Then, we evaluate the optical flow of them by using an efficient and less noisy optical flow approach [2, 3] to obtain the motion vector of each pixel. The motion vector of pixelpican

be described as F p_i( _i) which represents the motion flow at pixel pi from frame i to frame i+1, and the

length of the motion vector shows the motion value.

Fig. 4: Left: The frame after changing the position according to the stabilized camcorder motion path.

otion values in the moving object regions are nsidered to be relatively larger than those in the static back

etion

nstead of filling in e color values from other frames directly, we want to

stabilized fram

Right: The mask of detected moving objects (white regions).

The m co

ground region. Hence, we can get a simple mask to show the regions with large motion values by a simple threshold as shown in Fig. 4. The dynamic regions are obtained by evaluating the dilation of the mask, which can help to guarantee the boundary of the moving objects is involved in the dynamic regions. If the missing area falls in the regions where the neighboring pixels have been asked as the dynamic region, this area is treated as the dynamic region and motion inpainting is used to complete the area, otherwise we recover the area by mosaicing.

5.2. Dynamic Region Compl

g the video frames along th c

in the new stabilized video. Traditionally, this problem could be solved by cutting out the missing areas and scaling the stabilized video to its original size, but it may result a stabilized video with worse resolution. Hence, to make the resolution of the stabilized video as good as that of the original one, the missing areas are filled from other frames.

To complete the video, we first detect the moving objects to segment the

For the dynamic missing regions, i th

fill them up with correct motion vectors. Once we derive the motion vectors of each pixel in the missing areas, we can get the pixel color from the next frame according to the motion vectors. The local motion vectors in the known image areas are propagated to the dynamic missing areas as described in [11].

First, local motion vectors are estimated by computing the optical flow between the

on and some dynamic moving object regions (Sec. 5.1). Then, we complete the missing areas by filling dynamic regions (Sec. 5.2) and static regions (Sec. 5.3) respectively. Since the camcorder motion path is fitted by a polyline, the missing areas may be large and need to be filled by the pixels on the frames far from the current frame, so we provide a three-dimensional Poisson-based smoothing method to smooth the discontinuity stitched areas (Sec. 5.4).

5.1. Moving Object Detection

es [2, 3]. The propagation starts at the pixel on the boundary of the dynamic missing areas, its motion vector is calculated as a weighted average of the motion vectors of its neighboring pixels. The process will continue until the dynamic missing areas are filled with motion vectors completely. If pi is a pixel in the

missing area, it will be filled according to its motion vector which is determined by

(5)

, ( ) ( ) ( , ) i pi i i i i i q N i i i i q Np w p q F F p w p q ∈ ∈ =

∑

,

here determines the contribution of the w w p q( _i, _i)

vector o

motion f pixel qi , Npi denotes the eight

neighboring pixels of p_i , and F pi( i) represents the

motion vector at pixel p_i from frame i to frame i+1. Suppose the neighborin pixel qi∈Np already has a

motion vector, according to its m vector, we can estimate its position on the next frame as qi+1. By using

the geometric relationship between the pix pi and qi,

the position of the pixel pi+1 in the frame i+1 ca also

determined as illustrated in . 5. Since the p ls in the same object have similar color values and move in the same direction, if the difference between the color values of the pixels qi+1 and pi+1 is small, they will

likely belong to the same object, d the ght of the motion vector of pixel qi is set to be large as

g i ion ot Fig an els wei n xe be i 1 1 1 ( ) ( ( , ) i i i w p q ClrD p₊ q₊ ε) + = + , where ε and

is a small value for avoiding the division by

ero ) is the -norm color

z i+1 i+1

difference in RGB color space of the pixels qi 1

( ,

ClrD p q l2

+

and pi+1 . erm gua ees that the

contribution of the motion vector in different object smal . 6 shows the result.

This weight t

s is l. Fig

rant

frame i+1

frame i unknown_motion

1 i q 2 i q 3 i q pi 1 i p+ 1 1 i q+ 2 1 i q+ 3 1 i q+ 1 i p+ 1 i p+

Fig. 5: Suppose the neighboring pixel already has a motion vector, according to its motio ctor, we can

ompletion

gions, we then recover ng frames which are

illustrates the process an ws

wrapped to the current frame according to the affine transformation obtained in Sec. 4.1. For the pixel piin

the static missing area at frame i , if there exists its corresponding pixel pi at the warped neighbo g

frame i', we directly copy the pixel to the missing pixel

i p by '( ') i i i i p =T p where ' i i T represents the transformation from frame i' to frame i . Fig. 7

d Fig. 8 sho the static region completion result.

rin ,

Fig. 6: Upper-Left: The frame after changing the position according to the stabilized camcorder motion path. There is a missing area at the left side and a moving object across the missing area. Lower-Left: The result of dynamic region completion. Right Column: The close up view of the yellow square of the Left Column. frame i frame i-1 i q n ve s

estimate its position on the next frame a qi+1. By using

the geometric relationship between the pixels p_i and q_i, the position of the pixel pi+1 in the frame 1 can also

be determined.

5.3. Static Region C

i+

After completing the dynamic re e static ones by its neighbori th m ar issing ea frame i+1 i p 1 i p+

Fig. 7: is a pixel in the missing area at current frame and

i p i p

i ₊₁ is its corresponding pixel in the next frame

1

i+ . The pixel value of pi+1 is directly copied to pi

r reco ing the missing pixel. fo ver

Fig. 8: Left: The frame after changing the position according to the stabilized camcorder motion path. There is a missing area at the right side and upper side. Right: The result of static region completion. Since the missing area is large, there is a discontinuity boundary between the recovered pixels and the original frame.

(6)

To find the corresponding pixel p′i of pi , we

egin the search from the nearest nei ng and prop

ng ar

sson-Based Smoothing

sed by the stabilized amcorder motion path are completed, there may be a

ation is appl

b ghbori frame

is agate the search out. For example, if i the current frame we want to recover, we search the frames

1

i− and i+1 first, if there are missing areas still have not been recovered by the two frames, the following

frame i−2 and i+2 are used to recover the missing areas. We keep the search until all the missing pixels in the static missi eas are completed. Finally, if there are still some missing areas we cannot recover, we use a simple image inpainting approach to complete them.

5.3. Poi

two s

Although the missing areas cau c

discontinuous boundary between the recovered pixels and the original frame, since the missing areas may be large and needed to be filled from the frame far from the current one. This problem may be solved by simply applying some smoothing approaches for the boundary areas; however these simply smoothing operations can only solve the spatial discontinuity problem. When we play the spatially smoothed video, it still has temporal discontinuity problem. In order to keep the spatial and temporal continuity, we provide a three-dimensional Poisson-based smoothing method. Poisson-based smoothing approach is often used in image editing [14], and we extend this approach for video editing.

To solve the discontinuity problem, before filling in a pixel from other frames, the Poisson equ

ied to obtain a smoothed pixel by considering its neighboring pixels in the same frame and neighboring frames. We first apply the Poisson equation in the spatial domain which is written as: For all p∈ Ω,

* p p p p p q q pq q N q N q N v ∈ ∩Ω ∈ ∩∂Ω ∈ ,

where denotes the missing area, is a pixel in the

N f −

∑

f =

∑

f +

∑

Ω p

missing area Ω, Np denotes the nei ng pixels of

pixel p ,

ghbori

p

N is the number of neighboring pixels Np,

p

f and f are the correct pixel values of pixels p q and

q which are what we want to derive, vpqdetermin

ergen of pixels p and q , ∂Ω is the region rrounding the missing area

es the div su ce Ω in t known image areas, and q* he

f denotes th know ol lue of pixel q in ∂Ω.

The Poi on equation can keep the correct structure the miss

e n c or va ss

in ing area and achieve a seamless stitching betw

the missing areas of each frame, we correct the pixel een the recovering areas and known image areas. In order to achieve temporal coherence, after recovering

values of the missing areas by apply the Poisson equation again. For the missing areas, now we consider not only the spatial neighboring pixels but also temporal neighboring pixels. Hence, the Poisson equation is the same, but N includes all neighboring pixels of pixel _p

p in the video volume. Fig. 9 shows the result.

Fig. 9: Upper-Left: The frame after video completion. Since the missing area is large, there is a discontinuous boundary between the recovered pixels and the original

fter video stabilization, the blurry frames which look

smooth in th iceable. Our

ideo deblurring method fundamentally based on [11],

ch frame by calc

frame. Lower-Left: The result of video completion with Poisson-based smoothing. Right Column: The close up view of the yellow square of the Left Column.

6. VIDEO DEBLURRING

A

e original video become not v

but we separate the moving objects from static background first, and deal with them respectively as the video completion process. Since the blur of the static background is much more noticeable than that of the moving objects, in the following explanation, we only focus on the static background deblurring.

The main idea of this method is to copy the pixels of neighboring sharper frames to the blurry frames. We first evaluate the "relative blurriness" of ea

ulating the gradient of it. Generally, the gradient of blurry image is smaller than that of sharper one at the same regions. With this assumption, the blurriness of frame i is defined as:

2 2

( ( ) ( ) )

i x i y i

B =

∑

g p +g p ,

where pi is the pixel of the frame i, and gx and g y

x− s Bi and recti blurriness ng fram y

are the gradients of − di ons, es respectively. We can derive the relative

between he current frame and its neighbori

by comparing their blurrines . If the blurriness Bi of

current frame i is smaller than the blurriness B_i' of its

neighboring frames i t

′ , then the frames i′ are treated be sharper than the frame i and we can use the fram

to es

(7)

i′ to recover the current blurry frame i by transfe ing the corresponding pixe from these sharper frames i

rr

ls ′

to the blurry frame i by

( ) 1 i i i i i N i i i i i i i i′ N p w T p p w ′ ′ ′∈ ′ ∈ ′ + = +

∑

where and e same pixel in frame urring operation, denot ng frames of current frame

, i p and before t ghbori i p he debl ion are th th N i e i es represents after the i , T nei i i′

the transformat from the neighboring frame i′∈ N_i to frame i , and wii′ is a weighting factor b tween i′ e

and i which is defined as:

0 if / 1

/ otherwise

i

i Bi′

′i

g. 10 shows th t of this deblurring method.

i i B w B′ B < ⎧ = ⎨ ⎩ . Fi e resul he up by pod, . The Fi resul vi si nd

g. 10: Upper-Left A blurry frame. Lower-Left:T t of video debl ng. Right Colu : The close ew of the yellow square of the Left C umn.

tured ng a hand-held t using a tri

the resolution e a

the hand

on approach is proposed in is paper to obtain a stabilized video. Since we use a

polyl the

abilized motion path is much more stable than other

was partially supported by the National cience Council of Taiwan under NSC

95-2622-E-002-018 and also Projects of

ational Taiwan University under 95R0062-AE00-02.

, mage Inpainting,” Proc. ACM SIGGRAPH 2000, pp.

417-, 1981. : urri vid of mn ol hout ll 7. RESULT

ll of the videos used in this paper was cap A

u eo camera wi

the videos ar

a 720 480×

resolution of all stabilized videos is the same as the input videos. Fig. 2, Fig. 11 and Fig. 12 show our results. In Fig. 2, the user wants to use -held camcorder to capture a panorama view. Without a tripod, the captured video is shaky due to the handshakes. In Fig. 11, the user wants to use the hand-held camcorder to capture a man walking with his child. Without a tripod, the captured video is shaky due to the handshakes. The bottom rows of Fig. 11 shows our result which is stabilized as captured by using a tripod. In Fig. 12, the user wants to use the hand-held camcorder to capture a man playing with his dog, but due to the view angle limitation, the user pans the camcorder a little bit to capture the whole scene. The

bottom row of Fig. 12 shows our result and the stabilized camcorder motion path is just like to capture the scene by using a tripod.

8. CONCLUSION AND FUTURE WORK

A full-frame video stabilizati th

ine to fit the original camcorder motion path, st

smoothness approaches. Hence, in the stabilized video, not only the high frequency shaky motions but also the low frequency unexpected movements are removed. Although using a polyline to estimate the camcorder motion path may cause large missing areas, it is solved by applying a three-dimensional Poisson-based smoothing method. To fill the missing areas from other frames and deal with blurry frames, we separate the moving objects from the static background and deal with them respectively in completion and deblurring processes.

ACKNOWLEDGEMENT

This paper S

by the Excellent Research N

REFERENCES

[1] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester “I

424, 2000.

[2] M. J. Black and P. Anandan, “A Framework for the Robust Estimation of Optical Flow,” Proc. IEEE ICCV 1993, pp. 231-236, 1993.

[3] M. J. Black and P. Anandan, “The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields,” CVIU, Vol. 63, No. 1, pp. 75-104, 1996.

[4] C. Buehler, M. Bosse, and L. McMillan, “Non-Metric Image-Based Rendering for Video Stabilization,” Proc. IEEE

CVPR 2001, Vol. 2, pp. 609-614, 2001.

[5] A. Criminisi, P. Perez, and K. Toyama, “Object Removal by Exemplar-Based Inpainting,” Proc. IEEE CVPR 2003, Vol. 2, pp. 721-728, 2003.

[6] M. A. Fischler and R. C. Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,” CACM, Vol.

4, No. 6, pp. 381-395 2

[7] J. Jia, T.-P. Wu, Y.-W. Tai, and C.-K. Tang, “Video Repairing Inference of Foreground and Background under Severe Occlusion,” Proc. IEEE CVPR 2004, Vol. 1, pp.

364-71, 2004. 3

(8)

[8] A. Levin, A. Zomet, and Y. Weiss, “Learning How to Inpaint from Global Image Statistics,” Proc. IEEE ICCV 2003, Vol. 1, pp. 305-312, 2003.

[9] A. Litvin, J. Konrad, and W. C. Karl, “Probabilistic Video

0] D. G. Lowe, “Object Recognition from Local

Scale-1] Y. Matsushita, E. Ofek, X. Tang, and H.-Y. Shum,

“Full-2] Z. Pan and C.-W. Ngo, “Structuring Home Video by

3] K. A. Patwardhan, G. Sapiro, and M. Bertalmio, “Video

4] P. Pérez, M. Gangnet, and A. Blake, “Poisson Image

5] T. Shiratori, Y. Matsushita, X. Tang, and S. B. Kang,

6] Y. Wexler, E. Shechtman, and M. Irani, “Space-Time [1

Snippet Detection and Pattern Parsing,” Proc. ACM MIR 2004, pp. 69-76, 2004.

[1

Inpainting under Constrained Camera Motion,” IEEE TIP, Vol. 16, No. 2, pp. 545-553, 2007.

Stabilization using Kalman Filtering and Mosaicking,” Proc.

SPIE EI 2003, Vol. 5022, pp. 663-674, 2003.

[1 [1

Editing,” Proc. ACM SIGGRAPH 2003, pp. 313-318, 2003. Invariant Features,” Proc. IEEE ICCV 1999, pp. 1150-1157,

1999.

[1

“Video Completion by Motion Field Transfer,” Proc. IEEE

CVPR 2006, Vol. 1, pp. 411-418, 2006.

[1

Frame Video Stabilization,” Proc. IEEE CVPR 2005, Vol. 1, pp. 50-57, 2005.

[1

Video Completion,” Proc. IEEE CVPR 2004, Vol. 1, pp. 120-127, 2004.

Fig. 11: Top row: Five frames of the original video. Middle row: Stabilized frames. Black areas show the missing areas of the frames. Bottom row: Our result.

Fig. 12: Top row: Five frames of the original video. Middle row: Stabilized frames. Black areas show the missing areas of the frames. Bottom row: Our result.

Full-Frame Video Stabilization with a Polyline-Fitted Camcorder Path