FULL-FRAME VIDEO STABILIZATION
WITH A POLYLINE-FITTED CAMCORDER PATH
1Jong-Shan Lin (
林蓉珊
),
1Wei-Ting Huang (
黃惟婷
)
2
Bing-Yu Chen (
陳炳宇
),
3Ming Ouhyoung (
歐陽明
)
National Taiwan University
E-mail:
1{maruko,weiting}@cmlab.csie.ntu.edu.tw,
2[email protected],
3[email protected]
ABSTRACT
Annoying shaky motion is one of the significant problems in home videos, since hand shake is an unavoidable effect when capturing by using a hand-held camcorder. Video stabilization is an important technique to solve this problem. However, the stabilized videos resulted by current methods usually have decreased resolution and are still not so stable. In this paper, we propose a novel, robust, and practical method of video stabilization with a polyline-fitted camcorder path. Our method can produce full-frame stabilized videos, and not only has the high frequency shaky motioned but also the low frequency unexpected movements are removed. To achieve this, we use a polyline to estimate a new stable camcorder motion path, and then we fill the dynamic and static missing areas caused by frame alignment from other frames to keep the same resolution and quality as the original video. Furthermore, we smooth the discontinuous regions by using a three-dimensional Poisson-based method. After the above automatic operations, a full-frame stabilized video can be achieved.
1. INTRODUCTION
As the use of digital camcorders grows, to capture videos using hand-held camcorders becomes more and more convenient than before. However, since most people usually do not bring a tripod with their camcorders, unwanted vibration in video sequence is an unavoidable effect due to the handshakes. To avoid or remove the annoying shaky motion is one of the significant problems in home videos, and video stabilization is an important technique to solve this problem. Many existed video stabilization applications result a stabilized video by smoothing the camcorder motion path and then truncating the missing areas after aligning the video frames along the smoothed motion path. Hence, the stabilized videos still have many
unexpected movements, since only high frequency shaky motions are removed during the smoothing stage. Moreover, the video qualities of the stabilized videos are usually decreased due to the truncating stage.
In this paper, we propose a novel, robust, and practical method of full-frame video stabilization with a polyline-fitted camcorder path. To achieve this, we use a polyline to estimate a new stable camcorder motion. Hence, the resulted videos are much stable and much close to the videos that the users want to capture. Once we obtain a new stable camcorder motion path of the video, the video frames are aligned along the new motion path to form a stabilized video.
Due to the frame alignment, there are several missing areas in the new stabilized video. Unlike other trimming approaches, we fill the dynamic and static missing areas respectively by using motion inpainting and warped neighboring video frames. This completion method works well even if there are some moving objects located at the boundary of the video frames while keeping the same resolution and quality as the original video. However, since we use a polyline to fit the camcorder motion path rather than using a parametric curve to smooth the motion path, the missing areas are usually large and cannot be easily completed by neighboring frames. To fill the missing areas using the pixels on the frames far from the current frame may cause the discontinuity at the boundaries of the filled areas, since the intensity of each video frame is usually not necessarily the same. Hence, we smooth the discontinuity boundaries by using a three-dimensional Poisson-based method which takes both of the spatial and temporal consistency into consideration and can result seamless stitching spatially and temporally.
After stabilizing the videos, the blurry video frames also caused by the handshakes become much noticeable. Hence, we detect the blurry frames and transfer the pixels from neighboring sharper frames. Therefore, our method can produce full-frame stabilized videos, and not only has the high frequency shaky motioned but also the low frequency unexpected movements are removed. The stabilized videos are stable, comfortable, and much more close to the videos which the users
really want to capture if they bring a tripod with them when capturing.
2. RELATED WORK
Video stabilization is an important research topic in multimedia, image processing, computer vision, and computer graphics. Buehler et al. proposed an image-based rendering (IBR) method to stabilize videos [4]. Recently, image processing methods are widely used for video stabilization. For estimating the camcorder motion path, Litvin et al. estimated a new camcorder motion path by altering camera parameter [9], and Matsushita et al. smoothed the camcorder motion path to reduce the high frequency shaky motions [11]. However, although the high frequency shaky motions can be easily reduced, the stabilized videos still have low frequency unexpected movements.
When filling up the missing image areas, there are some image inpainting approaches developed for recovering the missing holes in the image [1, 5, 8]. Although these approaches can complete the missing regions with correct structure, but there will be obvious discontinuity if we recover each video frame respectively. Litvin et al. used mosaic method to fill up the missing areas in the stabilized video [9], however they did not consider the moving objects may appear at the boundary of the video. Wexler et al. and Shiratori et al. filled up the missing holes by sampling the spatio-temporal volume patches from other portion of the video volume [15, 16]. The former approach used the most similar patch in color space for completing the missing holes and the later one used the patch with similar motion vector. The drawback of these methods is that they need large computing time for searching a proper patch. Jia et al. and Patwardhan et al. segmented the video into two layers and recovered them individually [7, 13]. These methods focused on long and periodic observed time of the moving objects, but this is not guaranteed in common home videos.
3. OVERVIEW
Global Path Estimation
Poly-line Fitting Motion Path Estimation Input
original video sequence
Moving Object Detection Dynamic Region Completion Static Region Completion Possion-based Smoothing Video Completion Output
stabilized video sequence
Video Deblurring
Fig. 1: System framework.
Fig. 1 shows the system framework of our algorithm. The input of our system is a video sequence captured by a hand-held camcorder without using a tripod. Hence, the video sequence has much annoying shaky motions due to the handshakes. The first process of our system is
called motion path estimation (Sec. 4). In this process, the camcorder motion path of the original video is estimated and changed to be a stabilized one. There are three steps contained in this process. In the first step (Sec. 4.1), we find out the transformation between the consecutive frames and combine all of the transformations to obtain the global camcorder motion path of the original video. In the second step (Sec. 4.2), the estimated global camcorder motion path is approximated by a polyline, since the camcorder motion path of the video captured by using a tripod is like a polyline.
After the stabilized camcorder motion path is achieved, the video completion process is applied (Sec. 5). Because the position of each frame is changed according to the frame alignment along the new motion path, there are some missing areas within each aligned frame. The first step in the video completion is to detect if there exists moving objects and where they are (Sec. 5.1). In the second step, we separate the moving objects as the dynamic foreground regions from the static background regions and complete the missing areas of them by different methods (Sec. 5.2 and 5.3). To fill the missing areas using the pixels on the frames far from the current frame may cause the discontinuity at the boundaries of the filled areas, since the intensity of each video frame is usually not necessarily the same. In order to make a seamless stitching, we apply a three-dimensional Poisson-based smoothing method on the discontinuous regions (Sec. 5.4).
Fig. 2: Top row: Three frames of the original video. There are annoying shaky motions in the frames. Middle row: Stabilized frames. Black areas show the missing areas of the frames. Bottom row: Completed frames. This is the result of our method; the shaky motions of the video are removed.
The last process is video deblurring (Sec. 6). Because the motion blur of each frame may not be matched in the stabilized camcorder motion path, the blurry frames become much noticeable in the new stabilized video. Instead of finding the accurate point spread function for image deblurring, we choose a video
deblurring method by transferring the pixel values from neighboring sharper frames to the blurry frames. After the above automatic processes, the output will be a stabilized video with stable camcorder motion path while keeping the same resolution and quality as the original one. Fig. 2 shows three frames of an input video and their stabilized results before and after the video completion and deblurring processes.
1 0 0 0 , 0 1 0 0 i i M =HK +v H= ⎢⎡ ⎤⎥ ⎣ ⎦ and 1 1 0 1 0 0 1 0 1 , 0 0 1 0 0 0 0 1 i i K+ K w ⎡ ⎤ ⎢ ⎥ ⎢ ⎥ = Φ + Φ = ⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ,
4. MOTION PATH ESTIMATION
In order to generate stabilized videos, we first estimate the camcorder motion path of the original video (Sec. 4.1). Then, the original camcorder motion path is stabilized by using a polyline-based motion path (Sec. 4.2), so that the undesirable motion caused by hand shake can be removed.
where Φ is the state transition model which is applied to the previous state Ki to approximate the next state
1
i
K+ , H is the observation model which maps the true state space Mi into the observed space Ki, v~N(0
vation noise which is assumed to be a zero mean Gaussian white noise with covariance R, and
~N(0,Q cess noise which is assumed to be drawn from a zero mean multivariate normal distribution with covariance Q. The camcorder motion path smoothed by the Kalman filter is shown as the green curves in Fig. 3.
, )R
is the
w
obser
) is
4.1. Global Path Estimation
the pro To estimate the global camcorder motion path, we first
extract the feature points of each frame by SIFT (Scale Invariant Feature Transform) [10], which is invariant to scaling and rotation of the image. The feature points on every consecutive frame are matched if the distances between the feature descriptions are small enough and RANSAC (RANdom SAmple Consensus) [6] is used to select the inliers of the matched feature pairs. For the accuracy, an over-constrained system is applied to find out the least square solution between these matched feature pairs and derive the affine transformation between the two consecutive frames. The affine transformation is represented by a 3×3 affine model which contains six parameters. That means if we find out the transformation matrix between frames and
, the corresponding pixel on the frame and pixel on the frame i
i T p 1 i i 1 i+ i 1 i
p+ + have the following
relationship: . Once the transformation matrices between the consecutive frames are obtained, all of the transformations can be combined to derive a global transformation chain.
1 i i p+ = ⋅T p −500 50 100 150 200 250 300 350 400 450 500 0 50 100 150 200 250
horizontal camera motion
pixel
frame
original camera path filtered by Kalman filter polyline i 0 50 100 150 200 250 300 350 400 450 500 −40 −30 −20 −10 0 10 20 30
vertical camera motion
pixel
frame
original camera path filtered by Kalman filter polyline
4.2. Motion Path Fitting
To obtain a stabilized camcorder motion path without not only the high frequency shaky motions but also the low frequency unexpected movements, we use a polyline to fit the estimated global camcorder motion path. We first separate the camcorder motion path estimated from Sec. 4.1 to be horizontal and vertical ones Mi =[ , ]x y , and operate them respectively. Then, Kalman filter is employed to estimate a smooth camcorder motion path Ki of Mi [12] which is
represented in two-dimensional space as
Fig. 3: The original camcorder motion path (red curve) and the estimated camcorder motion path after applying Kalman filter (green curve) and fitting by a polyline (blue straight line) for horizontal (Upper) and vertical (Lower) motion paths.
After applying the Kalman filter, we can obtain the oothed camcorder motion path (Kalman path). Then, we
along the polyline fitte
sm
use a polyline to fit the Kalman path. We first assume there is only one straight line connected the camcorder position of the first frame and that of the last frame on the Kalman path. Then, we calculate the difference between the camcorder position in the original motion path of each frame and the temporary motion path (the straight line) and find the maximal difference. If the maximal difference is larger than the a threshold, the straight line is broken by connecting the camcorder position of this frame on the Kalman path and the endpoints of the straight line, and hence the temporary camcorder motion path (the straight line) becomes a polyline with two straight line segments. This step is performed iteratively until the temporary camcorder motion path can preserve all important regions in the video. The final camcorder motion path is shown as the blue polyline in Fig. 3.
Once the camcorder motion path is fitted by a polyline, the video frames are aligned
d camcorder motion path. If the global transition matrix from the first frame to the i-th frame is denoted by Mi , then the i -th frame is aligned to
0 1
1 j i
j i i
M ⋅
∏
= −T− ⋅p , where pi means the pixels on thei
T represe ed affine transformation etween j e and j+1-th frame. Hence, e can obtain a sta lized video after the polyline fitting and frame alignm nt.
5. VIDEO i e and ma -th fram trix b nt -th fram w bi e COMPLETION
After alignin e stabilized
amcorder motion path, there are several missing areas
video to a static background regi
ts, we first align every air of adjacent frames by using the affine In order to detect moving objec
p
transformation obtained in Sec. 4.1. Then, we evaluate the optical flow of them by using an efficient and less noisy optical flow approach [2, 3] to obtain the motion vector of each pixel. The motion vector of pixelpican
be described as F pi( i) which represents the motion flow at pixel pi from frame i to frame i+1, and the
length of the motion vector shows the motion value.
Fig. 4: Left: The frame after changing the position according to the stabilized camcorder motion path.
otion values in the moving object regions are nsidered to be relatively larger than those in the static back
etion
nstead of filling in e color values from other frames directly, we want to
stabilized fram
Right: The mask of detected moving objects (white regions).
The m co
ground region. Hence, we can get a simple mask to show the regions with large motion values by a simple threshold as shown in Fig. 4. The dynamic regions are obtained by evaluating the dilation of the mask, which can help to guarantee the boundary of the moving objects is involved in the dynamic regions. If the missing area falls in the regions where the neighboring pixels have been asked as the dynamic region, this area is treated as the dynamic region and motion inpainting is used to complete the area, otherwise we recover the area by mosaicing.
5.2. Dynamic Region Compl
g the video frames along th c
in the new stabilized video. Traditionally, this problem could be solved by cutting out the missing areas and scaling the stabilized video to its original size, but it may result a stabilized video with worse resolution. Hence, to make the resolution of the stabilized video as good as that of the original one, the missing areas are filled from other frames.
To complete the video, we first detect the moving objects to segment the
For the dynamic missing regions, i th
fill them up with correct motion vectors. Once we derive the motion vectors of each pixel in the missing areas, we can get the pixel color from the next frame according to the motion vectors. The local motion vectors in the known image areas are propagated to the dynamic missing areas as described in [11].
First, local motion vectors are estimated by computing the optical flow between the
on and some dynamic moving object regions (Sec. 5.1). Then, we complete the missing areas by filling dynamic regions (Sec. 5.2) and static regions (Sec. 5.3) respectively. Since the camcorder motion path is fitted by a polyline, the missing areas may be large and need to be filled by the pixels on the frames far from the current frame, so we provide a three-dimensional Poisson-based smoothing method to smooth the discontinuity stitched areas (Sec. 5.4).
5.1. Moving Object Detection
es [2, 3]. The propagation starts at the pixel on the boundary of the dynamic missing areas, its motion vector is calculated as a weighted average of the motion vectors of its neighboring pixels. The process will continue until the dynamic missing areas are filled with motion vectors completely. If pi is a pixel in the
missing area, it will be filled according to its motion vector which is determined by
, ( ) ( ) ( , ) i pi i i i i i q N i i i i q Np w p q F F p w p q ∈ ∈ =
∑
∑
,here determines the contribution of the w w p q( i, i)
vector o
motion f pixel qi , Npi denotes the eight
neighboring pixels of pi , and F pi( i) represents the
motion vector at pixel pi from frame i to frame i+1. Suppose the neighborin pixel qi∈Np already has a
motion vector, according to its m vector, we can estimate its position on the next frame as qi+1. By using
the geometric relationship between the pix pi and qi,
the position of the pixel pi+1 in the frame i+1 ca also
determined as illustrated in . 5. Since the p ls in the same object have similar color values and move in the same direction, if the difference between the color values of the pixels qi+1 and pi+1 is small, they will
likely belong to the same object, d the ght of the motion vector of pixel qi is set to be large as
g i ion ot Fig an els wei n xe be i 1 1 1 ( ) ( ( , ) i i i w p q ClrD p+ q+ ε) + = + , where ε and
is a small value for avoiding the division by
ero ) is the -norm color
z i+1 i+1
difference in RGB color space of the pixels qi 1
( ,
ClrD p q l2
+
and pi+1 . erm gua ees that the
contribution of the motion vector in different object smal . 6 shows the result.
This weight t
s is l. Fig
rant
frame i+1
frame i unknownmotion
1 i q 2 i q 3 i q pi 1 i p+ 1 1 i q+ 2 1 i q+ 3 1 i q+ 1 i p+ 1 i p+
Fig. 5: Suppose the neighboring pixel already has a motion vector, according to its motio ctor, we can
ompletion
gions, we then recover ng frames which are
illustrates the process an ws
wrapped to the current frame according to the affine transformation obtained in Sec. 4.1. For the pixel piin
the static missing area at frame i , if there exists its corresponding pixel pi at the warped neighbo g
frame i', we directly copy the pixel to the missing pixel
i p by '( ') i i i i p =T p where ' i i T represents the transformation from frame i' to frame i . Fig. 7
d Fig. 8 sho the static region completion result.
rin ,
Fig. 6: Upper-Left: The frame after changing the position according to the stabilized camcorder motion path. There is a missing area at the left side and a moving object across the missing area. Lower-Left: The result of dynamic region completion. Right Column: The close up view of the yellow square of the Left Column. frame i frame i-1 i q n ve s
estimate its position on the next frame a qi+1. By using
the geometric relationship between the pixels pi and qi, the position of the pixel pi+1 in the frame 1 can also
be determined.
5.3. Static Region C
i+
After completing the dynamic re e static ones by its neighbori th m ar issing ea frame i+1 i p 1 i p+
Fig. 7: is a pixel in the missing area at current frame and
i p i p
i +1 is its corresponding pixel in the next frame
1
i+ . The pixel value of pi+1 is directly copied to pi
r reco ing the missing pixel. fo ver
Fig. 8: Left: The frame after changing the position according to the stabilized camcorder motion path. There is a missing area at the right side and upper side. Right: The result of static region completion. Since the missing area is large, there is a discontinuity boundary between the recovered pixels and the original frame.
To find the corresponding pixel p′i of pi , we
egin the search from the nearest nei ng and prop
ng ar
sson-Based Smoothing
sed by the stabilized amcorder motion path are completed, there may be a
ation is appl
b ghbori frame
is agate the search out. For example, if i the current frame we want to recover, we search the frames
1
i− and i+1 first, if there are missing areas still have not been recovered by the two frames, the following
frame i−2 and i+2 are used to recover the missing areas. We keep the search until all the missing pixels in the static missi eas are completed. Finally, if there are still some missing areas we cannot recover, we use a simple image inpainting approach to complete them.
5.3. Poi
two s
Although the missing areas cau c
discontinuous boundary between the recovered pixels and the original frame, since the missing areas may be large and needed to be filled from the frame far from the current one. This problem may be solved by simply applying some smoothing approaches for the boundary areas; however these simply smoothing operations can only solve the spatial discontinuity problem. When we play the spatially smoothed video, it still has temporal discontinuity problem. In order to keep the spatial and temporal continuity, we provide a three-dimensional Poisson-based smoothing method. Poisson-based smoothing approach is often used in image editing [14], and we extend this approach for video editing.
To solve the discontinuity problem, before filling in a pixel from other frames, the Poisson equ
ied to obtain a smoothed pixel by considering its neighboring pixels in the same frame and neighboring frames. We first apply the Poisson equation in the spatial domain which is written as: For all p∈ Ω,
* p p p p p q q pq q N q N q N v ∈ ∩Ω ∈ ∩∂Ω ∈ ,
where denotes the missing area, is a pixel in the
N f −
∑
f =∑
f +∑
Ω p
missing area Ω, Np denotes the nei ng pixels of
pixel p ,
ghbori
p
N is the number of neighboring pixels Np,
p
f and f are the correct pixel values of pixels p q and
q which are what we want to derive, vpqdetermin
ergen of pixels p and q , ∂Ω is the region rrounding the missing area
es the div su ce Ω in t known image areas, and q* he
f denotes th know ol lue of pixel q in ∂Ω.
The Poi on equation can keep the correct structure the miss
e n c or va ss
in ing area and achieve a seamless stitching betw
the missing areas of each frame, we correct the pixel een the recovering areas and known image areas. In order to achieve temporal coherence, after recovering
values of the missing areas by apply the Poisson equation again. For the missing areas, now we consider not only the spatial neighboring pixels but also temporal neighboring pixels. Hence, the Poisson equation is the same, but N includes all neighboring pixels of pixel p
p in the video volume. Fig. 9 shows the result.
Fig. 9: Upper-Left: The frame after video completion. Since the missing area is large, there is a discontinuous boundary between the recovered pixels and the original
fter video stabilization, the blurry frames which look
smooth in th iceable. Our
ideo deblurring method fundamentally based on [11],
ch frame by calc
frame. Lower-Left: The result of video completion with Poisson-based smoothing. Right Column: The close up view of the yellow square of the Left Column.
6. VIDEO DEBLURRING
A
e original video become not v
but we separate the moving objects from static background first, and deal with them respectively as the video completion process. Since the blur of the static background is much more noticeable than that of the moving objects, in the following explanation, we only focus on the static background deblurring.
The main idea of this method is to copy the pixels of neighboring sharper frames to the blurry frames. We first evaluate the "relative blurriness" of ea
ulating the gradient of it. Generally, the gradient of blurry image is smaller than that of sharper one at the same regions. With this assumption, the blurriness of frame i is defined as:
2 2
( ( ) ( ) )
i x i y i
B =
∑
g p +g p ,where pi is the pixel of the frame i, and gx and g y
x− s Bi and recti blurriness ng fram y
are the gradients of − di ons, es respectively. We can derive the relative
between he current frame and its neighbori
by comparing their blurrines . If the blurriness Bi of
current frame i is smaller than the blurriness Bi' of its
neighboring frames i t
′ , then the frames i′ are treated be sharper than the frame i and we can use the fram
to es
i′ to recover the current blurry frame i by transfe ing the corresponding pixe from these sharper frames i
rr
ls ′
to the blurry frame i by
( ) 1 i i i i i N i i i i i i i i′ N p w T p p w ′ ′ ′∈ ′ ∈ ′ + = +
∑
∑
where and e same pixel in frame urring operation, denot ng frames of current frame
, i p and before t ghbori i p he debl ion are th th N i e i es represents after the i , T nei i i′
the transformat from the neighboring frame i′∈ Ni to frame i , and wii′ is a weighting factor b tween i′ e
and i which is defined as:
0 if / 1
/ otherwise
i
i Bi′
′i
g. 10 shows th t of this deblurring method.
i i B w B′ B < ⎧ = ⎨ ⎩ . Fi e resul he up by pod, . The Fi resul vi si nd
g. 10: Upper-Left A blurry frame. Lower-Left:T t of video debl ng. Right Colu : The close ew of the yellow square of the Left C umn.
tured ng a hand-held t using a tri
the resolution e a
the hand
on approach is proposed in is paper to obtain a stabilized video. Since we use a
polyl the
abilized motion path is much more stable than other
was partially supported by the National cience Council of Taiwan under NSC
95-2622-E-002-018 and also Projects of
ational Taiwan University under 95R0062-AE00-02.
, mage Inpainting,” Proc. ACM SIGGRAPH 2000, pp.
417-, 1981. : urri vid of mn ol hout ll 7. RESULT
ll of the videos used in this paper was cap A
u eo camera wi
the videos ar
a 720 480×
resolution of all stabilized videos is the same as the input videos. Fig. 2, Fig. 11 and Fig. 12 show our results. In Fig. 2, the user wants to use -held camcorder to capture a panorama view. Without a tripod, the captured video is shaky due to the handshakes. In Fig. 11, the user wants to use the hand-held camcorder to capture a man walking with his child. Without a tripod, the captured video is shaky due to the handshakes. The bottom rows of Fig. 11 shows our result which is stabilized as captured by using a tripod. In Fig. 12, the user wants to use the hand-held camcorder to capture a man playing with his dog, but due to the view angle limitation, the user pans the camcorder a little bit to capture the whole scene. The
bottom row of Fig. 12 shows our result and the stabilized camcorder motion path is just like to capture the scene by using a tripod.
8. CONCLUSION AND FUTURE WORK
A full-frame video stabilizati th
ine to fit the original camcorder motion path, st
smoothness approaches. Hence, in the stabilized video, not only the high frequency shaky motions but also the low frequency unexpected movements are removed. Although using a polyline to estimate the camcorder motion path may cause large missing areas, it is solved by applying a three-dimensional Poisson-based smoothing method. To fill the missing areas from other frames and deal with blurry frames, we separate the moving objects from the static background and deal with them respectively in completion and deblurring processes.
ACKNOWLEDGEMENT
This paper S
by the Excellent Research N
REFERENCES
[1] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester “I
424, 2000.
[2] M. J. Black and P. Anandan, “A Framework for the Robust Estimation of Optical Flow,” Proc. IEEE ICCV 1993, pp. 231-236, 1993.
[3] M. J. Black and P. Anandan, “The Robust Estimation of Multiple Motions: Parametric and Piecewise-Smooth Flow Fields,” CVIU, Vol. 63, No. 1, pp. 75-104, 1996.
[4] C. Buehler, M. Bosse, and L. McMillan, “Non-Metric Image-Based Rendering for Video Stabilization,” Proc. IEEE
CVPR 2001, Vol. 2, pp. 609-614, 2001.
[5] A. Criminisi, P. Perez, and K. Toyama, “Object Removal by Exemplar-Based Inpainting,” Proc. IEEE CVPR 2003, Vol. 2, pp. 721-728, 2003.
[6] M. A. Fischler and R. C. Bolles, “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography,” CACM, Vol.
4, No. 6, pp. 381-395 2
[7] J. Jia, T.-P. Wu, Y.-W. Tai, and C.-K. Tang, “Video Repairing Inference of Foreground and Background under Severe Occlusion,” Proc. IEEE CVPR 2004, Vol. 1, pp.
364-71, 2004. 3
[8] A. Levin, A. Zomet, and Y. Weiss, “Learning How to Inpaint from Global Image Statistics,” Proc. IEEE ICCV 2003, Vol. 1, pp. 305-312, 2003.
[9] A. Litvin, J. Konrad, and W. C. Karl, “Probabilistic Video
0] D. G. Lowe, “Object Recognition from Local
Scale-1] Y. Matsushita, E. Ofek, X. Tang, and H.-Y. Shum,
“Full-2] Z. Pan and C.-W. Ngo, “Structuring Home Video by
3] K. A. Patwardhan, G. Sapiro, and M. Bertalmio, “Video
4] P. Pérez, M. Gangnet, and A. Blake, “Poisson Image
5] T. Shiratori, Y. Matsushita, X. Tang, and S. B. Kang,
6] Y. Wexler, E. Shechtman, and M. Irani, “Space-Time [1
Snippet Detection and Pattern Parsing,” Proc. ACM MIR 2004, pp. 69-76, 2004.
[1
Inpainting under Constrained Camera Motion,” IEEE TIP, Vol. 16, No. 2, pp. 545-553, 2007.
Stabilization using Kalman Filtering and Mosaicking,” Proc.
SPIE EI 2003, Vol. 5022, pp. 663-674, 2003.
[1 [1
Editing,” Proc. ACM SIGGRAPH 2003, pp. 313-318, 2003. Invariant Features,” Proc. IEEE ICCV 1999, pp. 1150-1157,
1999.
[1
“Video Completion by Motion Field Transfer,” Proc. IEEE
CVPR 2006, Vol. 1, pp. 411-418, 2006.
[1
Frame Video Stabilization,” Proc. IEEE CVPR 2005, Vol. 1, pp. 50-57, 2005.
[1
Video Completion,” Proc. IEEE CVPR 2004, Vol. 1, pp. 120-127, 2004.
Fig. 11: Top row: Five frames of the original video. Middle row: Stabilized frames. Black areas show the missing areas of the frames. Bottom row: Our result.
Fig. 12: Top row: Five frames of the original video. Middle row: Stabilized frames. Black areas show the missing areas of the frames. Bottom row: Our result.