• 沒有找到結果。

Scene Warping: Layer-based Stereoscopic Image Resizing

N/A
N/A
Protected

Academic year: 2022

Share "Scene Warping: Layer-based Stereoscopic Image Resizing"

Copied!
8
0
0

加載中.... (立即查看全文)

全文

(1)

Scene Warping: Layer-based Stereoscopic Image Resizing

Ken-Yi Lee Cheng-Da Chung Yung-Yu Chuang

National Taiwan University

Email:kez@cmlab.csie.ntu.edu.tw, johnny751125@cmlab.csie.ntu.edu.tw, cyy@csie.ntu.edu.tw

Abstract

This paper proposes scene warping, a layer-based stereoscopic image resizing method using image warping.

The proposed method decomposes the input stereoscopic image pair into layers according to the depth and color information. A quad mesh is placed onto each layer to guide the image warping for resizing. The warped layers are composited by their depth orders to synthesize the re- sized stereoscopic image. We formulate an energy func- tion to guide the warping for each layer so that the com- posited image avoids distortions and holes, maintains good stereoscopic properties and contains as many important pixels as possible in the reduced image space. The pro- posed method offers the advantages of less discontinuous artifacts, less-distorted objects, correct depth ordering and enhanced stereoscopic quality. Experiments show that our method compares favorably with existing methods.

1. Introduction

Image resizing (image retargeting), adapting images for displays with different sizes and aspect ratios, has received considerable attention recently due to diversity of displays.

Traditional scaling and cropping methods easily cause sig- nificant distortions or information loss. Content-aware im- age retargeting methods take into account the saliency dis- tribution of the image and attempt to keep the salient fea- tures uncontaminated by hiding distortion within the less noticeable areas. As another trend, recently, stereoscopic and autostereoscopic displays have been deployed in the- aters, televisions, computer screens, and even mobile de- vices. Due to the diversity among resolutions and aspect ratios of stereoscopic displays, similar to 2D images, stereo- scopic images need to be retargeted for displaying properly on stereoscopic displays with different specifications.

Recognizing the importance of stereoscopic image re- targeting, several stereoscopic image resizing methods have been proposed in the spirit of their 2D counterparts.

This work was partly supported by grants NSC100-2628-E-002-009 and NSC100-2622-E-002-016-CC2.

Basha et al. extended seam carving to stereoscopic image resizing [2]. Although this method produces geometrically consistent results, as a descendant of seam carving, its dis- crete nature may cause noticeable discontinuity on struc- tural objects. Chang et al. extended the warping-based ap- proaches to stereoscopic image resizing [3]. Although their results contain less discontinuity artifacts on structural ob- jects, the stereoscopic quality of the resized images could be reduced because it models the whole image as a rubber sheet and can not create proper occlusions and depth dis- continuity, important cues for human depth perception. As a result, depth edges could be made less prominent after resizing, thus reducing stereoscopic quality.

Our method is inspired by scene carving for 2D image resizing [10], which decomposes an image into layers and applies seam carving to synthesize scene consistent retar- geting results. It can create proper occlusions and depth discontinuity because of its layered nature. However, since it adopts seam carving, scene carving suffers from the same artifacts of discontinuous structured objects. Our method can be taken as a hybrid of scene carving and warping-based methods [3]. Similar to scene carving, we decompose the input stereoscopic image pair into a set of layers. Since our input is an stereoscopic image, we can take advantage of the disparity map to create layers more easily than scene carving. We adopt a warping-based approach. Each layer is warped by its own mesh deformation and the warped layers are composited together to form the resized image. We for- mulate an energy function to guide image warping of each layer so that the composited resized image has the follow- ing properties: (1) it avoids distortions and holes as much as possible; (2) it maintains good stereoscopic properties and (3) it contains as many important pixels as possible in the reduced image space.

Compared to existing stereoscopic image resizing meth- ods, the proposed method offers the following advantages.

(1) It avoids artifacts of discontinuous structured objects commonly encountered by discrete resizing methods. (2) It shares the same advantages with scene carving that ob- jects are protected (i.e. not distorted) and their depth orders are correctly maintained. (3) Thanks to its layered nature,

(2)

it better preserves the depth edges and creates proper oc- clusions, both enhancing stereoscopic quality. (4) It applies different deformations on different layers. Thus, it has bet- ter chances to hide more distortions into unimportant areas while keeping important areas uncontaminated.

2. Related work

Image retargeting. Shamir and Sorkine categorized the content-aware image retargeting into two main classes: dis- crete approaches and continuous approaches [16]. Seam carving [1] is a well-known discrete method, which re- moves a seam with the lowest importance from an image at a time. A seam is a connected path crossing the im- age from top to bottom or from left to right. Seam carv- ing has been improved by many others [14, 15]. Mans- field et al. proposed scene carving [10] to generalize seam carving. With a user-provided relative depth map, the im- age is decomposed into layers. Seams are removed from the background and foreground objects are re-arranged spa- tially. Their algorithm has the advantages that the objects are protected (i.e. not distorted) and their depth orders are correctly maintained. Warping-based methods, also called continuous approaches, place a quad mesh onto the image and deform the mesh to guide image warps for resizing the image [17, 18]. Wolf et al. obtained warping functions by a global optimization that squeezes or stretches homogeneous regions to minimize the resulting distortions [17]. Wang et al. [18] proposed to assign spatially varying scaling factors by optimization. They also designed an energy term to pre- serve edge orientations of the mesh for important areas.

Stereoscopic image retargeting. Basha et al. [2] extended seam carving for stereoscopic image retargeting. Their method simultaneously carves a pair of seams, each for a view. By defining occluding and occluded pixels, they guar- antee that the removed seam pairs are geometrically consis- tent. Nevertheless, their method suffers from limitations of seam carving and might cause obvious artifacts on struc- tured objects, especially when the aspect ratio changes in- tensively. Chang et al. proposed a content-aware display adaptation method which simultaneously resizes a stereo- scopic image to the target resolution and adapts its depth to the comfort zone of the display while preserving the per- ceived shapes of prominent objects [3]. Our method is sim- ilar to theirs in the aspect that both use image warping for stereoscopic image resizing.

3. Method

Given a stereoscopic image pair {IL, IR} whose dimen- sions are w ×h, the goal of stereoscopic image resizing is to change their dimensions to the desired size ˆw × ˆh. We first compute a disparity map between two views using the semi- global stereo matching algorithm [5]. Inspired by scene

(a) left view (b) right view

(c) disparity map (d) object segmentation map Figure 1. The disparity map and the object segmentation map.

Given an input stereoscopic image pair ((a) for the left view and (b) for the right view), we compute its disparity map (c) and its ob- ject segmentation map (d). Each color in (d) represents an object layer. Notice that only maps for the left view are shown here.

carving [10], we decompose the images into multiple object layers. Each pixel in the input image pair is assigned to one object layer based on the computed disparity map. The cor- responding pixels between views should be assigned to the same object layer. Object layers can be obtained automat- ically or semi-automatically [9] by utilizing color and dis- parity. In our current implementation, we used a GrabCut system [13] to segment the stereoscopic images with user hints. Pixels are assigned to the background layer if they are not explicitly assigned to any object layer . Through this process, we obtain a set of object layers (including the back- ground layer) S = {sL1, sR1, sL2, sR2, . . . , sLN, sRN}, in which the l-th object has two corresponding object layers, sLl and sRl , for the left and right views, respectively. We assume that object layers in S are sorted by their average depths;

and sL1 and sR1 are the background layers. We also define a w × h object segmentation map, Ok (k ∈ {L, R}), in which Ok(x, y) = l, if the pixel (x, y) of Ikbelongs to ob- ject layer skl. Given a stereoscopic image pair (Figure 1(a) and (b)), we compute the disparity map (Figure 1(c)) and obtain the object segmentation map (Figure 1(d)).

As most continuous methods, we place a quad mesh onto each object layer and compute a new geometry for each mesh to deform the associated object layer. Each object layer skl is associated with a quad mesh of fixed quad size (20 × 20 in all experiments). Let Vkl = {vi,jk,l} be the vertex set of the quad mesh for skl, where vi,jk,ldenotes the position of the vertex at i-th column and j-th row of the mesh. Let Vˆlk = {ˆvi,jk,l} denote the vertex set of the deformed quad mesh. The goal of stereoscopic image resizing is to find the optimal vertex positions ˆvi,jk,lfor these deformed quad mesh with respect to some energy function.

(3)

3.1. Multi-layer image compositing

Before elaborating how to obtain the optimal vertex po- sitions, we describe the process of compositing the resized image assuming that we have obtained the optimal vertex positions. To render the resized stereoscopic image pair { ˆIL, ˆIR} with the desired size ˆw × ˆh, each object layer skl is first warped by the associated quad mesh ˆVlk to obtain the warped object layer ˆskl. Next, the warped object layers {ˆskl|1 ≤ l ≤ N } belonging to the same view k are compos- ited together to obtain ˆIk according to their depth orders.

Since layers are sorted by depths, we can use Painter’s al- gorithm to composite the final retargeted image ˆIk by ren- dering in the order of ˆsk1, ˆsk2, . . ., ˆskN. Figure 2 shows the compositing process.

3.2. Problem formulation

The goal of stereoscopic image resizing is to find a stereoscopic image pair { ˆIL, ˆIR} with the desired size which (1) has less distortions and holes; (2) maintains good stereoscopic properties and (3) contains as many impor- tant pixels as possible. We formulate the stereoscopic im- age resizing problem as an optimization problem to find a set of optimal vertex positions of deformed quad meshes, V = { ˆˆ Vkl|k ∈ {L, R}, 1 ≤ l ≤ N }, which minimize the following objective function:

E( ˆV) = EQ( ˆV) + λSES( ˆV) + λIEI( ˆV), (1) where EQ is the image quality energy; ES is the stereo- scopic quality energy; and EI is the importance energy.

These energy terms correspond the above three require- ments respectively.

To avoid folding artifacts and heavily distorted quads, the following constraints are applied on all deformed mesh vertex positions ˆvk,li,j = (xi,j, yi,j):

xi,j< min(xi+1,j−1, xi+1,j, xi+1,j+1), xi,j> max(xi−1,j−1, xi−1,j, xi−1,j+1), yi,j< min(yi−1,j+1, yi,j+1, yi+1,j+1),

yi,j> max(yi−1,j−1, yi,j−1, yi+1,j−1), (2) where i and j index columns and rows, respectively. These are hard consrtaints and will be strictly enforced in our iter- ative optimization procedure (Section 4).

3.3. Image quality energy

We evaluate image quality from two aspects: image dis- tortionand image incompleteness. The first one measures how layers (images) are distorted by the mesh deformation and the second one counts how many pixels are left uncov- ered (holes) in the final composited image. Thus, we define the image quality energy EQas

EQ( ˆV) = EF( ˆV) + λCEC( ˆV), (3)

ILL

OLL

sL1L1

: :

sL4L4

sL5L5

sL6L6

Figure 2. The composting process. In this example, the width is reduced by 40%. For saving space, two object layers are not dis- played.

where EFis the image distortion energy, and ECis the im- age incompleteness energy. The total image distortion en- ergy is the sum of quad distortion energy terms of all layers and all views:

EF( ˆV) = X

Vˆkl∈ ˆV

X

ˆ q∈ ˆVlk

WQk(q)(ER(ˆq) + λEEE(ˆq) + λOEO(ˆq)), (4)

(4)

v0

v1

v3

v2 v^1 v^2

v0

v~0 ^

Figure 3. Shape deformation measurement.

where ˆq = (ˆvi,jk,l, ˆvi,j+1k,l , ˆvi+1,j+1k,l , ˆvk,li+1,j) represents a quad in a warped mesh; ERis the similarity energy; EE is the size energy; EO is the line bending energy; WQk is an im- age saliency map which can be computed from Ik using any saliency detection algorithm and WkQ(q) is the average saliency value of the quad q in the original mesh.

For the similarity energy ER, we encourage each quad to undergo a similarity transformation [4] and use a quadratic energy term to measure how far the deformation of a quad is from a similarity transformation [7]. More specifically, as shown in Figure 3, by picking any three vertices of a quad in the counter-clockwise direction, taking v0, v1, and v2as an example, we can define v0by v1and v2as

v0= v1+ R90v−→1v2, R90= 0 1

−1 0



(5) After deformation, given vertex positions ˆv1and ˆv2, we can obtain the expected position of v0after deformation as

˜

v0= ˆv1+ R90

−→

ˆ

v12. (6) Ideally, if the quad undergoes a similarity transformation, the expected position ˜v0should be identical to ˆv0, the posi- tion of v0after deformation. Thus, ER(ˆq) can be calculated by summing (˜v0− ˆv0)2for all combinations of three ver- tices in a quad q.

Inspired by Wang et al. [18] and Niu et al. [12], it is also important to maintain the sizes and orientations of salient regions. To maintain the original quad size, the energy term EE is added to measure the edge length differences. An- other energy term EO is added to maintain the orientation by measuring the degree of line bending.

EE(ˆq) = (xi+1,j−xi,j−S)2+ (yi,j+1−yi,j−S)2 (7) + (xi+1,j+1−xi,j+1−S)2+ (yi+1,j+1−yi+1,j−S)2, EO(ˆq) = (xi,j+1−xi,j)2+ (yi+1,j−yi,j)2 (8)

+ (xi+1,j+1−xi+1,j)2+ (yi+1,j+1−yi,j+1)2, where (xi,j, yi,j) is the position of the vertex ˆvi,jk,l, and S is the width of the original quad.

The image incompleteness term EC measures the in- completeness of the final resized stereoscopic image. There could exist holes in the final composited image if some pix- els are not covered by any resized object layer. We would

like to reduce holes (uncovered pixels) as much as possible for better visual quality. Thus, the image incompleteness term EC is defined as the number of uncovered pixels in the resized images, { ˆIL, ˆIR}. To count the number of un- cover pixels, we obtain the resized object segmentation map by the following steps. First, for each object layer, skl, we define a w × h mask Mlk, where

Mlk(x, y) =

 l if Ok(x, y) = l,

0 otherwise. (9)

Then we warp each Mlkby ˆVlk, and denote the warped mask as ˆMlk. With our multi-layer image compositing method (Section 3.1), we can composite all these warped masks { ˆMlk|1 ≤ l ≤ N } to form a resized object segmentation map ˆOk of size ˆw × ˆh. Figure 2 shows an example for the resized object segmentation map, ˆOL. With ˆOk, ECis defined as:

EC( ˆV) = Z( ˆOL) + Z( ˆOR), (10) where the operator Z(·) counts the number of zero-pixels in the input image.

3.4. Stereoscopic quality energy

In order to maintain good stereoscopic properties, we use two criteria. The first is to maintain the original disparity values as much as possible, and the second is to ensure that there is no vertical offset between the corresponding points across views. From the disparity map, we can obtain a set of corresponding points F = {(pLi, pRi )}, in which pLi and pRi are a pair of corresponding points between the left and right views. After the mesh deformation defined by ˆV, we have a set of warped corresponding points ˆF = {(ˆpLi, ˆpRi )}. To preserve good stereoscopic quality, we require (1) the dis- parity value between warped corresponding points ˆpLi and

ˆ

pRi should be the same as the original disparity between pLi and pRi and (2) their vertical offset should be zero:

ES( ˆV)= X

( ˆpLi, ˆpRi)∈ ˆF

WS(pLi)(ED(ˆpLi, ˆpRi)+λVEV(ˆpLi, ˆpRi)), (11) where ED measures disparity consistency and EV ensures zero vertical drift:

ED(ˆpLi, ˆpRi ) = ((ˆpRi [x] − ˆpLi[x]) − (pRi[x] − pLi[x]))2 (12) EV(ˆpLi, ˆpRi ) = (ˆpRi [y] − ˆpLi[y])2, (13) where the operator [x] extracts the x-component of the in- put 2-D vector; [y] extracts the y-coordinate; and WS is a stereoscopic saliency map to encourage more salient re- gions to have better stereoscopic quality. We will discuss the stereoscopic saliency map in Section 4.

(5)

3.5. Importance energy

With only the image quality term and the stereoscopic quality term, the optimal solution would be cropping as cropping perfectly preserves stereoscopic constrains and in- troduces neither distortions nor holes. Cropping however is not a preferred solution as it could remove important con- tent. In addition to cropping, layer occlusions could also introduce important content loss as some pixels could be- come occluded and not shown in the resized image. We would like to reduce content loss as much as possible. For example, content loss due to layer occlusions could be re- duced if the layers are repositioned so that they occlude less important areas instead of important ones. We add the im- portance energy term to ensure that the resized image keeps as much important content as possible.

We assume that each object layer has an importance map, Wk,lI . There are many ways for obtaining such importance maps. For example, objects in the front are often more im- portant than the ones in the back. Thus, we could use a layer’s depth order as its importance. It can also be pro- vided by the users or set as the same as the image’s saliency map WkQ. To measure importance loss, EI, we first obtain the visibility map [11], which describes whether a pixel in the original image is visible in the resized image. A pixel becomes invisible usually due to occlusions. The impor- tance loss EI can then be determined by summing the im- portance values of all unseen pixels. To determine whether a pixel (x, y) of Ikis visible in the resized image, the pixel is warped from (x, y) to (ˆx, ˆy) by ˆVkO

k(x,y), and its visibil- ity can be determined as:

Ak(x, y) =

1, if 1 ≤ ˆx ≤ ˆm, 1 ≤ ˆy ≤ ˆn and ˆOk(ˆx, ˆy) = Ok(x, y), 0, otherwise.

(14)

With the visibility map Ak, EIis defined as:

EI( ˆV) = X

k∈{L,R}

X

(x,y)∈Ik

(1−Ak(x, y)) × WkI(x, y) (15)

4. Implementation details

Iterative optimization. The major challenge for opti- mizing E( ˆV) is the calculation of the terms EI and EC. They cannot be parameterized and can only be evaluated by counting pixels or summing importance values in the com- posited images. Thus, for optimization, we adopt a coarse- to-fine strategy and iteratively update one mesh vertex po- sition at a time by searching for the local minimum within a small neighborhood of the current solution.

To find ˆV which minimizes E, the input images are first scaled down to the coarsest level. We use five levels with a scaling factor of 2 for all results. At the coarsest level, we take uniform scaling as the initial guess. After obtaining the

optimal solution at a coarser level, the resultant ˆV is scaled up into the next finer level and used as the initial guess for that level. At each level, we first fix ˆVRand update ˆVL. Then, we fix ˆVLand update ˆVR. We alternatively update between views for T1iterations. When updating ˆVk, all N object layers, ˆVkl ∈ ˆVk, are optimized one by one by fixing all other layers. This process is repeated for T2iterations.

In the current implementation, both T1and T2are 4.

To find the optimal ˆVkl when fixing all other object lay- ers, we iterate through every vertex ˆvi,jk,l ∈ ˆVkl, and eval- uate E in a small neighborhood around the current solu- tion ˆvi,jk,l using a local search. More specifically, we take a uniform grid of samples around the small neighborhood of the current solution and search for the minimum of E at these samples. The grid of samples can be written as ˆvi,jk,l + (δx, δy) where ˆvi,jk,l is the current solution and (δx, δy) ∈ {(txP, tyP )|tx, ty∈ Z and −K ≤ tx, ty ≤ K}

That is, we take (2K +1)×(2K +1) samples with the sam- pling interval P in each dimension. In our implementation, we used K = 6 and P = 0.25. We evaluate E at these sam- ples and update ˆvk,li,j as ˆvk,li,j + (δx, δy) with the minimum energy among these samples. Although we have to evaluate E at each sample, fortunately, only one vertex is updated at a time and the updates of EI and ECare local. They can be implemented efficiently using incremental calculation.

Importance maps. There are three types weighting maps in the energy, each accounting for image quality importance (WQ), stereo quality importance (WS), and content impor- tance (WI). A reasonable choice for WQand WS is the image’s saliency map. WI can be supplied by the users or by the saliency map. In practice, we found that foreground objects usually have higher importance. Thus, the object segmentation map or the estimated disparity map are also reasonable choices for WI. This observation is the same as the stereoscopic saliency map used by Lang et al. [8]. In the current implementation, we use the same map for WIk, WkQ, and WkS. The map is defined as

Wk(x, y) =

 1, Ok(x, y) > 1

0.01, otherwise. (16)

5. Results

We used the following datasets to test the propose method: Aloe from the Middlebury stereo dataset [6], Peo- ple, Snowman, and Man from Flicker1. Two methods were compared: a seam carving based approach (ICCV’11) [2]

and a warping based approach (TMM) [3]. For all results in the paper, our method took around five minutes. The process can potentially be sped up using parallel processing with GPUs or multithreading.

There are a few adjustable parameters in our method, such as λS, λI, λC, and λV. In all results, λC is set to

1Downloaded from the website http://www.eng.tau.ac.il/ talib/Data SC.html

(6)

Original TMM ICCV’11 Ours

Figure 4. Man Dataset. Reducing width by 17%. (From top to bottom: left view, right view, estimated disparity map)

Original TMM ICCV’11 Ours

Figure 5. People Dataset. Reducing width by 17%. (From top to bottom: left view, right view, estimated disparity map)

a large value of 104 because image incompleteness is not preferred in most cases. λV controls the degree of vertical offsets between different views, which is crucial for stereo- scopic vision, and is also set to a large value of 103. As for λSand λI, we leave them as control options. We start with small values and intuitively adjust them depending on what we prefer: a larger λS for better stereoscopic quality, and a larger λI for preserving important content better.

A good retargeted stereoscopic result has less distortions and information loss in the left and right views, while pre- serving original disparity values and depth discontinuities (sharp edges in the disparity map). Based on these criteria, we can compare our method with other approaches. Fig- ure 4 compares our method with ICCV’11 and TMM on Man. Note that the white lines in our results remain straight while they become jiggle in ICCV’11. It is the common ar-

(7)

Original ICCV’11 Ours

Figure 6. Aloe Dataset. Reducing width by 20%. (From top to bottom: left view, right view, estimated disparity map)

tifacts inherited from discrete methods. TMM distorts the shape of the man, making the head smaller and the lower body fatter. It is because that the warping-based approach uses a single mesh for the whole image. The warping has to make a trade-off between the requirements of the fore- ground and the background. Our result preserves the shape of man and the lines in the background. Note that, as scene carving, our method could change the relative positions of objects and rearrange objects’ spatial relationships. Never- theless, it still yields a geometrically consistent interpreta- tion of the scene as shown in the disparity map estimated from the resultant left and right images. In Figure 5, again, TMM distorts the shapes of people. ICCV’11 has similar discontinuity artifacts. For example, the boat after the car becomes broken in ICCV’11 results.

Figure 6 compares our method to ICCV’11 on Aloe. The discreet nature of ICCV’11 creates very strange shape for the foreground plant and pot. Our method preserves the shapes much better. Figure 7 and Figure 8 compare our method with TMM on Snowman with two different size changes. For moderate size change (17% in Figure 7), TMM can produce reasonable results. However, for more intensive size change (40% in Figure 8), TMM introduces shape distortion. In addition, for this case, the disparity range of TMM result is highly compressed and the stereo- scopic quality is greatly reduced.

As defined by Basha et al. [2], for maintaining geometric consistency, the resized image should have a similar dispar- ity map to the original disparity map. The sharp edges in the disparity map are especially important for good stereo ex-

Original TMM Ours

Figure 7. Snowman Dataset. Reducing width by 17%. (From top to bottom: left view, right view, estimated disparity map)

Original TMM Ours

Figure 8. Snowman Dataset. Reducing width by 40%. (From top to bottom: left view, right view, estimated disparity map)

periences. In general, our method has a better disparity map than TMM in terms of geometric consistency. Compared to ICCV’11, our results have less structure discontinuity in ap- pearance. When viewing in 3D with stereoscopic displays, TMM has “rubber sheet” artifacts in depth and worse 3D perception since it does not handle occlusions well. On the other hand, The structure discontinuity artifacts of ICCV’11 often look annoying in appearance when viewing in 3D.

(8)

Without EQ Without EI Without ES

Figure 9. Visual contributions of EQ, EI and ES. (From top to bottom: left view, right view, estimated disparity map)

There are three major terms, EQ, ESand EI, in our for- mulation (Equation 1). If EQis removed, the image could be distorted and have holes for preserving more important content (EI) and maintaining better stereo correspondences (ES). If ES is removed, the stereo correspondences might not be maintained well. The perceived depths could be dis- torted, or even worse the viewer cannot fuse images to have 3D perception. If EI is removed, the method tends to crop the images to the target size as it perfectly preserves im- age and stereo qualities. This, however, might remove im- portant contents. Figure 9 evaluates visual contributions of these terms. The results show that the content will be more distorted if EQ is removed; the content will be cropped if EI is removed; and the estimated disparity map could devi- ate much from the original disparity map if ESis removed.

6. Conclusions

This paper proposes scene warping, a layer-based stereo- scopic image resizing method using image warping. The input stereoscopic image is decomposed into layers. Each layer is warped by its own mesh deformation and the warped layers are composited together to form the resized images. The energy function of scene warping enforces the resized image with less distortions and holes, good stereo- scopic properties and important content. Compared to ex- isting methods, scene warping offers the advantages of less discontinuous artifacts, less-distorted objects, correct depth ordering and enhanced stereoscopic quality. Our method suffers from the common limitations shared by warping-

based methods. When parts of the input image are crowded with important objects, our method could crop or occlude important objects. In the future, we would like to explore methods to relax this restriction.

References

[1] S. Avidan and A. Shamir. Seam carving for content-aware image resizing. ACM Trans. Graph., 26(3):10, 2007.

[2] T. Basha, Y. Moses, and S. Avidan. Geometric consistent stereo seam carving. In Proceedings of ICCV, 2011.

[3] C.-H. Chang, C.-K. Liang, and Y.-Y. Chuang. Content-aware display adaptation and interactive editing for stereoscopic images. IEEE Transactions on Multimedia, 13(4):589–601, August 2011.

[4] R. Gal, O. Sorkine, and D. Cohen-Or. Feature-aware textur- ing. In Proceedings of Eurographics Symposium on Render- ing, pages 297–303, 2006.

[5] H. Hirschm¨uller. Accurate and efficient stereo processing by semi-global matching and mutual information. In Proceed- ings of CVRP, pages 807–814, 2005.

[6] H. Hirschm¨uller. Evaluation of cost functions for stereo matching. In Proceedings of CVPR, 2007.

[7] T. Igarashi and T. M. J. F. Hughes. As-rigid-as-possible shape manipulation. ACM Trans. Graph., 24(3):1134–1141, 2005.

[8] M. Lang, A. Hornung, O. Wang, S. Poulakos, A. Smolic, and M. Gross. Nonlinear disparity mapping for stereoscopic 3D.

ACM Trans. Graph., 29(4):75, 2010.

[9] W.-Y. Lo, J. van Baar, C. Knaus, M. Zwicker, and M. Gross.

Stereoscopic 3D copy & paste. ACM Trans. Graph., 29(6):147, 2010.

[10] A. Mansfield, P. Gehler, L. V. Gool, and C. Rother. Scene carving: Scene consistent image retargeting. In Proceedings of ECCV, 2010.

[11] A. Mansfield, P. Gehler, L. Van Gool, and C. Rother. Vis- ibility maps for improving seam carving. In Media Retar- geting Workshop, European Conference on Computer Vision (ECCV), 2010.

[12] Y. Niu, F. Liu, X. Li, and M. Gleicher. Warp propagation for video resizing. In Proceedings of CVPR, pages 537–544, 2010.

[13] C. Rother, V. Kolmogorov, and A. Blake. Grabcut: Interac- tive foreground extraction using iterated graph cuts. ACM Trans. Graph., 23(3):309–314, 2004.

[14] M. Rubinstein, A. Shamir, and S. Avidan. Improved seam carving for video retargeting. ACM Trans. Graph., 27(3):16, 2008.

[15] M. Rubinstein, A. Shamir, and S. Avidan. Multi-operator media retargeting. ACM Trans. Graph., 28(3):23, 2009.

[16] A. Shamir and O. Sorkine. Visual media retargeting. In ACM SIGGRAPH Asia 2009 Course Notes, 2009.

[17] L. Wolf, M. Guttmann, and D. Cohen-Or. Non-homogeneous content-driven video retargeting. In Proceedings of ICCV, 2007.

[18] O. S. Yu-Shuen Wang, Chiew-Lan Tai and T.-Y. Lee. Op- timized scale-and-stretch for image resizing. ACM Trans.

Graph., 27(5):118, 2008.

參考文獻

相關文件

We model the HDR- to-LDR image formation pipeline as the (1) dynamic range clipping, (2) non-linear mapping from a camera response function, and (3) quantization1. We then propose

Each cast portrait is depicted with a face image of high quality, so we adopt the face visual links as the initial relationship between the portrait and candidates.. For face

2009], stereoscopic 3D image cloning has its own chal- lenges: (1) we must adjust disparity values within the cloned re- gion for depth continuity; (2) we must alter the projected

‹ Based on the coded rules, facial features in an input image Based on the coded rules, facial features in an input image are extracted first, and face candidates are identified.

The overall system is shown in figure 1. An infrared sensitive camera synchronized with infrared LEDs is used as a sensor and produces an image with highlighted pupils. The

We propose a digital image stabilization algorithm based on an image composition technique using four source images.. By using image processing techniques, we are able to reduce

A calibration algorithm is developed to match the depth information, IR image information, and RGB image information of the Kinect, to improve the image quality in

In this function, we use a 3*3 mask to correct the edge pixel gradient, start from the upper left of image and finish in the lower right of image.. After previous work,