Related work - 單一視角影片中折疊表面之深度重建

This section has three parts: First of all, we introduce geometry-based 3D construction method like Blender3D and Maya. Then we show image-based reconstruction techniques such as Structured-line, Multi-view reconstruction, Photometric stereo, and SFS. Finally, we describe some applications of reconstruction.

2.1 Geometry-based Method

The first approach is interactive modeling with manual assistance, like Blender3D、Maya. Through these tools we can build a model manually, but it is labor-intensive to create a photorealistic result. Several intelligent or hybrid modeling system were proposed to reduce manual intervention. [Hengel et al. 2007.] proposed building a realistic 3D models from video by point clouds with a small number of simple 2D sketches as constraints.

[Debevec et al. 1996] provided another method to construct 3D model from video. Users needed only draw some structure lines, and the system then built 3D buildings and retargeted the texture over it.

Fig2.P. Devebec et al‟s 3D reconstrunction from video.

2.2 Active Structured-line System

By contrast, the second kind of approach using active structured light system is faster and more convenient. It is also the main stream of the high-accuracy 3D recovery. In the early years, 3D scanning technique was only suitable for static objects, and it needed more scanning time. [S.Rusinkiewicz et al. 2002.] developed the system based on the structured-light system and a real-time variant of ICP(iterative closest points) to align the shapes acquired from multiple views. It made the significant impact of rapid 3D recovery. [L.Zhang et al. 2004.] used the consistent space-time stereo technique to enhance the reliability of acquired 3D data. By usage of the structure-light system, we can precisely estimate the shape of the target object, but it still has several defects limiting its usability. The target object is limited to be a nearly-lambertian object and not suitable for the one too big or the scene outdoors.

Fig3. L.Zhang et al.’s facial detail reconstruction system and depth data

2.3Multiple views

Multiple view technique plays an important role in reconstruction. With well calibration and correspondence matching, we can reconstruct scene‟s surface. But it’s cost-high to find the reliable correspondence matching at untextural or highly

repeated regions and the occlusion of correspondences are also the problem. Even though, it‟s still widely used as the constrain to other technique or a coarse shape recovery.

[Vogiatzis et al. 2005.] proposed a novel technique combined multi-view stereo withgraph-cut based optimization for detailed surface reconstruction. They used the visual hull as the initial shape, and then defined a continuous photo-consistency function as a flow graph to minimize the detailed surface.

Structure form motion is the same technique but use only one single camera instead and suitable for the moving rigid or static object. With such a uncalibrated property, it is more suitable for the common video sequence.

[Pollefeys et al. 2006.] used the corner detection to find out the feature points, and then found out the correspondences by use of the epipolar geometric properties.

The affine transformations between multiple-views were therefore acquired.

Fig4. Pollefeys et al.‟s Reconstructed model and the view points

If there were few correspondences, only sparse 3D points can be estimated. [M.

Lhuillier et al. 2005.] proposed an approach to generate quasi-dense 3D points toward

the surface with fewer feature points. They produced a dense disparity map and used it to improve numbers and qualities of the feature correspondences matching by the correlation method. Moreover, they proposed a fast gauge-fee algorithm to estimate the accuracy of the recovered 3D depth.

For the non-rigid body, [Torresami et al. 2003.] proposed a method combined with structure from motion to recover the target shape from the video. They defined a non-rigid body as a rigid transformation combined with a non-rigid deformation in the time frames. Under the assumption that the object shape at each time frame was organized from a Gaussian distribution, they simultaneously estimated 3D shapes in each time frame, learned the parameters of the Gaussian, and also recovered the missing data points. Finally, they implemented the space-time constrain to the object shape for the better consistent result.

2.4 Photometric stereo

Photometric stereo estimates local surface orientation by using several images of a surface taken from the same viewpoint but under illumination from different directions.

Fig5. The left two images are the reconstructed surface by the M. seitz et al‟s method.

The right four images are the reference and target object used for

[M. seitz et al. 2004.] proposed the example-based photometric stereo method.

They introduced orientation-consistency concept to reconstruct the surface normal from the reference images where the reference objects with identical materials were also taken. Combined with traditional photometric stereo, a more detailed surface can be recovered. The technique is reliable to be applied to a broader class of objects than previous photometric stereo technique.

[Carlos et al. 2008.] used the silhouettes in multiple views to recover camera motion and then got a coarse shape of the object by the visual hull. Besides, they proposed a robust technique to estimate light directions and introduced a novel formulation to combine photometric stereo and 3D points from visual hull.

2.5 Shape-from-shading (SFS)

Shape-form-shading recovers the shape from the gradual variation of shading of one single image. However, it has several limitations. For example, it is sensitive to the noise of intensity, and the light condition is limited to simple lighting conditions.

SFS techniques only work for single material object by its principle. Most important of all, SFS techniques can only recover continuous surface, so it cannot deal with folding. Even though, it‟s single-view requirement is a benefit for image-based modeling. We need only one single shot and without the correspondences matching compared to multiple views technique.

Due to its intrinsic ill-pose problem, [Zeng et al. 2005.] proposed a user-assistant solution of continuous surface. Users input surface normal on specific feature points and the system refined the surface variations to the whole face. This method applied a

Fast Marching Method speed up the computation. After optimizing the energy function combining with each local surface, it can evaluate a global solution toward synthetic and real-world data.

[Tai-Pang et al. 2008.] made a extension of the above one. Toward the biases of the light direction, they reformulated SFS and produced good initial normals for a large region to leave most noticeable errors mainly in the smooth part. They also developed an easily used 2D user-interface to edit and correct the normal map.

Fig7. Tai-Pang et al’s reconstructed surface Fig6.Interactive Shape-from-shading

2.6 Others

[Fang et al. 2006.] combined the Shape-from-shading and texture synthesis to re-texture the target object in the photograph and video. They used optical flow keeping the texture coordinate in each frame. However, this approach is error-prone due to the Lambertian surface assumption and simple lighting conditions. It was only suitable for some simple objects, like t-shirt or sculptures, and needed manual rectification. Furthermore, it can only recover normal vectors.

Fig8. Fang et al.‟s method pastes an image on a surface in video.

[Lin et al. 2004] analyzed the texture‟s type of geometry. They viewed near-regular textures as statistical departures from a regular texture along different dimensions. So they used shape-from-texture to recover geometry. This method worked mainly on surface with regular/near-regular texture and it also only recovered normal vectors. The two methods don‟t really recover surface geometry, but they synthesized realistic results by texture synthesis. It motivates us that we may not need to recover the exact depth map, but related depths for view changes.

[Mathieu et al. 2009.] provided another optimization method to recover 3D mesh with inequality constraints. It also combined with Principle Component Analysis (PCA) so that it can folds and individual images. Nevertheless, this method needed an initial mesh on the surface, and cannot deal with self-occlusion.

Fig10. Mathieu et al.‟s method recovers 3D mesh in video.

Fig9.Lin et al perform texture replacement on an outdoor photo.

在文檔中單一視角影片中折疊表面之深度重建 (頁 12-20)