Surface recovery techniques - 主動視覺式三維掃描儀之建置與評估

Chapter 1 Introduction

1.1 Surface recovery techniques

The construction of 3-d representatives of objects in digital world has become possible through a variety of shape recovery techniques. These disciplines solve ranging problem from different points of view, and with various constrains. In the taxonomy, the vision-based approaches are further categorized into passive vision and active vision. A passive vision approach derives range data from one or multiple images that are sampled under “nature”

condition, whereas an active approach involves the emission of controlled radiation. A common source of radiation is visible light.

The vision-based techniques are also characterized by their temporal and spatial configur-ations. For example, a classical binocular stereo vision system uses a pair of camera to capture depth image, while a similar approach uses one moving camera to achieve the same goal. Table 1.1 lists some related applications with respect to different camera configurations.

In following subsections some well-known vision-based approaches are reviewed.

One sensor Many sensors

Single

view Shape from shading, single view

reconstruction Binocular/trinocular stereo, range

data fusion Multiple

views Shape from contour (or silhouette), photometric stereo, shape from rotation, shape/structure from motion, optical flow

Dynamic stereo, space-time stereo, real-time distancing, the .enpeda.

project [1]

Table 1.1: Different vision-based configurations and related techniques or applications

1.1.1 Shape from shading (SFS)

The attempt to derive depth information from single photo can be traced back to the early

development in computer vision. The problem of recovering depth values from a gray-scale image was first discussed in Horn's work in early 70s [2], although the statement of problem is simple, finding a unique solution is difficult [3]. The shape from shading approach is itself an ill-posed problem, which means it can only be solved by further regularization. Some shading or geometrical assumptions have to be made to eliminate many possibilities. An interesting topic closely related to SFS is the 3-d reconstruction of painting (e.g. [4]).

1.1.2 Photometric stereo (PSM)

The photometric stereo method, which is also known as shape from multiple light sources, utilizes more than one light to overcome the ill-posed problem encountered by SFS.

It was proposed by R.J. Woodham in 80's [5]. Assuming the surface obeys Lambertian reflectance model, its gradient field can be numerically recovered from three light sources (and from two sources with an ambiguity). By integration techniques (e.g. [6][7]) the solved discrete vector field is converted to a depth map with an unknown constant, which denote the absolute distance between centre of camera and measured surface.

1.1.3 Shape from contours (SFC)

Shape from contours (or silhouettes) recovers model of an object from its 2-d contours viewed from different directions [8]. It is perhaps the most straightforward method among passive vision approaches. A common implementation of SFC is to place the object on a computer-controlled turntable. For each viewing angle a frame is taken and further processed to extract the object's contours, which are then used by a carving algorithm to reconstruct volumetric model of the object. Although the accuracy of reconstruction may be limited by cavities on the object, SFC is an efficient method to acquire rough model for fast prototyping.

1.1.4 Shape/structure from motion (SFM)

The shape and structure of a rigid body are supposed to be consistent in the photos taken from arbitrary views [9]. It is possible to identify such invariant from 2-d images even if the

positional relationships between viewpoints are not known in advance. The reconstruction of scene from images of a moving camera is known as structure from motion. The term

“motion” is defined relatively. It is equivalent to computing the structure of a moving object with the camera remaining still. Finding shape from motion is closely related to stereo vision systems. The simplest configuration of SFM is to place one camera in two different locations.

In that case it becomes a temporal stereo pair with unknown extrinsic parameters. The SFM and binocular methods present similar mathematical problems, which have been extensively studied in terms of multiple view geometry. Common techniques involved in computing struc-ture from motion include correspondence analysis (described later), optical flow analysis, camera calibration, triangulation, and bundle adjustment.

1.1.5 Binocular stereo vision

Using a pair of cameras looking toward the same target to estimate distances is known as stereoscopic vision or binocular vision. Mimicking the human visual system, the depth information can be extracted from two conjugated images. The stereo vision approach is one of the most classical passive vision methods. The accuracy of a stereoscopic scanner relies on a good solution for the correspondence problem, which aims to find correspondences between pixels in two images. The correspondence analysis relates to extraction of view-in-variant features and pattern matching techniques. The existence of features on analyzed surface are crucial for passive stereo vision systems.

1.1.6 Structured light

By replacing one camera in a stereo vision system with a light projector, the configuration becomes an active stereo pair that solves correspondence problem in a straightforward way.

Such a system is called a structured light system, and it is an active vision method that involves the use of light patterns. Two commonly adopted light sources are laser diodes and video projector [10]. Measurement based on structured light do not depend on features of surface nor on particular reflectance models. Due to the excellent robustness, the structured light method has been widely applied to numerous applications such as digital archiving [11],

biometric surveillance [12], reverse engineering [13], vision inspection [14], robot navigation and space science [15]. Many commercialized structured light scanners can be found in metrology industry (e.g. InSpeck 3D Mega Capturor [16] and HDI 3D Scanners [17]). More details on structured light will be given in subsection 1.2.

1.1.7 Modulated light

Modulated light is a surface measuring method similar to structured light. In literatures it is occasionally considered a kind of structured light technique. The modulate light scanners have two major characteristics different from a typical structured light system, which are the use of rapidly modulated illumination and a high frame rate camera. The strength of projected light is modulated over short time according to a predefined function, which is a sinusoidal function in general. By analyzing the phase shift of reflected light, the round-trip time is estimated. A scanner based on modulated light is usually capable to achieve real-time ranging due to its fast operational rate (e.g. [18]).

1.1.8 Laser rangefinder

Although depth recovery using laser rangefinders is strictly not a vision-based technique, it is frequently discussed in the comparison of 3-d reconstruction approaches. The simplest laser rangefinder uses the emission of single laser beam toward measured surface. The distance is calculated by analyzing the time-of-flight of the beam reflected back to a receptive sensor. The range data can be measured with an accuracy within one millimeter, however, at the cost of more expensive prices. A scanner using laser rangefinder can cost more than 30,000 US dollars. A laser rangefinder can be implemented in different form and sizes. A laser scanner can be designed as small as a handhold scanner for digitalizing a regular sized object, or as a sensor array that integrates multiple lasers to expand the scale of reconstruction and increase the scanning rate. For instance, Velodyne HDL-64E SE, a lidar (light radar), consists of 64 spinning laser rangefinders for real-time large field scanning. The solution provided by a laser rangefinder usually comes with no colour information. Therefore some computer vision techniques have to be integrated if texture data are desired.

在文檔中主動視覺式三維掃描儀之建置與評估 (頁 14-18)