• 沒有找到結果。

Chapter 1 Introduction

1.2 Survey of Related Works

The related works are categorized into several directions and reviewed as follows.

(1) Catadioptric camera

A catadioptric omni-camera is a combination of a reflective mirror and a CCD camera as shown in Fig. 1.1(a). An image taken by such a kind of camera is shown in Fig. 1.1(b). With the aid of reflective surface from mirror, a camera of this type can obtain larger fields of view in the acquired images. The lens of the CCD camera may be of a perspective or orthographic projection type, and the mirror surface of a catadioptric omni-camera may be in various shapes such as hyperbolic, circular, parabolic, or conical one, as illustrated in Fig. 1.2. With distinctive mirrors or lenses, the images and calibration methods of the cameras are different in this category. Some works of using this type of camera can be found in [6]-[14].

(2) Dioptric camera

A dioptric omni-camera, looking like a traditional camera, has no reflective mirror, but is with a “wider-angle” lens. It can capture incoming light rays from a wider field of view to form an omni-image. An illustration of such an imaging difference between traditional and catadioptric cameras is shown in Fig. 1.3. The lens shape design of this group of cameras decides the formed images and their calibration methods. An example of this kind of omni-camera is the fish-eye camera. An image acquired by a fish-eye camera is shown in Fig. 1.4. Some works of using fish-eye cameras can be found in [15]-[17].

3

(a) (b)

Fig. 1.1 A catadioptric camera. (a) Structure of camera. (b) Acquired image.

Fig. 1.2 Illustration of camera and reflective mirror types.

(a) (b) (c)

Fig. 1.3 FOVs of different camera types. (a) Dioptric camera. (b) Traditional (perspective) camera. (c) Catadioptric camera. [18]

orthographic camera

paraboloid mirror

hyperboloid mirror

perspective camera

FOV FOV

FOV

4

Fig. 1.4 An image acquired by a fish-eye camera.

(3) Binocular vision systems

A binocular vision system is composed of two cameras, typically perspective ones, which are placed at different locations. Differently, a binocular omni-vision system consists of two omni-cameras, which can be catadioptric or fisheye-lens ones.

An illustration is shown in Fig. 1.5, where two kinds of such camera pairs are seen. In theory, by using the corresponding pixels in the two images acquired from the cameras, stereo information can be derived from such correspondences. Most existing researches were focused on binocular vision systems using perspective cameras [19][20]; contrarily, researches on binocular vision systems with omni-directional cameras are less [21] with many open problems waiting to be solved.

(4) Human-machine interface systems

Human-machine interaction has been intensively studied for many years. Laakso and Laakso [22] proposed a multiplayer game system using a top-view camera, which

5

maps player avatar movements to physical ones, and used hand gestures to trigger actions. In [23], a special human-machine interface was proposed by Magee et al., which uses the symmetry between the left and right human eyes to control computer applications. Zabulis et al. [24] proposed a vision system composed of eight cameras mounted at room corners and two cameras mounted on the ceiling to localize multiple persons for wide-area exercise and entertainment applications. Starck et al. [25]

proposed an advanced 3-D production studio with multiple cameras. The design considerations were first identified in that study, and some evaluation methods were proposed to provide an insight into different design decisions.

(a) (b)

Fig. 1.5 Two types of binocular omni-vision systems. (a) Laterally parallel combination. (b) Longitudinally coaxial combination.

(5) Geometric feature extractions

Geometric features, like points, lines, spheres, etc., in environments encode important information for on-line calibrations and adaptations [26][27]. Several methods have been proposed to detect such features in environments. Ying [28][29]

6

proposed several methods to detect geometric features when calibrating catadioptric cameras, which use the Hough transform to find the camera parameters by fitting detected line features into conic sections. Duan et al. [30] proposed a method to calibrate the effective focal length of the central catadioptric camera using a single space line under the condition that other parameters have been calibrated previously.

Von Gioi et al. [31] proposed a method to detect line segments in perspective images, which gives accurate results with a controlled number of false detections and requires no parameter tuning. Wu and Tsai [32] proposed a method to detect lines directly in an omni-image using a Hough transform process without unwarping the omni-image.

Maybank et al. [33] proposed a method based on the Fisher-Rao metric to detect lines in paracatdioptric images, which has the advantage that it does not produce multiple detections of a single space line. Yamazawa et al. [34] proposed a method to reconstruct 3D line segments in images taken with three omni-cameras in known poses based on trinocular vision by the use of the Gaussian sphere and a cubic Hough space [35]. Li et al. [36] proposed a vanishing point detection method based on cascaded 1-D Hough transforms, which requires only a small amount of computation time without losing accuracy.

(6) System configuration optimization

Several methods have been proposed to derive optimal vision system configurations. Among them, one popular way is to assess the 3D measurement error by the use of the covariance matrix [37]-[42]. For this, Wenhardt et al. [37]

determined the locations of mobile cameras to yield the best 3D model reconstruction by assessing the covariance of the resulting 3D data in three ways, namely, using the determinant, eigenvalues, and trace of the covariance matrix, respectively. Hoppe et al.

[40] used the eigenvalues of the covariance matrix to model the 3D measurement error for precise camera localization and object modeling. Alsadik et al. [39]

established a camera network for precise reconstruction of a cultural heritage object by the use of the trace of the covariance matrix. Olague and Mohr [41] proposed a multi-cellular genetic algorithm to decide camera locations, which yield minimal 3D measurement errors, by the use of the maximum diagonal element of the covariance matrix. Zhang [38] determined the optimal 2D spatial placement of multiple sensors participating in a robot perception task utilizing the determinant of the covariance

7

matrix. Rivera-Rios, et al. [42] analyzed 3D measurement errors due to feature-point localization errors and found accordingly the optimal camera pose by the mean-square-error criterion using the covariance matrix of the 3D measurement data.

In these methods, the precisions of the 3D measurements are all assessed by the use of the covariance matrix. Additionally, a local-affineness assumption was made when deriving the covariance matrix (as stated in [43]).