Chapter 5 Omni-vision Based Localization of Lateral Vehicles for
6.3 Proposed Techniques for Basic Mapping Table Construction and
6.3.3 Mapping Table Modification According to Change of Camera
Assume now that the camera is affixed to the ceiling with a tilt angle of θ and a height of L with respect to floor F1, as illustrated in Fig. 6.7. Here, the location of the object point P1 on F1, which we want to estimate, is specified by the real-world
121
coordinates (x1, y1) with respect to the downward projection point O of the camera’s lens center onto F1, where the x-axis is assumed to be coincident with the projection of the camera’s optical axis on F1. Let the coordinates of P1 in the acquired image be (u, v). Again the space-mapping table is inapplicable here; the table lookup result, the real-world coordinates (x0, y0), are actually those of a real-world point on a floor surface F0 at a distance of H0 to the camera’s lens center, instead of being the desired real-world coordinates (x1, y1) of P1 on F1. Table modification is necessary, which is called camera orientation adaptation in Step 6 of Algorithm 6.1.
To correct the values (x0, y0) into (x1, y1), we rotate floor surface F1 through an angle of 90o − θ with P1 as the rotation pivot point, such that the resulting surface plane F1' becomes perpendicular to the camera’s optical axis and the lateral view of the rotation result seen from the positive y-axis direction becomes the one shown in Fig. 6.8. The original floor surface F0 is also shown in the figure.
Assume that the distance of P1 on F1' to the camera’s optical axis is x'. Then, according to the concept of side proportionality again, we have
0 0
1
x H
x' = H . (6.3)
Also, by geometry and trigonometry we have
sin x'
122
Fig. 6.7 Illustration of a tilted camera with angle θ with respect to the x-axis of the real-world coordinate system.
Fig. 6.8 Lateral view (from the positive y-axis direction) of rotation result of floor surface F1 in Fig. 6.7 through an angle of 90o − θ with P1 as the rotation pivot point.
123 which can be reduced to be
x1 = 0 0 x0×(L/H0), which exactly is the case described by Eqs. (6.2) and depicted by Fig. 6.6.
The correctness of (6.11) may also be seen from Fig. 6.9 because according to trigonometry we have H0cosθ = a, x0sinθ = b, H0sinθ = c + d, x0cosθ = d, so that
124
Fig. 6.9 Illustration for verification of correctness of Eq. (6.11).
On the other hand, because the x-axis on F1 is assumed to be coincident with the projection of the camera’s optical axis on F1, and because the rotation of F1 into F1' is pivoted in the y-direction, we have y' = y1. Also, according to Eqs. (6.2) we have
1 which is again exactly the case described by Eqs. (6.2) and depicted by Fig. 6.6. The correctness of (6.14) may also be seen from Fig. 6.10 which is a lateral view of Fig.
125
6.8 from the positive x-axis direction, because from the previous analysis of (6.12), we have H0sinθ − x0cosθ = c so that
0
0sin 0cos
L y
H θ x θ
× − = y0
L× c = Lcotθ1 = y1 (6.15)
which is just (6.14).
Fig. 6.10 Lateral view of Fig. 6.8 from direction of positive x-axis for verification of correctness of Eq. (6.14).
6.4 Experimental Results
A series of experiments have been conducted to test the correctness and precision of the proposed method for object location estimation. The fish-eye camera used in the experiments is shown in Fig. 6.11. It is attached to a rotator connected to a rod with an adjustable length. The camera can so be tilted arbitrarily by rotating the rotator, and raised to any height by adjusting the rod length, to simulate environments
126
of different ceiling heights and camera orientations. An image taken with the camera looking right downward (i.e., with the tilt angle of 90o) is shown in Fig. 6.3. We show additionally here three images (Figs. 6.12(a) through 12(c)) taken with the camera in three distinct setups, which are used in our experiments reported in this section: (1) looking downward at the ceiling height of 200cm; (2) looking downward at the ceiling height of 250cm; (3) tilted for the angle of 50o at the ceiling height of 200cm.
The images are all of the resolution of 1280×1024.
Fig. 6.11 Fish-eye camera used in this study, which is attached to a rod fixed on ceiling and can be tilted and moved up and down.
Case (1) is regarded as the original camera setup configuration used in the factory for building a basic space-mapping table. After the image of Fig. 6.12(a) was taken with the downward-looking camera at the ceiling height of 200cm, all the grid points in the image are extracted to get their image coordinates, forming a set denoted by Ic. Also, the real-world coordinates of each grid point are measured manually to form a set denoted by Wc. The two sets Ic and Wc of coordinate data are then used to construct a basic space-mapping table T by the process described in Section 6.3.1. To test the precision of the constructed table T, nine non-grid points among the grid ones,
127
which also appear in Fig. 6.12(a), were selected and their image coordinates collected to form a set Ic'. Also, the real-world coordinates of these non-grid points are measured manually to form another set Wc'. The set Ic' then is used to obtain their corresponding real-world coordinates by table lookup using T, forming a set denoted Wc''. Finally, the two sets Wc' and Wc'' are compared and two types of error ratio measures are defined to compute the similarity between them as follows.
(1) Type 1 --- location error ratio with respect to the distance from the real-world point to the camera’s lens center:
location error ratio =
2 2
2 2 2
( i i) ( i i)
i i
real x estimated x real y estimated y real x real y L
− + −
+ +
where real xi and real yi are data in Wc' and estimated xi and estimated yi are data in Wc''.
(2) Type 2 --- location error ratio with respect to the effective field of view of the camera (see Fig. 6.13):
location error ratio =
2 2
( ) ( )
radius of effective camera's field of view
i i i i
real x −estimated x + real y −estimated y .
The computed results for the two types of error ratios are summarized as a table as shown in Table 6.2, from which we can see that the ratios are all small then 5%
which is practical for object location estimation applications like robot or vehicle guidance in indoor environments.
For Case (2), the camera, still looking downward, was affixed at a different height of 250cm and the previously-mentioned process of error ratio computation was
128
(a) (b)
(c)
Fig. 6.12 Images used for experiments reported here. (a) Taken with camera looking downward at ceiling height of 200cm. (b) Taken with camera looking downward at ceiling height of 250cm. (c) Taken with camera tilted for 50o at ceiling height of 200cm.
Fig. 6.13 Effective field of view of camera measured by radius of an enclosing red circle.
repeated after the proposed method was applied to the image of Fig. 6.12(b). The results were again summarized as a table shown in Table 6.3, from which we can see that the ratios are all small then 5% as well.
129
Table 6.2 Error ratios with camera looking downward at ceiling height 200cm.
real x
Table 6.3 Error ratios with camera looking downward at ceiling height 250cm.
real x
Similarly, for Case (3) where the camera was affixed at the ceiling height 200cm and tilted for 50o, the error ratio table constructed for the image of Fig. 6.12(c) is shown in Table 6.4, from which we see that the ratios are not all small then 5% this time; some are larger (6.0% and 7.1% for the last row in the table). The reason for this phenomenon is that the object point dealt with is located at (-320, -15) which is quite far away from the center of the image, falls within a distorted-shaped quadrilateral, and so incurs a larger error in the process of quadrilateral mapping described in Section 6.3.1.
130
Table 6.4 Error ratios with camera looking downward at ceiling height 250cm.
real x
For Case (4) where the camera was affixed at the ceiling height 200cm but tilted for 90o (looking down), 70o, and 50o, respectively, the error ratio table is shown in Table 6.5, from which we see that the average error ratios are slightly increased by 1~2% as the titled angle changes. For the similar reason as Case (3), the larger change occurs only in the positions far away from the image center.
Two possible applications of the proposed method are guidance of autonomous vehicles for human tracking and security patrolling, which were conducted in our laboratory. Two images taken in such application studies are shown in Fig. 6.14. The vehicle location estimation results using the proposed method were used for path correction and planning in these studies.
6.5 Concluding Remarks
A general space-mapping method for object location estimation by modifications of the basic space-mapping table to adapt it to camera setup changes has been proposed. The method does not require the conventional camera calibration process, and is general for any type of camera. The method estimates the location of an object by mapping the image coordinates of an object point to the real-world coordinates of
131
the point using a space-mapping table. An algorithm is designed to construct the table, which consists of two stages with the first stage for constructing a basic space-mapping table using a bilinear interpolation technique and the second stage for modifying the basic table to adapt it to changes of camera setups, including camera Table 6.5 Error ratios with camera at ceiling height 200cm for different tilted angle
90o (looking down), 70o, and 50o.
real x,y (cm)
Titled for 90o Titled for 70o Titled for 50o estimated (-20,96) (-20,94) (0.9%,0.6%) (-22,94) (1.3%,0.9%) (-21,90) (2.7%,1.9%) (-45,-107) (-44,-106) (0.6%,0.4%) (-45,103) (1.7%,1.2%) (-42,-103) (2.2%,1.6%) (-111,-55) (-112,-56) (0.6%,0.4%) (-115,-58) (2.1%,1.6%) (-110,-60) (2.2%,1.6%) (-140,62) (-140,60) (0.8%,0.6%) (-144,66) (2.2%,1.8%) (-141,58) (1.6%,1.3%) (-229,-101) (-228,-104) (1.0%,1.0%) (-224,-95) (2.4%,2.4%) (-238,-110) (4.0%,4.0%) (-253,76) (-257,82) (2.2%,2.3%) (-259,82) (2.6%,2.6%) (-271,81) (4.5%,4.6%) (-320,-15) (-317,-15) (0.8%,0.9%) (-330,-16) (2.7%,3.1%) (-340,-26) (6.0%,7.1%)
average error ratio
(type-1, type-2) (0.9%,0.7%) (1.9%,1.7%) (2.8%,2.6%)
height and orientation adjustments, which often occur after the camera is delivered to a user for uses in the application environment. The proposed techniques for table modifications are based on a concept of image formation by light rays as well as several properties of geometry and trigonometry. Such a problem of adapting the space-mapping method to camera-setup changes has not been studies before.
Experimental results show that the method yields location estimation results with error ratios smaller than 5% in most cases, which means that the proposed method is practical for applications like robot or vehicle guidance. Future studies may be directed to applying the proposed method to more application fields, as well as
132
extending the method to outdoor environments. Modifications of the method for more complicated camera structures like omni-camera pairs [16] or two-mirror omni-cameras [63] are also worth investigations.
(a) (b)
Fig. 6.14 Illustrative images of applications of proposed location estimation method for autonomous vehicle guidance in an indoor environment (a laboratory where this study was conducted). (a) An image acquired by a downward-looking camera affixed on ceiling. (b) A processed image in which autonomous vehicle center (white point) was detected for vehicle location estimation.
133
Chapter 7 Unwarping of Images Taken by Misaligned Omni-cameras without Camera Calibration by Curved Quadrilateral Morphing Using Quadratic Pattern Classifiers
7.1 Idea of Proposed Method
As mentioned previously, omni-cameras are getting popular for various applications owing to their advantage in providing greater FOVs in acquired omni-images. A dioptric omni-camera captures incoming light through the camera lens to form images. An example is the fish-eye camera [13]. A catadioptric omni-camera has, in addition to a CCD camera, a reflective mirror, and captures indirect light reflected by the mirror to form images. The mirror surface may be of various shapes, like conic, parabolic, hyperbolic, spheric, etc. A catadioptric omni-camera with a parabolic mirror used in this study is shown in Fig. 1.1(a), in which a transparent plastic hollow cylinder is used to support the mirror at a distance from the CCD camera placed on a platform. The structure of the camera is illustrated in Fig. 7.1(a). If all the reflected light rays go through a common point, the camera is said additionally to have a single-viewpoint (SVP) [64]; otherwise, a non-single-viewpoint (non-SVP) [3].
Omni-images, though providing wider FOVs, are warped in nature. In many applications, it is necessary to transform them into unwarped images. Such an image unwarping work usually involves camera calibration, in which the intrinsic and
134
extrinsic camera parameters are estimated, followed by the derivation of equations to transform image coordinates into unwarped versions [71]. The camera calibration process, presumably conducted in the camera manufacturing process, is in general complicated and time-consuming. After a calibrated camera is equipped in an application environment (e.g., installed on a vehicle, attached on a house ceiling, etc.) and used for application purposes, it is usually assumed that the camera structure is fixed stably forever, incurring no change of the camera parameters.
However, in real applications like vision-based autonomous vehicle navigation or security surveillance, a camera equipped on a vehicle might be shaken due to vehicle vibrations or one installed on a wall might be removed due to re-employment, causing possibly camera misalignment as mentioned previously, which causes displacements or/and re-orientations of the CCD camera with respect to the reflective mirror. The previously-mentioned non-SVP property is actually a type of camera misalignment with both the optical axis through the lens center and the mirror axis through the mirror center being axially displaced with respect to each other, resulting in destruction of the SVP into a locus called a caustic surface [3]. We will call such a kind of camera structure change axial-directional camera misalignment. An illustration is shown in Fig. 1(b). Note that usually the optical axis is assumed to be coincident with the mirror axis and that the distance of the CCD camera to the mirror surface is usually adjusted properly in advance to form the SVP property.
Another type of camera misalignment is re-orientation of the CCD camera with respect to the mirror surface, resulting in destruction of the coincidence of the optical axis with the mirror axis. Such misalignment, seldom studied, not only destructs the SVP property [34] but also the rotational invariance property in omni-images, used by almost all existing image unwarping methods to simplify computation [71]. We
135
will call such a kind of camera structure change lateral-directional camera misalignment. An illustration is shown in Fig. 7.1(c). Note that the rotational invariance property says that the angle of an incoming light ray of a scene point is identical to that of the corresponding image point in the image space. An image taken by a correctly-aligned catadioptric omni-camera and another taken by a lateral-directionally misaligned one are shown in Figs. 3(a) and 3(b), respectively, for illustration.
(a) (b) (c)
Fig. 7.1 Alignment of catadioptric omni-camera. (a) Correct alignment. (b) Axial-directional misalignment. (c) Lateral-directional misalignment.
(a) (b)
Fig. 7.2 Images of a color pattern acquired by a catadioptric omni-camera. (a) Image taken with the camera correctly-aligned. (b) Image taken the camera misaligned.
Camera misalignment causes conventional image unwarping methods inapplicable because of the resulting changes of the camera parameters. It is desired in
Om
Oc
Omni-image
Om
Oc
Omni-image
Oc
Omni-image Om
136
this study to design a general image unwarping method which can solve this problem in the application environment without camera calibration which is usually done in the factory. Such an in-field method is useful for applications where sending misaligned cameras back to factories for re-calibration is undesirable or impractical.
For this goal, the idea of a mapping-based approach proposed recently by Jeng and Tsai [36] is adopted. This approach does not conduct camera calibration to estimate camera parameters, but creates a so-called pano-mapping table as a substitute of camera parameters for image unwarping. It is unified and integrated in nature, applicable to unwarping of images taken by any type of CCD camera as well as any type of reflective mirror surface.
More specifically, the proposed method has two stages, the first being assumed to be conducted in the factory and the second in the in-field environment. In the first stage, it assumed that the camera is correctly aligned to take images, which we call undistorted images. A pano-mapping table is then created according to Jeng and Tsai [36], which defines a coordinate mapping function from the real-world space to the omni-image space. It can be used to unwarp an omni-image into a panoramic or a perspective-view image. In the second stage where the camera is lateral-directionally misaligned, a distortion correction table is created first, which maps undistorted images to distorted ones taken by the camera. The table then is combined with the pano-mapping table to create a composite mapping from the real-world space to the distorted image space, in the form of a third table, called misalignment adjustment table. Such a table is finally used for unwarping distorted images into panoramic images in the real-world space.
In generating the distortion correction table which is essentially an image mapping between patches of a distorted image and those of an undistorted one, a new
137
image morphing technique proposed in this study is applied. The technique is based on the use of the quadratic classifier in pattern recognition theory for two-class pattern classification. The use of such quadratic classifiers improves the precision of the morphing result of the conventionally-adopted bilinear mapping technique, because the corresponding patches in this study have curved boundaries instead of linear ones.
Furthermore, the misalignment adjustment table is invariant in nature with respective to the camera position, so that the table is applicable wherever the camera is moved.
In the remainder of this chapter, we describe the proposed two-stage mapping-based image unwarping method as an algorithm in Section 7.2, present the proposed image patch morphing technique using quadratic classifiers in Section 7.3, show some experimental results in Section 7.4, and make concluding remarks finally in Section 7.5.
7.2 Proposed Mapping-based Image Unwarping Method
In this section, Jeng and Tsai’s method [36] used in the proposed image unwarping method are reviewed first, followed by the description of the proposed method.
A. Review of a mapping-based image unwarping method
The pano-mapping table proposed by Jeng and Tsai [36] is created once forever by a simple learning process for a non-lateral-directionally misaligned omni-camera with any type of reflective mirror surface as a summary of the information conveyed by all the camera parameters. The learning process takes as input a set of landmark points on a calibration object in the world space and the set of corresponding points in a given image. For example, as illustrated in Fig. 7.3, P1 and P2 are two landmark points in the real world, and p1 and p2 are the corresponding image points,
138
respectively. More generally, let the coordinates of each real-world point P be denoted as (θ, ρ), and those of its corresponding image pixel p as (u, v). The pair (θ, ρ) describes the azimuth angle and the elevation angle of an incident light ray coming from P and reflected by the mirror surface to go through the lens center, yielding the corresponding image pixel at (u, v) on the image plane. Accordingly, the pano-mapping table is designed to be 2D in nature with the horizontal and vertical axes specifying the possible ranges of θ and ρ in M and N increments, respectively, as shown in Table 7.1. Each entry Eij with indices (i, j) in the table specifies a pair (θi, ρj), which defines an infinite set Sij of real-world points on the light ray with azimuth angle θi and elevation angle ρj. These real-world points in Sij are all projected onto an identical pixel pij in an omni-image taken by the camera, forming a pano-mapping, denoted as fpm, from Sij to pij. An illustration of this mapping is shown in Fig. 7.4. This mapping is shown in the table by filling entry Eij with the coordinates (uij, vij) of pixel pij in the omni-image.
139
Table 7.1 A pano-mapping table of size M×N.
θ1 θ2 θ3 θ4 … θM
Fig. 7.4 Mapping between pano-mapping table and omni-image.
Under the assumption of correct camera alignment which leads to the rotational invariance property, Jeng and Tsai [36] derived the following equations for computing the values (uij, vij) of each entry in the table: omni-camera, respectively; fr(ρ) is a nonlinear function specifying the relation between the elevation angle ρ of a real-world point P and the radial distance r from
Under the assumption of correct camera alignment which leads to the rotational invariance property, Jeng and Tsai [36] derived the following equations for computing the values (uij, vij) of each entry in the table: omni-camera, respectively; fr(ρ) is a nonlinear function specifying the relation between the elevation angle ρ of a real-world point P and the radial distance r from