Dynamic Camera Calibration - 多台攝影機之靜態與動態校正技術

Up to now, very few research works [38]-[43] have been proposed for dynamic camera calibration. Most existing dynamic calibration techniques concern with extrinsic parameters of cameras. Jain et al [38] proposed an off-line method, where they tried to find the relationship between the realized rotation angle and the requested angle. In [39], the pose of a calibrated camera is estimated from a planar target. However, both [38] and [39] only demonstrate the dynamic calibration of a single camera, but not the calibration among multiple cameras. In [40], the authors utilize the marks and width of parallel lanes to calibrate PTZ cameras. In [41], the focal length and two external rotations are dynamically estimated by using parallel lanes. Although these two methods are practical for traffic monitoring, it is not general enough for other types of surveillance systems. In [42], a dynamic camera calibration with narrow-range coverage was proposed. For a pair of cameras, this method performs the correspondence of feature points on the image pair and uses coplanar geometry for camera calibration. In [43], the relative pose between a calibrated camera and a projector is determined via plane-based homography. The authors took two steps to recalibrate the pose parameters. They first estimated the translation vector and then found the rotation matrix. They also offered analytic solutions. Nevertheless, this approach requires the correspondence of feature points.

So far as we know, most calibration algorithms require corresponding feature points, special calibration patterns (coplanar points with known structure or parallel lines), or known landmarks in the three dimensional space. However, to dynamically calibrate multiple cameras, calibration patterns and landmarks are not always applicable since they may get occluded or even are out of the captured scenes when cameras pan or tilt. On the other hand, in the correspondence of feature points, we

need to keep updating the correspondence of feature points when cameras rotate. For surveillance systems with a wide-range coverage, the matching of feature points is usually a difficult problem. Hence, in this thesis, we develop a new algorithm for the dynamic calibration of multiple cameras, without the need of a complicated correspondence of feature points.

CHAPTER 3 Static Calibration of Multiple Cameras

______________________________________________

I In this chapter, we introduce how to efficiently calibrate the extrinsic parameters of multiple static cameras. In Section 3.1, the camera model of our surveillance system is first described. Next, in Section 3.2, we will deduce the 3D-to-2D coordinate transformation in terms of the tilt angle of a camera. In [46], a similar scene model based on pan angle and tilt angle has also been established. In this paper, however, we will deduce a more complete formula that takes into account not only the translation effect but also the rotation effect when a camera is under a tilt movement. After having established the 3D-to-2D transformation, the tilt angle and altitude of a camera can thus be estimated based on the observation of some simple objects lying on a horizontal plane. Then, we will introduce how to utilize the estimation results to achieve the calibration of multiple cameras in Section 3.3. In addition, the sensitivity analysis with respect to parameter fluctuations and measurement errors will be

discussed in Section 3.4. In Section 3.5, some experimental results over real data are demonstrated to illustrate the feasibility of the proposed static calibration method.

3.1 Introduction of Our Camera Model System

In this section, we give a sketch of our system overview, camera setup model and the basic camera projection model. Although the camera model is built based on our surveillance environment, this model is general enough to fit for a large class of surveillance scenes, which are equipped with multiple cameras.

3.1.1 System Overview

In the setup of our indoor surveillance system, four PTZ cameras are mounted on the four corners of the ceiling in our lab, about 3 meters above the ground plane. The lab is full of desks, chairs, PC computers, and monitors. All the tabletops are roughly parallel to the ground plane. These cameras are allowed to pan or tilt while they are monitoring the activities in the room. Figure 3.14(a) shows four images captured by these four cameras. We will first estimate the tilt angle and altitude of each camera based on the captured images of some prominent features, such as corners or line segments, on a horizontal plane. Once the tilt angles and altitudes of these four cameras are individually estimated, we will perform the calibration of multiple cameras. Figure 3.1 shows the flowchart of the proposed static calibration procedure.

Fig. 3.1 Flowchart of the proposed static calibration procedure.

3.1.2 Camera Setup Model

Figure 3.2 illustrates the modeling of our camera setup. Here, we assume the observed objects are located on a horizontal plane ∏, while the camera lies above ∏ with a height h. The camera may pan or tilt with respect to the rotation center OR. Moreover, we assume the projection center of the camera, denoted as OC, is away from OR with distance r. To simplify the following deductions, we define the origin of the rectified world coordinates to be the projection center OC of a camera with zero tilt angle. The Z-axis of the world coordinates is along the optical axis of the camera, while the X- and Y-axis of the world coordinates are parallel to the x- and y-axis of the projected image plane, respectively. When the camera tilts, the projection center moves to OC’ and the projected image plane is changed to a new 2-D plane. In this case, the y-axis of the image plane is no longer parallel to the Y-axis of the world coordinates, while the x-axis is still parallel to the X-axis.

Assume P=[X, Y, Z, 1]^T denotes the homogeneous coordinates of a 3-D point P in the world coordinates. For the case of a camera with zero tilt angle, we denote the perspective projection of P as p= [x, y, 1]^T. Under perspective projection, the

relationship between P and p can be expressed as Equation (2.5), 1

[ ] . p K R t P

= z

With respect to the rectified world coordinate system, the extrinsic term [R t] becomes [I 0]. To further simplify the mathematical deduction, we ignore the skew angle and assume the image coordinates have been translated by a translation vector (-u0, -v0).

Hence, (2.5) can be simplified as

3.2 Pose Estimation of a Single Camera

In this section, we first deduce the projection equation to relate the world coordinates of a 3-D point p to its image coordinates on a tilted camera. Then, under the constraint that all observed points are located on a horizontal plane, the mapping between the 3-D space and the 2-D image plane is further developed. Finally, we deduce the formulae for the pose estimation of a camera.

3.2.1 Coordinate Mapping on a Tilted Camera

When the PTZ camera tilts with an angle φ, the projection center OC translates to a new place OC’ with OC’ =[0 -rsinφ -(r-rcosφ)]^T. Assume we define a tilted world coordinate system (X’, Y’, Z’) with respect to the tilted camera, with the origin being the new project center OC’, the Z’-axis being the optical axis of the tilted camera, and the X’- and Y’-axis being parallel to the x’- and y’-axis of the new projected image plane, respectively. Then, it can be easily deduced that in the tilted world coordinate system the coordinates of the 3-D point P become

1 0 0

After applying the perspective projection formula, we know that the homogeneous coordinates of the projected image point now move to

⎥⎥

3.2.2 Constrained Coordinate Mapping

In the rectified world coordinates, all points on a horizontal plane have the same Y coordinate. That is, Y = -h for a constant h. The homogeneous form of this plane ∏ can be defined as π =[0 1 0 h]^T. Assume the camera is tilted with an angle φ. Then, in the tilted world coordinate system, the homogeneous form of this plane ∏ becomes π′ =[0 cosφ −sinφ (h r− sin )]φ ^T, as shown in Fig. 3.3.

Fig. 3.3 Geometry of a horizontal plane ∏ with respect to a tilted camera.

Assume a 3-D point p is located on the horizontal plane ∏. Then, in the rectified world coordinate system, we have π⋅ =P 0, where P = [X, Y, Z, 1]^T. Similarly, in the tilted world coordinate system, we have π′ ′⋅P =0, where P’ = [X’, Y’, Z’, 1]^T. With (3.2), Z’ can be found to be

( sin )

Moreover, the tilted world coordinates of p become

( sin )

If the principal point (u0, v0) is taken into account, then (3.7) can be reformulated as

This formula indicates the back projection formula from the image coordinates of a tilted camera to the rectified world coordinates, under the constraint that all the observed points are lying on a horizontal plane with Y = -h.

3.2.3 Pose Estimation Based on the Back-Projections

As aforementioned, in real life, based on the image contents of a captured image, people can usually have a rough estimate about the relative position of the camera with respect to the captured objects. In this section, we demonstrate that, with a few corners or a few line segments lying on a horizontal plane, we can easily estimate the tilt angle of the camera based on the back projection of the captured image.

3.2.3.1 Back-projected Angle w.r.t. Guessed Tilt Angle

Suppose we use a tilted camera to capture the image of a corner, which is located on a horizontal plane. Based on the captured image and a guessed tilt angle, we may use (3.8) to back-project the captured image onto a horizontal plane on Y = -h. Assume three 3-D points, PA, PB, and PC, on a horizontal plane form a rectangular corner at PA. The original image is captured by a camera with φ = 16 degrees, as shown in Fig.

3.4(a). In Fig. 3.4(b), we plot the back-projected images for various choices of tilt angles. The guessed tilt angles range from 0 to 30 degrees, with a 2-degree step. The back-projection for the choice of 16^o is plotted in red, specifically. It can be seen that the back-projected corner becomes a rectangular corner only if the guessed tilt angle is correct. Besides, it is worth mentioning that a different choice of h only causes a scaling effect of the back-projected shape.

To formulate this example, we express the angle ψ at PA as

After capturing the image of these three points, we can use (3.8) to build the relation between the back-projected angle and the guessed tilt angle.

cos {{(1 )

In Fig. 3.5, we show the back-projected angle ψ with respect to the guessed tilt angle, assuming α and β are known in advance. In this simulation, the red and blue curves are generated by placing the rectangular corner on two different places of the horizontal plane. Again, the back-projected angle is equal to 90 degrees only if we choose the tilt angle to be 16 degrees. This simulation demonstrates that if we know in advance the angle of the captured corner, we can easily deduce camera’s tilt angle.

Moreover, the red curve and the blue curve intersect at (φ, ψ) = (16 , 90). This means that if we don’t know in advance the actual angle of the corner, we can simply place that corner on more than two different places of the horizontal plane. Then, based on the intersection of the deduced ψ-v.s.-φ curves, we may not only estimate the tilt angle of the camera but also the actual angle of the corner.

(a)

(b)

Fig. 3.4 (a) Rectangular corner captured by a tilted camera (b) Illustration of back-projection onto a horizontal plane on for different choices of tilt angles.

Fig. 3.5 Back-projected angle with respect to guessed tilt angles.

3.2.3.2 Back-projected Length w.r.t. Guessed Tilt Angle

Assume two 3-D points, PA and PB, on a horizontal plane form a line segment with length L. Similarly, we can build a similar relationship between the back-projected length and the guessed tilt angle by setting the constraint: P P_{A B} = Based on this L. constraint and (3.8), we can deduce that

Similarly, if α, β, r, and h are known in advance, we can deduce the tilt angle directly based on the projected value of L.

Note that in (3.11), the right-side terms contain a common factor (rsinφ-h)². This means the values of r and h only affect the scaling of L. Hence, we can rewrite the formula of the L-v.s.-φ curve as

Then, even if the values of r and h are unknown, we may simply place more than two line segments of the same length on different places of a horizontal plane and seek to find the intersection of these corresponding L-v.s.-φ curves, as shown in Fig. 3.6.

Fig. 3.6 Back-projected length with respect to guessed tilt angle. Each curve is generated by placing a line segment on some place of a horizontal plane.

As mentioned above, the tilt angle can be easily estimated from the ψ-v.s.-φ curves or L-v.s.-φ curves. However, in practice, due to errors in the estimation of camera parameters and errors in the measurement of (x’, y’) coordinates, the deduced ψ-v.s.-φ curves or L-v.s.-φ curves do not intersect at a single point. Hence, we may also seek to perform parameter estimation based on an optimization process. Here, we take (3.11) as an example. We assume several line segments with known lengths (not necessary of the same length) are placed on different positions of a horizontal plane and we use a tilted camera to capture the image. Assume the length of the ith segment is Li, then we aim to find a set of parameters {α, β, u0, v0, φ, r, h} that minimize

In this way, camera parameters can also be easily estimated. In the optimization process, we adopt the Levenberg-Marquardt algorithm. Under our scene model, the tilt angle φ and altitude h can be roughly estimated simply based on visual observations. In our experiments, the error range for the guessed tilt angle φ is within

±20 degrees and the error range for the guessed altitude h is within ±1.5 meters. With these initial guesses, the optimization process is very stable and the estimation results are satisfactorily accurate.

3.3 Calibration of Multiple Static Cameras

3.3.1 Static Calibration Method of Multiple Cameras

In our camera model, each camera has its own world coordinate system. If a vector in the 3-D space, like a line segment on a tabletop, is observed by several cameras at the same time, we can achieve the calibration of these cameras by mapping the individual back-projected world coordinates of this vector to a common reference world coordinates. In Fig. 3.7, we take two calibrated cameras as an example. Fig. 3.7(a) shows the scene model of these two cameras. Fig. 3.7(b) shows the vector locations in the world coordinates of these two cameras, respectively. Based on the estimated φ and h, and the image projections of the vector points, we can get the world coordinates of points Aref, Bref, and A’, B’ from (3.8). The difference of the rotation angle ω between the two world coordinate systems can then be easily computed by

cos , ^ref ^ref .

After applying the rotation to point A’, the position translation t between these two cameras can be expressed as

cos 0 sin

0 1 0 .

sin 0 cos

ω ω

⎡ − ⎤

⎢ ⎥ ′

−⎢ ⎥⋅

⎢ ⎥

⎣ ⎦

=

_ref

t A A (3.15)

Hence, the 3-D relationship between these two cameras can be easily deduced.

(a)

(b)

Fig. 3.7 (a) Top view of two cameras and a vector in the space (b) The world coordinates of the vector with respect to these two cameras.

3.3.2 Discussion of Pan Angle

Notice that, in the above deductions, we don’t care about the pan angles of cameras.

This is because, in the back-projection process, to guess a pan angle only implies to rotate the X and Z coordinates in our camera model. It does not change the space relationship between the camera and the back-projected objects. In Fig. 3.8 we show such an example. In Fig. 3.8(a), we show the image captured by a camera with three corner points being marked in blue. Fig. 3.8(b) shows the top view (i.e. X-Z plane) of the back-projected corner points with respect to four guessed pan angles, 0^o, 30^o, 60^o, and 90^o. The arrows indicate the optical axes of the camera with respect to these four pan angles. It can be seen that the space relationship between the optical axis and the back-projected corner points is almost the same when the camera pans. The little variation comes from the fact that the panning center is not the same as the projection center. However, since the rotation radius r is so small if compared with the distance between the camera and the object, this small variation can actually be ignored.

(a)

(b)

Fig. 3.8 (a) Three points marked in the image captured by a PTZ camera (b) Top view of the back-projected corners and the optical axes with respect to different guessed pan angles.

3.4 Sensitivity Analysis

The tilt angle of a camera is the key factor in our multi-camera calibration method. It affects the back-projections of image coordinates, the information utilized to calibrate cameras. In this section, we try to analyze how sensitive the estimation of tilt angle is with respect to parameter fluctuations and measurement errors. Among these parameters, the distance r between the camera center and the rotation center has no impact over (3.10) and (3.12). Even in (3.11), r tends to have negligible impact since the term rsinφ is usually much smaller than h. Hence, the parameter r can be ignored or be estimated via direct measurement. This means (3.13) can be reformulated as

1 1 2 2 0 0

Besides, several parameters tangle together in a fairly complicated way in the formulae relating the back-projected angle and length with respect to the guessed tilt angle. Hence, we figure out the variations of the tilt angle caused by several parameters via computer simulations in addition. In this section, the values of {u0, v0, α, β} are estimated to be {348, 257, 770, 750} based on Zhang’s calibration method [4].

3.4.1 Mathematical Analysis of Sensitivity

We assume the total variation of φ or h is the summation of the individual variation with respect to different parameter fluctuation. It leads to equation (3.17) as follows,

2 2

where we use the second order of Taylor series to approximate the variations. Next,

the terms of right hand side, such as xi

For estimations of φ and h, the optimization of (3.16) conforms to the following equations: implicit function theorem to (3.19) to find how φ and h deviate with respect the

measurement error in xi'. Equation (3.20) is the differential of (3.19) with respect to xi’ by using the chain rule.

1 1 1

If we assume the total variations of φ and h are caused by individual variations with respect to parameter fluctuations in { , , , }α β u v₀ ₀ and measurement errors in { ', '}x y_i _i ,

Note that we ignore the second differential terms because the variations can be approximated by only the first differential terms appropriately.

3.4.2 Sensitivity Analysis via Computer Simulations

We verify the prediction of the variation deduced in the previous sub-section via computer simulations. Ideally, when we place several corners or line segments on a horizontal plane, the deduced ψ-v.s.-φ curves or L-v.s.-φ curves should intersect at a single point. However, due to errors in the estimation of camera parameters and errors in the measurement of (xi’,yi’), these curves usually do not intersect at a single point.

Hence, in practice, we estimate φ and h based on (3.16) by using the optimization of Levenberg-Marquardt algorithm.

In the following simulations, two line segments are placed as Fig. 9(a). The lengths are both equal to 0.283 meters. Moreover, the images are assumed to be captured by a camera with tilt angle φ = 60 degrees. We change the values of camera parameters and the measurement of (xi’,yi’) individually. Again, φ and h are estimated via LM optimization. The variations of these estimation results, together with the variations deduced by (3.23) and (3.24) are listed in Table 3.1. It can be seen that the deduced variations based on (3.23) and (3.24) well approximate the simulation results.

Besides, we figure out L-v.s.-φ curves and apply the simulation results to ψ-v.s.-φ curves. For ψ-v.s.-φ curves, the corners are placed as Fig. 9(b). We can find that the variations of tilt angle in the ψ-v.s.-φ figures match those in the L-v.s.-φ figures.

(a)

(b)

Fig. 3.9 (a) Top view of line segments placed on a horizontal plane (b) Top view of corners placed on a horizontal plane.

3.4.2.1 Sensitivity w.r.t. u₀ and v₀

As indicated in (3.12), the parameter u0 only affects the numerate part of the back-projected X coordinate. Since the calculations of ψ and L depend on the distance between back-projected points, but not the absolute positions, this parameter has little impact over the deduced ψ-v.s.-φ curves and L-v.s.-φ curves. Even if the value of u0 is changed by an amount of 100, the deduced tilt angle only changes about 0.1 degrees.

On the other hand, the parameter v0 has a larger, but still acceptable impact over the estimation of tilt angle. In Fig. 3.10, we plot the deduced ψ-v.s.-φ curves and L-v.s.-φ curves for the example in Fig. 3.9, with respect to different choices of v0. It can be

在文檔中多台攝影機之靜態與動態校正技術 (頁 46-0)