• 沒有找到結果。

Generation of specified perspective-view images with mouse clicks

Chapter 3  Using Pano-mapping Tables for Unwarping Omni-images into

3.3  Image Unwarping and Generation of Perspective-view Images

3.3.2  Generation of specified perspective-view images with mouse clicks

In this section, a method proposed to make the user interface friendly for a user to change the view direction of the perspective-view image with moving a mouse is described. Figure 3.6 shows an experimental result of a perspective-view image and a corresponding omni-image. According to the previously-mentioned derivation and observation of the following perspective-view image and omni-image, we can get some relations with a mouse motion and the direction of a perspective view.

Specifically, the horizontal motion (Mx) of a mouse is just dependent on the azimuth angle θ, and the vertical motion (My) is dependent on the elevation angle ρ.

As a result, we reconstruct Eq. (3.15) as follows:

sin 1 ,

q mouse

h

θ = L+θ (3.20) where θmouse is an index to record the azimuth angle of the view. It will increase as the mouse moves to the left. On the other hand, it will decrease as the mouse moves to the right. Eq. (3.18) is also reconstructed as follows:

tan 1 q ,

q mouse

H ρ = L ⎟+ρ

⎝ ⎠ (3.21)

where ρmouse is an index to record the elevation angle of the view. Similar to the θmouse, it will increase as the mouse moves down. On the other hand, it will decrease as the mouse moves up. Hence, a user can use a mouse to choose any view direction conveniently to observe the scene which he/she is concerned with. The user interface of the program becomes friendly after adding the two variables θmouse and ρmouse.

(a)

(b)

Figure 3.6 Corresponding omni-image and perspective-view image. (a) A perspective-view image. (b) Omni-image from which (a) was generated.

My

Mx

θ

Chapter 4

Automatic Detection of a Suspicious Passer-by with a Two-camera

Omni-directional Imaging Device

4.1 Introduction

To detect a suspicious passing-by person approaching the video surveillance car, an automatic human detection process should be available. In this chapter, the method proposed to detect a suspicious passer-by near a surveillance car and estimate his/her relevant 3D data using a two-camera omni-directional imaging device is described.

Using the result of this method, a user in a video surveillance car can get the passer-by’s position and height for use in various security applications. Also, according to the information of a passer-by’s position, a corresponding perspective view also can be computed by the proposed method.

The remainder of this chapter is organized as follows. In Section 4.2, we introduce the concepts behind the proposed system. In Section 4.3, we describe the techniques we propose for detection of a passer-by’s distance and height using two omni-images which are taken by a two-camera omni-directional imaging device.

4.2 Review of Related Concepts in Proposed System

In this study, we use a pair of two-camera omni-directional imaging devices to perform a video surveillance task around a video surveillance car. To segment the

shape of a moving object out of an omni-image accurately, we use the moment-preserving threshold method proposed by Tsai [17] and a so-called dynamic offsetting scheme for image intensity adjustment in the proposed passer-by and passing-by car detection methods. The concepts behind the two methods are described in following two sections.

4.2.1 Moment-preserving thresholding for object segmentation

An approach to automatic threshold selection for segmenting meaningful objects out of a given image is adopted in this study and is reviewed here [17]. The approach can automatically and deterministically select multiple thresholds without iterations or searches. A bi-level thresholding solution is obtained. The details are described in the following.

Given an image f with n pixels whose gray value at pixel (x, y) is denoted by f(x, y), the ith moment mi of f is defined as Moments can also be computed from the histogram of f in the following way:

1 1 approach also defines m0 to be 1. Image f can be considered as a blurred version of an ideal bi-level image which consists of pixels with only two gray values z0 and z1, where z0 < z1. The adopted moment-preserving thresholding scheme is to select a threshold value such that if all below-threshold gray values in f are replaced by z and

all above-threshold gray values replaced by z1, then the first three moments of image f are preserved in the resulting bi-level image g.

To find the desired threshold value t, we can solve two equations described by Eqs. (4.1) and (4.2) above to obtain p0 and p1, as described in the following [17]:

Because a high-resolution omni-camera is used in this study, it is a heavy computation load to acquire the histogram of an omni-image. Note that p0 and p1

denote the fractions of the below-threshold pixels and the above-threshold pixels in f, respectively; p0 + p1 = 1; and z0 and z1 can be regarded as the representative gray values for the below-threshold and the above-threshold pixels, respectively. We use an alternative method to obtain the threshold value t in this study. An illustration of a histogram with parameters z0, z1, t is shown in Figure 4.1.

Figure 4.1 A conceptual illustration of a histogram with parameters z0, z1, and t.

Accordingly, we may use the following equation to approximate the value t to

speed up the computation:

0 ( 1 0) 0.

t=z + zz ×p (4.3) Our experimental results also support the validity of using the above equation to compute the value of t.

4.2.2 Dynamic offsetting

Dynamic offsetting is an approach used in this study to solve the problem of object detection in images caused by varying light intensities in different weather conditions. In order to make the intensity of two omni-images similar by dynamic offsetting, we use the following algorithm where a pixel of an image C at image coordinates (u, v) is denoted as pC(u, v).

Algorithm 4.1 Intensity normalization by dynamic offsetting.

Input: two grayscale omni-images A and B both with size n (the number of pixels).

Output: a modified version A′ of grayscale omni-image A, whose intensity is normalized based on B.

modified version A′ of A as follows:

'( , ) ( , ) .

A A offset

p u v = p u vM (4.6)

After the completion of the above steps, the intensity of the two omni-images A′ and B will not be much different.

4.3 Estimation of a Passer-by’s Distance and Height Information

The proposed method for extraction of a passer-by’s information consists of three major stages: (1) moving object extraction (2) acquisition of the passer-by’s head; and (3) estimation of the passer-by’s 3D data. In the first stage, we used the dynamic threshold technique proposed by Tsai [17] and dynamic offsetting, as described in the last section, to acquire moving objects from an omni-image. The detail will be described in Section 4.3.1. In the second stage, we propose an algorithm to detect a passer-by’s head in the omni-image using a specific property in an omni-camera, as described in Section 4.3.2. In the final stage, we use the technique described in Section 4.3.2 to estimate the relevant 3D data of the passer-by. The detail will be described in Section 4.3.3.

4.3.1 Detection of moving objects in an omni-image

Before extracting moving objects around a video surveillance car, omni-images without unnecessary objects should be captured, and each of such images is called a background image. An example of background images is shown in Figure 4.2. To detect moving objects, each omni-camera must capture the current image. Each of the current images is called a foreground image.

(a) (b) Figure 4.2 Background images of a two-camera omni-directional imaging device. (a) A

background taken by an upper omni-camera. (b) A background taken by a lower omni-camera.

First, we transform the background and foreground images, which are color ones, into two grayscale omni-images. However, to avoid the varying light intensities that affect the accuracy of the object detection, we used the dynamic offsetting technique described Section 4.2.2 to make the intensity of the two omni-images similar. Second, by subtracting the background image from the foreground one, we obtain all differences between the two images. Third, because there is a lot of noise in the surveillance area, such as those caused by light variations, we set an appropriate threshold parameter TH to threshold the difference image to eliminate noise. The value TH we use in this study is a dynamic threshold value yielded by using moment-preserving thresholding proposed by Tsai [17] and described in the last section (Section 4.2.1). Finally, if the difference value of a pixel is larger than the parameter TH, it is recorded as “1”; else, as “0”. This process is the so-called bi-level thresholding. At the end, we will obtain a binary image IBI with detected moving objects labeled as “1.”.

A sequence of images yielded as intermediate results of the above-mentioned

object detection process is shown in Figure 4.3. Figure 4.3(d) is the resulting binary image IBI. It is also shown that Eq. (4.3) we proposed is feasible to get the threshold t.

(a) (b)

(c) (d) Figure 4.3 Related images of passers-by detection. (a) Background image. (b)

Foreground image. (c) The difference image obtained after a subtracting process. (d) The binary image obtained by moment-preserving thresholding [17].

4.3.2 Detection of a passer-by’s head by component labeling

Before introducing how to detect a passer-by’s head, we prove a specific

property of an omni-camera, as shown in Figure 4.4. The property is: each line L, which is horizontal to the Z-axis in the WCS, will be projected onto the image as a line IL which goes through the omni-image center (Oc).

  Figure 4.4 A specific property of an omni-camera.

An explanation is in the following. Owning to the rotation-invariant property of the omni-image, the azimuth angle which is formed by the light ray with respect to the X-axis in the WCS is essentially identical to the azimuth angle θ of the corresponding pixel with respect to the U-axis in the omni-image. Because the Z-axis is vertical to the ground, all the points (e.g., P and Q) on L have the same azimuth angle with respect to the X-axis in the WCS. Therefore, these points are projected onto the points (e.g., IP and IQ) with an identical azimuth angle θ with respect to the U-axis in the ICS, forming a line IL on the omni-images. And this line extends to go through the center of the omni-image, Oc.

A more theoretical verification of the above fact is in the following. By using minimum distance estimation with ρ and ρ in the corresponding r-ρ Table

described in Section 4.3.3, the corresponding radial lengths rP and rQ can be derived, respectively. Because the azimuth angles θ are the same, the corresponding pixels in the omni-images can be described as follows:

( cos , sin );

Because the line passes through the point IP, substituting the coordinates of IP into Eq. (4.8) leads to the following equation:

sin sin cos sin .

P cos P P

r θ θ r θ b r θ b

= θ + = + (4.9) Thus, b = 0 can be obtained, and the line of (4.8) can be expressed in the following:

tan .

v= θ×u (4.10) As a result, it is easy to see that the line goes through the center of the omni-image, Oc, with coordinates (u, v) = (0, 0).

However, every passer-by around of the video surveillance car stands on the ground, so it is also means that everyone is vertical to the ground. As illustrated in Figure 4.5, according to the above-proved property, the midline IL of a passer-by (i.e., the axis of his/her body) goes through the center of the omni-image Oc As a result, the process of finding the top of a passer-by’s head is simplified, which is described as Algorithm 4.2. And the result of passer-by’s head detection in Figure 4.3 is shown in Figure 4.6.

 

Figure 4.5 The midline of a passer-by through the center of the omni-image.

Algorithm 4.2 Finding the top of a passer-by’s head.

Input: A binary image IBI which is obtained as described in Section 4.3.1.

Output: The position of the passer-by’s head.

Steps:

Step 1. Initialize the position of the passer-by’s head to be 0 (the minimum distance).

Step 2. Scan each line in the radial direction through the image center based on the polar coordinates (θ, r), where θ is the azimuth angle and r the radial length.

If the whole omni-image has scanned, output the position of the passer-by’s head and finish the process.

Step 3. If there is a segment with fifteen continuous points or more, called a continuous segment, in the line, continue; else, go to Step 2.

Step 4. Find the farthest point with respect to the image center of each continuous segments in the line.

Step 5. If the farthest point is farther than the position of the passer-by’s head, assign it as the new position of the original record of the head position and Go to Step 2.

Figure 4.6 A result of passer-by’s head detection. (The top of the passer-by’s head is marked in red.)

4.3.3 Calculation of a passer-by’s distance and height in 3D space

At the beginning, note that we construct an r-ρ Table using the radial stretching function in Section 3.2 in the learning process. An r-ρ-Table records the corresponding relationships of an omni-camera between radial lengths r and elevation angles ρ, as shown in Table 4.1.

Table 4.1 r-ρ-Table

r r1 r2 r3rn

ρ ρ1 ρ2 ρ3 … ρn

To calculate a passer-by’s relevant 3D data, we do the same process as described in Section 4.3.2 using each omni-camera of a two-camera omni-directional imaging device. Then, we get the two positions of a passer-by’s head (Pup, Pdown) in the two omni-images taken by a two-camera omni-directional imaging device. When

the two positions Pup (xup, yup) and Pdown (xdown, ydown) are obtained, we can derive the radial distance in the following:

2 2 ; 2 2 .

up up up down down down

r = x +y r = x +y (4.11)

By using minimum distance estimation with r in the corresponding r-ρ Table, the corresponding elevation angles ρup, ρdown can be derived.

Because the two omni-cameras are combined coaxially in the longitudinal direction and their camera coordinates (X, Y, Z) have the same axis directions, the two azimuth angles (θup, θdown) of Pup and Pdown are the same by the rotational-invariant property of the omni-camera. However, because of mechanical errors, it is accepted that the difference angle between θup and θdown is less than 10 degrees. Thus, if θup and θdown is matched within this tolerance of angular error, we can use the 3D data acquisition proposed in Chapter 2.3.3 to estimate a passer-by’s position (X, Y, Z) with respect to the center of the upper omni-camera Cup in the WCS. In more detail, with HCup denoting as the height of Cup, the passer-by’s height Hpasser-by can be computed as follows:

Hpasser-by = HCup − Z, (4.12) where HCup in this study is 256cm as measured manually in our experiment..

To summarize, the flowchart in Figure 4.7 shows the overall processes used to detect a passers-by in this study.

  Figure 4.7 An overview of passers-by detection proposed in this study.

Chapter 5

Integration of Two Omni-images into a Top-view Image with a Pair of

Two-camera Omni-directional Imaging Devices

5.1 Introduction

In order to expand the range of surveillance, two pairs of two-camera omni-directional imaging devices are used in this study. A top-view image around a video surveillance car which comes from merging the two omni-images taken by the upper cameras in the two pairs of two-camera omni-directional imaging devices is available for users in the car. In this chapter, we describe all the relevant methods which are used in computing such an integrated top-view image.

The remainder of this chapter is organized as follows. In Section 5.2, we introduce the techniques we propose for construction of such an integrated top-view image. In Section 5.3, we describe the techniques we propose for superimposition of the video surveillance car shape and filling of ground texture in the top-view image.

5.2 Construction of a Top-view Image

5.2.1 Construction of a top-view image with an

omni-camera

Because the view range of the upper camera in a two-camera omni-directional imaging device is wider than the lower one, we use the upper omni-camera to construct a top-view image. The height of the omni-camera affixed on the video surveillance car roof is known in this study, so we can derive a simple top-view image by assuming all pixels in the omni-image are on the ground.

Figure 5.1 illustrates the geometry relationship between a point on the ground and the upper omni-camera. A straightforward derivation (forward mapping) is in the following. Note that, forward mapping means to map the pixels in an omni-image to those of the top-view image. Specifically, any pixel p at coordinates (u, v) in the ICS can be used to compute the radial distance r as follows:

2 2

,

r= u +v (5.1)

and the azimuth angle θ as follows:

1 1

cos u sin u.

r r

θ = = (5.2)

By using minimum-distance estimation with r in the corresponding r-ρ Table as described in Section 4.3.3, the corresponding elevation angle ρ can be derived. More specifically, we try to find the ri in the r-ρ Table with the minimum distance to r, and regard the corresponding ρi of ri in the table as the corresponding elevation angle ρ of r. Then, according to the known height of the mirror center dh and the known elevation angle ρ so computed, as shown in Figure 5.1, the horizontal distance between a scene point and the mirror base center, dw, can be computed as follows:

cot .

dw dh= × ρ (5.3) Accordingly, by the rotational-invariant property of the omni-camera, the position (x, y) of a point P on the ground in the WCS can be obtained from Eqs. (5.2) and (5.3) as

follows:

cos ; sin .

x dw= θ y dw= θ (5.4) By mapping all pixels in the omni-image in this way, a top-view image can be obtained.

Figure 5.1 The ray tracing of a scene point P on the ground with a hyperbolic-shaped mirror.

However, a forward mapping will lead to a “broken” image with many unfilled points, as shown in Figure 5.2(b) (the black portion in the image). Thus, we attempt to derive a backward mapping for computing a “complete” top-view image. Note that, backward mapping is to map all pixels in a top-view image to those of an omni-image.

First, we may compute dw for every pixel with coordinates (x, y) in the WCS as follows:

2 2,

dw= x +y (5.5)

and the azimuth angle θ can be derived as follows:

1 1

cos x sin y .

dw dw

θ = = (5.6)

Eq. (5.3) can be rewritten to derive the elevation angle as follows:

tan 1 dh

ρ= dw. (5.7) By minimum distance estimation with ρ in the corresponding r-ρ Table as described in Section 4.3.3, the corresponding radial distance r can be derived. Accordingly, by the rotational-invariant property of the omni-camera, the image coordinates (u, v) of the corresponding pixel p in the ICS can be obtained from Eq. (5.6) as follows:

cos ; sin .

u r= θ v r= θ (5.8) As a result, a complete top-view image can be obtained, as shown in Figure 5.2(c).

(a)

Figure 5.2 An omni-image and its corresponding top-view images. (a) An omni-image. (b) A top-view image obtained from forward mapping. (c) A top-view image obtained from backward mapping.

(b) (c) Figure 5.3 An omni-image and its corresponding top-view images. (a) An omni-image.

(b) A top-view image obtained from forward mapping. (c) A top-view image obtained from backward mapping continued.

In order to improve the program speed, the corresponding relations of all pixels between an omni-image and a top-view image is stored into a table, called TopviewTable, in the learning process. Accordingly, constructing top-view images do not need to compute the above-mentioned equations again and again, but just to refer to the corresponding TopviewTable.

5.2.2 Calculation of relative position of two omni-cameras

Figure 5.3 is an illustration of the layout of the video surveillance car roof where the length and the width of the car roof are obtained by manual measurement.

Accordingly, we can get the relative position between the pairs of two-camera omni-directional imaging devices easily with this layout we designed. The red circles in the layout are the positions of our devices. If the front device is assumed as the

offset between the two devices is (−110, −330).

Figure 5.4 An illustration of the layout of the video surveillance car roof.

5.2.3 Merging of two top-view images into a single one

At the beginning, we construct two top-view images using the omni-images taken from the upper cameras in the pairs of two-camera omni-directional imaging devices, respectively. To make the merging construction simple and fast, we divide a top-view image around the video surveillance car into two parts. As shown in Figure 5.4, CW is the width of the relative distance between the two upper omni-cameras, and CL is the length of the relative distance between the two upper omni-cameras. One part is the front half top-view of the car surrounding (covering the image’s upper part

At the beginning, we construct two top-view images using the omni-images taken from the upper cameras in the pairs of two-camera omni-directional imaging devices, respectively. To make the merging construction simple and fast, we divide a top-view image around the video surveillance car into two parts. As shown in Figure 5.4, CW is the width of the relative distance between the two upper omni-cameras, and CL is the length of the relative distance between the two upper omni-cameras. One part is the front half top-view of the car surrounding (covering the image’s upper part