Chapter 1 Introduction
1.5 Thesis Organization
In the remainder of this thesis, we introduce the system configuration and the idea of the proposed method in Chapter 2. The designs of the camera system and the method to obtain stereo information are also described. In Chapter 3, the construction of the pano-mapping tables by the space-mapping technique and the technique of using the pano-mapping tables for unwarping an omni-image into multiple perspective-view images are described. In Chapter 4, the proposed methods for
computing the car moving direction and for generating the corresponding perspective-view image are described. In Chapter 5, the proposed method for detecting a static surrounding car is described. In Chapter 6, the proposed method for detecting a moving surrounding car is described. In Chapter 7, experimental results and discussions are included. Finally, conclusions and some suggestions for future works are given in Chapter 8.
Chapter 2
Idea of Proposed Methods and System Design
2.1 Idea of Analyzing Surrounding Environment and Vehicles
In order to monitor the surrounding environment of the video surveillance vehicle, we choose omni-cameras instead of traditional projective cameras to acquire environment images. The acquired omni-images can be used to generate corresponding panoramic images and so provide necessary information for security monitoring or driving assistance. In this study we affix a pair of two-camera omni-imaging devices to the surveillance vehicle roof for this purpose as shown in Figure 2.1. Each device includes two omni-cameras aligned coaxially and back to back, as mentioned previously.
(a) (b)
Figure 2.1 The video surveillance vehicle used in this study with a pair of two-camera omni-directional devices affixed on the car roof. (a) A front view of the video surveillance vehicle. (b) A side view of the video surveillance vehicle.
An advantage of the mobility of the video surveillance vehicle is that we can move the entire system to everywhere to conduct surveillance works. Besides, to get useful views as far as possible, we decided to affix one omni-image device at the right-front position of the surveillance vehicle roof, and the other at the left-rear. As illustrated in Figure 2.2, if instead we affixed a device at the front (or back) middle of the vehicle roof, a half of the acquired omni-image is useless, covering just the roof of the vehicle.
(a) (b)
Figure 2.1 Positions of cameras affixed to the video surveillance vehicle roof and the corresponding FOV. (a) The omni-camera is affixed at the rear-middle of the car roof. (b) The omni-camera is affixed at the right-rear of the car roof.
Many car accidents occur because the driver ignores “blind spots” which cannot be seen in the mirrors equipped inside and outside the car. To show the views of these blind spots, we can use the mentioned pair of omni-imaging devices on the surveillance vehicle roof to generate perspective-view images around the car on every driver-specified view direction. The construction of the perspective-view image from an omni-image conducted in this study is based on the space-mapping method proposed by Jeng and Tsai [8].
In addition, when a driver wants to turn to the left or to the right, the blind spots behind the surveillance vehicle are apt to be neglected. Therefore, we analyze the
motion vectors produced by the optical flow method in consecutively acquired images.
With these motions, we can estimate the vehicle moving direction and show the corresponding perspective-view image. During driving, we can also store images captured with omni-cameras. As a driver recorder, the system can then display these images in sequence, or let the user to choose a view direction and display the corresponding perspective-view image for closer observation.
Furthermore, in order to detect a static car parked at the nearby roadside and compute the stereo information of it, we use the color feature to separate the car region from the ground in the acquired omni-image. In doing this, we assume that the background is uncomplicated with almost the same color as that of an asphalt road, and that the color of the detected car is different from the ground presumably.
Therefore, the car shape can be extracted as the foreground by elimination of the ground color.
Moreover, we also want to obtain the stereo information of the detected car. For this, the corresponding points of the bottom-window edge of the car in a pair of images captured with the upper omni-camera and the lower omni-camera are chosen.
Then, the image data of these points are used to compute the desired stereo information. However, some points like the outlier ones might incur errors in the computed stereo information, so they are eliminated by a linear regression method in this study. As a result, we can compute the location of the car by using the image data of the remaining points, and generate accordingly a surround map. An example of the result of this process is shown in Fig. 2.3.
Finally, we use motion vectors to analyze the acquired omni-images for several purposes. Such motion vectors are produced from consecutive omni-images directly when concerned objects are moving in the omni-images. Specifically, the angles and the lengths of these motion vectors are almost all equal when the vehicle is driven on
a flat field with roughly identical texture everywhere. Accordingly, if another car is driven aside to overtake the surveillance vehicle gradually, the angles of the motion vectors of the car will differ from those of the motion vectors produced from the entire environment. This characteristic so can be utilized to detect a nearby car in an acquired omni-image.
Another feature used in this study is motion vector length. If a concerned object is higher than the ground, the lengths of the motion vectors yielded by it will be longer than those yielded by the ground. After roughly locating the car using this feature, we use a third feature, the color of the monitored car, to grow the car region in the omni-image, and compute accordingly the location of the car by the use of a mask model of the car. Finally, a surround map is generated to show the relative position of the nearby car with respect to the surveillance vehicle from the top view for driving assistance.
(a) (b)
Figure 2.3 An example of static nearby car detection. (a) An omni-image of a static car parked at the nearby roadside. (b) A generated surround map showing the relative position from the top view. Note that the direction of an object is 180o reversed in the omni-image when compared with the real situation as illustrated in (b).
2.2 System Configuration
In this section, we will describe the video surveillance system elaborately. The proposed system is mainly divided into three parts. The first part is the hardware which includes a video surveillance vehicle, a pair of two-camera omni-directional imaging devices, and two laptop computers. The second part is the software. In this part, we will introduce the software development environment and the accompanying SDK and driver programs for the CCD cameras. The third part is the network.
Because we use two laptops to handle the pair of two-camera omni-directional imaging devices, respectively, a local network is used for communication between the two laptops.
2.2.1 Hardware configuration
The surveillance vehicle, named Delica, is made by Mitsubishi Co. It is a 469cm
×169cm×196cm vehicle with a working table and a power supply. System operators may sit inside the surveillance vehicle to operate the laptop computers and monitor the entire surrounding environment. Moreover, a steel frame is affixed to the car roof, on which the omni-image devices can be affixed. And four extension USB cords crossing the video surveillance vehicle were added to receive images which are captured with the two omni-imaging devices. Detailed descriptions of the imaging devices will be given in Section 2.3. The entire video surveillance system is shown in Fig. 2.4.
In order to control the entire video surveillance system, in this study we use two laptops as control units, each handling an omni-imaging device. Both laptops are produced by TOSHIBA Computer Inc., and their detailed specifications are listed in
Table 2.1. To exchange commands and images between the two laptops, we use a cross-over cable to connect then and set up a local network for between-computer communication.
Local Network
Computer A Computer B
Cross-over cable Video surveillance
car Camera
System A
Camera System B
Affixed on Affixed on
Figure2.4 Structure of the proposed surveillance system.
Table 2.1 Specifications of the laptop computers used in this study.
Tecra M11 Satellite A660
CPU Intel Core i7-620M
2.66/3.33GHz
Intel Core i5-480M 2.66/2.93GHz
RAM 4G DDR3 1066MHz 2G DDR3 1066MHz
GPU nVidia NVS 2100M ATI HD5650
Network Gigabit LAN Fast Ethernet LAN
2.2.2 Software configuration
We use Borland C++ Builder (BCB) V6 as a developed platform to build our video surveillance system. The BCB is a program development tool for the operating system of Windows; therefore, we can create a graphic user interface (GUI) conveniently and quickly. The programming language we use is C++. It is a widely used language. One of the laptops, the Tecra M11 computer, uses the operating system of Windows 7, and the other, Satellite A660, uses Windows XP.
Before developing a video surveillance system, we have to install the drivers of the ARTCAM-200SO cameras and those of the ARTCAM-200SS cameras in the laptop computers. The camera company also provides corresponding software development kits (SDKs) and some simple source codes. Accordingly, we can adjust the parameters of each camera, such as the value of exposure or the global color gain, through the SDK. The SDK is an object-oriented toolkit, and the camera company not only provides the BCB version but also the C, VB.NET, C#.NET or Delphi version to the programmers.
Figure 2.5 The network architecture of transmission between two laptops.
A network configuration is needed for communication between two laptop computers because four omni-images are acquired from the pair of two-camera omni-directional imaging devices and each imaging device is processed by a respective laptop. As a result, to communicate between the two laptops, we set up a local area network to send images and control signals.
As shown in Fig. 2.5, laptop computer COMB is used to display the perspective-view image and the acquired omni-image, therefore, laptop computer COMB needs to receive these images from COMA through the local network.
Moreover, the control signals of the selected view direction produced by COMA are sent to COMB for generating the corresponding perspective-view image.
2.3 Review of Adopted Camera System and 3D Data Acquisition Process
In this section, we review the adopted camera system and the corresponding 3D data acquisition process proposed in Yuan el at. [19]. First of all, we introduce the detail of building the camera system. The entire system includes four lenses of model LV0612H, two CMOS cameras of model ARTCAM-200SO, and two CMOS cameras of model ARTCAM-200MI. Table 2.2 lists the specifications of the COMS cameras.
To build an omni-camera, the most important task is to combine a projective CCD camera and a hyperboloidal-shaped mirror into an omni-camera. In the design process of the omni-camera, an optics manufacturer was requested to produce hyperboloidal-shaped mirrors. The parameters of each of the mirrors are described here. The radius r of the hyperboloidal-shaped mirror is 4cm. The projective camera has a focal length f of 6 mm and a sensor width Sw of 2.4mm. And the axis of the
camera is aligned with the axis of the hyperboloidal-shaped mirror. Therefore, by the principle of similar triangles, the distance d between the optical center of the lens and the mirror center can be computed from the following equation:
w
d f
r S . (2.1)
Also, as shown in Fig. 2.6(a), the hyperboloidal shape of the mirror in the camera coordinate system may be described as:
2 2 elevation angle α in Figure 2.6 (a) can be obtained from the relation between the CCS and the ICS of an omni-camera system derived by Wu and Tsai [20] as follows.
2 2
Furthermore, by the simple formula d = 2c where the value c is the distance from the center O shown in Fig. 2.6 to the mirror center Om, the angles θ and β can be
In Eq. (2.3), let the omni-camera have the largest FOV, the incidence angle α be set 0, and by using Eq. (2.4), the parameter b can be obtained by solving Eq. (2.3).Finally, the parameter a is derived from the following equation:
2 2
c a b . (2.5)
f
Figure 2.6 (a) Relation between the world coordinates and the image coordinates (b) Geometry between the mirror and the CMOS sensor in camera.
Each omni-camera was built with these parameter values, and a two-camera omni-directional device can be constructed with two omni-cameras aligned vertically.
Table 2.2 Specifications of used COMS cameras.
ARTCAM-200SO ARTCAM-200MI
Resolution 2.0 M pixels(1600*1200) 2.0 M pixels(1600*1200)
Dimension 33mm × 33mm × 50mm 33mm × 33mm × 50mm
CMOS sensor size 1/2” (6.4×4.8mm) 1/2” (6.4×4.8mm)
Mount C-mount C-mount
Frame per second 8 fps 5 fps
Direct show camera Yes No
After describing the way of building the cameras, we now describe the adopted method to compute stereo information from a two-camera omni-directional imaging device. In the omni-imaging device, relevant 3D data can be computed by two elevation angles and an azimuth angle of a scene point P. As shown in Figure 2.7(a), the point P projects on each hyperboloidal-shaped mirror and forms a pair of
corresponding points in the upper image and the lower image captured with a two-camera omni-imaging device. The elevation angles of point P on the hyperboloidal-shaped mirrors are defined as α1 and α2, respectively. Also, the center of the upper hyperboloidal-shaped mirror is assumed to be the origin of the world coordinates (0, 0, 0). It is desired now to compute the stereo depth data of point P in terms of the two elevation angles α1 and α2.
(a) (b)
Figure2.7 Computation of depth using the two-camera omni-directional imaging device. (a) The ray tracing of a scene point P in the imaging device with a hyperboloidal-shaped mirror. (b) A triangle in detail (part of (a)).
To obtain stereo depth of a scene point P(x, y, z), finding two elevation angles α1
and α2 by looking up a pano-mapping table is required, and the construction of pano-mapping table will be described in Chapter 3. As shown in Figure 2.7(b), the distance d between the point P and the upper mirror center c1 is computed by the triangulation principle shown in Figure 2.7(a) using the equation below:
2 1 2
sin(90 ) sin( )
d b
, (2.6)
where the parameter b is the baseline of the stereo imaging device.. The equation of
(2.6) may be reduced to be the following equation by trigonometry:
As a result, the horizontal distance dw and the vertical distance Z may be computed as follows: image coordinates (u, v) in the image coordinate system (ICS). Then, we can use point I to calculate the azimuth angle . A triangulation which is illustrated in Figure 2.8 includes an azimuth angle between the X-axis and point I. As a result, the azimuth angle can be computed by the following equation:
1 1 property of omni-imaging, the azimuth angle of a point in the ICS is the same as that of the corresponding point in the WCS. We can calculate the parameters x and y by the distance dw and the azimuth angle in the WCS as follows:
1 2
As a result, by the use of the pano-mapping table, each pixel in an omni-image
can be transformed to an elevation angle and an azimuth angle. Once the azimuth angle and a pair of elevation angles α1 and α2 are obtained, we are able to compute the location of point P in the WCS. Therefore, a pair of matching points (one is in an omni-image taken by the upper omni-camera, and the other is in an omni-image taken by the lower omni-camera) is known, the stereo information of the unique point in the WCS may be obtained.
Figure2.8 System configuration of upper omni-camera with a hyperboloidal-shaped mirror.
2.4 System Processes
To get stereo information from a pair of two-camera omni-imaging devices, the omni-cameras need to be calibrated. For this purpose, the space-mapping technique is applied, and the technique is based on the use of a pano-mapping table. The process of constructing the table is shown by Fig. 2.9. We will introduce the process in Chapter 3 elaborately. Moreover, the method to unwarp an omni-image into a perspective-view image using the pano-mapping table is also described in Chapter 3.
The above-mentioned process is an advance preparation before developing the system. As shown in Figure 2.10, to develop the car-driving assistance application, we need to read the related tables at the beginning. Computer COMA used in this study is responsible for analyzing omni-images of the surrounding environment by the optical flow method and estimating the moving direction of the video surveillance vehicle by the produced motion vectors. Then, the analyzed result and acquired images are sent to another computer COMB for generating the corresponding perspective-view image.
The process will be introduced elaborately in Chapter 4. Computer COMB generates the corresponding perspective-view images with respect to these signals. The method of quickly generating a perspective-view image is described in Chapter 3. Moreover, the images of the driving history in Computer COMA are sent to Computer COMB for off-line inspection. The images of the driving history can be sequentially displayed; in the meantime, the user may select the view direction to inspect the corresponding
Figure 2.9 Flowchart of proposed learning process.
Start of Video Surveillance
Figure 2.10 Flowchart of the moving direction analysis.
Both the application of nearby static car detection with a static video surveillance vehicle and the application of nearby static or moving car detection with a moving video surveillance vehicle require reading table files as a preparation task as shown in Fig. 2.11. In the application of nearby static car detection with a static video surveillance vehicle, two omni-images are captured with a two-camera omni-directional imaging device and the detection process is conducted. Finally, the stereo information of the car is computed. The detailed process will be introduced in Chapter 5. Additionally, nearby static or moving car detection with a moving video surveillance vehicle requires two consecutively acquired images to conduct the detection process. We need to create an image buffer to keep the previous image and acquire a current image with the omni-camera for analyzing the omni-images of the surrounding environment. The detection and position estimation of the static or moving car will be described in Chapter 6. Finally, the two applications will both
display the surround map. Both tasks are required complex processes, so we will introduce above-mentioned processes in the remaining chapters elaborately
Start of Video Surveillance
Sequential omni-images Pano-mapping
Tables
Detection of a static surrounding
car
Estimating 3D Data of the car Load Tables
Display Surround Map
Detection of a moving surrounding car
Estimating 3D Data of the car
Image buffer
Figure 2.11 Flowchart of vehicle detections
Chapter 3
Generation of Perspective-view
Images Using Pano-mapping Tables
In this chapter, we describe the details of the scheme we use to generate perspective-view images from omni-images acquired with the omni-image devices attached to the roof of the video surveillance vehicle used in this study. Before describing the detail in Sections 3.2 through 3.4, we review first in Section 3.1 a
In this chapter, we describe the details of the scheme we use to generate perspective-view images from omni-images acquired with the omni-image devices attached to the roof of the video surveillance vehicle used in this study. Before describing the detail in Sections 3.2 through 3.4, we review first in Section 3.1 a