Thesis Organization - 利用擴增實境與環場電腦視覺技術作園區行車導覽之研究

CHAPTER 1 Introduction

1.5 Thesis Organization

In the remainder of this thesis, we introduce the system configuration and the idea behind the proposed system in Chapter 2. The structure of the two-camera omni-imaging device is also described. In Chapter 3, the proposed method for using the two-camera omni-imaging device and the PTZ camera to help creating a guidance map is described. In Chapter 4, the proposed method for detecting a landmark to obtain the stereo data is described. In Chapter 5, the proposed method for projecting tour guidance information on an iPad onto the windshield is described. In Chapter 6, the proposed method for organizing graphs in the guidance map to conduct graph traversal

for proceed car tours in the guidance area is described. In Chapter 7, experimental results and discussions are included. Finally, conclusions and some suggestions for future works are given in Chapter 8.

Chapter 2 Ideas of Proposed Methods and System Design

2.1 Ideas of System Design

In order to monitor the surrounding environment of the video surveillance vehicle, we affix a two-camera omni-imaging device and a PTZ camera, instead of traditional projective cameras, on the roof of the surveillance vehicle in this study.

The omni-camera can be used to monitor 360 degrees of the car surround and enhance acquisition of necessary scene information outside the car. The PTZ camera can pan, tilt, and zoom by computer control. The aforementioned structure of the surveillance vehicle used in this study is shown in Figure 2.1. Note that the two-camera omni-imaging device includes two omni-cameras aligned coaxially and back to back, as mentioned previously.

(a) (b)

Figure 2.1 The video surveillance vehicle used in this study with a two-camera omni-imaging device and a PTZ camera affixed on the car roof. (a) A front view of the vehicle. (b) A side view of the vehicle.

The video surveillance vehicle embraces high mobility so that we can move the

onboard camera system to everywhere, but we have to determine the best locations on the car roof where the omni-imaging device and the PTZ camera should be affixed, respectively. We discuss where to affix the omni-imaging device at first. As illustrated in Figures 2.2(a) and 2.2(b), if we affix the device at the front middle of the car roof; a half of the omni-image acquired with the device is undesirably the car body. But if we affix it at the right-front position of the surveillance vehicle roof, only a quarter of the omni-image taken with the same imaging device is the car body. Therefore, in this study we decide to affix an omni-imaging device at the right-front of the surveillance vehicle roof. Second, as illustrated in Figures 2.2(a) and 2.2(b), the PTZ-camera is affixed at the border position of the surveillance vehicle roof. If the PTZ-camera were affixed instead at the middle position of the surveillance car roof, it would cover undesirably a half of the car body.

Furthermore, we analyze the images acquired with these imaging devices to create a guidance map and extract information for tour guidance by detecting landmarks appearing in the acquired images. The landmarks are detected by algorithms proposed in this study, such as region growing, YCbCr color modeling, ellipse fitting, etc. The details will be described in Chapter 4. The landmarks can be used to navigate the vehicle in the guidance area, but they do not all lie on planes of the same height. Using the two omni-cameras in the omni-imaging device, we can solve this problem by estimating relevant 3D data of the landmarks.

Moreover, in order to create the guidance map, the PTZ-camera can be used at first to get feature points of nearby buildings (the detail will be described in Chapter 3), then the positions of landmarks are obtained accordingly, and finally the location of the video surveillance vehicle is computed (the detail will be described in Chapter 4). Also, we use an iPad-like mobile device (called a pad hereafter) to simulate a head-up display on the windshield. The position of the projected image appearing on

the pad onto the windshield can be estimated so that the driver does not have to put the pad into an accurate pose under the windshield (the detail will be described in Chapter 5). The projected image is designed to include the names of the currently-visited buildings on the left and right road sides, allowing the driver to understand the current vehicle’s location in the guidance area. An example of the guidance map and a displayed image with building names on the pad is shown in Figure 2.3.

(a) (b)

Car

Figure 2.2 Positions of cameras affixed to the video surveillance vehicle roof and the corresponding FOV. (a) The omni-camera is affixed at the rear-middle of the car roof. (b) The omni-camera is affixed at the right-front of the car roof. (c) The PTZ-camera is affixed at the middle of the car roof. (b) The PTZ-camera is affixed at the border of the car roof.

Furthermore, we analyze the motion vectors yielded by an adopted optical flow

method in acquired omni-images so that we can estimate the vehicle’s moving direction when meeting a branching road. In this way, we will not get lost on any path in the guidance map by keeping track of the graph nodes in the guidance map.

2.2 System Configuration

The proposed video surveillance system will be described elaborately in this section. The description will be separated into three parts: hardware configuration, software configuration, and network configuration. The hardware includes: 1) a video surveillance vehicle, 2) a two-camera omni-directional imaging device and a

(a) (b)

(c)

Figure 2.3 An example of landmark detection and guidance map. (a) An omni-image of a landmark detected at a sidewalk. (b) A generated guidance map showing the relative position of the car. (c) A projected image on the windshield.

PTZ-camera device, and 3) two laptop computers and a pad. The software includes: 1) network to handle the task of communication among all the equipments.

2.2.1 Hardware configuration

The surveillance vehicle, named Delica, is made by Mitsubishi Co. It is a 469cm

×169cm×196cm vehicle with a working table and a power supply. System operators may sit inside the surveillance vehicle to operate the laptop computers and monitor the entire surrounding environment. Moreover, a steel frame is affixed to the car roof, on which the omni-image device and the PTZ camera are affixed. And two extension USB cords and a cross-over cable crossing the video surveillance vehicle were added to facilitate transmitting images captured with the omni-imaging device and the PTZ-camera. Detailed descriptions of the functions of the imaging devices will be given in Sections 2.3 and 2.4. The entire video surveillance system is shown in Fig.

2.4.

In order to control the entire guidance system, we use two laptop computers and a pad as control units, with the laptops handling the omni-imaging device and the PTZ-camera. Both laptops are produced by TOSHIBA Computer Inc. The pad, named Eee Pad Transformer, is produced by ASUS Computer Inc. We simulate a head-up display device by projecting images appearing on the pad onto the windshield of the vehicle. Detailed specifications of these devices are listed in Table 2.1.

Figure 2.4 Structure of the proposed surveillance system.

Table 2.1 Specifications of the laptop computers and the pad used in this study.

Tecra M11 Satellite A660 Eee Pad Transformer

CPU Intel Core

i7-620M2.66/3.33GHz

Intel Core i5-480M 2.66/2.93GHz

NVIDIA Tegra2.1.0 GHz

RAM 4G DDR3 1066MHz 2G DDR3 1066MHz 1GB

GPU nVidia NVS 2100M ATI HD5650 none

Network Gigabit LAN Fast Ethernet LAN WLAN 802.11 b/g/n

2.4GHz

To exchange commands and information of guidance between the two laptops and the pad, we use an access point (AP) to connect them and set up a local network for communication.

2.2.2 Software configuration

We use a Borland C++ Builder (BCB) V6 as the development platform to build our guidance system. The BCB is a program development tool for the operating system of Windows by which we can create a graphic user interface (GUI) conveniently and quickly. The programming language we use is C++. It is a widely used language. One of the laptops, the Tecra M11 computer, uses the operating system of Windows 7, and the other, Satellite A660, uses Windows XP.

The operating system of the pad is Android 3.2, and we develop the applications by the use of the Eclipse. However, we need to install the JAVA development tool (JDK) and Android development tool (ADT) in the Eclipse, so that we can develop Android applications in this environment.

In order to use the camera devices, we have to install the drivers of the ARTCAM-200SS cameras and ARIA into the laptops. The camera company also provides corresponding software development kits (SDKs), in addition to, we can use simple source codes to know the purpose of call functions in the program.

Accordingly, we can adjust the parameters of each camera, such as the value of exposure or the global color gain, through the SDK. Moreover, the camera company not only provides the BCB but also the C, VB.NET or C#.NET to the programmers.

2.2.3 Network Configuration

A network configuration is needed for communication between the two laptop computers and the pad because two omni-images are acquired from the two-camera omni-directional imaging device and the PTZ-camera, and each imaging device is processed by a laptop, respectively. Moreover, the laptops also send data to the pad.

As a result, to communicate between the two laptops and the pad, we set up a local area network.

As shown in Fig. 2.5, the access point (AP) can provide a wireless environment;

and the devices can be connected to one another through the AP. The laptop computer COM_A can be used to create the guidance map and guide the navigation. Itcan be used to acquire not only the data of the two omni-images by itself but also the data of the PTZ image (the image taken by the PTZ camera). On the other hand, the laptop computer COMB needs to receive these images from COMA through the local network.

Afterwards, the pad can be used to display the names of buildings. Using the local network, COMA can send the information of the current buildings to the pad.

Figure 2.5 The architecture of the local network used in this study.

2.3 Review of Adopted PTZ-camera System

In this section, we review the adopted PTZ-camera system with panning, tilting, and zooming capabilities. In this study we use an AXIS 213 PTZ camera made by AXIS Inc. as shown in Figure 2.6. This is a camera with a height of 130mm, a width of 104mm, a depth of 130mm, and a weight of 700g. The pan angle range is 340 degrees and the tilt angle range is 100 degrees. It has 26x

optical zooming and 12x digital zooming capabilities. The image captured is of the resolution of 320×240 pixels.

(a) (b)

Figure 2.6 The pan-tilt-zoom camera used in this study. (a) A perspective view of the camera. (b) A front view of the camera. (c) A left-side view of the camera. (d) A back view of the camera.

2.4 Review of Adopted Omni-camera System

In this section, we review the adopted omni-camera system which includes two lenses of model LV0612H, two CMOS cameras of model ARTCAM-200SO, and two

CMOS cameras of model ARTCAM-200MI. Table 2.2 lists the specifications of the COMS cameras.

Table 2.2 Specifications of used COMS cameras.

ARTCAM-200MI

Resolution 2.0 M pixels(1600*1200)

Dimension 33mm × 33mm × 50mm

CMOS sensor size 1/2” (6.4×4.8mm)

Mount C-mount

Frame per second 5 fps

Direct show camera No

To produce an omni-camera, we need to combine a projective CCD camera and a hyperbolical-shaped mirror together. The parameters of each of the hyperbolical-shaped mirrors are described here. The radius r of the hyperbolic-shape mirror is 4cm, the focal length f of the projective camera is 6 mm, and the sensor width Sw of the camera is 2.4mm. Also, the axis of the camera is aligned with the axis of the canter point of the hyperbolic-shape mirror.

As shown in Fig. 2.7, by the principle of similar triangles, the distance d between the optical center and the mirror center can be computed by the following equation:

d f

r = S . (2.1)

Figure 2.7 Relationship of the mirror and the CMOS sensor in camera.

Also, as shown in Fig. 2.7(a), the hyperbolic-shape of the mirror in the camera

The coordinates (X, Y, Z) specify a point P in the world coordinate system (WCS). Let the projection of P into the image plane of the camera be the point p with image coordinates (u, v) in the image coordinate system (ICS). In order to get the parameters a and b of the hyperbolic-shape of the mirror, first we have to acquire the elevation angle α in Figure 2.7(a) from the relation between the camera coordinate system (CCS) and the ICS according to Wu and Tsai [7] as follows:

2 2

where β is the azimuth angle as shown in Fig. 2.8(a). To compute it, let the distance from the origin O of the camera coordinate system shown in Fig. 2.8 to the mirror center O_m be denoted as c, and let that from the lens center O_c of the camera to O_m be denoted as d which may be measured in advance. Then, we can compute c by the simple formula d = 2c because O is defined to be at the middle point between O_m and Oc. Accordingly, in Eq. (2.3), let the omni-camera have the largest FOV, and the incidence angle α be set 0. Then, the angle θ and thereby β, according to Fig. 2.8(a), can be computed as follows:

where r is the radius of the circular area of the base of the mirror. Using Eq. (2.4), the parameter b can be obtained by solving Eq. (2.3).Finally, the parameter a is derived from the following equation:

2 2

c= a +b . (2.5)

Each omni-camera was built with these parameter values, and a two-camera omni-directional device can be constructed with two omni-cameras aligned vertically.

(a) (b)

Figure 2.8 (a) Relation between the worlds coordinates system and the image coordinate system. (b) Simple geometry between the mirror and the CMOS sensor in the camera.

2.5 System Processes

2.5.1 Learning Process

The first part of our guidance system is the learning process. The camera system is useful to get feature points from captured images. In this study we use a two-camera omni-imaging device to get stereo information using the extracted feature points. For camera calibration, the pano-mapping method using pano-tables is applied.

The process will be introduced in Chapter 4. Moreover, the PTZ-image can be used to obtain the angle between a PTZ-camera and a point of a building through an angular mapping. The process will be introduced elaborately in Chapter 3.

In order to develop the tour guidance system, we need to create a guidance map.

Therefore, we propose a learning strategy. As shown in Figure 2.9, the laptop COM_A is used to analyze acquired omni-images of the surrounding environment and compute the position of the landmark in the omni-image through an ellipse fitting method. In addition, the laptop COMB is used to capture PTZ-images of neighboring buildings, and then we choose feature points of the buildings manually and use compute their distances using the pano-table. Furthermore, through the angular mapping method, we can get the angels of the feature points. Afterwards, the distance and orientation data of the feature points are sent to another laptop COMA for creating the desired local map.

A local map only includes a landmark. When we generate the local maps for all landmarks, the laptop COM_B does not need to send the information of the feature points to the laptop COMA. Our local maps are independent; therefore, we propose a method for quickly converting the local maps into a global map, which is described in Chapter 3.

Figure 2.9 Flowchart of calibration of omni-cameras and PTZ-camera.

Figure 2.10 Flowchart of learning guidance map.

2.5.2 Navigation Process

The second part of our guidance system is the navigation process. In Section 2.5.1, we mentioned how we create a guidance map through the learning process. Accordingly, we can estimate the position of the video surveillance car on the guidance map and implement our tour guidance system in the navigation process. First of all, we use the captured omni-images to detect the landmarks at the sidewalk. Then, by using the pano-mapping table, the 3D information of the landmarks can be estimated. The process will be introduced elaborately in Chapter 4. Next, the turning direction of the surveillance vehicle is checked by an optical flow method. We use the turning direction to keep track the current node corresponding to the vehicle on the graph of the guidance map. The

detailed process will be introduced in Chapter 6. After the laptop COM_Areceives the position of the currently-visited landmark and the vehicle turning direction, we integrate these data to obtain the position of the video surveillance vehicle and get the information of the neighboring buildings. Finally, we need to send the names of buildings to the pad. For this, at first through the AP, we can use the local wireless network to connect the laptop COMA and the pad. The laptop COM_A then sends the names of the nearby buildings to the pad. Then, we display the names of the buildings as an image shown on the pad and project the image onto the car windshield. Moreover, we propose a method to estimate the position of the projected images on the windshield. The user can use the method to place the pad conveniently. The detailed process will be introduced in Chapter 5. As shown in Figure 2.11, we can know the flowchart of navigation process.

Figure 2.11 Flowchart of tour guidance

Chapter 3 Creation of Guidance Map

3.1 Introduction

In this chapter, we describe the details of the method we propose to generate the guidance map for use in the proposed augmented reality based tour guidance. Using the omni-camera device and the PTZ-camera device affixed to the roof of the video surveillance vehicle, we can create the guidance map quickly.

The proposed guidance map includes the center points of the landmarks and some feature points of objects of interests in the guidance area. We can use prominent points on any objects, like buildings, lamps, chairs, etc., as feature points. However, in this study, we just choose the corner points of buildings as the feature points because buildings are the most obvious objects in the guidance area and we can get prominent feature points of buildings conveniently in images.

In general, we use a ruler and a protractor to measure manually the distance between the PTZ-camera and every feature point, as well as the orientation of the camera direction with respect to the line from the PTZ-camera to the feature point. On the other hand, in this study we use the omni-camera to detect and estimate the location of each landmark automatically, which is defined to the position of the center of the landmark. The detail will be described in Chapter 4.

In addition, Wang and Tsai [9] proposed a method for angular mapping for camera calibration to compute the angular information of points in the PTZ-image.

The PTZ-camera can be used to estimate the directions of feature points by the method. However, we have to combine the information of the landmark with that of the feature points. Therefore, we propose a method to convert PTZ-camera

coordinates into omni-camera coordinates. The detail will be described in Sections 3.2 and 3.3.2.

By the aforementioned techniques, local maps of landmarks can be created. In

在文檔中利用擴增實境與環場電腦視覺技術作園區行車導覽之研究 (頁 23-0)