• 沒有找到結果。

5.2 Object Detection

5.2.1 Visibility-Based U-Disparity Occupancy Grid

Occupancy grid map is a powerful method for describing an environment and has been used for a variety of applications in robot field. In the last decade, occupancy grids are typically constructed in the Cartesian space from beam-type sensor such as ultrasound or laser range finder. In contrast, stereo camera is conic type sensor, which is less common to build an occupancy grid map due to the needing processing time and the limitation in accuracy [29: Perrollaz et al. 2012]. To overcome these problems, constructing occupancy gird map in u-disparity space using stereo camera is more popular than constructing occupancy gird map in the Cartesian space.

In [29: Perrollaz et al. 2012], u-disparity occupancy grid map is constructed by assuming that each pixel in disparity map is pre-classified as the road or obstacle pixel by double correlation framework proposed in [30: Perrollaz et al. 2010] which exploits different matching hypotheses for vertical and horizontal objects. However, in most applications, occupancy grid map is used to describe the environment without knowing each disparity is obstacle or road. Thus, knowing each pixel is obstacle or free space to construct occupancy does not make sense. Therefore, in this thesis the proposed system modifies the method slightly without pre-categorizing the disparity pixels.

The concept of the visibility-based occupancy grid map considers the ratio between observation pixels and visible pixels in the region of interest (ROI) with the height

according to the disparity of the grid cell. Figure 5.2 shows the concept: assuming that a human stands behind of a car as shown in Figure 5.2(a) with corresponding disparity map, shown in Figure 5.2(b). To estimate the occupancy of the grid cell UD =( , )u d that the car is located at, the disparity pixels in the region of interest are classified to be visible or non-visible pixels, observed pixels or occluded pixels. u is the image column coordinate, whereas d is the disparity coordinate of the grid at certain distance (d = fB Z/ ). Figure 5.2(c) shows the possible pixels in the region of interest with d as illustrated in Figure 5.2(d), whereas Figure 5.2(e) shows the classification result of these pixels. First, the pixels colored in green are classified as visible with their disparity value smaller than d. This means that these measurement rays pass through the grid cell and do not hit any obstacle (note that the larger disparity d, the smaller distance Z ). The pixels colored in yellow are categorized as observed and visible pixels since their value in disparity map are the same as d, which means that the measurements hit the car exactly. The remaining blue pixels are categorized as non-visible pixels. The pixels in occlusion and invalid disparity are all in this case.

Figure 5.3(e) better shows the occlusion case: for estimating the grid cell that the human stands at, these pixels do not hit or go through the human, which are occluded by the car in front of the human. These pixels cannot “see” the grid cell and therefore they are

illust

Figurre 5.3: Ano

According to [29: Perrollaz et al. 2012], the concepts of the visibility-based occupancy grid construction which are presented above can be formulated as follows [29: Perrollaz et al. 2012]:

,

( U) ( U ) ( U ) ( U | U , U )

v c

P O =

P V =v P C =c P O V =v C =c (5.1) where (P OU) is the probability describing the occupancy of a certain grid cell U . D

V , U C and U O are binary random variables, which can be one of the value in U

{ }

0,1 ,

describing the specific states of the grid U . D V represents the visibility of the cell. U

U 1

V = means that the grid U is visible. D C indicates the obstacle confidence of the U cell. 1CU = means that an object is seen in the grid U . D O is the occupancy of the U cell. 1OU = shows that the cell is occupied by obstacle pixels.

To solve the Equation (5.1), some boundary conditions of P O( U |V CU, U) are intuitive known. First of all, for a grid cell that is in invisible state, the occupancy cannot be determined due to no measurement data. That is, nothing is known about its occupancy.

This can be written as follows [29: Perrollaz et al. 2012]:

{0,1}, ( U | U, U ) 0.5

c P O V C c

∀ ∈ ¬ = = (5.2)

Secondly, for grid cell that is fully in visible state, the boundary conditions ( U | U 1, U)

P O V = C are determined according to the obstacle confidence state C of U that grid. If the grid cell is in full confident that an obstacle is observed, this means that the cell is absolutely occupied or is not occupied only when the false positive is

occurred. This can be expressed as follows [29: Perrollaz et al. 2012]:

( U| U, U) 1 FP

P O V C = −P (5.3)

On the other hand, if the grid cell was fully visible but noting can be observed, the cell can only be occupied when it occur a false negative. That is:

( U | U, U) FN

P O V ¬C =P (5.4)

The four boundary conditions mention above are listed in Table 5.2.

Table 5.2: Bounding conditions of P O( U |V CU, U) Visibility

Observed Confident V U ¬ VU

CU 1−PFP 0.5

CU

¬ PFN 0.5

Substituting these boundary conditions in Equation (5.1), it can be extended as follows [29: Perrollaz et al. 2012]:

( U) ( U) ( U)(1 FP) ( U)(1 ( U)) FF (1 ( U)) 0.5

P O =P V P CP +P VP C P + −P V ⋅ (5.5)

P and FN P can occur during the stereo matching step and are assumed to be a FP

known constant (both of them are 0.02 in this thesis). Therefore, to obtain the occupancy of grid cell P O( U), the remaining things to be estimated are the visibility of a cell, (P VU), and the confidence of observation, P C( U). The visibility is defined as the ratio between the number of visible and possible pixels (length of the ROI), that is [29: Perrollaz et al. 2012]:

( ) ( )

( )

V D

U

P V N U

= N U (5.6)

where ( , )UD = u d stands for the certain grid cell in u-disparity space. N UP( D) is the number of possible pixels at the cell depends on its d-coordinate, which is defined as follows [29: Perrollaz et al. 2012]:

( ) ( ) 0( ) ,

P D h

N U =v dv d (5.7)

where ( )v d is the v-coordinate of the pixel which are situated at the maximum h detection height for certain disparity d. Similarly, v d is the v-coordinate of the 0( ) pixel which are located on the ground for certain disparity d. v d and 0( ) v d can h( ) be obtained by the fundamental pin-hole model and are expressed as follows:

( ) h

h center

v d dY v

= B + (5.8)

0( ) Yground center

v d d v

= B + (5.9)

where B is the baseline of the stereo camera and vcenter is the v-coordinate of the center of the image plane. Y and h Yground are. Substituting Equation (5.8) and (5.9)

into (5.7), it becomes:

( )

( )

ground h

P D center center

h ground

ROI

Y Y

N U d v d v

B B

Y Y

d B

dC

= + − +

= −

=

(5.10)

CROI is a constant which depends on the preset detection height and ground position.

Thus, (N UP D) only depends on the d-coordinate for a given cell U . D

On the other hand, N UV( D) is the number of visible pixels in the subset of the

possible pixels, which can be expressed as: Equation (5.6). For estimating the confidence of the observation, (P CU), the ratio between the observed pixels and the visible pixels is considered. It is defined as an exponential function as follows [29: Perrollaz et al. 2012]:

( )

Perrollaz et al. 2012], and r is defined as the obstacle confidence, which is the ratio O

between observed pixels and the visible pixels, that is, [29: Perrollaz et al. 2012]:

( )

The number of the observed pixels can be expressed as follows:

0

0, d ObservedThreshold d

d observed

if I u v d C I u v invalid

F = other − <= ≠ (5.16)

For all the grid cells UD =( , )u d in the u-disparity space, their occupancy can be obtained by the above expressions, Equation (5.5), (5.6) and (5.13). Note that N UP( D),

the occupancy grid calculation processing. Hence this pre-calculation process reduces

the time complexity and then speeds up the overall algorithm.