Principles of Lane Departure Warning and Drowsiness Prediction

Chapter 2 Preliminary

2.4 Principles of Lane Departure Warning and Drowsiness Prediction

Warning and Drowsiness Prediction

The part for lane departure warning is to provide some triggers for caution with respect to the driving-off-road behavior through the lateral information of the lane extracted by the lane detection algorithm. After measuring the lateral velocity from the consecutive frames, the warning system will determine when the departure driving occurs based on the lateral displacement and TLC (time to lane crossing.)

On the other hand, the part for drowsiness prediction will try to combine the

experimental results of BRC (brain research center) from NCTU with the realistic driving video. In order to estimate the lateral location of lane where the driver gets used to navigate on the straight road, we construct the single Gaussian model to simulate the stable-state range about the lane position. Then, the additional updating mechanism will contribute to the systematic adaptation even if the driver changes his/her driving habits. At last, the proportional gauge of the drowsy degree we proposed will show if the driver has higher or lower probability in the drowsy state at that moment with the amount of reflection time measured by the lane position over the stable-state region.

3 Chapter 3

Lane Detection 3.1 Overview

Figure 3-1 shows the flow chart of lane detection. At the beginning of this architecture, because we merely aim at the monochromatic information of each frame to process, the RGB coordinate will be transformed into the YCbCr one so that the illumination component will be totally retained. Then, the automatic mechanism about searching the ROI (region of interest) of the image content will be described in Section 3.2.1. The preprocessing step about de-noising will be presented in Section 3.2.2.

Next to the processing step, the flow will enter the principal detection parts. Due to the mounting position of camera on the side of the car, the image captured by that device will contain most of the lateral-view information next to the wheels. In other words, only one lane trajectory which is the most closed to the vehicle can be apparently seen. An edge detection operator will be developed to adapt to the geometry relationship of the camera based on the property of view-angle in Section 3.3. In addition, the binarization step we proposed in this section will depend on the spatial relation with respect to the perspective effect. To eliminate the blind-spot region as much as possible, we choose the fish-eye camera for enlarging the field of view with some obvious distortion result. Therefore, the adaptive edge-linking model demonstrated in Section 3.4 will overcome the serious problem whether the lane boundary in the image sequences is straight or not.

Fig. 3-1 : The flow chart of lane detection.

3.2 Preprocessing

3.2.1 Automatic ROI Extraction

Before discussing how to search for the lane-marking, the step of color transformation must be executed. In general, most of the algorithms shown in the past theses with respect to lane detection are only considered the grey-level component.

This reason is that the contrast between the lane boundary and the normal road plane can be easily seen by normal people as usual even if the colors of lanes are not necessarily the same. As a result, the information of luminance for each frame must be stored in our system by the RGB-to-YCbCr transformation. On the other hand, the remaining chrominance components such as Cb and Cr are not taken seriously due to the insensitive perception about human eyes. The formulation of transformation can be described by

(3.1)

As shown in Section 2.1.1, equation (2-1) and (2-2) tell us the relationship of geometric transformation which demands the known information of camera, such as the height, pan-tilt angle, and the internal focal-length of the camera, between the image coordinate and the vehicle coordinate systems. Some methods proposed in the previous works have to compute the curvature of the realistic road plane or to estimate the lane shape effectively by these intrinsic or extrinsic parameters. However, an adaptive system can not be sensitive to the variation of the camera mounting position for the aspect of application and commerce. For instance, the systematic performance

0.257 0.504 0.098 16

should be not influenced by the distance between from the rear-view mirror and the road surface about various vehicles.

To take this target, we hope that our detection algorithm can automatically determine the ROI (region of interest) contained the whole lane trajectory on the road surface only by the image content with lateral view-angle. The chosen range of ROI should be unchanged by the later information of image sequences whether some new moving objects are captured or not. Figure 3-2(a) demonstrates the realistic frame acquired by the camera alongside the side mirror. Through being concerned about the image content, the fixed parts within it might be regarded as the evidence for ROI extraction. In our opinion, the sideward car body with constant area throughout the image sequences and the horizon relative to the road plane both correspond to the fixed condition. Therefore, the approximately location of ROI will be determined by the edge information of them.

The definition of ROI is that a rectangle region which extends its width to the location next to wheels contains all the lane shapes in the image. In general, the height of ROI is below the vanishing point situated in the horizon closed to the border of vehicle’s window. This 2D geometry with respect to the above characteristics can not depend on the light condition or view-angle of the camera.

(a) (b)

Fig. 3-2 : (a) The image acquired by the camera alongside the rearview mirror. (b) The upper left point of ROI next to the boundary of the vehicle window.

Figure 3-2(b) shows the location of the upper left point of ROI between the boundary of the window and the vanishing point. In this figure, the portion of green rectangle is shown as ROI, and the intersection of the marking cross stands for the key point to determine where the range of ROI has covered. In this case, the 2-D gradient operator will be used to extract the position of key point by considering the boundary information of the vehicle window. Hence, we use only two of eight-directional Sobel masks for detection due to the obvious edge of the window in the horizontal and vertical aspects, as follows:

x, y : coordinate values of each pixel in the x and y axis f (x ,y) : the intensity of this pixel

[

( 1, 1) 2 ( , 1) ( 1, 1)

] [

( 1, 1) 2 ( , 1) ( 1, 1)

]

Gx= f x− + + ⋅y f x y+ +f x+ + −y f x− − + ⋅y f x y− +f x+ − (3.2) y

[

^{( 1,} 1) 2 ( 1, ) ( 1, 1)

] [

( 1, 1) 2 ( 1, ) ( 1, 1)

]

Gy= f x− − + ⋅y f x− y +f x− + −y f x+ − + ⋅y f x+ y +f x+ + (3.3) y

(a) (b) Fig. 3-3 : The mask type of (a)G . (b)_x G_y.

The two mask types are shown in Fig. 3-3. Figure 3-4 displays the results of Sobel edge detection with Gx and Gy. After extracting the border of the window from Fig. 3-4 (b) to Fig. 3-4(d) with thresholding, the coordinate values of the key point in the x- and y- axis will be founded to determine the range of ROI by computing which row and column retain the most edge pixels along the horizontal and vertical direction individually. This process can be expressed as:

( )

The ratio of to the image width is closed to 0.5, and that is the same case as the ratio of h to the image height. Due to the more edge pixels naturally existed along the horizon in the horizontal axis and the perpendicular border of vehicle in the vertical axis, an intersection point of the car window can be found out by searching in the x-y direction respectively. The detecting results with different light conditions and view-angles are shown in Fig. 3-5.

Fig. 3-4 : (a) Original image. (b) Edge detection by Gy. (c) Edge detection by Gx. (d) Edge detection by Gx+Gy.

(a) (b)

Fig. 3-5 : (a) Day light. (b) ROI extraction of (a). (c) ROI extraction at night. (d) ROI extraction with different view-angle in the nighttime.

Although the horizontal border of vehicle window may be unclear in the worst conditions which the illumination from the car and street light has not adequate at night, the extracting result is still steady since the edge information of horizon can be replaced to obtain the similar position in the x-axis, as shown in Fig. 3-5(c) and (d).

3.2.2 De-noise Processing in Spatial and Temporal Domain

The quality of image sequences collected by the vision-based sensing device will be almost subjected to this challenge of the variance of light conditions, such as day or night situation. Because the problems about high-frequency noise will be serious for some driving environment due to the photosensitivity of cameras, especially on

night vision. Therefore, the preprocessing step for eliminating the noise effect must be considered in the detecting architecture if the system is expected to work robustly all day long.

In general, a low-pass filter can be implemented before the process which is used to extract the information about the boundary, texture, or shape of the interesting objects within the frame. Since the frame is stored as a collection of discrete pixels, we need to produce a discrete approximation to the chosen filter-type before the convolution step. Hence, the Gaussian smoothing operator which is a 2-D point-spread function achieved by convolution is used for this de-noising task in our system. The isotropic form of Gaussian is shown as below:

2 2 is the standard deviation of this function

where σ

The diagram of this distribution is shown in Fig. 3-6(a). Moreover, this function has been assumed with a zero mean. In principle, the Gaussian distribution is non-zero everywhere, but its value is closed to zero more than about three standard deviations from the mean centered at the distribution. Therefore, we can truncate it as the mask-type at the specific pixel of each frame. Figure 3-6(b) shows a suitable integer valued convolution mask of Gaussian where σ =1. The Gaussian filter outputs a weighted average of the neighborhood of each pixel. It can provide gentler smoothing and preserves edges better than the normal-sized mean filter due to the distinct size between 5x5 and 3x3. On the other hand, by choosing an appropriately size of Gaussian filter determined by the standard deviation, more range of spatial frequencies is still preserved in the image after filtering because its Fourier form is itself a Gaussian. However, over-wide region contained in the filter will result in the

serious blur effect of the image content. Therefore, the 5x5 principal type of Gaussian mask is still adopted in this part.

273 1

(a) (b)

Fig. 3-6 : (a) 2-D Gaussian Distribution with mean(0,0) and σ=1. (b) Suitable 5x5 mask of Gaussian filter with σ =1.

Some results of edge detection which describes the details in the next section is preprocessed by Gaussian and Mean filter as shown in Fig. 3-7. Compared with (c) and (d), the extracting method of the lane boundary will be easily disturbed by the remaining noise if the smoothing filter can not effectively remove the high-frequency perturbation.

(a) (b)

Fig. 3-7 : (a) Mean filter. (b) Gaussian filter. (c) Edge detection after (a). (d) Edge detection after (b).

Salt and pepper noise which exist in spatial and time domain is more challenging

for the preprocessing tasks, especially the night environment. To achieve the objective that the effect of the proposed lane detection method in this thesis must be independent on the variation of external light conditions, the time-averaging process focused on the current and previous frames will be added behind the Gaussian smoothing work. The integrated de-noising procedure is demonstrated in Fig. 3-8.

Fig. 3-8 : Flow chart of the complete preprocessing steps.

The preprocessing work

3.3 Lane Boundary Detection

3.3.1 Edge Detection

The objective in this section is to find the features of lane marker from the information of image. Through the observation, lanes must have some apparent properties about its boundary. The most obvious reason of them is that the lane markers must be brighter than the neighborhood road surface even if they are with various color information. Then, the lane shapes in the image are almost presented as slender types. In other words, extracting the lane boundary is an important step to locate the realistic lane position throughout the video by the foregoing two factors.

The determination of edge detection operators need to be considered the suitable and effective performance for the image contents. Y. Wang [7] and [10] select Canny operator to locate the position of pixels where the significant edge information of

lanes exists by considering the gradient characteristics at the same time. However, using this operator will accompany the obvious computing load since the judging mechanism about the orientation and magnitude of each candidate edge in the whole frame. On the other hand, Kreucher [20] provides a frequency-based extracting method to find the diagonal dominant edges through the partly DCT coefficients. This special concept is so intuitive that the components of edges determined by DCT may be not related to the local information more closely than the common gradient operators, especially about the contents of video acquired by the camera on the side of the vehicle without fixed edge direction of lane markers. By giving the consideration to effects about the systematic performance and the adaptation of video with various view-angles, the LoG (Laplacian of Gaussian) operator is implemented in this step.

LoG is an associative convolution operation which convolves the Gaussian smoothing filter with Laplacian filter of all, and then convolve this hybrid type with the image to achieve the required result for edge detection. As an approximated second-order derivative, the Laplacian mask can highlight the regions where the intensity of pixels contained by the boundary of objects changes rapidly. Nevertheless, this operator can not be used for edge extraction thanks to the higher sensitivity of noise. To reduce this effect, the image has often smoothed by Gaussian before applying the Laplacian mask. Because the second derivative is a linear operation, the hybrid mask of two filters is similar to convolve the Gaussian function first and compute the Laplacian of the result. The 2-D LoG function is shown as follows:

The 5x5 mask approximation to the LoG function and its 3-D plot is shown in Fig. 3-9. By further observing the property of blind-spot view image from the camera alongside the rear-view mirror, the included angle from the edge of lane to the vertical Y-axis of the image plane must be within the range of degree from . Compared with other gradient operators, LoG mask has no orientation so that it can not adapt to some specific edge directions of the object. So the additional 5x5 mask similar to the form of the sobel-mask with tilt angle of 45 degree is provided to be combined with the previous LoG mask to adapt to the lateral-view image environment.

The convolving relation is explained in the following:

0 to 90^D ^D

3-10(b). Compared with Fig. 3-9(b), this distribution not only maintains most part of the LoG shape, but also be added the identity of orientation for the blind-spot view due to the “slant” shape in Fig.3-10(b). The results of edge extraction for lane

boundary between the LoG and the new combined mask are demonstrated in Fig. 3-11.

According to the result from Fig. 3-11(d), only the intra-boundary of the lane can be extracted, and this property will contribute to link the lane trajectory described in the later section.

Fig. 3-10 : (a) The additional mask for LoG combination. (b) 3-D plot of the new 5x5 combined mask.

(a) (b)

Fig. 3-11 : (a) The original image. (b) Gaussian smoothing within the ROI of (a).

The morphological post-procedure is to thin out the lane-marking after the edge

extraction. There are two conditions determining which the pixel can be retained in

⇒ the lane boundary of the row

else

⇒ I(k)=255

The edge-finding approach to determine the location of P(i) will be introduced in Section 3.4.

3.3.2 Adaptive Threshold Determination by Distinct Spatial Region

The pixels within the ROI can be extracted for the image processing tasks in our system. According to the perspective geometry, the length or width of the lane markers within ROI is not the same with each different position. In other words, the lane boundary in the bottom part of ROI is always wider and longer than that in the up part. By considering the transformation effect, the adaptive mechanism is developed to adjust the threshold for different sub-regions, and the size of them depends on ROI.

After processed by edge extraction, the image needs to be decided the threshold for more obvious detecting result. Due to the evidently contrast between the lane markers and the neighborhood road surface, the gradient magnitude of lane boundary caused by the edge operator is usually larger than other locations. Therefore, in this section the values of mean and standard deviation computed by each row within the ROI will be selected as the threshold for different region.

Take the normal distribution for example, the range which contained the distance

for one standard deviation from the mean will account for about 68% of the whole set.

Besides, the range will account for about 95% if it contains the distance for two standard deviations from the mean. For each row within ROI in the image, the threshold value is still selected by referencing above scattered property since the gradient magnitude of lane markers is certainly higher than that of the normal road surface. That is,

The performance of the binarizing approach may be dependent on the edge information of the adjacent moving vehicles close to the lane or the car-light of them, especially the upper part of ROI which can not contain adequate component of the magnitude of lane. Hence, the ROI will be divided into seven sub-regions when it is automatically extracted in the first frame of video, as illustrated in Fig. 3-12

Fig. 3-12 : The division of ROI into seven sub-regions.

height of ROI

Where size= , N: number of segments (we choose 7 in this system) N-1

In this way, the values of thresholds situated in different location are selected by tuning the mean value of each-row pixels and the arrangement of magnitude for them

are from the bottom to the top sub-region, as described in the following:

(

i (0, width of ROI)

)

i (0, width of ROI)

( ) Mean ( , ) Standard deviation ( , )

Threshold j ⁼ _∈ f i j ⁻

α

^{+ ⋅}k _∈ f i j ^(3.10)

i (0, width of ROI) 0

=0.1 ⁿ (Standard deviation ( , )) ,

f i j k

α = ∈

⋅

∑

⋅

where

: n-th sub-region of R .n OI

(a) (b)

(c)

Fig. 3-13 : (a) The image is photographed in a tunnel. (b) Lane-marker extraction without considering the sub-region threshold. (c) Lane-marker extraction with considering the sub-region threshold.

Figure 3-13(a) shows an imaging environment about driving in a tunnel. The original lane boundary in the upper region is not easily seen due to the disturbance of the car-light from the backward vehicle, as shown in Fig. 3-13(b). This overexposure effect will be improved by considering the tuning parameter (α ) in Fig. 3-13(c).

3.4 Lane-Finding Algorithm

Since the edge information of lane markers has been acquired by the foregoing demonstration, marking and tracking the lane trajectory within ROI can be succeed by such pixels lying on the sides of lane boundary in the image. There have been some researches for lane-model construction. Y. U. Yim and S. Y. Oh [21] use the starting position, direction, and saturation of the lanes regarded as the three features to initialize the lane vector and find the most probable lane trajectory by Hough Transform. Roland Chapuis [25] uses the statistical model to specify the detection ROI in order to narrow the searching area of lane markings. Different from the method merely about the image processing, the lane geometry is taken into the fitting of the lane model provided by A. Lopez [28]. D. J. Kang [30] combines the vanishing point of the road from the frontal camera with Hough Transform for lane tracking.

在文檔中以車道線偵測為基礎之駕駛人昏睡警示安全系統 (頁 30-0)