Chapter 1 Introduction
1.4 Organization
This thesis is organized as follows. In the next chapter, we briefly review the related topic papers then describe methods and discuss the most features which they adopted. In chapter 3, the proposed core techniques in this system are described in detail. They modify the inadequacy and disadvantage of the features. In chapter4, the main algorithm is presented. In chapter 5, we will show the experimental results and have a discussion on efficiency and reliability of proposed algorithm. Finally, the conclusions of our algorithm and future work will be presented in chapter 6.
2 Chapter 2
Relate Works
2.1 Overview
In recent years, many researchers have developed many techniques for road and driving situation detection in the frontal view of vehicle. The related topics to our research include “road detection”, “road recognition”, “road following”, and “lane detection” etc. In this chapter, these related works are classified by the reason that they used different approaches or different features. Furthermore, we will describe the problems which might be happened in the real condition. In section 2.2, it divides the existed methods into several bases by the features which are adopted to extract the positions of boundaries or roads and then we briefly introduce these methods.
2.2 Approaches Introduction
To distinguish the roads and boundaries for driving situation detection in the frontal view of vehicle, many researchers have proposed diverse approaches which depend on different bases and using different features. We describe the most important bases which include “region-based”, “boundary-based”, and “model-based” and introduce briefly the most popular features.
The most common features can be divided into four typical types as seen in Fig.
2-1. The methods usually utilize one or integrate different features to realize the frontal situation of vehicle during driving.
Fig. 2-1: The most popular features.
The first feature selection to detect road area is applying color information. There is some unstableness of color feature. The common causes of these unstableness including non-homogeneous surface or light variation. It is important that the method using color feature can adapt the color setting simultaneously. However, the color feature method cannot capture the precise road area and compute current road information adaptively. Furthermore, the speed of processing is time-consuming because color images require three times data quantity than the grey-scale images.
The second feature selection is to locate road boundary using edge characteristics of the boundary. If there is no distinct boundary features, or non-boundary objects may produce stronger edge intensity, all of these reasons may mistakenly lead to false detection. The third feature selection is to utilize the geometry of the road.
Geometric feature largely depends on vanish line information, but camera shaking encountered during driving will destabilize the position of the vanish line in the image.
On the other hand, many approaches adopt inverse perspective mapping (IPM) to transform the original image to bird-viewed image. Bird-viewed images are capable to
capture the parallel characteristic of the road boundary; but many adjustments of parameters such as the placement of the camera and internal camera information are required. Finally, the last feature, texture selection, encountered the same drawback as the color feature. Besides, since most road surface is smooth, it is hard to discriminate the drive way from the whole surrounding. We have summarized the overall disadvantages of common features in the Fig. 2-2.
Fig. 2-2: The disadvantages of common features.
The universal approaches which can be divided to three kinds of bases are
“region-based”, “boundary-based”, and “model-based”. We introduce the main idea of these methods as follows:
1. Region-Based :
The methods with topics about road detection or road recognition usually belong to region-based. The idea of region-based is like the solution of segmentation problems. Region-based methods adopt road features to extract the road region. The features include the color features and texture features and many methods use these
features to identify if the pixel or the region is the road or non-road. However, the road detection methods usually identify the approximate road region as seen in Fig.
2-3. Actually, the ultimate goal is to detect the boundary of the main drive path. Some methods combine the edge and geometric features about the boundary and shape of the road to estimate the candidate road regions firstly and narrow the search area from the whole image to the candidate regions in order to save the detection time.
(a) (b) (c)
Fig. 2-3: (a) captured image, (b) road detection result, (c) the boundary of the main drive path.
The region based methods usually catch the key point on color or texture information between road region and non-road region, and apply the clustering algorithm by these features. They design a threshold or hyper-plane such as decision boundary to identify which category region or a single pixel is. Such region based methods are usually insensitive to road shapes, but sensitive to illumination variation and inconsistent surface such as shadows and non-homogeneous appearance as seen in Fig. 2-4.
Fig. 2-4: The non-homogeneous appearance of surface causes instability of the region-based without adaptive module.
The supervised classification applied to road following (SCARF) [1] and unsupervised clustering applied to road following system (UNSCARF) [12] are integrated with navigation system at NAVLAB of Carnegie Mellon University (CMU).
The system can deal with unstructured in urban without obvious lane marking but with considerable difference in color between road area and surroundings. They are considered as a method of color segmentation based on clustering. The SCARF uses the RGB as color features to build multiple Gaussian color models in road/non road area and using standard nearest mean clustering (K-means) to cluster the pixels into four classes for each road and non-road with similar colors. The UNSCARF adopt iterative self-organizing data-analysis technique algorithm (ISODATA) to cluster a pixel x with five-dimensional as feature extracted by RGB color space and the coordinate of the pixel x in the image such as x = [R, G, B, X, Y]. These methods using the clustering technique have to process iteratively to fit the best clustering result but the steps are too time-consuming to meet the real-time requirement. In addition, they are difficult to distinguish the road from its surroundings only by color for the similar color or road area and the area that is beyond road shoulder.
Furthermore, they assumed the width of road is known and the road is linear so the road recognition result is less accurate when the curvature of road is sharp and the width of road is non-uniform. As a result, they display its inability to detect arbitrary road shapes explicitly.
J.Huang 2007 [3] uses HSV color space to detect the unstructured road without priori knowledge. It just assumes a small area in front of the vehicle as a sample of road to compute means of Hue, color purity defined as Saturation multiplied by Value of HSV color space and judge a whether a pixel belongs to the road or not by the simple Manhattan distance measure with the means of color purity. The color of road
is grayish, and RGB is similar value. Therefore, the Hue component is always sensitive and unstable, but color purity composed of Saturation and Value components is not enough discriminative from road and non-road. In addition, the method belongs to the pixels based so the processing time is slow. Because they do not use any edge information of boundary, so the effectiveness of the approach is not stable. Mobileye [4] designed an autonomous driving vehicle for Darpa Grad Challenge Race in the Mojave desert. They adopted Adaboost learning principle based on the texture information of the road which is extracted by oriented filter, Walsh-Hadamard kernels and Moment to detect the off-road path. The approach is only appropriate to the surface with uniform texture on the off-road path. However, this method often fails when current road is dissimilar to that of the training set. Rasmussen [5] utilized the texture cue to navigate the vehicle on the unstructured road which is without any lane markings and a homogeneous surface. Previous approaches use color feature to build a Gaussian mixture models [6][7] but the color on the road is actual not homogenous.
Road colors are different produced by the reasons including: different light, different material of surface etc, so the color model is not adaptive enough to perform precisely.
Some methods proposed the learning model [8], but they don’t have methods to define where the road regions are. Therefore, the samples to build the model are not appropriate and the reason why the effective of learning become even worst.
We have summarized the overall features selected and common disadvantages of the region-based approaches in the Fig. 2-5.
Fig. 2-5: The feature and common disadvantages of the region based methods.
2. Boundary Based :
Some methods use the characteristic includes edge and shape of the roadside to extract the boundary position. Bertozzi et al. [10] proposed the GOLD (Generic Obstacle and Lane Detection) system by using a stereo vision-based hardware and software architecture developed to increment road safety of moving vehicles. The GOLD system addresses both lane detection and obstacle detection at the same time.
Lane detection is a boundary based skill which relies on the presence of road marking, while the localization of obstacles in front of the vehicle is performed by the processing of pairs of stereo images. The IPM (Inverse Perspective Mapping) [11] is the most important component. The IPM algorithm can be used to project image plane to ground plane which is assumed to be z=0 in the real world as shown a bird-viewed scene shown in Fig. 2-6 (a) to (c). Meanwhile, road boundaries become parallel two lines in the bird-viewed image and extract the boundaries by the hint. However, the IPM algorithm needs appropriate parameters of the camera setting information so it is not compatible enough to use immediately after setting on the vehicle. Besides, some objects may produce edge as interference as seen in Fig. 2-6 (c), it cause difficulties of the detection.
(a) (b) (c)
Fig. 2-6: The method transform to bird-viewed image using inverse perspective mapping (a) to (b) and extract the boundaries from bird-viewed image (b) to (c)
Most of the boundary based approaches [10][12][13][14] rely on the distinctness of the boundary, such methods assume a clear and solid lane mark on the surface of the road. The ambiguous country roadside or the broken lane mark in the middle of the road could lead to the failure in following the detection of main drive path shown in Fig. 2-7 (a) and (b). The shadow or objects at the side of the road could sometimes render the strong edge information as seen Fig. 2-7 (b) and (c); such interferences need to be avoided in order to precisely detect the road boundary.
(a) (b) (c)
Fig. 2-7: The difficult reasons of boundary based approach.
In order to increase the accuracy, some methods usually combine the edge feature with geometric properties of the road [10][12][13]. Due to the perspective
sin cos x θ+y θ γ=
effect of camera, each pair of parallel lines in the real world would connect to the vanish point on the horizon in the captured image. Generally, the key of the approach is to use voting procedure such as Hough transform for the edge points of the image to the (θ,γ) plane by equationand (2-1) and form a sinusoidal curve in (θ,γ) plane as seen Fig. 2-8 (a) and (b) .
(2-1) They divide the image to many sections and compute the accumulator values after all edge points have been transformed to the (θ,γ) plane. It detect the pair of boundaries by extracting the pair of line with maximal accumulator and intersecting on the vanish line as seen in Fig. 2-9. These algorithms are fast and appropriate for the task of highway driving; however, it make assumptions about the structure of the road, and do not work on the unmarked roads without clear and distinct lanes on the boundaries. They have great problem by the effect on the edges of the surrounding objects, so it always is not acceptable when the road is too narrow to extract enough edge of boundaries. In addition, the camera which is set on the vehicle is shaked due to flatness of road when the vehicle is moving. As a result , the vanish line position is not stable which cause the algorithm to extract the incorrect boundaries. In addition, these approaches could not detect precisely when the road curvature is very sharp, because the boundary in the far field sections is unclear and the strong edges prodiced by the objects around surroundings make the curve boundary detection failed as shown Fig. 2-10. Therefore, the height of far field section must be decresed to handle the sharp curve boundary.
We have summarized the overall features selected and common disadvantages of the boundary-based approaches in the Fig. 2-10.
(a) (b) Fig. 2-8: Points transfer to (θ,γ) plane.
Fig. 2-9: Extract the boundaries in each section by selecting the lines with maximal accumulator values [12].
Fig. 2-10: It is suffered from edges of non road boundary and difficult to detect the curve boundary precisely
Fig. 2-11: The feature and common disadvantages of the boundary based methods
3. Model Based
The last method is the model based. They apply the mathematic function to model the road boundary shape. They use the some feature such as edge and geometric feature to converge to model to the boundary position and minimize its divergence errors. [12][13] Wang use a local interpolating spline such like
Catmull-Rom Spline and B-Spline to describe the boundary shape by setting the control points on the boundary shown in Fig. 2-12. Jung 2005 [14] utilized the edge feature and then combined edge distribution function and modified Hough transform to locate the boundary first and then used a linear-parabolic model to perform lane following and lane departure detection. The linear-parabolic model use linear equation to track the near field boundary, and then use quadratic equation to track the far field.
The near field and far field is shown in Fig. 2-13 and the linear-parabolic mode is as equation (2-2). The same idea is adopted in [15] and they use vanish point and the control points on the boundary to model their lane-curve function (LCF) shown in Fig.
2-14.
(a) Ground Plane (b) Image Plane
Fig. 2-12: Spline model to describe the boundary by the control points P1 P2 P3
Fig. 2-13: The position Xm divides the scene into far field and near field.
Fig. 2-14: The lane-curve function (LCF) and its asymptotes. (2) is the near field LCF, (3) is the far field LCF, (4) is the near field asymptote, and (5) is the far field asymptote.
Actually, the model based approach relies on the accuracy of the feature. If the feature is unstable, the model cannot converge to the precise position of the boundary.
Moreover, if the shape of the boundary is too twisted, the first and second degree equation cannot match the curve. Multiple curves of the road could also complicate the parameters of the formula. We summarize all the common features of model based method and its disadvantage in the below Fig. 2-15.
Fig. 2-15: The feature and common disadvantages of the model based methods
3 Chapter 3
Proposed Techniques
3.1 Objective
The last chapter two has introduced current approaches using different bases including region based, boundary based, and model based. Besides we have realized frequent inadequacy of the features they adopt. Therefore, we proposed the two core techniques in this chapter in order to promote the precision and plasticity of the road following system. The two main core techniques are: 1. On-lined L*a*b color model and 2. LMR boundary window. The first technique attempts to train the adaptive color model and modify the non-homogeneous drawback of the road color. The second technique uses a new tracking tool called LMR boundary window to extract boundary position by edge intensity difference. We introduce the details of the core techniques and explain how they can solve the inadequacy of the features.
3.2 On-lined L*a*b color model
3.2.1 Objective
Most region based approaches frequently use the color feature to detect the road.
Color information extends into three times dimensions of original grey scale image so it is with better discrimination. Therefore, there will be results in a more completed and precise detection. Most current papers propose off line method to determine the ranges of each color vector. This method only enables the detection of road area
within the color range. However, off-line method is very ineffective because of the difficulty to cope with the heterogeneity of surface material and the variation of illumination. Therefore, we purpose an on-line learning model that allows continuously update during driving. Through the training method, we can enhance plasticity of the system.
Besides, due to the uneven distribution of the road color, we select the samples which most closely resemble the detection zone to build the exclusive color model. In addition, we don’t detect the whole image but set the each LMR boundary windows which prepare to track the boundary position as the region of interesting (ROI).
Therefore, when we track the boundary we just detect each LMR boundary window (ROI) in order to increase the detection speed and the performance of updating models. Besides, each LMR boundary window has exclusive color model whose on-lined sampling area is very close to its corresponding LMR boundary window. The exclusive model and its adequate sampling area can promote detection accuracy in each LMR boundary window when the road surface with uniform color appearance.
3.2.2 Color space selection
In this section, we illustrate the advantaged properties of L*a*b color space by comparing the other color space and explain the reason we adopt the L*a*b color feature for modeling.
We attempt to describe the color appearance in the driving environment by selecting the color features and using these color features to build a color model of the road, therefore we have to choose a color space which has uniform, little correlation, concentrated properties in order to increase the accuracy of the model. In computer color vision, all visible colors are represented by vectors in a three-dimensional color
space. There are many common color spaces that have been used to facilitate the analysis of color image.
Among all the different color spaces, RGB color space is the most common color feature selected because it is the initial format of the captured image without any distortion. Using the RGB color space to build model had been seen in many approaches [16]. However, the RGB color feature is high correlative, and the similar colors spread extensively in the color space. As a result, it is difficult to evaluate the similarity of two color from their 1-norm or Euclidean distance in the color space.
It takes some experimental results to explain the L*a*b color space is a better choice than others. RGB color space is susceptible to illumination variation on the road such as shadow. In Fig. 3-1, we track the values of (Imax-Imin) of each component on a selected road area over time which distribute at the horizontal axis
It takes some experimental results to explain the L*a*b color space is a better choice than others. RGB color space is susceptible to illumination variation on the road such as shadow. In Fig. 3-1, we track the values of (Imax-Imin) of each component on a selected road area over time which distribute at the horizontal axis