Chapter 1 Introduction
1.3 Overview of Proposed System
In this study, we try to design a vision-based computer-assisted driving system for use in outdoor environment. To achieve this goal, firstly an autonomous learning process that extracts parameters automatically by image processing techniques is proposed. Secondly, a vehicle movement analysis method which finds lane lines and analyzes them to know the current vehicle movement condition is also proposed.
Finally, a method of detection and tracking of neighboring objects is proposed to conduct risk condition recognition.
More specifically, in the system designed in this study using these proposed techniques, the following tasks are conducted.
(1). Designing a car roof rack with adjustable width that can fit various sizes of vehicles to fix an omni-directional camera on the top, as shown in Figure 1.1.
(2). Analyzing the images grabbed by the omni-directional camera to obtain parameters that we need.
(3). Locating and finding lane lines in input images.
(4). Analyzing the lane lines to find the vehicle’s moving direction.
(5). Detecting and classifying the objects of the vehicle’s surrounding.
(6). Conducting the task of risk condition recognition.
The vision-based computer-assisted driving system proposed in this study consists roughly of four stages in its performance. The first stage is outdoor environment learning. In order to perform the task of computer-assisted driving, the system analyzes the images of outdoor environment and extracts pertinent features for the subsequent stages. The second stage is lane line processing. First, the system locates lane lines in each image grabbed by the camera. Then it finds the vehicle movement by analyzing the lane lines. In order to locate lane lines, pixels within the lane line regions found in the learning process are classified into two groups, namely, lane line pixels, and non-lane ones. Then a line fitting method is employed to find lines that can fit the lane line pixels. Movement of the vehicle then can be found from analyzing the features of the lane lines.
The third stage is neighboring object detection. In this stage, the tasks are to detect neighboring objects and classify them. The task of detecting neighboring objects can be accomplished by computing the image difference between two consecutive images.
Then a merging method is employed to merge the pixels in the results of frame differencing into objects. So far what the obtained objects are is unknown yet. We classify these objects into three groups, namely, neighboring vehicles, obstacles on the
ground, and non-objects, by some properties of objects, such as width, height, and centroid. Then we track these objects in the consecutively grabbed images for the detection of risk condition which is main task of the fourth stage. Risk conditions can be categorized into two main types: conditions caused by dangerous driving behaviors and those by neighboring objects. The conditions of the first type are detected by a finite-state transition automaton in this study. The conditions of the second type are detected by the result of tracking the neighboring objects. When a risk condition is detected, a corresponding warning sound will be played to inform the driver what kind of risk condition has been detected. A flowchart of the proposed system is shown in Figure 1.2.
Figure 1.1 Car roof rack used in this study.
Learning of Outdoor Environment
Processing of Lane Lines
Detection of Neighboring Objects
Detection of Risk Conditions
Figure 1.2 Flowchart of four stages of proposed system..
1.4 Con
The main contributions of this study are summarized in the following:
ng an omni-directional camera in outdoor environment is proposed.
(2) nment is
(3) analyzing lane lines by computer vision and image
(4) of detecting and classifying neighboring objects by image sequence
(5) g for driving by a state-transition automaton
(6) ues to reduce the influence of changing lighting
1.5
The remainder of this thesis is organized as follows. The configuration of the nvironment properties are described in Chapter 2. The proposed autonomous learning process is described in Chap
tributions
(1) A computer-assisted driving system usi
An automatic learning process to extract features of outdoor enviro proposed.
A method of finding and
processing techniques is proposed.
A method
processing is proposed.
A method for risk condition checkin model is proposed.
Error tolerance techniq
conditions of outdoor environment are proposed.
Thesis Organization
system used in this study and introductions to outdoor e
ter 3. In Chapter 4, the proposed methods for vehicle movement analysis are described. In Chapter 5, the proposed methods for detection and tracking of neighboring objects are described. Risk condition definitions and the proposed
detection method are described in Chapter 6. Satisfactory experimental results are shown in Chapter 7. Finally, some conclusions and suggestions for future works are given in Chapter 8.
Chapter 2
System Configuration and Outdoor Driving Environment
2.1 Introduction
In this study, an omni-directional camera is used to grab the images of the surroundings of a vehicle. The main advantage of using an omni-directional camera is described in Section 2.2. And the configuration of the proposed system is introduced in Section 2.3.
The images of outdoor environment have some properties that cause them difficult to analyze by computer vision and image processing techniques. Section 2.4.1 discusses the properties of outdoor environment. Some of the properties may cause effects to degrade the accuracy of the result of image processing. The main effects are lighting and shadow, which are described in Section 2.4.2. Some techniques are developed in this study to eliminate these effects and enhance the correctness of image processing. They are described in Section 2.4.3.
2.2 Advantages of Using
Omni-Directional Cameras
An omni-directional camera has a wide range of field of view, as shown in Figure 2.1. It has a 360-degree field of view horizontally and 115-degree field of view vertically. Because a traditional camera has a narrower angle of view, we need to mount multiple traditional cameras to obtain a wider angle of views. The main advantage of using an omni-directional camera is that it has a 360-degree horizontal field of view without dead space. In order to analyze the surroundings of a vehicle, the images of the front side, back side, left side, and right side of the vehicle should be grabbed. Thus we have to mount multiple traditional cameras to grab images of the four sides of the vehicle. These four traditional cameras can be replaced by a single omni-directional camera. Thus in this study, we choose to use the omni-directional camera. Figure 2.2 shows a panoramic image of a driving car in outdoor environment grabbed by an omni-directional camera on the top of the car.
Figure 2.1 Field of view of an omni-directional camera.
2.3 System Configuration
As shown in Figure 2.3, an omni-directional camera is fixed on the top of a vehicle by a car roof rack, as described in Chapter 1. The car roof rack used in this study is removable with its width adjustable from 70 cm to 95 cm, as shown in Figure 2.4, and thus the omni-camera can be used for vehicles of various sizes. We only need
to unlock the controllers at the two ends of the car roof rack; then the width can be extended or shortened manually.
Figure 2.2 A panoramic image of surrounding of a driving car in outdoor environment.
Figure 2.3 The omni-camera used in this study.
The omni-directional camera
(a)
(b)
Figure 2.4 Adjustable width of the car roof rack used in this study. (a) The minimum width of it. (b) The maximum width of it.
70 cm
95 cm
2.3.1 Hardware Configuration
The hardware configuration of the system used in this study contains two main parts: a vision system and a computation system. The first is a vision system which consists of an omni-directional camera as mentioned previously. The input images grabbed by the camera are of the resolution of 880×880 pixels. However, in this study we reduce the resolution of the input images to 440×440 for the reason of raising image processing speed. The frame rate of the camera is 20 frames per second. The second part is a computation system which is simply a notebook PC with a Dothan 2.0G CPU, a 512MB DDR RAM, and a 5400 rpm 80G HDD. The kernel program can be executed in this system to analyze input images and give warnings to drivers.
Images grabbed from the camera are transmitted to the computation system through a USB 2.0 port. The transmission rate of the port can be up to 50Mbps, and so the input images can be transmitted with no delay. The power supply for the camera is directly provided by the notebook PC through the USB port.
2.3.2 Software Configuration
In order to send commands to the camera and retrieve panoramic images, we use ARTCAM 130MI SDK which is developed by ARTRAY Company. This API provides a complete command interface. We can grab input images and modify image qualities such as their resolutions, RGB Gain, white balance, and other image properties by sending corresponding commands to the camera through the API.
Developers can use this API as an interface to grab specific kinds of images of his/her need. We use Visual C++ as the development tool in this study.
2.4 Outdoor Driving Environment
In this section, we will discuss relevant issues about outdoor environment for car driving, including properties of outdoor environment and problems caused by them in image processing. We will discuss these problems and their corresponding solutions.
2.4.1 Properties of Outdoor Environment
Unlike indoor environment, outdoor environment contains some features that are changing constantly. For example, the lighting condition of outdoor environment
changes as time goes on. Similarly, the scenes also change while the vehicle is moving. These features are normal to outdoor environment, but they cause serious problems to the results of image processing. In order to analyze the images of outdoor environment taken during driving, some features such as lane line region, lighting condition, and the center of the vehicle in images need to be found first. Figure 2.5 shows an illustration of the above-mentioned important features of an image of outdoor environment taken during driving. The main problems caused by outdoor conditions are lighting and shadow effects, as described in the following section.
Figure 2.5 An illustration of important features of an image of outdoor environment.
Lane Line Regions
Vehicle Region Vehicle Center
VehicleNeighboringObjects
2.4.2 Lighting and Shadow Effects
The lighting and shadow effects of outdoor environment cause a great influence on the results of image processing. Lighting causes the magnitudes of the RGB values
of the pixels in an image to increase abruptly, and shadow causes them to decrease.
These two kinds of changes make it difficult to use a global threshold for image thresholding. Besides, lighting also causes the colors of pixels to change unpredictably. The color of a pixel means the proportions of its RGB components. If the proportions of the RGB values change, the color of a pixel will also changes.
Figure 2.6(a) shows the original color of a lane line and Figure 2.6(b) shows the color of the lane line affected by lighting. In this example, the R, G, and B values of the lane line shown in Figure 2.6(a) are nearly the same. But in Figure 2.6(b), the color of the lane line is affected by lighting and the R value is higher than the G and B values.
This phenomenon will cause serious problems in the lane line processing stage. Thus some techniques are developed to reduce the effects of lighting and shadow in this study, as described next.
RGB= (253,248,247) RGB= (254,206,202)
(a) (b) Figure 2.6 An example of lighting effect. (a) The original color of a lane line. (b)
The color of the lane line affected by lighting.
2.4.3 Elimination of Lighting and Shadow Effects
In order to eliminate lighting and shadow effects, two problems described in the previous section need to be solved. The two problems and their corresponding solutions are discussed as follows:
Problem (1). The intensity of RGB values of each pixel in the input image varies
because of lighting and shadow effects. This will cause erroneous results in lane line processing.
We can not use a global threshold for image thresholding because each region in the input image has a different lighting condition. For example, the left and right lane line regions may have totally different lighting conditions. We need to compute local thresholds for left and right lane line regions in each input image to eliminate this problem. For each input image, we thus compute respectively the thresholds of the left and right lane line regions. These local thresholds can be used in lane line processing to avoid erroneous image thresholding results.
Problem (2). The proportions of RGB values change as mentioned before.
The idea of white-balancing is used here as a solution. Before we start analyzing input images, we need to find what changes have been applied to the proportions of the RGB values of the pixels because of the lighting effect. Then we use the predefined white color with identical R, G, and B values as a reference to restore the changed white color to be the original one. In this way, errors in lane line processing can be reduced.
Chapter 3
Proposed Learning Principle and Process
3.1 Introduction
In order to perform the task of computer assisted driving in outdoor environment, some features of input images must be extracted first for the following two processes:
vehicle movement analysis and detection and tracking of neighboring vehicles. These features can be categorized into three types. The first type includes the features for vehicle location. More specifically, we want to find the center of the vehicle and the region of the vehicle in each input image. These methods are described in Section 3.3.1. The second type includes the features for vehicle movement analysis. We define the lane line regions for lane line detection and find the color of lane lines. These works are described in Section 3.3.2. The third type of features includes the features for detection and tracking of neighboring objects. We define the warning region around a car and check the left and right regions for detection and tracking of neighboring objects. These works are described in Section 3.3.3.
The results of the two processes: vehicle movement analysis and detection and tracking of neighboring vehicles are easily affected by the changes of lighting conditions. We need to extract the features for the processes, as well as the features for elimination of lighting effect. In Section 3.3.4, the problem caused by lighting
conditions in outdoor environment is discussed. The detailed learning process and the feature extraction algorithm are described in the following sections.
3.2 Proposed Learning Principle and process
In order to extract features as parameters for image processing, a learning strategy is proposed, whose principle is described here. The input image for learning should contain two properties. First, the quality of the input image should be clear enough. Second, the scene of the image should not be too complicated. Input images with complicated scenes may cause the learning process to yield erroneous results. In this study, we assume that the first input image is suitable for learning which we denote as Ilearn. Three main processes are included in the learning stage, as described above. The first process includes two main steps: the first is to find the vehicle center Cvehicle in Ilearn, the second is to define the vehicle region Rvehicle in Ilearn. The second process includes two main steps: the first is to define lane line regions Rllane and Rrlane, and the second is to find the lane line colors of the two regions. The third process includes two major steps: the first is to find the warning region Rwarn in Ilearn for detection and tracking of the neighboring objects and the second is to check whether the side of the vehicle is a lane or not. Figure 3.1 shows a flowchart of the learning process. The detailed processes and algorithms of each step are described in the following sections.
Start
of the vehicle are lane or not
Figure 3.1 Flowchart of learning process.
3.2.1 Learning of Features for Vehicle Location
In this section, we use image processing techniques to find the features of the vehicle in Ilearn. To find the region of the vehicle Rvehicle, the center of the vehicle Cvehicle need to be found first. The algorithm of finding the center of the vehicle is
described as follows.
The main idea of this algorithm is to locate the vehicle center by the black circle as shown in Figure 3.2(a). Because the omni-directional camera is installed at the center of the vehicle, we can find the center of the vehicle by finding the center of the black circle. The center of the black circle can be found by locating the pixels within the black circle and computing the mean positions of the pixels. Then we can find the center of the vehicle Cvehicle. The algorithm is described as follows.
Algorithm 3.1. Computation of the vehicle center in the input image by image processing techniques.
Input: (1) an input image Ilearn; (2) a square region Ra with 80 pixels in width, 80 pixels in height, and centered at the position (220,220) which is the center of the input image as shown in Figure 3.2(b).
Output: the center of the vehicle Cvehicle. Steps:
Step 1 For all the pixels in Ilearn, compute the intensity which is the mean of the R, G, and B values of each pixel.
Step 2 Compute the mean of the intensity of all the pixels in Ilearn to get the mean intensity Ta.
Step 3 Use Ta as a threshold to perform thresholding in region Ra. If the intensity of the current pixel is smaller than Ta, then classify this pixel to be a circle pixel, else ignore this pixel.
Step 4 Compute the mean positions of the circle pixels found in Step 3 to get the center of the vehicle Cvehicle as shown in Figure 3.2(c)
In Step 3, we use white color to mark the circle pixels and use black color to
mark the ignored pixels, as shown in Figure 3.2(c).
(a) (b)
(c) (d) Figure 3.2 An example of learning features for vehicle location. (a) Original image Ilearn. (b)
Predefined region Ra. (c) Result of finding the center of the vehicle Cvehicle. (d) Result of finding the region of the vehicle Rvehicle.
(220,220)
The region Ra
80 pixels
80 pixels Black Circle
133 pixels
Cvehicle
110 pixels After thresholding
Cvehicle
With Cvehicle found in the above algorithm, the subsequent task is to find the
region of the vehicle Rvehicle in the input image. We use a rectangle region Rvehicle
centered at Cvehicle to fit the vehicle, and the size of the rectangle region varies with the size of the vehicle. In this study, the rectangle is taken to be 220 pixels in width and 266 pixels in height. Figure 3.2(d) shows an example of finding the region of the vehicle in the input image.
3.2.2 Learning of Features for Vehicle Movement Analysis
In this section, we describe how to find the features for vehicle movement analysis, including the left lane line region Rllane obtained by left lane line detection and the right lane line region Rrlane obtained by right lane line detection. These regions can be defined from Cvehicle and Rvehicle. The lane line regions are 27 pixels in width and 27 pixels in height. They are symmetric with Cvehicle. The left lane line region is 15 pixels to the left from Cvehicle and the right lane region is 15 pixels to the right from Cvehicle. Figure 3.3 shows an example of finding Rllane and Rrlane in the input image.
In this section, we describe how to find the features for vehicle movement analysis, including the left lane line region Rllane obtained by left lane line detection and the right lane line region Rrlane obtained by right lane line detection. These regions can be defined from Cvehicle and Rvehicle. The lane line regions are 27 pixels in width and 27 pixels in height. They are symmetric with Cvehicle. The left lane line region is 15 pixels to the left from Cvehicle and the right lane region is 15 pixels to the right from Cvehicle. Figure 3.3 shows an example of finding Rllane and Rrlane in the input image.