• 沒有找到結果。

Chapter 2 Related Work

2.3 Behavior Analysis

Understanding objects’ behavior and extracting useful traffic parameters are the main work after successfully tracking the moving objects from the image sequences. Behavior understanding involves the analysis and recognition of objects’ motion, and the production of high-level description of actions and interactions. Thus, via user interface or other output methods, we presented summarized useful information.

Traffic information is also an important tool in the planning, maintenance, and control of any modern transport system. Traffic engineers are interested in parameters of traffic flow such as volume, speed, type of vehicles, traffic movements at junctions, etc. Fathy [29]

presented a novel approach based on applying edge-detection techniques to the key regions or

windows to measure traffic parameters such as traffic volume, type of vehicles. Jung et al. [30]

proposed a traffic flow extraction method with the velocity and trajectory of the moving vehicles. They estimated the traffic parameters, such as the vehicle count and the average speed and extracted the traffic flows. Kumar et al. [31] proposed target classification in traffic videos using BNs. Using the tracking results and the results of classification, world coordinate estimation of target position and velocity were obtained. The a priori knowledge of context and predefined scenarios was used for behavior recognition. Haag and Nagel [32] proposed a system for incremental recognition of traffic situations. They used fuzzy metric temporal logic (FMTL) as an inference tool to handle uncertainty and temporal aspects of action recognition.

In their system, all actions were modeled using some predefined situation trees. Remagnino et al. [33] presented an event-based visual surveillance system for monitoring vehicles and pedestrians that supplies word descriptions for dynamic activities in 3-D scenes. In [34], an approach for the interpretation of dynamic object interactions in temporal image sequences using fuzzy sets and measures was presented. A multidimensional filter-based tracking algorithm was used to track and classify moving objects. Uncertainties in the assignment of trajectories and the descriptions of objects were handled by fuzzy logic and fuzzy measures.

Recently, traffic incident detection employing computer vision and image processing had attracted much attention. Ikeda et al. [35] outlined an image-processing technology based automatic abnormal incident detection system. This system was used to detect the four types of incidents: stopped vehicles, slow vehicles, fallen objects, or vehicles that attempted lane changes. Trivedi et al. [36] described a novel architecture for developing distributed video networks for incident detection and management. The networks utilized both rectilinear and omni-directional cameras. Kamijo et al. [37] developed a method by the results of tracking for accident detection which can be generally adapted to intersections. The algorithm to detect accidents used simple left-to-right HMM. Lin et al. [38] proposed an image tracking module

with active contour models and Kalman filtering techniques to perform the vehicle tracking.

The system provided three types of traffic information: the velocity of multi-lane vehicles, the number of vehicles and car accident detection. Veeraraghavan et al. [23] presented a visualization module. This module was useful for visualizing the results of the tracker and served as a platform for the incident detection module. Hu et al. [39] proposed a probabilistic model for predicting traffic accidents using three-dimensional (3-D) model-based vehicle tracking. Vehicle activity was predicted by locating and matching each partial trajectory with the learned activity patterns, and the occurrence probability of a traffic accident was determined.

We propose a framework for accident prediction based on objects’ properties, such as velocity, size and position. Besides, according to some preset information, our system can also do accurate objects classification. The useful information can be presented on GUI module and it’s easy to be understood.

Chapter 3

Multi-objects Tracking System with Adaptive Background Reconstruction

In this chapter, we will present our system structure and the details of proposed algorithms. The system structure is composed of four sub-systems: foreground segmentation, objects extraction, objects tracking and behavior analysis. In section 3.1, we use a diagram of the global system to show four sub-systems and their key modules. In section 3.2, we present foreground segmentation’s framework and the adaptive background reconstruction technique.

In section 3.3, we present the approach and algorithms of objects extraction. In section 3.4, we present the framework of objects tracking and its relevant algorithms. In section 3.5, we present the behavior analysis module and the analyzing algorithms.

3.1 System Overview

At first, foreground segmentation module directly uses the raw data of surveillance video as inputs. This sub-system also updates background image and applies segmenting algorithm to extract the foreground image. Next, the foreground image will be processed with morphological operation and connected components method to extract individual objects. At the same time, object-based features are also extracted from the image with extracted objects.

Main work of the third sub-system is to track objects. The tracking algorithm will use significant object features and input them into analyzing process to find the optimal matching between previous objects and current objects. The occlusion situation and other interaction of moving objects are also handled well in this sub-system. After moving objects are tracked successfully in this sub-system, the consistent labels are assigned to the correct objects.

Finally, objects behavior is analyzed and recognized. Useful traffic parameters are extracted and shown in the user interface. The diagram of global system is shown in Fig. 1.

3.2 Foreground Segmentation

The purpose of first sub-system is to extract foreground image. At first, we input the raw Adaptive B/G

Updating

Objects Lists Frames

Matching Analysis Foreground Color

Channels Foreground

Segmentation

Objects Extraction

Behavior Analysis

Region Matching

Adaptive Threshold

Object Features

Objects Classification

Accident Prediction Objects Tracking

Pre- Processing

Mask Images

Modified Bounding Box

Traffic Parameters

Extraction Connected Components

Fig. 1 Global system diagram

main processes of this sub-system are foreground segmentation and background reconstruction. In regard to segmentation, there are three basic techniques: 1) frame differencing, 2) background subtraction, and 3) optical flow. Frame differencing will easily produce some small regions that are difficult to separate from noise when the objects are not sufficiently textured. Optical flow’s computations are very intensive and difficult to realize in real time. In [10], a probabilistic approach to segmentation is presented. They used the expectation maximization (EM) method to classify each pixel as moving object, shadow or background. In [31], Kumar proposed a background subtraction technique to segment the moving objects from image sequences. And, the background pixels were modeled with a single Gaussian distribution. In [17], Gupte used a self-adaptive background subtraction method for segmentation.

In almost surveillance condition, the video camera is fixed and the background can be regarded as stationary image, so the background subtraction method is the simplest way to segment moving objects. That’s why we adopt this method as the basis of our segmentation algorithm. Besides, the results of frame differencing and previous objects condition are also used in order to achieve the segmentation more reliably. The process of foreground segmentation and background reconstruction is shown in Fig. 2.

3.2.1 Background Initialization

Before segmenting foreground from the image sequences, the system needs to construct the initial background image for further process. The basic idea of finding the background pixel is the high appearing probability of background. During a continuous duration of surveillance video, the level of each pixel appeared most frequently is almost its background level. According to this concept, there are some approaches to find out background image, such as classify and cluster method and Least-Median-Squares (LMedS) method. We use a simpler method that if a pixel’s value is within a criterion for several consecutive frames, it

means the probability of appearing of this value is high locally or this value is locally stable.

This value is regarded as the background value of this pixel. Then the pixel value in background buffer is duplicated to the corresponding pixel in the initial background image.

Fig. 2 The process diagram of foreground segmentation

This method can build an initial background image automatically even though there are Current

Frame

Previous Frame

Frame Diff.

Image

Objects Life Mask Background Reconstruction

Current B/G Image

B/G Diff.

Image Yes

Adaptive Updating

Objects Filter

Foreground Image B/G

Temp.

Image

establishing equation is Eq. (1), (2) and (3). In these equations, the superscript C means different color channels. In our segmentation algorithm, we use R, G and intensity channels for background subtraction process. Hit(i,j) records the times of one same value appeared consecutively at pixel(i,j) and Thappear is the threshold of the appearing times. The σBG is a preset variance as the criterion for checking that the current value is the same as the one of buffer image.

3.2.2 Adaptive Background Updating

We introduce an adaptive threshold for foreground segmentation. The adaptive threshold includes two parts: one is a basic value and the other is adaptive value. And, we use the equation shown in Eq. (4) to produce the threshold. The two statistic data (Peaklocal, STDEVlocal) are calculated in the specific scope as shown in Fig. 3. This adaptive threshold will assist the background updating algorithm in coping with environmental changes and noise effects.

local local

basic

FG Value Peak STDEV

Th = +1.5* + (4)

At outdoor environment there are some situations that result in wrong segmenting easily.

Those situations include waving of tree leaves, light gradual variation and etc. Even there are sudden light changes happened when the clouds cover the sun for a while or the sun is revealed from clouds. We propose an adaptive background updating framework to cope with

those unfavorable situations. Firstly, we introduce a statistic index which is calculated by the

equation shown in Eq. (5). The mean value and standard deviation of Eq. (5) are obtained from calculating the local scope in Fig. 3.

local

local STDEV

Means

Index= +3∗ (5)

According to this index, we adjust the frequency of updating the current background image adaptively and the updating frequency is defined as several phases. The background updating speed will increase or decrease with the updating frequency. Besides, the final phase is an extra heavy phase which is designed for those severely sudden change conditions. At this phase, the background image will be updated directly. These phases and their relevant parameters are listed in Tab. 1.

Tab. 1 Phases of adaptive background updating

Phase Condition Sampling rate Freq in Eq.(8)

Normal Index < 12 1/30 30

Middle I 12 ≤ Index < 18 1/24 24

Middle II 18 ≤ Index < 24 1/16 16

Previous ThFG

Histogram of Background Subtraction Image

: Local Calculating Scope Fig. 3 The calculating scope for the index of adaptive threshold

Heavy I 24 ≤ Index < 30 1/8 8

Heavy II 30 ≤ Index < 36 1/4 4

Extra Heavy 36 ≤ Index Directly update N/A

At the reconstruction process, the temporary background image is the result of current frame image filtered by a background mask image. The background mask image is updated based on frame differencing image and objects life mask image. Its updating equation is shown in Eg. (6). Then the current background image will be updated with itself and temporary background image by the equation in Eq. (7) & (8). The parameter α is a weighting factor and the ThDiff is a threshold for frame differencing. The parameter Freq is the updating frequency and results from the adaptive background updating algorithm.

⎪⎭

MaskBG Frame Diff M life

,

3.2.3 Background Subtraction

As mentioned in section 3.2, we use the background subtraction method to segment foreground image. We use R, G and intensity channels to perform the subtraction and the intensity channel is calculated by the equation shown in Eq. (9). The computation loading, blue channel’s sensitivity to the shadow and the characteristics of traffic intersections are the main reasons why those channels are introduced by our framework. Then the background subtraction image is obtained by combining three channels’ subtraction directly as shown in

Eq. (10).

Next, the background subtraction image is filtered by a foreground mask image. This mask image consists of previous extracted objects with their life information and frame temporal differencing image. The frame temporal differencing is considered only with the intensity channel and showed in Eq. (11). The objects life mask image is based on the appearing period of each object. We assign the value of its life feature to those pixels which this object belongs to. We can use the equation Eq. (12) to obtain the foreground mask. This mask can filter out some noise or ghost regions that we don’t desire to extract. As shown in Eq. (13), after applying the foreground mask to background subtraction image, we can get the foreground image and it’s the output of this sub-system.

)

MaskFG Frame Diff

,

3.3 Objects Extraction

In this sub-system, we will use the connected components algorithm to extract each object and assign it a specific label to let the system recognize different objects easily. Before the process of connected components algorithm, we will apply morphological operation to improve the robustness of object extraction. The result of the connected components algorithm is the labeled objects image. The process of this sub-system is shown in Fig. 4.

Foreground

Image Close

Operation Close

Operation

Size filter Connected

Componets Labeled

Objects Image

Fig. 4 Diagram of object extraction process

Then we will build a current objects list with their basic features such as position, size and color according to the labeled objects image. We use a spatial filter and a B/G check filter to remove ghost objects or objects at the boundaries. We also calculate the overlap area between current objects list and previous objects list. If the overlap area is larger than the threshold, a region relation will be established. This process’s diagram is shown in Fig. 5. The current objects list and region relation list will pass to tracking module for further process.

Spatial

Filter B/G Check

Filter

Current Objects

List

Previous Objects

List Overlap

> Thoverlap Area Overlap

Relation List

Yes

No

Next P. Yes Object

No Labeled Objects Image

3.3.1 Pre-Processing

Before the process of connected components algorithm, we apply some pre-processing to smooth the contours of objects and remove the noise. Our algorithm uses closing process

Fig. 5 Diagram of establishing objects list

twice and this closing operation can help fill the holes inside the object regions. The morphological operation closing consists of the dilation process and the erosion process and the performing order of these two processes is important. Dilation-erosion is the closing operation but erosion-dilation is the opening operation. After the process of the closing operation, we apply the adaptive threshold of foreground segmentation to the result images and then extract moving objects by the connected component algorithm.

3.3.2 Connected Components Algorithm

Each object in the foreground image must be extracted and assigned a specific label for further processes. The connected components algorithm [40], [41] is frequently used to achieve this work. Connectivity is a key parameter of this algorithm. There are 4, 8, 6, 10, 18, and 26 for connectivity. 4 and 8 are for 2D application and the others are for 3D application.

We used the 8-connectivity for our implementation. The connected component algorithm worked by scanning an image, pixel-by-pixel (from top to bottom and left to right) in order to identify connected pixel regions. The operator of connected components algorithm scanned the image by moving along a row until it came to a point (p) whose value was larger than the preset threshold of extraction. When this was true, according to the connectivity it examined p’s neighbors which had already been encountered in the scan. Based on this information, the

labeling of p occurred as follows. If all the neighbors were zero, the algorithm assigned a new label to p. If only one neighbor had been labeled, the algorithm assigned its label to p and if more of the neighbors had been labeled, it assigned one of the labels to p and made a note of the equivalences. After completing the scan, the equivalent label pairs were sorted into equivalence classes and a unique label was assigned to each class. As a final step, a second scan was made through the image, during which each label was replaced by the label assigned to its equivalence classes. Once all groups had been determined, each pixel was labeled with a graylevel or a color (color labeling) according to the component it was assigned to.

Next, we use a predefined threshold of object size to filter out some large noise and ghost regions. After applying size filter, we can get the labeled objects image. In this image, the different gray level presents different object so we can gather the pixels with same gray level to form the region of a specific object.

3.3.3 Objects List

When building objects list, we apply two filters to remove some unwanted objects.

Firstly, spatial filter will remove those objects near boundaries with a preset distance. This can solve the tracking confusion by partial appearance on boundaries when objects are just leaving or entering the field of view (FoV). This filter can be extended its area of filtering to become a scene mask for simplifying the effective region of the FoV. Secondly, the B/G check filter is a combination of three Sobel operations. We use first Sobel operation to find the edges of each object. Then the second Sobel operation is performed with the current frame on all the edge pixels which were obtained from first Sobel operation. The third Sobel operation is performed with the background image on the same edge pixels. We mark the pixels if the value of their third Sobel operation is bigger than the value of second Sobel operation. Finally, we use a preset threshold for the ratio of marked pixels to all edge pixels to judge whether this object is a background ghost object.

When establishing objects list, we also extract three basic categories. They are central position, size of bounding box and YCbCr color information. At the same time, we calculate the overlap area between each current object and previous object based on the estimated position. We use the size of bounding box and central position as input data and a simple method to calculate the size of overlap area that is shown in Fig. 6. Then we calculate the ratio of the overlap area to the minimum area of two objects by Eq. (14).

) ,

(

/ current_obj. previous_obj

overlap

overlap Area Min Area Area

Ratio = (14)

If the overlap ratio is larger than a preset threshold, one relation of this current object and

the previous object is established. This overlap relation list is an important reference list for objects tracking sub-system.

3.4 Objects Tracking

This sub-system is the main process of entire system, because it deals with objects tracking function. Inputs of this module are three lists: current objects list, previous objects list and overlap relation list. This sub-system can analyze the relation between current objects and previous objects and obtain other properties of objects, such as velocity, life, trajectory and etc. The tracking framework can be divided into several modules and we will present each module and introduce its algorithm. The diagram of tracking process is shown in Fig. 7.

The overlap relation list will be simplified by some constraint and rules. In [42], Masound used an undirected bipartite graph to present relations among objects and apply a

The overlap relation list will be simplified by some constraint and rules. In [42], Masound used an undirected bipartite graph to present relations among objects and apply a

相關文件