• 沒有找到結果。

In recent years, video cameras are equipped everywhere in our daily life due to the affordable price and easy installation of devices. People can record a mass amount of events passing though the scene by storing the data within videos. However, most of videos contain a lot of redundancies. What we care about is only very small subsets or portions of the videos with semantic meanings. For example, in a traffic surveillance system, what we are interested in is the traffic density of a road section [1]

or whether an incident happened [2], and then we can select other path to avoid those sections. For cameras set on streets, we would like to know how many pedestrians passed [3] or if there is any unusual activity [4]; in our living room, system may send alarms if elderly people fall on floor or some dangerous events occur [5]. Therefore, methods to retrieve and summarize useful data efficiently are essential for handling the huge amount of videos.

Research topics on surveillance systems have been discussed in the last decades, for example, object detection, tracking and identification, scene understanding, and event detection [6] [7] [8] [9]. Although different applications are developed for different scenarios, many basic techniques can be applied to most of the surveillance videos. Object detection is usually the first step of a surveillance system to locate the region of interests (ROI), precisely, the object within a scene. After a sequence of video frames is processed, we need to know whether two detected objects from two frames are the same one. This is called object identification or object tracking.

Trajectories and paths of detected objects are collected after tracking objects in a period of time. We can use the information to retrieve high-level semantics, such as

2

the number of vehicles passed by and events, car accidents for example.

A traffic surveillance system can reveal various kinds of information. We can monitor the traffic condition by analyzing the videos collected from cameras installed along roads and highways. Intuitively, traffic data like the level of driving speed or the number of vehicles passing by can be provided to drivers or used in navigation systems to avoid congestion areas. Another type of information is whether a specific event happens, like traffic accidents or traffic rule violation. The police nearby can receive alarms from the control center and come to the location quickly. For long-time monitoring, traffic data are stored in databases and we can discuss issues on a road section about congestion or retrieve historic records about some specific vehicles driven by criminals.

In recent years, researchers focus on tunnel surveillance since accidents in a tunnel may cause serious problems [10]. The traffic agency monitors the traffic condition of a tunnel using multiple surveillance cameras and tries to discover unusual events in real-time. However, that is not an easy task since in most of time monitoring is nothing interesting and makes workers hard to concentrate on the screens. Another shortcoming is that sometimes there are not enough cameras to cover the whole tunnel scene. There are temporal and spatial gaps between videos. Therefore, it is necessary to develop a computer system that can automatically provide precise and brief information. Figure 1-1 shows an example of a tunnel surveillance system that contains multiple cameras. Many surveillance systems on day-time traffic can be directly applied to tunnels because major features are the same as in tunnels. However, more challenges such as poor illumination conditions are present. It is worth considering more aspects on tunnels than on day-time traffic to achieve better

3

performance.

Figure 1-1. An example of a tunnel surveillance system.

There are several difficulties for traffic surveillance: low resolution, less frame rate, limited view, and camera sensor noises. Since we have to install a huge number of cameras and to store a mass amount of video data, the price of camera is relatively low and the size of data should be as small as possible. That lowers the quality of videos and makes the analysis of videos more challenging. In addition, there are some differences between surveillance systems on highways and on tunnels. Most systems on road can only work in day-time. In this case, many basic algorithms can be applied due to sufficient light. However, illumination effects usually exist in tunnels, producing more noise and unpredictable troubles. Fortunately, there are still some benefits in the environments of tunnels, like fewer lanes, more strict driving constraints, and the 24-hour system with the same lighting condition.

4

Another issue is that many traffic surveillance systems do not consider about information between two or more cameras. As a car is driven on a road through multiple cameras, we would first like to know whether two vehicles in two videos are the same. This task is called multi-camera object tracking or multi-camera object identification. It is more challenging than tracking in a single camera, because the view points, poses and lightings are different. These differences may make the visual appearances of an object different in different cameras. Sometimes we cannot track a vehicle across cameras because there are time and space gaps in between. However, we can still identify them by considering the features of each vehicle. A naïve method to identify vehicles between cameras is obtaining the unique license plate number using license plate recognition. However, it is usually not available in a surveillance system due to the poor quality of videos. Hence the robust image features are needed for multi-camera object identification.

In this thesis, we propose a multi-camera vehicle identification system in tunnels.

First vehicle detection and tracking are performed in single tunnel surveillance video using Haar-feature-based cascade detector [11]. After images of vehicles are collected from videos, the visual features of images are then extracted. Features such as color histograms, Haar-like feature vector, SIFT-based feature points and template matching in pixel domain are studied and evaluated by experiments. Next we propose a multi-camera vehicle identification method to identify vehicles between two non-overlapping views of different cameras using calculated feature vectors. The first step is the Spatiotemporal Successive Dynamic Programming (S2DP) algorithm that matches vehicles in two cameras. And then two different algorithms for real-time tracking and offline refinement are proposed for different requirements: the Real-Time (RT) and the Offline Refinement (OR) algorithms following S2DP,

5

respectively. Finally the experiments and discussions on proposed methods are presented in this thesis.

The remaining of this thesis is organized as follows. In Chapter 2, we introduce the related work about surveillance video analysis. Chapter 3 describes the proposed tunnel surveillance system and the multi-camera vehicle identification methods. The experiment settings and experimental results are presented and discussed in Chapter 4.

Finally, conclusions and discussions of future work are in Chapter 5.

6

相關文件