The Proposed Approaches - 影像處理與電腦視覺技術應用於複雜文件影像分析、夜間駕駛輔助、以及視訊監控系統之研究

Chapter 1. Introduction

1.2 The Proposed Approaches

In this dissertation, we will present several algorithmic, practical, or integrated methods and systems based on image processing and computer vision techniques to deal with the above-mentioned issues, including multilevel thresholding techniques for low-level image segmentation, text extraction for complex document image analysis, nighttime vehicle detection for driver assistance, and multi-channel video surveillance. They are briefly introduced in the following sub-sections.

1.2.1 Multi-level thresholding approaches for image segmentation

For segmenting objects from a given image, different objects with homogeneous illuminations must be separated into different segmented images. However, most of the conventional thresholding techniques [12]-[26] were developed for effectively applying on bi-level thresholding cases, and when the number of desired thresholds increases, the computation costs needed to obtain the optimal threshold values is substantially increased.

Another problem associated with these conventional methods is that the number of segments, into which the image should be segmented, cannot be suitably and automatically determined.

For this purpose, the discriminant criterion, for measuring separability among the segmented images with different objects, is described in this section. By evaluating the separability criterion, the number of objects, into which the image should be segmented, can be automatically determined. Hence, an automatic multilevel thresholding method, based on this criterion, will be presented in this dissertation.

The concept of using discriminant analysis for classification problems was first introduced by Fisher [55] and was applied on image thresholding by Otsu [12]. It is attractive for the simplicity in computation, with which it measures the separability among segmented

images. In Chapter 2, we will analyze the properties of discriminant analysis and then propose an automatic multilevel thresholding method [56]. The proposed method applies the discriminant criterion for analyzing the separability among the gray levels in the image to automatically determine the optimal number of thresholded classes that the gray levels should be partitioned. A fast recursive selection strategy is also introduced for determining the optimal thresholds to segment objects of interest in complex images into separate thresholded images in a computationally fast way. Each threshold determined by this recursive selection strategy is ensured to achieve the maximum separation on the resultant thresholded images, and hence satisfactory thresholded results can be accomplished by means of the smallest number of thresholding levels. To conduct an equitable performance evaluation of the proposed method, when compared to other criterion-based methods (i.e. the between-class variance method [12], the entropy method [13] and the minimum error method [14]), we also will introduce a efficient combinatorial scheme [57] to properly reduce the computation complexity of performing multilevel thresholding by these methods.

1.2.2 A multi-plane segmentation approach for text extraction in complex document images

For extracting textual objects from complex document images involves several difficulties. These difficulties arise from the following properties of complex documents: 1) Character strings in complex document images may have different illuminations, sizes, and font styles, and are overlapped with various background objects with uneven, gradational, and sharp variations in contrast, illumination, and texture, such as illustrations, photographs, pictures or other background textures. 2) These documents may comprise small characters with very thin strokes as well as large characters with thick strokes, and may be influenced by image shading.

Hence, we will propose effective region-based approaches for extracting textual objects from these complex document images [58]-[62], and resolving the above issues associated with the complexity of their backgrounds. The document image is processed by the proposed multi-plane segmentation technique to decompose it into separate object planes. The proposed multi-plane segmentation technique comprises two stages: automatic localized histogram multilevel thresholding, and multi-plane region matching and assembling processing. After the multi-plane segmentation technique has been carried out, homogeneous objects including textual blocks, other non-text objects, and background textures are separated into individual object planes. The text extraction process is then performed on the resultant planes to detect and extract textual objects with different characteristics in the respective planes. The document image is processed regionally and adaptively according to local features by the proposed method. This allows detailed characteristics of the extracted textual objects to be well-preserved, especially the small characters with thin strokes, as well as the gradational illuminations of characters. This also allows for characters adjoined or touched with graphical objects and backgrounds with uneven, gradational, and sharp variations in contrast, illumination, and texture to be handled easily and well.

1.2.3 Vision-based nighttime vehicle detection for driver assistance

For the issues of nighttime driver assistance and the development of autonomous camera-assisted vehicles, an efficient technique for effectively detection and recognition of moving vehicles in nighttime road-scene image sequences is practically a necessary demand.

Besides, this way provides beneficial information for the driver to perceive surrounding traffic conditions outside the vehicle during nighttime driving, and can also be applied to a versatile control scheme for the apparatus of vehicles. For example, the use of high-beam and low-beam states of headlights can be intelligently controlled according to the detection results

of presence of oncoming and preceding vehicles, and thus many hazards during nighttime driving, such as headlight dazzler, can be efficiently prevented.

Therefore, we will present an effective nighttime vehicle detection method [63]-[65] for identifying vehicles by locating and analyzing their headlights and taillights. This proposed method comprises of the following processing stages. First, a fast bright object segmentation process based on automatic multilevel histogram thresholding is performed to extract pixels of the bright objects from the grabbed image sequences of nighttime road scenes. The advantage of this automatic multilevel thresholding approach is its robustness and adaptability for dealing with various illuminated conditions at night. Then a connected-component analysis procedure is applied on the bright pixels obtained by the previous bright object segmentation stage, to locate the connected-components of these bright objects. These bright components are then grouped by a projection-based spatial clustering process to obtain potential pairing headlights of oncoming vehicles, and taillights of preceding vehicles. Accordingly, a set of identification rules are applied on each group of bright objects to determine whether it represents an actual vehicle. Finally, the distance between each of the detected vehicles and the camera-assisted car can be estimated and reported.

1.2.4 Real-time wavelet-based video compression approach for video surveillance

For the purpose of developing a digital surveillance system fulfilling the requirements of real-time multi-channel video compression, and ensuring the high quality of restored images and the efficiency of compression and decompression of images, we will present a real-time wavelet-based video compression technique and an intelligent multi-channel surveillance system [66]. Based on the low-complexity and low-memory-cost wavelet-based coding

scheme and motion compression strategy, the proposed video codec achieves high vision quality, high compression speed and high compression ratio. Then the ActiveX COM component technique is also implemented and integrated with the proposed video codec to realize multimedia, internet applications and many other video-intensive applications.

Furthermore, an intelligent surveillance system, which integrates the proposed wavelet-based video codec, computer peripherals and mobile communication, is also developed in this study.

Therefore, the future e-Home with controlled home electronics, managed video/audio systems and home security will be realized.

在文檔中影像處理與電腦視覺技術應用於複雜文件影像分析、夜間駕駛輔助、以及視訊監控系統之研究 (頁 20-24)