基於串聯式剔除機制來減少視訊中時空搜尋空間的即時車牌辨識

(1)

國

立

交

通

大

學

資訊科學與工程研究所

博

士

論

文

基於串聯式剔除機制來減少視訊中時空搜尋空間

的即時車牌辨識

Real-time License Plate Recognition based on Cascaded Rejection Mechanisms to

Reduce Spatio-temporal Search Space in Video Sequences

研究生：王舜正

指導教授：李錫堅教授

(2)

基於串聯式剔除機制來減少視訊中時空搜尋空間的即時車牌辨識

Real-time License Plate Recognition based on Cascaded Rejection Mechanisms to

Reduce Spatio-temporal Search Space in Video Sequences

研究生：王舜正 Student：Shen-Zheng Wang

指導教授：李錫堅教授 Advisor：Prof. Hsi-Jian Lee

國立交通大學

資訊科學與工程研究所

博士論文

A Dissertation

Submitted to Institute of Computer Science and Engineering College of Computer Science

National Chiao Tung University in partial Fulfillment of the Requirements

for the Degree of Doctor of Philosophy

in

Computer Science and Engineering

June 2008

Hsinchu, Taiwan, Republic of China

中華民國九十七年六月

(3)

(4)

(5)

基於串聯式剔除機制來減少視訊中時空搜尋空間的即時車牌辨識

學生：王舜正

指導教授：李錫堅教授

國立交通大學資訊科學與工程研究所

摘

要

在監控應用中，減少搜索空間(SSR)是發展高效率演算法的一項重要關鍵。在接受視訊的車牌辨識系統研究中，我們整合了空間上和時間上的減少搜尋空間技術。然而由於越來越多特徵需量測，計算量可能會顯著地增加。當考量到大多數輸入樣式都是非目標樣本(不是要認出的物件)的這個事實，似乎盡快剔除大量的非目標樣本是相當有效益的。因此我們提出了一個基於串聯式剔除架構來減少視訊中時空搜尋空間，同時確保系統效能的即時車牌辨識。甚至為了在複雜的環境下正確地擷取出車牌，我們首先提出兩個物件表示法:簡潔車牌區域與重複區域。簡潔車牌區域定義為車牌字元上下界內所包含的區域，此區域可以在第一個步驟中擷取出來，以避免額外的字元上下界偵測程序。我們提出的方法就是由減少空間上的搜尋空間開始，其中包含了三個模組：單次掃描的簡潔車牌區域偵測、雙階層的車牌字元切割和適應性的機器學習。經由可便利計算的特徵，如垂直梯度值和擴充 Haar-like 特徵值，這些模組擷取出簡潔車牌或字元的候選區域並驗證之。再者，我們提出要剔除在視訊中重複出現的相同外觀區域，這些區域通常包含有停止的車輛或是固定的背景，且應可剔除以避免被重複分類。為了效能考量，重複樣式只會在車牌候選區域內偵測，在此稱之為時空關係上的減少搜尋空間。且該重複樣式比對是基於區塊為基礎的機制來設計，透過計算正切距離來克服位置、大小、旋轉或是亮度的變異。在我們的實驗中，經由時空關係上的減少搜 尋空間，可減少 87.9% 的搜尋空間；該車牌辨識系統在 Intel P-IV 3-GHz 的個人電腦 上，每秒可以處理 38 張 640x480 解析度的影像。

(6)

Real-time License Plate Recognition based on Cascaded Rejection

Mechanisms to Reduce Spatio-temporal Search Space in Video Sequences

Student: Shen-Zheng Wang

Advisor: Prof. Hsi-Jian Lee

Submitted to Institute of Computer Science and Engineering

College of Computer Science

Chiao Tung University

ABSTRACT

In surveillance applications, search space reduction (SSR) is an essential element to ef-ficient algorithms. In this study, spatial and temporal SSRs are integrated for license plate recognition (LPR) in video sequences. However, as more features are measured, the computational load may increase significantly. When regard to the fact that most input patterns are negatives, it is apparently efficient to reject a majority of negatives as soon as possible. Therefore, we propose a realtime LPR based on a cascaded rejection frame-work to reduce spatiotemporal search space rapidly, while ensuring that the performance is high. To extract plates accurately even in complicated situations, two representations, compact plate regions and repeated regions, are first presented. Compact plate regions, which bound the top and bottom of plate characters, could be extracted in the first stage to avoid the use of additional removal procedures. Our method started from spatial SSR by algorithms of one-pass compact plate extraction, bi-level plate character segmentation, and adaptive machine learning. Region candidates of compact plates or plate characters are extracted and verified by these algorithms performed on effectively calculated fea-tures, such as vertical gradients and extended Haar-like features. Moreover, we proposed to exclude repeated patterns with the similar appearances in the same location of consec-utive frames, which usually include stopped vehicles or regular backgrounds and could be excluded from repeated classification. For efficiency, repeated patterns were detected only on the plate candidates, named spatiotemporal SSR, based on a block-based mechanism

(7)

by estimating the tangent distance, which is invariant to the variations in positions, sizes, rotations, or brightness. In our experiments, the search space could be reduced up to 87.9% by the spatiotemporal SSR; the LPR system can recognize plates over 38 frames per second with a resolution of 640× 480 pixels on a 3-GHz Intel P-IV PC.

(8)

Acknowledgements

Writing this dissertation is the most diﬃcult, yet the most gratifying, work that I have ever done. The accomplishment of this work, however, could not have been possible without many people’s constant support and help. I would like to express my sincere thanks to all the people who have played signiﬁcant roles during my Ph.D. studies.

First, I am deeply grateful to my advisor Prof. Hsi-Jian Lee for the past years, for his great guidance, infinite patience, and strong faith in me. I am greatly indebted to him for many invaluable contributions he made to the development of this study. On the personal side, I truly appreciate the full confidence he had in my ability and the continuing care and encouragement he gave me at difficult times. He has always known how to help me to go through all the hurdles in this long journey and motivate me to do my best work. It is indeed a great blessing that I have had Prof. Hsi-Jian Lee as my mentor.

Next, I would like to express my special gratitude to the two committee members, Prof. Wen-Hsiang Tsai and Prof. Jen-Hui Chuang. They have been generous with their time giving me detailed comments and eminently helpful suggestions on this study.

Many thanks are also due to my research participants. It has been a wonderful experience working closely on, for, and with them, and I truly appreciate their unreserved sharing and trust in me. Finally, I owe a great deal to my parents and my gril friend, Trista, for always encouraging me to achieve my goals and comforting me during my emotional storms even though we live apart. With deep love and gratitude, I dedicate this dissertation to them.

(9)

List of Tables

2.1 List of research focus . . . 12

5.1 Recognition results of our plate recognition system (I) . . . 84

5.2 Recognition results of our plate recognition system (II) . . . 84

5.3 Detection results of our plate detection module . . . 85

5.4 Comparison results of the computational speed and the accuracy rate for plate recognition . . . 86

6.1 The sequences used in the benchmark . . . 107

6.2 The comparisons between the OPE and the BOPE algorithms . . . 108

6.3 The comparisons between adopting repeated region prediction and none . . 109

6.4 The comparisons adopting diﬀerent parameters of the plate height Hp used in OPE and BOPE . . . 109

(13)

List of Figures

1 Two image categories in surveillance applications: (a) a still image from a camera, and (b) nine successive frames in a video stream. . . 1 2 Examples of variations in license plates and their environments. . . 3 3 Examples of variations in plate types and environments: (a) diﬀerent

vehi-cle headlights, (b) different types of environment lighting, and (c) plate-like background. . . 3 4 Two different appearances of license plates. . . 18 5 The images of character R in different orientations. The dash lines represent

the major axis. . . 19 6 The rotation result of an image block. (a) The original image; (b) The

rotated image with a -30-degree transformation. . . 20 7 The coordinate diagram. . . 22 8 The system diagram. . . 24 9 (a) The license plate image; (b)-(d) The possible license plate regions with

the sizes of window mask, 7× 3, 11 × 7 and 15 × 11. . . 25 10 (a) The car image with a license plate; (b) The vertical gradients of Fig.

10(a); (c) The local variance of Fig. 10(b); (d) The possible license plate regions. . . 26 11 (a) The car image with a license plate; (b) The separated possible license

plate regions. . . 28 12 (a) The car images with license plates; (b) The possible license plate

re-gions; (c) The license plate detection results. . . 29 13 License plates with diﬀerent sizes in an frame. . . 32 14 License plates with partial occlusion in an frame. . . 32 15 A license plate with unobvious borders. The plate and background have

similar gray values. . . 32 16 Schematic diagram of the rejection mechanisms for license plate extraction. 33 17 Skew correction. The bottom border can be obtained from the original

image. . . 34 18 Character Segmentation: (a) a source image and (b) the segmentation

result part. . . 35 19 A diﬀerence result of two consequent images. If the corresponding pixels

in the two images have diﬀerent grey-levels, the original gray value in the current frame is stored in the diﬀerence frame. . . 35

(14)

20 The white blocks have low contrast and will be removed from the current frames. (a) Current frame, Ft1; (b) the result of the frame Ft1; (c) current

frame Ft2; (d) the result of the frame Ft2. . . 37

21 The results of removing low-contrast and stationary blocks in the current frame of Fig. 19. . . 38 22 The example of the misclassiﬁed stationary block, block (2, 3). . . 39 23 Misclassiﬁed moving blocks. (a) Blocks in the reference frame; (b) blocks

in the current frame; (c) image blocks after removing low-contrast ones; (d) image blocks after removing stationary ones. . . 40 24 The non-white blocks represent the moving vehicle blocks after our

block-based rejection module. . . 40 25 Results of gradients measurements and the vertical and horizontal

projec-tions: (a) the original image, (b) vertical gradients image, and (c) horizontal gradients image. . . 42 26 A result of removing horizontal lines from horizontal projections. The

characters in license plates become broken. (a) The source image, (b) the horizontal gradient image, and (c) the result image. . . 43 27 A result of removing horizontal lines from vertical projections. The

char-acters in license plates remain completely. (a) The source image, (b) the vertical gradient image, and (c) the result image. . . 43 28 A comparison of vertical projections of the horizontal and the vertical

gra-dients. Characters have low counts in the projections of vertical gragra-dients. 44 29 The result of removing non-license plate regions for the image in Fig. 14.

(a) Result of the block-based rejection; (b) Result after the horizontal scan lines of non-license plate regions are removed. . . 45 30 A result of removing scan lines in blocks. Note that the license plate

“IWA-091” is broken into two blocks . . . 45 31 Candidate regions selection. Several regions are excluded since they do not

satisfy the geometrical constraints. . . 47 32 Five license plates detected successfully. (a) The source image; (b) The

interesting regions detected. . . 48 33 An erroneous example of interesting region detection. (a) Source image;

(b) The result of interesting regions detection. . . 48 34 A result of license plate detection.(a) The test image; (b) Two regions of a

moving object are detected. The plate region is detected successfully. . . . 48 35 An erroneous example for license plate extraction. (a) Source image; (b)

Image after gradients measurement. . . 49 36 Patterns of (a) normal plate regions and (b) compact plate regions. . . 52 37 Diﬀerences of the ROC curves between a rejecter, a strong classiﬁcer, and

(15)

38 (a) Schematic diagram of the cascade framework and (b) ROC curves for

the three rejecters. . . 56

39 Schematic diagram of the rejection mechanisms for single images. . . 59

40 Diﬀerent gradient results of an input image: (a) Original image, (b) Sobel gradients, (c) vertical gradients, and (d) horizontal gradients. . . 61

41 Schematic diagram of the one-pass algorithm for extracting compact plate regions. . . 64

42 An example in which the one-pass plate extraction algorithm is applied: (a) Original image, (b) Gp of plate candidates, and (c) detected compact plate region. . . 64

43 Schematic diagram of the learning and veriﬁcation steps. . . 66

44 Schematic diagram of the plate veriﬁcation procedure. . . 66

45 Feature prototypes of upright and skewed Haar-like features used in our system. . . 67

46 Acceleration tables: (a) SAT and (b) SSAT. . . 70

47 Schematic diagram of the learning phase: f denotes the maximum accept-able F P R per stage; t, the minimum acceptaccept-able T P R per stage; Ftarget, the overall false positive rate; F P Ri, the false positive rate at stage i; and T P Ri, the true positive rate of stages i. . . . 71

48 Schematic diagram of the extraction of one Haar-like feature (1× 2) from two compact plate regions. . . 73

49 Patterns of (a) compact plate regions, (b) distinct characters, and (c) in-distinct characters. . . 75

50 Inevident diﬀerence between the colors of the characters and the back-ground: (a) compact plate region, (b) histogram, and (c) binary represen-tation after thresholding by Otsu’s method. . . 76

51 One segmentation result: (a) Original compact plate region, (b) histogram of image (a) where peaks and valleys are indicated by white and gray lines, respectively; (c)-(e) three thresholding results with three diﬀerent peak values-from low to high; (f) indistinct characters detected by histogram segmentation; (g) distinct characters detected by histogram segmentation; (h) projection proﬁle of (f); (i) single characters detected by projection segmentation. . . 76

52 Schematic diagram of the mode initialization step. . . 78

53 Schematic diagram of the peak-valley decision step. . . 78

54 Schematic diagram of the character veriﬁcation step. . . 79

55 Feature prototypes of OCR: (a) CC and (b) DC. . . 80

56 Examples of a variety of plates on a wet day: (a) input images, (b) recog-nized results shown in color images, and (c) enlarged recogrecog-nized plates. . . 85

(16)

57 Examples of a variety of plates on a cloudy day: (a) input images, (b) recognized results shown in color images, and (c) enlarged recognized plates. 85 58 Examples of a variety of plates on a sunny day: (a) input images, (b)

recognized results shown in color images, and (c) enlarged recognized plates. 86 59 Examples of a variety of plates with the pavement background of texture

patterns: (a) input images, (b) recognized results shown in color images, and (c) enlarged recognized plates. . . 86 60 Examples of a variety of plates captured from a device installed in a car:

(a) input images, (b) recognized results shown in color images, and (c) enlarged recognized plates. . . 86 61 Comparison of feature numbers between classiﬁers using only upright and

both upright and skewed Haar-like features. . . 87 62 Overview of the rejection mechanisms for video sequences. . . 90 63 Results of the OPE and the BOPE algorithms in a complex environment

of a parking lot: (a) the original frame with multiple motorcycle plates, (b) the corresponding plate candidate map, (c) the result after removing the four types of plate runs from (b), (d) the result after applying the OPE procedure in (b), (e) the result after applying the BOPE procedure in (b), (f)-(h) the extracted regions R1-R3 of (d), and (i)-(k) the extracted regions R1-R3 of (e). . . 91 64 Examples of repeated regions in the same locations of two successive frames:

(a) Left: the frame at time t− 1, Right: the frame at time t, (b) the candidate regions in the two frames, (c) and the enlarged regions. . . 96 65 Schematic diagram of the Euclidean distance and the tangent distance

be-tween P and E. The curves Sp and Se represent the sets of patterns after

certain transformations to P and E. The lines TP and TE represent the

tangent spaces passing through P and E, respectively. . . . 98 66 Schematic diagram of the repeated block matching mechanism: (a) the

candidate region CR extracted from the overlap between the plate can-didates in the current frame P Ct and the preceding frame P Ct−1 (b) the

selected candidate blocks, (c) the matched repeated blocks, and (d) the selected regions to detect repeated regions. . . 103 67 An example of the repeated block matching mechanism: (a) Left: the frame

at time t− 1, Right: the frame at time t, (b) the plate candidates extracted by BOPE algorithm, (c) the candidate regions, (d) the candidate blocks, and (e) the repeated blocks of the frame at time t. . . 105 68 Schematic diagram of the acceleration mechanism for the calculations of

tangent vectors. (a) For the worst case, tangent vectors are re-calculated at every two frames since the input region varies continuously. (b) For the generic case, tangent vectors are re-calculated at the next time after identifying the input region is identiﬁed as a non-repeated region. (c) For the best case, tangent vectors are calculated only once because the same patterns occur continuously. . . 106

(17)

Chapter 1 Introduction

1.1 Motivations

Information automatically collected for smart visual surveillance applications, such as portal controlling, traffic monitoring, or vehicle detecting, has gained increasing impor-tance in Intelligent Transportation Systems (ITS). With rapid development of vehicles and visual analysis technologies, automatic recognition of vehicles becomes more and more practical in many applications during the past two decades. In vehicles, license plates represent the unique identifications with regular patterns, such as a sequence of characters. The design of robust license plate recognition (LPR) systems becomes an im-portant issue for recording, analyzing, and reporting surveillance targets. The inputs of LPR systems could be divided into two different types: still images and frame sequences. In the first type, a still image from a camera can be designed explicitly to process complex scenes such as multiple plates that exist in different locations of an image. In the second type, temporal information could be used to identify plates more robustly. Figure 1 shows examples of the two image types in surveillance applications. Figure 1(a) shows a still image captured by a camera controlled by the police, while Fig. 1(b) shows several frames from a capturing device installed in a car.

(a) (b)

Figure 1: Two image categories in surveillance applications: (a) a still image from a camera, and (b) nine successive frames in a video stream.

(18)

1.2 The Considered Problems

To develop a LPR system for surveillance, three requirements must be satisfied. The first is the high accuracy performance; for instance, the accuracy rate measured as the weighted sum of the true positive rate (T P R) and the false positive rate (F P R) should be above 95%. The rate should be measured from a real surveillance environment with vari-ations in plate types and environments. The second requirement is to process all frames from an input stream in real time because successive frames may contain temporal plate information useful in surveillance applications. Moreover, plates may not be recognized in certain frames due to unexpected events; for example, the plates may be covered by other passing vehicles or appear significantly blurred when they are out of focus. From the successively recognized results, certain strategies such as voting could be used to increase the accuracy rate. Currently, a common input stream may arrive at a standard frame rate of approximately 30 fps (frames per second) and a resolution of 640× 480 pixels. Thus, a practical plate recognition system should function at over 30 fps, for example, the real time system in this study, with a high dimensional input (640× 480 pixels). The third requirement is the adaptability. In real-world environments, there generally exist unfamiliar plate types or background. When new plate types or backgrounds arise, the plate recognition system should have the ability to recognize them correctly after learning the variation in plates and non-plates automatically.

In plate recognition applications, since input plates or environments generally have different variations, the techniques available are probably not very accurate, robust, or practical. For instance, plate sizes and locations may change significantly in different inputs because the target or the capturing device moves continuously; meanwhile, illu-mination would also be different. Moreover, few of the publishes consider motorcycles. The detection of license plates in the environment with motorcycles would be more diffi-cult because the plates often connect directly with complex backgrounds. However, the

(19)

detection of license plates on motorcycles is desired because motorcycles are important transportation tools in many countries. Figure 2 shows several license plates captured from different environments; some of the plates belong to motorcycles with white or green plates. Figure 3 shows another examples of variations that were tested in this study. The headlights of different vehicles at night are shown in Fig. 3(a), and examples of en-vironmental lighting in different weather conditions are shown in Fig. 3(b). Plate-like background patterns on a tiled floor are shown in Fig. 3(c).

Figure 2: Examples of variations in license plates and their environments.

(a)

(b)

(c)

Figure 3: Examples of variations in plate types and environments: (a) diﬀerent vehicle headlights, (b) diﬀerent types of environment lighting, and (c) plate-like background.

Statistical mechanisms provide more robust and accurate representations. However, an important problem of the statistical type of features or techniques is that the com-putational load is high when a statistical mechanism is applied to all input candidates (more than one hundred thousand units). Instead of explicitly segmenting the characters in detected plates, Amit et al. [1] use a coarse-to-ﬁne approach for both the detection and recognition of characters on license plates. Although they achieve high recognition

(20)

rates, the statistical technique requires 3.5 seconds to process an input image, and it does not satisfy the requirement of surveillance applications. Another problem of the proposed algorithms, as detailed in Chapter 2, is that there is a tradeoﬀ between multiscale ability and the computational eﬀect. Most of them compute a multiscale image pyramid, which is complicated and time-consuming to detect targets of a particular size.

It is a challenging task to develop a LPR system that satisfies all the three require-ments. This is because it is difficult to adjust the existing approaches to achieve the objective of a real-time system with a high T P R while maintaining a low F P R in various plate types or environments. The system may satisfy the minimum real-time criterion (for example, 15 fps) only for one module such as plate detection [2]. However, it is dif-ficult to meet the real-time requirement in the entire LPR system. All variations under consideration for single images and video sequences are summarized as follows.

1.2.1 Diﬃculties for Sigle Images

• Plate variations

– Location

∗ Plates may exist in diﬀerent locations of an input image.

– Quantity

∗ An input image may contain many or no plates.

– Size

∗ Plates with diﬀerent sizes may exist in an image or diﬀerent images.

– Colors of plate characters and backgrounds

∗ Plates may have various characters and background colors due to diﬀerent

plate types (taxis, private cars, etc.) or capturing devices.

– Others

∗ In addition to characters, a plate may contain adornments such as frames

and screws.

• Environment variations

(21)

∗ Diﬀerent types of illumination may occur in input images, mainly due to

environmental lighting and vehicle headlights.

– Plate-like background patterns

∗ A background may contain patterns similar to plates, such as numbers

stamped on a vehicle, bumpers with vertical patterns, and textured ﬂoors.

1.2.2 Diﬃculties for Video Sequences

In video-based surveillance applications, the input is usually fed from a video, which consists of a sequence of frames with surveillance targets and backgrounds. Utilizing temporal features over the sequence would speed up license plate recognition by avoiding processing unnecessary areas. This necessarily involves the use of motion models which describe and label the expected structure of the input sequence.

Motion segmentation has four conventional approaches: (1) background subtraction, (2) temporal difference, (3) optical flow, and (4) predictive model. Background subtrac-tion is a simple and popular method for mosubtrac-tion segmentasubtrac-tion, especially under those situations with a relatively static background. Haritaoglu et al. [3], for example, used the background subtraction method for segmenting potential foreground objects in the real-time surveillance system W4. Temporal difference makes use of pixel differences between two or more consecutive frames to extract moving regions and is adaptive to dynamic environments. Lipton et al. [4], for example, obtained the absolute difference between the current and the previous frame and used a threshold function to determine changes. Optical flow could be used to detect moving objects even in the presence of camera mo-tion. Schunck [5], for example, used characteristics of flow vectors of moving objects to detect moving regions in an image sequence. The comparisons of the three approaches are summarized in Hu et al. [6]. In predictive models, a set of features, such as points, edges, or contours, are tracked over a sequence of frames and some prediction mechanisms are used to minimize the potential candidates of these features. For example, Zayed et al. [7]

(22)

applied Kalman ﬁlters while Jia et al. [8] utilized mean-shift ﬁlters to track recognized vehicles.

The motion segmentation approaches may easily be deteriorated by the problems of changes in illumination, shadow, or colors between the surveillance targets and back-grounds. To deal with the problems, these approaches would become computationally expensive. Direct enhancement and modiﬁcation on above approaches would increase the computational load.

(23)

1.3 Goal

As described above, currently, some important problems are still remained to be resolved. The following problems are focused in this dissertation:

1. Considerations for the development of real-time license plate recognition system

2. Machine learning for adaptive license plate recognition system

3. Search-space reduction(SSR) for improving the computational speed in single images

4. Search-space reduction(SSR) for improving the computational speed in video se-quences

Certainly, it is a great challenge to overcome these problems. However, it is well known that the license plates include general characteristics ranging from low-level gra-dient features to high-level contextual meanings as follows:

1. The color of a plate character is always diﬀerent from that of the background.

2. Plate characters are arranged in a sequence known as plate character lines.

3. A plate is mainly composed of plate characters.

4. Plate characters usually satisfy the plate speciﬁcations; for instance, the English alphabet could not be used in certain parts of the plate or characters have speciﬁc width-to-height ratios, depending on the location of the vehicle.

Thus, our major goal is to ﬁnd out eﬀective approaches to extract the license plates in single images and video sequences.

(24)

1.4 Summary of Achievements

To account for the lack of real-time consideration in the overall system, we propose a cascade framework based on rejecting mechanisms to develop a real-time statistical plate recognition system, which could deal with various problems eﬀectively. We summarize some valuable results in our works as follows:

1. A cascade framework for a real-time plate recogniton system: We first propose two representations: spatial compact plate regions and temporal repeated regions. Com-pact plate regions, which bound the top and bottom of plate characters, could be extracted in the first stage to avoid the use of additional removal procedures. Re-peated regions have similar appearances in the same location of consecutive frames. The repeated regions could be detected and excluded from repeated classification to save computation load. Moreover, under the assumption that most of the input candidates are negative, a cascade framework is presented based on rejection mecha-nisms to reduce spatial and temporal search-space. The computational speed of the entire system can be improved because a majority of the negative candidates may be rejected by the initial rejecters using simpler decision rules with a low computational load.

2. Rejection mechanisms for single images: Cascaded rejection mechanisms have been developed to process single images rapidly at high accuracy rates. The mechanisms are designed to meet the requirements of performance, computational speed, and adaptation for vehicle surveillance applications such as stolen car detection systems. One-pass algorithms are proposed to extract candidates of compact plate regions and segment plate characters precisely and compactly.

3. Rejection mechanisms for video sequences: In this study, we proposed to exclude repeated patterns with the similar appearances in the same location of consecutive

(25)

frames, which usually include stopped vehicles or regular backgrounds. For eﬃ-ciency, repeated patterns were detected only on the candidates of compact plate regions, named spatiotemporal SSR, based on a block-based mechanism by esti-mating the tangent distance, which is invariant to the variations in positions, sizes, rotations, or brightness. Moreover, a bi-level one-pass plate extraction (BOPE) algorithm developed to extract plates accurately even in complicated situations.

(26)

1.5 Organization of this Dissertation

The remainder of this dissertation is organized as follows. Chapter 2 is a review of related research. In Chapter 3, we discribe approaches to detect license plates under two special situations: the license plates with different appearances and the frame inputs with high resolution. In Chapter 4, we will introduce our proposed cascaded license plate frame-work in detail. Chapter 5 describes how to reject non-plates or non-characters for single images. Furthermore, we developed a real-time statistical license plate recognition system by some one-pass rejection mechanisms. Although variations in plates and environments are existed, only candidates of license plates are extracted and outputted by our system. In Chapter 6, we present the spatiotemporal search-space reduction for video sequences to effectively save the computational load. Chapter 7 offers some conclusions and directions of future work.

(27)

Chapter 2 Related Research

In the last decade, many researchers [9–41,41–76] have focused heavily on plate recog-nition. We will brieﬂy review a number of works related to our research on automatic license plate extraction in single images and video sequences in this chapter.

The ﬁrst step in the recognition process is obtaining a frame of the vehicle, usually by use of a CCD camera. In the related publishes, most publishes analyze license plates from grayscale frames. Some publishes use color frames in RGB [16,26,56,69–73,76], YCrCb [22,38], rgb [58], HSV [68], HLS [69], or HSI [58,66,67] spaces; few publishes the IR image (cite...) Then, some pre-processing algorithms, such as noise reduction and histogram equalization, would be performed to enhance images.

In general, a general framework for an license plate recognition system would con-sist of three modules: (1) license plate detection(LPD), (2) plate character segmenta-tion(LPS), and (3) optical character recognition(OCR). In LPD, many methods ranging from simple techniques to sophisticated mechanisms have been developed to detect license plates based on the features of plate characteristics or statistical representations. In LPS and OCR, the available methods generally use approaches that are similar to those used in license plate detection except that features are extracted from characters. Moreover, some publishes [10,33,54,60–65] would normalize the skew of the license plates before segmenting plate characters. Instead of skew normalization, Naito et al. [9] recognized in-clined plate characters in the OCR module. For better representation, a Markov random ﬁeld [74,75] could be used to perform super-resolution of license plates [30] in videos.

The natural control parameters for extracting license plates are the feature extraction and classification. An ideal feature extractor would distinctly represent plates and non-plates that make the classification process trivial; conversely, a strong classification process

(28)

Table 2.1: List of research focus

LPD LPS OCR

[8,11,21,24,27,32,37,38,42,46,47,52,56,67,68,72,73,76–92] Yes

[10,16,19,23,62,63,93] Yes Yes

[9,12–15,17,18,25,26,31,33,34,40,43,51,53,58,64,66,69,70,94–111] Yes Yes Yes

[35,112,113] Yes

[39,41,41,45,54,114–117] Yes Yes

[28,29,50,61,65,118,119] Yes

would not require a sophisticated feature extractor. There is a large number of possible feature types and associated classification measurements which emphasize different license plate properties like pixel intensities, color, teture, edges, etc. Currently, most researchers prefer a hybrid detection algorithm, where multiple features are involved in order to make the algorithm more robust. The algorithms proposed in this study is also hybrid algorithms. Different feature types and classification methods would be summarized in the following.

2.1 List of Research Focus

Due to the diﬀerent goals being emphasized in each publish, the comparison of research focus is shown in Table 2.1. The value, Yes, in the ﬁled means that the publishes listed in the second column mentioned that they did contributions in the corresponding module.

2.2 Features Types

Gradient features:

Gradient features are usually the most important since the features are insensitive to scale, rotation, size, or colors. After thresholding the gradient values, edges would be analyzed in many publishes [12,15,19–21,25,33,36,42,51,53,55,58,62–64,67,79,80,83,91,98,

(29)

103,104,114,120]. Moreover, vertical edges [15,20,25,33,36,55,64,79,80,83,91,120] would be the most popular gardient feature to represent the license plate areas. By limiting the colors of license plates, Xu et al. extract color edges from an RGB image; Chang et al. [58] extract color edges from an rgb image; Yang et al. [68] extract color edges from an HSV image.

Statistical features:

Statistical features for LPD or OCR are usually composed of the covariance matrix [84], the density [8,16,25–27,36,44,48,76,79,81,82,85,87,91,120], density variance [85,87,91,120], Haar-like features [85,87–89,91,111,120], or Gabor features [23,117]. The region density may be measured from the edges [8,36,76,82] or the gradients [91,120] of the license plate region. Wu et al. [86] use the frequency of zero crossing on the map of edges.

Shape-based features:

Shape-based features are usually composed of the skeleton [117], the size [10,20,36,81,94, 111], the width or the height [10,27,31,36,53,54,79,86], the aspect ratio [8,20,25,27,36,53, 60,70,76,79,81,82,86,104,109,111], the rectangularity [8,76,82], or the orientation [25,91,94] of the license plate or plate characters. Although these features may not be scale-invariant or rotation-invariant, they are insensitive to many environment changes. The plate orien-tation calculated using least second moments would be adopted in some publishes [109]. Moreover, the symmetric property [22,38] of license plate regions is measured to eliminate the false positives.

(30)

2.3 Extracting Methods

Morphological methods:

Morphological operators, such as local or global thresholding, Sobel [19,24,78,83,95], thin-ning [113], smoothing [10,88,89,93,106,110,121], opethin-ning or closing [10,36,62,67,79,80,88, 89,107,110], or diﬀerencing [10,88,89], are widely used in license plate extraction or charac-ter segmentation because the operators does not require complex and heavy mathematical calculations.

Transformation methods:

Hough transformation is a method for detecting the borders of the license plates [24,35,53, 55,104]. Martin and Borges [31] use bottom-hat transformation to enhances the charac-ters. Hou et al. [110] measure the diﬀerences of top-hat and bottom-hat transformations to extract the license plate characters. Wu et al. [86] use bottom-hat transformation to enhance the texture in the input image. Hsieh et al. [81] and Guo et al. [122] per-form wavelet decomposition in each block of the input image and generate four subbands (smoothing, horizontal, vertical, and diagonal).

Projection:

License plates or characters could be extracted by analysing the horizontal and vertical projections [12,34,35,42,51,69,80,93,95,99,104,107]. Because characters of license plates are usually arranged in horizontal lines, some publishes [19,20,31,33,54,60,62,64,81,94, 115,123] only use vertical projection histograms to segment plate characters.

(31)

Mean Shift:

The mean shift algorithm [124] is a nonparametric clustering technique which does not require prior knowledge of the number of clusters, and does not constrain the shape of the clusters. Some publishes [8,76,82] would use the mean shift ﬁlter to extract candi-dates of license plates. Kim et al. [71] adopted the continuously adaptive mean shift algorithm (CAMShift)[xxx] to extract regions of license plates in the result after plate color measurement.

Sliding window:

Each sub-window [63,78,85,91,104,106,109,111,122] in the input image would be extracted and identifed by heuristic rules or classiﬁcation methods. Anaqnostopoulos et al. [109] proposed a sliding concentric windows(SCWs) to detect regions of license plates.

Vector quantization(VQ):

Based on vector quantization, Zunino and Rovetta [77] encode each row of the input image for locating license plates. [23]

Hidden Markov Chain(HMC):

Franc and Hlavac [112] use the Hidden Markov Chain (HMC) model to describe a relation between the image and its corresponding segmentation.

(32)

2.4 Classiﬁcation Methods

Template matching:

By thresholding the minimum distance calculated from the pre-deﬁned database, tem-plate matching algorithms [8,9,13,15,18,20,49,60,64,69,76,82,94,96,99,107,121,123] veriﬁed an input pattern as a plate or a character.

K nearest neighbors (KNN):

Cano and Perez-Cortes [11,78] classify every pixels of the input image based on KNN method. Guo et al. use a k-means cluster to identify license plate blocks.

Cascade classiﬁer:

For object detection, Viola and Jones [125] propose an cascade classifier trained by Ad-aBoost [126]. A cascade classifier can be taken as a degenerate decision tree. Some publishes [85,87–89,91,111,120] adopt the cascade classifier to identify license plates.

Neural network(NN):

Various neural network architectures [12,16,28,39,41,41,44,48,55,56,58,70,84,90,95–97,102, 105,108,109,122] are proposed and implemented for plate identification or character recog-nition. Artificial neural network(ANN) trained by a backpropagation algorithm is used for plate character recognition in [58,97,105,106,121]. Guo et al. [122], Yuan et al. [90], and Chacon and Zimmerman [21] classify license plates based on pulse coupled neuron model(PCNN) [127], a kind of artifical neural network model. Anaqnostopoulos et al. [109] trained a two-layer probabilistic neural network(PNN) to identify plate characters. Hu

(33)

et al. [115] use a PNN to identify low-dimension test samples segmentated from actual

license plate images.

Hidden Markov Model(HMM):

For character recognition, Llorens et al. [103] and Duan et al. [104] use a HMM model with the observations of the ratio of foreground pixels in a window.

Support vector machine(SVM):

Support vector machine with radial basis functions (RBFs) is usually used for plate clas-siﬁcation [122] or plate character recognition [111,116]. Direct pixel values of an input region are scaled and the size of the region is normalized. Otherwise, Kim et al. [71] adopted a support vector machine (SVM) with a polynomial kernel as the color texture classiﬁer.

Genetic algorithms(GA):

Yoshimori et al. [22,37,38] and Yohimori et al. [49] change the threshold values based on genetic algorithms to extract license plates. Karungaru et al. [121] use a genetic algorithm to select the three parameters, position, size, and orientation of the input characters. Xiong et al. [46] apply GA to seach the possible license plate area in the whole image.

(34)

Chapter 3 Detection of License Plates

under Two Special Situations

3.1 License Plates with Diﬀerent Appearances

This section proposes an approach to developing an automatic license plate detection system with different appearances. The car images are taken from various positions in outdoors. Because of the variations of angles from the camera to the car, the license plates will have various locations and rotation angles in an image. In the license plate detection phase, since the colors of characters and of the license plate background are generally different, the magnitude of the gradients is used to detect candidate license plate regions. The license plates are usually located on the bumper. In the car images, there are several horizontal lines. If we use the horizontal gradients, it will be difficult to separate the regions of license plates from the bumper. Thus, the magnitude of the vertical gradients is used to detect the candidate license plate regions. These candidate regions are then evaluated based on three geometrical features: the ratio of width and height, the size and the orientation. The last feature is defined by the major axis. The various rotated character images of a specific character can be normalized to the same orientation based on the major axis of the character image. Two different appearances of the license plates are shown in Fig. 4 Experimental results show that the license plates detection method can correctly extract all license plates from 102 car images taken in outdoors.

(35)

The remaining parts of this section are organized as follows. Section 3.1.1 studies the motivation and modules of the orientation normalization and inverse rotation trans-formation. Section 3.1.2 presents the procedure of license plates detection. Section 3.1.3 shows experimental results and Section 3.1.4 includes some concluding remarks.

3.1.1 Orientation Normalization

This section aims to detect the license plates of the car image with various locations and to recognize the rotation-free characters in the license plates. It is useful to derive the major axis which shows the orientation of the image to detect and recognize license plates. In the license plate detection phase, the major axis is measured in possible license plate region to evaluate the possibility to be a license plate region. Figure 5 shows the major axis on each image of the character R in ﬁve diﬀerent rotation angles, where the dash lines represent the major axis of the character image.

Figure 5: The images of character R in diﬀerent orientations. The dash lines represent the major axis.

When the rotation angles of a speciﬁc character image are between 90o _{and 270}o _such

as Figs. 5(d)-(f), the normalized character images is inverse. However, the situation is regardless because the license plate images would not have rotation angles between 90o

and 270o_.

(36)

is deﬁned below [128]    x 0 y0    = R    x y    =    cos θ − sin θ sin θ cos θ    ×    x y    (3.1)

The rotation transformation will result in some holes after the rotation transforma-tion is applied. In Fig. 6(b) the destinatransforma-tion pixel 2 is mapped from two source pixels, while the destination pixel 1 is mapped from none of the source pixel. These undeﬁned destination pixels produce holes in the image. To solve the problem, for each pixel of the rotated image, the relative origin pixel is checked to see if it is the black pixel of the character image. If it is, the pixel of the rotated image is marked as character pixel; otherwise it is marked as non-character one.

Figure 6: The rotation result of an image block. (a) The original image; (b) The rotated image with a -30-degree transformation.

The processes of orientation detection will be discussed in the following.

A. Orientation Detection

In the binary image, we first define that the mass is the black pixels whose gray level is 1. The moment of mass of the binary image is the distribution of the mass throughout the binary image. Horn [129] mentioned that the first moment of mass which is defined

(37)

as mass times distance could be used to derive the center location of the mass and the second moment of the mass could be measured the distribution of mass relative to axes through the center of the mass. And the orientation, of the mass is derived from the least second moment of the mass. Then, the major axis of the mass can be achieved from the orientation and the center. The steps to derive the orientation from the binary image are described in the following.

The ﬁrst moment of mass in the binary image is deﬁned as

C = (xc, yc) = (∫ ∫ xg(x, y)dxdy, ∫ ∫ yg(x, y)dxdy ) (3.2)

where g(x, y) is the black point (x, y) which gray level is 1 in the binary image.

The second moment of mass in the binary image is equal to mass times square of the distance from the black point in the binary image to a line as shown below:

S =

∫ ∫

r2g(x, y)dxdy (3.3)

where r is the perpendicular distance from the black point (x, y) to a line L. In Fig. 7, for a particular line in the binary image, two parameters are deﬁned: the distance from the origin to the closest point on the line, and the angle between the x-axis and the line, which is measured counterclockwise. The equation of the line is presented as follows.

x sin θ− y cos θ + t = 0 (3.4)

Note that the line intersects the x-axis at _{sin θ}−t and the y-axis a _{cos θ}+t . The closest point on the line to the origin is located at −t sin θ, +t cos θ. Suppose that the point (x0, y0 is

(38)

Figure 7: The coordinate diagram.

located on the line The equations for the point (x0, y0) on the line L are as the following:

x0 =−t sinθ + r cos θ and

y0 = +t cos θ + r sin θ.

(3.5)

Given an arbitrary black point (x, y) in the binary image, the shortest distance between (x, y) and the line L is deﬁned as

r2 _{= (x}_{− x}

0)2+ (y− y0)2

= t2_{+ 2t (x sin θ}_{− y cos θ) − 2r (x cos θ + y sin θ) + r}2_{+ (x}2_{+ y}2_{) .}

(3.6)

Totally diﬀerentiating with respect to we obtain

r = x cos θ + y sin θ. (3.7)

Substituting the equation 3.7 back into the equation 3.6 lead to

r = x sin θ− y cos θ + t. (3.8)

(39)

mass can be derived as

S =

∫ ∫

(x sin θ− y cos θ + t)2g(x, y)dxdy. (3.9)

Because the second moment of mass crosses the center of the binary image, we can substitute the equation 3.2 into the equation 3.9 and totally diﬀerentiating with respect to t, we obtain

(xcsin θ− yccos θ + t) = 0, (3.10)

where (xc, yc) is the center of the binary image. Without losing the generality, we can

change the coordinate to x0 = x− xc and y0 = y− yc the equation 3.10 can be rewritten

as follows:

x sin θ− y cos θ + t = x0sin θ− y0cos θ, (3.11)

and we can substitute the equation 3.11 back into the equation 3.9, then the new equation is obtained.

S =

∫ ∫ (

(x0)2sin2θ + 2 (x0y0) sin θ cos θ + y02(cos 2θ− 1) )

g(x, y)dx0dy0 (3.12)

Total diﬀerentiating S with respect to θ and we can obtain

tan 2θ =

∫ ∫

2(x0y0)dx0dy0

∫ ∫

x02dx0dy0−∫ ∫ y02dx0dy0 (3.13)

Finally, we obtain the orientation θ,

θ = 1 2arctan ( ∫ ∫ 2(x0y0)dx0dy0 ∫ ∫ x02dx0dy0−∫ ∫ y02dx0dy0 ) (3.14)

(40)

major axis crosses the center.

3.1.2 License Plates Detection

The first step of the license plate detection system is to detect the license plate regions of the input car images. Due to the similar colors of the license plate background and that of the car body, it is difficult to detect the boundary of the license plate from the input car images in outdoors. Because the color of characters is different from that of the license plate background, the gradients of the original image are adopted to detect candidates of license plate regions. Figure 8 shows the processing flow of license plates detection.

Figure 8: The system diagram.

This section presents the details of license plates detection. The procedure consists of four major steps: (1) detection of possible license plate regions, (2) possibility mea-surement, (3) merging of broken regions, (4) inverse rotation transformation. The last step, inverse rotation transformation, has already been described in the previous section. The details of the remaining steps are explained as follows.

A. Detection of Possible License Plate Regions

At the ﬁrst step of the license plate detection phase, the possible license plate regions are detected from the vertical gradients of the input car images. The vertical gradients are derived by multiplying with a mask value for each pixel and its neighboring pixels. In the vertical gradients image, the license plate region is the area with large local variance. The local variances of the vertical gradients image are measured with a local window mask. In

(41)

this section, in order to cover the characters in the license plate of the input car images, the size of the local window mask is set as 11× 7. The smaller the window size is, the more possible the license plate regions are separated, while the larger the window size is, the over detected license plate regions occur. Figure 9 shows the possible license plate regions with three diﬀerent window sizes: 7× 3, 11 × 7 and 15 × 11.

Figure 9: (a) The license plate image; (b)-(d) The possible license plate regions with the sizes of window mask, 7× 3, 11 × 7 and 15 × 11.

The pixel is deﬁned as 1 for possible license plate regions. When we threshold the local variance image, the image of possible license plate regions is obtained. Figure 10(a) shows the image of a car with a license plate, where the colors of the license plate background and that of the car body are similar. Figure 10(b) displays the vertical gradient image of Fig. 10(a). Figure 10(c) and Fig. 10(d) demonstrate the local variance image of Fig. 10(b) and the possible license plate regions, respectively.

There may be some noise in the images of possible license plates such as holes and single dots. An opening operation of morphological analysis, in which the dilation oper-ation is performed after an erosion operoper-ation, is applied in order to reduce the undesired eﬀect of noise and to separate the regions that were slightly connected.

(42)

Figure 10: (a) The car image with a license plate; (b) The vertical gradients of Fig. 10(a); (c) The local variance of Fig. 10(b); (d) The possible license plate regions.

B. Possibility Measurement

To detect the most possible license plate regions from the candidate plate regions, the geometrical properties of the license plate are introduced to measure the possibility value. The following deﬁnes the geometrical features:

• Area: If the candidate region is large, it is more likely being a license plate. A higher

possibility value represents a more possible license plate region. The possibility of the area is deﬁned as _∑Ns

Ns, where Ns is the number of boundary rectangle of the

possible license plate region, s.

• Orientation: As described before, the orientation of each possible license plate

re-gion can be measured. A license plate usually appears as a horizontal rectangle. The smaller the orientation of the possible license plate region is, the higher the possibility value is. The possibility of the orientation is given by 90₉₀−θ, where θs is

(43)

the orientation of the possible license plate region, s.

• Density: The ratio between the black regions and the area of the bounding rectangle

is deﬁned as the density of the license plate region. The license plate is always a rectangle. A higher density value means that the region is more likely to be a rectangle and to be viewed as a license plate region. The possibility of the density is deﬁned as Bs/Ns, where Bs is the number of the possible license plate region, s.

For each possible license plate region, s, the possibility value p(s), is deﬁned as the weighted sum of the above three features, as shown below.

p(s) = ω1 Ns ∑ Ns + ω2 90− θs 90 + ω1 Bs Ns (3.15)

where ωi is the weighting coeﬃcient. We need to select proper ωi that can keep a high

detection rate. These values are determined according to experimental results. In this study, ω1 = 0.2, ω2 = 0.3, and ω3 = 0.5 are adopted.

C. Merging of Broken Regions

After the detection of all candidate license plate regions, a license plate is probably sepa-rated into several adjacent regions. In Fig. 11(a), since the distance between the characters F and 4 in the license plate is larger than the threshold of the window mask deﬁned above, two separated candidate license plate regions are generated. These separated regions have to be merged to extract the accurate license plate region.

Assume that s1 and s2 are two possible license plate regions and s is the merged region of s1 and s2. Regions s1 and s2 are merged when the following two rules are satisﬁed.

(44)

Figure 11: (a) The car image with a license plate; (b) The separated possible license plate regions.

• The possibility value of the merged region s is larger than both of s1 and s2.

The merging operation is repeatedly performed until no regions could be merged. Then, the region with the largest possibility value is viewed as the license plate region.

3.1.3 Experimental Results

The system proposed in this chapter has been applied to 102 images with 104 license plates, involving vehicles at diﬀerent pan/tilt angles. We implemented the proposed system on a Pentium II 300MHz PC with C++ language under Windows environment and used Nikon 5700 digital camera as an input device. For the license plate detection method, Fig. 12 shows the original car images, the possible license plate regions and the result image of license plate detection. There are 108 totally license plate images extracted from the test images.

(45)

Figure 12: (a) The car images with license plates; (b) The possible license plate regions; (c) The license plate detection results.

(46)

3.1.4 Conclusion

In this chapter, we have proposed an automatic license plate detection system with dif-ferent appearances. In conventional license plate detection methods, it is diﬃcult to determine the license plate with large pan and tilt angles. The proposed methods use major axis information which is non-sensitive to rotation variance to detect the license plate. The major axis is determined by the orientation which is the second moment of the mass and center which is the ﬁrst moment of the mass in the binary image. Then, the input images can be taken from large pan and tilt angles relative to the car in outdoors. Experiments carried out on some samples of outdoors car images show the feasibility of using the proposed methods to detect the license plates. The system proposed can be applied in general security systems and car violation prevention systems.

(47)

3.2 License Plates in High Resolution Frames

To reduce the search space in high resolution inputs, this study presents two fast rejection mechanisms to extract license plates of motorcycles and vehicles on highways. First, a block-based rejection mechanism was proposed to eliminate regions of non-vehicles uti-lizing temporal information over the input sequence. In the mechanism, three types of blocks, low-contrast, stationary, and false blocks, would be removed to reduce the search space. Second, a spatial projection-based rejection mechanism based on orthogonal gra-dient projections was proposed to extract candidates of license plates. To reduce the candidates, horizontal lines were rejected in the horizontal projections of the vertical gradients, and vertical lines were rejected in the vertical projections of the horizontal gradients. The remaining candidate plate regions were further segmented and veriﬁed by the character segmentation and statistical recognition modules. In our experiments, we tested 180 pairs of images, where an image contains 2560× 1920 pixels. Our rejection mechanisms successfully extracted 98% license plates and improved the computational speed of the system to 0.075 seconds per image on a personal computer with Pentium 4 2GHz CPU because 88% of pixels were excluding from processing by the plate extraction, plate segmentation, and character recognition procedures.

Although much work has been done on the recognition of license plates, most studies focused on the processing of images with only one vehicle and few studies reported on that of motorcycles. Moreover, many of these algorithms are robust to small-scale variations (such as local variations of colors or sizes); they are not intended for matching images with large differences in sizes or amounts as shown in Fig. 13. In many environments, an input image may contain several motorcycles and vehicles. The situation becomes more difficult when the target plates are occluded by others as shown in Fig. 14. In another difficult case, the gray levels of plates are similar to those of backgrounds as shown in Fig. 15. This study aims to develop methods to extract license plates of motorcycles and

(48)

vehicles under these situations rapidly and eﬀectively.

Figure 13: License plates with diﬀerent sizes in an frame.

Figure 14: License plates with partial occlusion in an frame.

Figure 15: A license plate with unobvious borders. The plate and background have similar gray values.

Although license plate extraction in a video stream has drawn much attention, stable segmentation results would only be obtained from heavy computational approaches with some predeﬁned criteria. The primary purpose of this study is to present two rejection mechanisms to extract license plates in high-resolution image inputs. By combining tem-poral rejection and spatial rejection mechanisms, we can rapidly extract license plates of motorcycles and vehicles on highways.

The remainder of this section is organized as follows. Section 3.2.1 presents the reject-based plate recognition system. Section 3.2.2 shows the mechanism of block-based rejection to reduce the search space in the high resolution inputs, and Section 3.2.3 de-scribes the mechanism of projection-based rejection to extract license plates rapidly. In

(49)

Section 3.2.4, some experimental results are presented. Section 3.2.5 concludes this section and gives suggestions for the future works.

3.2.1 Overview of the Reject-based License Plate Extraction

System

To design a plate extraction system, it is apparently eﬃcient to reduce the search space as soon as possible. In this chapter, we design two rejection mechanisms that can dra-matically reduce the search space, while ensuring the high performance of the system. The system architecture mainly composed of temporal and spatial rejection mechanisms is depicted in Fig. 16(a). To avoid processing unnecessary areas in high resolution inputs, the block-based rejection, as shown in Fig. 16(b), utilizes temporal features in consecu-tive frames. Three types of blocks, low-contrast, stationary, and false blocks, would be removed in this module. The three removal procedures are detailed in Section 3.2.2.

(a)

(b) (c)

Figure 16: Schematic diagram of the rejection mechanisms for license plate extraction.

Moreover, instead of detecting license plates directly, we propose the projection-based rejection to eliminate non-characters lines. With respect to the computational load, a reject-based method is proposed to rapidly remove uniform lines on vertical and horizontal projections, named orthogonal projections, simultaneously. The

(50)

projection-based rejection algorithm is shown in Fig. 16(c) and detailed in Section 3.2.3. By only detecting regions of license plate characters, the system can avoid additional procedures for removing the adornments before plate character segmentation. The last block in the system ﬂowchart consists of three modules: plate classiﬁcation, character segmentation, and character recognition, which are summarized in the following.

After we extract candidates of license plates, characters in the regions may be skew and recognized erroneously. Skew correction is important to recognize correctly the char-acters. In this study, the skew angle of a license plate is determined from the plate bottom border, as shown in Fig. 17.

(a) Source image (b) Edge image

Figure 17: Skew correction. The bottom border can be obtained from the original image.

After skew correction, the plate region could be classified by support vector machine (SVM), Neural Network, or cascade classifiers. Then, the classified plate region could be divided into individual character regions after binarization. In our collected license plates, the character widths are four times of the dash “-” width. Since we have six characters in our samples, we can divide the plate region into 25 (6× 4 + 1) units and then cut the characters in the computed location. For example, the partial segmentation result is depicted in Fig. 18. For characters recognition, a classification mechanism based on support vector machine (SVM) is adopt to recognize the segmented characters of license plates. Moreover, the characters in license plates follow regulation rules. In the current study, the following four rules are used.

(51)

1. The license plates consist of six characters.

2. The ﬁrst two characters are capital letters form A to Z.

3. The third character is a letter or a number.

4. The late three characters are numbers from 0 to 9.

The recognition output of a character is a list sorted by the diﬀerence between the input and the candidate. If the ﬁrst character in the list does not satisfy one of the rules, the following character candidate is selected for testing.

(a) (b)

Figure 18: Character Segmentation: (a) a source image and (b) the segmentation result part.

3.2.2 Block-based Rejection in High Resolution Frames

(a) The reference frame ft−1 (b) The current frame ft (c) The diﬀerence frame Dif ft

Figure 19: A difference result of two consequent images. If the corresponding pixels in the two images have different grey-levels, the original gray value in the current frame is stored in the difference frame.

The size of an input image in our system may be very large and the image could be composed of various variants of plates and backgrounds. We would spend much time to

基於串聯式剔除機制來減少視訊中時空搜尋空間的即時車牌辨識

國

立

交

通

大

學

資訊科學與工程研究所

博

士

論

文

基於串聯式剔除機制來減少視訊中時空搜尋空間

的即時車牌辨識

Real-time License Plate Recognition based on Cascaded Rejection Mechanisms to

Reduce Spatio-temporal Search Space in Video Sequences

研 究 生：王舜正

指導教授：李錫堅 教授

基於串聯式剔除機制來減少視訊中時空搜尋空間的即時車牌辨識

Real-time License Plate Recognition based on Cascaded Rejection Mechanisms to

Reduce Spatio-temporal Search Space in Video Sequences

研 究 生：王舜正 Student：Shen-Zheng Wang

指導教授：李錫堅 教授 Advisor：Prof. Hsi-Jian Lee

國 立 交 通 大 學

資 訊 科 學 與 工 程 研 究 所

博 士 論 文

中華民國九十七年六月

基於串聯式剔除機制來減少視訊中時空搜尋空間的即時車牌辨識

學生： 王舜正

指 導 教 授 ： 李錫堅 教授

國立交通大學資訊科學與工程研究所

摘

要

Real-time License Plate Recognition based on Cascaded Rejection

Mechanisms to Reduce Spatio-temporal Search Space in Video Sequences

Student: Shen-Zheng Wang

Advisor: Prof. Hsi-Jian Lee

Submitted to Institute of Computer Science and Engineering

College of Computer Science

Chiao Tung University

ABSTRACT

Acknowledgements

Table of Contents

List of Tables

List of Figures

Chapter 1

Introduction

1.1

Motivations

1.2

The Considered Problems

1.2.1

Diﬃculties for Sigle Images

1.2.2

Diﬃculties for Video Sequences

1.3

Goal

1.4

Summary of Achievements

1.5

Organization of this Dissertation

Chapter 2

Related Research

2.1

List of Research Focus

2.2

Features Types

2.3

Extracting Methods

2.4

Classiﬁcation Methods

Chapter 3

Detection of License Plates

under Two Special Situations

3.1

License Plates with Diﬀerent Appearances

3.1.1

Orientation Normalization

3.1.2

License Plates Detection

研究生：王舜正

指導教授：李錫堅教授

研究生：王舜正 Student：Shen-Zheng Wang

指導教授：李錫堅教授 Advisor：Prof. Hsi-Jian Lee

國立交通大學

資訊科學與工程研究所

博士論文

學生：王舜正

指導教授：李錫堅教授