Synthetic Minority Over-sampling Technique (SMOTE)

Chapter 3. Proposed Method

3.2. Top-Down Learning

3.2.4. Synthetic Minority Over-sampling Technique (SMOTE)

T ECHNIQUE (SMOTE)

The problem of our proposed method is imbalanced data. The number of the negative data is about 3000; however, the number of static waymarks data is about 40.

The difference is about 60 times. In Figure 3-24(a), if the training data is balanced, the estimated decision boundary (solid balck line) will approximate the true boundary (solid red line) if there is few wrong data (black symbol „+‟). In contrary, if the training data is imbalanced (Figure 3-24(b)), the estimated decision boundary (solid black line) may be very far from the true boundary (solid red line) if there is few wrong data (black symbol „+‟).

(a) (b) Figure 3-25 (a) Balanced data (b) imbalanced data

Thus, if the training data is imbalanced, we have to synthesize new data for the minority set. Here we apply SMOTE algorithm in [13].Synthetic samples are generated in the following way: Take the difference between the feature vector (sample) under consideration and its nearest neighbor. Multiply this difference by a random number between 0 and 1, and add it to the feature vector under consideration.

This causes the selection of a random point along the line segment between two specific features. This approach effectively forces the decision region of the minority class to become more general. The following the pseudo-code for SMOTE in [13].

Algorithm SMOTE(T, N, k)

Input: Number of minority class samples T, Amount of SMOTE N%; Number of nearest neighbors k

Output: (N/100)* T synthetic minority class samples

(*If N is less than 100%, randomize the minority class samples as only a random

(* The amount of SMOTE is assumed to be in integral multiples of 100. *) k = Number of nearest neighbors

numattrs = Number of attributes

Sample[ ][ ]: array for original minority class samples

newindex: keeps a count of number of synthetic samples generated, initialized to 0 Synthetic[ ][ ]: array for synthetic samples(*Compute k nearest neighbors for

each minority class sample only. *) for i ← 1 to T

Compute k nearest neighbors for i, and save the indices in the nnarray Endfor

Populate(N, i, nnarray) (*Function to generate the synthetic samples. *) while N ~= 0

Choose a random number between 1 and k, call it nn. This step chooses one of the k nearest neighbors of i.

for attr ← 1 to numattrs

Chapter 4. E XPERIMENTAL R ^ESULTS

In this chapter, we will show and discuss our experimental results. In computer simulation, the proposed algorithm is coded in Matlab without code optimization, and is tested over a PC with Intel® Core™2 Duo CPU running at 3G Hz. The first experimental stage is to capture the video in the campus. Here we prepare three videos captured at different places for four different static waymarks. Figure 4-1 and 4-2 show the three places and the selected waymarks.

(a)

(b)

(c)

Figure 4-1 The overview of three places

Figure 4-2 Left to right: blue signboard, fire hydrant, Disabled signboard, and parking signboard

In Table 4-1, for each static waymark, we randomly selected forty image patches in the video. The SOMTE algorithm was used to synthesize two hundred image patches as the positive training samples. On the other hand, three thousand image patches were extracted from the same video. The video was captured in the afternoon.

Table 4-1 Number of training images

Static waymarks The number selected

SMOTE

Blue signboard 40 200

Fire hydrant 40 200

Unable signboard 40 200

Parking signboard 40 200

Negative images 3000 X

In the Figure 4-3, we use the four colored bounding boxes to recognize four different static waymarks.

Figure 4-3 Use different colors bounding boxes for different static waymarks

The results of four static waymarks, with variations in rotation and scale, are shown in Figure 4-4 and 4-5.

35 (a)

(b)

(c)

Figure 4-4 Different scales of static waymarks in the afternoon

(a)

(b)

(c)

Figure 4-5 Static waymarks in the afternoon with rotation variations

Here we also apply our algorithm to the videos captured at noon and evening. The results are shown in Figure 4-6 and 4-7.

36 (a)

(b)

(c)

Figure 4-6 Static waymarks at noon with rotation variations

Figure 4-7 Static waymarks in the evening with rotation variations

Table 4-2 is the detection rate and false alarm for each static waymark at noon, afternoon, and evening. The number of testing image for each static waymark is about six hundreds. In Table 4-2, we find varying illumination is important effect at the detection rate. Moreover, if there are many objects similar to the static waymark, such as cripple signboard or parking signboard, the detection rate of the static waymark is lower than others.

Table 4-2 Detection rate and false alarm at different time for each static waymarks

Time Static

Comparing to combining the three methods (Table 4-2), applying three methods respectively (Table 4-3) has little lower detection rate in the afternoon and much lower the detection rate at night, because of varying illumination. The result shows if we combine three methods we can obtain the good bottom-up detection and detection rate.

On other hand, in Table 4-4, we obtain the average computing time that includes detection and recognition by different bottom-up detection. The number of images is 2520 and the size of image is 360*240.

Table 4-3 Detection rate and false alarm for different bottom-up methods

Time Static

Table 4-4 Computing time for different bottom-up detection

Bottom-up methods

Combination Feature-pair Phase Directional filters Time (second)

(each image)

15.1s 16.9s 22.4s 30.8s

In Table 4-4, combining the three methods spends least time because it can make the static waymarks more salient and suppress the false alarm region.

We also consider our SVM classifier, which trained by the contain three places, using in other places. Figure 4-8 shows the overview of other three places.

(a) (b)

(c)

Figure 4-8 The overview of other three places

Table 4-5 shows the detection rate and false alarm for different places in the afternoon.

The number of testing image is about 300 for each static waymarks.

Table 4-5 Detection rate and false alarm for three new places

Time Static

In Table 4-5, we can notice some static waymarks have high detection rate and some have low detection rate. In our opinion, if there are similar objects in the image the bottom-up detection has worse performance and we obtain the lower detection rate.

Moreover, if the background is different the static waymarks the bottom-up detection has better performance and we can obtain high detection rate.

Chapter 5. C ^ONCLUSIONS

In this thesis, we propose a system of saliency-based detection and recognition for static waymarks with well performance. Based on three physical properties, the saliency detection can quickly identify the candidate regions in any newly captured video. Over the candidate regions, HSV color space and directional filter banks which are applied to analyze the color and shape respectively for the recognition of the pre-selected waymarks. These features are invariant to the lighting change, scale, rotation. Our system also considers the problem of different shapes and imbalanced training data. The result in chapter 4 shows the well detection rate under different condition such like at noon, in the afternoon, and at evening. We also obtain the lower detection rate for same static waymarks at different places.

R ^EFERENCES

[1] Maldonado-Bascón S., Lafuente-Arroyo S., Gil-Jiménez P., Gómez-Moreno H., López-Ferreras F., “Road-Sign Detection and Recognition Based on Support Vector Machines”, IEEE Conference on Intelligent Transportation Systems, pp.

264 – 278.2007

[2] Y. Xie, L.-F. Liu, C.-H. Li, and Y.-Y. Qu, “Unifying visual saliency with hog feature learning for traffic sign detection,” IEEE Symposium on Intelligent Vehicles, pp. 24 –29, June 2009.

[3] A. de la Escalera and L. Moreno, “Road traffic sign detection and classification.”

IEEE Trans. Indust. Electronics, 44:848–859, 1997.

[4] J. Miura, T. Kanda, and Y. Shirai, “An active vision system for real-time traffic signs recognition, IEEE Symposium on Intelligent Vehicles, pp. 52.57, Oct 2002.

[5] S. Lafuente-Arroyo, P. Gil-Jiménez, R. Maldonado-Bascón, F. López-Ferreras, and S. Maldonado-Bascón, “Traffic sign shape classification evaluation I: SVM using distance to borders,” in Proc. IEEE Intell. Veh. Symp., Las Vegas, NV, pp.

557–562 , Jun. 2005.

[6] L. Itti, C. Koch and E. Niebur, "A model of saliency-based visual attention for rapid scene analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, pp. 1254-1259, 1998.

[7] X. Hou and L. Zhang, "Saliency detection: A spectral residual approach," IEEE Conference on Computer Vision and Pattern Recognition, pp. 2280-2287, 2007.

[8] C. Guo, Q. Ma and L. Zhang, "Spatio-temporal saliency detection using phase spectrum of quaternion fourier transform," IEEE Conference on Computer Vision and Pattern Recognition, pp. 2908-2915, 2008.

[9] Wen-Chung Huang, Sheng-Jyh Wang, and Cheng-Ho Hsin, "Visual Saliency Detection Based on Feature-Pair Distributions", in Proc. Computer Vision, Graphics, and Image Processing, Taiwan, 2009

[10] H. Bay, T. Tuytelaars, and L. Van Gool. “SURF: Speeded Up Robust Features”, Proceedings of the ninth European Conference on Computer Vision, May 2006.

[11] M. Varma and A. Zisserman. A statistical approach to material classification using image patch exemplars. PAMI, 31(11):2032–2047, November 2009.

[12] S. Belongie, J. Malik, and J. Puzicha, “Shape Matching and Object Recognition Using Shape Contexts,” IEEE Trans.Pattern Analysis and Machine Intelligence, 24(4):509–522, 2002.

[13] Chawla, N.; Bowyer, K.; Hall, L.; and Kegelmeyer, “Smote: Synthetic minority

over-sampling technique.”Artificial Intelligence Research, 16:321–357, 2002

在文檔中基於顯著特性之靜態路標偵測與辨識系統 (頁 40-0)

Synthetic Minority Over-sampling Technique (SMOTE)

Chapter 3. Proposed Method

3.2. Top-Down Learning

3.2.4. Synthetic Minority Over-sampling Technique (SMOTE)

T ECHNIQUE (SMOTE)

Chapter 4.

E XPERIMENTAL R ESULTS

Chapter 5.

C ONCLUSIONS

R EFERENCES

E XPERIMENTAL R ^ESULTS

C ^ONCLUSIONS

R ^EFERENCES