K‐means Algorithm for Database Creation

CHAPTER 3 THE PROPOSED METHOD

3.4 K‐means Algorithm for Database Creation

3.4 K‐means Algorithm for Database Creation

K‐means algorithm is a common algorithm for clustering. More specifically, it is an algorithm to group the objects based on attributes into K number of groups.

Hence, for each of the print sets of 10 shoeprint images, we use the K‐means algorithm to select 3 representative images as the database category.

Feature vectors CM , DM , GF , and LF extracted using the methods described above are taken as attributes f ={CM,DM,GF,LF}, and used in the K‐means algorithm to form three groups.

The algorithm performs the following steps until convergence:

1. Determine the centroid coordinate of each group.

In the first iteration, shoeprints are assigned randomly into 3 groups.

2. Determine the distance of each shoeprint to the centroid of each group.

The distance is calculated by Euclidean distance

∑

−

= ^K

i c

f dist

(

Where denotes the attribute of the shoeprint image, and denotes the attribute of the centroid.

fi i^th c_i

ith

3. Group the shoeprints based on the minimum distance.

Each shoeprint is assigned to the group with the minimum distance. If no new assignment is performed, then the grouping procedure ends. Otherwise, repeat step 1 to step 3.

After applying the K‐means algorithm, we will have three groups of images for each individual shoeprint image set. From each group, the image with the closest disatance to the centroid of the group is taken as the representative image. The selected three images are then taken as database images. Seven of others are taken as query images.

Chapter 4 Experimental Results

Experiments are conducted to evaluate the performance of the proposed method. 1050 shoeprint images collected from 105 distinct shoes a re used to test our algorithm. 315 out of 1050 prints are taken as the database images according to the K‐means process described above. The remaining 735 shoeprint images become the query images. Fig. 13 shows 105 distinct shoeprints. Every print in the database is examined in turn for comparison with the input shoeprint. The similarity measures calculated based on each feature vector are then used to sort the shoeprint images in the database from the most similar print to the least similar one.

The method is designed to find similar shoeprints and sort the corresponding categories of database in response to a reference image. With higher performance, the result is expected to present fewer nonmatching shoeprint categories before a matching category. In view of this, “Average Match Score (AMS)“ is used to evaluate the performance of the results.

The “Average Match Score” measures the average percentage of the database categories before a correct match is delivered. In our experiments, each shoe pattern contains 3 corresponding shoeprint images in the database which are gathered from an identical right shoe. Hence, the performance is determined by counting the number of nonmatching categories until hitting the correct one. Then, the process continues to find the second and the third category that correctly matches to the reference shoeprint image with their searching cost.

Table 2 displays an example of query. With respect to the query shoeprint, each row shows the top 5 query results according to different feature vectors. In the first row, taking the feature vector of co‐occurrence matrix, the correct matching is

delivered in the 3^rd, the 4^th, and the 5^th position. While taking all features into consideration, the correct matching is delivered in the 1^st, the 2^nd, and the 4^th position.

‐

Query Results

1^st 2^nd 3^rd 4^th 5^th

Co‐occurrence

Direction

Global Fourier

Local Fourier

Query shoeprint

All

TABLE 2

An Example of Query Results

Each feature vector is conducted independently first, and then combined together for further assessment. The best performance is the one with the combination of all proposed features vectors. Table 3 shows the results of AMS of the method in [7], which utilizes Fourier transform for matching. From the first column of

images should be examined in average to get one correct match. While for getting all three matching shoeprints, 13.88% of shoeprint database images should be examined in average. Results of the proposed method for different feature vectors are shown in Table 4. From the table we can see that 1.29%, 4.71%, and 11.59% of shoeprint database images should be examined in average respectively before the 1^st, the 2^nd, and the 3^rd correct matching with the feature vector of co‐occurrence matrix.

While taking all features into consideration, 0.38%, 1.19%, and 2.75% of the database images are examined in average before retrieving the 1^st, the 2^nd, and the 3^rd correct shoeprint. The proposed method is much more accurate than the method provided in [7].

First Data Second Data Third Data Average

AMS (%) 3.51 11.75 26.39 13.88

TABLE 3

Average Match Scores (%) for the Chazal and Flynn’s Method on the Proposed Shoeprint Images with Image Resolution of 512x512

Features First Data Second Data Third Data Average

Co‐occurrence 1.29 4.71 11.59 5.86

Direction 1.61 5.68 13.71 7

Global Fourier 0.64 3.09 7.39 3.71

Local Fourier 0.92 3.03 7.67 3.87

All Features 0.38 1.19 2.75 1.44

TABLE 4

Average Match Scores (%) for the Proposed Method on 735 Shoeprint Images from 105 Individual Shoes, Database of 315 Shoeprint Images with Different Features

Chapter 5 Conclusion

The study proposed a novel method for automatically recognizing the shoeprint image using the properties of directionality. Firstly, a series of preprocessing processes are applied to eliminate distortions in the print including rotations, translations, and noises. Four feature extraction methods based on the directionality are then performed on the preprocessed image. In the end, a similarity measure using SAD is calculated in response to the reference image.

Based on the algorithm, a system can be built to help forensic scientists seeking for the model of a shoe from a shoeprint image. Experiments designed for assessment of performance showed the accuracy and efficiency of the proposed method. It can accelerate human observer identifying the shoeprint pattern with respect to the reference image. However, shoeprints used in the study were obtained under human controlled circumstances, while those acquired at crime scene were of lower quality and of desperate distortions. These shoeprints were mostly partial prints and tough for matching. Improvements may be achieved by employing new de‐noise methods in preprocessing and delicately designed the database images from acquired shoeprints.

References

[1] A. Girod, “Computer Classification of the Shoeprint of Burglar Soles,” Forensic Science Int’l, vol. 82, pp. 59‐65, 1996.

[2] A. Girod, “Shoeprints ‐ Coherent Exploitation and Management,” European Meeting for Shoeprint/Toolmark Examiners, The Netherlands, 1997.

[3] W.J. Bodziak, “Footwear Impression Evidence Detection,” Recovery and Examination, 2^nd ed. CRC Press, 2000.

[4] N. Sawyer, “’SHOE‐FIT’ A Computerized Shoe Print Database,” Proc. European Convention on Security and Detection, pp. 86‐89, May 1995.

[5] W. Ashley, “What Shoe Was That? The Use of Computerized Image Database to Assist in Identification,” Forensic Science Int’l, vol. 82, pp. 67‐79, 1996.

[6] A. Bouridance, A. Alexander, M. Nibouche, and D. Crookes, “Application of Fractals to the Detection and Classification of Shoeprints,” Proc. 2000 Int’l Conf. Image Processing, vol. 1, pp. 474‐477, 2000.

[7] P. de Chazal, J. Flynn, and R. B. Reilly, “Automated Processing of Shoeprint Images Based on the Fourier Tranform for Use in Forensic Science,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 27, no. 3, pp. 341‐350, March 2005.

[8] K. Jack, “Video Demystified,” 5^th ed. Elsevier, 2007

[9] R. Fisher, S. Perkins, A. Walker, and E. Wolfart. “Image Processing Learning Resources,” http://homepages.inf.ed.ac.uk/rbf/HIPR2/gsmooth.htm.

[10] R. C. Gonzalez, R. E. Woods, “Digital Image Processing,” 2^nd ed. Addison Wesley, 2002.

[11] R. M. Haralick, “Statistical and Structural Approach to Texture,” Proc. IEEE, vol.

67, no. 5, pp. 786‐804, 1979.

[12] A. K. Julesz, “Dialogues on Perception,” MIT Press, Cambridge MA, 1995.

[13] K. L. Lee, L. H. Chen, “A New Method for Coarse Classification of Textures and Class Weight Estimation for Texture Retrieval,” Pattern Recognition and Image Analysis, vol. 12, no. 4, pp. 400‐410, 2002.

在文檔中一個新的鞋印識別及分類之方法 (頁 27-0)