LIU–TRANSFORMATION - 行政院國家科學委員會專題研究計畫成果報告

HSIANG-CHUAN LIU¹, YA-CHING CHIU¹, CHIEN-HSIUNG LIAO¹, TUNG-SHENG LIU²

1Department of Bioinformatics, Asia University, Taiwan

2Forensic Science Center of Military Police Command, Taiwan

E-MAIL:[email protected], [email protected], [email protected], [email protected]

Abstract:

The support vector machine (SVM) classifier is a popular and appealing classifier .It could be improved by taking some transformation about the original data before classification even sometimes its performance is not good,. In our previous paper, two transformations, NWFE-Transformation and Liu-Transformation are considered. The results showed that the SVM with our Liu-Transformation algorithm has the best performance.

In this paper, we considered the further improved SVM algorithm based on not only the Liu- transformation but also the well known normalization, For evaluating the performances of the SVM without any transformation and normalization, the SVM with NWFE-Transformation and Liu-Transformation, respectively, the SVM with one of above two transformations and the well known normalization, a real data experiment by using 5-fold and Leave-one-out Cross-Validation accuracy is conducted. Experimental result shows that the SVM with the proposed Liu-Transformation algorithm and the well known normalization algorithm has the best performance.

Keywords:

SVM; NWFE-Transformation; Liu-Transformation

1. Introduction

The support vector machine (SVM) classifier is a popular and appealing classifier [1], [2], [3], [4]. Due to sometimes its performance is not good, it can be improved by taking some transformation about the original data before classification, two transformations can be considered, one is NWFE-Transformation proposed by B.

C. Kuo & D. A. Landgrebe in 2001 [5], [6], the other is Liu-Transformation proposed by our previous work in 2008 [7], [8]. The results of our previous paper [8] showed that the SVM with our Liu-Transformation algorithm has the best performance. In this paper, we considered the further improved SVM algorithm based on not only the Liu- transformation but also the well known normalization, For

evaluating the performances of the SVM without any transformation and normalization, the SVM with NWFE-Transformation and Liu-Transformation, respectively, the SVM with one of above two transformations and the well known normalization, a real data experiment by using 5-fold and Leave-one-out Cross-Validation accuracy is conducted. Experimental result shows that the SVM with the proposed Liu-Transformation algorithm and the well known normalization algorithm has the best performance.

For evaluating the performances of the SVM without any transformation and normalization, the SVM with NWFE-Transformation and Liu-Transformation, respectively, the SVM with one of above two transformations and the well known normalization, a real data experiment by using 5-fold and Leave-one-out Cross-Validation accuracy is conducted. Experimental result shows that the SVM with the proposed Liu-Transformation algorithm and the well known normalization algorithm has the best performance

This paper is organized as followings: support vector machine classifier is introduced in section 2, NWFE-Transformation is introduced in section 3, Liu-Transformation is introduced in section 4.

Normalization algorithm is described in section 5.

Experiment and result are described in section 6 and final section is for conclusions and future works.

2. Support vector machine (SVM) [1], [2], [3], [4]

Given the training set of instance-labeled pairs

(

^{x y}ⁱ^, ⁱ

)

^,ⁱ⁼^{1, 2,...,}^N^{, where}

xi∈R yⁿ^, i∈ −

{ }

^{1, 1 ,}i=^{1, 2,...,}N (1) The support vector machine (SVM) algorithm (Boser, Guyon, and Vapnik 1992, Cortes and Vapnik 1995) requires

make an assignment according to the following formula:

The main ideas of nonparametric weighted feature extraction transformation (NWFE-Transformation)(Kuo, B.

C. and Landgrebe, 2002, 2004) are putting different weights on every sample to compute the “local means” and defining new nonparametric weighted between-class and within-class scatter matrices to get more features.

The nonparametric weighted between-class scatter matrix, S_b^NW and the nonparametric weighted within-class scatter matrix, S_w^NW, of NWFE-Transformation are defined as

M x is the nonparametric weighted local mean of

( )ⁱ

xk in class j; ^{d x y}

( )

^, is the Euclidean distance from x to y.

The goal of NWFE-transformation is to find a linear transformation A R∈ ^{d p}^× , p d≤ , which maximizes the between-class scatter and minimizes the within-class scatter.

The columns of A are the optimal features by optimizing the following criterion

A=^{arg max}A tr A S⎡⎢⎣

(

^T ^w^NWA

)

⁻¹A S^T ^b^NWA⎤⎥⎦ (9) This maximizing is equivalent to find the eigen-pairs

(

λ_i,v_i

)

,i=1, 2,..., ,d λ λ1≥ 2 ≥ ≥... λ_d for the generalized eigenvalue problem

S_b^NWv=λS_w^NWv (10) 4. Liu-Transformation [7]

The main ideas of Liu-transformation proposed by our previous work (Hsiang-Chuan Liu, 2008) [7] are putting different weights on every sample to compute the

“weighted means” by referring the distances of the points from the ‘outmost points’ and defining new nonparametric weighted between-class and within-class scatter matrices to get more features.

Let X_{p n}_× be the data set with n sample points and c distances of the sample points from the outmost point of class j satisfying

The nonparametric weighted between-class scatter matrix, S_b^L and the nonparametric weighted within-class scatter matrix, S_w^L, of Liu-Transformation are defined as

The goal of Liu-transformation is to find a linear transformation A R∈ ^{d p}^× ,p d≤ , which maximizes the between-class scatter and minimizes the within-class scatter.

The columns of A are the optimal features by optimizing the following criterion

A=^{arg max}A tr A S A⎡⎢⎣

(

^T ^w^L

)

⁻¹A S A^T ^b^L ⎤⎥⎦ (18) This maximizing is equivalent to find the eigen-pairs

(

λ_i^,v_i

)

^,i=1, 2,..., ,d λ λ¹≥ ² ≥ ≥... λ_d for the generalized eigenvalue problem

S v_b^L ⁼λS v_w^L (19) 5. Normalization algorithm

Given the training set of instance-labeled pairs

(

^{x y}ⁱ^, ⁱ

)

^,ⁱ⁼^{1, 2,...,}^N ^{. Let}^xⁱ ⁼

(

^{x x}ⁱ^,1^, ⁱ^,2^,^xⁱ^,3^,...,^x^{i n}^,

)

^∈^Rⁿ ^, 6. Experiment and result

A wine data set was downloaded from website, ftp://ftp.ics.uci.edu/pub/machine-learning-databases. The sample included 178 instances, 3 classes of wine, and 13 features for each instance.

The above real data is applied to evaluate the performances of the Support Vector Machine (SVM) algorithm without any transformation, the SVM algorithm with NWFE-Transformation, the SVM algorithm with Liu-Transformation, the SVM algorithm with normalization, the SVM algorithm with normalization and NWFE-Transformation, and the SVM algorithm with normalization and Liu-Transformation by using 5-fold and Leave-one-out Cross-Validation method to compute the accuracies of the response category variable.

Table 1 Accuracy of six Classification algorithms Classification

algorithm 5-fold CV

accuracy Leave-one-out CV accuracy

SVM 45.763 46.633

SVM_NWFE 93.023 96.305 SVM_N 97.740 98.740

SVM_ Liu 99.080 98.773

SVM_N_NWFE 100 100

SVM_N_Liu 100 100

The experimental results of six classification algorithms are listed in Table 1. For both 5-fold CV and Leave-one-out CV accuracy, we can find the same situations as following:

(i) The SVM algorithm with normalization and Liu-Transformation and the SVM algorithm with normalization and NWFE-Transformation had the same performance, better than others.

(ii) The SVM algorithm with just one of transformation or normalization is better than the SVM algorithm without any transformation and normalization.

7. Conclusions and future works

The support vector machine (SVM) classifier is a popular and appealing classifier. Because sometimes its performance is not good, it could be improved by taking some transformation about the original data before classification. Two transformations can be considered, one is NWFE-Transformation proposed by B. C. Kuo & D. A.

Landgrebe in 2001 [5], [6], the other is Liu-Transformation proposed by our previous work in 2008 [7], [8]. The results of our previous paper [8] showed that the SVM with our Liu-Transformation algorithm has the best performance. In this paper, we considered the further improved SVM algorithm based on not only the Liu- transformation but also the well known normalization.

For evaluating the performances of the SVM without any transformation and normalization, the SVM with NWFE-Transformation and Liu-Transformation, respectively, the SVM with one of above two transformations and the well known normalization, a real thyroid data included 178 instances, 3 classes of wine, and 13 features for each instance is conducted.

The experimental results of six classification algorithms are listed in Table 1. Both 5-fold CV and Leave-one-out CV accuracy, we can find the same situations as following;

(i) The SVM algorithm with Liu-Transformation is better than the SVM algorithm with NWFE-Transformation had and the SVM algorithm without any transformation.

(ii) The SVM algorithm with normalization and Liu-transformation and the SVM algorithm with

In future, we will apply our Liu-Transformation with normalization to improve the performances of other classifiers.

Acknowledgements

This paper is partially supported by the National Science Council grant (NSC 96-2413--H-468-001).

References

[1] Boser, B.E., Guyon, I.M., and Vapnik, V. “A training algorithm for optimal margin classifiers”, In Fifth Annual Workshop on Computational Learning Theory, Pittsburgh, 1992. ACM.

[2] C. Cortes, and V. Vapnik, “Support-vector network”, Machine Learning, Vol. 20, pp. 273-297, 1995.

[3] Vapnik, V., The Nature of Statistical Learning Theory.

New York, NY. Springer-Verlay, 1995.

[4] Chang, C.-C., and Lin, C.-J. LIBSVM; a library for support vector machine Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm, 2004.

[5] B. C. Kuo, and D. A. Landgrebe, “Improved Statistics Estimation and Feature Extraction for Hyperspectral Data Classification”, Technical Report, Purdue University, West Lafayette. IN., TR-ECE 01-6, December, 2001.

[6] B. C. Kuo, and D. A. Landgrebe, “Nonparametric Weighted Feature Extraction for Classification”, IEEE Trans. on Geosience and Remove Sensing, Vol. 42, No. 5, 1096-1105, May, 2004.

[7] Hsiang-Chuan Liu, “A novel nonparametric weighted feature extraction transformation algorithm based on the outmost points”, Journal of Taichung University (JNTCU), Vol. 22, No. 1, pp. 1-7, June 2008.

[8] Hsiang-Chuan Liu, Chien-Hsiung Liao, Tung-Sheng Liu, and Ya-Ching Chiu, “An improved SVM algorithm based Liu–Transformation”, 2008 International Conference on Management and Technology, Yunlin, Taiwan, June 2008.

在文檔中行政院國家科學委員會專題研究計畫成果報告 (頁 63-67)