Experimental Results - SUPPORT-VECTOR BASED FUZZY NEURAL

CHAPTER 2 SUPPORT-VECTOR BASED FUZZY NEURAL

3.3 Experimental Results

The classification performance of the proposed SVFNN is evaluated on five well-known benchmark datasets. These five datasets can be obtained from the UCI repository of machine learning databases [42] and the Statlog collection [43] and IJCNN challenge 2001 [44], [45], respectively.

A. Data and Implementation

From the UCI Repository, we choose one dataset: Iris dataset. From Statlog

collection we choose three datasets: Vehicle, Dna and Satimage datasets. The problem Ijcnn1 is from the first problem of IJCNN challenge 2001. These five datasets will be used to verify the effectiveness of the proposed SVFNN classifier. The first dataset (Iris dataset) is originally a collection of 150 samples equally distributed among three classes of the Iris plant namely Setosa, Verginica, and Versicolor. Each sample is represented by four features (septal length, septal width, petal length, and petal width) and the corresponding class label. The second dataset (Vehicle dataset) consists of 846 samples belonging to 4 classes. Each sample is represented by 18 input features. The third dataset (Dna dataset) consists of 3186 feature vectors in which 2000 samples are used for training and 1186 samples are used for testing. Each sample consists of 180 input attributes. The data are classified into three physical classes. All Dna examples are taken from Genbank 64.1. The four dataset (Satimage dataset) is generated from Landsat Multispectral Scanner image data. In this dataset, 4435 samples are used for training and 2000 samples are used for testing. The data are classified into six physical classes. Each sample consists of 36 input attributes. The five dataset (Ijcnn1 dataset) consists of 22 feature vectors in which 49990 samples are used for training and 45495 samples are used for testing. Each sample consists of 22 input attributes. The data are classified into two physical classes. The computational experiments were done on a Pentium III-1000 with 1024MB RAM using the Linux operation system.

For each problem, we estimate the generalized accuracy using different cost parameters C=[2¹², 2¹¹, 2¹⁰, …, 2^-2] in (3.1). We apply 2-fold cross-validation for 100 times on the whole training data in Dna, Satimage and Ijcnn1, and then average all the results. We choose the cost parameter C that results in the best average cross-validation rate for SVM training to predict the test set. Because Iris and Vehicle datasets don’t contain testing data explicitly, we divide the whole data in Iris and

Vehicle datasets into two halves, for training and testing datasets, respectively.

Similarly, we use the above method to experiment. Notice that we scale all training and testing data to be in [-1, 1].

B. Experimental Results

Tables 3.1 to 3.5 present the classification accuracy rates and the number of used fuzzy rules (i.e., support vectors) in the SVFNN on Iris, Vehicle, Dna, Satimage and Ijcnn1 datasets, respectively. The criterion of determining the number of reduced fuzzy rules is the difference of the accuracy values before and after reducing one fuzzy rule. If the difference is larger than 0.5%, meaning that some important support vector has been removed, then we stop the rule reduction. In Table 3.1, the SVFNN is verified by using Iris dataset, where the constant n in the symbol SVFNN-n means the number of the learned fuzzy rules. The SVFNN uses fourteen fuzzy rules and achieves an error rate of 2.6% on the training data and an error rate of 4% on the testing data. When the number of fuzzy rules is reduced to seven, its error rate increased to 5.3%. When the number of fuzzy rules is reduced to four, its error rate is increased to 13.3%. Continuously decreasing the number of fuzzy rules will keep the error rate increasing. From Table 3.2 to 3.5, we have the similar experimental results as those in Table 3.1.

These experimental results show that the proposed SVFNN is good at reducing the number of fuzzy rules and maintaining the good generalization ability. Moreover, we also refer to some recent other classification performance include support vector machine and reduced support vectors methods [46]-[48]. The performance comparisons among the existing fuzzy neural network classifiers [49], [50], the RBF-kernel-based SVM (without support vector reduction) [46], reduced support vector machine (RSVM) [48] and the proposed SVFNN are made in Table 3.6.

TABLE 3.1 Experimental results of SVFNN classification on the Iris dataset.

Training process Testing process SVFNN-n

(SVFNN with n

fuzzy rules) Error rate C Number of

misclassification Error rate

SVFNN-14 2.6% 2¹² 3 4%

SVFNN -11 2.6% 2¹² 3 4%

SVFNN -9 2.6% 2¹² 3 4%

SVFNN -7 4% 2¹² 4 5.3%

SVFNN -4 17.3% 2¹² 10 13.3%

1. Input dimension is 4.

2. The number of training data is 75.

3. The number of testing data is 75.

TABLE 3.2 Experimental results of SVFNN classification on the Vehicle dataset.

Training process Testing porcess SVFNN-n

(SVFNN with n

fuzzy rules) Error rate C Number of

misclassification Error rate

SVFNN-321 13.1% 2¹¹ 60 14.2%

SVFNN-221 13.1% 2¹¹ 60 14.2%

SVFNN-171 13.1% 2¹¹ 60 14.2%

SVFNN-125 14.9% 2¹¹ 61 14.5%

SVFNN-115 29.6% 2¹¹ 113 26.7%

1. Input dimension is 18.

2. The number of training data is 423.

3. The number of testing data is 423.

TABLE 3.3 Experimental results of SVFNN classification on the Dna dataset.

Training process Testing process SVFNN-n

1. Input dimension is 180.

2. The number of training data is 2000.

3. The number of testing data is 1186.

TABLE 3.4 Experimental results of SVFNN classification on the Satimage dataset.

Training process Testing process SVFNN-n

1. Input dimension is 36.

2. The number of training data is 4435.

3. The number of testing data is 2000.

TABLE 3.5 Experimental results of SVFNN classification on the Ijnn1 dataset.

Training process Testing porcess SVFNN-n

1. Input dimension is 22.

2. The number of training data is 49990.

3. The number of testing data is 45495.

TABLE 3.6 Classification error rate comparisons among FNN, RBF-kernel-based SVM, RSVM and SVFNN classifiers, where NA means “not available”.

FNN [49, 50] RBF-kernel-based

SVM [46] RSVM [48] SVFNN

在文檔中支持向量模糊類神經網路及其在資料分類和函數近似之應用 (頁 39-45)