3. PROPOSED DISCRETIZATION MODEL
3.3 Multi-objective Model
From previous section, we can discover that our objectives seem to be conflict between the number of intervals and rules, accuracy, and upper bound value. Therefore, a multi-objective optimization model needs to be added in the discretization process, as in the design Fig. 3-2.
3.3.1 Notations
First, we give out some mathematical notations and definitions of parameters in Table 3-2.
Notice that here we set an upper bound for split points and a lower bound for rules; those bounds are determined by researchers.
Table 3-2 Notations and definitions Terms used in model Notation and definitions
Sum of number of split points 𝑝 ∈ 𝑁 , 𝑝 = ∑ 𝑎𝑙𝑙 𝑠𝑝𝑙𝑖𝑡 𝑝𝑜𝑖𝑛𝑡𝑠
Upper bound of split points 𝑛𝑝 ∈ 𝑁, 𝑝 ≤ 𝑛𝑝 (value is determined by researchers)
Number of rules 𝑟 ∈ 𝑁
Lower bound of rules 𝑛𝑅 ∈ 𝑁, 𝑛𝑅 ≤ 𝑛 (value is determined by researchers)
Accuracy 𝑎, 0 ≤ 𝑎 ≤ 1
Weight of each objective 𝑤𝑖 where 𝑖 = 1,2, … 0 ≤ 𝑤𝑖 ≤ 1, ∑ 𝑤𝑖 = 1
30
Figure 3-2 Proposed Multi-objective Model
31
3.3.2 Fitness function
From the definitions and notations, here we standardize objectives into a number between 0 and 1. Besides, we alternated accuracy into error rate for finding minimum. After standardization, we have four functions and need to weight each function since we choose weight method.
Table 3-3 Standardization
Objective Standardization Target
Number of intervals 𝑓1 = 𝑛𝑝
𝑝, 0 ≤ 𝑓1 ≤ 1 Minimum
Number of rules 𝑓2 =𝑟−𝑛r𝑅, 0 ≤ 𝑓2 ≤ 1 Minimum
Error rate 𝑓3 = 𝜀, 0 ≤ 𝑓3 ≤ 1 Minimum
Fitness function denotes as:
𝐹∗ = 𝑤1𝑓1+ 𝑤2𝑓2+ 𝑤3𝑓3 Eq. 3-1
where 𝑤1, 𝑤2, 𝑤3 are the weight of the four objective functions
The optimal solution is to find the minimum of the fitness function.
32
3.3.3 Pseudo code
FOR (ITERATION TIMES = 10000) {
FOR (ALL ATTRIBUTES ONE BY ONE) { //discretization SORT;
GENERATE SPLIT POINTS (TOTAL = UPPER BOUND);
FOR (ALL SPLIT POINTS) {
IF (SPLIT POINTS ARE EQUAL) { SPLIT POINTS – 1; }
}
CREATE INTERVALS;
}
FOR (ITERATION TIMES = 1000) { //generate classification rules GENERATE RULES (RANDOM NUMBER UNDER UPPER BOUND);
CALCULATE ACCURACY;
}
CALCULATE FITNESS;
PSO CORRECTING SPILT POINTS POSITION;
}
33
3.3.4 Emulation Picture
Step 1: Randomly select dividing points (we can set up the maximum number of bins)
Figure 3-3 randomly select dividing points
Step 2: List all possible rule set and calculate affinity degree
Figure 3-4 List all possible rule set and calculate affinity degree
34
Step 3: Iterate selecting rules randomly and calculate fitness
Figure 3-5 Iterate selecting rules randomly
Step 4: select dividing bins via PSO
Figure 3-6 select dividing bins via PSO
35
Step 5: Iterate Step 2 to Step 4, until reaching terminating conditions.
Figure 3-7 Final dividing bins
36
4. DATA ANALYSIS
In this paper we use two datasets: the well-known iris flower data set, and multiple enrollment program and academic achievements dataset attained in St. John's University. The details of the datasets are listed as in Table4-1.
Table 4-1 Details of datasets
Data set Examples Attributes Classes
IRIS flower
data set
150 Sepal length (continuous) Sepal width (continuous)
460 Academic achievements average (continuous)
Physical training score (continuous)
Score rank in class (continuous) College department (discrete)
37
IRIS flower data set was obtained from University of California at Irvine (UCI)’s data set repository, and multi enrollment dataset of multi enrollment programs in Taiwan’s technical–
vocational college was obtained from a 4-year system technical–vocational college in Taipei (personal information was removed for protecting the students’ privacy). The number of examples, attributes and classes of these data sets is shown in Table 4-2.
Table 4-2 Main characteristics of the data sets used in the experiments
Data set # examples # attributes # classes
IRIS flower data set 150 4 3
Multi enrollment data set 460 5 6
38
4.1 Experiment I: IRIS dataset
The IRIS data takes example from two characteristic marks of flowers: sepal and petal.
Researchers can infer a flower’s species from its sepal length, sepal width, petal length, and petal width. Input data is numerical value; output class consists three species: Setosa, Versicolor, and Virginica.
4.1.1 Descriptive statistics
The attributes and class of IRIS data set are listed in Table 4-3. The attributes of IRIS flower data set were previously discretized into discrete values for ant colony optimization and traditional affinity set using k-means cluster, denoted as sl-1, sl-2, sl-3, sw-1, sw-2, etc.
Before experiment, the parameters settings for ACO, affinity set, and proposed multi-objective affinity set are given below.
39
Table 4-3 Attributes and class coding of IRIS dataset Attributes Sepal length Average: 5.843
Standard deviation: 0.825 Sepal width Average: 3.0573
Standard deviation: 0.434 Petal length Average: 3.758
Standard deviation: 1.759 Petal width Average: 1.199
Standard deviation: 0.760 Class Species Setosa, Versicolor, Virginica
40
4.1.2 Experiment Results
Custom dividing point for ACO and Traditional Affinity Set is listed as follows: (k-means)
Table 4-4 Custom dividing point for ACO and Traditional Affinity Set Sepal length Dividing point 1: Sepal width <5.6
Dividing point 2: 5.6<= Sepal width <=6.5 Dividing point 3: Sepal width >6.5
sl-1
sl-2
sl-3
Sepal width Dividing point 1: Sepal width<2.8
Dividing point 2: 2.8<= Sepal width <=3.4 Dividing point 3: Sepal width >3.4
sw-1
sw-2
sw-3
Petal length Dividing point 1: Petal length <3
Dividing point 2: 3<= Petal length<=5.1 Dividing point 3: Petal length >5.1
pl-1
pl-2
pl-3
Petal width Dividing point 1: Petal width<1
Dividing point 2: 1<= Petal width<=1.7 Dividing point 3: Petal width>1.7
pw-1
pw-2
pw-3
41
Table 4-5 Parameters settings for experiment I
Methodology Parameters Value
Ant colony optimization Folds 10
Number of ants 10
Default class Virginica
Minimum cases per rule 5
Maximum uncovered cases 10
Rules for convergence 10
Number of iterations 100
Affinity set Selection of k 32%
Default class Virginica
Multi-objective affinity set Maximum number of rules (N) 5
Default class Virginica
Iteration times 10000
42
The following presents the classification rules below and comparison results in Table 4-9.
Rule set from ant colony optimization:
Table 4-6 Rule set from ant colony optimization Rule 1 IF Petal length = pl-3
THEN Species = Setosa
Rule 2 IF Petal length = pl-1 AND PetalWidth2 = pl-3
THEN Species = Versicolor
Rule 3 IF Petal length = pl-2
THEN Species = Virginica
Rule 4 IF Sepal length = pl-1
THEN Species = Virginica
Default Species = Virginica
43
Rule set from multi-objective affinity set:
Table 4-7 Rule set from multi-objective affinity set
Rule 1 IF Sepal length= sl-2 AND Sepal width= sw-1 AND Petal length = pl-1
AND Petal width = pw-3
THEN Species= Versicolor
Rule 2 IF Sepal width= sw-1 AND Petal width = pw-1
THEN Species= Setosa
Rule 3 IF Petal length = pl-1 AND Petal width = pw-3
THEN Species= Versicolor
Rule 4 IF Petal width = pw-1
THEN Species= Setosa
Default Species = Virginica
From the results in Table 4-6 and Table 4-7, rule set generated by proposed model seems to contain more variety; thus, these rules keep more information and are more meaningful for botanist or biologist to analyze.
44
Table 4-8 Dividing point for multi-objective affinity set Sepal length Dividing point 1: Sepal width <5.62
Dividing point 2: 5.62<= Sepal width <=6.82 Dividing point 3: Sepal width >6.82
sl-1
sl-2
sl-3
Sepal width Dividing point 1: Sepal width <2.88
Dividing point 2: 2.88<= Sepal width <=3.68 Dividing point 3: Sepal width >3.68
sw-1
sw-2
sw-3
Petal length Dividing point 1: Petal length <3.16
Dividing point 2: 3.16<= Petal length <=5.13 Dividing point 3: Petal length >5.13
pl-1
pl-2
pl-3
Petal width Dividing point 1: Petal width <0.98
Dividing point 2: 0.98<= Petal width <=1.78 Dividing point 3: Petal width >1.78
pw-1
pw-2
pw-3
This discretization result shows that the proposed model and k-means split the continuous data with some near split points and same number of split points; which means they generate in similar quality.
45
Table 4-9 Classification results of experiment I
Algorithm Accuracy # rules
Ant colony optimization 76.67% 4
Affinity set 88.00% 6
*Multi-objective affinity set 97.33% 4
* denotes the best model
In this IRIS classification case, the result shows Multi-objective affinity set seems to be the best model among three classification methods. In the next two sections, we applied and compared the three models on practical issues for experiment.
46
4.2 Experiment II: Multiple Enrollment Dataset
The multi enrollment dataset was observed and collected from a 4-year system technical–
vocational college in Taipei in 2001. In the dataset, we choose several attributes such as grade, conduct, sports, etc. to deduce the enrollment type of multi enrollment programs in Taiwan’s technical–vocational college obtained. Since the concept of Multiple Intelligence was proposed by Gardner in 1993 (Gardner, 1993), multi enrollment program became a trend of
enrollment entrances program in many countries. In 1995, Taiwan Ministry of Education proposed ―The report of education in Taiwan, ROC‖ (Ministry-of-Education, 1995), paraded and planned the multi enrollment entrance program to enhance students’ learning effect and
interests by a more adapted selection. Therefore, in this case we experiment the attributes, which can easily be observed and present students’ learning effect, to examine how effectively
multi enrollment worked. The attributes and details are listed in Table 4-10. In addition, students’ personal information was removed for protecting privacy.
47
4.2.1 Descriptive statistics
Table 4-10 Attributes and class coding of IRIS dataset
Attributes Conduct Average: 87.202
Standard deviation: 5.205
Grade Average: 69.741
Standard deviation: 7.503
Sports Average: 76.780
Standard deviation: 10.749 Rank in class Average: 29.407
Standard deviation: 16.804 School Score Average: 87.202
Standard deviation: 5.205 Class Enrollment entrances Application (Application)
Joint entrance examination (Joint) Audition (Audition)
Cerebral palsy disability (Disability) Disaster area admission (Disaster) Technical admission (Tech)
48
4.1.2 Experiment Results
Custom dividing point for ACO and traditional Affinity Set
Table 4-11 Custom dividing point for ACO and traditional Affinity Set Conduct Dividing point 1: Conduct<83
Dividing point 2: 83<= Conduct<=88 Dividing point 3: Conduct>88
Low Middle High Grade Dividing point 1: Grade<65
Dividing point 2: 65<= Grade<=73 Dividing point 3: Grade>73
Low Middle High Sports Dividing point 1: Sports<47
Dividing point 2: 47<= Sports<=76 Dividing point 3: Sports>76
Low Middle High Rank in class Dividing point 1: Rank<21
Dividing point 2: 21<= Rank<=40 Dividing point 3: Rank>40
Front Middle Post School Score Dividing point 1: SchoolScore<83
Dividing point 2: 83<= SchoolScore<=88 Dividing point 3: SchoolScore>88
Low Middle High
49
Table 4-12 Parameters settings for experiment I
Methodology Parameters Value
Ant colony optimization Folds 10
Number of ants 10
Default class Joint
Minimum cases per rule 5
Maximum uncovered cases 10
Rules for convergence 10
Number of iterations 100
Affinity set Selection of k 35%
Default class Joint
Multi-objective affinity set Maximum number of rules (N) 5
Default class Application
Iteration times 10000
50
The following presents the classification rules set in Table 4-13 to 4-15. The comparison result was presented in Table 4-17. Rule set from ant colony optimization:
Table 4-13 Rule set from ant colony optimization Rule 1 IF Conduct = Middle AND Grade = Middle
THEN enrollment= Joint Rule 2 IF Conduct = Low
THEN enrollment= Joint Rule 3 IF ClassRank = Front
THEN enrollment= Joint Rule 4 IF Grade = Low
THEN enrollment= Joint
Rule 5 IF Conduct = High AND Grade = High THEN enrollment=Audition
Rule 6 IF Conduct = High AND Sports = Middle THEN enrollment= Joint
Default enrollment=Joint
51
Rule set from affinity set:
Table 4-14 Rule set from affinity set Rule 1 IF Sports=Middle
THEN Enrollment= Joint Rule 2 IF Grade= Middle
THEN Enrollment= Joint Rule 3 IF SchoolScore= Middle
THEN Enrollment= Joint Rule 4 IF Conduct= Middle
THEN Enrollment= Joint
Rule 5 IF Conduct= Middle AND SchoolScore= Middle THEN Enrollment= Joint
Rule 6 IF Grade= Middle AND Sports= Middle THEN Enrollment= Joint
Rule 7 IF Sports= Middle AND SchoolScore= Middle THEN Enrollment= Joint
Rule 8 IF Conduct= Middle AND Sports= Middle THEN Enrollment= Joint
52
Rule 9 IF Conduct= Middle AND Sports= Middle AND SchoolScore= Middle THEN Enrollment= Joint
Rule 10 IF ClassRank= Middle THEN Enrollment= Joint Default enrollment=Joint
In this multi enrollment case, we selected rules by setting k=35%, and obtained totally 10 rules; however, the k-core method cannot help us to avoid the rule set pointing to the same class, thus the rule set cannot reveal the feature of the dataset. On the contrary, our proposed multi-objective affinity set has the advantage to avoid the kind of issues happen by using iteration selection. Rule set from multi-objective affinity set is listed as follow in Table 4-15.
The proposed multi-objective affinity set output totally 4 rules, and is fewer than ACO (6 rules) and traditional affinity set (10 rules). These four rules highlight the ―School Score‖ ,an
integrated number considered attendance, bonus point by teachers, merits, and faults , might be a major attribute that shows the difference of students’ learning effects by different
enrollment entrance.
53
Table 4-15 Rule set from multi-objective affinity set Rule 1 IF SchoolScore =Low
THEN enrollment= Joint
Rule 2 IF Grade = Low AND SchoolScore =Middle THEN enrollment= Joint
Rule 3 IF Sports = Low AND SchoolScore =High THEN enrollment= Joint
Rule 4 IF Grade = Middle AND Sports = Low THEN enrollment= Joint
Default enrollment=Application
Dividing points is shown in Table 4-16.
Notice that in attribute ―Conduct‖ and ―Rank in class‖ both have only one dividing point, less
than two that k-means generated.
54
Table 4-16 dividing point for multi-objective affinity set Conduct Dividing point 1: Conduct<95.0
Dividing point 2: Conduct>=95.0
Low High Grade Dividing point 1: Grade<49.41
Dividing point 2: 49.41<= Grade<=62.87 Dividing point 3: Grade>62.87
Low Middle High Sports Dividing point 1: Sports<23.83
Dividing point 2: 23.83<= Sports<=57.23 Dividing point 3: Sports>57.23
Low Middle High Rank in class Dividing point 1: Rank<62
Dividing point 3: Rank>=62
Front Post School Score Dividing point 1: SchoolScore<81.17
Dividing point 2: 81.17<= SchoolScore<=87.80 Dividing point 3: SchoolScore<87.80
Low Middle High
55
The following is the comparison of three models.
Table 4-17 Classification results of experiment II
Algorithm Accuracy # rules
Ant colony optimization 61.44% 6
Affinity set 61.30% 8
*Multi-objective affinity set 61.50% 4
* denotes the best model
This result shows that the proposed multi-objective affinity set has advantages to enhance accuracy, and decrease the number of split points and rules. Fewer rules without losing information and variety can be more easily applied and build for education diagnosis system.
56
5. CONCLUSION
5.1 Conclusion
The major purpose of this research is to combine multi-objective decision making and affinity set classification method, and enhance accuracy of output rule set. Since skipping the k-core method of traditional affinity set, the combination of rules has more variety to be chosen and has a higher prediction accuracy of the three experiments in this study. Furthermore, our improved multi-objective affinity set can reduce the necessary numbers of classification rules.
As a result, our method improves the prediction accuracy via fewer classification rules, and makes the system based on classification rules in real world easier to be applied or constructed, such as web interface on internet, educational support software on PC, etc.
5.2 Future Works
This study focuses on increasing classification accuracy and reducing the number of dividing points and number of classification rules. Since skipping the k-core method of traditional affinity set, the combination of rules has more variety to be chosen and has a higher prediction accuracy of delayed diagnosis detection. Moreover, there are still objectives can be added to
57
the MO affinity set system, such as higher TN, lower FP, etc. On future applications, the focus of the improvement of the multi-objective model should aim at real-world problems, such as
making the system more sensitive for predicting some particular attributes. For example, medical diagnosis system via observing patients’ blood pressure, body temperature, pulse, etc.,
could be used to prevent delayed diagnosis or medical error.
58
REFERENCES
1. Ahmad, M. A., & Srivastava, J. (2008). An Ant Colony Optimization Approach to Expert Identification in Social Networks. Social Computing, Behavioral Modeling, and Prediction, 120-128.
2. Barakat, N. H. (2007). Rule Extraction from Support Vector Machines: A Sequential Covering Approach. IEEE Transactions on Knowledge and Data Engineering, 19(6), 729-741.
3. Berrado, A., & Runger, G. C. (2007). Using Metarules To Organize And Group Discovered Association Rules. Data Mining and Knowledge Discovery, 14(3), 409-431.
4. Brauers, W. K. M., Zavadskas, E. K., Peldschus, F., & Turskis, Z. (2008). Multi-objective decision-making for road design. Transport 23(3), 183 - 193.
5. Chen, Y.-W., & Larbani, M. (2007). Affinity Set and Its Applications. Paper presented at the Proceeding of the International Workshop on Multiple Criteria Decision Making, Poland.
6. Chen, Y.-W., Larbani, M., Shen, C.-M., & Chen, C.-W. (2008). Using Affinity Set on Finding the Key Attributes of Delayed Diagnosis. Applied Mathematical Sciences, 3(7), 217-316.
59
7. Chen, Y.-W., Larbani, M., Wu, C.-L., & Chen, C.-W. (2007). Using Affinity Set Theory to Enhance the Effectiveness of Head Computed Tomography.
8. Coello, M. R.-s. C. A. C. (2006). Multi-Objective particle swarm optimizers: A survey of the state-of-the-art. International Journal of Computational Intelligence Research, 2(3), 287-308.
9. Deb, K. (2001). Multi-Objective Optimization using Evolutionary Algorithms: John Wiley
& Sons, England.
10. Dougherty, J., Kohavi, R., & Sahami, M. (1995). Supervised and Unsupervised Discretization of Continuous Features. Paper presented at the Machine Learning:
Proceeding of the Twelve International Conference.
11. Gardner, H. (1993). Multiple intelligences: The theory in practice. New York: Basic Books.
12. Grzymala-Busse, J. W. (2002). Data reduction: discretization of numerical attributes. In Handbook of data mining and knowledge discovery. New York, NY: Oxford University
Press, Inc.
13. Ho, D. Y. F. (1998). Interpersonal Relationships and Relationship Dominance: An Analysis Based on Methodological Relationism. Asian Journal of Social Psychology, 1, 1-16.
60
14. Holden, N., & Freitas, A. A. (2004). Web Page Classification with an Ant Colony Algorithm. Lecture Notes in Computer Science, 3242, 1092-1102.
15. Hwang, K.-K. (1987). Face and Favor: The Chinese Power Game. The American Journal of Sociology, 92(4), 944-974.
16. Ishibuchi, H., Nakashima, T., & Nii, M. (2005). Multi-Objective Design of Linguistic Models. In Classification and Modeling with Linguistic Information Granules (pp.
131-141): Springer Berlin Heidelberg.
17. Jensen, R., & Shen, Q. (2006). Webpage Classification with ACO-enhanced Fuzzy-Rough Feature Selection. Paper presented at the Proceedings of the Fifth International
Conference on Rough Sets and Current Trends in Computing (RSCTC 2006), LNAI 4259.
18. Kennedy, J., & Eberhart, R. C. (1995). Particle swarm optimization. Paper presented at the IEEE Int. Conf. on Neural Networks, Piscataway, NJ.
19. Kerber, R. (1998). Chimerge: Discretization of numeric attributes. Paper presented at the the 10th Conference of the American Association for Artificial Intelligence.
20. Kianmehr, K., Alshalalfa, M., & Alhajj, R. (2008). Effectiveness of Fuzzy Discretization for Class Association Rule-Based Classification. Paper presented at the Foundations of
Intelligent Systems.
21. Lhotská, L., Macaš, M., & Burša, M. (2006). PSO and ACO in Optimization Problems.
61
Paper presented at the Intelligent Data Engineering and Automated Learning – IDEAL 2006.
22. Liu, H., Hussain, F., Tan, C. L., & Dash, M. (2002). Discretization: An Enabling Technique. Data Mining and Knowledge Discovery, 6(4), 393-423.
23. Luo, Y. (2000). Guanxi and Business (Vol. 1): World Scientific.
24. Mendelson, & B. (1990). Introduction to Topology. Dover Publications.
25. Ministry-of-Education. (1995). An Report of education in Taiwan, ROC o. Document Number)
26. Mostaghim, S. (2003). The Role of -dominance in Multi Objective Particle Swarm Optimization Methods. Paper presented at the Proceedings of the 2003 Congress on
Evolutionary Computation.
27. Pal, P. K. T. S. B. S. K. (2007). Multi-Objective Particle Swarm Optimization with time variant inertia and acceleration coefficients Information Sciences, 177(22), 5033-5049 28. Pfahringer, B. (1995). Compression-Based Discretization of Continuous Attributes. Paper
presented at the Proceedings of the 12th International Conference on Machine Learning.
29. Piatrik, T., & Izquierdo, E. (2006). Image Classification Using an Ant Colony Optimization Approach. Lecture Notes in Computer Science, 4306, 159-168.
30. Qu, W., Yan, D., Sang, Y., Liang, H., Kitsuregawa, M., & Li, K. (2008). A Novel Chi2
62
Algorithm for Discretization of Continuous Attributes. In Progress in WWW Research and Development (Vol. 4976): Springer-Verlag Berlin Heidelberg.
31. Skubacz, M., & Hollmén, J. (2008). Quantization of Continuous Input Variables for Binary Classification. Paper presented at the Intelligent Data Engineering and Automated
Learning — IDEAL 2000. Data Mining, Financial Engineering, and Intelligent Agents.
32. Wang, Z., Sun, X., & Zhang, D. (2007). A PSO-Based Classification Rule Mining Algorithm (Vol. 4682). Heidelberg: Springer Berlin.
33. Wu, C.-H., Lin, W.-T., Li, C.-H., Fang, I.-C., & Wu, C.-H. (2008). Ant Colony Optimization On Building An Online Delayed Diagnosis Detection Support System For
Emergency Department. Paper presented at the CIEF 2008.
34. Wu, C.-H., Lin, W.-T., Li, C.-H., Fang, I.-C., & Wu, C.-H. (2009). A Novel Multi-Objective Affinity Set Classification System: An Investigation of Delayed Diagnosis
Detection. Paper presented at the 1st Asian Conference on Intelligent Information and
Database Systems.