CHAPTER 6 CASE STUDY
6.3 Inspect the Result Rules
國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
settings of JRip and J48 may consider them as noise and ignore them to pursue the overall accuracy; as a result, the models are too simple and explain nothing. So we try several different settings to remove the mechanisms which are designed to prevent models from growing too luxuriant and becoming over fitting.
6.3 Inspect the Result Rules
We generate many rule sets from 135 data sets with 3 algorithms and different settings, and we give some of the rules. Following are the rules generated by J48 on the data set of family 27 at pass 5:
(R2 roll torque_pass5 [kNm]\[1] >= 3078.667) => Torque ratio p5=[0.65,1] (27.0/0.0).
(R2 roll torque_pass5 [kNm]\[1] >= 2585.333) and
(R2 total roll force_pass5 [kN]\[1] <= 23046.67) => Torque ratio p5=[0.65,1] (3.0/0.0).
The first rule indicates that if the torque value measured from the motors of mill for pass 5 is bigger than or equal to 3078.667, then the torque ratio of pass 5 will be in the range [0.65,1], which is a slip range. The second rule indicates that if the torque value measured from the motors of mill for pass 5 is bigger than or equal to 2585.333, and the force value measured from the motors of mill for pass 5 is smaller than or equal to 23046.67, then the torque ratio of pass 5 will be in the slip range [0.65,1]. From these two rules we may conclude that when torque measured from the motors of mill is too high, and sometimes when the force measured from the motors of mill is too low, a slip may occurs.
Our job is to summarize the results and let the experts to inspect the results. We need feedbacks from the experts to improve the experiments.
‧
Consider the rules provided above, the torque value measured from the motors of mill for pass 5 attribute appears twice, and the force value measured from the motors of mill for pass 5 appears once. The more frequent an attribute appears in the rules, the more important the attribute is, especially when we built plenty of rules.
From all rule sets we discover one same phenomenon that the torque measured from the motors of mill when biting in a slab is the most frequent attribute for passes 3, 4, and 5. The rolling speed measured from the motors of mill is the second most frequent attribute for passes 3, 4, and 5.
We also discover that some attributes never appear in any rule. This discovery may help the experts to reduce dimensions when building a predictor.
The third most frequent attribute for passes 3 and 4 is the rolling speed of the top working roll, which is preferred by JRip and J48, and the third most frequent attribute for passes 5 is the bottom working roll number, which is preferred by ROUSER and never chosen by JRip and J48. Both of these results are considered reasonable to experts. We discovered that JRip and J48 prefer real number attributes and they may overlook some important nominal attributes.
From the results we find that the default settings of running speed, threading speed, force, and torque, are involved, while the thickness draft of each pass are not involved. We look into the data to seek out the evidence of this discovery. First, we find that thickness draft of each pass differs only a little in each data record, which may be the reason of why the thickness is not involved. Second, we find that different records with exactly the same slab properties (such as family) and the same size of finished products may have different settings on the mill, and some setting combinations are rare with regard to the other records with the same slab properties, and these rare settings are usually accompanied with slip. We considered
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
this phenomenon as one of the causes of slip.
6.4 Summary
Through data mining techniques we narrow the exploring range of the problem happened in a rough mill. The attributes chosen by our experiments are considered reasonable, and we find that JRip and J48 are good at capturing important real number attributes, while ROUSER is good at capturing the important nominal attributes. The results also respond to the experts’ doubts about the default settings. Following the narrowed clues, we look into the data and find some evidences to explain that the default settings may be one cause of slip.
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
CHAPTER 7
CONCLUSIONS AND FUTURE WORK
A rule-based classification algorithm named ROUSER is proposed. It is designed to process nominal data and generate human understandable decision rules. ROUSER uses a rough set approach as its search heuristic, and the rule generation method of ROUSER is based on the separate-and-conquer strategy.
As a prototype without the optimization stage or the pruning stage to reduce errors, ROUSER still provides classification performance comparable to or even better than that given by the rule-based or tree-based classification algorithms considered in experiments.
Since the search heuristics of ROUSER is totally different from the search heuristics (Entropy and Information Gain) used by the other three algorithms, the results imply that the proposed PotBound and DiscPow are useful. This also shows the potential of ROUSER and gives an
example of future work.
For future work, we plan to conduct more experiments, develop better strategies to select attributes and handle contradictions, and apply ROUSER to data sets obtained from a real-world case study.
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
REFERENCE
[1] J. G. Bazan, H. S. Nguyen, S. H. Nguyen, P. Synak, J. Wróblewski, and blewski, "Rough set algorithms in classification problem," in Rough set methods and applications, ed:
Physica-Verlag GmbH, 2000, pp. 49-88.
[2] D. P. Bertsekas and J. N. Tsitsiklis, Neuro-Dynamic Programming: Athena Scientific, 1996.
[3] W.W. Cohen, “Fast Effective Rule Induction,” Proc. 12th Int'l Conf. Machine Learning (ICML), pp. 115-123, 1995.
[4] C. Cortes and V. Vapnik, "Support-vector networks," Machine Learning, vol. 20, pp.
273-297, 1995.
[5] J. Dai, Q. Xu, and W. Wang, "A comparative study on strategies of rule induction for incomplete data based on rough set approach," International Journal of Advancements in Computing Technology, vol. 3, p. 176–183, 2011.
[6] U. M. Fayyad, K. B. Irani, “Multi-interval discretization of continuous-valued attributes for classification learning”, presented at the Proceedings of 13th international joint conference on Artificial intelligence, 1022-1027, 1993.
[7] J. Fürnkranz, "Separate-and-Conquer Rule Learning," Artif. Intell. Rev., vol. 13, pp. 3-54, 1999.
[8] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, "The WEKA data mining software: an update," SIGKDD Explor. Newsl., vol. 11, pp. 10-18, 2009.
‧
[9] L. Huan and R. Setiono, "Chi2: feature selection and discretization of numeric attributes,"
in Tools with Artificial Intelligence, 1995. Proceedings of IEEE Seventh International Conference on, 1995, pp. 388-391.
[10] J. C. Huhn and E. Hullermeier, "FR3: A Fuzzy Rule Learner for Inducing Reliable Classifiers," Fuzzy Systems, IEEE Transactions on, vol. 17, pp. 138-149, 2009.
[11] W. Jiabing, Z. Pei, W. Guihua, and W. Jia, "Classifying Categorical Data by Rule-Based Neighbors," in Data Mining (ICDM), 2011 IEEE 11th International Conference on, 2011, pp. 1248-1253.
[12] X. Jin, A. Xu, R. Bie, and P. Guo, "Machine Learning Techniques and Chi-Square Feature Selection for Cancer Classification Using SAGE Gene Expression Profiles Data Mining for Biomedical Applications." vol. 3916, J. Li, Q. Yang, and A.-H. Tan, Eds., ed:
Springer Berlin / Heidelberg, 2006, pp. 106-115.
[13] M. T. Mitchell, Machine Learning, 1997 :McGraw-Hill
[14] G. Pagallo and D. Haussler, "Boolean Feature Discovery in Empirical Learning,"
Machine Learning, vol. 5, pp. 71-99, 1990.
[15] Z. Pawlak, "Some Issues on Rough Sets,” Transactions on Rough Sets I, vol. 3100, J.
Peters, A. Skowron, J. Grzymala-Busse, B. Kostek, R. Swiniarski, and M. Szczuka, Eds., ed: Springer Berlin / Heidelberg, 2004, pp. 1-58.
[16] Z. Pawlak, A. Skowron, "Rudiments of rough sets", Information Sciences, vol.177, no.1, pp.3-27, 2007.
[17] J. R. Quinlan, "Induction of Decision Trees," Machine Learning, vol. 1, pp. 81-106, 1986.
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
[18] J. R. Quinlan, C4.5: programs for machine learning: Morgan Kaufmann Publishers Inc., 1993.
[19] J. Stefanowski and K. Slowiński, "Rough Set Theory and Rule Induction Techniques For Discovery of Attribute Dependencies in Medical Information Systems,” Principles of Data Mining and Knowledge Discovery. vol. 1263, J. Komorowski and J. Zytkow, Eds., ed: Springer Berlin / Heidelberg, 1997, pp. 36-46.
[20] S. M. Weiss and N. Indurkhya, "Reduced complexity rule induction," presented at the Proceedings of the 12th international joint conference on Artificial intelligence - Volume 2, Sydney, New South Wales, Australia, 1991.
[21] X. Wu, V. Kumar, J. Ross Quinlan, J. Ghosh, Q. Yang, H. Motoda, G. McLachlan, A.
Ng, B. Liu, P. Yu, Z.-H. Zhou, M. Steinbach, D. Hand, and D. Steinberg, "Top 10 algorithms in data mining," Knowledge and Information Systems, vol. 14, pp. 1-37, 2008.
[22] L. A. Zadeh, "Fuzzy Sets," Information and Control, vol. 8, pp. 338–353, 1965.
[23] Frank, A. & Asuncion, A. (2010). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
[24] "Data Mining Curriculum". ACM SIGKDD. 2006-04-30. Retrieved 2011-10-28.