IV. Result Analysis and Evaluations
4.1 Decision Tree Analysis
4.2.1. Analysis of “excited and unexcited” for Multi-staged Binary Tree
國
立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
“usable and not usable”, and “warm and not warm”. The processes of full and dynamic questionnaires selection for each target approach are shown in Table 6, 10 and 14 respectively. The comparison between the precision rates and used attributes between full and dynamic questionnaires for each target categories are shown in Tables 7, 11 and 15 respectively. The decision rules and the most attribute used of
“excited and unexcited” for dynamic questionnaire are shown in Table 8 and 9 respectively. The decision rules and the most attribute used of “usable and non-usable”
for full questionnaire are shown in Table 12 and 13 respectively. These will be analysed and discussed in detail as follows.
4.2.1. Analysis of “excited and unexcited” for Multi-staged Binary Tree
4.2.1.1. The process of full and dynamic questionnaires selection of “excited and unexcited” for Multi-staged Binary Tree
To generate the trees, the researcher used 600 data samples by setting 80% of the data sample as training and the other 20% as the testing set in random selection. The purpose of dynamic questionnaire is to have a less number of questions (used attributes) in the questionnaire, and also to have as less significant difference in precision rate changes as possible with original (full) questionnaire. The researcher applied two methods similar as Decision Tree to determine two dynamic questionnaires for comparing with full questionnaire. However, the difference is using two steps of Multi-staged Binary Tree to determine a tree: classified “excited and unexcited” first and then classified “unexcited” to “usable and warm”
perceptions.
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
Table 6: The trees results of dynamic questionnaires for “excited and unexcited” of Multi-staged Binary Tree9
By using this setting in random sample selection, it generated different combination of trees with different outcomes of used attributes and the precision rates. The algorithms of determining the first dynamic tree were (1) to keep generating trees until the tree containing all 23 attributes is determined, which is called full questionnaire. And (2) then compared the results of the numbers of attribute used and the precision rates among the trees. The least attributes used with the higher precision rate of the tree were selected to be the first dynamic questionnaire. The trees results for “excited and unexcited” of Multi-staged Binary Tree is shown in Table 6. In Table 6, the input attributes of step one highlighted in purple. The input attributes of step two highlighted in orange. The researcher illustrated the details of tree
9① means used attributes in step one; ② means used attributes in step two.
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
generation as follows:
The way to decide attributes used was try and error method. Firstly, the researcher does not know the input attributes in both steps of one and two of the classification for generating the first tree. Therefore, the researcher used all 23 attributes as input to determine the input attributes for generating the first tree; and “excited and unexcited” as target output. By doing this, the outcomes of the attributes that are shown in the result of classification, will be the used attributes in the step one for generating the first tree, and the attributes that are not showed will be the used attributes in step two. From this method, the researcher determined that there were 18 attributes shown, which were the input attributes of the step one of Tree 1 generation (highlighted in light and dark purple of Tree 1 in Table 6); and the remaining of 5 attributes that are not shown, were the input attributes of the step two (highlighted in orange and dark brown of Tree 1 in Table 6).
In this combination of input attributes of Tree 1 generation, there is a total of 17 attributes used in the result, which is the best solution for the suggestions given when it applies to the tourism recommendation system for users use. It has a precision rate of 67.333%, which including training rate of 68.542% and testing rate of 62.50%. It includes 13 used attributes in the first step, which are Attributes 1-3, 7-10, 13, 15-17, 21 and 23 (highlighted in light purple in Table 6); and 4 used attributes in the second step, which are Attributes 5, 6, 11 and 20 (highlighted in orange in Table 6).
To continue using try and error method to generate more different trees
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
and to compatible with previous tree, the algorithm was using the used attributes from the previous tree of the result as the input attributes of the following generated trees. Therefore, for the generation of Tree 2, the researcher used the used attributes in step one (highlighted in light purple in Table 6) and unused attributes in step two (highlighted in dark brown in Table 6) of Tree 1 as the input attributes in step one of Tree 2; and the input attributes in step two of Tree 2 are the unused attributes in step one (highlighted in dark purple in Table 6) and the used attributes in step two (highlighted in orange in Table 6) of Tree 1. The result shows that there are 18 used attributes of Tree 2 with a precision rate of 66.667%, which including training rate of 67.708% and testing rate of 62.50%. It includes12 used attributes in step one, which are Attributes 1-3, 7-10, 15-17, 21 and 22 (highlighted in light purple in Table 6); and 6 used attributes in step two, which are Attributes 4, 6, 11, 14, 18 and 20 (highlighted in orange in Table 6).
The generation of Tree 3 and 4 had applied the same generation algorithm as Tree 2. Taking Tree 3 as an example, the input attributes of Tree 3 in the step one are the used attribute in the step one (highlighted in light purple in Table 6) and the unused attributes in the step two (highlighted in dark brown in Table 6) of Tree 2. The input attributes of step two of Tree 3 are the used attribute in the step two (highlighted in orange in Table 6) and the unused attributes in the step one (highlighted in dark purple in Table 6) of Tree 2.The results of these two trees in Table 6shows that (1) Tree 3 has twenty-two used attributes with a precision rate of 69.333%, which including training rate of 70.417% and testing rate of 65%. It
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
includes fifteen used attributes in step one, which are Attributes 1-3, 5, 7-10, 12, 15-17, 19, 21 and 22; and seven used attributes in step two, which are Attributes 4, 6, 11, 13, 14, 18 and 23. (2) Tree 4 has twenty-three used attributes with a precision rate of 71.333%, which including training rate of 72.292% and testing rate of 67.50%. It includes sixteen used attributes in step one, which are Attributes 1-3, 5, 7-10, 12, 15-17and 19-22; and seven used attributes in step two, which are Attributes 4, 6, 11, 13, 14, 18 and 23.
In the result, Tree 4 contains all the attributes in the tree, of which the attributes and precision rate will use for full questionnaire (highlighted in green in Table 6).
In addition, for the selection of the first dynamic questionnaire, the researcher selected the tree with the least used attributes and the higher (appropriate) precision rate from these four trees. Dynamic questionnaire of Tree 3 only has a marginal difference from full questionnaire of Tree 4 which is ineffective as dynamic questionnaire. Tree 1 has more reduction in attributes with the highest precision rate of 67.333 than Tree 2 of 66.667.
Thus, Tree 1 will be the first dynamic questionnaire of “excited and unexcited” (highlighted in blue in Table 6).
Second of all, the second dynamic questionnaire was hoped with the least amount of the attributes. The details are as follows. The second dynamic questionnaire was generated by using the most important used attributes from the first four trees in Table 6. The researcher, firstly, found out there are 13 most important used attributes (highlighted in yellow in Table 6) from the first four trees, of which 11 used attributes were in step one (highlighted in
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
light and dark purple in Table 6) and 2 used attributes were in step two (highlighted in orange and dark brown in Table 6); and secondly, used these most important used attributes to generate trees.
For the generation of Tree 5, the input attributes of step one are 11 important used attributes in step one, which are Attributes 1-3, 7-10, 15-17 and 21; the input attributes of step two are 2 important used attributes, which are Attributes 6 and 11. The result in Table 6 shows that there are eleven used attributes of Tree 5 with a precision rate of 58%, which including training rate of 58.542% and testing rate of 55.833%. It includes ten used attributes in step one, which are Attributes 2, 3, 7-10, 15-17 and 21;
and one used attribute in step two, which is Attribute 6.
This method of tree generation differs from the generation method of first four trees as the purpose of dynamic questionnaire is to have least attributes of the classification. The tree generation stops when the used attributes stop decreasing. Tree 5 is the first tree that generated by the most important used attributes in Table 6. Due to there were still decreasing in attributes, so the researcher kept generating the trees and stopped when the attributes stopped decreasing. Thus, the researcher used the methodology of using the used attributes from the important used attributes to generate the next tree. For the generation of Tree 6, the input attributes of step one and step two are the used attributes in step one and step two of Tree 5, respectively. The result shows that there are eleven used attributes of Tree 6 with a precision rate of 58%, which including training rate of 58.542% and testing rate of 55.833%. It includes ten used attributes in step one, which are
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
Attributes 2, 3, 7-10, 15-17 and 21; and one used attribute in step two, which is Attribute 6. This tree is a final tree of “excited and unexcited”
classification as to compare with the previous tree of Tree 5. There is no change in both used attributes and precision rate from Tree 5.
Furthermore, for the selection of the second dynamic questionnaire, the researcher selected the tree with the least used attributes from Tree 5 and 6 that generated by the important used attributes. Tree 6 had the same amount of attributes and the precision rate of Tree 5, so it has no affection to Tree 5.
Thus, Tree 5 will be the second dynamic questionnaire for “excited and unexcited” target approach (highlighted in pink in Table 6).
In summary, Tree 4 will be full questionnaire for “excited and unexcited” of Multi-stage Binary Tree (highlighted in green in Table 6).
Dynamic questionnaire of Tree 3 only has a marginal difference from Tree 4 which is ineffective as dynamic questionnaire. Tree 1 has more reduction in attributes with the higher precision rate than Tree 2. Thus, Tree 1 will be dynamic questionnaire (highlighted in blue in Table 6). Tree 5 will also be dynamic questionnaire as Tree 6 had the same amount of attributes and the precision rate of Tree 5, which has no affection to Tree 5. Therefore, Tree 1, Tree 4 and Tree 5 will also be analysed in the next section.
4.2.1.2. Precision rate analysis for “excited and unexcited”
In the analysis of “excited and unexcited” under full and dynamic questionnaires (Table 7) classified by Multi-staged Binary Tree, there are 23 attributes used with a precision rate of 71.333% in full questionnaire. There
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
are 15 attributes in the first step of full questionnaire. It includes main reasons, total expenditure, causation of participation, weather, guests attend the event, outside facilities around the event, length of the event holding, time, reference group, relationships with the person who goes with, number of people to go with, sex, impression, the way to make a decision and lifestyle.
There are 8 attributes in the second step, which are traffic condition, economics, transportation, attraction of the content and the planning of the event, relationships among the participants, age, personalities and interest.
In dynamic questionnaire, the researcher tested two conditions to see which dynamic questionnaire’s precision rate is closer to full questionnaire.
For the first condition (Dynamic Q1), there are 17 attributes used in total with a precision rate of 67.333%. In the first step, there are 13 attributes which includes main reasons, total expenditure, causation of participation, guests attend the event, outside facilities around the event, length of the event holding, time, attraction of the content and the planning of the event, relationships with the person who goes with, number of people to go with, sex, the way to make a decision and interest. The 4 attributes in the second step are weather, economics, transportation and personalities.
For the second condition (Dynamic Q2), there are 11 attributes used in total with a precision rate of 58.00%. In the first step, there are 10 attributes which includes total expenditure, causation of participation, guests attend the event, outside facilities around the event, length of the event holding, time, relationships with the person who goes with, number of people to go with, sex and the way to make a decision. The 1 used attribute in the second step is
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
economics.
In conclusion, Table 7 indicates that there is a significant difference between the precision rates of full and dynamic questionnaires. Full questionnaire has 71.333% of precision rate, and the closest dynamic questionnaire’s precision rate of full questionnaire is Dynamic Q1 of 67.333%. This is a difference of 4%. However, if compared with Dynamic Q2, there will be a 13.333% difference which is more significant and indeed material. Furthermore, there are 23 used attributes in full questionnaire where Dynamic Q1 has 17 attributes and Dynamic Q2 has 15 attributes. It shows that the attributes used has decreased by 6 when compared with full questionnaire and Dynamic Q1, and 8 attributes less if compared with full questionnaire and Dynamic Q2. Therefore, Dynamic Q1 is more suitable for dynamic questionnaire of “excited and unexcited” approach than Dynamic Q2 as there is a reduction in attributes and the less difference in precision rate that differs from full questionnaire. It will be discussed and compared with other dynamic questionnaire of other approaches of Multi-staged Binary Tree.
‧
Table 7: The comparison of the precision rate and used attributes between full and dynamic questionnaires in “excited and unexcited”10
Full Questionnaire Dynamic Questionnaire
Attraction of the content and the
planning of the event ② ①
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
4.2.1.3. Decision rule analysis for “excited and unexcited”
Appendix B is the diagram of Multi-staged Binary Tree of “excited and unexcited” for dynamic questionnaire. The first step is the classification of the excited and unexcited category approach, and the second step is the classification of the unexcited to usable and warm categories. The listed decision rule from Appendix B which the researcher had applied is shown in Table 8. The decision rules highlighted in yellow contain reasonable attribute contents. The unreasonable decision rules are not highlighted. The decision rules of step one of “excited” will be analysed below although
“unexcited” will not be analysed. Step two of “usable” and “warm” will also be analysed below.
For the solutions of “excited”, there are eight possibilities to produce excited perception. For example, Rule 2, the night time, with the crowded guests attend the event, with the very close or average relationships with the person who goes with, as the main reasons of attracted from news or it is a festival, and the people attends the event is female, will produce the perception of excited. Moreover, these eight possible decision rules contain both reasonable and unreasonable content in each of the decision rules.
Decision rules 1, 2, 6 and 7 contain reasonable attributes to produce an excited perception. Taking decision rule 1 as an example, spend night time and attend the crowded event with the main reasons of recommended by others, randomly, birthday, attended before or attracted from advertisement, will create an excited perception during the event. That is especially the case when the person is interested in food, art, beauty, fashion, shopping, leisure or sport, and attends together with the people who are very close or
‧
國立 政 治 大 學
‧
N a tio na
l C h engchi U ni ve rs it y
average people. However, decision rules 3-5 and 8 are unreasonable for an excited perception. An example of this is Rule 5, where the person interests in art, sport or leisure, and attend the event at night time with average or less attending guests and the total expenditures of event is above $3001, regardless the content and planning of the event is attractive. This will not give an excited perception. Thus, this decision rule is not reasonable for the excited perception content.
For the solutions of “usable”, there are three possibilities to produce a usable perception. For example of Rule 21, the decision rules of unexcited, with the economics of cheap, and with the transportation of average or hard to arrive, will produce the perception of usable. Moreover, these three possible decision rules of the usable content contain two reasonable and one unreasonable decision rule. Decision rules 21 and 11 contain reasonable attributes to produce a usable perception. For example, Rule 22 where the person feels unexcited at the first step, the individual is introverted or extraverted, especially if the cost of the event is cheap, with the location of the event is easy to arrive, and with the weather of sunny is more usable at the second step. This is a reasonable usable decision content. However, there is one unreasonable decision rule for the usable content, which is Rule 20. Rule 20 describes the person feels unexcited at the first step, and someone is introverted attending the event that costs of expensive or average, regardless of the transportation to the event is easy to arrive, will not give a usable perception, which is not a reasonable content for a usable perception.
‧
this perception. For example, Rule 24, the decision rules of unexcited, with the economics of expensive or average, and the personalities of extraversion, will produce the perception of warm. Moreover, these three possible decision rules of the warm content contain two reasonable and one unreasonable decision rules. Decision rule of 24 and 25 contain reasonable attributes to produce a warm perception. For the decision rule of 25, the decision rule of “unexcited” with the economics of cheap, with the transportation to the event is easy to arrive, and the weather of the day is cloudy or training, is a reasonable content for a warm content. However, there is one unreasonable decision rule for the warm content. Decision rule 23 describes the decision rule of “unexcited” with the economics costs individual cheap, with the transportation to the event is average or hard to arrive, for the person who is introverted, is not a reasonable content for a warm perception.Table 8: Decision rules of “excited and unexcited” for full questionnaire11 Excited Content
1.
Time night + crowded guests attend the event + very close/average relationships with the person who goes with + main reasons of recommended by others/randomly/birthday/attended before/attracted from advertisement + interest of food/art/beauty/fashion/shopping/leisure/sport 2. Time night + crowded guests attend the event + very close/average relationships with the person
who goes with + main reasons of attracted from news/festival + sex of female
3. Time night + average/less guests attend the event + average content and planning of the event + 2-3 people to go with + causations of number of people goes with/outside facilities
4. Time night + average/less guests attend the event + average content and planning of the event + 1/above 4 people to go with
11The attribute is distinguished in bold font, and the value of the attribute is underlined with the italic font. The decision rules highlighted in yellow contain reasonable attribute contents.
‧
5. Time night + average/less guests attend the event + attractive content and planning of the event +
5. Time night + average/less guests attend the event + attractive content and planning of the event +