• 沒有找到結果。

CHAPTER 6 EVALUATION

6.3 E XPERIMENT 1

6.3.3 Result of Experiment 1’s Data

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

78

6.3.3 Result of Experiment 1’s Data

For the first objective of experiment 1, to find out which kind of provided anchor’s adjustment indexes have the better performance of being the measurement of anchoring effect and fulfilled the assumption 2, we use several scatter plots to observe each attribute’s distribution and the mean of adjustment index. Then we use the category of attribute to observe their distribution and the mean of adjustment index again to infer which category’s adjustment indexes have the biggest ability to be the measurement. In the end, we use ANOVA (Analysis of variance) to be more rigorous to test the assumption 2 is fulfilled or not. ANOVA is a collection of statistical models used to analyze the differences between group means. It provides a statistical test of whether or not the means of several groups are equal (Drummond et al., 2012).

To analyze the collected data, first we have to use the formula of adjustment index to transfer the adjustment values into adjustment index.

However, in this process, we found that most of subjects didn’t adjust the value of the e-shopping service’s attribute - “amount of goods”. That makes this attribute’s median of the adjustment values is zero, so the adjustment index can’t be calculated according to the formula.

After transferring process, we can make the scatter plot of adjustment Index. From these charts (Figure 6.3.1 and Figure 6.3.2), we can see most of the adjustment indexes are below “2”. The category 2 has the obviously higher discrete degree than the other two categories. For example, “frequency of promotion activities” and “days of free trial” are obviously not centralized focus.

The category 1 has a lower discrete degree, but still has some suspected outliers.

For example, there are 4 data of “monthly fee” above 2. The category 3 has the

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

79

lowest discrete degree and seems more stable. Through the Figure 6.3.3 and Figure 6.3.4, we can see the category 2’s mean of adjustment index has is more obviously extreme than the other two categories. That means the category 2’s attributes would not fulfill the assumption 2 and their adjustment indexes cannot be used in cross-attributes comparison.

Figure 6.3.1 Scatter Plot of Adjustment Index

0

Scatter Plot of Attribure's Adjustment Index

Adjustment Index

For X_Axis:

1 = Shipping fee 2 = Delivery time 3 = Amount of goods

4 = Frequency of promotion activities 5 = Security of personal data

13 = Amount of authorized mobile devices 14 = Ease of use of UI

15 = Update speed of new songs

16 = Ease of sharing through social network

none

Figure 6.3.2 Scatter Plot of Attribute Category's Adjustment Index

Figure 6.3.3 Attribute’s Mean of Adjustment Index

0

Scatter Plot of Attribute Category's Adjustment Index

Adjustment Index

For X_Axis:

1 = Countable with Limitation 2 = Countable without Limitation 3 = Uncountable with Limitation

1.19

Figure 6.3.4 Attribute Category’s Mean of Adjustment Index

Therefore, we infer that the category 3 may be the most suitable kind of provided anchors for matching assumption 2, “each of the provided anchor’s mean of its adjustment indexes must be equal”. To confirm the inference with more rigors, we further build a hypothesis that all the means adjustment indexes of the attributes in the category 3 are equal (see Figure 6.3.5) and use ANOVA to test the hypothesis. The following Table 6.3.6 shows the list of the test targets:

Table 6.3.6 List of the test targets

Number of

test attribute

Attribute name

1 Security of personal data

2 Convenience of refund

3 Ease of use of Web UI(e-shopping)

4 Ease of use of operation

5 Ease of use of UI(online music service)

6 Update speed of new songs

7 Ease of sharing through social network

1.12

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

83

Figure 6.3.5 Hypothesis of the testing for attribute category 3

We used IBM’s statistic software, SPSS (Statistical Product and Service Solutions), to support the ANOVA test. To make sure the result is valid, the assumption of ANOVA which is “there needs to be homogeneity of variances”

has to be fulfilled. We tested this assumption in SPSS using Levene's test for homogeneity of variances.

Table 6.3.7 The result of Levene's test for homogeneity of variances

Test of Homogeneity of Variances

Levene Statistic df1 df2 Sig.

24.587 6 770 0.000

According to Table 6.3.7, the Levene Statistic’s value is 24.587 and its significance level is less than 0.001 (p-value < 0.001), which is below 0.001. It means our data fails this assumption the equal variances test reveals that the group variances are significantly different, so we need to run a Welch ANOVA in SPSS instead of a one-way ANOVA since the Welch’s test is not the same as the regular ANOVA test, it does not assume the data of all groups have the same variance (Welch 1951; Brown and Forsythe 1974b; Asiribo, Osebekwin, and Gurland 1990). The following Table 6.3.7 is the result of Welch ANOVA:

𝐻0: 𝑇ℎ𝑒 𝑚𝑒𝑎𝑛 𝑜𝑓 𝐴𝐼𝑠 𝑎𝑟𝑒 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒 𝑓𝑜𝑟 𝑡ℎ𝑒 all attributes in category 3.

𝐻0: 𝜇1 = 𝜇2 = 𝜇3 = 𝜇4 = 𝜇5 = 𝜇6 = 𝜇7 𝐻1: 𝑇ℎ𝑒 𝑚𝑒𝑎𝑛 𝑠𝑐𝑜𝑟𝑒𝑠 𝑎𝑟𝑒 𝑛𝑜𝑡 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒.

𝐻1: 𝐻0 𝑖𝑠 𝑛𝑜𝑡 𝑡𝑟𝑢𝑒.

μ = popultion mean.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

84

Table 6.3.8 The result of Welch ANOVA for attribute category 3

Welch ANOVA

Welch Statistic df1 df2 Sig.

12.630 6 340.389 0.000

It shows the Welch Statistic’s value is 12.63 and its significance level is less than 0.001 (p-value < 0.001), which is below 0.001 (see Table 6.3.8). Therefore, the null hypothesis H0 is rejected and it means there is a statistically significant difference in the mean of adjustment indexes between the attributes in the category 3. The result is different from our inference, so we ran another statistic test, Games-Howell test, to illustrate where the differences between the attributes lie (see Table 6.3.9). Post-hoc analyses are concerned with finding patterns or relationships between subgroups of sampled populations. For unequal variances, the Games and Howell test was consistently more powerful than all other procedures (Jaccard et al., 1984). The post-hoc test result demonstrates that there are statistically significant differences between attribute3, which is Ease of use of Web UI (e-shopping), and each of the other attributes. The significance levels are all less than 0.001 (p-value < 0.001, which is below 0.001). We infer that all the web user interfaces of e-shopping services are similar and that makes the subjects could not imagine what the improvement of this attribute means. So, we may consider the result of this attribute as an outlier. Hence, we removed the data of attribute3 and ran a Welch ANOVA test again.

Table 6.3.9 The result of post-hoc test, Games-Howell for attribute category 3

Multiple Comparisons Dependent Variable: Adjustment Index

Games-Howell

(I) attribute (J) attribute Mean Difference (I-J) Std. Error Sig.

95% Confidence Interval Lower Bound Upper Bound

1

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

86 6

1 0.00 0.04 1.00 -0.11 0.11

2 -0.04 0.04 0.98 -0.16 0.09

3 -0.39 0.06 0.00 0.22 0.55

4 -0.03 0.04 0.99 -0.15 0.09

5 -0.04 0.04 0.98 -0.17 0.09

7 -0.15 0.06 0.22 -0.35 0.04

7

1 0.15 0.06 0.22 -0.04 0.34

2 0.12 0.07 0.59 -0.08 0.32

3 0.54 0.08 0.00 0.32 0.77

4 0.12 0.07 0.52 -0.08 0.32

5 0.12 0.07 0.60 -0.09 0.32

6 0.15 0.06 0.22 -0.04 0.35

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

87

Table 6.3.10 The result of Welch ANOVA

for attribute category 3 without “Ease of use of Web UI (e-shopping)”

Welch ANOVA

Welch Statistic df1 df2 Sig.

1.345 5 306.242 0.245

After removing “Ease of use of Web UI (e-shopping)”, the Welch Statistic’s value decreases to 1.345 and its significance level increases to 0.245 (p-value = 0.245), which is above 0.05 (see Table 6.3.10). Therefore, if we consider

“Ease of use of Web UI (e-shopping)” is a outlier, the null hypothesis H0 would not be rejected and it means there is no statistically significant difference in the mean of adjustment indexes between the attributes in the category 3.

For the second objective of experiment 1, to justify whether the preference of service attributes would affect the level of the anchoring effect, we use two statistic methods, ANOVA and correlation, to see the relation between the preference and the adjustment’s level of anchoring effect. Because we have known that the attributes in the category 3 has the best quality from the previous test, we will only consider the attributes in the category 3 (the attributes shown in Table 6.3.5) as the target in the evaluation this time.

We employed the questions with Liker scale, so we have to code it into numeric values before running the statistic test. We coded the options using the following Table 6.3.11:

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

88

Table 6.3.11 Coding of preference of service attributes

Option Value

Strongly disagree 1

Disagree 2

Neither agree nor disagree 3

Agree 4

Strongly agree 5

For testing whether the preference affect the adjustment’s level of anchoring effect or not, we built a hypothesis (Figure 6.3.6):

Figure 6.3.6 Hypothesis of the testing for the preference

As the same, we have to use Levene's test for homogeneity of variances (Table 6.3.12):

Table 6.3.12 The result of Levene's test for homogeneity of variances

Test of Homogeneity of Variances

Levene Statistic df1 df2 Sig.

4.114 4 772 0.003

𝐻0: 𝑇ℎ𝑒 𝑚𝑒𝑎𝑛 𝑜𝑓 𝐴𝐼𝑠 𝑎𝑟𝑒 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒 𝑓𝑜𝑟 𝑡ℎ𝑒 𝑎𝑙𝑙 𝑝𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙𝑠.

𝐻0: 𝜇1 = 𝜇2 = 𝜇3 = 𝜇4 = 𝜇5 𝐻1: 𝑇ℎ𝑒 𝑚𝑒𝑎𝑛 𝑠𝑐𝑜𝑟𝑒𝑠 𝑎𝑟𝑒 𝑛𝑜𝑡 𝑡ℎ𝑒 𝑠𝑎𝑚𝑒.

𝐻1: 𝐻0 𝑖𝑠 𝑛𝑜𝑡 𝑡𝑟𝑢𝑒.

μ = popultion mean.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

89

The Levene Statistic’s value is 4.114 and its significance level is less than 0.003 (p-value = 0.003), which is below 0.01. Therefore, we need to run a Welch ANOVA test:

Table 6.3.13 The result of Welch ANOVA for preference levels

Welch ANOVA

Welch Statistic df1 df2 Sig.

8.854 4 39.202 0.000

From the result, we can see the Welch Statistic’s value is 8.854 and its significance level is less than 0.001 (p-value < 0.001, which is below 0.001) (see Table 6.3.13). Therefore, the null hypothesis H0 is rejected and it means there is a statistically significant difference in the mean of adjustment indexes between the different preference levels. Besides, to be more precise, we want to know how the preference levels affect the adjustment’s level of anchoring effects Spearman's rank correlation coefficient (Lehman, 2005) is a measure of the linear correlation (dependence) between two variables X and Y, giving a value between +1 and −1 inclusive, where 1 is total positive correlation, 0 is no correlation, and −1 is total negative correlation. Spearman's coefficient is appropriate for both continuous and discrete variables, including ordinal variables. We use the Spearman's rank correlation coefficient to examine the relationship since it can explain the relationship of two variables and the preference of service attributes is an ordinal variable.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

90

Table 6.3.14 The Spearman's rank correlation coefficient for preference levels and adjustment index

Spearman's rho

Variables: Preference Level & AI

Correlation Coefficient Sig. (2-tailed) N

0.233 0.000 777

From Table 6.3.14, we can see the Spearman’s correlation coefficient’s value is 0.233 (p-value < 0.001, which is below 0.001) and that means there is a statistically significant evidence supports that these two variables react in the same way. That is, if the preference level of service attributes increases, the adjustment’s level of anchoring effect increases and vice versa.