CHAPTER 3 SOCIAL SUPPORT
3.2 Experiments
3.2.1 Experiment Source
The greater closeness coefficient value indicates that the alternative is simultaneously closer to IFPIS and farther from IFNIS. Hence, the ranking list of all the alternatives can be determined according to the descending order of closeness coefficient values.
Finally, the alternative with the highest ranking is the most preferred alternative.
3.2 Experiments
3.2.1 Experiment Source
In order to evaluate the proposed social appraisal support mechanism, we construct experiments on both search goods and experience goods in the Plurk micro-blogosphere. According to the report from InRev Inc. [7], the Plurk micro-blogosphere is very popular in Taiwan, the Philippines, Indonesia, and the United States. Based on the statistics of May 18, 2010, almost 50% of Plurk users are teenagers and 30% of users are aged 20~30. Because Plurk is predominently used by youths and young adults for information sharing, we believe that it is an excellent platform for soliciting social appraisal support when users face a purchase decision.
Construction of the friend network. In the experiments, a total of 113 active Plurk users
are invited to be support requesters. All these qualified support requesters have undertaken at least one purchasing activity in the last three months. Besides, to ensure that a support requester has sufficient time to evaluate the satisfaction degree of the-30-
purchased product, the latest purchase decision of a support requester should have been more than one week ago. We construct the friend network as initiated and expanded from these support requesters. Data descriptions of the experiments are outlined in Table 3.2. In the experiments, a total of 161 purchase decisions (88 for search goods and 73 for experience goods) are evaluated. A typical decision support request contains 3-5 alternatives and on average 16 friends (decision supporters) reply to a request with their opinions. For the purpose of analyzing the companionships of the decision supporters who respond, we collected the post and response activity records in the last 6 months from the participants’ public Plurk interface. Figure 3.7 shows the visualization of the collected friend network.
Table 3.2 Data descriptions of the experiment
Statistics of the experiment data
Number of invited participants 113
Number of available social appraisal requests 161 Average number of decision supporters per social appraisal request 16
Average number of friends per participant 83
Average number of interactions per participant (6 months) 2,967 Average number of requests released per participants 1.6 Average number of alternatives per social appraisal request 4.2
-31-
Figure 3.7 Visualization of collected social appraisal network
Table 3.3 Features of products in different categories
Digital Camera Computer MP3 Player Cell Phone Resolution Processor PC interface Cellular tech.
Price Memory Flash
memory
Specific absorption rate
Lens Video
graphic Dimension Band/mode
Storage Size of case Weight Wireless interface
Interfaces Storage Resolution Weight
Exposure
controls Warranty Battery tech. Memory Focus controls Network Battery life Battery life Flash modes Audio
-32-
Construction of the decision criteria. Four kinds of search goods, “digital camera,”
“computer,” “MP3 player,” and “cell phone,” and three kinds of experience goods, such as “restaurant,” “movie,” and “peripheral products” are analyzed in the experiments. Note that “peripheral products” mainly refers to the peripheral products of mobile devices (e.g. case, headset of tablet or smartphone, etc.). As the features and characteristics of search goods can be explicitly evaluated by the customers before purchasing, we pre-collect product features as the appraisal criteria from the buying guide of the CNE product review site. The pre-collected product categories and features of search goods are listed in Table 3.3. The participants were asked to initiate a request for decision support and disseminate it over their own social networks on the Plurk platform. For experienced goods, we use semantic analysis of the microblog messages to extract the implicit decision criteria, described in subsection 3.1.2.1.
(a) The synonymous adjective expansion (b) The first-level expansion of the adjective “good”
(c) The second-level expansion of the adjective “good”
(d) The final expanded synonymous adjective graph
Figure 3.8 Synonymous adjective graph creation
-33-
Construction of the adjective word graph. Figure 3.8 depicts the evolving process of
the word set expansion. We can observe that the expansion of the word set is marginally diminishing from Figure 3.8-(a). Altogether 1,127 non-duplicate adjectives are included in the word set used for synonymous adjective graph building. In Figure 3.8-(b-c), an example of two-level synonymous adjective expansion of the adjective“good” is shown. The word “good” has synonymies of “full,” “estimable,” “beneficial”
etc. in the first-level expansion according to WordNet. These extracted synonymies are used as the seed words for further extracting the second-level synonymies of “good” in the second-level expansion, and so on. The final expanded synonymous adjective graph is shown in Figure 3.8-(d).
Selection of the polar adjectives. As explained in section 3.1.2.2, the semantic
orientation of an adjective is calculated by the comparison of the shortest paths between this adjective and the positive polar adjective and between this adjective and the negative polar adjective. In this research, we use 27 words (19 words of high popularity and 8 words of low popularity) selected from the list of adjective words used by Vegnaduzzo [118] to evaluate whether the orientation identification mechanism could deal with the user’s daily used adjectives. These words are included in the synonymous adjective graph created as the evaluation word set. These 27 words are sequentially fed into the proposed evaluation extraction process to estimate the semantic orientation identification accuracy. However, these words are without orientation or polarity information. A group of 10 human judges (consisting of 2 doctoral students and 8 master students) was invited to pre-identify the semantic orientation (positive or negative) using the majority voting method. If an adjective is identified as having a positive orientation and a negative orientation with an equal number of votes, it would be marked as a vague orientation.We experimented with various polar pairs such as (good, bad), (positive, negative), and (excellent, poor) to study the impact on the accuracy of semantic orientation identification. The experimental results and the two-paired sample t-test at the 95%
significant level are respectively shown in Figure 3.9. As we can observe, the accuracy rate of adjective semantic orientation identification using the polar pair of (good, bad) is significantly higher than that of other pairs. Hence, it is used for the semantic orientation identification process in the experiments.
-34-
Figure 3.9 Accuracy comparison between different polar word pairs 3.2.2 Experiment Design
In the experiments, we asked the participants to recall their original decision-making process and report (1) the product they bought and the alternatives they took into account, (2) the criteria they considered, and (3) whether the product purchase decision was satisfactory.
First, we have to know which product they bought because different products have different criteria for decision making. The alternatives together with the suited criteria set were sent to their friends through Plurk. A friend becomes a decision supporter when he/she replies to the message with his/her criteria evaluation.
Second, although we pre-collected a general criteria set (i.e. product features) of products, in order to make the criteria set closer to participants’ considerations, the collective criteria for each product could be additionally collected from the participants.
For search goods, the system would respond with the pre-collected criteria set (as shown in Table 3.3) according to the product category mentioned in the social appraisal request. The decision supporters could give their evaluation (“G,” “B,” or “U”) to each criterion of the alternatives. For experience goods, the system analyzes the opinions posted by decision supporters to extract possible criteria and evaluations.
Third, after gathering the evaluation and building the collective decision matrix, the proposed social appraisal mechanism will output a ranking list of all the alternatives to support the originator’s decision-making on product purchasing. In order to evaluate the efficiency of the proposed social appraisal support mechanism, it is necessary to
-35-
know whether the participants are satisfied with their product purchase decision. In our mechanism evaluation process, the item ranked in the first place is selected as the purchasing target and it is used to evaluate the effectiveness of the proposed mechanism.
We illustrate the system process in the following example:
User A wants to buy a camera. According to self-survey or other recommendations, he/she has narrowed the choice to three camera alternatives but it is hard to decide which one is most suitable. He/she initiates a support request in the micro-blogosphere.
The request message is formed as “[Digital camera]: [camera1, camera2, camera3].”
The extracted criteria set for the digital camera would be posted in the form of
“[Criteria]: [resolution, price, lens, storage, interfaces, exposure controls, focus controls, flash modes].” Then, the decision supporters (the friends of A) reply with their criteria evaluation of each alternative in the following form “[ans]: [G, B, U, G, G, B, U, G], [U, G, G, B, B, B, U, G], [G, G, G, G, U, G, G, B], [1, 3, 8, 4, 2, 7, 5, 6].” After the consensus decision analysis, the system produces a list of ranked cameras for A in the form of “[Rank]: [camera2 > camera3 > camera1],” which indicates that A’s friends think that “camera2” is the most suitable camera.
Another example considers experience goods. User B initiates a support request for restaurant selection as “[Restaurant]: [restaurant1, restaurant2, restaurant3]. For a family dinner, which one is the best?” Suppose that friend1 gives his opinion as “[ans]:
[the service is great and the food is delicious but the price is expensive], [the distance is too far but food and service are good], []”. After collective opinion analysis, the system respectively transformed the sentences into the criteria set as “[Criteria]: [service, food, price, distance]” and the criteria evaluation as “[ans]: [G, G, B, U], [U, G, U, B], [U, U, U, U]” for these three restaurants and feed into the consensus decision analysis. Notice that the system would post the current criteria set to the support request message and allow other friends to give their opinions according to these criteria. Then, if friend2 mentioned other features of the restaurants, like “[ans]: [the service is great but I do not like their food and the price is a little bit expensive, distance is ok to me], [service and food are great], [very nice background music],” the criteria set would be expanded automatically as “[Criteria]: [service, food, price, distance, music]” and the evaluation of the criterion “music” of friend1 would be set as “U” and the evaluations updated as
“[ans]: [G, G, B, U, U], [U, G, U, B, U], [U, U, U, U, U]” for consensus decision
-36-
analysis. Finally, after the consensus decision analysis, the social appraisal system would reply with the restaurant ranking to B as “[Rank]: [restaurant2 > restaurant1 >
restaurant3],” which means that B’s friends think “restaurant2” is the most suitable restaurant for B.
3.3 Results and Evaluations
The effectiveness of social decision support is determined by the recipient’s subjective judgment [39], so the results recommended by the proposed mechanism should be compared with the support requester’s self-evaluation. The detailed comparison rules are listed in Table 3.4.
Table 3.4
Evaluation rule tableUser evaluation
Satisfied Unsatisfied
System
recommendation
Purchasing
CSS
1 CSUNot purchasing 1 CSS
CSU
There are two major evaluation rules to judge the effectiveness of the social support mechanism:
(1) Do recommend the user to buy the product they are satisfied with; if the support requester feels satisfied with the product and the social appraisal mechanism also recommends purchasing it (i.e. it is placed in the first ranking by the system), a mark “CSS,” which means correct social support is made.
| |
| |
S R
CSS S
, (3.21)
where S stands for the set of satisfactory products purchased and R stands for the set of products recommended for purchasing.
(2) Do not recommend the user to buy the product they are unsatisfied with. If the support requester feels unsatisfied with the product and the social appraisal
-37-
mechanism does not recommend purchasing it, a mark “CSU” is given, which means that wrong social support is avoided.
| |
| |
S R
CSU S
, (3.22)
where S stands for the set of unsatisfactory products purchased. For enterprises, these two rules could enhance customers’ degree of satisfaction and create more business opportunities.
Finally, the overall successful support is measured as:
| | | |
| | | |
S R S R
SS S S
. (3.23)
3.3.1 Comparisons of Criteria Weighting Strategies
We construct three experiments and compare the results with respect to the self-weighting, group-weighting, and equal-weighting strategies. The criteria importance of self-weighting and group-weighting strategies is respectively obtained from the decision requester and the group of decision supporters. For the equal-weighting strategy, the criteria importance would be set to 1. The results shown in Figure 3.10-(a) and -(b) reveal that the self-weighting strategy is more effective than other strategies for both search goods and experience goods. It is because when making a purchasing decision, the decision maker most clearly knows his/her individual needs.
Besides, as our close friends might know us better, the group-weighting strategy has better performance than the equal-weighting strategy. Therefore, it is suitable to use group-weighting strategy as the default criteria weighting if the support requester did not give their own criteria importance settings.
-38-
(a) Search goods (b) Experience goods
Figure 3.10 Accuracy rates of different criteria weighting strategies
Table 4 and 5 shows the results of the 95% significant level two-paired sample t-test.
The results verified that the self-weighting strategy significantly outperforms the other strategies.
Table 3.5 Statistical verification of the decision analysis results with different weighting methods for search goods
Paired Group Mean
Std.
Deviation
Std. Error
Mean T value
Sig.
(2-tailed)
Self V.S.
Group -0.063 0.358 0.020 -3.138 0.002
Equal -0.036 0.394 0.022 -1.670 0.003
Group V.S. Equal 0.026 0.389 0.021 1.198 0.011
-39-
Table 3.6 Statistical verification of the decision analysis results with different weighting methods for experience goods
Paired Group Mean
Std.
Deviation
Std. Error
Mean T value
Sig.
(2-tailed)
Self V.S.
Group 0.099 0.370 0.023 4.306 0.000
Equal 0.083 0.376 0.024 3.535 0.000
Group V.S. Equal -0.017 0.381 0.024 -0.699 0.001
3.3.2 Comparisons of Support Effectiveness
We construct and compare the results of three experiments with three different product selection approaches: the proposed social appraisal mechanism (SAM), the majority voting (voting) method, the five-star rating method, and the random selection method (random). The majority voting method is one of the baseline social support methods allowing users to aggregate friends’ opinions. For example, Facebook developed a simple social support function, “Questions.” In this scenario, the support requesters are asked to re-post their social appraisal request, then the decision supporters vote directly for which candidate is most suitable without criteria and evaluations. The five-star rating method is one of the baseline product evaluation methods for gathering the collective opinion of public users’ opinions. In this scenario, the decision supporters are requested to reply their opinions by using five stars scaling for each alternative. The random selection method is used to simulate the scenario that there is no social support mechanism. In this scenario, the participants do not know which product is the most suitable and pick one to buy randomly. Figure 3.11 indicates that the proposed mechanism is more effective than other baseline social support methods. Measures
“CSS” and “CSU” respectively indicate the performance that the support requester indeed buys the most suitable product and the performance that the support requester indeed avoids buying an unsuitable product.
-40-
(a) Search goods (b) Experience goods
Figure 3.11
Accuracy rates of different methodsAs we can observe, the performance of our proposed SAM is better than that of the other approaches. First, the SAM, majority voting, and five-star rating methods achieve better performance than the random approach. This indicates that soliciting external appraisement from the social network is helpful for supporting customers’ online shopping behavior. Second, both the SAM and the majority voting method aim to provide social appraisal support for support requesters, but the majority voting method does not consider the relative importance of decision supporters. This shows that considering social companionship could improve the social appraisal mechanism.
Third, the result of the five-star rating method is very similar to the voting method.
From the purchasing purpose, the buyer would like to buy the product which is the most suitable. While a decision supporter gives the highest star to a product indicates that he/she feels the product is the most appropriate. Similarly, he/she will vote the most suitable product in the voting method.
Due to the difficulty of complex nature language analysis and heterogeneity of user tastes, the extracted criteria and evaluations using semantic analysis for experienced good might not perfectly represent the characteristics of a product. So that, the CSS evaluation values of experience goods are lower than search goods. And, the CSU is greater than CSS in the evaluations of experience goods.
Finally, the result of the overall performance of different approaches is further evaluated by two-paired sample t-test and shown in Table 3.7 and 3.8. At the 95%
-41-
significant level, all the test results show that the proposed social appraisal mechanism significantly outperforms the other product selection approaches.
Table 3.7 Statistical verification of the decision analysis results with different selection approaches for search goods
Paired Group Mean
Std.
Deviation
Std. Error Mean
T value
Sig.
(2-tailed)
SAM V.S.
Voting -.01904 .38157 .02140 -.890 .003 Five-star .02918 .39352 .02207 1.322 .002 Random -.04526 .39169 .02197 -2.061 .000
Table 3.8 Statistical verification of the decision analysis results with different selection approaches for experience goods
Paired Group Mean Std.
Deviation
Std. Error Mean
T value Sig.
(2-tailed)
SAM V.S.
Voting 0.051 0.406 0.017 3.025 0.003
Five-star 0.027 0.392 0.016 1.620 0.000
Random 0.097 0.386 0.016 6.002 0.006
We further compare the effectiveness of various appraisal mechanisms using different social companionship measures: (1) the proposed social appraisal mechanism (SAM), which considers the behavioral and structural tie strengths, (2) an appraisal mechanism using only behavior weighting (SAM-B), (3) an appraisal mechanism using only structural weighting (SAM-S), and (4) an appraisal mechanism using equal weighting (SAM-E). The alternatives are ranked by these different appraisal mechanisms.
-42-
(a) Search goods (b) Experience goods
Figure 3.12 Accuracy rates of different companionship measures
Figure 3.12 reveals that using both the behavioral and the structural characteristics to evaluate the importance of friends can significantly improve the appraisal effectiveness.
The results of the two-paired sample t-test are shown in Table 3.9 and 3.10. At the 95%
significant level, all the test results show that the proposed companionship evaluation approach significantly outperforms the other approaches. This implies that it is beneficial and essential to consider the behavioral information and the structural information together while developing a social support mechanism.
Table 3.9 Statistical verification of the decision analysis results with different models for search goods
Paired Group Mean
Std.
Deviation
Std. Error
Mean T value
Sig.
(2-tailed)
SAM V.S.
SAM-B -.06406 .36091 .02024 -3.165 .002 SAM-S -.04501 .37700 .02114 -2.129 .003 SAM-E -.04043 .39475 .02214 -1.826 .000
-43-
Table 3.10 Statistical verification of the decision analysis results with different models for experience goods
Paired Group Mean
Std.
Deviation
Std. Error
Mean T value
Sig.
(2-tailed)
SAM V.S.
SAM-B .09978 .37075 .02317 4.306 .000
SAM-S .08013 .37909 .02369 3.382 .001
SAM-E .08312 .37627 .02352 3.535 .000
3.3.3 Comparison of Search and Experience Goods
The accuracy rates with respect to different products are shown in Figure 3.13. The proposed mechanism achieved an overall 83% accuracy rate. The accuracy rate for search goods and for experience goods is 83% and 82%, respectively. Among the search goods, cell phones have the highest accuracy rate (87%). Among the experience goods, peripheral products have the highest accuracy rate (88%). Mobile devices, such as smartphones and tablets, are trendy products and most of the decision supporters invited to take part in the experiments already have one or more mobile devices and peripheral products. Respectively, 21% and 32% of the requests for social appraisal support are related to peripheral products and mobile devices (cell phones and computer categories). Therefore, the social support has relatively sufficient basic knowledge to judge whether a product is good or bad and provide more appropriate product opinions and criteria evaluations.
As Figure 3.13 shows, movies have the lowest rate (64%). The result can be explained by two reasons. First, movies are highly dependent on individual preferences, so 11 (about 7%) appraisal requests are released. The number of decision samples might be insufficient to evaluate the performance accurately. Second, there are too many
“unknown” criteria evaluations in the movie category. Besides, as watching a movie is a costly activity (time and price), comparatively few friends have watched all the alternatives of a movie appraisal request and respond with their opinions. However, the proposed mechanism still received approximately a 64% support accuracy rate in the movie category.
-44-
(a) Search goods (b) Experience goods
Figure 3.13 Accuracy rates for different products
3.4 Chapter Summary
In this chapter, a social appraisal mechanism, which is composed of social companionship analysis, collective opinion analysis, and consensus decision analysis, for online purchase support in the micro-blogosphere was proposed. To measure the social companionship of decision support, this study constructed an interaction network based on the interactions of posts and responses in micro-blogs to measure the behavioral tie strength of the social relationship and measured the structural tie strength of the social relationship by analyzing the friend network. To analyze the collective opinions, a text-mining technique with semantic orientation identification was developed for criteria and evaluation extraction. Besides, to resolve the inherent issue of information incompleteness in the collective opinions, IFS is applied to model the vague or incompletely known opinions from the micro-blogosphere. Finally, to consolidate the evaluations from various decision supporters and the support requester’s decision criteria preference, TOPSIS was applied to rank the final alternative. Our experimental results show that the accuracy of the proposed social appraisal support mechanism outperforms that of other benchmark approaches. The proposed social appraisal framework soliciting opinions from trustable friends can thus be effectively applied to support individual decisions, such as online purchasing.