• 沒有找到結果。

一個研究社群媒體政治活動之計算方法 - 政大學術集成

N/A
N/A
Protected

Academic year: 2021

Share "一個研究社群媒體政治活動之計算方法 - 政大學術集成"

Copied!
160
0
0

加載中.... (立即查看全文)

全文

(1)國立政治大學資訊科學系 Department of Computer Science National Chengchi University. 博士學位論文 Doctoral Dissertation. 立. 政 治 大. ‧ 國. 學. 一個研究社群媒體政治活動之計算方法. ‧. n. al. er. io. sit. y. Nat. A computational approach to the study of political activities on social media Ch. engchi. i n U. v. 研 究 生: 邱淑怡 指導教授: 徐國偉 博士 中華民國一○七年五月 2018/05. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(2) 摘要 資訊與通訊科技帶來社會的變革與人們的便利,已經改變人們的生活模式。 Facebook(臉書)、Google、YouTube 等社群媒體興起,環繞著人們的日常作息。 社群媒體帶來的不僅只是傳播本質的變化,它從政治,經濟,社會系統結構等, 全面滲透人們的生活當中,儼然已經成為人們生活的一部分。社群媒體提供使用 者方便上傳資訊、分享圖片和影片等訊息的共享平台,每個人都可以成為資訊的 創造者或分享者,可以恣意地在社群媒體發表言論及心情,它創造一個訊息傳播. 政 治 大. 的平台。在眾多社群媒體中,臉書的全球使用者超過 20 億,是擁有最多使用者. 立. 的社群網站,因此,我們選定臉書做為研究的社群媒體。. ‧ 國. 學. 本研究主要是在社群媒體上的政治活動進行全面性的研究,以政治活動之粉 絲專頁的貼文進行分析,擷取貼文的互動特徵及文字特徵,依據不同的分析議題. ‧. 提出計算的方法。互動特徵的分析是以太陽花學運的政治活動進行研究,運用不. y. Nat. sit. 同角度探討熱門貼文,臉書貼文的傳播軌跡,利用這些傳播的軌跡發現重要的臉. n. al. er. io. 書使用者。然而,太陽花學運為特定主題之政治活動,屬於短期性政治活動。而. i n U. v. 文字特徵的探勘需要長期政治活動,且還能包含不同政治立場的黨派才能進行分. Ch. engchi. 析,太陽花學運無法滿足我們的研究議題,因此,我們引用美國政黨粉絲專頁貼 文為素材,利用美國左、右派政黨在臉書上的貼文,進行貼文黨派傾向的預測分 析。而透過粉絲專頁的資料,不具有任何網路結構圖,故本文提出一些新的方法 來解決當沒有網路架構下如何進行社群媒體的分析研究。 本研究運用貼文的互動特徵得到熱門貼文,探討訊息傳播的能力,並在太陽 花學運期間找到重要的臉書使用者;利用貼文的互動特徵及文字特徵預測美國左 右派貼文內容的政黨傾向,比較不同分類器之預測結果與特徵的影響因素。 關鍵字:訊息傳播、社群媒體、天際線查詢、預測、分類與分群、文字探勘、 語意分析、臉書、太陽花學運 I. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(3) Abstract In recent years, due to the rise of social media websites on Internet and the popularity of mobile devices capable of Internet access, people can quickly publish their statuses and messages to social media anytime at any place. Internet has changed our lives; we use Internet in almost everything we do. As of 2017, Facebook had 2 billion monthly active users. It is the most popular social networking platform in the world. Therefore, we choose Facebook as our research platform.. 政 治 大. This dissertation focuses on the analysis of political activities entirely. We use posts. 立. of fan pages to analyze political activities and then construct the interaction features. ‧ 國. 學. and sentiment features of posts on Facebook. We use characteristics of features to analyze political activities. The sunflower student movement focuses on the. ‧. interaction features. We use methods to search popular posts and to analyze. Nat. sit. y. information diffusion. Then, we mine important Facebook users. We get popular. n. al. er. io. posts and find that three users are active and important users through sharing-. i n U. v. reaction during the sunflower student movement. However, the sunflower movement. Ch. engchi. cannot investigate the sentiment features because all fan pages fight against the Cross-Strait Service Trade Agreement (CSSTA). For the sentiment features, we study prediction of political tendency of posts. We also collect posts from political groups of fan pages in America; we build sentiment features for the prediction and evaluate prediction performance. To summarize, in this dissertation, we propose novel methods to analyze social media datasets that contain valuable information but do not contain any network structure required by other methods.. II. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(4) Keywords: information diffusion, social media, skyline query, prediction, clustering and classification, text mining, sentiment analysis, Facebook, sunflower student movement. 立. 政 治 大. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. III. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(5) 誌謝 漫長的博士生生涯,終於即將畫上句點,回想當年毅然決然地辭去職場工 作,決定投身博士生的行列,然而,在一切看似美好的求學路程,不知從何開 始,出現無法控制的局面,開始經歷不斷湧來的挫折與失敗,讓我走到了人生 的谷底,完全失去方向。此時,只能告知指導老師,我需要休學,重新找尋自 己的定位。 爾後當我決定復學之後,首先要感謝陳良弼老師在他同意之下,我變更指 導老師,由徐國偉老師指導我重新建立新的研究領域,讓我在另一個地方再度 站起來,揮別過去,重新面對挑戰及生活。徐老師於論文寫作期間細心指導, 不時在論文方法與實作上進行討論,自論文題目的擬定、研究的方向、結果的 分析及結論的探討,皆能適時給予指導,得以讓我走完博士生的研究歷程。此 外亦誠摯感謝所有口試委員在口試過程中給予寶貴的意見,讓此篇論文更為完 善。在論文發表期間感謝政治大學資科系學術組的老師們給我的意見,讓我能 更加精湛地發表論文;同時也感謝系上所有師長在我修業期間給我的指導與建 議;在執行計畫期間感謝政治大學新聞系劉慧雯老師、資科系紀明德老師及徐 老師,在我剛接觸社群媒體領域時給我的說明及建議,讓我能發展自己的研究 主題。. 立. 政 治 大. ‧. ‧ 國. 學. sit. y. Nat. n. al. er. io. 感謝我的博士班同學吳中信和黃啟禎,我們一起修課、準備資格考,共度快 樂的時光,在論文寫作過程中感謝我的戰友兼學弟林明慶及黃凱彬的鼓勵及協 助,讓我們一起成長完成各項任務,這段期間有你們的相伴,讓我度過辛苦又 艱難的日子,我將永生難忘。同時也感謝已畢業的碩士班學弟妹們及同一間實 驗室的學弟妹們,讓我感受到這間聯合實驗室的樂趣及溫暖。. Ch. engchi. i n U. v. 最後,感謝雙親養育栽培之恩、兄姐的陪伴與支持,謹將此論文獻給我最 摯愛的親友,在窮途末路之際仍不斷地鼓勵我,你們是我前進的動力,讓我得 以完成我的學業,也感謝身邊有可愛的兒女相伴,度過艱困的時光。. 邱淑怡 謹誌於 國立政治大學 資訊科學系博士班 中華民國 107 年 5 月 IV. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(6) Content 摘要 ...................................................................................................................... I Abstract .............................................................................................................. II 誌謝 ................................................................................................................... IV Content ................................................................................................................ V List of Tables ...................................................................................................VIII List of Figures ..................................................................................................... X 1.. Introduction ................................................................................................. 1. 2.. Literature Review .........................................................................................7 2.1. 政 治 大 Web2.0 ..................................................................................................7 立. Social media ......................................................................................... 9. 2.3. Facebook .............................................................................................10. 2.4. The sunflower student movement ...................................................... 15. ‧. ‧ 國. 學. 2.2. 2.4.1 323 the Executive Yuan event ......................................................... 16. y. Nat. sit. 2.4.2 330 demonstration .......................................................................... 17. 2.6. Information diffusion on social media .............................................. 20. 2.7. Skyline query ...................................................................................... 22. 2.8. The left-wing and right-wing politics ................................................ 28. 2.9. Text mining ........................................................................................ 29. 2.10. Sentiment analysis ............................................................................. 30. 2.11. Algorithms.......................................................................................... 32. n. al. er. The civil movement of the other countries ......................................... 17. io. 2.5. Ch. engchi. i n U. v. 2.11.1. Naïve Bayes ................................................................................. 32. 2.11.2. k-Nearest Neighbor (kNN) ......................................................... 32. 2.11.3. Support Vector Machines (SVM) ............................................... 33. 2.11.4. AdaBoost ..................................................................................... 34. 2.11.5. Decision Tree .............................................................................. 34 V. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(7) 2.11.6 3.. Classification and Regression Trees (CART) ............................. 35. Method....................................................................................................... 36 3.1. Research design ................................................................................. 36. 3.2. Data collection and extraction ........................................................... 37. 3.3. Search popular posts approach ......................................................... 39. 3.4. The information diffusion approach ................................................. 40. 4.. Anatomy of the sunflower student movement.......................................... 43. 5.. The application of the skyline query on Facebook ................................... 63 5.1. Method ............................................................................................... 63. 5.2. Experiment results ............................................................................. 73. 政 治 大. 5.2.1 Synthetic datasets........................................................................... 74. 立. Datasets from random distribution ........................................ 74. 5.2.1.2. Datasets from Gaussian distribution ....................................... 77. ‧ 國. 學. 5.2.1.1. 5.2.2 Real datasets ................................................................................... 80. ‧. 5.2.2.1. The review dataset .................................................................. 80. sit. y. Nat. 5.2.2.2 The dataset from Facebook......................................................81. io. Information diffusion on Facebook .......................................................... 87. al. n. 6.. er. 5.2.3 Comparison with other skyline query processing algorithms ....... 83. 6.1 6.1.1. Ch. i n U. v. The speed of information diffusion ................................................... 87. engchi. Post-to-sharing reaction ................................................................ 87. 6.1.2 Post-to-commenting reaction ........................................................ 94 6.2. The acceleration analysis .................................................................. 101. 6.2.1 The sharing-reaction .................................................................... 102 6.2.2 The commenting-reaction ............................................................ 104 7.. 8.. User mining to find important Facebook users ...................................... 108 7.1. The sharing-reaction ......................................................................... 110. 7.2. The commenting-reaction ................................................................ 112. Predicting political tendency of posts on Facebook ................................ 115 8.1. Method .............................................................................................. 115 VI. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(8) 9.. 8.2. Explorative analysis .......................................................................... 119. 8.3. Predictive analysis............................................................................. 124. Conclusions and Suggestions ................................................................... 133 9.1. Conclusions ....................................................................................... 133. 9.2. Limitations and directions for future research ................................ 135. References ........................................................................................................138. 立. 政 治 大. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. VII. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(9) List of Tables Table 1. Facebook versus Twitter ...................................................................... 11 Table 2. The features of skyline query processing algorithms ......................... 24 Table 3. The summary of related works of skyline query ................................ 24 Table 4. The summary of related works of skyline query on probabilistic data ................................................................................................................... 26 Table 5. The summary of comparisons between our proposed skyline query and related works ............................................................................................. 27. 治 政 大 for 20 selected fan Table 7. The total number of posts, shares, and comments 立 pages .......................................................................................................... 44 Table 6. Notations for a post P with n shares and m comments ...................... 41. ‧ 國. 學. Table 8. The posts of the top-10 shares ............................................................ 48 Table 9. The posts of the top-10 comments ..................................................... 49. ‧. Table 10. The posts of the top-10 Facebook likes ............................................. 49. y. Nat. Table 11. The maximum values of daily shares ................................................ 55. io. sit. Table 12. Top-10 users who like maximal posts ............................................... 56. n. al. er. Table 13. Top-10 users who share maximal posts ............................................ 58. i n U. v. Table 14. Top-10 users who comment on maximal posts ................................ 59. Ch. engchi. Table 15. The last post for 20 selected fan pages as of May 2018 ..................... 61 Table 16. A post is shared by users ................................................................... 64 Table 17. The object data for P1......................................................................... 64 Table 18. Example 1 score dataset .................................................................... 66 Table 19. The multi-instance data transformed from Example 1 score dataset ................................................................................................................... 66 Table 20. A summary table for three monotonic aggregation functions ......... 70 Table 21. Statistics for share and comment ratings in the FB dataset ..............81 Table 22. Post-to-sharing-reaction time (hours:minutes) ............................... 89 Table 23. Types of posts during the sunflower student movement ................. 90 VIII. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(10) Table 24. Types are defined by Facebook Graph API ...................................... 90 Table 25. Post-to-commenting-reaction time (hours:minutes) ...................... 96 Table 26. The acceleration of the sharing-reaction......................................... 107 Table 27. The acceleration of the commenting-reaction.................................107 Table 28. Two Lexical Databases..................................................................... 118 Table 29. Four functions for interaction feature adjustments........................ 118 Table 30. The number of posts for each fan page ........................................... 119 Table 31. Types of posts for American political fan pages .............................. 121. 立. 政 治 大. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. IX. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(11) List of Figures Figure 1. The conceptual framework ................................................................ 37 Figure 2. Daily number of posts ....................................................................... 46 Figure 3. Number of Facebook likes per day .................................................... 47 Figure 4. Number of shares per day ................................................................. 47 Figure 5. Number of comments per day ........................................................... 48 Figure 6. Daily number of posts ....................................................................... 50 Figure 7. Daily number of Facebook likes for BIN fan page ............................. 51. 政 治 大 Figure 9. Daily number of 立comments for BIN fan page ................................... 52 Figure 8. Daily number of shares for BIN fan page ......................................... 52. ‧ 國. 學. Figure 10. Hourly number of posts................................................................... 53 Figure 11. Hourly numbers of shares ................................................................ 54. ‧. Figure 12. Hourly numbers of comments ......................................................... 54 Figure 13. Frequency distribution of the number of Facebook likes ................57. y. Nat. sit. Figure 14. Frequency distribution of the number of shares............................. 59. n. al. er. io. Figure 15. Frequency distribution of the number of comments ...................... 60. i n U. v. Figure 16. The algorithm for dominance relation comparison ........................ 69. Ch. engchi. Figure 17. The algorithm for skyline query processing and dominance relation comparison ................................................................................................ 72 Figure 18. The data flow of the skyline query................................................... 73 Figure 19. The running time and the number of dominance relation comparisons for datasets from random distribution ................................75 Figure 20. The performance comparison between datasets from random distribution of different sizes .................................................................... 76 Figure 21. The running time and the number of dominance relation comparisons for datasets from Gaussian distribution ............................. 78 Figure 22. The performance comparison between 10-dimensional datasets from Gaussian distribution with different score ranges.................................... 79 X. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(12) Figure 23. The performance comparison between the RC dataset and two synthetic datasets .......................................................................................81 Figure 24. The running time and the number of dominance relation comparison for the FB dataset. ..................................................................................... 82 Figure 25. The frequency distributions of share and comment ratings in the FB dataset........................................................................................................ 83 Figure 26. Our methods compared with BNL algorithm ................................. 84 Figure 27. Our methods compared with D&C algorithm ................................. 85 Figure 28. Distribution of sharing-reaction numbers ..................................... 88 Figure 29. Distributions for cumulative sharing-reaction counts ................... 90. 政 治 大. Figure 30. 50% and 90% cumulative sharing counts among different types of posts ............................................................................................................ 91. 立. ‧ 國. 學. Figure 31. Trends of the sharing-reaction time [hh:mm] among four groups - group 1 to group 4 from top to down ...................................................... 94 Figure 32. Distribution of commenting-reaction numbers ............................. 95. ‧. Figure 33. Distributions for cumulative commenting-reaction counts ........... 97. sit. y. Nat. Figure 34. Histograms of cumulative commenting-reaction counts for 50% and 90% ............................................................................................................ 97. er. io. Figure 35. 50% and 90% cumulative commenting counts among different types of posts ....................................................................................................... 98. al. n. v i n Ch Figure 36. Trends of commenting-reaction [hh:mm] among five groups e n g c time hi U group 1 to group 5 from top to down ....................................................... 101. Figure 37. Algorithm for maximum acceleration for sharing-reaction ......... 103 Figure 38. Ranking users for sharing-reaction .............................................. 104 Figure 39. Ranking users for commenting-reaction ...................................... 106 Figure 40. The cumulative sharing-reaction counts for two posts ................ 109 Figure 41. Algorithm for top-k users for maximal influence for sharing-reaction .................................................................................................................. 112 Figure 42. Data processing for predicting political posts ............................... 116 Figure 43. The number of posts of the left-wing political fan pages ............. 120 Figure 44. The number of posts of the right-wing political fan pages ........... 120 XI. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(13) Figure 45. The word cloud for all posts ........................................................... 123 Figure 46. The word cloud for the left-wing politics ....................................... 123 Figure 47. The word cloud for the right-wing politics .................................... 124 Figure 48. F1-scores given by TDM_TF and TDM_TFIDF ............................ 125 Figure 49. F1-scores given by TDM_TF .......................................................... 126 Figure 50. F1-scores given by TDM_TFIDF.................................................... 127 Figure 51. F1-scores given by the two lexical databases ..................................128 Figure 52. F1-scores given by the interaction features .................................. 130 Figure 53. F1-scores given by logarithm function of the interaction features130 Figure 54. F1-scores given by normalization function of the interaction features .................................................................................................................. 131. 立. 政 治 大. ‧ 國. 學. Figure 55. F1-scores given by standardization function of the interaction features ..................................................................................................... 131 Figure 56. F1-scores given by similarity_L of the interaction features ......... 132. ‧. Figure 57. F1-scores given by similarity_R of the interaction features ......... 132. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. XII. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(14) 1. Introduction Internet rapidly develops during the past several decades. It has revolutionized communications and turned our lives upside down. We use Internet in almost everything we do. Internet has been transformed. In the early development, it was a static network designed to send a short file between two terminals. However, today, immense quantities of information are uploaded and downloaded through wireless electronic equipment. In the first decade of the 21st century, Web 2.0 developed the. 政 治 大 Internet was no longer concerned with information exchange. It is a complicated tool 立 enabling individuals to create content and communicate with one another. The content social media and other interactive, crowd-based communication tools. Therefore,. ‧ 國. 學. is a great deal of our own; we are all publishers, critics, and creators.. ‧. Traditional public media include newspapers, radio, television, movies, etc. The content is edited by the owners with the goal to achieve mass production and sales [1].. y. Nat. io. sit. In recent years, social media generate a prodigious wealth of real-time content at an. n. al. er. incessant rate [2]. Social media are a group of Internet-based applications that build. i n U. v. on the ideological and technological foundations of Web 2.0, and that allow the. Ch. engchi. creation and exchange of UGC (User Generated Content) [1]. UGC is any form of content created by users of a system or service and made available publicly on that system. UGC most often appears as supplements to online platforms, such as social media websites. A social networking service (also social networking site or SNS) is a platform to build social networks among people who share similar interests, activities, backgrounds or real-life connections. Now, many Web2.0 websites belong to SNS websites such as Facebook, Google, Myspace, YouTube and Twitter, and social media have become one of the most popular Internet services in the world. Social media and Web2.0 changed the way people were interacting, so the politicians had to adapt their 1. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(15) communication to these social changes. Facebook is a very successful social media website. According to the latest Internet World Stats reports [3], as of 2017, Facebook had 2 billion monthly active users and the penetration rate of Facebook is 26.3% in the world. In Taiwan, Facebook had 18 million monthly active users and 16 million daily active users. The penetration rate of Facebook is 76.9%. Of those Americans who are online, 71% are on Facebook, 63% of whom check Facebook at least once a day. It shows that using Facebook every day has become a general lifestyle. Facebook is no longer just a social media but has become a social utility that uses movements to activate users. In this dissertation, we focus on the political activities on social media. 政 治 大 the most representative social networking platform; we choose Facebook as our 立 research platform.. and apply some computational methods to study the political activities. Facebook is. ‧ 國. 學. This dissertation focuses on the analysis of political activities completely. We use. ‧. posts of fan pages to explore political activities. In general, posts of fan pages include the interaction features and sentiment features on Facebook. We use characteristics of. y. Nat. io. sit. features to analyze political activities. First, this dissertation discusses the sunflower. n. al. er. student movement because it was an important movement in Taiwan’s political. i n U. v. history. This civil movement was a protest movement driven by a coalition of students. Ch. engchi. and civil groups between March 18 and April 10, 2014. The police clashed with hundreds of students who were against a trade deal with China and occupied the government headquarters. These students posted as well as shared posts on Facebook and even created fan pages on Facebook to fight against the Cross-Strait Service Trade Agreement (CSSTA). The analysis of this civil movement focuses on the interaction features. It is a special and short-term activity. We use methods to search popular posts and analyze information diffusion. Then, we mine important Facebook users. However, this movement cannot investigate the sentiment features because all fan pages fight against CSSTA. Due to these fan pages against CSSTA, posts of these fan pages are all with the same tendency. Posts cannot be classified according to the 2. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(16) sentiment features of posts. Consequently, we collect posts from left- and right-wing politics in the United States of America for sentiment analysis. It is a general and long-term activity on Facebook. We predict political tendency of posts. Then, we build the interaction features and sentiment features for the prediction and evaluate prediction performance. We collect datasets from political fan pages on Facebook. The two datasets do not contain any network structure. Therefore, we propose novel methods to analyze information diffusion, search popular posts, perform sentiment analysis, and mine important Facebook users. We apply features for the prediction and evaluate prediction performance. This dissertation gives an insight into the political. 政 治 大 We study the political activity 立 of America about left- or right-wing politics. It is a. activities on Facebook.. ‧ 國. 學. general activity. We also study the sunflower student movement. It is a special event. One of the goals when broadcasting content is to reach a large audience. We focus on. ‧. political activities on Facebook entirely. Since the rise of social media, it is a trend to publish politicians’ personal news and political comments to social media nowadays.. y. Nat. n. al. er. io. and their posts.. sit. Politicians like to post on Facebook, and we would like to study them (the posters). Ch. i n U. v. In the early 21st century, there were several revolutions or movements led by college. engchi. students or young adults across the globe through Internet. In Taiwan, the sunflower student movement was a protest movement driven by a coalition of students and civil groups between March 18 and April 10, 2014. As in most revolutions or movements in the early 21st century, these Taiwanese students employed the Internet and shared information on some platforms like Facebook or Twitter in the sunflower student movement. Taiwanese students also shared the information that is seen through posts, photos, and videos on Facebook during the sunflower student movement. These protesters used the social networking platform to share information and videos. We analyze information diffusion and investigate the reaction time on Facebook during this movement from more than one perspective. In the United States of America, some 3. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(17) analysts attribute Obama’s victory to a large extent to his online strategy [4]. Some studies present evidence of the association between the level of discussion surrounding the 2012 U.S. presidential candidates in the newly social media sphere and the candidates' own social media activities [5]. Their studies provide some of the empirical evidence about the potential impact of social media on the U.S. political elections. They find that social media do substantially expand the possible modes and methods of election campaigning. We use left- and right-wing political posts to predict political tendency. In the Western countries, left-wing politics supports social equality and egalitarianism; right-wing politics supports the basis of natural law, economics or. 政 治 大 In this dissertation, we develop 立 some novel approaches to analyze Facebook data. tradition [6].. ‧ 國. 學. because our dataset does not contain any structure. In general, most studies are based on the structure of the social network, such as users’ friend graphs on Facebook.. ‧. Therefore, we need to propose an approach to solve the problem because our dataset does not contain any network structure, which is required by existing approaches.. sit. y. Nat. io. er. Facebook fan pages are different from personal pages because the dynamics of posting and reacting on these pages diverge significantly from personal pages [7]. Our. n. al. Ch. i n U. v. dataset is collected through Facebook Graph API, and it has no network structure. The. engchi. collected dataset is stored in a database. Although we have no network structure, we still attempt to answer the following questions: How does information diffuse through sharing or commenting? Which post is popular? Who are the important Facebook users? We analyze various factors that affect the reactions on fan pages. The reactions contain sharing and commenting on posts. This dissertation is to provide more information to researchers for studying the political activities on the social network. In this dissertation, we apply the skyline query to search popular posts on Facebook; we use the reaction time to analyze posts that are more popular. Reactions include sharing posts and commenting on posts. For a post, there is a significant 4. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(18) difference between the time when the post is created and the time when any reaction appears. Related to the maximum vector problem, a skyline query is to discover dominating tuples from a set of tuples. Traditionally, skyline queries are defined upon single-instance data or upon objects each of which is associated with an instance. However, in some cases, an object is not associated with a single instance but rather by multiple instances. On a review website, many users assign scores to a product or service, and a user’s score is an instance of the object representing the product or service. Such data is an example of multi-instance data. For Facebook data, every post contains two attributes, namely the number of shares and the number of comments.. 政 治 大 and comments. The skyline query can search popular posts. 立. The task is to retrieve the posts that dominate others (or the skyline posts) in shares. ‧ 國. 學. The main contributions of this dissertation are as follows:. 1. We describe the 20 selected fan pages’ activities in a period of the sunflower. ‧. student movement. We discuss the numbers of posts and reactions from these. y. sit. io. er. posts.. Nat. fan pages. The reaction behavior includes liking, sharing, and commenting on. al. n. v i n Ch U propose a way to speed up the dominance relation on multi-instance e n g cdata h i and. 2. We apply the skyline query to search popular posts. Consequently, we define. the skyline query processing. 3. We define and calculate the speed and acceleration of information diffusion. Then we distinguish different types of posts. 4. We mine users who are important Facebook users during the civil movement. 5. We predict the political tendency of posts on Facebook in the United States of America. The remainder of this dissertation is organized as follows. Chapter 2 reviews related studies. Chapter 3 introduces the research design, data collection and 5. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(19) extraction, the skyline query and information diffusion approaches. Chapter 4 explores the sunflower student movement. Chapter 5 applies the skyline query to search popular posts. Chapter 6 discusses information diffusion on Facebook. Chapter 7 mines the important Facebook users. Chapter 8 predicts the political tendency of posts on Facebook. Finally, Chapter 9 draws conclusions and provides research limitations and directions for future research.. 立. 政 治 大. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. 6. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(20) 2. Literature Review This chapter presents the results of literature review. Section 1 introduces Web2.0. Section 2 describes social media and Section 3 describes Facebook. Section 4 shows the sunflower student movement. Section 5 shows the civil movement of the other countries. Section 6 introduces the literature review of information diffusion on social media. Section 7 introduces the literature review of skyline query. Section 8 describes the left-wing and right-wing politics. Section 9 exhibits text mining and Section 10 displays sentiment analysis. Finally, the last section introduces text mining algorithms.. 立. 政 治 大. ‧ 國. 學. 2.1 Web2.0. ‧. Today, Web2.0 is the current online technology as it compares to the early days of the. sit. y. Nat. Web, characterized by greater user interactivity and collaboration and by more. io. er. pervasive network connectivity. One of the most significant differences between Web 2.0 and the traditional World Wide Web (WWW, referred to as Web1.0) is greater. n. al. Ch. i n U. v. collaboration among Internet users and content providers. Web 2.0 allows hundreds of. engchi. millions of Internet users to produce and consume content. In 1960, Internet developed to carry an extensive range of information resources and services, such as the hypertext documents and applications of WWW, e-mail, instant messaging (IM), Internet telephony, newsgroups and file transfers [8]. The origins of Internet are to research commissioned by the United States federal government in the 1960s. WWW is an application of the global, interactive, distributed and multi-platform system built on Internet. The Web project was started by Tim Berners-Lee at the European Physics Laboratory (CERN) in Geneva, Switzerland [9]. Tim wanted to find a way for scientists doing projects at CERN to 7. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(21) collaborate with each other online. He started the WWW project at CERN in March 1989. In January 1992, the first versions of the WWW software, known as Hypertext Transfer Protocol (HTTP), appeared on Internet [8]. The origins of WWW share information among scientists. They use graphs and texts to share their works and documents. WWW is a new way of sighting information. The initial concept of WWW is Hypertext. Viewing a web page on the WWW normally begins either by typing the Uniform resource locator (URL) of the page into a web browser or by linking a hyperlink to that page [8]. A URL is a reference to a web resource that specifies its location on a computer network to retrieve it.. 政 治 大 software developers and end-users 立 started to utilize WWW [10]. It refers to a webWeb 2.0 is a term that was first used in 2004 to describe a new way in which. ‧ 國. 學. based platform which the contents are dominated by user-generated content, in contrast to the traditional employed-generated content by the websites. In Web 2.0,. ‧. the contents are generated by each user's participation and a personalized content, and the contents are shared by people. When Web 2.0 represents the ideological and. y. Nat. io. sit. technological foundation, UGC can be seen as the sum of all ways in which people. n. al. er. make use of social media [11]. Early in 1999 the famous management scholar Peter F.. i n U. v. Drucker has stated that the development of the Information Technology goes the. Ch. engchi. wrong direction because it is the "Information" pushing real social progress rather than "Technology" we should focus attention on information content [11]. "Web 2.0", the word was originated when it was mentioned in the brainstorming hosted by Tim O'Reilly [10]. This enterprise is composed of a series of new network technology. It makes the network from the previous centralized steering decentralization; the user can get more spread on the Internet, sharing, and free communication. This media company organized the first "Web2.0 Conference" in October 2004, and determined the term Web2.0 [12].. 8. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(22) 2.2 Social media By 1979, Tom Truscott and Jim Ellis from Duke University had created the Usenet, a worldwide discussion system that allowed Internet users to post public messages [11]. The era of social media as we understand it today probably started about 20 years earlier. Contents are selected or edited by users on the social media platform. In [14], they identified two essential trends about the use of social media. First, users engage a range of tools for communication. Second, users embrace new tools and adopt them as part of their communication repertoire. These two trends suggest that users do not completely replace one form of social media with another because each form supports. 政 治 大 user-generated content, usability, 立 and interoperability is the spirit of Web 2.0.. the unique communication needs that the other cannot completely fulfill. Emphasizing. ‧ 國. 學. A SNS is a platform to build social networks or social relations among people who share similar interests, activities, backgrounds or real-life connections. A social. ‧. network service consists of a representation of each user, their social links, and a. sit. y. Nat. variety of additional services. Social network sites are web-based services that allow. io. er. individuals to create a public profile, create a list of users with whom to share connections, and view and cross the connections within the system. SNS adopt. n. al. Ch. i n U. v. distributed technology (i.e., P2P technology) to build the next generation of Internet-. engchi. based software. Early social networking on the WWW began in the form of generalized online communities. Many of these early communities focused on bringing people together to interact with each other through chat rooms and encouraged users to share personal information and ideas via personal web pages by providing easy-to-use publishing tools and free or inexpensive web space. The social networking services is a social networking software based on the theory of The Six Degrees of Separation [13]. According to the basis of friends of friends, they can expand their contacts. Most of the social networking sites provided users to interact such as sending e-mail, writing blogs, sharing data, discussing groups, etc. Now, many Web2.0 websites belong to SNS websites, such as Facebook, YouTube, 9. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(23) Myspace, Twitter, etc. Development of social networking websites verified the “Six Degrees of Separation” [13]. It is the theory that everyone is six or fewer steps away, by way of introduction, from any other person in the world, so that a chain of “a friend of a friend” statements can be made to connect any two people in a maximum of six steps [15]. This concept was originally set out by Frigyes Karinthy in 1929 and popularized by 1990 play written by John Guare. Social media play a crucial role in recent social movements. There were some revolutions or movements led by young adults across the globe through Internet [16], [17], [18], [19], [20], [21], such as the Arab Spring, the Indignados protest in Spain. 政 治 大 highlighted the particular role 立of digital social media networks and its contribution to and the Occupy Wall Street movement in North America. These movements. ‧ 國. 學. the facilitation of protest movements [22]. In [22], they think that social media have been a crucial factor in the prolongation and success of this civil movement. They. ‧. indicate that another important factor for protestors to use social media was the mistrust in the traditional media [22].. n. er. io. sit. y. Nat. 2.3 Facebook al. Ch. engchi. i n U. v. Facebook is a social media website and social networking service. The Facebook website was launched on February 4, 2004, by Mark Zuckerberg [24]. Facebook is a very successful social media website and it is also the most popular social networking platform worldwide. Facebook’s users can create profiles, update status, publish posts, and share photos and videos on their walls [1]. There have been many studies regarding Facebook, while most are concerning the structures and dynamics of the social networks [1]. According to a study in 2015, 63% of the users of Facebook or Twitter in the USA consider these networks to be their main source of news, with entertainment news being the most seen [25]. Facebook and Twitter are the most popular social 10. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(24) networking platforms. According to the Diffen website report [23], we do a summary table to compare Facebook with Twitter in Table 1.. Table 1. Facebook versus Twitter Facebook. Twitter. Brief. It is a popular free social networking website that allows registered users to create profiles, upload photos and video, send messages and keep in touch with friends, family, and colleagues.. Features. Facebook features include Friends, Fans, Wall, New Feed, Fan Pages, Groups, Apps, Live Chat, Likes, Photos, Videos, Text, Polls, Links, Status, Pokes, Gifts, Games, Messaging, Classified section, upload and download options for photos.. It is a free microblogging service that allows registered members to broadcast short posts called tweets.. 政 治 Tweet, 大 Retweet, Direct. Messaging, Follow People and Trending Topics, Links, Photos, Videos. 學. sit. Nat. y. ‧. ‧ 國. 立. Required. Upload photographs. Yes. Instant Messaging. Yes. Launch date. February 4, 2004. July 6, 2006. Number of users. 2 billion monthly active users (2017). 319 million monthly active users (2016). Number of users in Taiwan. 18 million monthly active users (2017). -. Founded by. Mark Zuckerberg. Jack Dorsey. Post update. Yes. Yes. Play games. Yes. No. n. al. Yes. Ch. engchi. Users express Like, Share approval of content by. er. Required. io. Registration. i n No U. v. "Retweet" or "Favorite". 11. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(25) Share links. Yes. Yes. Languages. Available in 140 languages (2016). Available in 29 languages (2016). Employees. 12,691 (2015). 3,628 (2015). Reblog posts. Yes. Yes. Follow trending topics. No. Yes. Follow people. Yes. Yes. Add friends. Yes. No. Privacy settings. Yes. Either public or private. Post length. Unlimited. Edit posts. Yes. 立. 政 治 140大characters No "Reply". hashtag. Yes. Nat. Activities of every minute. 382,000 Facebook likes. 350,000 tweets. n. al. Ch. sit. No. er. Yes. io. Check-in place. y. Yes. ‧. ‧ 國. 學. Users express "Comment" opinions about content by. engchi. i n U. v. As of 2017, Facebook had 2 billion monthly active users all over the world. Facebook is a very successful Web2.0 website. It is an online social networking service website, founded by Mark Zuckerberg with Andrew McCollum and Eduardo Saverin. “The Facebook” was launched on February 4, 2004, when Mark was a Harvard student [26]. The website had initially limited the membership to Harvard students but later expanded it to colleges in the Boston area (i.e., MIT), Stanford University, New York University, Northwestern University and the Ivy League in the following two months [26]. In 2005, the students at various other universities were also invited, and later to any student whose e-mail address is registered in any 12. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(26) university of the global is allowed to register. Facebook was originally from the directory of high school when Mark was at Phillips Exeter Academy, admitted in his junior year like the other students of the school, he got the student directory. “The Photo Address Book” (contain the photo and address for each student), named for the student directory, which students referred to as “The Facebook” [26]. Such photo directories were an important part of the student social experience at many private schools like Phillips Exeter Academy. It is working now. This “Facebook” has become an important resource for students to contact each other. This is the same with Facebook website nowadays. It even created a social culture before the birth of. 政 治 大 with the spirit of Web 2.0. Since 2006, anyone who is at least 13 years old was 立 allowed to become a registered member of Facebook, though the age requirement may Internet, this social culture is driven by the student directory, and it is also consistent. ‧ 國. 學. be higher depending on applicable local laws.. ‧. At present, Facebook has become the world’s largest SNS. If you use the Internet, you are increasingly possible to use Facebook. It is the second-most-visited website,. y. Nat. io. sit. after Google, and Facebook had 2 billion monthly active users worldwide as of 2017. n. al. er. [26]. According to Facebook reported, a monthly active user is a registered Facebook. i n U. v. user who logged in and visited Facebook through their website or a mobile device, or. Ch. engchi. used their Messenger app, in the last 30 days as of the date of measurement. This platform is also the most popular social network worldwide. Facebook’s users spend an average of 20+ minutes per day on the social network, liking, commenting, and scrolling through status updates, according to analysts from Needham, citing comScore data in 2015. Therefore, many enterprises are using Facebook to implement marketing activities product or service sales and enhance brand image [27], [28]. Although 20+ minutes is the global average, people in America spend much more time than that, according to Facebook’s internal information. In 2014, Facebook’s CEO Mark Zuckerberg said that the average US user spends 40 minutes on Facebook per day. In accordance with the result of a study commissioned by analytics firm Verto 13. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(27) on Search Engine Journal website, Facebook’s 222 million monthly users in the U. S. spend an average of 14 hours each month within the company’s mobile app. From the above description, our research dataset from Facebook is more representative than others’ SNS. Unlike just about any other website or technology business, Facebook is profoundly, centrally, about people [26]. It is a platform for people to get more out of their lives; it is a new form of communication [26]. Facebook's service is constantly updating. It provides many functions on the website. It allows a user to add friends, send messages, and update personal profiles to notify friends and peers about. 政 治 大 occupation to their religious 立and political views to their favorite movies and. themselves [14]. Users can post information about themselves ranging from their. ‧ 國. 學. musicians. On this profile, both the user and their ‘friends’ can post web links, pictures, and videos of interest. Further, Facebook also offers the facility to send. ‧. private and public messages to other users and even engage in real time instant messaging [29]. Facebook users can also form and join virtual groups, develop. y. Nat. io. sit. applications, host content, and learn about their interests, hobbies, and relationship. n. al. er. statuses through users’ online profiles.. Ch. i n U. v. Furthermore, News Feed was announced on September 6, 2006, which appears on. engchi. every user's homepage and highlights information including profile changes, upcoming events, and birthdays of the user's friends. News Feed was more than just a change to Facebook. It was the harbinger of an important shift in the way that information is exchanged between people [26]. When you wanted to get information about yourself to someone until now, you had to initiate a process or ‘send’ them something, as you do when you make a phone call, send a letter or an email, or even conduct a dialogue by instant message. Facebook provided many friendly tools for users. Therefore, Facebook has become a way of life. You can instantly communicate with old high school friends or peruse your neighbor's vacation photos. Its wide variety of picture, video, advertising and security features will keep this social media 14. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(28) service at the top for the time being. In summary, the three core of the Facebook experience centers on users’ ability to (1) post self-relevant information on an individualized profile page, (2) link to other members, and (3) interact with other members [30].. 2.4 The sunflower student movement March 18, 2014 is an important day in Taiwan’s history. Because many students occupied the Taiwanese Parliament Building on March 18, 2014. On March 17,. 政 治 大 Legislative Yuan to force CSSTA to the legislative floor without giving it a clause-by立 Taiwan’s ruling Kuomintang party (KMP) attempted a unilateral move in the. clause review as previously established in a June 2013 agreement with the opposing. ‧ 國. 學. Democratic Progressive Party (DPP). This agreement is expected to cast huge impacts. ‧. on the life of ordinary people, including considerable job losses or worsen working condition in several enterprises [31]. This movement is an important movement of. y. Nat. sit. students during the democratic development of Taiwan. “What they have demanded,. er. io. they have not yet received; thus they keep pressing on. As the 4th day passed by since. al. n. v i n C protest againstUCross-Strait Service Trade March 18th, the momentum behindhthis engchi student activists taking over Taiwan’s national legislative building on midnight. Agreement between Taiwan and China has only grown stronger than ever.” The above statements are mentioned by CNN News about the event of the sunflower movement. It shows that the international media have concerned this civil movement. The term "the sunflower student movement" was derived from March 19, 2014, The Black Island Nation Youth Front fan page issued a post on Facebook. The content of this post is “we hope to help buy sunflowers to cheer for the student movement”. This post was published widely on Facebook. They thought that sunflowers had phototropism and hoped to help sunflowers to cheer for this student movement. This term was popularized after a floristry contributed 1000 sunflowers to the students 15. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(29) outside the Legislative Yuan Building. “Sunflower” was also an allusion to the Wild Lily movement of 1990 which set a milestone in the democratization of Taiwan. Taken together, we understood apparently Facebook to be a tool of spreading messages in the new generation. The sunflower student movement is also called the "March 18 Student Movement" or "Occupying Taiwan Legislature". Students have seen off the police’s serial breaching attempt around the Taiwanese Parliament Building. This was only the beginning of the resistance for these students. They attempt to stop the undemocratic coalition of some political elite in both Taiwan and China. Students spread messages. 政 治 大 participation. They posted 立 as well as shared posts on Facebook and even created fan. and videos to whoever is committed to the principle of democracy, transparency, and. ‧ 國. 學. pages on Facebook to fight against CSSTA. Since March 18, protesters have spread messages, shared thoughts on the fan pages established by supporters on Facebook. ‧. and other social media.. n. er. io. al. sit. y. Nat. 2.4.1 323 the Executive Yuan event. i n U. v. The protest swelled to comprise over 3,000 people further; it was supposed to be. Ch. engchi. peaceful. On the night of March 23, a group of protesters snuck into the Executive Yuan to occupy the chamber. They posted information on Facebook inviting supporters to join. A few hours later, Premier Jiang Yi-Huah, who had consulted President Ma, ordered hundreds of police to use violence to evict the protesters. Over 100 people were injured during the attack on March 24; fifty of them were occupiers, the others were police officers who were accidentally injured by their peers. In Taiwan, the democracy is under threat. Most newspapers and TV stations have failed to provide full and accurate information about this event that has taken place. Journalists belonging to the main media downplayed or dismissed the activities of the protesters. However, the news was quickly circulated on Facebook and attracted 16. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(30) thousands of supporters. Witness narratives about what is being called “the 323 event” [32].. 2.4.2 330 demonstration The 330 Demonstration means that student occupied by the Legislative Yuan and called on the masses to go to Ketagalan Boulevard on March 30, 2014. This activity organizers appeal to people that participate in 330 demonstration wearing black clothes as a mark. It is also called Black-shirts by lots of media. Their requests are. 政 治 大. “the defense of democracy, returned to service trade and people stand up”.. 立. During the sunflower student movement, students’ representative announced the. ‧ 國. 學. expansion of the protests and called on the people to go to Ketagalan Boulevard for the 330 demonstration [32]. On March 29, students stressed that it continues to adhere. ‧. to the occupation legislature Yuan after the 330 demonstration if the Government still. y. Nat. made no response to students of four demands. At 14:00 on March 30, these students. io. sit. gathered 350,000 people participating in this demonstration. The next day of this. n. al. er. demonstration, KMP policy committee chief executive Lin, Hung-Chih and KMP. i n U. v. legislator Chang Ching-Chung held a press conference and gave an apology to. Ch. engchi. masses. In the past few days, these students of this movement called on the people by Facebook.. 2.5 The civil movement of the other countries People notice that social media have played a crucial role in recent civil movement and the most prominent example is Arab Spring or the Occupy Wall Street movement in 2011 [33]. However, using social media for mobilizing not a recent phenomenon. In 1994, a group of indigenous people called Zapatistas in Mexico made use of the 17. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(31) Internet technology in its early stage [34]. They demanded socio-economic reforms as well as respect for indigenous people. Martinez-Torres describes their uprising as the “first informational guerilla movement” [34]. The Zapatistas wanted to raise awareness to the injustice they encountered and called for international support. Therefore, with the help of e_mails, BBS and fax they were able to connect to the other area of the world. The Zapatistas built a dialogue between them and the outside world. In recent years, due to the rise of social media websites on the Internet, people always publish their statuses and messages to social media at any place. When a civil. 政 治 大 they see at any time. The Arab 立 Spring was a revolutionary wave of both violent and. movement or revolution happened, people use such websites to publish anything that. ‧ 國. 學. non-violent demonstrations, protests, riots, coups and civil wars in the Arab world that began on 17 Dec. 2010 in Tunisia with Tunisian Revolution, and spread throughout. ‧. the countries of Arab League and the surroundings [16]. The trigger was the selfimmolation of Tunisian Mohamed Bouazizi. Unable to find work and selling fruit at. y. Nat. io. sit. roadside stand, Bouazizi had his wares confiscated by a municipal inspector on 17. n. al. er. December 2010. An hour later he doused himself with gasoline and set himself on. i n U. v. fire. The news of his death on 4 January 2011 spread quickly and was not covered by. Ch. engchi. the traditional news outlet in Tunisia. Images of Bouazizi spread and resulted in the public’s anger against the government [36]. These images and videos were spread through social media websites, such as YouTube, Facebook or Twitter. According to The Washington Post reported, with the success of the protests in Tunisia, a wave of unrest sparked by Bouazizi struck Algeria, Jordan, Egypt, and Yemen, then spread to other countries. The Indignados movement in Spain had origins in social networks and began with demonstrations on 15 May 2011 close to the local and regional elections, held on 22 May. Since the ongoing economic crisis began, Spain has had one of the highest unemployment rates in Europe, reaching a Eurozone record of 21.3% [22]. The youth 18. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(32) unemployment rate stands at 43.5%, the highest in the European Union. According to Yahoo News, Spain government approved a sweeping overhaul of the labor market designed to reduce unemployment and revive the economy. However, main trade unions rejected the plan because it made it easier and cheaper for employers to hire and fire workers. In 2011, Spanish young adults on Spanish social networks and forums created the digital platforms. Facebook groups “Real Democracy NOW” and “Platform of Coordination of Groups Pro-Citizen Mobilization” were created. They criticized how the Spanish government managed the economic crisis in Europe. This dissertation focuses on Facebook fan pages. According to Facebook website. 政 治 大 political figures to represent 立themselves on Facebook. Unlike Facebook groups, fan reported, a fan page is the only way for entities like organizations, celebrities, and. ‧ 國. 學. pages are visible to everyone on the Internet by default. You and every person on Facebook can connect with these fan pages by becoming a fan and then receive their. ‧. updates in your News Feed and interact with them. Facebook Groups are the place for small group communication and for people to share their common interests and their. y. Nat. io. sit. opinion. When you create a group, you can decide whether to make it publicly. n. al. er. available for anyone to participate, require administrator approval for members to. i n U. v. participate or keep it private and by invitation only. We think that the fan pages are. Ch. engchi. more public than groups for a user. Anyone can become a fan (i.e., ‘Liking’ the page). In this dissertation, we want to receive more messages from anyone and then analyze messages diffusion further. Therefore, this dissertation uses the public datasets of fan pages on Facebook. In North America, the Occupy Wall Street is the name given to a protest movement that began on September 17, 2011, in Zuccotti Park, located in New York City’s Wall Street financial district [37]. The main issues raised by Occupy Wall Street were social and economic inequality, greed, corruption on government. It was inspired by anti-austerity protests in Spain coming from the Indignados movement. The protest itself created Facebook fan pages for the demonstrations and shared videos on 19. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(33) YouTube. In [38], the authors explore the notion that Twitter may support multiple opportunities for participation in the Occupy Wall Street movement. Today, social media platforms provide them with an open space to discuss and share their emotions and feelings with one another and mobilize for demonstrations. The three civil movements have three characteristics in common: they are all more or less spontaneous, leaderless and every country has its preference in social media [33]. Social media have an important tool in diffusing messages and help protesters not only to mobilize but also to gain international attention and prolong the movement over a long period. Facebook is a representative social media platform. Therefore, we. 政 治 大 some particular events over立 a period of the sunflower student movement.. use Facebook data to analyze message diffusion and investigate the reaction time for. ‧ 國. 學 ‧. 2.6 Information diffusion on social media y. Nat. io. sit. The social networks have become powerful tools for information diffusion. They. n. al. er. facilitate the rapid and large-scale propagation of content and the consequences of. i n U. v. information [38]. Today, many political events or critical news are spread using these. Ch. engchi. social networks. Social networks play a major role in the diffusion of this information and have proven to be very powerful in many situations, such as the 2010 Arab Spring or the 2008 U.S. presidential elections [39]. Twitter is considered an important role in Arab Spring and other political activities. On the day of the 2016 U.S. presidential election, Twitter proved to be the largest source of breaking news, with 40 million tweets sent by 10 p.m. that day, according to the New York Times. In Taiwan, according to a survey by Commonwealth Magazine, the usage rate of Twitter stands at 5.5%. Compared with Facebook, its usage rate is 87.2%. Therefore, Facebook is the most representative and suitable social network.. 20. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(34) There is a study that finds out the relationship between students’ Facebook use and their political participation [40]. They mentioned that “About students’ Facebook use, motivation of use and degree of use are two major items.” In addition, their study shows that the degree of students’ Facebook uses truly played a key role in the participation of the sunflower student movement [40]. Another study explains that social media have been a crucial factor in the prolongation and success of this movement [22]. In [22], their study mentions that “Besides the high penetration rate of social media, such as Facebook, among young Taiwanese, another important factor for protestors to use social media were the mistrust in the traditional Taiwanese media.” These studies show. 政 治 大 mainstream media. In [22], they need to interview people among participants of the 立 sunflower student movement and assess their use of social media during the movement. that the ordinary people through the social media exert the same influences as the. ‧ 國. 學. However, our study analyzes Facebook data to replace interview work. We use a novel approach to search some important users or participants during this movement.. ‧. Most studies discuss message flow and diffusion in social networks websites [2],. y. Nat. io. sit. [41], [42], [43], [44]. They have analyzed the roles that factors such as the topological. n. al. er. structure of social networks play in message spreading. Some studies [34], [45], [46],. i n U. v. [47], [48], [49] examine the relative role of strong and weak ties in information. Ch. engchi. propagation. In [45], their study shows that the weak ties may play a more dominant role in the dissemination of information online than currently believed. These studies are based on the structure of the social network, such as users’ friend graphs on Facebook. However, our study develops a novel approach to analyze information diffusion on Facebook. Our method searches some important users that change the speed of information diffusion without using social connections to build a diffusion network of users. The related studies on message flow and diffusion in social networks include [2], [41], [46], [47], [48]. They have analyzed the roles that factors such as the topological structure of social networks play in message spreading. Yang and Counts present 21. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(35) results of network analysis for information diffusion on Twitter by using users’ ongoing interactions as denoted by "@username", and they build the diffusion network and develop models for three dimensions of diffusion networks in Twitter [44]. Spasojevic et al. examine the flow of messages in the entire network and recommend best times for a user to post on social networks, and they collect users’ friend graphs on Facebook [2]. These studies are also based on the structure of social networks. Social network analysis (SNA) is not a formal theory in sociology but rather a strategy for investigating social structures [48]. It is an idea that can be applied in. 政 治 大 network and graph theories立 [48]. Examples of social structures commonly visualized many fields. SNA is the process of investigating social structures through the use of. ‧ 國. 學. through social network analysis include social media networks, means spread, friendship and acquaintance networks, collaboration graphs [49]. On the other hand,. ‧. some studies address the issue of predicting the temporal dynamics of the information diffusion process [38], [39], [50]. In [38], the authors also construct a graph-based. y. Nat. io. sit. model for information diffusion prediction; they build a platform that helps. n. al. er. understanding social network users’ interests and activity by providing emerging. i n U. v. topics and events detection as well as network analysis functionalities. These. Ch. engchi. researchers need to construct some graph structures. However, our collected dataset does not contain such graph structures. We develop a new method to handle SNA without graph structures.. 2.7 Skyline query The skyline operation is proposed to extend database systems [51]. This operation is to discover a set of interesting tuples from a potentially large set of tuples. The basic way to compute the skyline is to apply block-nested-loop (BNL) algorithm and compare every tuple with every other tuple [52]. In [52], the authors also use divide22. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(36) and-conquer (D&C) algorithm [53] to implement the skyline query. Two progressive techniques are proposed in [54], and they are the Bitmap and the Index techniques. A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional storage space and writes to maintain the index data structure. An index is used to quickly locate data without having to search every row in a database table every time a database table is accessed. Nearest neighbor (NN) uses the results of nearest neighbor search to partition the data universe recursively [54]. NN executes a nearest neighbor query on R-trees. In [54], the authors mention that NN has some desirable features (such as high speed for. 政 治 大 as the need for duplicate elimination if the dimension is larger than 2, multiple 立 accesses of the same node, and large space overhead). Therefore, the authors. returning the initial skyline tuples) but presents several inherent disadvantages (such. ‧ 國. 學. developed branch-and-bound skyline (BBS) [54], [55].. ‧. Some works sort the input data to speed up the performance of queries [51], [56], [57], [58], [59], [60], [61], [62]. The sorting-based algorithms aim to optimize pivot. y. Nat. io. sit. ordering to prune non-skyline tuples early. The first sorting-based algorithm is sort-. n. al. er. filter-skyline (SFS) algorithm [56]. In [55], the authors define the monotone scoring. i n U. v. function (ordered from highest to lowest score) which is a topological sort with. Ch. engchi. respect to the skyline dominance partial relation. We also define a monotone function in our study. Godfrey et al. mention that the maximal vector problem has been rediscovered in the database context with the introduction of skyline query [58]. Computing the skyline is known as the maximum vector problem [57], [63], [64]. In [58], the authors present a new algorithm for maximal vector computation, linear elimination sort for skyline (LESS), that combines aspects of SFS, BNL, and fast linear expected-time (FLET) [62] but does not contain any aspects of D&C. LESS must sort the tuples initially; LESS is an optimized version of SFS [58]. In [61], sort and limit skyline algorithm (SaLSa) exploits the idea of presorting the input data so as to effectively limit the number of tuples to be read and compared. The SaLSa strives 23. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(37) to avoid scanning the complete set of sorted tuples and its feature is the ability of computing the result without having to apply dominance tests to all the tuples in the input relation [61]. Many algorithms such as SFS, LESS and SaLSa need to sort tuples first, and so do our methods. Besides, the partitioning-based algorithms aim to group tuples into sub-regions which are used for region-based dominance tests. D&C [52] simply divides the problem into multiple sub-problems and merges the local skyline tuples into global ones. Zhang et al. [59] propose an object-based space partitioning (OSP) scheme, which recursively divides the data space into separate partitions with respect to a. 政 治 大. reference skyline tuple and facilitates progressive retrieval in high dimensional. 立. spaces.. ‧ 國. 學. Table 2 summarizes the features of skyline query processing algorithms in the literature, and Table 3 summarizes related works according to their features.. ‧ y. er. sit. Description Researchers sort the input data by using some functions. io. Features Sorting technique. Nat. Table 2. The features of skyline query processing algorithms. al. Abbreviation ST DA. Indexing technique. Researchers build index to speed up. IT. Application. Researchers evaluate their algorithms with real data. Ap. n Dominance checking approach. v i n Researchers to reduce C h use some methods ecalculations ngchi U. Table 3. The summary of related works of skyline query Papers [53]. Algorithms BNL, D&C. ST No. DA No. IT Yes. Ap Yes. [54]. Bitmap, Index. Yes. No. Yes. No. [56], [57]. SFS. Yes. No. No. No. [60]. NN. Yes. No. Yes. Yes. [54], [58]. BBS. Yes. Yes. Yes. Yes. 24. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

(38) [58]. LESS. Yes. Yes. No. No. [61]. SaLSa. Yes. Yes. No. No. [59]. OSP. Yes. Yes. Yes. No. In the typical application to which our methods can be applied, users assign scores to items or objects and then these scores are transformed into multi-instance data by the proportion method. An object can contain a series of probability values. Many prior works show that skyline query is very useful in multi-criteria decision making applications [56], [53], [54], [55], [57], [58], [59], [60], [61], [62]. Uncertainty in data is inherent in many applications such as sensor networks, scientific data management,. 政 治 大. data integration, where data take different values with probabilities [65]. Probabilistic. 立. data are unavoidable in some important applications. The first work on supporting the. ‧ 國. 學. skyline query on such data, called p-skyline, is reported in [66], in which the authors consider analyzing professional basketball players using multiple technical statistics. ‧. criteria and attempt to find the player who can achieve the best performance in all aspects. In [66], the authors propose a probabilistic skyline model in which an. y. Nat. sit. uncertain tuple may take on a probability of being on the skyline called p-skyline.. n. al. er. io. Given a threshold p (0 ≤ p ≤ 1), the p-skyline is the set of uncertain objects, each of. i n U. v. which takes a probability of at least p to be on the skyline [67], [68]. In [67], the. Ch. engchi. definition of an instance is different from that in our study. Atallah and Qi propose a general probabilistic skyline query that takes into account different user utilities without any restriction, but they do not use any probability threshold [65], [66]. Liu et al. propose a new uncertain skyline model called u-skyline [51], and it aims to return an uncertain skyline answer set from a complementary perspective to p-skyline. Furthermore, p-skyline returns individual data tuples with non-dominance probabilities greater than or equal to a specified threshold [51], while u-Skyline focuses on returning an answer set that forms a valid skyline with the maximum probability [51]. Most works assume that uncertainty exists only in attribute values [69]. Zhang et al. 25. DOI:10.6814/DIS.NCCU.CS.002.2018.B02.

參考文獻

相關文件

People need high level critical thinking skill to receive and deconstruct media messages and information from different sources.

• How social media shape our relationship to and understanding of breaking news events. – How do we know if information shared on social media

香港特別行政區應當通 過學校、社會團體、媒 體、網絡等開展國家安 全教育B. 香港特別行政區應當盡早

Keywords: Balanced Scorecard, Construction Industry, Performance, SWOT Analysis, Five-forces Analysis... 誌

Keywords: Ecotourism, Smangus Tribe, “Tnunan Smangus” a cooperative system, 5 Force Analysis, value chain analysis, SWOT analysis, TOWS Matrix.1. 誌 謝

謝函亘﹝32﹞以衣夾為研究對象,採用 TRIZ 創新之方法,結合綠色設計(Green

本研究採用的方法是將階層式與非階層式集群法結合。第一步先運用

本研究旨在使用 TI-Nspire CAS 計算機之輔助教學模式,融入基礎 統計學的應用,及研究如何使用 TI-Nspire CAS