線上論壇提問推薦機制：以醫療照護問答網站為例 - 政大學術集成

全文

(1)國立政治大學資訊管理學研究所碩士學位論文. 線上論壇提問推薦機制：以醫療照護問答網站為例. 治政 A posting recommendation system for 大 question formulation: 立. n. engchi. y. sit er. ‧ 國. io. Ch. ‧. Nat. al. 學. A healthcare Q&A forum study. i n U. v. 指導教授：林怡伶博士. 研究生：陳怡儒撰. 中華民國 108 年 7 月. DOI:10.6814/NCCU201900771.

(2) ACKNOWLEDGEMENT. To my family, To participants, joining the user study to support my data collection, To my advisor, Dr. Yi-Ling Lin, giving me guidance and helping me to complete my research,. 政治大. To the thesis committee, Dr. Szu-Yin Lin & Dr. Hsin-Lu Chang, and. 立. To the community that is exploring question-answering forums and recommendation systems.. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. DOI:10.6814/NCCU201900771.

(3) 摘要搜尋引擎是現代人慣用搜尋解答的工具，但不是所有人皆可以在搜索引擎上透過精準的查詢句或關鍵字快速找到所需的資訊。有別於搜尋引擎，線上問答論壇並未限制使用者如何提出問題才能找到答案，自由式的書寫環境逐漸吸引許多用戶到平台上尋找有用的資訊，並解決他們在特定領域遇到的困難。然而使用者生成內容不可避免會有表達不清的問題，尤其是特定領域的提問。因為提問者通常不具備專業背景，進而讓論壇分類系統無法成功匹配或是提問目的模糊而得不到專業人士的解答。所以我們設計出一個貼文撰寫推薦系統希望能提高線上論壇的可用性。常見的推薦機制傾向於匹配合適的回覆者或向用戶提出已存在的相似問題，但沒有人去研究如何在詢問過程中. 政治大者的貼文。我們採用設計科學的方法來完成貼文撰寫主題式推薦系統的研究。目標是立撰寫優良的貼文。本文將著重於醫療保健案例中，使用新式推薦系統是否能優化提問. 協助用戶更了解自己的疑問以組織想法來寫出不錯的貼文，最終獲得有效的答案。我. ‧ 國. 學. 們的推薦系統會透過使用者研究的方式，聘請 27 名參與者，在兩種保健情境下使用三. ‧. 種模型進行廣義估計式分析，了解使用者是否會因為推薦系統而去優化撰寫的貼文。此外，我們招募兩名從事醫療職業相關的專家，請他們對推薦系統修改後的貼文作評. y. Nat. sit. 分，驗證是否能讓專家迅速解決問題。我們的結果說明，即使主題式推薦系統的可用. er. io. 性仍需要更多證據來證明，貼文撰寫推薦系統的表現確實與不具備推薦系統的貼文撰. al. n. v i n Ch 題式推薦系統，的確增加了獲得更高分評價的可能性。雖然此篇論文遇到了一些問題 engchi U 寫情況不同。其次，兩位專家評分的結果顯示，添加更多病況資訊並使用貼文撰寫主. 無法直接證實我們目前主題式的設計可行，不過我們確定了協助寫作貼文的推薦機制是值得探討的。對於常需要協助提問者的服務提供方，優化貼文的機制將能協助服務方製作常見問題與解答，並在撰寫過程中引導提問者至解答頁面，儘速解決問題。關鍵詞：問答論壇、醫療資訊學、推薦系統、主題式推薦、詞向量、語意關聯、 Word2Vec、WordNet、使用者生成內容、設計科學、使用者研究、廣義估計式. 1. DOI:10.6814/NCCU201900771.

(4) Abstract People are getting used to google when they confuse about something, but the truth is not everyone can formulate clear queries on a search engine. Different from the search engine, online Q&A forums allow people to write anything and any styles they want. This free-writing environment gradually attracts more users to search for useful information and deal with their difficulties of specific domains on this platform. However, we cannot control the quality of user-generated content. Few askers have professional backgrounds, so unclear posts may be categorized wrong by supportive systems or difficult to be understood by experts. So, we design a posting recommender and expect it can improve the usability of online forums. In the past, recommendation mechanisms matched suitable repliers or suggested similar questions to. 治政 recommendation system to support askers optimizing posts大 under a healthcare study. We apply 立to develop a topic-based posting recommender. Its goal is to the design science methodology users, but no one investigated how to write better posts. This thesis focuses on a new. ‧ 國. 學. support users in understanding more about self-issues, organizing thoughts to write reasonable posts and then receiving effective answers. Our posting recommender is evaluated by. ‧. generalized estimating equations with a user study containing 27 participants, three models, and two healthcare conditions to see if users become more engaged in the question generation.. Nat. sit. y. In addition, we recruited two medical-related experts to verify whether modified posts attract. io. er. them to answer quickly by rating posts from participants. Though the usefulness of our posting recommender still needs more evidence to prove, the result indicates that the performance of a. n. al. i n U. v. posting recommender acts differently from not having the recommendation system. Another. Ch. engchi. result of analysing experts shows adding more information and using a posting recommender enhance the possibility to get higher rating points. The thesis does encounter some problems to confirm the design effectiveness at this stage, but the posting recommender is proved to be worth investigating. Those who need more efforts to deal with askers may think the posting recommender mechanism useful. The mechanism of optimized posts can help service provider arrange frequently asked questions and guide askers to find the correct answer before posting.. Keywords: Question-answering forum, Healthcare informatics, Recommendation system, Topic-based recommendation, Word embedding, Semantic relatedness, Word2Vec, WordNet, User-generated content, Design science, User study, Generalized estimating equations. 2. DOI:10.6814/NCCU201900771.

(5) Table of Contents CHAPTER 1 INTRODUCTION ................................................................................................................ 6 1.1 Background and Motivation................................................................................................................. 6 1.2 Research Purpose and Questions.......................................................................................................... 8 1.3 Contribution ....................................................................................................................................... 10 1.4 Content organization .......................................................................................................................... 11 CHAPTER 2 LITERATURE REVIEW .................................................................................................. 12 2.1 Recommendation mechanisms in healthcare domain......................................................................... 12 2.2 Recommendation in the asking process ............................................................................................. 13 CHAPTER 3 RESEARCH METHODOLOGY ...................................................................................... 15 3.1 Identify problem & motivation........................................................................................................... 15 3.2 Define objectives of a solution ........................................................................................................... 16 3.3 Design and development .................................................................................................................... 19 3.3.1 System layout.............................................................................................................................. 20 3.3.2 Data preparation of the RS ........................................................................................................ 22 3.3.3 Features extraction .................................................................................................................... 23 3.3.4 Recommender systems’ implementation..................................................................................... 24 3.4 User study .......................................................................................................................................... 26 3.4.1 Dataset ....................................................................................................................................... 26 3.4.2 Models ........................................................................................................................................ 26 3.4.3 Tasks and experimental materials .............................................................................................. 26 3.4.4 Participants and procedure ........................................................................................................ 29 3.5 Evaluation .......................................................................................................................................... 30 3.5.1 User study evaluation ................................................................................................................. 30 3.5.2 Judgements ................................................................................................................................. 31 3.6 Communication .................................................................................................................................. 32. 立. 政治大. ‧. ‧ 國. 學. sit. y. Nat. CHAPTER 4 ANALYSIS AND RESULTS ............................................................................................. 33. n. al. er. io. 4.1 Demographic information .................................................................................................................. 33 4.2 Analysis of results .............................................................................................................................. 36 4.2.1 Effectiveness ............................................................................................................................... 36 4.2.2 Efficiency .................................................................................................................................... 40 4.2.3 Circumstance.............................................................................................................................. 42 4.3 Experts rate posts ............................................................................................................................... 44 4.3.1 Pharmacy expert ........................................................................................................................ 44 4.3.2 Medical expert............................................................................................................................ 45 4.3.3 Interaction analysis between model and description ................................................................. 46 4.4 Satisfaction questionnaire .................................................................................................................. 48. Ch. engchi. i n U. v. CHAPTER 5 DISCUSSION ...................................................................................................................... 50 5.1 Effectiveness perspective ................................................................................................................... 50 5.2 Efficiency perspective ........................................................................................................................ 51 5.3 Circumstance perspective .................................................................................................................. 52 5.4 Subjective adjudgment ....................................................................................................................... 53 5.5 Discussion of questionnaire ............................................................................................................... 54 CHAPTER 6 CONCLUSION ................................................................................................................... 55 APPENDIX 1 – Supportive paragraphs for participants ....................................................................... 58 REFERENCE ............................................................................................................................................. 64. 3. DOI:10.6814/NCCU201900771.

(6) Tables Table 1 The order setting of our user study ....................................................................... ..29 Table 2 Demographics of the participants by models .......................................................... 34 Table 3 Research questions, hypotheses, and measurements ............................................... 35 Table 4 Descriptive of three main hypotheses ..................................................................... 36 Table 5 Descriptive (mean ± S.E.) of the length of a post in pairwise comparison by applying a RS and not applying a RS ........................................................................... 37 Table 6 Descriptive (mean ± S.E.) of the number of medical-related features in pairwise comparison by applying a RS and not applying a RS ................................................... 37 Table 7 Significant effect influencing the existence of description between applying a RS and not applying a RS under model and illness parameter .................................................. 38. 政治大. Table 8 Descriptive (mean ± S.E.) of the existence of description in pairwise comparison by. 立. applying a RS and not applying a RS ........................................................................... 39. ‧ 國. 學. Table 9 Summary of significant effects relating to effectiveness ......................................... 39 Table 10 Descriptive of recommended features adoption. ................................................... 40 Table 11 Significant effect influencing the number of medical-related features between. ‧. applying a word embedding RS and a semantic RS under model and illness parameter 41. y. Nat. Table 12 Summary of significant effects relating to efficiency............................................ 42. sit. Table 13 Descriptive (mean ± S.E.) of the amount of used time associating with model and. er. io. task variables in pairwise comparison .......................................................................... 43. al. n. v i n C h of the pharmacyUexpert analysis .......................... 45 Table 15 Summary of significant effects engchi. Table 14 Summary of significant effects relating to circumstance ...................................... 43 Table 16 Summary of significant effects of the medical expert analysis ............................. 46 Table 17 Interaction analysis between labeled true description and models to find rating points’ effects .............................................................................................................. 47 Table 18 Satisfaction questions on the post-questionnaire and their average points from participants. ................................................................................................................. 48. 4. DOI:10.6814/NCCU201900771.

(7) Figures Figure 1 Six phases of our design process........................................................................... 15 Figure 2 Four steps of our design’s procedure .................................................................... 19 Figure 3 The display of the word2vec posting recommender .............................................. 21 Figure 4 The whole operation from a user’s perspective ..................................................... 25 Figure 5 The material that a participant reads before writing a post in the try-out ............... 28 Figure 6 Distribution plots.................................................................................................. 49. 立. 政治大. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. 5. DOI:10.6814/NCCU201900771.

(8) Chapter 1 INTRODUCTION 1.1 Background and motivation Due to the rapid development of computational power and data analysis over decades, it is easy to find out information from Internet-based resources. The appearance of Google Chrome, having optimized searching algorithms, cultivate people “google” habit when they confused about something at the very first moment. However, not everyone can come out precise queries which efficiently and effectively narrow down the range of diverse results. These users then often decide to add more details into the search to find closer answers. The truth is that the search results become more ambiguous and noisier if items in the searching pool do not have. 政治大 online forums (where users can post questions and respond to others’ questions) allow people 立 the exact same structure of the search query. Unlike above constraint of search engines, Q&A. to write anything in sentences if users feel hard to summarize their problems. Further, some. ‧ 國. 學. Q&A online forums are more community-oriented mainly focusing on a particular domain to maintain the quality of their content. When time passes, text data collecting from platform’s. ‧. users can be accumulated. By a suitable manipulation, it is feasible to optimize the whole. y. Nat. asking process. This natural language-based environment encourages researchers and forums’. er. io. sit. runners to make the asking process of users on Q&A websites more automatic and customized.. al. v i n Ctheh action to search U look for a solution of a problem, (b) e n g c h i for asked relevant questions, (c) the action to formulate a new post if exist threads cannot answer doubts and finally (d) get solutions n. We summarize the asking process of users on Q&A websites in four steps: (a) the intention to. from other forums’ users. After observing several Q&A websites, we found the most direct supportive method in the asking process is to utilize various recommendation systems. Yahoo! Answers, one of the earlier and well-known Q&A websites, asks users to categorize their questions by chosen topics before submitting their posts. A famous general Q&A website, Quora, which announced they were hit by 300 million user accounts monthly in September 2018, includes lots of modern websites’ mechanism. It combines traditional forums with the social mechanism (i.e. tagging) to provide recommendations and narrow down the scope of existed questions. Interestingly, most recommendation systems appear in step (b) and (d). Even the forum like Quora combines (b) and (c) together, its system seldom considers in how to make the new post having good expression and being with a direct objective (not only pair 6. DOI:10.6814/NCCU201900771.

(9) similar questions’ threads). It is probably because the analyzation of posts, the user generated content (UGC), is very complicated. In the past, most of studies focused on analyzing transaction logs to navigate users’ behavior, but often the result was not enough for revealing their information needs (Mccray, Loane, Browne, & Bangalore, 1999; Rose & Levinson, 2004). UGCs are the element containing useful message and information like contextual factors, including linguistic features that users formulated, motivations and timing to ask the questions. Through understanding UGCs, the cognitive representation of the problem, can improve the design of information systems (Zhang, 2010). However, the rising production of UGCs have become more complex lexicon and syntax at the same time, supportive methods (recommendation systems) turn to be sometime. 政治大 post given out by the user is complicated because of not a native writing or too much 立 professional details to categorize well. To create great user experience, why not develop a semi-. time-consuming and ineffective. For example, recommenders resent irrelevant threads when a. ‧ 國. 學. automatic system that encourage users to reconsider and reformulate posts (UGCs) before submitting to forums? The goal of this design is to address difficulties of UGCs’ interpretation. Nat. y. ‧. by converging ambiguous concepts of posts on the user side.. sit. Another feature of Q&A forums is domain-specific. Healthcare and its social groups have been. er. io. an attractive domain to users and researchers for a long time. A Pew research study (Fox &. al. Fallows, 2003) showed above 63% of Internet users in the U.S.A. searched health information. n. v i n C hhave visited websites online. More than 54% Internet users sharing people’s experiences of U i e h n g c2013’s survey reported from Pew (Fox, 2013) medical condition and personal situation. Another presented 35% adults figured out what medical condition they or someone else might get online. These online search results may guide users to decide their further medical appointment. Although the clinician’s care and conversation about serious health episodes take place mostly offline, 53% of people who have figured out the medical condition themselves online have a habit of sharing what they have found with a clinician. In addition, when facing a problem that people are unfamiliar with, especially highly specialized areas like healthcare and medicine, they will try to ask someone who is with more expertise or able to share experiences (Wildemuth et al., 1994). It is not surprising that Pew Research in 20091 concluded that the. 1. https://www.pewinternet.org/2009/06/11/the-social-life-of-health-information/ 7. DOI:10.6814/NCCU201900771.

(10) online information significantly impacts user decision on how they treat their own illnesses or care the health of their significant the other.. Healthcare Q&A forums can not only let users describe more details about their conditions but also become a good source of contextual factors that is to find the knowledge back of users’ information needs, which is fundamental for the design of information systems. There was a limitation of domain-specific Q&A forums due to low computing power and sparse theoretical coverage; however, current techniques are able to retrieve fruitful resources, comprehend more about natural languages and provide deeper reasoning (Arora, Li, Liang, Ma, & Risteski, 2016; Gittens, Achlioptas, & Mahoney, 2017; Mimno & Thompson, 2017). Our study confirms online behaviours in healthcare field will be a steady presence in lives, so the optimization of. 政治大 problems. In addition, if the recommendation methods came from the ideas of data analysis 立. the asking process on healthcare forums is important to help users deal with health-related could become more reliable in supporting the asking process, online users may be more willing. ‧ 國. 學. to discuss their medical condition on Q&A websites.. ‧. 1.2 Research purpose and questions. sit. y. Nat. er. io. Even though search engines are the most popular channel to everyone, search results are sometimes too general to find solutions directly. Searching information on Google.com usually. n. al. Ch. i n U. v. popped out good preselected hosts and Wikipedia results in the first listing (Höchstötter &. engchi. Lewandowski, 2006). It is hard for users who want to solve unfamiliar problems like healthcare and medicine conditions to follow because these answers are usually not customized or toward professional information. In addition, the search engine only gets results efficiently by keeping queries succinct while using a Q&A forum can type questions as long as users want. When the problem needs more details to comprehend, it is necessary for an asker to explain the whole thing. Therefore, online Q&A websites gradually attracted more users to join; however, if forums’ topics cover too wide and do not have any mechanism to control the quality of answers, finding professional and reliable answers may be tough. Take Yahoo! Answer for an example, some reviews in a question thread are just advertisements or unrelative descriptions attracting specific users. So, do users participate in a domain-specific Q&A forums solve all difficulties?. 8. DOI:10.6814/NCCU201900771.

(11) As we mentioned, not all users are familiar with specific domains. Take our study, healthcare, for example, when formulating queries, people without medical training often feel hard to formulate their requests. Different people may have different mental representation (e.g. various description of the pain scale to the same illness) (Zeng-treitler, Kogan, Ash, & Greenes, 2002). Secondly, most users can only pop out simple words to describe their diseases and medical conditions. Vocabularies in queries are also not match with medical terminologies. Sometimes, even the intention of asking questions is not concrete. Not to mention some lexical barriers such as partial misspellings and uses of abbreviations. Therefore, simply change a place to propose questions is not enough. It encourages us to propose a design that suggests users taking more ideas (e.g. topics and features) that they never think of and help them formulate posts with more reasonable details, not just presenting existed questions in the. 政治大 the input content’s quality does affect users to get useful answers (2015). If a system can 立 interpret the main idea of input content and recommend directions of modifications to make searching pool as most Q&A websites. Referred to the research from Baltadzhieva, enhancing. ‧ 國. 學. queries better, people finally get good feedbacks after a short time. We want to help participants engage in the process of making queries via a posting recommender. This design could be quite. ‧. useful when general users are not familiar with a domain.. Nat. sit. y. We plan to build two recommendation systems, a word embedding method and a semantic. er. io. method, based on the concept of a posting recommender. The word embedding method is known to be one of the greater tools that can process words into space vectors and improve the. n. al. Ch. i n U. v. understanding of human languages all by the machine. This recommendation, implemented by. engchi. a Word2Vec model (Mikolov, Chen, Corrado, & Dean, 2013), trains 5,319 questions and 500 abstracts of publication crawled from health-related websites. The second recommendation adopts a WordNet2 model, which is a lexical database for English, provides several synonyms that are tagged artificially. Grouping words together from their meanings makes it a useful tool for computational linguistic and natural language processing (NLP). Both Word2Vec model and WordNet model are meant to recommend some ideas and features that users may need in their current post but still not consider in. These feature-based recommendations push posts from users to become more subject-specific.. 2. https://wordnet.princeton.edu/ 9. DOI:10.6814/NCCU201900771.

(12) To implement the recommendation action, we first adopt a processor to tag part-of-speech (POS) and extract features as a main idea from input sentence(s). Then, users decide to accept or not accept new ideas. We think that participating more in the user asking process based on text analytics may successfully support users formulating UGCs and enhance the integrity and distinction of askers posts in the end. But how to judge if our two posting recommenders are better than original designs? So, the last step is to verify whether reformulating queries organizes users’ wordings better and finding desired answers easier. We propose a user study and a satisfaction questionnaire to understand users’ perspectives of our recommender. In addition, posts written by our study’s participants will be evaluated by health-related background experts to see if each UGC can be solved easily at the first moment.. 政治大 RQ1: Do the posting recommender help users formulate questions in healthcare Q&A forums? 立 RQ2: Do questions supported by the posting recommender attract experts to answer? To sum up, two research questions are posited:. ‧ 國. 學. 1.3 Contribution. ‧ sit. y. Nat. In this work, we investigate how features-based mechanisms enhance rational and reliable recommendations when formulating questions. In the meantime, we want to intensify the. io. n. al. er. convergence of UGCs that sometimes contain ambiguous terms or lack straight focus in a. i n U. v. healthcare forum. If online answerers and experts feel hard to answer a condition without. Ch. engchi. details, they may forgo the current query to browse another one. Therefore, this thesis sets the goal to build a closer link between questioners and answerers via the posting recommender. If numeric and interview results are positive, adding the semi-automatic method into the asking process may be feasible to construct clear posts and acquire high quality feedback easier (e.g. more tagging recommendations or suitable answers from machine-chose and experts-replied). Further, the recommendation mechanism we proposed may not limited in the healthcare area. Take e-commerce for example, when people are purchasing products that they are unfamiliar, it is common for them to ask details before and after buying. If there is a system formulate questions before posting, problems may be solved by FAQs or posts that have already existed in the customer service webpage but now is hard to be found because of ambiguous wordings. Thus, the unanswered rate in a Q&A forum may go down and the possibility of getting 10. DOI:10.6814/NCCU201900771.

(13) solutions may go up. We think any industry who needs to deal with queries and interpret UGCs are suitable for building a post recommender in the users’ asking process.. 1.4 Content organization In Chapter 1, we elaborate the obstacle of searching for information only on the search engine. The asking process and analyzation of UGCs are discussed in the second and third paragraph. Next, we state the Q&A forum, especially domain-specific platforms such as healthcare and medicine, encourages users to solve problems online. This chapter is an overview of our research background, motivations, and defines the research purpose and questions. In Chapter 2, which is the literature review part, we will discuss the theoretical support that can help us. 政治大 recommendation mechanisms and 立healthcare Q&A forums. In Chapter 3, we go into details of define our research’s objectives and organize what others researcher have done in the field of. ‧ 國. 學. how to process our research through building a visualization system with the recommendation. mechanism. Because we have to validate the feasibility of posting recommenders, this section also organizes the design of our user study including what methods we use to judge this artefact.. ‧. In Chapter 4, we manipulate the statistical computation method to analyze data collected from. y. Nat. participants’ behaviors on our system in user study. In addition, we compare participants’ posts. sit. rated by experts to the statistical result and briefly present the average satisfaction points of. n. al. er. io. participants. In Chapter 5, we discuss the experimental result organized from Chapter 4. In. i n U. v. Chapter 6, we conclude our research based on two research questions and talk about our. Ch. engchi. research’s limitation, contributions and future works.. 11. DOI:10.6814/NCCU201900771.

(14) Chapter 2 LITERATURE REVIEW In this chapter, we will organize recommendation mechanisms in the healthcare domain and recommendations in the asking process.. 2.1 Recommendation mechanisms in healthcare domain Healthcare services have animated in health information systems. In this context, the recommendation mechanism is a complementary tool when users are making decision or a service provider uses to improve the effectiveness of communication channels. David Isern et al. (2016) organized various agents applied in healthcare to support determinations. For. 政治大 situations and patient-centered applications are created to alert patients when abnormal 立. example, intelligent agents are able to give cure plans when patients are facing specific messages are detected. Other researchers demonstrated recommendation system has already. ‧ 國. 學. employed in health areas like education, dietary, assistance agents in different classes (collaborative, content-based, demographic, and knowledge-based) of recommendation. ‧. techniques based on knowledge source (Kim, Lee, Park, Lee, & Rim, 2009; Pattaraintakorn,. y. Nat. Zaverucha, & Cercone, 2007; Sami, Nagatomi, Terabe, & Hashimoto, 2008; Wiesner & Pfeifer,. sit. 2010). Emre Sezgin et al. (2013) summarized papers related to health recommender systems. er. io. are usually analyzed by users’ group and system design (Pattaraintakorn et al., 2007; Wiesner. al. v i n Ch when doing health marketing, recommendations on personal information and selfe n g cbased hi U n. & Pfeifer, 2010). Besides, electronic records on healthcare websites are another focal point examination (Lopez-Nores, Blanco-Fern´ndez, Pazos-Arias, Garcia-Duque, & Martin-Vicente,. 2011; Pattaraintakorn et al., 2007; Wiesner & Pfeifer, 2010). Though Morrell et al. (2012) underlined semantics on webs is a challenging task for predicting user behaviors, researchers still tried content-based filters (examining the historical data and current preference of users to predict items) to attract one’s attention from designed algorithms (Kim et al., 2009; Park, Kim, Choi, & Kim, 2012; Sami et al., 2008). Unlike the recommendation mechanism before and after the post action, our research put more efforts in the searching and posting step. We roughly observe recommendation patterns of whole asking procedure (e.g. start to type, check the recommendation, and decide to use) and found recommendation mechanisms in health forums are similar to general topics forums. 12. DOI:10.6814/NCCU201900771.

(15) Users type keywords like symptoms or conditions into blank searching box and the recommender will pop out relative illnesses’ wordings or asked sentences containing identical words (Patient3 and WebMD4). Specially, most online healthcare communities tend to combine the characteristic of information conveyers (e.g. medical news and effective treatments) with platform providers (e.g. offer places where askers and experts can share their experiences). It is a feasible way to keep member engaged because the more useful information a forum has, the more users will adopt solutions and stay. If more users stay and activate, the connection between users become tight, so the community is able to survive longer (Williams, R. L., & Cothrel, 2000). In addition, different from only browsing and searching messages as visitors, being a forum member can get more assistances sometimes because predictions based on behavior’s data may satisfy each user’s need. It is obvious that most recommender agents or. 政治大 suggestions related to what a user has typed. 立. systems contained in the asking procedure follows the tradition to guess intentions and give. ‧. ‧ 國. 學. 2.2 Recommendation in the asking process. Most recommendations during the posting step of whole asking process adopt various analysis. y. Nat. from the exist Q&A database to help new askers to solve their problem faster. Question routing. sit. and recommendation studies before (Dror, Koren, Maarek, & Szpektor, 2010; Li & King, 2010;. er. io. Riahi, Zolaktaf, Shafiei, & Milios, 2012) have explored methods to find potential answers and. al. n. v i n underlying social network featuresC(e.g. query get more than x times hits), users’ h ewhich ngchi U activities (e.g. which category that an expert likes to reply and always get the honour of best answerers (people who have similar experiences in a thing) in Q&A forums. They consider. answer) and public personal data on websites to improve systems’ usability. Several methodology were developed to make systems efficient for users: words’ collections by grouping questions from two similar answers (Jeon, Croft, & Lee, 2005), finding different relationships between specific data (Zhao, Collins, Chevalier, & Balakrishnan, 2013), and adding neural models into applications (Feng, Xiang, Glass, Wang, & Zhou, 2016; Shen, Rong, Sun, Ouyang, & Xiong, 2015; Zhou, He, Zhao, & Hu, 2015). Question-question similarity and answer selection are both common recommendations seen in a real forum.. 3 4. https://patient.info/forums https://www.webmd.com/ 13. DOI:10.6814/NCCU201900771.

(16) Refer to the Quora forum, we found most of the questions on general forums (various topics include food, education, country recognition, and etc.) start from 5W1H and short questions usually get answers faster than the longer one. Perhaps it is because Quora alludes querents to ask in a specific structure and provide limited space to control writing short. The number of short questions then grows up. After finishing typing the post, askers can invite answerers who specialize in the corresponding category to reply to their questions. Even if askers do not send any invitations, their posts will be spread to the public and showed to users who are interested in. Operations on Quora such as real name policy, answer recommendations, content moderation, and top writer program are components we can mimic when developing a Q&Arelated system.. 政治大 Zhang (2010) stated that when askers are posting questions in a health-related forum, they 立 prefer more spaces letting them write as much as details they want to share. Many askers valued. However, universal situations aren’t always suitable for the healthcare environment. Yan. ‧ 國. 學. personal experiences and sought actively for someone they can talk to. Further, some askers have difficulty with spellings and come up with proper terms to describe their condition. So,. ‧. this paper focuses on developing a model to suggest potential ideas, formulate users’ questions and avoid using ambiguous terms that decrease answered possibility and increase the length of. Nat. sit. y. answered time. At the beginning of proposing the question, users usually have a strong. er. io. intention to get a beneficial reply and motivate them to accept or follow the instruction easily. Therefore, even though there is vindication of using short guidance to users (Quora’s 5W1H. n. al. Ch. i n U. v. and limited space), we decide to recommend users relative features during query’s construction. engchi. and most importantly our research stresses on health care domain which has obvious differences to general forums. Questions like “having severe lower back stiffness/pain. can you take cele atbrex with coumadin” and “pain on both side of chest” seen on WebMD answer (look far away from a standard formulation) cause difficulties to background programs in categorizing questions by analyzing wordings. Specific-domain questions do need a descriptive of the situation sometimes so our goal is to develop an effective solution to formulate UGCs.. 14. DOI:10.6814/NCCU201900771.

(17) Chapter 3 RESEARCH METHODOLOGY We followed the design science guideline (Peffers, Tuunanen, Rothenberger, & Chatterjee, 2008) to design the process of our research including building the posting recommender and conducted a user study to evaluate it. Six phases of the design process are as follow. Figure 1 is the overview of methodology.. 立. 政治大. ‧. ‧ 國. 學 y. Nat. n. al. er. io. sit. Figure 1. Six phases of our design process. Ch 3.1 Identify problem & motivation. engchi. i n U. v. Askers sometimes feel hard to write queries concretely on a search engine. An online Q&A forum then turned to be another suitable environment to solve askers’ problems. This free asking platform encourages askers to create free-style posts, regarded as UGCs. But it is not easy to control the quality of UGCs which is one of vital reasons to enhance a Q&A service. Incomplete and unclear posts’ construction sometimes confuses answerers (so called experts) and the category system (match questions with whom are able to answer). This makes the number of unsolved and unclear questions still large. Moreover, general askers have no specific background and know less about terminologies in a professional domain. For example, in the healthcare Q&A online forum, a general post looks like “I have a back pain recently. What kind of the lifestyle help prevent back pain?” while a more background knowledge included 15. DOI:10.6814/NCCU201900771.

(18) post looks like “I am staying at a healthy weight, but a severe back pain occurred recently. Then, I searched for some information online and found smoking ages the spine. I seldom smoke but my husband smokes a lot. Is it because of inhaling too much secondhand smoke?”. It is easy to tell differences of amount of information in these two posts. So, if there is a posting recommender to support askers in formulating their questions, askers may get new ideas to write queries more specific and details. We attempted to encourage users to participate more in the asking process so that the quality of UGCs can be improved. After all, high quality posts bring high quality solutions faster (Agichtein, Castillo, Donato, Gionis, & Mishne, 2008; Li, Jin, Lyu, King, & Mak, 2012).. 3.2 Define objectives of a solution. 政治大 From our observations, the reason 立 why many unsolved questions exist is because posts on the. Q&A online forum are still unoptimized. The posting recommender is designed to help users. ‧ 國. 學. formulate requests and enhance the possibility of being answered. It belongs to a kind of recommendation systems (RSs). RSs are useful for various services (e.g. research articles,. ‧. commercial activities, and healthcare forums). They keep tracking of data and support users to. y. Nat. make decisions. Common recommender techniques can be considered into five groups:. sit. collaborative, content-based, demographic, utility-based, knowledge-based (Burke, 2002).. er. io. This thesis is a content-based RS. All features are extracted and processed from users’ forum. al. n. v i n C (e.g. browsing moments. Few applications correction) were activated when users are h espelling ngchi U. posts. In the past, most researchers mainly focused on recommendations at the querying and formulating posts. If the usefulness of applying RSs in decision-making was organized and proved (Bobadilla, Ortega, Hernando, & Gutiérrez, 2013), suggesting users what to post and how to post, which is another category of RSs, can be a useful function. A good quality of input is an important factor to get useful answers (Baltadzhieva, 2015). If users can get useful answers, they can make decisions easier. But how to judge if the recommendation system, especially a posting recommender, can trigger users to adopt recommended items and help them formulate posts? General to say, four evaluation are applied to RSs: offline, user study, online, and qualitative approaches (Beel et al., 2013). Offline (no access to real-world) and user study are the most common choices. So, except knowing users’ perspectives of a RS by the user-centric web-based survey (Pu, Chen, 16. DOI:10.6814/NCCU201900771.

(19) & Hu, 2011), we conducted a user study to collect and analyze contents written by users. In this thesis, content (posts) can be distinguished between affective and informative content (Denecke & Nejdl, 2009). If posts contain actions a user performs during a day and feeling or thoughts on treatments, they are affective content (Denecke & Nejdl, 2009). If posts contain specific information about diseases and treatments, news from research results or healthcare websites, and experiences of a particular disease or treatment, they are informative content (Denecke & Nejdl, 2009). Having a large information, these posts are considered medically relevant (Denecke & Nejdl, 2009). Therefore, the length of medically relevant posts may be longer, and the frequency of medical-related terms may be higher. Our system (a posting recommender) was first arranged to extract features in users’ input content (See details in 3.3). These medical-related terms were then picked from features extracted from our text documents.. 政治大 medical dictionary website . If the website does show some explanations and possible words 立 combination, we regard the feature as a medical-related term. Last, the existence of description. How to discriminate medical-related features? We decided to type each picked feature to a 5. ‧ 國. 學. may be confirmed by finding fruitful information. We record them and posit the following:. recommendations.. ‧. H1: There is a difference between conditions having recommendations and a condition without. Nat. sit. y. H1a. There is a difference between the length of a post using recommendations and. er. io. without recommendations.. H1b. There is a difference between the number of medical-related terms in a post using. n. al. Ch. recommendations and without recommendations.. engchi. i n U. v. H1c. There is a difference between with or without the description in a post using recommendations and without recommendations. In this work, we planned to design two posting recommender systems: recommendations of relative features (the word embedding model: Word2Vec) and recommendations of synonym features (the semantic model: WordNet). Both of them were needed to compared with the baseline model, no recommendation situation (what H1 has stated). We also investigated which kind of these two posting recommenders has a better ability to help users formulate posts. So, it is necessary to observe the usefulness of the word embedding model and the semantic model. Beside collecting users’ opinions from the satisfaction questionnaire, the number of adopting. 5. https://www.medilexicon.com/ 17. DOI:10.6814/NCCU201900771.

(20) the recommendation can slightly see the recommender useful or not useful. The greater number of medical features used in a post could be regarded as an informative and medically relevant content (Denecke & Nejdl, 2009). Therefore, we posited the following hypothesis: H2: There is a difference between the word embedding recommender and the semantic one. H2a. There is a difference between the number of adopted recommendations under the word embedding recommender and the semantic one. H2b. There is a difference between the number of medical-related features in a post under the word embedding recommender and the semantic one. Time is one of noticeable factors in the asking process. For example, the amount of time that a. 政治大 adopts suggestions (Pu et al., 2011), and the amount of time that a user finishes posting. We 立 planned to perform three models (Word2Vec, WordNet, and baseline) and two experimental RS gives out suggestions (Beel et al., 2013), the amount of time that a user understands and. ‧ 國. 學. situations that one is more general (simple) and the other is more uncommon (complicated) in our user study (See details in 3.4). It is possible that different pairs of conditions take different. ‧. amount of time. Under a healthcare environment, an uncommon situation is usually rare among people than a general situation. It sometimes needs more professional terms to describe and is. Nat. sit. y. harder to be understood. Writing these kinds of questions are difficult to users who are not. er. io. familiar with healthcare topics. So, generally speaking, the used time under complicated conditions may be longer. However, what if the general and uncommon situation encounter a. n. al. Ch. i n U. v. RS? With a posting recommender, participants may use more time because there is a lot of. engchi. information they can refer or they may use less time because the recommendation is useful to formulate posts quickly. Therefore, we want to know the relation between the amount of used time and using recommendations under different scenarios. The third research hypothesis is as follows: H3: There is a difference between the used time of using recommendations and without recommendations. H3a. There is a difference between the used time under a general situation with recommendations and that without recommendations. H3b. There is a difference between the used time under an uncommon situation with recommendation and that without recommendations.. 18. DOI:10.6814/NCCU201900771.

(21) 3.3 Design and development After looking over Q&A online forums (e.g. Quora.com, Yahoo! Answer, Stack Overflow6, English Language & Usage. 7. ), we found many platforms have powerful query. recommendations but less focusing on the cold-start posting. To assist askers in optimizing a post process, this thesis developed a posting recommender system.. 立. 政治大. Figure 2. Four steps of our design’s procedure. ‧ 國. 學. There are four steps (figure 2) in our design’s procedure after users give out any sentences on. ‧. the system. First, we attempted to understand what users want to ask or concepts they are interested in. A post, knowing as natural language or UGC, in Q&A forums is usually. y. Nat. composed by a question sentence or narrative short paragraph. Our collected UGCs, which are. io. sit. trained as a tool for finding intentions, are most in a sentence pattern. These text processing do. n. al. er. need more efforts to address than a whole document pattern. We finally decided to adopt a. i n U. v. features extractor with a POS-tagger base to preliminary guess what users want. Features. Ch. engchi. extractor is an application of machine learning to select a subset of terms and uses this subset as features in text classification. In a sentence, different terms act different roles such as nouns or noun phrases for main topic and verbs or verb phrases for action. This work’s extractor manipulates POS-tagger to fetch noun and noun phrase as main ideas. Next, we developed a word embedding and a semantic model to provide possible ideas that users may need. Preparing suitable data resources is the first thing we have to address. It is not complicated to prepare a semantic model. There is a well-processed English lexical database made from Princeton University. We ran our application with NLTK Python modules8 to get. 6. https://stackoverflow.com/ https://english.stackexchange.com/ 8 https://www.nltk.org/api/nltk.corpus.reader.html?highlight=wordnet#module-nltk.corpus.reader.wordnet 7. 19. DOI:10.6814/NCCU201900771.

(22) recommendations. The word embedding method: Word2Vec model was first published in 2013 (Mikolov et al., 2013). It proposed software packages to train efficient word representation (also called word vector or word embedding) and trigger other researchers to generate variants in the data mining field. Authors of How to generate a good word embedding (Lai, Liu, He, & Zhao, 2016) discovered a corpus domain is more vital than a corpus size. They recommend researchers have better choose a suitable domain corpus for the desired task and then find a large corpus to yield better results. Therefore, our design doesn’t adopt pre-trained and wellperformed word embedding models based on English Wikipedia. We collect medical information from relative publications and retrieve healthcare forums wordings’ data from specific domain websites to create a new data pool for training our own Word2Vec model.. 政治大 Users can copy what they want to add in the current content. Redoing modify until they satisfy 立 or the system give nothing new. Finally, all modified records and final input will be compared Thirdly, the model arranges possible ideas from top 1 to top 10 in the table on the interface.. ‧ 國. 學. to control groups of each condition: (1) Wor2Vec, WordNet (with a RS) and baseline (without a RS) (2) Wor2Vec and WordNet.. ‧. 3.3.1 System layout. sit. y. Nat. er. io. Operation methods of our posting recommender include feature extraction, recommendation. al. calculation and the display of recommendations. After establishing all operations, we visualize. n. v i n C harea and an area for our design. There should be an input placing recommendations in our engchi U. system layout, performed on a web form. The version of our posting recommender still not consider a real-time mechanism, so we also need a trigger to start recommendation calculation. Figure 3 displays the interface of our system. As most Q&A forums, the green line space is set to let askers type texts. If they don’t know or forget some terminologies or even not sure if the wordings and writing style are suitable for the forum, they can click yellow execute button to fetch some recommendations. When a one-time edit completes, askers can click green finish button or red next round button to start a new calculation. As for the display of recommendation, we put features processed by recommenders into a table to present results orderly. The first column of the table shows topics that askers may focus on and the rest of the columns ranked top 1 to top 10 related terms correlated to the particular topic. We pick total 10 candidates because the default number on the Word2Vec tutorial is 10 and George A. Miller summarize. 20. DOI:10.6814/NCCU201900771.

(23) there is a magical number: 7, plus or minus 2, limiting on people capacity for processing information. Although the number of candidates features on interface is not 5 or 9, 10 is so close. Moreover, the width of a table containing 10 features is just the same as the width of input area and the button we designed.. 政治大. 立. ‧. ‧ 國. 學. Figure 3. The display of the Word2Vec posting recommender. sit. y. Nat. io. er. We use XAMPP9, a web server application, to build our system on the Chrome browser. The intact operation is as follows: askers first formulate posts with several sentences in the green. n. al. i n U. v. input region. We also consent askers to run an auto-spelling check, an installed extension:. Ch. engchi. Grammarly10 from Chrome Web Store while writing. If askers need ideas, they can click the execute button to receive suggestions from our system (make sure each end of sentence has a period). Askers then have to wait a few seconds for calculations and browse the popped-out recommendation table. Askers can press the copy picture button under each feature once a feature is ideal. If askers have already edited posts according to the first-time recommendation and want to see another one, they are welcomed to click the next round button. When askers have no more adjustments, they should click the finish button to record all modifications. We arranged several text processing to set the system. Section 3.3.2 delineates data preparation. Section 3.3.3 and 3.3.4 organize feature extraction and recommender system’s implementation. 9. https://www.apachefriends.org/index.html https://www.grammarly.com/. 10. 21. DOI:10.6814/NCCU201900771.

(24) 3.3.2 Data preparation of the RS To develop a recommender, how to prepare the data resource throughout history is very important. Besides the development of algorithms, a useful healthcare-related recommendation system needs to be trained from suitable data topics, correct data types, and enough data volume. In our healthcare study, we undoubtedly collect questions and answers from WebMD answer, a health care Q&A forum, to train our feature-based recommender model. Posts include health issues such as pain, flu, skin, diabetes, etc. can be easily browsed because of their clear categories. Also, WebMD answer is also one of few health communities that provide specialists on medical field (or they are called experts in the forum) to help querents find out suggestions to their illness. We processed crawled data which covers the range from March 2010 to. 政治大 Because answers like “not normal”, “I do not know”, and a relative link cannot deliver useful 立 messages to complete our tasks. Besides, left questions contain various health issues like pain, September 2014 and finally decide to keep 25,319 questions and remove answers for this work.. ‧ 國. 學. flu, skin, diabetes, shingles, period, and etc.. ‧. Besides collecting questions from general askers regarded as resources of daily conversations, we think medical terminologies and specialists’ wordings may be helpful when askers are. y. Nat. sit. trying to come up with proper terms to describe a condition. We used web crawling program. er. io. and selenium-web browser automation11 to collect abstracts of publications from PubMed12. al. owned by National Center for Biotechnology Information, U.S. on March 2019. The selenium. n. v i n C h a website and itUcan be extended through browsers’ server is to simulate how human browses engchi. plugin as a web driver. We chose the Chrome browser driver to execute our program and controlled it to enter food allergy, allergy, and food poisoning keywords into the searching box on the website. After that, the program will scan from page 1 to page 5 of each keyword (Each page has twenty publications). The content of abstracts usually has descriptions of the background, objectives, method, results, and conclusions sections. Because we only investigate which words are usually near to a word, all abstract paragraphs were split into sentences and processed by the Word2Vec model.. 11 12. https://www.seleniumhq.org/ https://www.ncbi.nlm.nih.gov/pubmed/ 22. DOI:10.6814/NCCU201900771.

(25) 3.3.3 Features extraction Because simply focusing on the same word recommendation cannot always categorize UGCs well and know the intention of askers (e.g. several recommended posts containing same words like symptoms and medicines doesn’t mean they’re same illness), we create a noun phrase extractor for extracting main topic from each typed sentence. From the linguistic aspect, we usually say that main building block of a sentence are noun phrases and verb phrases. The noun phrases are usually the core topics or objects in a sentence, while verb phrases describe actions between the objects in a sentence. For example, a sentence describes: “My kid has a fever.” About Who/What (the objects)? “My kid” and “fever”. What happened (the action)? “Has”.. 政治大 and speed up feature extraction process for better users’ experiences, we implement a tokenizer 立 method which can make the process be faster and process many sentences or full documents. Note that the need to extract noun phrase topics (we also regard them as features in the sentence) 13. ‧ 國. 學. in a short period at one time.. ‧. Three steps of pre-processing are as follows: in the first step, the tokenizer method will load a brown corpus14 from NLTK15 data package to define our own POS tagger. It is built from a. y. Nat. sit. bigram (John, 1996) tagger, a regex expression16 tagger and a unigram tagger with the brown’s. er. io. category: reviews. Second, a Semi-CFG (context free grammar) pattern is defined to organize. al. the basic rules of regular noun phrase. Thirdly, the extractor is set to split the sentence into. n. v i n C hPOS tags to normalize tokens (a single word) and rename some them (e.g. noun phrase belongs engchi U to the pack of noun). After pre-processing, we will execute the trained extractor to split input. sentences, tag the tokens, and search for patterns in order. Take “My kid has a fever.” For example, the sentence will be divided into “My”, “kid, “has”, “a”, “fever” and each word (token) will be tagged with “pronoun”, “noun”, “verb”, “article”, “noun”. Last, the feature extractor picks out all noun phrase as main topics of the sentence.. 13. https://thetokenizer.com/2013/05/09/efficient-way-to-extract-the-main-topics-of-a-sentence/ https://www.nltk.org/book/ch02.html 15 https://www.nltk.org/ 16 https://www.regular-expressions.info/tutorial.html 14. 23. DOI:10.6814/NCCU201900771.

(26) 3.3.4 Recommender systems’ implementation To give suitable ideas to help users construct their posts, we arrange a word embedding model: Word2Vec after extracting features. We want to test if the posting recommender can normalize personal wordings and generalize medical terminologies to reduce the ambiguous expressions. The first step to train a better Word2Vec model is to pre-process sentences we collected. Data collected from websites and online forums usually contains colloquial sayings and abbreviation of vocabularies (e.g. please -> plz, pls). If we use unprocessed sentences to train the model, our recommendation will have a lot of no meaning words and brackets (e.g. “?”, “.”, “;”). Therefore, after splitting each sentence into a several words collection, we compare each word to the stop. 政治大 corpus before training. Next, the step of stemming or lemmatization is also important. To avoid 立 the topic-based (features-based) function miss to process same words with different types, we words list, including be verbs, reflexive pronouns and conjunctions, regulated from NLTK. ‧ 國. 學. choose the lemmatization (e.g. am, are, is -> be) method in NLTK to get the general pattern of words. On the other hand, the stemming (e.g. automate, automatic -> automat) sometimes. ‧. deducts characters from a word, so it may not be suitable under our situation. After the preparation of training, we now put all word packs of each sentence into a collection and. y. Nat. er. io. sit. execute the gensim model, a Python Library for scalable statistic semantics.. al. Word2Vec is a shallow, two-layer neural network model which uses a large corpus of texts to. n. v i n C h 2004) and hasUability to produce a vector space to perform unsupervised learning (Sapatinas, engchi reconstruct linguistic contexts of words. The word embedding function assigns each unique. word a corresponding vector to the new dimension space. In the new vector space, words sharing common contexts in the corpus will be located in close proximity to each other. The math expression of vector relationship is like: “kitten – cat = puppy – dog”. Therefore, when the expression turns into “kitten – cat + dog =?”, we can infer what words should be put in. There are two categories of Word2Vec model: skip-gram (infer context words based on input words) and continuous bag of word (CBOW) (infer input words based on context words). In this work, we follow the gensim tutorial17 and use skip-gram method to train Word2Vec from a corpus having medical terms and healthcare forum wordings data. Word2vec model will be 17. https://radimrehurek.com/gensim/tutorial.html 24. DOI:10.6814/NCCU201900771.

(27) loaded after a user clicks the execute button. When the system encounters the sequence of a period and a space (full stop of a sentence), it regards the prior section is a sentence and gives out features of this words’ section. The Word2Vec model will then present what features are related in a table based on previous extracted features.. Another recommendation system is a semantic model: WordNet. We did not manipulate it a lot because its database is already well-organized. User can easily command English lexical database to fetch synonyms, which is one of methods set in Python NLTK’s WordNet package, of features of the input content. To sum up, our system will process a feature extractor, a recommender, and a layout setting.. 政治大 recommended, or add brand-new ones into the current content. If the user wants to see other 立 suggestions of the current content, he/she can click the next round button and then click the. After above steps, the user can decide to keep the original writing, substitute words for. ‧ 國. 學. execute button again. In the meantime, our system scans and records whether any differences in the previous content into log files, such as which similar words be adopted, which words be. ‧. added, and which words be deleted. However, not all details can be collected directly by the program, we still need to note differences artificially. Figure 4 shows the whole operation from. n. al. er. io. sit. y. Nat. a user’s perspective.. Ch. engchi. i n U. v. Figure 4. The whole operation from a user’s perspective 25. DOI:10.6814/NCCU201900771.

(28) 3.4 User study We organized the procedure of our design and the preparation and operations of building our system in section 3.3. After creating the posting recommender holding the word embedding and the semantic synonym mechanism, we need to recruit participants to experience this innovative design. The following are how we conducted a user study and what a participant has to accomplish during the experiment.. 3.4.1 Dataset The data for the later evaluation can be categorized into two groups: collected for making. 政治大. recommendation systems and collected for testing experts’ reactions. Details about where we. 立. gather data resources and how we manipulate them for building the system are in 3.3. Data for. ‧ 國. ‧. 3.4.2 Models. 學. experts’ judgement comes from posts written by our user study participants.. Nat. sit. y. The posting recommender with a word embedding method: Word2Vec and a semantic method:. io. er. WordNet are designed to compare with baseline model (no recommendation). Word2Vec model, trained by asked questions of WebMD and abstracts of PubMed, gives relative ideas. n. al. i n U. v. (features) of main topics from the input content. WordNet model, applied by an English lexical. Ch. engchi. database, gives synonym ideas (features) of main topics from the input content. Baseline model only provides a space for participants to write posts and a delete all button. To reduce the effectiveness of intervention, every participant has to perform each model three times matched with different conditions.. 3.4.3 Tasks and experimental materials We build a Word2Vec model and a WordNet model in contrast to a baseline model (total 3 models). To evaluate whether two posting recommenders are useful, a user study is arranged. It is a simulation experiment, so to help participants come up with posting ideas, we plan to provide a short introduction (background) in every formulation. In addition, we want to know. 26. DOI:10.6814/NCCU201900771.

(29) more about participants’ post formulation under different situations. Each model will assign two tasks: a general situation and a particular situation. The general situation means it is easy to encounter or imagine the condition. For example, (1) scenarios of common illnesses happen around us (e.g. flu, allergy, and abdominal pain) (2) reports written as a popular science. The particular situation means it is hard to see similar conditions around or to be understood by general users easily. For example, (1) health problems need to be judged by professionals (2) conditions contain a lot of medical information. Finally, a daily scenario task is picked to be the general situation while a class discussion task is picked to be the particular situation. After all setting, we performed a pilot study to observe our study’s validity. As we have mentioned, there are three different models in our experiment. Each model has two. 政治大 6 times, it is possible that a learning effect may happen. Therefore, corresponding to models, 立 we assign three different health conditions (illnesses). In the pilot study, three health conditions. tasks. Only one participant needs to complete 6 posts in total. If we give the same introduction. ‧ 國. 學. are flu, asthma, and pregnancy because we want to discriminate common and uncommon topics. Flu and asthma are regarded as common topics on WebMD18. Only pregnancy is unfamiliar to. ‧. the public. However, except for flu, pilot participants state asthma and pregnancy are both difficult for them to write the content because they have no similar experiences. The selection. Nat. sit. y. of health condition should be reconsidered in the official user study. For example, flu, allergy,. er. io. and foodborne illness which are more frequent to be discussed among the public. Besides, only a short introduction seems not enough for formulating posts under a simulation condition. Thus,. n. al. Ch. i n U. v. we decided to put supportive paragraphs into one task. Participants will get one-page document. engchi. containing an introduction and some supportive paragraphs when they’re writing one condition. Our next important step is to prepare suitable supportive paragraphs. To cover various aspects of a health situation, articles from health agencies, relative news reports and scenarios of specific illness happened around us are considered to be our materials. Finally, we selected passages that excerpted from a health agency’s announcement with statistical data (the rate of an illness in a region) and sections gathered from news reports with common knowledge that the public can understand easier. Figure 5 is a try-out of a daily scenario task.. 18. https://www.webmd.com/a-to-z-guides/common-topics 27. DOI:10.6814/NCCU201900771.

(30) Participants need to imagine the background and write down their or the character’s experiences of specific illness after knowing the background. If they still feel hard to construct the content, participants can read articles we gave under the introduction of each task (e.g. “A guide to a heart attack” of figure 5. This section is collected from WebMD19). In addition, we write an example question in the try-out. This possible question is to encourage participants to produce longer questions implicitly, not just propose a question sentence like what the symptom of the heart disease. If the content is really short, the recommend mechanism becomes useless. But this part in a real user study will be deleted to avoid affecting the post content of participants. Total 6 tasks can be found in Appendix 1.. [Task]. 政治大. Background: Sandy’s Grandfather has a family history of the heart attack. Unluckily, his illness occurred yesterday and was sent to the hospital. After receiving a phone call from Dad, Sandy tried to search for some information about the sickness. She will go to pick up Grandpa Johnson tomorrow on her way home but she has no ideas what she should know in advance. The following is the information she has now. If you are Sandy and want to get help on the health care online forum, what you will say?. 立. ‧ 國. 學. ‧. Supportive paragraphs of a daily scenario task: A guide to a heart attack When blood can't get to your heart, your heart muscle doesn't get the oxygen it needs. Without oxygen, its cells can be damaged or die. Over time, cholesterol and a fatty material called plaque can build up on the walls inside blood vessels that take blood to your heart, called arteries. This makes it harder for blood to flow freely. Most heart attacks happen when a piece of this plaque breaks off. A blood clot forms around the broken-off plaque, and it blocks the artery. The following is the call, from Sandy’s dad: "If Tracy (paid cleaner) wasn't there at that time, it may have been too late to rescue your grandpa. You know, Grandpa Johnson had a heart attack. He told me before that his chest was sometimes painful and that made it difficult for him to breath. And our hometown was pretty cold in the winter. I'm afraid that if Grandpa forgets to dress warm enough, the low temperature may stimulate another heart attack. Do you think I should find a personal physician for grandpa? Near his house? We are all working outside the county. When emergency happens, this protection may work.". n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. Example question from Sandy: My grandpa's heart attack occurred yesterday. I'm going to ask the cleaner who found my grandpa fainted what happened before talking to the doctor. What should I know? The body reaction at that moment? In addition, Grandpa is an emotional person, so I consider talking to the doctor by myself first and decide what can tell him directly. Is that good? By the way, the temperature here is pretty low. Does anyone know what things should be prepared for when grandpa goes back home?. Figure 5. The material that a participant reads before writing a post in the try-out. 19. https://www.webmd.com/heart-disease/ss/slideshow-heart-attack 28. DOI:10.6814/NCCU201900771.

(31) 3.4.4 Participants and procedure According to Latin Square (Aigner & Ziegler, 2018), at least 27 experimental participants, whose English reading and writing ability are above average, should be recruited to accomplish our user study. Every participant has to conduct 6 steps of the procedure: Step 1. Writing a pre-questionnaire Step 2. Explaining each part of try-out task Step 3. Showing an example regarded as a good post Step 4. Being familiar with the mechanism of our recommendation (1) When an asker finishes a sentence, click the <execute> button (2) If any suggestion is good, add it into the asker’s content (3) If the asker wants new ideas, click the <next round> button and then click the <execute> button again P.S. If the table cell appears nothing, that means the model can’t find any similar topics.. 政治大. 立. ‧ 國. 學. Step 5. Following our instruction to complete six tasks Step 6. Finishing the post-questionnaire. ‧. The pre-questionnaire collects the basic information of users and their past experiences in using. Nat. sit. y. Q&A forums. We inquire participants about experiences of posting and answering questions. io. er. on online forums. What the try-out contains and how the try-out conducts are showed in Figure 5. After understanding the whole procedure, participants have to write 6 posts corresponding. n. al. i n U. v. to healthcare conditions (illnesses) with a RS or without a RS. Possible pairs include the first. Ch. engchi. illness with model A, the second illness with model B, the third illness with model C, and etc. Picked illnesses are flu (denotes to F), foodborne illness (denotes to FB), and allergy (denotes to A) which are frequent to be known to the public. Model A, B, and C are denoted respectively to Word2Vec, WordNet, and baseline model. Table 1 presents the order setting of our user study. Last, the post-questionnaire focuses on experiences after using our system including visually appealing, system process, system speed, and the extent they want to use our recommender. Table 1 The order setting of our user study Model sequence 1) A, B, C 2) B, C, A 3) C, A, B. Illness sequences AF[FB] AF[FB] AF[FB]. [FB]AF [FB]AF [FB]AF. F[FB]A F[FB]A F[FB]A. Note. Every illness sequence needs three objects, so 9*3 = 27.. 29. DOI:10.6814/NCCU201900771.