• 沒有找到結果。

客戶回訪預測模型應用於線上預約服務 - 政大學術集成

N/A
N/A
Protected

Academic year: 2021

Share "客戶回訪預測模型應用於線上預約服務 - 政大學術集成"

Copied!
39
0
0

加載中.... (立即查看全文)

全文

(1)國立政治大學資訊管理學系. 碩士學位論文 指導教授:周彥君博士. 立. 政 治 大. ‧ 國. 學. 客戶回訪預測模型應用於線上預約服務. ‧. PREDICTIVE MODELING OF CUSTOMER RETENTION. sit. y. Nat. io. n. al. er. IN ONLINE RESERVATION SERVICES. Ch. engchi. i n U. v. 研究生:林煥禹 中華民國 106 年 3 月.

(2) Abstract Electronic commerce still grows rapidly in the recent years and innovative services are introduced in the recent years accordingly. This research analyzes a representative online reservation service provider – EZTable. Loyalty of customer is crucial because EZTable can obtain more market share with high customer loyalty. Therefore, we expect to answer the following research questions: (1) What are relevant and useful consumer, restaurant, and demographic factors to predict customer retention? (2) How do we develop and determine effective predictive models? We apply generalized additive model, decision tree, bagging, and random forest, to a large volume of operational data from EZTable and develop a set of predictive models. Instead of model complexity, identifying critical variables from the research context determines predictive performance. Transaction-dependent factors could substantially enhance predictive performance. Our findings enable companies like EZTable to understand what predictors are critical to customers’ loyalty. Further, the company can design effective promotions for customers with. 立. 政 治 大. ‧ 國. 學. ‧. higher return probability. Our modeling effort could help those service providers reduce advertising cost by allocating limited resources to customers with higher probability to place orders again. Keywords: e-commerce, online services, return rate, machine learning, gam. y. Nat. er. io. sit. 摘要. 隨著電子商務近年的蓬勃發展,許多形形色色的創新服務應運而生。本研 究針對具指標性的線上餐飲訂位平台---EZTable(簡單桌)進行分析,對於. n. al. Ch. engchi. i n U. v. EZTable 而言,前來訂位的客戶是十分重要的,因為高客戶忠誠度能為其帶來 更多市佔率。因此,本研究將會針對以下兩點問題進行研究分析:(1)為了能夠 準確預測顧客回訪率,那些因素與客戶回訪率是高關聯性的?如訂位者、餐廳、 地理位置等資訊。(2)應如何建構並驗證高準確度的預測模型?根據以上問題, 本研究使用廣義加成模型(GAM)、決策樹(decision tree)、套袋抽樣(bagging)、 隨機森林(random forest)等模型訓練方法,搭配 EZTable 大量的訂位資料,建構 不同的預測模型來預測客戶回訪率。以 EZTable 的資料而言,本研究發現比起 訓練模型的方法,模型變數的選擇更明顯影響了預測表現,而關於訂位本身的 資訊,如訂位狀態,能夠大幅度提升預測準確度。這些發現能夠幫助如 EZTable 等服務提供者,了解哪些變數對於顧客忠誠度是相當重要的;再者,公司能夠 透過這些資訊,為有較高回訪率的會員量身打造適合的促銷活動。透過將行銷 資源集中在特定的客戶上,這些提供服務公司的行銷成本也能夠因此減少。 關鍵字: 電子商務、線上服務、回訪率、機器學習、廣義加成模型.

(3) Table of Contents 1. Introduction ................................................................................................................ 4 2. Literature Review....................................................................................................... 7 2.1 Online purchase behavior ................................................................................ 7 2.2 Churn analysis in industries ............................................................................. 7 3. Data ............................................................................................................................ 9 3.1 data resource .................................................................................................... 9 3.2 Measures .......................................................................................................... 9 4. Method ..................................................................................................................... 18 5. Empirical result ........................................................................................................ 22 5.1 GAM .............................................................................................................. 22 5.2 Tree-Based Learning ...................................................................................... 27 6. Discussion ................................................................................................................ 35. 立. 政 治 大. ‧. ‧ 國. 學. 7. References ................................................................................................................ 37. n. er. io. sit. y. Nat. al. Ch. engchi. 3. i n U. v.

(4) List of figures Fig 3.1a.........................................................................................................................17 Fig 3.1b.........................................................................................................................17 Fig 3.2...........................................................................................................................18 Fig 4.1...........................................................................................................................20 Fig 4.2...........................................................................................................................22 Fig 5.1...........................................................................................................................23 Fig 5.2...........................................................................................................................24 Fig 5.3...........................................................................................................................25 Fig 5.4...........................................................................................................................26 Fig 5.5...........................................................................................................................27 Fig 5.6...........................................................................................................................30 Fig 5.7...........................................................................................................................30. 立. 政 治 大. Fig 5.8...........................................................................................................................31. ‧ 國. 學. ‧. Fig 5.9...........................................................................................................................31 Fig 5.10.........................................................................................................................32 Fig 5.11.........................................................................................................................33 Fig 5.12.........................................................................................................................34 Fig 5.13.........................................................................................................................34. er. io. sit. y. Nat. n. a l List of tables i v n Ch U engchi Table 1 Variables ......................................................................................................... 12 Table 2 Statistics of Training Data .............................................................................. 14 Table 3 Statistics of Test Data ..................................................................................... 15 Table 4 Statistics of people .......................................................................................... 16 Table 5 Statistics of timediff ........................................................................................ 16 Table 6 Statistics of status ........................................................................................... 17 Table 7 confusion table................................................................................................ 20 Table 8 GAM model comparison ................................................................................ 26 Table 9 Learning methods comparison ....................................................................... 34. 4.

(5) 1. Introduction Industry of electronic commerce still grows rapidly in the recent years. Report shows that electronic commerce sales of business-to-consumer (B2C) reached $1.471 trillion in 2014, with a nearly 20% increase1. In addition, mobile commerce (m-commerce) grew to 30% of electronic commerce sales in 2014 (Eichmann, 2014). With the penetration of smart phone, it allows everyone to buy products anytime anywhere. Consequently, more and more consumers choose to shop online. With the advancement of Internet technologies and the growing number of Internet users, the needs of online services have become much stronger and more diverse. In order to meet the needs of the people's daily life, innovative services are introduced in the recent years. For instance, people are able to sit in front of a laptop or use their smart phone to place a restaurant reservation through Internet. There is no need for people to make a phone call or go to the restaurant for a reservation. In addition, websites for reservation provide complete information of restaurants for customers. By following. 立. 政 治 大. ‧ 國. 學. ‧. the instructions, people are able to easily place a reservation in a short time. Moreover, system that provides the service allows a restaurant to accept a large amount of reservations simultaneously. In order to understand such a brand new service, our study analyzes a representative online reservation service provider – EZTable. EZTable is the largest. sit. y. Nat. al. er. io. online restaurant booking platform in Taiwan and offers 24hr online reservation services. Customers can make reservations through smartphone, laptop, and personal computer. It has over 1 million members and 300 thousand active monthly users, and. n. v i n C hand five-star hotelsUin Taiwan, Hong Kong, covers more than 2,600 restaurants e EZTable g c hisi a suitable example for us to study Thailand and Indonesia . The success ofn 2. this new type of electronic commerce service. Because the operation of EZTable relies on people who place reservations, customer is the most important asset to the service provider. Customer is also a key to studying this kind of service. However, it costs really high to attract new customers (Reichheld and Schefter, 2000).Therefore, customer loyalty is an imperative issue. Besides, according to Jarvis and Mayo (1986), loyalty of customer is crucial because a company can obtain more market share with high customer loyalty. As a result, how to retain customers would be a key question to EZTable. With such a research context, 1. B2C e-commerce sales worldwide, Retrieved August 15 2016, from. http://www.statista.com/statistics/261245/b2c-e-commerce-sales-worldwide/ 2. Eztable, Retrieved June 17 2016, from https://en.wikipedia.org/wiki/EZTABLE 5.

(6) our study aims to construct a model for EZTable to predict whether a customer would return and continue to use the service. Accordingly, we expect to answer the following research questions: (1) What are relevant and useful consumer, restaurant, and demographic factors to predict customer retention? (2) How do we develop and determine effective predictive models? In order to answer the research questions, we apply generalized additive model (GAM) and tree-based learning methods: decision tree, bagging, and random forest, to a large volume of operational data from EZTable and develop a set of predictive models. We then use receiver operating characteristics (ROC) to determine the most predictive model across various specifications. Using EZTable data, we find that GAM has the highest predictive power over other computational-complex models. Instead of model complexity, identifying critical variables from the research context determines predictive performance. Our results show that traditional variables such as age and gender are not effective predictors. We further identify some transaction-dependent factors that could substantially enhance predictive performance.. 立. 政 治 大. Our findings enable companies like EZTable to understand what predictors are critical. ‧. ‧ 國. 學. to customers’ intention to place a reservation within a certain period again. Further, as we mention before, it costs a lot to attract new customers. As a result, if a service provider like EZTable knows which customer is more likely to utilize its service offerings again, the company can design effective promotions for returning customers. Our modeling effort could help those service providers reduce advertising cost by allocating limited resources to customers with higher probability to place orders again.. n. er. io. sit. y. Nat. al. Ch. engchi. 6. i n U. v.

(7) 2. Literature Review 2.1 Online purchase behavior With the explosively growing penetration rate of Internet, electronic commerce has sprung up like mushrooms. Companies pay a lot attention to predict whether a customer will purchase products online. Accordingly, prior studies examine the question across industries. Van den Poel and Buckinx (2005) use clickstream, customer demographics, and historical purchase behavior to predict a customer’s purchase decision under the context of e-business industry. The research shows that variables of detailed clickstream information are the most valuable ones to predict and classify customers’ purchase tendency. Moe (2004) develops a model based on historical visits and purchases data from the leading online bookstore, Amazon.com, to predict each customer’s purchase probability. Similarly, Sismeiro and Bucklin. 立. 政 治 大. ‧ 國. 學. (2003) focus on online car retailing and develop a predictive model using clickstream data. The study shows that consumers’ visit experiences and navigational behavior are. ‧. predictive to a customer’s online car-buying decision. Nevertheless, compared to studying customers’ willingness to purchase products online, there are relatively few studies that focus on whether a customer will purchase products online again. Morrisonn (2001) stated that only communicability of booking online affected both of being a bookers or being a repeat bookers in travel industry. Most of the key elements to the retention of online purchase are different from the ones of customer online purchase behavior. In addition, evidence shows that just a small percentage of improvement at retention rate can leads to huge profit increase. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. (Van den Poel and Lariviere, 2004). This indicates an example with strong incentive to study customer’s online repurchase behavior.. 2.2 Churn analysis in industries Customer relation management has been one of the biggest concerns of companies at the present time. More and more companies are interesting in keeping customers staying. In addition, retention affects profit drastically as well (Van den Poel and Lariviere, 2004). As a result, churn rate research has spread all over industries. Verbeke (2011) provided a profit-based measure to predict mobile telecom sector churn rate. Xie (2009) proposed a novel method, improved balanced random forests (IBRF), to enhance performance of churn prediction for banks in China. Moreover, another churn prediction model in a subscription service, newspaper, is developed by Coussement and Van den Poel (2008) through support vector machines. 7.

(8) Dasgupta (2008) studied the influence of users’ social network to potential churners in mobile telecom network. There are many studies about offline churn analysis related to different industries. As e-commerce sales of B2C reached $1.471 trillion in 20143, more and more consumers shop online. In order to retain their customers, online customer retention is another important research area. Different from physical stores, online context opens 24/7 and provides convenient and reachable service to customers. In addition, through the online storefront, firms are easily to collect customer browsing and purchasing information, and provide customized offerings and promotions accordingly. Shankar et al. (2003) argue that customer loyalty to online service provider is higher than customer loyalty to offline service provider. This means there are a plenty of highly loyal online shoppers. Thus, customer retention is of great importance in the online context. In this study, we then aim to focus on online reservation service and build a prediction model for online customer retention.. 立. 政 治 大. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. 3. Ch. engchi. i n U. v. B2C e-commerce sales worldwide, Retrieved August 15 2016, from http://www.statista.com/statistics/261245/b2c-e-commerce-sales-worldwide/ 8.

(9) 3. Data 3.1 data resource EZTable would like to find out whether or not the customer would place reservations again. If EZTable knows which members would return to use its services in the first place, it will be able to effectively design promotions. So, our goal is to use EZTable’s booking records to accurately estimate the return probabilities of members. The data set contains over 100 thousand different user’s bookings records in EZTABLE between 2012 and 2014. The records are all from different member’s booking history. Therefore, each reservation record represents a member. EZTable defines 90 days as a meaningful period of time to record whether the member would place another reservation again during the period. Each row represents the. 政 治 大 information of one member’s booking record, including member id, restaurant id, 立 booking date, dining date, number of dining people, purpose, status. And there are two. ‧ 國. 學. parts to these records: 1. MEMBER data, including over 620 thousand members’. ‧. information, like member id, gender, and birthday, and 2. RESTAURANT data, containing 724 registered restaurants’ profiles, like its nationality, country, providing WIFI or not. There are over 100,000 booking records, with the variables in Table 1. With such a large data set, which is commonly seen in data mining research, we randomly split the data into two groups: one set is the training data for fitting models, while another is used to evaluate the performance of the models, and that is the so-called test data. In this case, we define a training set, which contains 62083 bookings data accounting. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. for about 60% of all cases and the remaining 40% data (41546 cases) are for evaluation. The probability of return (placing order within 90 days, Return90) is 0.1979 in training data set and is 0.20105 in test data set. The probability of not return (not placing order within 90 days, Return90) is 0.8021 in training data set and is 0.79895 in test data set.. 3.2 Measures Return90 is the binary dependent variable of the data set. 1 denotes that the member returns to place another reservation in 90 days and 0 otherwise. The independent variables are the data of last reservation record, which includes the reservation information and restaurant information. Reservation information contains the detailed booking information, including member’s age (age16-25. 1 represents age 16-25, while 0 represents other ranges. 9.

(10) age26-35. 1 represents age 26-35, while 0 represents other ranges. age36-45. 1 represents age 36-45, while 0 represents other ranges. ageOther. 1 represents age is not included in the former three ranges or is missing, while 0 represents other ranges.), member’s gender (gender. 1 represents men, while 0 represents woman), the number of days between dining day and placing order day (timediff), and the size of the party of diners (people). Dummy variables of status are the information recorded by EZTABLE according to the booking information form restaurant. If status is new or ok, that means the restaurant has no extra status information of the record (status_ok is 1). If the member changes his/her booking, status_changes is 1. Finally, if the status_canceled of a record is 1, it means the member cancelled the reservation. Restaurant information includes dummy variables, which denote the city area the restaurant is located (area), if the restaurant is situated in a hotel (1/0) (is_hotel), and providing wifi or not (1/0) (wifi). Table 2 and Table 3 shows the summary statistics of the variables for training data and test data correspondingly. Fig 3.1 shows the trend between People and Return90, and Timediff and. 立. 政 治 大. Return90, respectively. As the number of People goes up, the value of Return90. ‧ 國. 學. ‧. correspondingly decreases. This implies that if the size of dining people is large, the member has less probability to place an order in 90 days. Generally speaking, we seldom dine out with a large group of people unless there is a family gathering or class reunion. Therefore, it is hard to motivate members to place another order in short term. On the contrary, a smaller group of people has a higher possibility to place an order in 90 days. If the member usually dines out with his/her intimate partner, it is more possible for them to place a reservation in advance. The trend of relation between Return90 and Timediff is just the opposite. When it is a long period of time between dining day and order-placed day, the member has a. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. higher chance to place another order in short term. This indicates that the earlier the member place an order, the higher the chance is that he/she will place an order soon. The members derive their early order-placing behavior from their habit of preparing in advance. They get used to placing a reservation early and make sure they can dine on time even if the restaurant is fully booked. Accordingly, the probability of Return90 is higher when Timediff increases. Conversely, if the value of timediff is small, the member might book the order temporarily. He/she might want to dine out on the spur of the moment. Hence, the probability of Return90 is smaller. Fig 3.2 displays the return rates between four types of statuses and Return90. There are four group of status: status_ok (0.155), status_canceled (0.379), status_change (0.253), and no status recorded (0.119). If a reservation status is ‘canceled’ (status_canceled), the return probability of its booker is higher than all the other situations. On the other hand, if a reservation status is ‘ok’ or ‘new’ (status_ok), 10.

(11) its booker has a lower chance to place another order in 90 days. This indicates that if a member canceled a reservation, there is a higher chance for the member to place another order in 90 days. The member might cancel the reservation for some reason and decided to dine out on another day. Therefore, he/she places another reservation in 90 days.. 立. 政 治 大. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. Ch. engchi. 11. i n U. v.

(12) Table 1 Variables Variables. Description. Reservation Information Return90. Place new order in 90 days or not? 1: True 0: False. Age. Member’s age. age16-25. age 16-25 Dummy variable 1: True 0: False. age26-35. age 26-35 Dummy variable. 立. ‧ 國. age 36-45. ‧. Dummy variable 1: True 0: False. age not in 16-45 or missing Dummy variable 1: True 0: False. n. al. er. io. sit. y. Nat. ageOther. 0: False. Gender. 學. age36-45. 政 治1: True大. Ch. i n U. v. Dummy variable. e n g c h1:i Male. 0: Female. Timediff. Number of days between dining day and placing order day. People. Size of the party of diners. Status. Status of reservation. status_ok:. No information from the restaurant Dummy variable 1: True 0: False. status_canceled:. Reservation cancelled Dummy variable 1: True 12.

(13) 0: False status_changed:. Reservation changed Dummy variable 1: True 0: False. Restaurant Information Is_hotel. Is located in a hotel Dummy variable 1: True 0: False. Cityarea. Area the restaurant located. new_taipei_city. out_of_greater_taipei. 政 治 大. Exclude Taipei & New Taipei City. ‧ 國. 學. Dummy variable 1: True 0: False. ‧. io. sit. y. Nat. Providing wifi or not Dummy variable 1: True 0: False. n. al. er. Wifi. 立. In New Taipei City Dummy variable 1: True 0: False. Ch. engchi. 13. i n U. v.

(14) Table 2 Statistics of Training Data Variables. n. mean. Stdev.. Min. Max. 62083. 0.20. 0.40. 0.00. 1.00. age16-25. 9314. 0.15. 0.36. 0.00. 1.00. age26-35. 17770. 0.29. 0.45. 0.00. 1.00. age36-45. 9097. 0.15. 0.35. 0.00. 1.00. ageOther. 25902. 0.42. 0.49. 0.00. 1.00. Gender. 62083. 0.45. 0.50. 0.00. 1.00. Timediff. 62083. 9.75. 13.57. 0.00. 142.00. People. 62083. 1.00. 45.00. Status. 立62083. 0.73. 0.45. 0.00. 1.00. status_canceled:. 62083. 0.19. 0.40. 0.00. 1.00. status_changed:. 62083. 0.01. 0.12. 0.00. 1.00. 62083. 0.36. 0.48. new_taipei_city. 62083. 0.03. 0.17. out_of_greater_taipei. 62083. 0.06. 0.23. Dependent Variable Return90 Reservation Information Age. 政 4.05治 2.93 大. 學. ‧ 國. status_ok:. n. 0.49 v a l 62083 0.61 i n Ch engchi U. 14. 0.00. 1.00. 0.00. 1.00. 0.00. 1.00. sit. io. Wifi. 1.00. y. Nat. Cityarea. 0.00. er. Is_hotel. ‧. Restaurant Information.

(15) Table 3 Statistics of Test Data Variables. n. mean. Stdev.. Min. Max. 41546. 0.20. 0.40. 0.00. 1.00. age16-25. 6144. 0.15. 0.35. 0.00. 1.00. age26-35. 11874. 0.29. 0.15. 0.00. 1.00. age36-45. 6046. 0.15. 0.35. 0.00. 1.00. ageOther. 17482. 0.42. 0.49. 0.00. 1.00. Gender. 41546. 0.45. 0.50. 0.00. 1.00. Timediff. 41546. 9.68. 13.39. 0.00. 147.00. People. 41546. 4.06 3.00 政 治 大. 0.00. 39.00. Dependent Variable Return90 Reservation Information Age. 0.72. 0.45. 0.00. 1.00. status_canceled:. 41546. 0.20. 0.40. 0.00. 1.00. status_changed:. 41546. 0.02. 0.12. 0.00. 1.00. 41546. 0.36. 0.48. ‧. 1.00. new_taipei_city. 41546. 0.03. 0.17. out_of_greater_taipei. 41546. 0.06. 0.23. 41546. 0.61. 0.49. ‧ 國. status_ok:. 學. 立41546. io. Status. Restaurant Information. n. al. Ch. engchi. 15. y. Nat. Wifi. 1.00. sit. Cityarea. 0.00. 0.00. 1.00. 0.00. 1.00. 0.00. er. Is_hotel. i n U. v.

(16) Table 4 Statistics of people Group. People. Sample size. Return90=True : Return90=False. Return prob.. people1-2. 1-2. 28909. 6250 : 22659. 0.216. people3-5. 3-5. 26936. 5156 : 21780. 0.191. people6-10. 6-10. 11813. 2105 : 9708. 0.178. people_others. others. 2342. 344 : 1998. 0.147. Table 5 Statistics of timediff Timediff. Sample size. Return90=True : Return90=False. Return prob.. timediff0-3. 0-3. 31869. 5138 : 26731. 0.161. timediff4-7. 4-7. timediff8-14. 8-14. timediff_others. others. 立. 13711 治 2620 : 11091 政 10292 大2311 : 7981 14128. 3786 : 10342. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. Ch. engchi Fig 3.1a. Fig 3.1b 16. i n U. v. 0.191 0.225 0.268.

(17) Table 6 Statistics of status Group. status. Sample size. Return90=True : Return90=False. Return prob.. new. status_ok=1 status_canceled=0 status_changed=0. 50504. 7844: 42660. 0.115. cancel. status_ok=0 status_canceled=1. 13669. 5184 : 8485. 0.379. 999. 253 : 746. 0.253. 4828 574 : 4254 政 治 大. 0.119. status_changed=0 change. status_ok=0 status_canceled=0 status_changed=1. none. status_ok=0 status_canceled=0 status_changed=0. 立. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. Ch. engchi Fig 3.2. 17. i n U. v.

(18) 4. Method Since the dependent variable, Return90, is dichotomous, fitting an ordinary linear regression model is not appropriate. Therefore, we employ the logit models for estimation. Let 𝑌𝑖 denote Return90 for the ith EZTable user 𝑌𝑖 ~ 𝐵𝑒𝑟𝑛𝑜𝑢𝑙𝑙𝑖(𝑃𝑖 ). (1). where 𝑃𝑖 denotes the probability of a returned visit within 90 days. From a regression perspective, 𝑃𝑖 is the expectation of Return90, and can be specified as: E(𝑌𝑖 = 1|𝑋𝑖 β) = 𝑃𝑖 (2) where 𝑋𝑖 is the vector of independent variables. Consequently, we can specify a logit model exp(α + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + ⋯ + 𝛽𝑘 𝑋𝑘𝑖 ) 𝑃𝑖 = (3) 1 + exp(α + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + ⋯ + 𝛽𝑘 𝑋𝑘𝑖 ). 立. 政 治 大. ‧ 國. 𝑃𝑖 1−𝑃𝑖. (4). ‧. where log(. 學. where (𝛽1 , 𝛽2 , … , 𝛽𝑘 ) are the parameters of explanatory variables. Equation (3) can be written as 𝑃𝑖 log( ) = α + 𝛽1 𝑋1𝑖 + 𝛽2 𝑋2𝑖 + ⋯ + 𝛽𝑘 𝑋𝑘𝑖 1 − 𝑃𝑖 ) as the logit of 𝑃𝑖 , which allows us to form a linear relationship. er. io. sit. y. Nat. between independent variables and logit(𝑃𝑖 ). The right hand side of this generalized linear model in equation (4) is additive and linear. However, empirical data may have various data generation processes and requires non-linear forms of independent variables. In other words, the relation. al. n. v i n C hvariables in the right between the sum of the independent hand side of equation (4) U i e h n c gexample, and response variable may be nonlinear. For Figure 4.1 shows the model fits better when we allow non-linear form of age (see below). It plots the simulated ages with corresponding wages. The straight line illustrates a linear relation between the independent variable and dependent variable. And the curve shows a non-linear relation between the two variables. The non-linear model fits the data better than the linear one. It catches the non-linear form of age around 20 years old to 30 years old.. 18.

(19) 立. 政 治 大 Fig 4.1. ‧ 國. 學. ‧. In order to capture non-linear patterns, we apply the generalized additive model (GAM). GAM is a generalized linear model that uses smooth function to incorporate non-linear patterns of independent variables 𝑃𝑖 log ( ) = α + 𝛽1 𝑆1 (𝑋1𝑖 ) + 𝛽2 𝑆2 (𝑋2𝑖 ) + ⋯ + 𝛽𝑘 𝑆𝑘 (𝑋𝑘𝑖 ) (5) 1 − 𝑃𝑖. y. Nat. sit. er. io. where 𝑆() stands for the smooth function of continuous independent variables. We use package gam in R to perform GAM. Through this package, we want to find a function 𝑆 that fits the data well, but it is smooth at the same time (James, Gareth, et al, 2013). One natural way is to determine the function 𝑆 that minimizes the spline. n. al. objective function. Ch. engchi. i n U. v. 𝑛. 1 2 2 ∑(𝑦𝑖 − 𝑆(𝑥𝑖 )) + 𝜆 ∫(𝑆 ′′ (𝑥)) 𝑑𝑥 𝑛. (6). 𝑖=1. With y preditcted by curve 𝑆(𝑥), the first term of equation (6) is the mean squared error (MSE), trying to make 𝑆(𝑥) match the data at each 𝑥𝑖 . As for the second term, it measures the curvature of 𝑆. The curvature controls how wiggly 𝑆(𝑥) is. It is modulated by the turning parameter λ ≥ 0. The higher the value of λ is, the smoother the curve is. We can consider that the smoothing function would search the minimum value of MSE while the average curvature is contingent on a restriction. The degree of 𝑆(𝑥) needs to strike a balance between minimizing the mean squared error and the penalty due to increased curvature. Such model extends the form of a generalized linear regression by allowing 19.

(20) nonlinear pattern. This is more flexible because the relation between independent variable and dependent variable is not necessarily defined as linearity. In other words, although the regression is not linear in x, 𝑆(𝑥), the converted variable, makes it linear in 𝑥. Like we mention about Fig 4.1, the transformation can capture the nonlinearity between age and wage that a general linear regression would miss (Kim Larsen, 2015). The purpose of our study is to enhance the predictive performance of returned visits for EZTable. We explore different model specifications, and evaluate their performances by receiver operating characteristics (ROC). ROC is a common tool for evaluating prediction accuracy of a binary classification system. It is widely used in data mining or machine learning research (James, et al., 2013). Among the methods of ROC, ROC curve is the most popular one. It is a two-dimensional graph. It is defined by TP rate (Equation (7)) and FP rate (Equation (8)). In a binary classification problem, there are four possible outcomes (see Table 6). Given the outcome is in fact positive, true positive (TP) is when the model predicts a positive result, and false negative (FN) is when the model predicts a negative result. Similarly, given the. 政 治 大. 學. ‧ 國. 立. outcome is in fact negative, true negative (TN) is when the model predicts a negative. Nat. 𝐹𝑃 𝑇𝑁 + 𝐹𝑃. al. Positive. Negative. n. Predict. (8). Ch. er. io. Table 7 confusion table Actual. (7). sit. FPR(false positive rate) =. 𝑇𝑃 𝑇𝑃 + 𝐹𝑁. ‧. TPR(true positive rate) =. y. result, and false positive (FP) is when the model predicts a positive result.. v ni. U e n g cFalse h i Positive. Total. True Positive (TP). (FP). Negative. False Negative (FN). True Negative (TN). FN+TN. Total. TP+FN. FP+TN. TP+FP+FN+TN. Positive. TP+FP. ROC curve utilizes TPR (true positive rate) and FPR (false positive rate). True positive rate is also known as sensitivity, which is the possibility to classify a positive response variable as positive. False positive, on the other hand, is the possibility to classify a negative response variable as positive. TPR are plotted on the Y axis and FPR are plotted on the X axis. Each point on the curve represents a combination of true positive rate and false positive rate. In Figure 4.2, the dotted line is the combinations of TPR and FPR. If the dotted 20.

(21) line passes through the point (0, 1), the model classifies all outcomes perfectly with TPR equaling 1 and FPR equaling 0. In our case, it correctly predicts every member would return to place an order in 90 days or not. On the other hand, if the dotted line passes though the point (1, 0), it wrongly predict every member’s behavior. The diagonal line y = x divides the ROC space into two parts, A and B. The diagonal line represents the randomly guessing classification model with the same TPR and FPR. If the curve is above the diagonal, it represents good classification results and is better than the randomly guessing model. However, if the line is below the diagonal, the model performs worse than the randomly guessing the model (see below).. 立. 政B 治 大. n. al. er. io. sit. y. Nat .. ‧. ‧ 國. 學 A. Ch. Fig 4.2. engchi. i n U. v. In addition to comparing the curvature from different model, area under curve (AUC) is another metric to evaluate the performance of predictive models. AUC is the area under the ROC curve. Its possible value is from 0 to 1. The area of A, which is under the diagonal line, is 0.5. This is a randomly guessing model with AUC 0.5. If AUC is larger than 0.5, say, the dotted line, we are able to confirm that the model performs better than a randomly guessing one. That is, the bigger the AUC is, the better the model performs. In our study, we are going to use AUC to compare the predictive performances of different models.. 21.

(22) 5. Empirical result 5.1 GAM In order to identify which variables would practically enhance predictive performance, we specify models using reservation information and restaurant information and evaluate their performances by AUC (See Section 4 for details). We note that all the variables used for forecasting are statistically significant. The 𝑃𝑖 in our study implies Pr(𝑅𝑒𝑡𝑢𝑟𝑛90𝑖 = 1|𝑋𝑖 β) log (. 𝑃𝑖 ) = α + 𝛽1 𝑆1 (Gender) + 𝛽2 𝑆2 (Age) 1 − 𝑃𝑖. (9). 政 治 大. Equation (9) is a generalized additive model with a member’s age (Age) and gender (Gender) as predictors. Traditionally, these member profiles are considered powerful to predict consumers’ behaviors. However, the AUC (Fig 5.1) is only 7.7% larger than the one of a randomly guessing model (AUC=0.5). That is, Age and. 立. ‧ 國. 學. ‧. Gender is not helpful to predict a customer’s return rate (Return90). Moreover, there are too many missing values in independent variable Age. This may lead to losing data information.. n. er. io. sit. y. Nat. al. Ch. engchi. Fig 5.1 22. i n U. v.

(23) We continue to try on period of time between dining day and order-placed day and reservation group size: Timediff and People. (equation 10) 𝑃𝑖 log ( ) = α + 𝛽1 𝑆1 (Timediff) + 𝛽2 𝑆2 (People) (10) 1 − 𝑃𝑖 From Fig 5.2, the ROC curve of equation 10 is closer to the point (0, 1) than equation 9 in Fig 5.1. Likewise, the AUC of equation 10 is 0.5799. This means that equation 10 has better prediction performance than equation 9 (0.5799 versus 0.5385). As a result, we consider that these two predictors improve prediction performance.. 立. 政 治 大. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. Fig 5.2 Since some variables of reservation information are able to improve the performance of prediction, we further include that the status of reservation information can improve the performance as well. 𝑃𝑖 log ( ) = α + 𝛽1 status_ok + 𝛽2 status_canceled + 𝛽3 status_changed (11) 1 − 𝑃𝑖 Equation (11) is another classification model using reservation status: reservation is new or ok (status_ok), reservation is canceled (status_canceled), and reservation is changed (status_changed). As we mention in Section 3, these three independent variables record the status of a reservation. Fig 5.3 shows the comparison between the model M1 (equation 10) and M2 (equation 11). The prediction performance of M2 is better. The AUC of M2 is 0.6236, which is higher than the 0.5799 of M1. 23.

(24) Consequently, the status of reservation improves the prediction as well.. 立. 政 治 大. ‧. ‧ 國. 學 y. Nat. sit. io. 𝑃𝑖 ) = α + 𝛽1 status_ok + 𝛽2 status_canceled + 𝛽3 status_changed 1 − 𝑃𝑖. er. log (. Fig 5.3. + 𝛽4 𝑆1 (Timediff) + 𝛽5 𝑆2 (people) (12) Equation (12) is a model combining independent variables from equation (10). n. al. Ch. engchi. i n U. v. and equation (11). Fig 5.4 shows the ROC curves of new model M3 and the other two, M1 and M2. The AUC of M3 is 0.6605. It improves the prediction performance (from 0.6236 to 0.6605).. 24.

(25) 立. 政 治 大. ‧ 國. 學 ‧. Fig 5.4 With the improvement at M3, we attempt to introduce all the other variables to the classification models. We infer the prediction power of member profile may increase in a larger predictor set. In section 3, we mention that besides reservation information, information of restaurants in the booking records is also included in the dataset. It is possible that restaurant information influences the behaviors of members. 𝑃𝑖 log ( ) = α + 𝛽1 age16 − 25 + 𝛽2 ag2 + 𝛽3 age36 − 45 + 𝛽4 gender 1 − 𝑃𝑖. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. + 𝛽5 status_𝑜𝑘 + 𝛽6 status_canceled + 𝛽7 status_changed + 𝛽8 𝑆1 (Timediff) + 𝛽9 𝑆2 (people) + 𝛽10 Is_hotel + 𝛽11new_taipei_city + 𝛽12 out_of_greater_taipei + 𝛽13 Wifi (13) Equation (13) includes member’s profile and restaurant information, covering age, gender, and the location of restaurant (area), if restaurant is in a hotel (is_hotel), and if restaurant provides wifi (wifi). In Fig 5.5, the ROC curve of M4 is almost the same as the one of M3. This implies that in our case, geographic information of a restaurant and its facilities are not powerful to predict member’s usage of an online booking service. In other words, no matter where the restaurant is, its location does not influence member’s behavior. Table 8 shows AUC comparison among these GAM models. 25.

(26) 立. 政 治 大. ‧ 國. 學 ‧. Fig 5.5. age, gender. M2 M3 M4. people atimediff, v i lstatus n Ch hi U n g cstatus timediff,e people,. n. M1. y. M0. AUC. sit. Variables. io. Model Name. er. Nat. Table 8 GAM model comparison. timediff, people, status, restaurant. 26. 0.5305 0.5799 0.6236 0.6605 0.6644.

(27) 5.2 Tree-Based Learning After constructing the models through GAM with different combination of independent variables, we are not able to improve their AUC further. Therefore, we would like to try tree-based methods which also perform well in data classification. The goal of tree-based methods is to divide the predictor space into several different regions for classification. At the end of the division, the regions are summarized to a tree. Decision tree is a method which constructs a tree through this process. It can classify data and interpret results easily. However, it actually has a poor prediction power compared with the learning methods, like logistics regression and GAM (James, et al., 2013). Therefore, we also fit our data with other two tree-based methods, Bagging and Random Forest. Unlike decision tree, the two learning approaches resample training data sets through the procedure bootstrap. Bootstrap is commonly used to reduce prediction variance of one training set from another. It resamples data multiple times by sampling the training data with replacement. Bagging bootstraps new training data sets from the original one first. These new. 立. 政 治 大. training sets are fit to prediction model with full set of p predictors separately. And. ‧ 國. 學. ‧. then the prediction results would be aggregated to one single model. Generally speaking, the results are aggregated through classifiers’ voting. For instance, if the training data is resampled 100 times through bootstrap, there are 100 classifiers. Therefore, for each observation, there would be 100 predictions. The final prediction is voted by the 100 predictions. If there are 90 “Yes” in the 100 predictions, the final prediction of the observation is voted as “Yes”. Random forest, one the other hand, is a refined version of bagging. Like bagging, multiple training data sets are bootstrapped. Nevertheless, there is a random subset of m predictors chosen from the full set of p predictors during each fitting process of these training sets to different. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. decision trees. The number m is typically equal to the square root of p, which means 𝑚 = √𝑝. This can improve on bagging by de-correlating the trees. That is, random forest can avoid the problem of data overfitting (James, G., et al., 2013). Fig 5.6 visualizes the two different types of decision tree: classification tree and regression tree. In classification tree, the tree only has one node, status_canceled. Whether the record’s predictor status_canceled is 1 or 0, it would be classified as not placing another reservation in 90 days. According to the result, there would be no positive prediction throughout the classification process. This leads to a poor value of AUC close to 0.5. In contrast, the AUC of regression tree is about 0.61 because it has positive predictions and improves the prediction performance. The regression tree also has only one node. If the record’s predictor status_canceled is 1, the return probability of the member is 0.3834. If the record’s status of status_canceled is 0, the return probability of the member is 0.1553. Consequently, we use regression tree to 27.

(28) construct models in the following research, including bagging and random forest. Fig 5.7 shows the comparison of AUC among tree-based models, i.e., decision tree, bagging, and random forest, and GAM. The independent variables for constructing models are the same as equation (13). The AUC of GAM is 0.6675, which is higher than the AUC of all the other models, including the one of random forest (0.6582). This indicates that with the same predictors in M4, GAM still performs better than some tree-based methods. Interestingly, the AUC of decision tree with only one predictor is 0.6145, which is higher than the one of bagging (0.6058).. 立. 政 治 大. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. Ch. engchi. 28. i n U. v.

(29) Fig 5.6. 立. 政 治 大. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. Ch. engchi. Fig 5.7. 29. i n U. v.

(30) Fig 5.8. 立. 政 治 大 Fig 5.9. Fig 5.8 and Fig 5.9 are the variable importance comparison of bagging and. ‧ 國. 學. random forest. For each tree, the original prediction error for classification is recorded. ‧. and the permuting prediction errors are recorded by permuting each predictor variable. For example, if there are 10 variables in the model, there are one original prediction error of the model with full set of variables and 10 permuting prediction errors of each variable in the tree. The difference between original prediction error and permuting prediction errors of variables are then averaged out across trees. The difference would be normalized by the standard deviation of these differences. The normalized difference is the relative importance of the variable. The variable which has the biggest error difference is the most important variable because the prediction error. er. io. sit. y. Nat. al. n. v i n C hAnd just as we seeUfrom the structure of decision increases a lot without the variable. e nimportant g c h i variable in both bagging and tree (Fig 5.7), status_canceled is the most. random forest model. But the information of restaurant, like is_hotel and wifi, seems to improve prediction a lot, which is different from the model of GAM. In order to improve the performance of tree-based models, we also include the two-way interaction terms. First of all, we added squared term of people and squared term of timediff to equation (13), resulting in 15 independent variables. Further, each of these independent variables is multiplied with one another. Consequently, there are 96 two-way interaction terms. They are the combinations of each two variables but the interaction terms between status, age, and area themselves are excluded because they are categorical variables. The interaction terms between people, timediff and their square terms are also excluded. (96 = 𝑐215 − 𝑐23 − 𝑐23 − 𝑐22 − 2) As a result, there are 111 independent variables in total (96+15). Moreover, the number of independent variables to construct random forest model are randomly chosen, which is different 30.

(31) from bagging model. Consequently, the number of variables randomly chosen in bagging model is 111, and in random forest model is 11 (≈ √111).. 立. 政 治 大. ‧. ‧ 國. 學. n. al. er. io. sit. y. Nat. Fig 5.10 Fig 5.10 shows the comparison of AUC between tree-based learning models with two-way interaction terms and GAM without interaction terms. The performance of GAM model is still better than the performance of tree-based learning models. The AUC of decision tree model (0.6145) is higher than bagging model (0.6058) and. Ch. engchi. i n U. v. random forest model (0.6092). And the performance of random forest model with interaction terms (0.6092) is poorer than the one without interaction terms (0.6582). In fig 5.11, the importance of variables in random forest model with interaction terms is listed. There is no interaction term on the list, which shows the reason why the performance of random forest model does not improve after adding these interaction terms to predictive models. In other words, interactions terms cannot improve the prediction performance in this context. Moreover, it is surprising that is_hotel and wifi are the two of the most important predictors in the model. We consider that the well-performed variable, status_canceled, would not appear in each variable set when randomly selecting variables in random forest models. Therefore, other variables, like is_hotel and wifi, may perform well in the models without status_canceled and the importance of these variables may be relatively higher than status_canceled.. 31.

(32) 立. 政 治 大. ‧ 國. 學. ‧. Fig 5.11 Due to the poor performance of random forest model with interaction terms, we consider that too many predictors would lead to higher probability to select variables without prediction power when training model. Therefore, we eliminate those redundant variables and keep predictors which improve the performance. According to the results we show, status of the reservation (status_ok, status_canceled, status_changed) can improve prediction performance a lot. As a result, we try the specifications focusing on reservation status to see if GAM can still perform better. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. than tree-based learning method in the context. In the first model, we pick the three status independent variables (status_ok, status_canceled, status_changed) and the interaction between the three status variables and rest of the 10 variables in equation (13). In other words, there are 33 (3+3*10) variables in the model. As for the second model, the 10 variables are also added to the model. So, there are 43 (3+3*10+10) independent variables in the second model. Fig 5.12 and fig 5.13 are the AUC comparisons of two different sets of independent variables applying to the four different training methods. Fig 5.12 is the first model with 33 variables and fig 5.13 is the second model with 43 variables. The model of GAM still performs better than all the other tree-based training models. Also, the performance of random forest models makes a great improvement under this independent variables combination. The AUC grows from 0.6092 to about 0.665. Table 9 compares with/without interaction terms of different learning methods. 32.

(33) 學. ‧ 國. 立. 政 治 大 Fig 5.12. ‧. n. er. io. sit. y. Nat. al. Ch. engchi. Fig 5.13. 33. i n U. v.

(34) Table 9 Learning methods comparison Model AUC Learning methods. Without interaction terms. With interaction terms (status). bagging. 0.6058. 0.6067. random forest. 0.6582. 0.665. GAM. 0.6675. 0.6675. 立. 政 治 大. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. Ch. engchi. 34. i n U. v.

(35) 6. Discussion Our study has two major findings pertaining to variable selection and modeling method for predicting online reservation services. Regarding variable selection, according to the comparison of performance between the predictive models, age and gender have weak prediction power. Traditionally, we consider information of members themselves strongly connected to their behavior. It would influence whether a member would buy products or services again and it is helpful to predict member behavior. However, in our study, the models with powerful prediction are related to the reservation itself, including variables like the status of the reservation (status), the period of time between the day of order placed and dinging day (timediff), and group size of dinging people (people). As for information about restaurant, it barely makes improvement to prediction performance. Although these independent variables are all statistically important to the models, they may not have strong prediction power. In other words, statistically significant variables may be explanatory to customer. 立. 政 治 大. ‧ 國. 學. ‧. behavior, but it is not necessary for them to improve the performance of predictive models. Regarding modeling method, according to section 5.2, we apply tree-based learning methods to training data. These methods, like decision tree and random forest, are widely used for data classification because of its prediction power. In order to provide more information to tree-based learning methods, two-ways interaction terms are added to the models. However, the prediction performance of such a model specification is worse than GAM. Because not all variables are selected in random. er. io. sit. y. Nat. al. n. v i n C h without prediction forest, some trees may contain variables power, e.g. age and U i e h n c interaction between status and other gthe restaurant information. So, we then focus on. predictors to fit the training data. However, under this specification, GAM still performs better than these tree-based learning methods. That is, computational-complex models cannot improve AUC further. Instead, GAM, with less computational efforts provides the best prediction accuracy. As a result, we conclude that neither model selection nor model complexity is the most important issue in our research context. It is variable selection that determines prediction power across models. For example, picking the transaction-dependent variables like status over restaurant information makes a significant improvement to prediction performance. In conclusion, service providers like EZTable can focus on collecting data about reservation. For example, reservation status can be collected automatically by system. Also, the time that the order placed and the group size of dining people have to be correct for member to place a reservation. It is efficient to collect these data because 35.

(36) they have high accuracy. This is able to avoid recording error data caused by members. Because data selection is crucial in the context, we may increase the prediction performance by collecting more information of reservation to improve predictive models. For instance, service provider can collect reserved dining time of member. It may show the members’ intendancy of choosing dining time. If dining time has connection with customer retention, EZTable could give discount to people who tend to dine at a particular time because they have higher probability to place another reservation. With the improvement of prediction performance, service providers can provide accurate advertisement to members with high return probability. By focusing on providing accurate advertisement, marketing cost will decrease because they don’t have to spend additional cost to make advertisement to members who have low retention rate. And hopefully, the profit of the company would increase accordingly.. 立. 政 治 大. ‧. ‧ 國. 學. n. er. io. sit. y. Nat. al. Ch. engchi. 36. i n U. v.

(37) 7. References Au, W., Chan, K., & Yao, X., (2003). A novel evolutionary data mining algorithm with applications to churn prediction., IEEE Transactions on Evolutionary Computation, Vol. 7, No. 6, 532–545. Coussement, K., & Van den Poel, D., (2008). Churn prediction in subscription services: An application of support vector machines while comparing two parameters selection techniques., Expert Systems with Applications, Vol. 34, 313– 327. Coussement, K., Benoit, D. F., & Van den Poel, D. (2010). Improved marketing decision making in a customer churn prediction context using generalized additive models. Expert Systems with Applications, Vol. 37, No.3, 2132-2143. Dasgupta, K., Singh, R., Viswanathan, B., Chakraborty, D., Mukherjea, S., Nanavati, A., & Joshi, A., (2008). Social ties and their relevance to churn in mobile telecom. 立. 政 治 大. ‧ 國. 學. networks. In: Proceedings of the 11th international conference on Extending. ‧. Database Technology: Advances in database technology, ACM, 697–711. Difference between glm and splines, Retrieved June 20 2016, from http://stats.stackexchange.com/questions/115245/what-is-the-difference-between-gl m-and-splines Eastin, M. S. (2002). Diffusion of e-commerce: an analysis of the adoption of four e-commerce activities., Telematics and informatics, Vol. 19, No. 3, 251-267. eCommerce Industry Outlook 2015, Retrieved August 15 2016, from http://www.criteo.com/media/1432/criteo-ecommerce-industry-outlook-2015.pdf. er. io. sit. y. Nat. al. n. v i n C hK. (2000). Internet U Emmanouilides, C., & Hammond, usage: Predictors of active users i e h n c and frequency use., Journal of Interactiveg Marketing, Vol. 14, No. 2, 17–32. From Functional Data to Smooth Functions, Retrieved June 19 2016, from http://www.psych.mcgill.ca/misc/fda/downloads/FDAtalks/smooth_talk.pdf Fawcett, T. (2006). An introduction to ROC analysis., Pattern recognition letters, Vol. 27, No.8, 861-874. GAM: The Predictive Modeling Silver Bullet, Retrieved June 20 2016, from http://multithreaded.stitchfix.com/blog/2015/07/30/gam/ Ganesh, J., Arnold, M., & Reynolds, K., (2000). Understanding the customer base of service providers: An examination of the differences between switchers and stayers. Journal of Marketing, Vol. 64, No. 3, 65–87. Generalized additive model, Retrieved June 17 2016, from https://en.wikipedia.org/wiki/Generalized_additive_model H€aubl, G., & Trifts, V., (2000). Consumer decision making in online shopping 37.

(38) environments: The effects of interactive decision aids., Marketing Science, Vol. 19, No. 1, 4–21. Hastie, Trevor J., & Robert J. Tibshirani., (1990). Generalized additive models., CRC press, Vol. 43. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning., New York: springer, Vol. 6. Jarvis, L.P. & E.J. Mayo, (1986) Winning the Market-Share Game, Cornell Hotel Restaurant Administration Quarterly, Vol. 27, No. 3, 72-79. Lemmens, A., & Croux, C., (2006). Bagging and boosting classification trees to predict churn., Journal of Marketing Research, Vol. 43, No. 2, 276–286. Lemon, Katherine N., Tiffany Barnett White, & Russell S. Winer., (2002). Dynamic customer relationship management: Incorporating future considerations into the service retention decision., Journal of marketing, Vol. 66, No.1, 1-14. Luarn, Pin, & Hsin-Hui Lin., (2003). A Customer Loyalty Model for E-Service Context., J. Electron. Commerce Res, Vol. 4, No. 4, 156-167.. 立. 政 治 大. Moe, W. W., & Fader, P. S. (2004). Capturing Evolving Visit Behaviour in. ‧ 國. 學. ‧. Clickstream Data., Journal of Interactive Marketing, Vol. 18, No. 1, 5-19. Moe, W. W., & Fader, P. S. (2004). Dynamic Conversion Behaviour at e-Commerce Sites., Management Science, Vol. 50, No. 3 , 326-335. Montgomery, A.L., (2001). Applying quantitative marketing techniques to the internet., Interfaces, Vol. 31, No. 2, 90–108. Morrisonn, A. M., Jing, S., O'Leary, J. T., & Cai, L. A. (2001). Predicting usage of the Internet for travel bookings: An exploratory study., Information Technology & Tourism, Vol. 4, No.1, 15-30. Mozer, M. C., Wolniewicz, R., Grimes, D. B., Johnson, E., & Kaushansky, H. (2000).. n. er. io. sit. y. Nat. al. Ch. engchi. i n U. v. Predicting subscriber dissatisfaction and improving retention in the wireless telecommunications industry., IEEE Transactions on Neural Networks, Vol. 11, No. 3, 690–696. Neslin, S., Gupta, S., Kamakura, W., Lu, J., & Mason, C., (2006). Detection defection: Measuring and understanding the predictive accuracy of customer churn models., Journal of Marketing Research, Vol. 43, No. 2, 204–211. Reichheld, F.F. & P. Schefter, (2000). E-Loyalty: Your Secret Weapon on the Web, Harvard Business Review, Vol. 78, No. 4, 105-113. ROC curve, Retrieved June 17 2016, from http://estat.pixnet.net/blog/post/61795603-roc%E6%9B%B2%E7%B7%9A-(receive r-operating-characteristic-curve) Rust, R. T., & Zahorik, A. J. (1993). Customer satisfaction, customer retention, and market share., Journal of Retailing, Vol. 69, No. 2, 193–215. 38.

(39) Shankar, V., Smith, A. K., & Rangaswamy, A. (2003). Customer satisfaction and loyalty in online and offline environments., International journal of research in marketing, Vol. 20, No.2, 153-175. Shmueli, G., (2010), To explain or to predict?, Statistical Science, Vol. 25, No. 3, 289-310. Shim, S., Eastlick, M. A., Lotz, S. L., & Warrington, P. (2001). An online prepurchase intentions model: The role of intention to search: Best Overall Paper Award—The Sixth Triennial AMS/ACRA Retailing Conference, 2000☆ 11☆ Decision made by a panel of Journal of Retailing editorial board members., Journal of retailing, Vol. 77, No.3, 397-416. Sismeiro, C., & Bucklin, R. E. (2004). Modeling purchase behavior at an e-commerce web site: A task-completion approach., Journal of marketing research, Vol. 41, No. 3, 306-323. Smoothing spline, Retrieved June 17 2016, from. 政 治 大 https://en.wikipedia.org/wiki/Smoothing_spline 立 Smooth Splines: Advanced Methods for Data Analysis, Retrieved June 20 2016, from. ‧ 國. 學. ‧. http://www.stat.cmu.edu/~ryantibs/advmethods/notes/smoothspline.pdf Van den Poel, Dirk, & Wouter Buckinx., (2005). Predicting online-purchasing behaviour., European Journal of Operational Research, Vol. 166, No.2, 557-575. Xie, Y., Li, X., Ngai, E. W. T., & Ying, W. (2009). Customer churn prediction using improved balanced random forests., Expert Systems with Applications, Vol. 36, No.3, 5445-5449.. n. er. io. sit. y. Nat. al. Ch. engchi. 39. i n U. v.

(40)

參考文獻

相關文件

– Taking any node in the tree as the current state induces a binomial interest rate tree and, again, a term structure.... Binomial Interest Rate

decision tree: a traditional learning model that realizes conditional aggregation.. Disclaimers about Decision

Hsuan-Tien Lin (NTU CSIE) Machine Learning Techniques 5/22.. Decision Tree Decision Tree Hypothesis. Disclaimers about

 Promote project learning, mathematical modeling, and problem-based learning to strengthen the ability to integrate and apply knowledge and skills, and make. calculated

Therefore, in this research, we propose an influent learning model to improve learning efficiency of learners in virtual classroom.. In this model, teacher prepares

questions and we are dedicated to are dedicated to are dedicated to are dedicated to future research and development to bring you the future research and development to bring you the

• In the present work, we confine our discussions to mass spectro metry-based proteomics, and to study design and data resources, tools and analysis in a research

• For a given set of probabilities, our goal is to construct a binary search tree whose expected search is smallest.. We call such a