Chapter 4. Hybrid Multiple Channels-based (HMC) Method
4.1.2 The Recommendation Engine
The proposed hybrid multiple-channel method derives recommendations based on the association-rule and most-frequent items approaches. For each group of users, two kinds of association rules are extracted, namely, product-level association rules and category-level association rules. The former are extracted from the product transactions; and the latter are extracted from category-level transactions, which are derived by replacing the products in product transactions with their respective categories. The recommendation engine is comprised of three components: the product association rules (XHPRi →YHPRi) component, the product category association rules (XHCRj →YHCRj)component, and the most frequent items (YHMf) component, as shown in Fig. 12. In the figure, H represents either M, T, C, or W, which denote the mobile, television, catalog and Web channels respectively.
In the multiple channel approach, let XHPRi →YHPRi,H∈{M,T,C,W} be the product-level association rules extracted from the product transactions of a group of channel users, comprised of mobile, television, catalog, and Web channel users; and let their associated confidence scores be cfMPRi,cfTPRi,cfCPRi, and cfWPRi respectively.
In addition, let Xu represent the previous set of products that the target user u browsed in the mobile channel; and let YuAR be the set of candidate products generated from the union of YHPRi −Xu according to all the association rules
i catalog, and Web channels respectively.
Let YHMf,H∈{ , , , }M T C W denote the set of most frequent items derived from the user groups of target user u in multiple channels. The frequency count of an item v for a user group Ug is equal to the number of users in Ug that had browsed/purchased item v. Let fv MMf, ,fv TMf, ,fv CMf, , and fv WMf, represent the frequency counts of an item v in
Mf
YH , respectively. Let YuMf be the set of candidate products generated from the union of YHMf −Xu. The products in YuMf are ranked according to the weighted sum of their frequency counts calculated as Eq. (11).
Mf
Let }XHCRj →YHCRj,H∈{M,T,C,W be the category-level association rules extracted from the category-level transactions of a group of channel users, comprised of mobile, television, catalog, and Web channels; and let their associated confidence scores be cfMCRj,cfTCRj,cfCCRj, and cfWCRj respectively. In addition, let XuC represent the set of product categories that the target user u browsed previously from the mobile channel; and let YuC be the set of candidate product categories generated from the union of YHCRj according to all the category-level association rules
j the weighted sum of their confidence scores (Eq. 12).
j Let YuCMf denote the set of most frequent candidate items derived from the candidate product categories YuC and most frequent candidate items YuMf . We note that YuMf is derived from the user groups of target user u in multiple channels.
CMf
Yu is the set of items in YuMf that also belong to the candidate categories in YuC. Each item v in YuCMf is associated with a pair of (cfCk ,fvMf ), where cfCk is the associated confidence score of v’s category Ck derived using Eq. (12), and fvMf is the frequency count of item v calculated using Eq. (11). The product items in YuCMf are ranked as follows. The items with the highest frequency counts in each category of
C
Yu are selected first and ranked according to their associated confidence scores.
Then, the items with the highest frequency counts among the remaining items in each category are selected and ranked according to their associated confidence scores. The process repeats to select and rank items in YuCMf by recommending most frequent items from diverse candidate categories.
We compare the number of candidate products |YuAR| and the top-N recommendations. Note that YuAR is the set of candidate products generated from the product-level association rules. If the number of candidate products |YuAR| is higher than the number of top-N recommendations (|YuAR|≥N) , the system will recommend the top-N products from YuAR. If the number of candidate products
|
|YuAR is less than the number of top-N recommendations (|YuAR|<N) , but
AR CMf
u u
| Y UY | is larger than the number of top-N recommendations
(| YuARUYuCMf | N≥ ), the system will recommend |YuAR| products from YuAR. The remaining |N−|YuAR products for recommendation are selected from YuCMf . Note that YuCMf is the set of most frequent product items belonging to the associated
product categories in YuC .
If | YuARUYuCMf | is less than the number of top-N recommendations
(| YuARUYuCMf | N< ), the remaining N | Y− uARUYuCMf | products for recommendation are selected from YuMf - (YuARUYuCMf), which is the set of most frequent items that the target user u has not browsed in the mobile channel and are not in YuARUYuCMf. The products are ranked according to the weighted sum of the frequency counts of the products.
Figure 12 The recommendation engine
4.2 Experimental Setup and Datasets
The multichannel company is a home shopping company which has owned the television, catalog and Web channels in Taiwan. Because of the rapid development of 3G mobile network, the company would develop the new mobile channel. The television channel is a sale channel of the home shopping company. The products are introduced in television channel and people can purchase products by a toll-free telephone.
The mobile channel is an on-line experimental mobile shopping website which tried to find the consumption behaviors of the new mobile channel users. Users could access the mobile website by their own mobile phones via 2G, 3G, 3.5G and Wi-Fi networks. Data for the mobile channel and the existing channels were collected from the mobile website and CRM system of a retailer from October 2006 to January 2007, which contained information of about 1,692 users who own 184 different models of the mobile phones and offered 1,416 products which are included in 194 product categories. The product categories which are frequently browsed are mobile phones, lingerie, digital cameras, skincare, MP3 players, watches, living products, cosmetics, cordless phones and travel coupons. The products offered by the mobile channel were also provided in the other three channels.
The dataset was divided up as follows: 80% was used for training and 20% for testing. The training set was also used as the dataset in the preliminary analytical experiment. Specifically, 55% of the data set was used to derive recommendation rules and 25% was used as a preliminary analytical dataset to determine the hybrid weights assigned to mobile, television, catalog, and Web channels based on the quality of the recommendations. There were 1,353 users in the training dataset and 339 users in the test dataset.
The consumption behaviors of the applications in e-commerce are different, so the datasets are different. The support and confidence of the association rules are set to retrieve the interesting patterns in datasets. Based on the characteristics of our dataset, the minimum support and confidence of the association rules were set at 0.004 and 0.4 to find the interesting rules, which were both higher than the study by Cooley et al. [12] but lower than the study by Cho et al. [9].
4.3 Experimental Results