The example in Section 5.2 is very simple and is used only to illustrate how the algorithm works. A larger data set is needed to illustrate the implications in practice. We thus conducted a data-collection experiment of 20 subjects who browsed through a Web shopping site. Each subject was asked to hypothetically shop the Web shopping site twice, with a very limited purchasing budget at the first time and an unlimited budget at the second time as their purchase-decision constraint. To meet the assumption of WTM, the shopping Web site has been modified to contain only one merchandise item in each page and no external link. Assume that only scroll-bar browsing data are collected on each page, which was automatically collected by the program as described in [31].
5.3.1 IWTMP with Purchases
IWTMp used the data set generated by the subjects with unlimited purchasing budget. Due to the large size of the experimental data, only the partial 1-transaction pattern candidate set C1 and the final large 3-transaction pattern set T3 are shown here.
They are in Table 10 and Table 11, respectively.
The Web mining results generated these clues: <ABHI:B{i1,b3}H{i7,b4}I{i8,b3}>
with Sup=5, <AB:B{i1,b3}> with Sup value =14 and <ABH:B{i1,b3}H{i7,b4}> with
Sup=7. Accordingly, two rules were derived as follows:
Rule 1:<ABHI:B{i1,b3}Æ H{i7,b4}I{i8,b3}>.
Confidence=<ABHI:B{i1,b3}H{i7,b4}I{i8,b3}> / <AB:B{i1,b3}> = 0.36.
Rule 2:<ABHI:B{i1,b3}H{i7,b4}Æ I{i8,b3}>.
Confidence=<ABHI:B{i1,b3}H{i7,b4}I{i8,b3}> / <ABH:B{i1,b3}H{i7,b4}> =
0.7.
With the path ABHI, if a user has purchased at Web page B, the next possible Web pages with purchases are either H or I, or when a user has purchased at both Web pages B and H, the most likely Web page to purchase again is I. Consequently, the implied EC strategies are discussed as follows.
1. Enhancement within one rule
The occurrences of IBD, i.e., scroll-bar action, can be used to enhance the derived rules by providing extra information for judging whether or not an online user’s intention leads to next actual purchasing. For example, if a user has purchased on Web pages B and H, rule 1 infers that the user may also purchase the merchandise on Web page I, which is what the WTM algorithm would have provided. However, with the extra clue of the scroll-bar IBD, there are two more situations to be considered. First, if the user also had some occurrences of the scroll-bar IBD on Web page I, then the user was more likely interested in the merchandise on page I. Otherwise, zero occurrence of the scroll-bar IBD on page I indicates no immediate interest so far on the merchandise of page I, and more monitoring is desired.
The other situation happens when users, for example Tom and Jim, both have purchased on Web pages B and H. Assume Tom also had four occurrences, while Jim only had one occurrence of the scroll-bar IBD, which may be an important indicator of the relative interests between different users. From the perspectives of online marketing, Tom has the higher possibility of purchase on Web page I at this time. If any marketing promotion to motivate potential buyers is triggered, Tom should be the one with a more preferable discount or free gifts. This implication illustrates how a Web site can better
deploy the strategies and resources to its browsers.
2. Enhancement between rules
The implication can also be judging the relative suitability of the two derived rules by the occurrences of the scroll-bar IBD. If a user has purchased on Web page B, then the next likely Web page with purchase can be judged by the Sup values. In the above case, Web page H has a higher Sup value than Web page I and should thus take a better promotion. Accordingly, the overall promotion cost can be lowered for better potential results. For instance, assume Tom has purchased on Web pages B and H, and had 5 occurrences of the scroll-bar IBD on Web page I, while Jim has purchased on Web page B and had 2 occurrences of the scroll-bar IBD on Web page H. Then the Web site can allocate more resources in promoting to Tom than to Jim by judging the occurrences of the scroll-bar IBD.
5.3.2 IWTMNP with no purchases
IWTMnp used the data set generated by the subjects who hypothetically shopped the experimental Web site with very limited purchasing budgets in order to obtain more no-purchase Web pages. Because Section 5.2 did not list the steps with the IWTMnp
algorithm, the step-by-step process is also illustrated below.
Step 1: Calculate the Sup values of both the IBD and Web pages for those pages without any purchase. The partial results for the 1-transaction pattern C1 data generated are shown in Table 12.
Step 2: Set the Sup hurdle value to 2, and retain only those large 1-transaction patterns with their support values greater than or equal to 2. The results are seen in Table 13.
Step 3: Continue from T1 by joining items for generating 2-transaction pattern
candidate set C2. Since many patterns do not really exist, such as path BK which does not have any connection in the Web structural chart, their Sup values will be 0. The paths with Sup = 0 are not listed in Table 14.
Step 4: Keep only those large 2-transaction patterns with their support values greater than or equal to 2. The results are shown in Table 15.
The algorithm then stops since only one pattern remains in Table 15. The final browsing path is thus ABC, where BC is the sub-path with Sup value = 2, and its total occurrences of browsing data are 3 and 5 for B and C, respectively. In other words, the Web mining results generate these clues: <ABC:B{i1,b3} C{i2,b5}> with Sup=2, and
<AB:B{i1,b3}> with Sup=13. Accordingly, one rule is derived as <ABC:B{i1,b3}Æ
C{i2,b5}> with Confidence = <ABC:B{i1,b3} C{i2,b5}> / <AB:B{i1,b3}> = 0.15. Similar
to the implications for IWTMp, the implied EC strategies for IWTMnp are discussed as follows:
1. Screening out no-purchase and low-interest merchandise
Unpopular merchandise management is an important issue in Web site management.
Any unpopular merchandise may have to be taken offline if it has no potential at all, or may need more allocated budgets for a stronger promoting if it still has good potential.
In our example, Table 12 reveals that Web page B has a high Sup of 13 but with low scroll-bar IBD occurrences of 3. As compared to Web page C with Sup of 8 and 5 occurrences of scroll-bar IBD, Web page B may be considered taken off the catalog for more effective use of the space and marketing budget.
2. Promoting no-purchase but high-interest merchandise
For effectively allocating budget on promoting merchandise sales, the IBD
occurrence serves as an indicator of the level of interests reflected by the users during their browsing processes. In Table 12, although Web pages C and Q have high Sup values, these values are still less than that of Web page B. The occurrences of the scroll-bar IBD in C and Q are, however, higher than that of B. Therefore, instead of replacing Web pages C and Q, more efforts should be spent on investigating why users did not purchase potentially interesting merchandise. Is it because of the price or the specification? With adequate research and investigation, appropriate marketing strategies may be applied to realizing actual sales out of Web pages C and Q.
3. Promoting positively correlated merchandises
From Table 14, we learn that when a user had no purchase on Web pages B, then his/her chance to purchase on Web page C was low. However, considering the scroll-bar IBD of these two pages together, Web page B has 3 and C has 5 occurrences, which indicates a higher level of interests on C. For instance, if Web page B lists a hard disk while Web page C lists a CPU, there are two possible situations. First, the user may not want the hard disk on Web page B, but raise the interests in CPU on Web page C. A marketing strategy of cross-selling or package promotion can then be applied to the user, such as buying a hard disk with 20% off the CPU price in order to increase the sales of both merchandise items. Second, the Web site management can use the hard disk on Web page B as a basis of comparison to the user by strongly promoting the CPU on Web page C while maintaining a stiff price for the merchandise on Web page B, so that the user may really feel the bargain price offers of the interested CPU, and make the deal.
5.3.3 Discussions
The illustration in Section 5.3.2 has its pros and cons. IBD is meant to bring
“potential” effectiveness into WUM. Therefore, the occurrence of the scroll-bar IBD does not guarantee the actual users’ levels of interest. For example, there may be no scroll bars on Web pages whose contents are less than one page, or Web pages may be very long and thus more scrolling actions are always performed. The scroll-bar IBD in this example is, however, an additional indicator for strategy deployment from a
conservative perspective, which is better than nothing to the decision support. Moreover, the effects of benefited implications depend heavily on the application domain, the data set, the selected IBD, and the persons who generate or implement the strategies.
Theoretically speaking, if the two potential benefits defined in Section 4.4 can be carefully manipulated, positive benefits would eventually justify the value of IBD in practice.
WTM is only an exemplar algorithm adopted for the purpose of illustration in this paper. From the way IBD was incorporated into the existing WTM algorithm, we learn that the IBD treatment can be thought of as an adds-on process to an original algorithm on Web log data items. Accordingly, any existing WUM algorithms with similar data structures can undoubtedly accommodate IBD for potential effectiveness.
Although the above discussion only focuses on the potential benefits of IBD, it is clearly an enabling component in practical WUM applications and an innovative one in further WUM research and development.