This paper provided the cornerstone research results on IBD, which help open up the scope of WUM applications or decision support for knowledge discovery by
jumping out the existing frame of algorithm as well as the conventional Web log records.
mechanism for collecting online browsing data, and justified two benefits of IBD.
Moreover, an EC example of WTM was adopted for illustrating the potential
effectiveness of IBD via two simple modifications of intention-based WTM (IWTM) algorithms.
The major implication of IBD in decision support is that by capturing more subtle and personal online browsing behavior in addition to the fixed Web log records, business strategies with greater personal touch that were never experienced can be distinguished and deployed for better online competitive advantages. Moreover, IBD can be a value-added component to the regular WUM algorithms at a lower cost. In other words, IBD can be used just like Web log records in WUM algorithms, except with potential extra values in decision making.
An immediate future work is to enhance the online data collection module [29] so that IBD can be easily adopted in WUM for fulfilling its potential benefits. Because IBD sheds light for an unlimited imaginary space for future WUM applications, another future work is to survey the importance and priority of research and development issues in WUM applications with IBD in context. With this, practical innovations on WUM algorithms can be pursued. Examples would be unifying IWTMp and IWTMnp into one IWTM algorithm for a more convenient and easier usage, or releasing the constraint of one merchandise item or one IBD in ITWM algorithms.
References
[1] Agrawal, R. & Mehta, M. (1996). SPRINT: A scalable parallel classifier for data mining, The International Conference on Very Large Database, 544-555, Bombay, India.
[2] Agrawal, R. & Srikant, R. (1994). Fast algorithms for mining association rules, The International Conference on Very Large Database, 487-499, Santiago, Chile.
[3] Agrawal, R. & Srikant, R. (1995). Mining sequential patterns, in Proceedings of the Eleventh International Conference on Data Engineering, 3-14, Taipei, Taiwan.
[4] Bonchi, F., Giannotti, F., Gozzi, C., Manco, G., Nanni, M., Pedreschi, D., Renso, C.
& Ruggieri, S. (2001). Web log data warehousing and mining for intelligent web caching, Data and Knowledge Engineering, 39(2), 165-189.
[5] Catledge, L. D. & Pitkow, J. E. (1995). Characterizing browsing strategies in the World-Wide Web, Computer Networks and ISDN Systems, 27(6), 1065-1073.
[6] Chan, C. (1997). The Access Log to Web Site and Query Language on WWW, Unpublished Master Thesis, Information Engineering Graduate School, National Central University,.
[7] Chen, M. S., Park, J. S., & Yu, P. S. (1998). Efficient data mining for path traversal patterns, IEEE Transaction on Knowledge and Data Engineering, 10(2), 209-221.
[8] Chen, Z. & Shen, H. (2000).A study of a new method of browsing path data mining, The Sixth International Conference of Information Management Research and Practice, TsingHua University, HsingChu, Taiwan, R.O.C.
[9] Cooley, R., Mobasher B., & Srivastava, J. (1999). Data preparation for mining World Wide Web browsing patterns, Journal of Knowledge and Information Systems, 1(1), 5-32.
[10] Cunha, C. R. & Jaccoud, C. F. B. (1997). Determining WWW user’s next access and its application to pre-fetching, The Second IEEE Symposium on Computers and Communications, 6-11, Alexandria, Egypt.
[11] Fann, C. (1999). Personalized Interactive Marketing Mechanisms On WWW, Unpublished Master Thesis, Information Management Graduate School, National Pingtung University of Science and Technology.
[12] Han, J. & Kamber, M. (2001). Data Mining: Concepts and Techniques, Academic Press.
[13] Holsapple, C. W. & Whinston, A. B. (1996). Decision Support Systems: A Knowledge-Based Approach, West Publishing Company.
[14] Hong, T. P., Lin, K. Y. & Wang, S. L. (2002). Mining linguistic browsing patterns in the World Wide Web, Soft Computing, 6(5), 329-336.
[15] Hsieh, C. C. and Chang, C. T. (2001). An enhanced transaction identification module on Web usage mining, Asia Pacific Management, 241-252.
[16] Iyengar, A., MacNair, E. & Nguyen, T. (1997). An analysis of Web server
performance, The IEEE Global Telecommunications Conference, 3, Phoenix, AZ, U.S.A., (1997)1943–1947.
[17] Joshi, K.P., Joshi, A., Yeti, Y. & Krishnapuram, R. (1999). Warehousing and mining Web logs, WIDM, Kansas City, Mo, U.S.A., ACM, 63-68.
[18] Kitsuregawa, M., Shintani, T. & Pramudiono, I. (2001). Web mining and its SQL based parallel execution, Proceedings on Information Technology for Virtual Enterprises, 5(5), 128-134.
[19] Lin, Y. (1997). A Design and Implementation of a Data Collection Mechanism in Mining WWW Information, unpublished master thesis, National Taiwan
University.
[20] Maglio, P. P., Campbell, C. S., Barrett, R. & Selker, T. (2001). An architecture for developing attentive information systems, Knowledge-Based Systems, 14(1-2), 103-110.
[21] Mehta, M., Agrawal, R. & Rissanen, J. (1996). SLIQ:A fast scalable classifier for data mining, In Proc. of the Fifth Int'l Conference on Extending Database
Technology 8-32, France.
[22] Mannila H. & Ronkainen, P. (1997). Similarity of event sequences, The Fourth International Workshop on Temporal Representation and Reasoning, 136-139.
[23] Mittal, B. & Lassar, W. M. (1987). The role of personalization in service
encounters, Journal of Retailing, 72(1), 95-109.
[24] Modhal, M. (1999). Now or Never: How Companies must Change Today to Win the Battle for Internet Consumers, Harper Business.
[25] Newman, W. M. & Lamming, M. G. (1995). Interactive System Design, Addison-Wesley.
[26] Park, J. S., Chen, M.S. & Yu, P.S. (1997). Using a hash-based method with transaction trimming for mining association rules, The IEEE International Conference on Knowledge and Data Mining, 9(5), 813-825.
[27] Perkowitz, M. & Etzioni, O. (2000). Towards adaptive Web sites: conceptual framework and case study, Artificial Intelligence, 118(1-2), 245-275.
[28] Reeder, R. W., Pirolli, P. & Card, S. K. (2000). WebLogger: a data collection tool for Web-use studies, UIR Technical report UIR-R-2000-06, Xerox PARC.
[29] Tao, Y.-H, Chung, S., Chung, M., Kao, H., Yang, K. & Lin., I. (2004). The data-collection mechanism of Web browsing behavior, Association of Electronic Commerce in Taiwan, March 26-27.
[30] Yu, P. (1999). Data mining and personalization technologies, The Sixth IEEE International Conference on Database Systems for Advanced Applications, 6-13.
[31] Yun, C. & Chen, M. (2000). Mining Web transaction patterns in an electronic commerce environment, The Fourth Pacific-Asia Conference on Knowledge Discovery and Data Mining, 216-219.
[32] Zhang, D. & Dong, Y. (2002). A novel Web usage mining approach for search engines, Computer Networks, 39(3), 303-310.
Log Files Non Log Files
Figure 1 Taxonomy of Browsing Data
Table 1 Examples of Browsing Data in Taxonomy
Classification Examples of Browsing Data
User Data Member ID, Name, Sex, Phone Number, Birth Date, Address, Social Security Number
Non-Log File Items
Transaction
Data Transaction ID, Transaction Time, Merchandise, Quantity, Cost, Payment Type.
Precise Browsing Data
Log File
Items User IP, User’s Browser Info., Web Page Name, Duration, Entering Time, Leaving Time Explicit
Intention Message Board, Historical Records, Email, Homepage, FTP, Add to Favorite, Hyperlink, Refresh, Editing Intentional
Browsing Data
Implicit
Intention Chat, Copy, Select, Scroll Bar, Search, News Group, Key-in Subjective Content, Back, Forward
Object Open File, Save As, Online Help, View Original Code Other
Browsing Data Unknown Internet Option
Transaction Transaction Volume, Number of Transactions, Transaction Value
Action Key word, Occurrences of Copy, Content of Copy Aggregated
Sample #1 Occurrences of Web Page, Browsing Time Aggregated Browsing Data
Aggregated
Sample #2 Browsing Time, Scroll Bar, Copy
Setup A:Web Log Files Setup and Execution
Figure 2. Online data collection mechanism
A
Figure 3. An illustration of Web page structure and corresponding transaction data