• 沒有找到結果。

CHAPTER 4 DATA PROCESSING AND ANALYSIS

4.3. Hot-paths

4.3.2. Hot-paths Mining

Although each visitor’s tour paths can be extracted as shown in Section 4.3.2., it is still difficult to recognize the patterns through visual inspection. Agrawal and Srikant (1995) propose sequential pattern mining to extract patterns in such situations in consumer purchasing behavior. Han et al. (2000) propose a “Freespan” algorithm to improve the efficiency of sequential pattern mining; Pei et al. (2001) extended these research works and improve the efficiency through a “Prefixspan” algorithm. Lee and Wang (2003) apply this sequential pattern mining technique to mine the frequent calling paths in GSM networks.

Mining the hot-paths in museum faces similar problems in mining the frequent calling paths in GSM networks; therefore this research adopted, with modification, Lee and Wang’s algorithm in hot-paths mining. In GSM network, because of the characteristics of hexagonal cell, each cell is surrounded by at most six neighboring cells; consequently, each user can only have six possible path selections when moving into neighboring cells. In the museum case, the path selection is not structured at all; a visitor can move to any zone including circling back to itself by passing through the gray zones. Before presenting the algorithm of mining hot-paths, some terminologies which will be used later are defined below.

Definition 1. A visit path P, denoted by v1=>v2=>…=>vn, n≥2, is the sequence of visited zones during a tour of specific visitor where v1, v2,…, vn are codes of zones.

Definition 2. The support of a visit path P is the ratio of visit paths containing P to all of the visit paths in the database D. The support of P is defined as sup(P) = |P| ⁄ |D|

38

where |P| denotes the number of visit paths containing P in D, and |D| denotes the number of all visit paths in D.

Definition 3. A visit path P is said to be a hot-path if sup(P) which is termed the minimum support is not less than a user-specified threshold. If all the paths in graph G are hot-paths, G is called a frequent visit graph.

Definition 4. A visit path graph G is consisted of vertexes (v1, v2,…, vn) and edges (e1, e2,…, en). An edge is the link of two vertexes. An in-edge of vertex v in a visit path graph is an edge ending at v. An out-edge of vertex v in a visit path graph is an edge starting at v. If a visit path graph G is not connected, it must be consisted of G1, G2,…, Gn. G1, G2,…, Gn are called sub-graphs of graph G

This research proposes the following algorithm to process this special case of

“move back to itself” pattern to visitors to move between zones more freely. This algorithm includes 6 steps:

Step 1: Divide all paths into edges to construct a visit path graph;

Step 2: Delete the edges that do not meet the support level needed;

Step 3: Find the special “move back to itself” paths and store them in a collection of frequent paths;

Step 4: Find the nodes which have no in-edge as the start-vertexes;

Step 5: Trace the graph from each start-vertex and store the mined paths in the frequent paths collection until all start-vertexes are traced; and

Step 6: Trace rest of the cyclic sub-graphs from any untraced node and store the mined path in frequent paths collection until all nodes are traced.

39

In step 1, all paths are divided into edges to construct a visit graph G. The count of each edge will be stored (see Figure 4-2). Step 2 removes the paths which do not meet minimum support and G becomes a frequent visit graph. Because the special

“move back to itself” paths are not easy to process with other paths, they are identified and removed in step 3. G may not be connected and it may be consisted of two kinds of sub-graph including cyclic sub-graphs and non-cyclic sub-graphs (see Figure 4-3). In Figure 4-3 the cyclic graph contains a cyclic path v2=>v4=>v5=>v3=>v2 and a non-cyclic path v1=>v2, the non-cyclic graph contains two paths v1=>v3=>v4 and v1=>v2=>v5. In step 4, start-vertexes which have no in-path of these vertexes are identified and will be used to be the beginning of tracing the frequent visit graph. All non-cyclic sub-graphs and the cyclic sub-graphs which connect with non-cyclic sub-graphs with will be traced in step 5. For example, v1 will the beginning of tracing the sub-graph and the result will be the path v1=>v2=>v4=>v5=>v3=>v2. In step 6, rest of the cyclic sub-graphs will be traced.

Figure 4-2: Visit Graph Construction

40

Non-cyclic Sub-graphs Cyclic Sub-graphs

V5

V4

V3

V2 V1

V5

V4

V3 V2

V1

Figure 4-3: Cyclic and Non-cyclic Sub-graphs

Using the algorithm above, a computer program (see Appendix) is written to process the positioning data. The minimum support can be modified according to the researchers’ needs. Figure 4-2 shows the result of hot-paths mining with 30% minimum support. Five hot-paths are identified: F1=>F1, SA=>SA, F3=>F3, B4=>B5=>B6=>SA, and F1=> F2=>F3=>SC=>B1=>B2=>B3.

Figure 4-4: Result of Hot-paths Mining

41

Strictly speaking, the special “move back to itself” paths are not paths, after excluding these special paths, we obtain two hot-paths: B4=>B5=>B6=>SA and F1=>

F2=>F3=>SC=>B1=>B2=>B3.

According to the museum tour guides, a comprehensive visitation would take 8 hours to cover all exhibitions, but most visitors have far less time to spend. For the convenience to visitors, based on these hot-paths managers may design recommended touring paths with different time requirements for visitors to choose. In our example here, there may be two recommended touring paths. One path is B4=>B5=>B6=>SA which includes part of prehistory exhibitions and scientific archaeology exhibition and takes about 1.5 hours to complete. The other is F1=>F2=>F3=>SC=>B1=>B2=>B3, and it includes nature history of Taiwan, part of prehistory exhibitions, and human evolution exhibits; and this path takes about 2 hours.

42

CHAPTER 5

CONCLUSION AND DISSCUSION

In this research, we have developed a new application for wireless locating technology that is different from the traditional “product positioning,” “product tracking,” and “product history.” In the past, the wireless locating technology such as RFID and Wi-Fi locating technology is used in logistic management, warehouse management, material/product management – just to name a few; but none of them is focused on consumer behavior research. In our research we first treat visitors as

“objects” so we may take advantage of the traditional applications such as VLRS to collect the whereabouts of visitor’s tour and then analyze the collected data to assess their behaviors – in other words, add the “humanity” back to these “objects”. This approach and concept can greatly expand the applications in psychology, sociology, and especially consumer marketing researches.

The other objective of this thesis is to develop a system to make previous infeasible research concepts work. This research designed and implemented the VLRS system to record the time (when) and location (where) of specific visitor’s (who) tour.

This system is more efficient and cost less than old approaches to collect the data of visitor behavior. This methodology is also easier to be applied in large scale exhibitions.

Lot of things which researches want to understand in the past may be measured, indentified, or found by the objective data provided by this system. This system can observe the visitors continually. It is hard to accomplish in the past, especially in large exhibitions and long time observation, because it is too complex and cost too much manpower. Time, location, and the object (visitor) are the most important data which wireless locating system can provide. In fact, these three kinds of data are the only

43

things what wireless locating system can provide. Based on these data, the hotspots are identified by calculating average viewing time as done in section 4.2, paths can be extracted by the approach proposed in section 4.3.1, and hot-paths are identified by the modified algorithm introduced in section 4.3.2. All objective information of visitor behavior can combine the result of questionnaire (subjective information) of satisfaction survey team of National Museum of Prehistory and increase the dimension of data analysis of satisfaction study. For example, the average viewing of each zone can compare with the most impressed exhibitions of visitor surveyed by questionnaire. The objective data may explain the result of questionnaire or create conflict between them.

Because the defects of software and hardware in the ERTLS, it exists an error range of the result of the locating system. During setting up the system, we found that the old APs cannot provide stable and strong signal for wireless locating engine. After replacing the APs with new model ones, the accuracy of locating system is improved.

Such as this kind of hardware problems are solvable. Even the triangular locating algorithm is possible to redesign and improve. In a word, it is believed that the accuracy of the locating system can be improved, but it not the objective in this research as mentioned in section 1.5. Though the accuracy of the locating system affects the correctness of the result and creates the gray zone problem. Once the wireless locating technology is improved, the result will also be improved. In the mean time, it only has to modify few details of implement of the system (i.e., data schema and size of zones) to ensure the system can work as well as which does in this research. The scalability was considered when designing the system.

The gray zone which causes by the inaccuracy of locating system may affect the hotspots and hot-paths. However, we did not study how the gray zone affects the

44

hotspots and hot-paths in depth. When extracting the paths, the parameter α is used in section 4.3.1 decide the lowest staying level. Is there an approach to set the best value of α? How the gray zones affect setting the value of α? It is encouraged to study these topics in depth.

VLRS can integrate with touring paths recommend system, active guide system, and questionnaire system. Objective data can be collected from VLRS and subjective data can be collect from questionnaire system. Since both of these two data are retrieved, it is possible to find out who is interested in which exhibitions. These patterns could be used to construct touring paths recommend system and provide the motion trend needed in active guide system. It will be helpful for visitors and raise the satisfactions of visitors.

45

Reference

[1] Agrawal, R. and Srikant, R., (1995), “Mining Sequential Patterns,” Proceedings of the 11th International Conference on Data Engineering, Taipei, Taiwan, March 6-10, 1995, pp. 3-14.

[2] Anderson, W. and Sullivan, M., (1993), “The Antecedents and Consequences of Customer Satisfaction for Firms,” Marketing Science, Vol. 12, pp. 125-143.

[3] Barker, S. and Kizil, M., (2009), “System Management Approach to Improvement in Longwall Development,” The Proceedings of the 2009 Underground Coal Operators’ Conference, Gwynneville, NSW, Australia, February 13.

[4] Bergman, A.B., Dassel, S.W., and Wedgwood, R.J., (1966), “Time-Motion Study of Practicing Pediatricians,” Pediatrics, Vol. 38, pp. 254-263.

[5] Bitgood, S., Patterson, D., and Benefield, A., (1988), “Empirical Relationship between Exhibit Design and Visitor Behavior,” Environmental and Behavior, Vol.

20, No. 4.

[6] Chang, Y.T., (1994), The Future of Museums in Globalization, Taipei: Daw Shiang, pp. 219.

張譽騰,1994,「全球村中博物館的未來」,台北:稻鄉,頁 219。

[7] Chen, H.C., (2001), “The Research of Visitor Behavior: ‘The World of Calcite Special Exhibition’,” Museology Quarterly, Vol. 15, No. 3.

陳慧娟,2001,「碳酸鈣礦物展觀眾行為研究」,博物館學季刊,15(3)。

[8] Chen, J.C.H., Chong, P.P., and Chen, Y.S., (2001), “Decision Criteria Consolidation: A Theoretical Foundation of Pareto Principle to Porter's Competitive Forces,” Journal of Organizational Computing and Electronic Commerce, Vol. 11, No. 1, pp. 1-14.

[9] Daviaud, E. and Chopra, M., (2008), “How Much is Not Enough? Human Resources Requirements for Primary Health Care: a Case Study from South Africa,” Bulletin of the World Health Organization, Vol. 86, No. 1, pp. 46-51.

[10] Ekahau RTLS Brochure, http://www.ekahau.co m/file.php?id=99414, retrieve date:

2008/3/26

[11] Fact Sheet of Ubisense Series 7000 Sensor, http://www.ubisense.n et, retrieve date:

2008/9/26

[12] Gillespie, J.M., Wyatt, W., Venuto, B., Blouin, D., and Boucher, R., (2008), “The Roles of Labor and Profitability in Choosing a Grazing Strategy for Beef

Production in the U.S. Gulf Coast Region,” Journal of Agricultural and Applied Economics, Vol. 40, No. 1, pp. 301-313.

[13] Gorry , G.M., and Scott-Morton, M.S., (1971), “A Framework of Management Information Systems,” Sloan Management Review, Vol. 13, No. 1, pp. 55-70.

46

[14] Han, J., Pei, J., Mortazavi-Asl, B., Chen, Q., Dayal, U., and Hsu, M. C., (2000),

“FreeSpan: Frequent Pattern-Projected Sequential Pattern Mining,” Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, August 20 -23, pp. 355-359.

[15] Hsi, S. and Fait, H., (2005), “RFID Enhances Visitors’ Museum Experience at the Exploratorium,” Communications of the ACM, Vol. 48, No. 9.

[16] Hsieh, M. C., Chong, P.P., Wang, W. C., and Shin, S.S., (2008), Taiwan National Museum of Prehistory Visitor Satisfaction Survey 2008, National Museum of Prehistory, Taitung, Taiwan.

謝明哲、鍾青萍、王文清、辛信興,2008,「國立臺灣史前文化博物館九十 七年度遊客滿意度調查研究報告」,國立臺灣史前文化博物館。

[17] Hsu, T.P., (2006), “A Study of Applying 7-11 Location to Model the Section of Hot Point Mechanism Commercial,” master’s thesis, Graduate Institute of Information Management, Ming Chuan University, Taiwan.

徐子鵬 ,2006 ,「應用 7-11 超商成功模式於選點分析機制之研究」,銘傳 大學資訊管理學系碩士在職專班碩士論文。

[18] IEEE Standard 802.11,

http://standards.ieee.org/getieee802/download/802.11-2007.pdf, Retrieve Date:

2009/4/20.

[19] Keen, P.G.W. and Scott-Morton, M.S., (1978), Decision Support System: An Organizational Perspective, MA: Addison-Wesley.

[20] Lee, A.J.T., Wang, Y.T., (2003), “Efficient Data Mining for Calling Path Patterns in GSM Networks,” Information Systems, Vol.28, No.8, pp. 929-948.

[21] Li, M.D., (2002), “Discuss the Police’s Reaction from Criminal Hot Spots: a Case of Taipei County,” master’s thesis, Graduate Institute of Police Administration, Central Police University, Taiwan.

李明道,2002,「以犯罪熱點論警察因應作為--以台北縣為例」,中央警察大 學行政警察研究所碩士論文。

[22] Lin, Y.C., (2002), “Visitor Behavior Study at Children Environmental Education Exhibition in Taroko National Park,” master’s thesis, Graduate Institute of Tourism & Recreation Management, National Dong Hwa University, Taiwan.

林宜君,2002,「太魯閣國家公園兒童環境教育館觀眾行為研究」,國立東 華大學觀光暨遊憩管理研究所碩士論文。

[23] McDonald, J.S. and Dzwonczyk, R.R., (1988), “A Time and Motion Study of the Anaesthetist’s Intraoperative Time,” British Journal of Anaesthetist, Vol. 61, pp.

738-742.

[24] Pei, J., Han, J., Mortazavi-Asl, B., and Pinto, H., (2001), “PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-projected Pattern Growth,” Proceedings of the 17th International Conference on Data Engineering, Heidelberg, Germany, April 2-6, pp. 106-115.

47

[25] Robinson, E.S., (1928), The Behavior of the Museum Visitor, Washington, DC:

American Association of Museums.

[26] Shi, H.Y., (1989), “A Study of Global Positioning System,” master’s thesis, Institute of Maritime Technology, National Taiwan Ocean University, Taiwan.

石宏揚,1989,「GPS 衛星定位設計之研究」,國立海洋大學航運技術研究 所碩士論文。

[27] Taylor, F.W., (1911), The Principles of Scientific Management, New York:

Norton.

[28] Turban, E., Aronson, J.E., Liang, T.P., Sharda, R., (2007), Decision Support and Business Intelligence Systems, 8th Edition, Prentice Hall.

48

Appendix

The codes below are written in C# to implement the mining algorithm.

public partial class Form1 : Form

private LinkedList<string> UnvisitedNodes;

AppSettingsReader ASReader = new AppSettingsReader();

public Form1()

private void button4_Click(object sender, EventArgs e) {

49

private FrequentGraph<String> addLinklist(FrequentGraph<String> FG, LinkedList<HotSpot> llhs)

50

51

52

private void trace(LinkedList<string> currentPath, FrequentGraph<String>

FG)

53

Unvisited.Remove(str);

UnvisitedNodes.Remove(str);

VisitedNodes.AddLast(str);

tmp.AddLast(str);

trace(tmp, FG);

} } } }

相關文件