The following steps are taken to prepare the data for analysis.

169

1. Define Taiwan’s cluster code (See Appendix - The US Clusters Code) – Taiwan Clusters Code basically follows the US Cluster Code definitions. In order to confirm that all cluster codes match closely with the US cluster, stepping through each code and evaluate the match is necessary. This has been done with the collaboration of the US Cluster Code manager, Richard Bryden in February 2015.

A matching process evaluating through each code and confirming the best match for the code was carefully done. Over ninety percent of the codes are the exact match even with the names. Some categories do not apply, and there are six clusters not being used. Agriculture, mining, forestry, government, and primary education data are not included in the Industry and Commerce data; therefore, all six clusters are presented in number 0. These empty clusters are: Agriculture (code:2), Coal Mining (code:7), Electric Power Generation (code:14), Forestry (code:20), Local Education (code:113), and Tobacco (code:43). Taiwan has stopped Tobacco production during this period. Total 61 clusters are being actively evaluated, and there are 46 traded clusters, and 15 local clusters remaining.

2. Processing through the Taiwan Census Data sets took time and effort. For the privacy reason, parts of data are not displayed but was replaced with a mark (D) by the National Statistics Bureau. Any industry with enterprises less than 3 is blocked by another selected larger group, so that privacy can be protected to the obvious enterprises. A reversal calculation was done at the first data estimation path that averaging method is used to estimate the data. Once the first path is done, the researcher can request entering the National Statistics Bureau to use the real uncovered data to calculate the true ratio. Researcher can take the ratio calculation out without taking out the original data sets. This cost more time for the data processing. The simulation of reversal calculation was done at first to create a model for the cluster data. Once the model is verified, it can be used for other sets of Census data. Currently, there are two sets of Census data, 2006 and 2011 used for this research. The earlier census data will require more work to transform from the old format into the newest cluster coded format. Some earlier data are even on paper print which is very cumbersome to use. Fortunately, for the

170

proof of concept, the two sets of data are provided, which are good to model the process for the future data processing.

3. Six data categories were extracted from these two sets of data, and they are enterprise establishments, employment counts, revenue, labor compensation, expenditure, and operating revenue. Data sum is checked against each step of the process, so that the data quality is kept and balance sum always checked against the original data to ensure the data is error free.

4. Create a cluster map on every City and for every cluster - Since the original data were collected on the City base. The smallest region is defined to be on City level.

A matrix of 135 by 45 data maps is created for one set of census raw data. This is the actual collected enterprise data, thus, will provide close to accurate estimates.

5. Calculate the shares – The units of data categories are “counts” for establishment,

“counts” for employment, “NT$ unit in1,000” for revenue, labor compensation, expenditure, and operating revenue. The first step is to calculate the shares of each cluster for each City. We will be able to see the differences of all the clusters in their shares. Some are predominately large, and some are very small (See Figure 47 Taiwan’s Cluster Composition). Also calculate the total Traded and Local cluster shares, and they are 63:37 ratios. This means there are 63% of employment in Traded clusters, and 37% of employment in Local clusters. The US has 40:60 ratios, just the opposite of Taiwan. The US has more employment in the Local clusters. Their Business Services is more developed according to

Ketels^{46F46F}^{47}. Taiwan is a small economy depends on export surplus, therefore, the
Traded clusters carry more weight.

6. Calculate the growth rates - Between 2006 and 2011, there are 5 years of compounded annual growth rate (CAGR) to be calculated. Employment growth rates, revenue growth rates, and wage growth rates among all clusters in all regions (National, City, District) are calculated.

7. Plot graphs - The most meaningful graph is the 3-dimentional bubble plots to show the dynamics of each growths. A wage graph uses average wage on x-axis,

47 In the clusters ratio discussion on 3/2/2016

171

employment shares on y-axis, and employment size for bubble sizes to indicate cluster growths and declines. The bubble chart is very easy to identify the dynamics with the alignment of cluster sizes especially plotting the growth rates.

The three dimensional charts provide dynamic information well; therefore, it is used throughout this report.

8. Run statistic correlation. Despite the Taiwan cluster definitions is adapted from the US cluster definitions. A statistics correlation is run to see their relations. A correlation of all three growth rates are run to show that the traded clusters are highly correlated clusters, and the local cluster is not. This confirms Porter’s claims on Traded Cluster’s importance. (See Appendix IV Pearson’s Correlation on Traded Clusters’ Growth Rates and Appendix V Pearson’s Correlation on Local Clusters’ Growth Rates)

9. Calculate Location Quotients for each cluster of each city. This gives the

indication of regional concentration of a cluster. The higher the concentration, the more important is the cluster to the nation for the cluster. Location Quotient represents the concentration of a cluster which is also a specialization indicator.

LQ equals 1 meaning the cluster has the same concentration as the Taiwan’s total ratio. LQ and shares can be interpreted differently, LQ means more to a region because if LQ is very high it means the city specializes the cluster may have some special knowledge to offer. This shares means more to a nation since the large the shares mean the more employment impacts. Population sizes matters too, if a larger region has a specialization that is highly populated, this cluster may have spillover effects when the cluster saturates. IT in Hsinchu is an example of this (Figure 98).

10. Draw maps of cities individually to see the clustered industries and its growth rate and employment shares. This provides a clear picture how dynamic is the cluster activities and labor movements. (See Figure 86 to Figure 142)

172