• 沒有找到結果。

Simulation Study

4.2 Statistical Inference

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

Table 4.2: The maximizer(s) and the corresponding maximal pAUC value found by using the grid search and the proposed multiple-initial method.

Case Method Maximizer(s) Optimal pAUC

I Grid Search (0.7077, 0.7065) 0.0253

Multiple-initial (0.7071, 0.7071) 0.0253

II Grid Search (0.8144, -0.5803),(-0.5803, 0.8144) 0.0372 Multiple-initial (0.8145, -0.5802),(-0.5802, 0.8145) 0.0372

III Grid Search (-0.5544, 0.8322) 0.0342

Multiple-initial (-0.5568, 0.8308) 0.0342

IV Grid Search (0.6580, 0.7530) 0.0365

Multiple-initial (0.6215, 0.7834) 0.0364

V Grid Search (0.4255, 0.4184, 0.8024) 0.0387

Multiple-initial (0.4222, 0.4223, 0.8021) 0.0387

V I Grid Search (-0.4762, 0.8120, 0.3375),(0.8122, -0.4758, 0.3376) 0.0411 Multiple-initial (-0.4798, 0.8149, 0.3251),(0.8149, -0.4798, 0.3251) 0.0411

and Case III have a unique maximizer. However, one local maximal solution other than the global maximum exists in Case III, and there are two optimal linear combinations in Case II as shown in previous Chapter. Next, the two cases of three biomarkers are also taken as Case V and Case V I. Basically, Case V is a simple extension of Case I for the case of unique pAUC maximizer, and Case V I is a simple extension of Case II for the case of two pAUC maximizers.

In all scenarios, the true optimal linear combinations produced by the grid search are obtained as the gold standard. All results are listed in Table 4.2. From Table 4.2, we see that the solutions found by our proposed multiple-initial method are all close to the true optimal linear combinations. Moreover, in the cases of multiple maximizers, such as Case II and Case V I, our method can successfully find all the solutions. Overall, the proposed multiple-initial method has quite satisfactory performance in these synthetic examples.

4.2 Statistical Inference

In this section, through simulation results, we can validate our proposed methods, in-cluding the estimated best linear combinations of all biomarkers, the global test of the discriminatory power of a set of biomarkers, and the two biomarker selection methods.

We provide the simulation results of two, three and four biomarkers.

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

First, we discuss the cases of two biomarker. Again, the mean vector in the non-diseased population is fixed as the zero vector, µ0 = 0. Then the mean vector in the diseased population is equal to the vector of the mean difference, µ1 = ∆µ = (∆1, ∆2)T. Various ∆µ are selected. Further, every biomarker has unit variance in the two groups.

Denote the within group correlation by ρd, d = 0, 1. Hence, the covariance matrices of the two biomarkers in the two groups are

Σd =

 1 ρd

ρd 1



, d = 0, 1.

In addition to the independent case, three population correlations, 0.1, 0.5, 0.9 are con-sidered in the simulation study. See Table 4.3. Given the true values of the parameters, the true best linear combination maximizing the pAUC with t = 0.1, denoted as a, is found by grid search with 106 grids. The distributions of the true best linear combination maximizing the pAUC, a∗TX, are reported in Table 4.3. The corresponding true pAUC values, pAUC(a), of a∗TX are also listed in the last column of Table 4.3.

The first case is a complete null scenarios, that is, their distributions in two groups are the same. All linear combinations do not have any discriminatory power to the disease, and their pAUC are equal to t2/2 = 0.005. We simply define a = (0, 0)T in this case.

In Case 2-31, we simulate the scenario that the mean difference of the first biomarker is zero, but the second biomarker has a positive mean difference. As a result, the second biomarker is considered as a more important biomarker under the situation. In Case 2-4, the two biomarkers are independent, so the first biomarker is uncorrelated with the disease and the second biomarker is the only contributor to the disease. Nevertheless, when the correlation between the first biomarker and the second biomarker exists, the first biomarker provides a nonignorable contribution, see Case 5-31. Comparing with the Case 2-4, we find that the global discriminatory power is improved from the existence of the positive correlation between the two biomarkers.

Now, in order to investigate the effect of correlation, we consider various covariance matrices. One is that only correlation of two biomarkers only exists in the non-diseased group (Case 5-13), the other one is only that the correlation of two biomarkers exists in the diseased group (Case 14-22), and another is that there are same correlations between two biomarkers in the non-diseased and diseased groups (Case 23-31). Except for Case

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

14-22, the rotation direction in the best linear combination both are clockwise with the increase of the correlation of two biomarkers. In addition, when the correlation between the two biomarkers in the non-diseased and the correlation in the diseased groups are the same, the best linear combination does not change with the mean difference, see Case 23-31.

Figure 4.1: The distributions of best linear combination, a∗TX , for Case 4 (Left top), Case 7 (Right top), Case 10 (Left bottom), Case 13 (Right bottom).

As we knows that in the integration for pAUC, the non-diseased population distribu-tion, related to the specificity, determines the threshold range, and the diseased population distribution, related to the sensitivity, determines the magnitude of the integrand. From Table 4.3, we can get some results about the relationships between the distribution of a∗TX in two groups and pAUC(a). In the following, a thorough discussion is provided.

First, we fix the mean difference ∆µ= (0, 1)T, then assess the effect of the distribution of a∗TX in the non-diseased group and pAUC(a). In specific, we compare Case 4, Case 7, Case 10 and Case 13. Figure 4.1 plot the distributions of a∗TX in two groups for the four cases.

When the correlation of two biomarkers in the non-diseased group ρ0 increases from

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

0.1 to 0.9, the variance of the best linear combination in the non-diseased groups, Q0, decreases quickly. The mean of the best linear combination in the diseased group, a∗Tµ1, decreases, too. Consequently, Case 13 has a dramatic increase in pAUC than Case 4 as the correlation of two biomarkers in the non-diseased group becomes large. From Figure 4.1, we see that when the variance of the non-diseased population decreases, the cut-off point c is more toward the left. Given the cutoff points, the pAUC is the integration of the tail probability, so the integration range becomes larger and the pAUC increases.

Similarly, we fix the mean difference ∆µ = (0, 1)T, then assess the effect of the distri-bution of a∗TX in the diseased group on its pAUC(a). Specifically, Case 4, Case 16, Case 19, Case 22 are compared. When the correlation of two biomarkers in the diseased group ρ1 increases from 0.1 to 0.9, the variance of the best linear combination in the diseased groups, Q1, increases slowly and their means are closed to the mean of Case 4. Since their non-diseased distributions all are standard normal distribution, which are the same as Case 4, the pAUC of the linear combination a∗TX defined in Equation (2.1.1) is,

pAUC(a) = Z t

0

1− Φ c(u) − a∗Tµ

√Q1

 du =

Z t 0

Φ a∗T∆√µ− c(u) Q1

 du.

From this expression, we have the following findings. First, as a∗Tµ increases, pAUC increases. Second, if a∗Tµ < Φ−1(1− t), pAUC increases as Q1 increases; otherwise, pAUC decreases as Q1 increases. Hence, the pAUC in Case 22 is larger than the pAUC in Case 4 from Table 4.3. See Figure 4.2 for the distribution plots of these cases.

Comparing Figure 4.1 with Figure 4.2, we find that the change in the distributions brought by the correlation in the non-diseased group is larger than the change brought by the correlation in the diseased group. Hence, we conclude that the appearance of a positive correlation in the non-diseased group has a greater effect than that in the diseased group.

The last three cases simulate the scenarios that two biomarkers have the same mean difference and are independent with each other. Their pAUC increases with the common mean difference.

Next, we study the performance of the proposed estimated best linear combination, ˆan, and the corresponding pAUC, pAUC(ˆan). In order to have a balance study, the sample size of two groups both are 100. All empirical mean and standard error of these estimators

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

Figure 4.2: The distributions of best linear combination, a∗TX , for Case 4 (Left top), Case 16 (Right top), Case 19 (Left bottom), Case 22 (Right bottom).

over 1000 replications are listed in Table 4.4. The mean and standard error are denoted as Ave and SE, respectively. Note that the estimated best linear combinations are found by our multiple-initial algorithm because the best linear combination may not be unique and has local maximum exist, discussed in Section 2.2.

In the view of the estimated best linear combination, we find that the estimated best linear combination tends to give a conservative and toward-zero result. In the Case 1, where no best linear combination exists in theory, we find the least stable estimation. The variance of the coefficient of the estimated best linear combination under the complete null scenario are the largest values among all cases. Figure 4.3 are the density plots of the two coefficients in the estimated optimal linear combination. We can see that they both follow a bimodal distribution, and have a high chance of observing the two boundaries. The variations reduce as the association of the two biomarkers with the disease becomes large. In addition, the corresponding estimated pAUC overestimates the true value, and similarly, the estimation improves as the biomarkers are more correlated with the disease. The last column of Table 4.4 is the empirical power of the proposed

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

test for the global discriminatory ability at the significance level α = 0.05. We find that the test not only adequately controls the type I error rate, but also has satisfactory performances in alternative cases. In addition, we try other complete null cases, which the non-diseased and the diseased distributions are the same. Then, we find the type I error is quite controlled in every different completely null case.

Now, we compare the performances of the Forward method and the Backward method in the variable selection. For every testing procedure, the significance level α is 0.05 and the bootstrapping sample size is 500 in every replication. For a study of two biomarkers, four possible outcomes in the conclusion are defined as follows: (c1, c2), if two biomarkers both are selected; (1, 0), if the first biomarker is selected; (0, 1), if the second biomarker is selected; (0, 0), if two biomarker both are not selected. The last case means that no biomarker is significant to detect the disease, the final reduced biomarker set is null, and hence no corresponding pAUC is obtained. Once we obtain a non-empty reduced set, we compute the best linear combination of the reduced set and its correspondent pAUC.

Table 4.5 reports the performance of the resultant best linear combination of the non-empty reduced set among the 1000 replications. The proportions of the four possible outcomes of two methods among the 1000 replications are listed in Table 4.6. In every scenario, the figure in boldface is correspondent to the most likely outcome.

From Table 4.5, we find that the performance of the Forward method is better than the Backward method in most cases except for the complete null case. In addition, when the first biomarker has a non-ignorable contribution mainly due to the existence of a positive correlation between the two biomarkers, such as Case 8-16 and Case 28-31, the Backward approach may have an unsatisfactory performance. From Table 4.6, we find that in these cases, the proportion of only selecting the first biomarker is quite high, and it is out of our expectation of selecting both biomarkers. The confusing result arises from an improper null distribution generated in these scenarios. In applying the Backward approach in these scenarios, we usually have the following outcomes. After the significant global effect test at step 0 is obtained, the first biomarker is the first tested variable since it tends to have a smaller absolute coefficient. By having a non-ignorable contribution in pAUC, it is usually determined as a significant biomarker. Next, the conditional discriminatory

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

power of the second biomarker is assessed through the use of its corresponding coefficient as the test statistic. In the proposed parametric bootstrap sampling method, the null distribution assume that the tested biomarker in two groups has common mean, common variance, and is uncorrelated with other biomarker. However, in this case, the significant effect of the first biomarker comes from the correlation, and eliminating the correlation produces a completely null scenario as Case 1. We learn that the estimated coefficient has a great variation in this situation as discussed in previous paragraph. As a result, it is difficult to obtain a significance in testing the conditional discriminatory power of the second biomarker. Finally, we get an inadequate result: The first biomarker is the only selected biomarker.

−1.0 −0.5 0.0 0.5 1.0

0.300.350.400.450.500.550.60

N = 1000 Bandwidth = 0.1604

Density

−1.0 −0.5 0.0 0.5 1.0

0.350.400.450.500.550.60

N = 1000 Bandwidth = 0.1593

Density

Figure 4.3: The distribution of estimated linear combination for Case 1 (Left:ˆa1, Right:ˆa2).

On the other hand, in these scenarios, the Forward approach, which starts from testing

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

the biomarker with the largest absolute coefficient, is not able to take the advantage of the correlation. When the mean difference of the second biomarker is small, such as Case 8-9, Case 11-12, and Case 29-30, the Forward approach has a larger chance of selecting no biomarker than the Backward approach and hence is less powerful. But as the mean difference of the biomarker becomes moderate to large, the Forward approach has a greater proportion of selecting both biomarkers, see Case 10, 13, 31. Moreover, on the average the resultant optimal pAUC of the reduced set selected from the Forward approach over 1000 replications is always larger than that from the Forward approach in Case 8-13, and Case 29-31. Hence, we conclude that when the mean difference is not too small, the Forward method performs better than the Backward method.

Two-biomarker simulations are generated from multivariate-t distributions with degree freedom 3 to investigate the robustness of our biomarker-selection methods with respect to deviation of the normality assumption. In Table 4.7, the true maximal pAUC, pAUC(a), which is found under the multivariate t distribution, is reported. In addition, over 1000 replications, the average and the standard error of the estimated maximal pAUC value of the reduced biomarker set, which is selected via our biomarker selection methods on the basis of the normality assumption, are also present in Table 4.7. From this table, we find that our methods give overestimated results in these cases. Thus, we conclude that the proposed optimal pAUC estimation and the biomarker selection methods are sensitive to the normality assumption.

Next we study the cases consisting of three and four biomarkers, i.e. p = 3 or 4.

Again, assume µ0 = 0 in the non-diseased group, and µ1 = ∆ = (∆1, . . . , ∆p)T in the diseased group. Further, the covariance matrices are of the following form: for d = 0, 1,

if p = 3, Σd=

1 ρd 0 ρd 1 0

0 0 1

!

, and if p = 4, Σd=

1 ρd 0 0 ρd 1 0 0

0 0 1 0

0 0 0 1

.

The performance of the estimated pAUC of the best linear combination of the full biomarker set, and that of the reduced biomarker set found from the two biomarker selection approaches, are presented in Table 4.8. We can see that similar to the cases of p = 2, the estimated pAUC tends to overestimate the true value. By using the Backward approach, we are less likely to obtain a confusing conclusion as in p = 2. Currently, the

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

two selection approaches have comparable performances in most cases, except Case 11 of p = 3 and Case 8 of p = 4.

In Table 4.9, the true value of the best linear combination; empirical mean and stan-dard error of the estimated ˆanbased on 1000 replications, denoted by True, AVE and SE, are reported. Table 4.10, 4.11 and 4.12 present the proportion of outcomes from the two biomarker selections among 1000 replicates. Table 4.10 reports the cases of p = 3, while Table 4.11 and 4.12 give the cases of p = 4. For three biomarkers, there are eight possible conclusions: (i) (0, 0, 0), if all biomarkers are insignificant; (ii) (1, 0, 0), if only the first biomarker is selected; (iii) (0, 1, 0), if only the second is selected; (iv) (0, 0, 1), if only the third is selected;(v) (c1, c2, 0), if the first and the second are selected; (vi) (c1, 0, c3), if the first and the third are selected; (vii) (0, c2, c3), if the second and the third are selected;

(viii) (c1, c2, c3), if all biomarkers are selected. Table 4.10 lists the proportions of the eight possible conclusions of the two approaches among the 1000 replications for three dimension. For four biomarkers, there are sixteen possible conclusions: (i) (0, 0, 0, 0), if all biomarkers are insignificant; (ii) (1, 0, 0, 0), if only the first biomarker is selected; (iii) (0, 1, 0, 0), if only the second is selected; (iv) (0, 0, 1, 0), if only the third is selected; (v) (0, 0, 0, 1), if only the forth is selected; (vi) (c1, c2, 0, 0), if the first and the second are selected; (vii) (c1, 0, c3, 0), if the first and the third are selected; (viii) (c1, 0, 0, c4), if the first and the forth are selected; (ix) (0, c2, c3, 0), if the second and the third are selected;

(x) (0, c2, 0, c4), if the second and the forth are selected; (xi) (0, 0, c3, c4), if the third and the forth are selected; (xii) (c1, c2, c3, 0), if only the forth is not selected; (xiii) (c1, c2, 0, c4), if only the third is not selected; (xiv) (c1, 0, c3, c4), if only the second is not selected; (xv) (0, c2, c3, c4), if only the first is not selected; (xvi) (c1, c2, c3, c4), if all biomarkers are se-lected. The proportions of the sixteen possible conclusions of the two approaches among the 1000 replications are reported in Table 4.11 and Table 4.12, respectively. In each scenario, the figure in boldface is correspondent to the most likely outcome.

From Table 4.10, we can see that the most likely outcomes of the Forward approach and the Backward approach are different in Case 6 and Case 8 of p = 3. See the coefficients of these two biomarkers for Case 6 and Case 8 of p = 3, we find that these two biomarkers seem uniform importance. Hence, the effect of the pAUC is not too large. However,

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

the difference pAUC between the Forward approach and the Backward approach is large, although the most likely outcomes of the Forward approach and the Backward approach are the same in Case 11 of p = 3. Similarly, we can find the same result in Case 8 of p = 4 from Table 4.11 and Table 4.12. In next Chapter, we apply our proposed methods to four real examples, containing atherosclerotic coronary heart disease, Duchenne Muscular Dystrophy (DMD), electrical impedance spectroscopy for breast tissue and magic gamma telescope data.

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

Table 4.3: The Setting of Populations.

Population (X) The best liner combination (a∗TX) Mean Difference Correlation Coefficients D = 0 D = 1

Case ∆12 ρ0 ρ1 a a∗Tµ0 Q0 a∗Tµ1 Q1 pAUC(a)

1 0.0 0.0 0.0 0.0 0.00 0.00 NA NA NA NA 0.0050

2 0.0 0.3 0.0 0.0 0.00 1.00 0.00 1.00 0.30 1.00 0.0088 3 0.0 0.5 0.0 0.0 0.00 1.00 0.00 1.00 0.50 1.00 0.0123 4 0.0 1.0 0.0 0.0 0.00 1.00 0.00 1.00 1.00 1.00 0.0245 5 0.0 0.3 0.1 0.0 -0.38 0.92 0.00 0.93 0.28 1.00 0.0093 6 0.0 0.5 0.1 0.0 -0.28 0.96 0.00 0.95 0.48 1.00 0.0127 7 0.0 1.0 0.1 0.0 -0.16 0.99 0.00 0.97 0.98 1.00 0.0249 8 0.0 0.3 0.5 0.0 -0.65 0.77 0.00 0.51 0.23 1.00 0.0164 9 0.0 0.5 0.5 0.0 -0.61 0.80 0.00 0.52 0.40 1.00 0.0204 10 0.0 1.0 0.5 0.0 -0.52 0.86 0.00 0.56 0.86 1.00 0.0333 11 0.0 0.3 0.9 0.0 -0.69 0.72 0.00 0.10 0.22 1.00 0.0367 12 0.0 0.5 0.9 0.0 -0.68 0.73 0.00 0.10 0.37 1.00 0.0422 13 0.0 1.0 0.9 0.0 -0.66 0.76 0.00 0.11 0.75 1.00 0.0567 14 0.0 0.3 0.0 0.1 0.32 0.95 0.00 1.00 0.28 1.06 0.0091 15 0.0 0.5 0.0 0.1 0.19 0.98 0.00 1.00 0.49 1.04 0.0125 16 0.0 1.0 0.0 0.1 0.06 0.99 0.00 1.00 1.00 1.01 0.0245 17 0.0 0.3 0.0 0.5 0.56 0.83 0.00 1.00 0.25 1.46 0.0119 18 0.0 0.5 0.0 0.5 0.47 0.88 0.00 1.00 0.44 1.41 0.0148 19 0.0 1.0 0.0 0.5 0.24 0.98 0.00 1.00 0.97 1.23 0.0256 20 0.0 0.3 0.0 0.9 0.60 0.80 0.00 1.00 0.24 1.87 0.0144 21 0.0 0.5 0.0 0.9 0.53 0.85 0.00 1.00 0.42 1.81 0.0172 22 0.0 1.0 0.0 0.9 0.33 0.95 0.00 1.00 0.95 1.55 0.0270 23 0.0 0.3 0.1 0.1 -0.09 0.99 0.00 0.98 0.30 0.98 0.0088 24 0.0 0.5 0.1 0.1 -0.09 0.99 0.00 0.98 0.49 0.98 0.0123 25 0.0 1.0 0.1 0.1 -0.09 0.99 0.00 0.98 0.99 0.98 0.0246 26 0.0 0.3 0.5 0.5 -0.45 0.89 0.00 0.60 0.27 0.60 0.0095 27 0.0 0.5 0.5 0.5 -0.45 0.89 0.00 0.60 0.45 0.60 0.0138 28 0.0 1.0 0.5 0.5 -0.45 0.89 0.00 0.60 0.89 0.60 0.0292 29 0.0 0.3 0.9 0.9 -0.67 0.74 0.00 0.10 0.22 0.10 0.0163 30 0.0 0.5 0.9 0.9 -0.67 0.74 0.00 0.10 0.37 0.10 0.0290 31 0.0 1.0 0.9 0.9 -0.67 0.74 0.00 0.10 0.74 0.10 0.0690 32 0.3 0.3 0.0 0.0 0.71 0.71 0.00 1.00 0.43 1.00 0.0109 33 0.5 0.5 0.0 0.0 0.71 0.71 0.00 1.00 0.71 1.00 0.0167 34 1.0 1.0 0.0 0.0 0.71 0.71 0.00 1.00 1.41 1.00 0.0380

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

Table 4.4: The true value, empirical mean and standard error of the estimated ˆan and pAUC(ˆan), and the power of the global test over 1000 replications.

a1 a2 pAUC

Case True Ave SE True Ave SE True Ave SE Power(Tg)

1 0.000 -0.014 0.707 0.000 0.046 0.706 0.0050 0.0084 0.0020 0.043 2 0.000 -0.005 0.552 1.000 0.763 0.337 0.0088 0.0106 0.0029 0.271 3 0.000 0.016 0.427 1.000 0.892 0.147 0.0123 0.0138 0.0039 0.631 4 0.000 0.018 0.238 1.000 0.970 0.042 0.0245 0.0256 0.0055 0.999 5 -0.383 -0.181 0.552 0.924 0.735 0.350 0.0093 0.0109 0.0030 0.279 6 -0.276 -0.223 0.394 0.961 0.876 0.163 0.0127 0.0138 0.0037 0.635 7 -0.157 -0.164 0.229 0.988 0.958 0.051 0.0249 0.0258 0.0055 1.000 8 -0.645 -0.569 0.324 0.765 0.694 0.299 0.0164 0.0169 0.0038 0.907 9 -0.606 -0.598 0.116 0.795 0.787 0.100 0.0204 0.0211 0.0043 0.995 10 -0.519 -0.514 0.088 0.855 0.852 0.052 0.0333 0.0340 0.0053 1.000 11 -0.692 -0.659 0.215 0.722 0.689 0.212 0.0367 0.0369 0.0042 1.000 12 -0.682 -0.680 0.050 0.731 0.730 0.049 0.0422 0.0425 0.0044 1.000 13 -0.657 -0.656 0.024 0.754 0.754 0.020 0.0567 0.0568 0.0045 1.000 14 0.321 0.107 0.568 0.947 0.745 0.333 0.0091 0.0107 0.0030 0.266 15 0.195 0.133 0.422 0.981 0.882 0.161 0.0125 0.0137 0.0037 0.607 16 0.061 0.059 0.238 0.998 0.969 0.042 0.0245 0.0254 0.0052 0.997 17 0.563 0.407 0.441 0.826 0.686 0.412 0.0119 0.0127 0.0033 0.505 18 0.467 0.436 0.238 0.884 0.853 0.157 0.0148 0.0158 0.0041 0.799 19 0.239 0.234 0.163 0.971 0.958 0.042 0.0256 0.0266 0.0053 0.999 20 0.604 0.451 0.438 0.797 0.653 0.423 0.0144 0.0151 0.0035 0.792 21 0.529 0.498 0.232 0.848 0.812 0.195 0.0172 0.0178 0.0041 0.923 22 0.325 0.326 0.123 0.946 0.936 0.044 0.0270 0.0275 0.0052 0.999 23 -0.099 -0.076 0.567 0.995 0.756 0.320 0.0088 0.0106 0.0029 0.260 24 -0.099 -0.079 0.439 0.995 0.879 0.168 0.0123 0.0136 0.0036 0.617 25 -0.099 -0.095 0.236 0.995 0.966 0.051 0.0246 0.0253 0.0054 0.998 26 -0.447 -0.331 0.473 0.894 0.779 0.245 0.0095 0.0114 0.0032 0.349 27 -0.447 -0.400 0.294 0.894 0.859 0.126 0.0138 0.0151 0.0041 0.731 28 -0.447 -0.428 0.129 0.894 0.892 0.059 0.0292 0.0303 0.0059 1.000 29 -0.669 -0.655 0.098 0.743 0.746 0.067 0.0163 0.0176 0.0043 0.895 30 -0.669 -0.666 0.039 0.743 0.744 0.035 0.0290 0.0301 0.0060 0.999 31 -0.669 -0.668 0.019 0.743 0.743 0.017 0.0690 0.0695 0.0059 1.000 32 0.707 0.603 0.364 0.707 0.607 0.368 0.0109 0.0125 0.0034 0.478 33 0.707 0.664 0.241 0.707 0.667 0.237 0.0167 0.0179 0.0044 0.903 34 0.707 0.696 0.117 0.707 0.698 0.117 0.0380 0.0390 0.0065 1.000

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

Table 4.5: The pAUC and pAUC estimate after the biomarker selection.

Forward Selection Backward Selection

Case pAUC(a) Ave SE Ave SE

1 0.0050 0.0106 0.0016 0.0114 0.0018

2 0.0088 0.0120 0.0023 0.0129 0.0025

3 0.0123 0.0137 0.0032 0.0150 0.0033

4 0.0245 0.0250 0.0054 0.0248 0.0056

5 0.0093 0.0119 0.0022 0.0128 0.0027

6 0.0127 0.0139 0.0032 0.0143 0.0033

7 0.0249 0.0250 0.0057 0.0242 0.0061

8 0.0164 0.0119 0.0026 0.0095 0.0027

9 0.0204 0.0145 0.0049 0.0118 0.0038

10 0.0333 0.0305 0.0091 0.0123 0.0095

11 0.0367 0.0149 0.0096 0.0085 0.0032

12 0.0422 0.0203 0.0141 0.0099 0.0048

13 0.0567 0.0526 0.0139 0.0075 0.0082

14 0.0091 0.0119 0.0023 0.0128 0.0025

15 0.0125 0.0137 0.0032 0.0146 0.0032

16 0.0245 0.0252 0.0056 0.0246 0.0054

17 0.0119 0.0122 0.0028 0.0114 0.0027

18 0.0148 0.0135 0.0032 0.0139 0.0036

19 0.0256 0.0251 0.0056 0.0248 0.0059

20 0.0144 0.0120 0.0028 0.0102 0.0027

21 0.0172 0.0140 0.0039 0.0128 0.0038

22 0.0270 0.0251 0.0059 0.0236 0.0067

23 0.0088 0.0118 0.0021 0.0128 0.0024

24 0.0123 0.0138 0.0032 0.0145 0.0030

25 0.0246 0.0248 0.0056 0.0242 0.0057

26 0.0095 0.0119 0.0024 0.0122 0.0030

27 0.0138 0.0140 0.0034 0.0138 0.0036

28 0.0292 0.0276 0.0080 0.0180 0.0099

29 0.0163 0.0125 0.0039 0.0092 0.0031

30 0.0290 0.0172 0.0093 0.0100 0.0040

31 0.0690 0.0628 0.0192 0.0077 0.0101

32 0.0109 0.0123 0.0025 0.0131 0.0025

33 0.0167 0.0159 0.0047 0.0157 0.0044

34 0.0380 0.0387 0.0071 0.0387 0.0070

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

Table 4.6: Biomarker Selection: Forward Method vs. Backward Method over 1000 repli-cations.

Forward method Backward method Case a1 a2 (c1, c2) (1,0) (0,1) (0,0) (c1, c2) (1,0) (0,1) (0,0)

1 0.000 0.000 0.001 0.036 0.051 0.912 0.000 0.019 0.024 0.957 2 0.000 1.000 0.002 0.040 0.416 0.542 0.001 0.042 0.228 0.729 3 0.000 1.000 0.005 0.012 0.799 0.184 0.003 0.031 0.597 0.369 4 0.000 1.000 0.015 0.000 0.984 0.001 0.007 0.006 0.986 0.001 5 -0.383 0.924 0.000 0.036 0.424 0.540 0.000 0.046 0.233 0.721 6 -0.276 0.961 0.004 0.020 0.802 0.174 0.003 0.040 0.592 0.365 7 -0.157 0.988 0.034 0.002 0.964 0.000 0.009 0.025 0.966 0.000 8 -0.645 0.765 0.008 0.027 0.412 0.553 0.001 0.231 0.675 0.093 9 -0.606 0.795 0.077 0.008 0.713 0.202 0.009 0.167 0.819 0.005 10 -0.519 0.855 0.622 0.000 0.377 0.001 0.037 0.614 0.349 0.000 11 -0.692 0.722 0.062 0.013 0.380 0.545 0.006 0.300 0.694 0.000 12 -0.682 0.731 0.200 0.001 0.593 0.206 0.012 0.337 0.651 0.000 13 -0.657 0.754 0.898 0.000 0.102 0.000 0.027 0.876 0.097 0.000 14 0.321 0.947 0.002 0.034 0.416 0.548 0.002 0.044 0.220 0.734 15 0.195 0.981 0.003 0.030 0.779 0.188 0.002 0.038 0.567 0.393 16 0.061 0.998 0.024 0.000 0.976 0.000 0.012 0.005 0.980 0.003 17 0.563 0.826 0.013 0.030 0.430 0.527 0.001 0.074 0.430 0.495 18 0.467 0.884 0.015 0.019 0.769 0.197 0.006 0.057 0.736 0.201 19 0.239 0.971 0.027 0.000 0.973 0.000 0.015 0.020 0.964 0.001 20 0.604 0.797 0.011 0.023 0.417 0.549 0.004 0.149 0.639 0.208 21 0.529 0.848 0.034 0.006 0.775 0.185 0.012 0.086 0.825 0.077 22 0.324 0.946 0.073 0.000 0.926 0.001 0.025 0.059 0.915 0.001 23 -0.099 0.995 0.000 0.031 0.430 0.539 0.001 0.043 0.216 0.740 24 -0.099 0.995 0.007 0.021 0.784 0.188 0.004 0.032 0.581 0.383 25 -0.099 0.995 0.021 0.000 0.979 0.000 0.013 0.017 0.968 0.002 26 -0.447 0.894 0.008 0.021 0.424 0.547 0.000 0.064 0.285 0.651 27 -0.447 0.894 0.011 0.007 0.788 0.194 0.002 0.066 0.663 0.269 28 -0.447 0.894 0.355 0.000 0.645 0.000 0.038 0.330 0.632 0.000 29 -0.669 0.743 0.030 0.000 0.394 0.576 0.006 0.242 0.647 0.105 30 -0.669 0.743 0.189 0.001 0.613 0.195 0.011 0.340 0.648 0.001 31 -0.669 0.743 0.884 0.000 0.115 0.001 0.026 0.891 0.083 0.000 32 0.707 0.707 0.011 0.304 0.367 0.318 0.005 0.234 0.239 0.522 33 0.707 0.707 0.165 0.391 0.408 0.036 0.113 0.402 0.388 0.097 34 0.707 0.707 0.965 0.014 0.021 0.000 0.964 0.017 0.019 0.000

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

Table 4.7: The pAUC and pAUC estimate after the biomarker-selection for the specificity range (0.9,1) based on multivariate t distribution with degree freedom 3.

Population (X) pAUC

Mean Difference Correlation Forward Backward

12 ρ0 ρ1 pAUC(a) Ave SE Ave SE

0.0 0.3 0.5 0.5 0.0070 0.0159 0.0057 0.0143 0.0062

0.0 0.5 0.5 0.5 0.0088 0.0165 0.0058 0.0140 0.0063

0.0 1.0 0.5 0.5 0.0160 0.0205 0.0074 0.0166 0.0083

0.0 0.3 0.5 0.0 0.0116 0.0175 0.0069 0.0108 0.0066

0.0 0.5 0.5 0.0 0.0136 0.0187 0.0079 0.0121 0.0075

0.0 1.0 0.5 0.0 0.0206 0.0224 0.0089 0.0147 0.0093

0.0 0.3 0.0 0.5 0.0088 0.0166 0.0058 0.0137 0.0062

0.0 0.5 0.0 0.5 0.0101 0.0168 0.0060 0.0143 0.0066

0.0 1.0 0.0 0.5 0.0150 0.0197 0.0067 0.0178 0.0073

0.3 0.3 0.0 0.0 0.0076 0.0158 0.0051 0.0149 0.0055

0.5 0.5 0.0 0.0 0.0101 0.0169 0.0059 0.0158 0.0063

1.0 1.0 0.0 0.0 0.0206 0.0241 0.0086 0.0225 0.0094

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

Table 4.8: The true pAUC, the estimated pAUC, and the global test using the full biomarker set; the estimated pAUC by using the reduced biomarker set after biomarker selection based on 1000 replications for three and four dimensions.

Full Biomarkers Set Reduced Biomarkers Set

Mean Difference Correlation Forward Selection Backward Selection Case ∆1234 ρ0 ρ1 True Ave SE Power(Tg) Ave SE Ave SE p = 3 1 0.0 0.0 0.0 - 0.0 0.0 0.0050 0.0099 0.0022 0.052 0.0107 0.0018 0.0126 0.0018

2 0.5 0.0 0.0 - 0.0 0.0 0.0123 0.0147 0.0036 0.576 0.0137 0.0033 0.0149 0.0032 3 0.5 0.5 0.0 - 0.0 0.0 0.0167 0.0187 0.0044 0.870 0.0156 0.0047 0.0157 0.0045 4 0.5 0.5 0.5 - 0.0 0.0 0.0207 0.0225 0.0050 0.977 0.0180 0.0061 0.0182 0.0066 5 0.5 1.0 0.0 - 0.0 0.0 0.0281 0.0301 0.0059 1.000 0.0277 0.0070 0.0270 0.0077 6 0.5 0.5 0.0 - 0.1 0.1 0.0159 0.0182 0.0042 0.845 0.0154 0.0044 0.0159 0.0042 7 0.5 0.5 0.0 - 0.5 0.5 0.0138 0.0161 0.0039 0.713 0.0150 0.0040 0.0160 0.0035 8 0.5 0.5 0.0 - 0.9 0.9 0.0125 0.0151 0.0038 0.616 0.0142 0.0035 0.0156 0.0035 9 0.5 1.0 0.0 - 0.1 0.1 0.0268 0.0289 0.0060 0.998 0.0267 0.0062 0.0266 0.0070 10 0.5 1.0 0.0 - 0.5 0.5 0.0245 0.0263 0.0056 0.995 0.0250 0.0055 0.0249 0.0059 11 0.5 1.0 0.0 - 0.9 0.9 0.0360 0.0375 0.0063 1.000 0.0342 0.0095 0.0208 0.0171 p = 4 1 0.0 0.0 0.0 0.0 0.0 0.0 0.0050 0.0111 0.0021 0.051 0.0107 0.0016 0.0118 0.0020 2 0.5 0.0 0.0 0.0 0.0 0.0 0.0123 0.0157 0.0036 0.520 0.0139 0.0036 0.0150 0.0034 3 0.5 0.5 0.0 0.0 0.0 0.0 0.0167 0.0197 0.0044 0.862 0.0157 0.0045 0.0160 0.0048 4 0.5 1.0 0.0 0.0 0.0 0.0 0.0281 0.0306 0.0060 1.000 0.0275 0.0070 0.0263 0.0083 5 0.5 1.0 1.0 0.0 0.0 0.0 0.0410 0.0433 0.0065 1.000 0.0418 0.0075 0.0420 0.0075 6 0.5 1.0 0.0 0.0 0.1 0.1 0.0268 0.0291 0.0056 0.995 0.0267 0.0065 0.0260 0.0067 7 0.5 1.0 0.0 0.0 0.5 0.5 0.0245 0.0272 0.0054 0.993 0.0251 0.0057 0.0246 0.0062 8 0.5 1.0 0.0 0.0 0.9 0.9 0.0360 0.0380 0.0064 1.000 0.0351 0.0094 0.0212 0.0172 9 0.5 1.0 1.0 0.0 0.1 0.1 0.0400 0.0422 0.0066 1.000 0.0405 0.0075 0.0407 0.0076 10 0.5 1.0 1.0 0.0 0.5 0.5 0.0380 0.0400 0.0063 1.000 0.0394 0.0071 0.0386 0.0075 11 0.5 1.0 1.0 0.0 0.9 0.9 0.0477 0.0497 0.0065 1.000 0.0472 0.0094 0.0490 0.0076

40

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

Table 4.9: The true value, empirical mean, and standard error of the estimated ˆan based on 1000 replications.

a1 a2 a3 a4

Case True Ave SE True Ave SE True Ave SE True Ave SE

p = 3 1 0.000 0.015 0.578 0.000 0.003 0.577 0.000 0.008 0.578 - - -2 1.000 0.803 0.207 0.000 -0.002 0.378 0.000 -0.017 0.412 - -

-3 0.707 0.613 0.241 0.707 0.637 0.233 0.000 0.023 0.324 - -

-4 0.578 0.534 0.223 0.578 0.536 0.225 0.578 0.528 0.221 - -

-5 0.440 0.420 0.181 0.898 0.861 0.097 0.000 -0.001 0.201 - -

-6 0.706 0.620 0.254 0.708 0.614 0.262 0.000 0.007 0.325 - -

-7 0.706 0.553 0.359 0.708 0.523 0.376 0.000 -0.011 0.389 - -

-8 0.709 0.395 0.536 0.705 0.384 0.533 0.000 0.002 0.354 - -

-9 0.371 0.367 0.202 0.928 0.876 0.103 0.000 -0.001 0.216 - - -10 0.000 0.032 0.256 1.000 0.937 0.077 0.000 0.009 0.221 - - -11 -0.584 -0.578 0.053 0.812 0.811 0.035 0.000 0.005 0.069 - - -p = 4 1 0.000 0.022 0.494 0.000 0.026 0.499 0.000 0.025 0.514 0.000 -0.031 0.491

2 1.000 0.751 0.205 0.000 0.003 0.359 0.000 -0.011 0.369 0.000 -0.014 0.358 3 0.707 0.614 0.226 0.707 0.595 0.238 0.000 -0.006 0.290 0.000 0.000 0.279 4 0.440 0.400 0.200 0.898 0.801 0.199 0.000 -0.001 0.196 0.000 -0.012 0.194 5 0.330 0.333 0.138 0.657 0.642 0.117 0.678 0.639 0.114 0.000 0.004 0.154 6 0.371 0.355 0.200 0.928 0.857 0.102 0.000 0.005 0.211 0.000 -0.001 0.212 7 0.000 0.028 0.253 1.000 0.911 0.094 0.000 -0.009 0.220 0.000 -0.004 0.221 8 -0.584 -0.579 0.053 0.812 0.807 0.034 0.000 0.002 0.073 0.000 -0.001 0.068 9 0.274 0.277 0.154 0.668 0.639 0.117 0.692 0.663 0.112 0.000 0.003 0.159 10 -0.001 0.021 0.186 0.717 0.672 0.125 0.697 0.675 0.117 0.000 0.000 0.170 11 -0.554 -0.553 0.058 0.787 0.774 0.033 0.273 0.281 0.084 0.000 -0.003 0.067

41

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

Table 4.10: The proportion of outcomes from the two biomarker selection methods among 1000 replications for three dimension.

I. Forward selection

Case (0,0,0) (1,0,0) (0,1,0) (0,0,1) (c1, c2,0) (c1,0,c3) (0,c2,c3) (c1,c2,c3)

1 0.848 0.057 0.046 0.047 0.000 0.001 0.001 0.000

2 0.194 0.764 0.013 0.023 0.003 0.003 0.000 0.000

3 0.045 0.400 0.430 0.009 0.105 0.002 0.002 0.007

4 0.006 0.228 0.236 0.224 0.075 0.065 0.057 0.109

5 0.000 0.022 0.546 0.000 0.405 0.000 0.012 0.015

6 0.037 0.424 0.410 0.017 0.101 0.003 0.002 0.006

7 0.075 0.413 0.396 0.021 0.087 0.006 0.001 0.001

8 0.149 0.367 0.387 0.010 0.083 0.003 0.000 0.001

9 0.000 0.011 0.680 0.000 0.292 0.000 0.010 0.007

10 0.001 0.002 0.928 0.001 0.053 0.000 0.015 0.000

11 0.000 0.000 0.193 0.000 0.783 0.000 0.001 0.023

II. Backward selection

Case (0,0,0) (1,0,0) (0,1,0) (0,0,1) (c1, c2,0) (c1,0,c3) (0,c2,c3) (c1,c2,c3)

1 0.948 0.014 0.020 0.018 0.000 0.000 0.000 0.000

2 0.424 0.520 0.026 0.025 0.004 0.000 0.000 0.001

3 0.130 0.364 0.386 0.021 0.083 0.002 0.002 0.012

4 0.023 0.219 0.239 0.220 0.050 0.054 0.047 0.148

5 0.000 0.056 0.567 0.003 0.344 0.000 0.008 0.022

6 0.155 0.356 0.362 0.024 0.095 0.002 0.001 0.005

7 0.287 0.309 0.298 0.028 0.064 0.003 0.005 0.006

8 0.384 0.278 0.272 0.021 0.037 0.001 0.003 0.004

9 0.002 0.031 0.666 0.009 0.273 0.000 0.005 0.014

10 0.005 0.007 0.925 0.010 0.044 0.000 0.006 0.003

11 0.000 0.373 0.194 0.000 0.404 0.007 0.000 0.022

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

Table 4.11: The proportion of outcomes from the Forward selection method among 1000 replications for four dimension.

Case 1 2 3 4 5 6 7 8 9 10 11

(0,0,0,0) 0.808 0.189 0.041 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 (1,0,0,0) 0.047 0.727 0.377 0.026 0.002 0.013 0.009 0.000 0.000 0.000 0.000 (0,1,0,0) 0.043 0.030 0.412 0.582 0.013 0.665 0.904 0.179 0.015 0.018 0.019 (0,0,1,0) 0.045 0.022 0.010 0.000 0.017 0.000 0.000 0.000 0.026 0.019 0.001 (0,0,0,1) 0.054 0.015 0.007 0.000 0.000 0.001 0.001 0.001 0.000 0.000 0.000 (c1,c2,0,0) 0.001 0.005 0.134 0.344 0.004 0.274 0.045 0.770 0.003 0.000 0.006 (c1,0,c3,0) 0.000 0.006 0.002 0.000 0.001 0.000 0.000 0.000 0.005 0.006 0.000 (c1,0,0,c4) 0.000 0.003 0.001 0.000 0.000 0.001 0.000 0.000 0.000 0.000 0.000 (0,c2,c3,0) 0.001 0.000 0.008 0.012 0.404 0.014 0.014 0.000 0.553 0.861 0.141 (0,c2,0,c4) 0.001 0.000 0.004 0.008 0.000 0.011 0.022 0.000 0.000 0.000 0.000 (0,0,c3,c4) 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 (c1,c2,c3,0) 0.000 0.000 0.001 0.017 0.523 0.007 0.002 0.025 0.369 0.067 0.798 (c1,c2,0,c4) 0.000 0.001 0.003 0.009 0.000 0.012 0.003 0.023 0.001 0.000 0.000 (c1,0,c3,c4) 0.000 0.001 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 (0,c2,c3,c4) 0.000 0.000 0.000 0.001 0.014 0.002 0.000 0.000 0.015 0.026 0.005 (c1,c2,c3,c4) 0.000 0.000 0.000 0.001 0.022 0.000 0.000 0.002 0.013 0.003 0.030

Table 4.12: The proportion of outcomes from the Backward selection method among 1000 replications for four dimension.

Case 1 2 3 4 5 6 7 8 9 10 11

(0,0,0,0) 0.949 0.480 0.138 0.000 0.000 0.005 0.007 0.000 0.000 0.000 0.000 (1,0,0,0) 0.012 0.441 0.382 0.099 0.001 0.029 0.010 0.366 0.000 0.001 0.003 (0,1,0,0) 0.016 0.018 0.337 0.520 0.014 0.661 0.896 0.191 0.022 0.033 0.002 (0,0,1,0) 0.009 0.026 0.011 0.001 0.010 0.006 0.012 0.001 0.018 0.032 0.001 (0,0,0,1) 0.014 0.026 0.015 0.007 0.001 0.008 0.014 0.000 0.000 0.000 0.000 (c1,c2,0,0) 0.000 0.001 0.086 0.318 0.001 0.257 0.040 0.389 0.002 0.000 0.002 (c1,0,c3,0) 0.000 0.003 0.001 0.000 0.003 0.000 0.000 0.003 0.006 0.002 0.003 (c1,0,0,c4) 0.000 0.004 0.002 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 (0,c2,c3,0) 0.000 0.000 0.002 0.004 0.389 0.003 0.002 0.000 0.518 0.790 0.001 (0,c2,0,c4) 0.000 0.000 0.000 0.002 0.000 0.003 0.004 0.000 0.000 0.000 0.000 (0,0,c3,c4) 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 (c1,c2,c3,0) 0.000 0.000 0.015 0.020 0.534 0.011 0.009 0.031 0.392 0.085 0.943 (c1,c2,0,c4) 0.000 0.001 0.009 0.025 0.000 0.013 0.004 0.017 0.000 0.000 0.000 (c1,0,c3,c4) 0.000 0.000 0.002 0.000 0.000 0.000 0.000 0.000 0.000 0.002 0.000 (0,c2,c3,c4) 0.000 0.000 0.000 0.001 0.016 0.002 0.002 0.000 0.030 0.049 0.000 (c1,c2,c3,c4) 0.000 0.000 0.000 0.003 0.031 0.002 0.000 0.002 0.012 0.006 0.045

‧ 國

立 政 治 大 學

N a

tio na

l C h engchi U ni ve rs it y

相關文件