Simulation Study
4.3 Two-Biomarker Study
國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
4.3 Two-Biomarker Study
In this section, numerical studies of two-biomarker examples are conducted in four parts.
The first part investigates which parameters will result in a bimodal pAUC plot. Secondly, the effect of an unbalanced design on our proposed statistical procedures will be also studied. Thirdly, we perform a power analysis by using three different sample sizes.
Finally, two tables of the three critical values of our three proposed statistical tests are present.
In the first study, we consider three classes of two-biomarker models, where the mean vectors for the two populations are kept fixed as µT0 = (0, 0), µT1 = (1, 1), respectively; but different covariance matrix structures are applied. Note that under the circumstances, the reciprocal of the variance reflects the relative importance of the biomarker. We denote σdj and ρd as the variance of the j-th biomarker and the correlation between the two biomarkers in the group D = d, respectively. Please see the following table for the set-up of the variances. Moreover, a thorough investigation of the impact of ρd on the pAUC plot is conducted. Specifically, ρd = ±0.1, ±0.5, ±0.9, respectively. Note that the two examples discussed previously in Section 2.2 respectively belong to Class I and Class II.
Non-diseased Diseased Class σ01 σ02 σ11 σ12
I 1 1 1 1
II 1 1 0.25 1
III 0.25 1 1 1
In each scenario, the pAUC plot, introduced in Section 2.2, is produced. Please see Figure 4.4, Figure 4.5 and Figure 4.6 for Class I, II and III, respectively. In each class, the plots are arranged from left to right with an increasing ρ0, and from top to bottom with an increasing ρ1. When ρ0 < 0, we always obtain a unimodal pAUC plot. Hence, only the plots with ρ0 > 0 are provided. From these figures, we find that when the two biomarkers have a greater positive correlation in the non-diseased population and have a greater negative correlation in the diseased population, it is more likely that a bimodal pAUC plot will be produced. As a consequence, the computational difficulty increases drastically. On the other hand, in the first class, the two modes in the pAUC plots are the maximal points and hence the solution is not unique. In Class II and III, a local
‧ 國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
maximum exists, in addition to the global maximum. Furthermore, it seems that Class II is much more difficult to compute since the local maximum and the global maximum are closer to each other. In summary, more attention to computation is required when ρ0 ≈ 1, ρ1 ≈ −1. When some biomarkers have about the same effective size marginally, multiple solutions may exist. When a biomarker has a relatively homogeneous diseased population, one should confirm whether the solution is truly a global maximum or in fact a local maximum.
Next, in order to explore the effect of the sample allocation on our proposed statistical procedures, more simulations with a fixed total sample size n0+ n1 = 200 are conducted.
In addition to the balanced design in Section 4.2, two unbalanced data structures are used. One is (n0, n1) = (50, 150), the other is (n0, n1) = (150, 50). For simplicity, only the two-dimensional cases of ∆2 = 0.5 in Section 4.2 are considered. One thousand replicates are generated in each case. The results of the balanced design are presented in Table 4.4-4.6. Note that the following comparison does not take the sampling cost into account.
We assume that the cost of sampling a diseased subject is the same as that of sampling a non-diseased subject.
The corresponding empirical powers of the global test are listed in Table 4.13. We find that the type I error rate of this test is well controlled overall. Furthermore, the balanced design has better performances than the two unbalanced designs in alternative scenarios. Among the unbalanced designs, the greater the non-diseased sample, the higher the power.
The following study focuses on the biomarker selection. The average value and the mean square error (MSE) of the estimated maximal pAUC of the full data set, of the reduced data set from each of the two proposed biomarker selections over 1000 replications are reported in Table 4.14. Recall that in Section 4.2, we present the SE of the maximal pAUC, rather than its MSE. It is also observed that the estimated pAUC is biased in the finite sample cases. To take both the bias and the variation into consideration, the MSE is calculated and reported. Moreover, the computed MSEs are plotted in Figure 4.7 by the data set used. In summary, we have similar findings as in the global test:
the balanced design dominates the two unbalanced designs in general; collecting more
‧ 國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
non-diseased subjects has a slightly better effect than collecting more diseased subjects.
Because the evaluation of the pAUC depends on the non-diseased and diseased popula-tions, the estimations of the two distributions are of equal importance. Hence a balanced design is better than an unbalanced design in our study. On the other hand, the pAUC under consideration here uses a large limit of specificity, estimating the non-diseased dis-tribution becomes relatively more important than estimating the diseased disdis-tribution.
Consequently, when an unbalanced design is used, collecting more non-diseased subjects results in a better performance than collecting more diseased subjects.
It’s learn that our proposed statistical procedures are lack of explicit formulations and require many complex computations, thus finding a suitable or optimal sampling ratio of the non-diseased group to the diseased group prior to the experiment is a difficult research problem. Luckily, from this empirical study, the balanced design is found to be a not bad choice. In real applications, the sampling cost of a diseased subject is usually different to that of a non-diseased subject. When the cost is taken into account, different conclusions are likely to produce.
Note that the MSEs in Case 12 and 30 seem exceptionally high from Figure 4.7. In fact, it is partly due to the fact that the true maximal pAUC values of the two cases are relatively larger than others. On the other hand, as discussed in Section 4.2, the two biomarker selections have unsatisfactory performances in Case 12 due to either the lack of power of the marginal test in applying the Forward procedure, or the deficiency of the Backward procedure.
Next we investigate the effect of sample sizes in the power of the global test for the two-biomarker cases. In additional to the original results, in which the sample sizes are set as 100 for each groups, the results of two smaller sample sizes-20 and 50, are also present in Table 4.15. One thousand replicates are generated in each case. From this table, we find that the type I error rate exceeds the significance level when the sample size is 20.
However, when the sample size increases to 50, the type I error rate is controlled. Given
∆2 = 1.0, in most cases, the global test with n0 = n1 = 50 achieves 80% power. Moreover, when the correlation between biomarkers in the non-diseased group is high, not only the true maximal pAUC becomes large, but also the power of the global test increases. Hence,
‧ 國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
in this situation, the required sample size can be reduced.
Finally, for practical purposes, we provide the critical values at the different signifi-cance level α = 0.01, 0.05, 0.1 with t = 0.1 for the three proposed statistical tests with three different sample sizes, 20, 50 and 100 via 105 replicates. The critical values for the global and the marginal discriminatory power tests are reported in Table 4.16. For marginal test, the non-diseased and the diseased data both are generated from a stan-dard normal distribution. For the global discriminatory power test, these two groups of biomarkers are generated from a same multivariate normal distribution with mean µp = 0 and covariance matrix,
Σp =
1 ρp
ρp 1
.
Four values of ρp, 0.0, 0.1, 0.5 and 0.9 are used. On the other hand, the critical values, d1, d2, for testing the conditional discriminatory power of the first biomarker corresponds to the 100×(1−α/2)th and 100×(α/2)th percentiles of the coefficient in the optimal linear combination from the null scenario among 105 replicates. The results are present in Table 4.17. Consider µ0 = 0, µ1 = ∆µ= (∆1, ∆2)T, ∆2 6= 0, and Σ0 = Σ1 = Σp. In this setting,
∆1 = ρp∗∆2implies that the first biomarker has no conditional discriminatory power given the second biomarker. Given ∆2 = 0.5, 1.0 and ρp = 0.0, 0.1, 0.5, 0.9, respectively, ∆1 is obtained. Note that we propose to find the critical values under the scenario ρp = 0 for simplicity. When ρp = 0, the absolute values of d1, d2 are most the same, but of opposite sign. However, when ρp is not zero, the distribution of the coefficient of the first biomarker satisfying ∆1 = ρp ∗ ∆2 is not symmetric to zero. Additionally, the null distribution is shifted to the right. It means that when the correlation is positive, the proposed test, which assumes zero correlation to find the critical values, tends to produce an inflated type I error rate and hence a high power. However, from our previous simulation results, we find that the type I error rate is quite controlled.
‧
‧
‧
‧ 國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
1 3 6 9 12 15 18 21 24 27 30
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
CASE MSE(10−4 )
Full Data
(50,150) (150,50) (100,100)
1 3 6 9 12 15 18 21 24 27 30
0 5 10 15
CASE MSE(10−4 )
Forward Selection
(50,150) (150,50) (100,100)
1 3 6 9 12 15 18 21 24 27 30
0 2 4 6 8 10 12
CASE MSE(10−4 )
Backward Selection
(50,150) (150,50) (100,100)
Figure 4.7: Case versus MSE (10−4).
‧ 國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
Table 4.13: The sample size effect of two groups in the power of the global test for unbalanced data over 1000 replications.
(n0, n1)
Case (50,150) (150,50) (100,100)
1 0.043 0.039 0.043
3 0.444 0.538 0.631
6 0.475 0.595 0.635
9 0.930 0.947 0.995
12 1.000 1.000 1.000
15 0.471 0.550 0.607
18 0.621 0.659 0.799
21 0.783 0.851 0.923
24 0.482 0.551 0.617
27 0.593 0.684 0.731
30 0.992 0.998 0.999
Table 4.14: The sample size effect of two groups in pAUC for unbalanced data over 1000 replications.
(n0, n1) = (50, 150) (n0, n1) = (150, 50)
Full Data Forward Backward Full Data Forward Backward
Case pAUC(a∗) Ave MSE Ave MSE Ave MSE Ave MSE Ave MSE Ave MSE
1 0.0050 0.0092 0.253 0.0120 2.740 0.0148 6.584 0.0087 0.192 0.0112 2.710 0.0128 6.792 3 0.0123 0.0145 0.258 0.0152 0.809 0.0167 2.250 0.0139 0.185 0.0141 0.691 0.0156 1.500 6 0.0127 0.0148 0.252 0.0151 0.943 0.0163 2.087 0.0143 0.196 0.0145 0.768 0.0153 1.277 9 0.0204 0.0214 0.272 0.0153 2.672 0.0129 1.087 0.0210 0.258 0.0152 2.016 0.0122 1.042 12 0.0422 0.0428 0.186 0.0197 14.696 0.0109 10.057 0.0424 0.339 0.0198 13.854 0.0104 10.327 15 0.0125 0.0145 0.241 0.0151 0.800 0.0162 2.027 0.0140 0.187 0.0143 0.720 0.0155 1.472 18 0.0148 0.0164 0.244 0.0150 1.045 0.0154 1.518 0.0152 0.199 0.0142 0.983 0.0143 1.270 21 0.0172 0.0178 0.199 0.0154 1.758 0.0139 1.095 0.0178 0.220 0.0145 1.250 0.0135 0.814 24 0.0123 0.0146 0.247 0.0151 0.819 0.0165 1.931 0.0139 0.176 0.0143 0.651 0.0153 1.422 27 0.0138 0.0160 0.296 0.0154 1.124 0.0153 1.518 0.0154 0.197 0.0144 0.850 0.0142 1.003 30 0.0290 0.0308 0.620 0.0172 6.618 0.0114 3.394 0.0298 0.370 0.0169 5.456 0.0109 3.504
52
‧ 國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
Table 4.15: The sample size effect of two groups in the power of the global test for balanced data over 1000 replications.
Mean Difference Correlation (n0, n1)
Case ∆1 ∆2 ρ0 ρ1 (20,20) (50,50) (100,100)
1 0.0 0.0 0.0 0.0 0.058 0.044 0.043
2 0.0 0.3 0.0 0.0 0.104 0.157 0.271
3 0.0 0.5 0.0 0.0 0.182 0.357 0.631
4 0.0 1.0 0.0 0.0 0.547 0.916 0.999
5 0.0 0.3 0.1 0.0 0.085 0.168 0.279
6 0.0 0.5 0.1 0.0 0.203 0.395 0.635
7 0.0 1.0 0.1 0.0 0.566 0.941 1.000
8 0.0 0.3 0.5 0.0 0.229 0.606 0.907
9 0.0 0.5 0.5 0.0 0.345 0.793 0.995
10 0.0 1.0 0.5 0.0 0.776 0.995 1.000
11 0.0 0.3 0.9 0.0 0.957 1.000 1.000
12 0.0 0.5 0.9 0.0 0.979 1.000 1.000
13 0.0 1.0 0.9 0.0 1.000 1.000 1.000
14 0.0 0.3 0.0 0.1 0.104 0.150 0.266
15 0.0 0.5 0.0 0.1 0.173 0.350 0.607
16 0.0 1.0 0.0 0.1 0.554 0.913 0.997
17 0.0 0.3 0.0 0.5 0.117 0.255 0.505
18 0.0 0.5 0.0 0.5 0.172 0.482 0.799
19 0.0 1.0 0.0 0.5 0.567 0.937 0.999
20 0.0 0.3 0.0 0.9 0.161 0.451 0.792
21 0.0 0.5 0.0 0.9 0.228 0.613 0.923
22 0.0 1.0 0.0 0.9 0.590 0.948 0.999
23 0.0 0.3 0.1 0.1 0.096 0.172 0.260
24 0.0 0.5 0.1 0.1 0.186 0.376 0.617
25 0.0 1.0 0.1 0.1 0.556 0.921 0.998
26 0.0 0.3 0.5 0.5 0.108 0.197 0.349
27 0.0 0.5 0.5 0.5 0.220 0.488 0.731
28 0.0 1.0 0.5 0.5 0.690 0.970 1.000
29 0.0 0.3 0.9 0.9 0.305 0.621 0.895
30 0.0 0.5 0.9 0.9 0.710 0.968 0.999
31 0.0 1.0 0.9 0.9 1.000 1.000 1.000
32 0.3 0.3 0.0 0.0 0.139 0.273 0.478
33 0.5 0.5 0.0 0.0 0.293 0.647 0.903
34 1.0 1.0 0.0 0.0 0.868 0.998 1.000
‧ 國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
Table 4.16: The critical value of optimal pAUC is the 100× (1 − α)th percentile among 105 replicates.
ρp
p α n0 n1 0.0 0.1 0.5 1.0
1 0.01 20 20 0.0272 - -
-50 50 0.0166 - -
-100 -100 0.0124 - -
-0.05 20 20 0.0201 - -
-50 50 0.0131 - -
-100 -100 0.0102 - -
-0.1 20 20 0.0169 - -
-50 50 0.0115 - -
-100 -100 0.0093 - -
-2 0.01 20 20 0.0337 0.0336 0.0340 0.0336 50 50 0.0200 0.0199 0.0199 0.0198 100 100 0.0144 0.0143 0.0143 0.0143 0.05 20 20 0.0268 0.0268 0.0267 0.0267 50 50 0.0164 0.0164 0.0163 0.0163 100 100 0.0122 0.0122 0.0122 0.0122 0.1 20 20 0.0232 0.0232 0.0233 0.0234 50 50 0.0146 0.0147 0.0146 0.0147 100 100 0.0112 0.0112 0.0112 0.0112
‧ 國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
Table 4.17: The critical values, d1, d2, of the coefficient in the optimal linear combination are the 100× (1 − α/2)th and 100× (α/2)th percentiles among 105 replicates.
α n0 n1 ∆2 ρp d2 d1
0.01 20 20 0.50 0.0 -0.9992 0.9999 20 20 0.50 0.1 -0.9989 0.9999 20 20 0.50 0.5 -0.9957 0.9999 20 20 0.50 0.9 -0.9317 1.0000 20 20 1.00 0.0 -0.9815 0.9888 20 20 1.00 0.1 -0.9702 0.9934 20 20 1.00 0.5 -0.8548 0.9997 20 20 1.00 0.9 -0.7260 1.0000 50 50 0.50 0.0 -0.9963 0.9985 50 50 0.50 0.1 -0.9933 0.9991 50 50 0.50 0.5 -0.9438 0.9999 50 50 0.50 0.9 -0.7840 1.0000 50 50 1.00 0.0 -0.8095 0.8061 50 50 1.00 0.1 -0.7686 0.8546 50 50 1.00 0.5 -0.6611 0.9843 50 50 1.00 0.9 -0.6322 1.0000 100 100 0.50 0.0 -0.9674 0.9701 100 100 0.50 0.1 -0.9344 0.9843 100 100 0.50 0.5 -0.8103 0.9996 100 100 0.50 0.9 -0.7024 1.0000 100 100 1.00 0.0 -0.6102 0.6072 100 100 1.00 0.1 -0.5761 0.6406 100 100 1.00 0.5 -0.5106 0.8358 100 100 1.00 0.9 -0.5563 1.0000 0.05 20 20 0.50 0.0 -0.9869 0.9919 20 20 0.50 0.1 -0.9827 0.9936 20 20 0.50 0.5 -0.9308 0.9976 20 20 0.50 0.9 -0.7876 0.9983 20 20 1.00 0.0 -0.8718 0.8780 20 20 1.00 0.1 -0.7284 0.8092 20 20 1.00 0.5 -0.7124 0.9897 20 20 1.00 0.9 -0.6583 0.9989
‧ 國
立 政 治 大 學
‧
N a
tio na
l C h engchi U ni ve rs it y
Table 4.17 (continued).
α n0 n1 ∆2 ρp d2 d1
50 50 0.50 0.0 -0.9429 0.9519 50 50 0.50 0.1 -0.9204 0.9714 50 50 0.50 0.5 -0.7988 0.9954 50 50 0.50 0.9 -0.6997 0.9998 50 50 1.00 0.0 -0.6315 0.6343 50 50 1.00 0.1 -0.5154 0.5698 50 50 1.00 0.5 -0.5370 0.8652 50 50 1.00 0.9 -0.5720 0.9987 100 100 0.50 0.0 -0.8222 0.8326 100 100 0.50 0.1 -0.7880 0.8709 100 100 0.50 0.5 -0.6786 0.9844 100 100 0.50 0.9 -0.6405 1.0000 100 100 1.00 0.0 -0.4637 0.4684 100 100 1.00 0.1 -0.3789 0.4090 100 100 1.00 0.5 -0.4104 0.6353 100 100 1.00 0.9 -0.4921 0.9947 0.1 20 20 0.50 0.0 -0.9535 0.9651 20 20 0.50 0.1 -0.9419 0.9741 20 20 0.50 0.5 -0.8497 0.9902 20 20 0.50 0.9 -0.7309 0.9919 20 20 1.00 0.0 -0.7622 0.7694 20 20 1.00 0.1 -0.8376 0.9089 20 20 1.00 0.5 -0.6302 0.9563 20 20 1.00 0.9 -0.6189 0.9945 50 50 0.50 0.0 -0.8654 0.8790 50 50 0.50 0.1 -0.8269 0.9044 50 50 0.50 0.5 -0.7113 0.9831 50 50 0.50 0.9 -0.6589 0.9954 50 50 1.00 0.0 -0.5412 0.5362 50 50 1.00 0.1 -0.6079 0.6737 50 50 1.00 0.5 -0.4681 0.7443 50 50 1.00 0.9 -0.5310 0.9929 100 100 0.50 0.0 -0.7183 0.7227 100 100 0.50 0.1 -0.6891 0.7677 100 100 0.50 0.5 -0.5951 0.9407 100 100 0.50 0.9 -0.6014 0.9965 100 100 1.00 0.0 -0.3971 0.3897 100 100 1.00 0.1 -0.4496 0.4879 100 100 1.00 0.5 -0.3538 0.5202 100 100 1.00 0.9 -0.4520 0.9752