4. Performance Comparisons of Four Bootstrap Methods
4.2. Error Probability Analysis
The error probability is the proportion of times that we wrongly reject the null hypothesis
H Y
0:
q1≥ Y
q 2, while actuallyH Y
0:
q1≥ Y
q 2 is true. For the test, we will calculate the proportion of times the LCB ofY
q2− Y
q1 is positive and the LCB ofis larger than 1. A sample of size
2
/
Y
qY
q1n =
100 was drawn withbootstrap resamples, and the single simulation was then replicated times.
Figures 2 and 3 show the error probability of those four bootstrap methods for the difference and ratio statistics with 25 combinations (it also called 25 cases in the following) tabulated in Table 3, respectively. Usually, the reasonable probability of error selection is less than a maximum value
3,000 B = 3,000 N =
α -condition. The frequency of error *
selection is a binomial random variable with
N = 3,000
and . Then we can calculate a 99% confidence interval for error probability is*
0.05
α
=
* * *
0.05
(1 )/ 0.05 2.576 (0.05 0.95)/3000 0.05 0.0103
Z N
α
± ×
α−
α= ± × × = ±
.That is, if we set , the reasonable interval would be the range from 0.0397 to 0.0610.
*
0.05
α
=
Before we selected the parameter of the error test, we tried many different combinations of N ,
B
and . Because too low value ofn N would make the
random error significant and the tendency between different cases wouldn’t be obvious. On the other hand, too high value of N will make the 99% confidence bound too narrow. Cases were out of interval easily in this condition and it was difficult to judge which bootstrap method was better. As the result, we tested many combinations of the parameter and finally selectedN = 3,000
, , andto perform the error test. Considering different value of may also affect the layout of the error curve for four bootstrap methods, different values of were also simulated under the same combination (
3,000
then we found that the tendency of the curves were more significant by the increase of value in . And the relative location between the curve of four methods wasn’t changed. Based on these tests, we finally selected the case of with100
n =
Y
qq
0.8
Y =
3,000
N =
, , and to perform the error test in four bootstrap methods.3,000
B = n =
100difference error
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0 5 10 15 20 25
case number
error probability
SB PB BCPB BT 0.0603 0.05 0.0397
Figure 2. Error probability of four bootstrap methods under
Y
q2− Y
q1= 0
(Y
q1= Y
q2= 0.8
).ratio error
0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09
0 5 10 15 20 25
case number
error prability
SB PB BCPB BT 0.0603 0.05 0.0397
Figure 3. Error probability of four bootstrap methods under
Y
q2/ Y =
q11
(Y
q1= Y
q2= 0.8
).After the difference statistic, there were three occurrences out of the 25 cases were outside the interval (0.0397, 0.0603) for the SB method. And for the PB method, there were three occurrences beyond these limits. Only two occurrences were outside the interval for the BCPB method, and there were three occurrences beyond these limits for the BT method. As for the ratio test, there were 3 occurrences out of the 25 cases outside the interval (0.0397, 0.0603) for the SB, PB and BCPB methods. The most cases out of the limits were for the BT method (7 occurrences). We could find that the PB, SB and BCPB methods had similar number of occurrences outside the interval for the 25 cases and the BT method had the least cases out of the upper bound in the ratio test.
By the following Tables 4-5, we can further examine the mean and standard deviation of the error probability for four methods. We could find that the mean of the error probability for the BCPB method was the farthest from the target and the standard deviation was the lowest compared with the other methods in the different test. In the ratio test, the mean for the BT method was the farthest from the setting 0.05 and the standard deviation was the highest in four bootstrap methods. And The BCPB method still had the lowest standard deviation. The SB method has the closest mean to the target, but its deviation was higher than the PB and BCPB methods in Table 5. In the two tests, we could find that the BCPB method had a higher mean of error probability in four methods but it could keep a lowest standard deviation in all four methods. We also performed the error test in
Y
q1= Y
q2= 0.83
(see Tables 10-11 and Figures 11-12 in Appendix A) and found that as the value became larger, the standard deviation of the error probability for four methods got larger. In this condition, the BCPB method still had the smallest variation among four bootstrap methods. Considering the application in high quality measuring, the BCPB method could keep the steadiest value of the error probability.Y
qTable 4. Error statistics of the four bootstrap methods for the difference test (
Y
q1= Y
q2= 0.8
).Difference Mean of these 25 cases error
Standard deviation of these 25 cases error
Number of out of limits
Out of limits case
SB 0.0552528 0.004475521 3 21,22,23
PB 0.0556936 0.003744748 3 21,22,23
BCPB 0.0567064 0.002709163 2 3,22
BT 0.0550536 0.0054367 9 4 21,22, 23,24
Table 5. Error statistics of the four bootstrap methods for the ratio test (
Y
q1= Y
q2= 0.8
).Difference Mean of these 25 cases error
Standard deviation of these 25 cases error
Number of out of limits
Out of limits case
SB 0.0499332 0.005729382 3 21,22,23
PB 0.0556936 0.003744748 3 21,22,23
BCPB 0.0569744 0.00272262 3 3,8,22
methods Error
prob.
In addition, we calculated an average lower bound and the standard deviation of the lower bound based on the