右設限資料之穩健逐次統計檢定方法的探討

(1)

GROUP SEQUENTIAL TESTS BASED ON THE SIMULTANEOUS USE OF WEIGHTED LOGRANK AND WEIGHTED KAPLAN-MEIER TESTS

Yunchan Chi

Department of Statistics

National Cheng-Kung University Tainan, Taiwan

ycchi@email.stat.ncku.edu.tw

ABSTRACT

This paper proposed a robust group sequential test procedure based on the joint use of group sequential weighted Kaplan-Meier test (Li, 1999; Gu et. al., 1999) and group sequential weighted logrank test (Gu and Lai, 1991). All the tests considered in the simulation study do not perform uniformly better than the others. Nevertheless, the overall performance of the proposed group sequential test procedure is slightly better than group sequential WKM and logrank tests.

Key Words : group sequential test procedures, weighted Kaplan-Meier test, weighted logrank test

(2)

1. INTRODUCTION

In clinical trials with survival time as response, the study will usually last for several years to draw the conclusion. Therefore, using group sequential tests to test the treatment effect and save the cost and time becomes a very important issue in medical research. Recently, many statisticians have devoted to group sequential analysis with censored observations and staggered entry. Slud (1984) gave detailed development of these methods ( Jones and Whitehead, 1979 ; Slud and Wei, 1982; Tsiatis, 1982; and Sellke and Siegmund, 1983) and generalized the results of Tsiatis (1982) and Sellke and Siegmund (1983) to weighted logrank statistics with weights independent of time. Gu and Lai (1991) further developed weak convergence for weighted logrank test with weights that may depend on time. However, Pepe and Fleming (1989) pointed out that weighted logrank tests are based on ranks, and these tests might not be sensitive to the magnitude of the difference in survival times against a specific alternative. Consequently, they suggested a class of test statistics based on the integrated weighted differences in Kaplan-Meier estimates and showed that these statistics compare favorably with the logrank test even under the proportional hazards alternatives, and may perform better under the early and crossing hazards alternatives. Further, Li (1999) and Gu et al. (1999), respectively, extended weighted Kaplan-Meier tests to sequential clinical trials and derived weak convergence of group sequential weighted Kaplan-Meier tests. For a given data set, however, one seldom knows what the exact alternative is. Therefore, Lee (1996) constructed versatile tests by combining four weighted logrank tests with various weight functions detecting proportional, early, late, and middle hazards differences, respec-tively. Chi and Tsai (2001) further constructed robust tests based on the simultaneous use of weighted logrank and weighted Kaplan-Meier tests. Their results showed that the overall performance of the linear combination of weighted logrank and weighted Kaplan-Meier tests is better than Lee’s linear combination test based on weighted logrank tests only. Therefore, group sequential tests based on the joint use of weighted logrank and weighted Kaplan-Meier tests will be derived in this paper.

(3)

logrank and group sequential weighted Kaplan-Meier tests are reviewed. Next, the proposed test statistics are developed in Section 3. Furthermore, simulation results of the power comparisons among the proposed test, grouped sequential weighted logrank tests and Kaplan-Meier tests are presented in Section 4.

2. LITERATURE REVIEW 2.1. Preliminary

In comparative clinical trials, patients usually enter the trial serially and are assigned to treatments according to some random mechanism. Let Yij be the entry time (calendar time)

of the jth subject in group i; j = 1, 2, . . . , ni, i = 1, 2, where ni represents the sample size of

the ith group. In addition, n = n1+ n2 is the total sample size. Let Tij denote the survival

time from entry for patient j in group i until the endpoint (death, tumor recurrence, etc.) under investigation; and Cij denote the time from entry until patient j in group i is lost

to followup. Thus the patient entering the trial at Yij is on test during the time interval

[Yij, Yij + (Tij ∧ Cij)], where ∧ denotes minimum. At time Yij + (Tij ∧ Cij), we will observe

an event if Tij ≤ Cij and otherwise observe that the patient is censored. Hence the censored

indicator for the jth subject in group i is defined as δij(t) = I {Tij ≤ Cij ∧ (t − Yij)+},

where a+ = max(0, a) and Xij(t) = Tij ∧ Cij ∧ (t − Yij)+ is the observed survival time.

Gu and Lai (1991) provided a very general setting that is based on counting processes and martingale in two dimension for deriving the weak convergence of the time-sequential censored rank statistics. Let Ni(t, s) = Pnj=1i I {Tij ≤ Cij ∧ (t − Yij)+∧ s} be the counting

process for evaluating the events occurred at time s, and Ri(t, s) =Pnj=1i I {Xij(t) ≥ s} be

the risk process for counting the subjects still at risk at time s. We assume survival times; Tij, are independent of entry times and censoring times; (Yij, Cij), and the distributions of

Yij, Tij and Cij for the ith population are continuous and denoted by Ei(t), Fi(t) and Li(t),

respectively. In addition, let Si(t) = 1 − Fi(t) and Gi(t) = 1 − Li(t) be the survival functions

of survival and censoring times, respectively, and the cumulative hazard function of the ith group is defined by Λi(t) = − log(Si(t)).

(4)

2.2. Group Sequential Weighted Logrank Tests

For testing the equality of two survival distributions (H0 : S1(t) = S2(t), for all t)

against stochastic ordering alternative (H1 : S1(t) ≤ S2(t), with strict inequality for some

t) in sequential clinical trials, Gu and Lai (1991) unified the most commonly used group sequential testing procedures as time-sequential censored rank statistics,

U1(t) = q 1/n Z t 0 K(t, s) dnΛb₁(t, s) −Λb₂(t, s) o , where Λbn_i(t, s) = Z s 0 dNi(t, u) Ri(t, u) , i = 1, 2, and K(t, s) = Q(t, s) R1(t, s) R2(t, s) R1(t, s) + R2(t, s) .

For example, Q(t, s) = 1 corresponds to the group sequential logrank test (Tsiatis, 1982) and Q(t, s) = R1(t, s) + R2(t, s) yields the group sequential Gehan-Wilcoxon test statistic

(Slud and Wei, 1982). Moreover, an extension version of weight function from Harrington and Fleming (1982) has the following form Q(t, s) =S(t, s)b %(1−S(t, s))b %, whereS(t, s) is Kaplan-b

Meier estimator of survival distribution based on combined data. Since the null asymptotic distribution of U1(t) based on the data at time t is normal with mean zero and variance

σ11(t, t) which is given in (A.1) in the Appendix and can be consistently estimated byσb11(t, t)

listed in (A.5) in the Appendix. Suppose the group sequential trials are examined at time points t1 < t2 <, · · · , tk. A two-sample group sequential weighted logrank test (GSWLR)

rejects H0in favor of H1 at the jth examination time point tj, if U1(tj)/

q b

σ11(tj, tj) ≥ zα∗j(tj),

where z_α∗

j(tj) are derived by choosing positive values α1, α2, · · · , αk, so that the overall level

of significance α = Pk

j=1αj, and αj can be obtained by recursively solving the following

equations P (Z1(t1) < z1, Z2(t2) < z2, · · · , Zj−1(tj−1) < zj−1, Zj(tj) ≥ zj) = αj, j = 1, · · · , k,

where (Z1(t1), Z2(t2), · · · , Zj(tj)) is multivariate normal with mean zero and covariance

ma-trix with elements equal to σ11(ti, tj)/

q

σ11(ti, ti) σ11(tj, tj). Note that a program by Schervish

(1984) for calculating multivariate normal probabilities can be applied for finding zj. A

con-sistent estimator of σ11(ti, tj) are also given in (A.5) in the Appendix.

2.3. Group Sequential Weighted Kaplan-Meier Tests

(5)

magnitude of the difference in survival time against a specific alternative. The reason is illustrated by an example in Pepe and Fleming (1989). Consequently, they proposed a class of weighted Kaplan-Meier test statistics (WKM) based on the sum of weighted difference in Kaplan-Meier estimators. Moreover, they showed that WKM compares favorably with logrank test even under the proportional hazards alternative, and may perform better than logrank test under early and crossing hazards difference alternatives. However, WKM may perform worse under late hazards difference alternatives since the weight function is chosen to put less weight over later time period if censoring rate is heavy. Nevertheless, Li (1999) further extended their procedure to sequential clinical trials , and the extended weighted Kaplan-Meier tests based on the data at time t have the following form

U2(t) = √ n Z Tc(t) 0 b ω(t, s)nSb₂(t, s) −Sb₁(t, s) o ds,

where Tc(t) = sup{s :bb₁(t, s) ∧bb₂(t, s) > 0}, ∧ denotes minimum, Sb_i(t, s) are Kaplan-Meier

estimator of survival distribution of group i,bb_i(t, s) is an empirical estimate of Pr (C_i ≥ s, t − Y_i ≥ s),

and the random weight function ω(t, s) estimating a deterministic function ω(t, s) whichb

downweights the contributions of the difference of two Kaplan-Meier estimators over later time period if censoring is heavy.

If the weight function is chosen appropriately, the resulted statistics will be stable. For instance, b ω(t, s) = bb1(t, s)bb2(t, s) b p1bb₁(t, s) +p_b₂bb₂(t, s) , where pbi = ni n, i = 1, 2, was used by Li (1999).

Furthermore, Gu et. al (1999) investigated more general weight functions in terms of U2(t) = √ n Z Tc(t) 0 c H(t, s) dnSb₂(t, s) −Sb₁(t, s) o ,

where Tc(t) is chosen so that the risk set at Tc(t) for each of the two groups is large enough,

c

H(t, s) is a random function of bounded variation which converges pointwise in probability to a deterministic piecewise continuous function H(t, s). Several test statistics belong to

(6)

this class. For example, if H(t, s) =c S(t, s), whereb S(t, s) is Kaplan-Meier estimator ofb

survival distribution based on combined data, then U2(t) is equivalent to a group sequential

and truncated version of Efron’s (1967) two-sample test statistic with Tc(t) = ∞. For

c

H(t, s) =

Z Tc(t)

s b

ω(t, u)du, the above statistic reduces to Li’s test. More examples can be found in Gu et. al (1999).

The null asymptotic distribution of U2(t) evaluated at time t is normal with mean

zero and variance σ22(t, t) which is given in (A.2) in the Appendix and can be

consis-tently estimated by σb22(t) listed in (A.6) in the Appendix. It is worth noticing that for c

H(t, s) =

Z Tc(t)

s b

ω(t, u)du, there were typos in σ22(t, t) formula in Li (1999, page 278-279),

the correct formula is given in the Appendix. As a result, the expression of the variance estimate of σ22(t, t) in Li (1999) also need to be modified. The correct formula is also given in

the Appendix. Thus, a two-sample group sequential weighted Kaplan-Meier test (GSWKM) rejects H0in favor of H1 at the jth examination time point tj, if U2(tj)/

q b

σ22(tj, tj) ≥ zα∗j(tj),

where z_α∗_j(tj) are, again, derived by choosing positive values α1, α2, · · · , αk, so that the

over-all level of significance α = Pk

j=1αj, and αj can be obtained by recursively solving the

following equations P (Z1(t1) < z1, Z2(t2) < z2, · · · , Zj−1(tj−1) < zj−1, Zj(uj) ≥ zj) = αj, j =

1, · · · , k, where (Z1(t1), Z2(t2), · · · , Zj(tj)) is multivariate normal with mean zero and

covari-ance matrix with elements equal to σ22(ti, tj)/

q

σ22(ti, ti) σ22(tj, tj). A consistent estimator

of σ22(ti, tj) are also given in (A.6) in the Appendix.

From Li’s limited simulation results, WKM performs well under early and crossing hazard differences alternatives, however, no simulation result about late hazard difference alterna-tive. This may be due to the weight function which is chosen to put less weight over later time period if censoring rate is heavy.

3. THE PROPOSED TEST STATISTICS

For a given data set, however, one seldom knows what the exact alternative is. Therefore, Lee (1996) constructed versatile tests by combining four weighted logrank (WLR) tests with various weight functions that are corresponding to detect proportional, early, late and middle

(7)

hazards differences, respectively. To take into account the advantages of both WKM and WLR, Chi and Tsai (2001) further proposed the simultaneous use of weighted logrank and weighted Kaplan-Meier statistics. Their simulation results demonstrated that the joint use of WLR and WKM produces better power than Lee’s linear combination test and is very robust against a broad range of alternatives. Therefore, the proposed test is an extension of the linear combination of group sequential WLR and WKM and is defined as U (t) = λ1U1(t) + λ2U2(t) for some real numbers λi, i = 1, 2, such that λ1 + λ2 = 1. Although

the derivations of the asymptotic distributions of group sequential WLR and WKM can be found in the literature, respectively, the linear combination of these two statistics has not been investigated yet. The tests U1(t) and U2(t) are correlated for a given data set; therefore,

it is necessary to examine the asymptotic distribution of U (t) based on the data at time t in detail and the derivations are outlined in the Appendix.

When using equal weight, λ1 = λ2 = 1₂, and by the Theorem in the Appendix, the

null asymptotic distribution of U (t) is normal with mean zero and variance (1 + ρ(t))/2, where ρ(t) = σ12(t, t)/

q

σ11(t, t) σ22(t, t) is the correlation between U1(t) and U2(t) and the

expressions of σ11(t, t), σ22(t, t) and σ12(t, t) are given in (A.1), (A.2), and (A.3), respectively,

in the Appendix. A consistent estimator of ρ(t) is ρ(t) =b σb12(t, t)/ q

b

σ11(t, t)σb22(t, t), where b

σ11(t, t),σb22(t, t) and σb12(t, t) are given in (A.5), (A.6), and (A.7), respectively, in the

Ap-pendix. Thus, we suggest rejecting H0in favor of H1 if GSKL(tj) = U (tj)/

r

1+_bρ(tj)

2 ≥ z

∗ αj(tj),

where z_α∗_j(tj) are, again, derived by choosing positive values α1, α2, · · · , αk, so that the

over-all level of significance α = Pk

j=1αj, and αj can be obtained by recursively solving the

following equations P (Z1(t1) < z1, Z2(t2) < z2, · · · , Zj−1(tj−1) < zj−1, Zj(uj) ≥ zj) = αj, j =

1, · · · , k, where (Z1(t1), Z2(t2), · · · , Zj(tj)) is multivariate normal with mean zero and

covari-ance matrix with elements equal to σ(ti, tj)/

q

σ(ti, ti) σ(tj, tj), where σ(t, t) = λ21σ11(t, t0) +

λ2

(8)

4. SIMULATION RESULTS

To understand the accuracy of the asymptotic distribution of the proposed group sequen-tial test statistics, GSKL, and their power properties, Monte Carlo simulations were carried out under Li’s simulation settings. To compare the simulation results with that in Li (1999), the information fractions from Li’s simulation settings 16/190, 60/190, 124/190, 190/190 were used. Moreover, a total number of 824 entry times were generated according to the uniform (0, 1.5) distribution. Each patient was assigned to the treatment and the placebo groups with equal chance. In the absence of random loss to follow-up, and under the null hypothesis with exponential survival function with scale parameter 0.5108, the numerical integration results in the covariance matrix of group sequential WKM test statistic at four different times 0.5, 1.0, 1.5, 2.0 is Σ =          1.0 0.5894 1 0.4145 0.7450 1.0 0.3673 0.6543 0.8807 1.0          .

Moreover, proportional model, early, and crossing hazard difference alternatives demon-strated in Li (1999, page 280) were used for comparing the power. Thus O’Brien-Fleming-type and Pocock-O’Brien-Fleming-type critical values are (6.891, 3.473, 2.423. 1.99) and (2.703, 2.397, 2.288, 2.233), respectively. Note that these critical values are for testing omnibus alternatives, not for stochastic ordering alternatives.

The type I error rate is generated from exponential distribution with scale parameter 0.5108. In Table 1, the test statistics considered here maintain reasonable type I error rates. For proportional hazards model , the statistic based on the joint use of WLR and WKM, GSKL, has higher power than group sequential WKM but slightly lower power than group sequential logrank test. For early hazard difference, the powers of GSKL, group sequential WKM, and group sequential logrank are almost the same. Group sequential logrank test performs much worse under crossing hazard difference alternative, and GSKL has lower power

(9)

than WKM. For late hazard difference alternatives, group sequential WKM has lowest power and the power of GLKL is still in the second place. In conclusion, all the tests considered in the simulation do not perform uniformly better than the others. Nevertheless, the overall performance of GSKL is slightly better than group sequential WKM and logrank tests. Note that the number in the parenthesis is average stopping time.

Table 1 : Estimated Power and Nominal Level

Boundary Pocock

OBrien-Fleming GSWKM GSLR GSKL GSWKM GSLR GSKL Proportional 0.817 0.858 0.847 0.866 0.894 0.886 (3.06) (2.99) (3.01) (3.39) (3.36) (3.37) Early 0.743 0.735 0.746 0.794 0.770 0.786 (3.09) (3.05) (3.06) (3.43) (3.43) (3.48) Crossing 0.364 0.260 0.315 0.330 0.186 0.259 (3.41) (3.54) (3.47) (3.75) (3.86) (3.80) Late 0.059 0.083 0.069 0.065 0.099 0.078 (3.92) (3.93) (3.92) (3.98) (3.98) (3.98) Null 0.049 0.055 0.053 0.050 0.054 0.051 (3.94) (3.93) (3.93) (3.98) (3.98) (3.98)

(10)

APPENDIX

The martingale form of U1(t) evaluated at time t, nonstandardized form of GSWLR, is

U1(t) = Z s 0 H11(t, u) dM1(t, u) − Z s 0 H12(t, u) dM2(t, u), where H1j(t, u) = 1 √ nQ(t, u) R1(t, u) R2(t, u) R1(t, u) + R2(t, u) I {Rj(t, u) > 0} Rj(t, u) , j = 1, 2.

Gu and Lai (1991) developed a certain maximal inequality and rigorously stated and proved the asymptotic distribution of GSWLR under weight function Q(t, u) that may change with t. Thus, their results will not be reproduced here. Gu et. al (1999) further examined the group sequential test based on weighted Kaplan-Meier statistics, and the martingale form of it is U2(t) = Z s 0 H21(t, u) dM1(t, u) − Z s 0 H22(t, u) dM2(t, u) + op(1), where H2j(t, u) = √ nξ(t, u)b b Sj(t, u−) b Sj(t, u) I {Rj(t, u) > 0} Rj(t, u) , j = 1, 2, with b ξ(t, u) =H(t, u)S(u) +c Z Tc(t) u c H(t, v) dS(v).

Hence, the linear combination of U1(t) and U2(t), U (t), can be expressed as a martingale

with a negligible term, Un(t) = Z s 0 {λ1H11(t, u) + λ2H21(t, u)} dM1(t, u) − Z s 0 {λ1H12(t, u) + λ2H22(t, u)} dM2(t, u) + op(1).

In order to show the weak convergence of Un(t), several assumptions are needed and

(11)

A1. lim n→∞ 1 n n X i=1 Pr (C1i ≥ s, t − Y1i ≥ s) = b1(t, s), lim n→∞ 1 n n X j=1 Pr (C2j ≥ s, t − Y2j ≥ s) = b2(t, s), where n = n1+ n2, ni n → pi, 0 < pi < 1, i = 1, 2, and bi(t, s) = Pr(Ci ≥ s, Yi ≤ t−s), i = 1, 2. Note that bi(t, s) are also assumed to be continuous for 0 ≤ s ≤ t.

A2. {Q(t, s), 0 ≤ s ≤ t} is a predictable process with respect to {F (s)}, for every t ≥ 0. A3. Assume that there exists 0 ≤ γ < 1₂, and a nonrandom function q(t, s), such that

sup 0≤t≤τ | q(t, 0) | < ∞, sup 0≤s≤t≤τ | Q(t, s) − q(t, s) | Inn−1R1(t, s) + n−1R2(t, s) ≥ ε o _p −→ 0 , for every ε > 0, as n → ∞ sup 0≤t≤τ n V0≤s≤t h n−1R1(t, s) + n−1R2(t, s) γ Q(t, s)i +V0≤s≤t[((1 − F1(s))b1(t, s) + (1 − F2(s))b2(t, s))γq(t, s)]} = Op(1).

A4. T1i are identical and independent random variables with a cumulative hazard function

Λ1, likewise, T2j are identical and independent random variables with a cumulative hazard

function Λ2.

A5. There exists a finite set D, such that as a function of s, H(t, s) is continuous except for s ∈ D for every t. As a function of t, H(t, u) is continuous for every u /∈ D. Also we have, for all 0 ≤ s ≤ Tc(t), τ0 ≤ t ≤ τ1 and s /∈ D, as n → ∞, H(t, s)c

p

−→ H(t, s).

A6. Let V[a,b]{f (u)} denote the total variation of the function f (u) on the interval [a, b].

The functions H(t, u) and H(t, u) satisfyc

sup τ0≤t≤τ1 h V[0,Tc(t)] n c H(t, u)o+ V[0,Tc(t)]{H(t, u)} i

(12)

is bounded in probability.

A7. For some c > 0 and any t ∈ [τ0, τ1], Si(Tc(t)) bi(t, Tc(t)) ≥ c, i = 1, 2. This essentially

says that there should be enough information at the point Tc(t) for all t ∈ [τ0, τ1].

A8. Function Tc(t) is continuous in t, and is chosen so that the risk at Tc for each of the two

groups is large enough.

Theorem : Under assumptions A1-A8, and H0 : S1 = S2, the process {Un(t), 0 ≤ t ≤ τ }

con-verges weakly in D[0, τ ] to a zero-mean Gaussian process {Z(t), 0 ≤ t ≤ τ } with covariance function σ(t, t0) =Cov(Z(t), Z(t0)) = λ2₁σ11(t, t0) + λ22σ22(t, t0) + λ1λ2σ12(t, t0) + λ1λ2σ21(t, t0), where σ11(t, t0) = 2 X i=1 Z t∧t0 0 q(t, s) q(t0, s) π1(t, s) π2(t, s) π1(t, s) + π2(t, s) π1(t0, s) π2(t0, s) π1(t0, s) + π2(t0, s) πi(t ∧ t0, s) πi(t, s) πi(t0, s) dΛ(s), (A.1) with πi(t, s) = piS(s)bi(t, s), i = 1, 2, σ22(t, t0) = 2 X i=1 Z t∧t0 0 ξ(t, s) ξ(t0, s)Si(s−) Si(s−) (Si(s))2 πi(t ∧ t0, s) πi(t, s) πi(t0, s) dΛ(s), (A.2) with ξ(t, s) = H(t, s)S(s) + Z t s H(t, u) dS(u), σ12(t, t0) = 2 X i=1 Z t∧t0 0 ξ(t, s) q(t0, s) π1(t 0_{, s) π} 2(t0, s) π1(t0, s) + π2(t0, s) πi(t ∧ t0, s) πi(t, s) πi(t0, s) dΛ(s), (A.3) σ21(t, t0) = 2 X i=1 Z t∧t0 0 q(t, s) ξ(t0, s) π1(t, s) π2(t, s) π1(t, s) + π2(t, s) πi(t ∧ t0, s) πi(t, s) πi(t0, s) dΛ(s). (A.4)

An consistent estimator of Cov(Z(t), Z(t0)) isσb

2_{(t, t}0_{) = λ}2 1σb11(t, t 0_)+λ2 2σb22(t, t 0_)+λ 1λ2σb12(t, t 0₎ +λ1λ2σb21(t, t 0_{), the expression of} b

σij(t, t0), i = 1, 2, j = 1, 2, are given below

b σ11(t, t0) = (1/n) Z t∧t0 0 Q(t, s) Q(t0, s) R1(t, s) R2(t, s) R1(t, s) + R2(t, s)

(13)

1 −∆N1(t, s) + ∆N2(t, s) − 1 R1(t, s) + R2(t, s) − 1 ! dN+(t0, s) R+(t0, s) , (A.5) b σ22(t, t0) = n Z t∧t0 0 b ξ(t, s)ξ(tb 0, s) 1 R1(t0, s) + 1 R2(t0, s) ! 1 −∆N1(t, s) + ∆N2(t, s) − 1 R1(t, s) + R2(t, s) − 1 ! dN+(t0, s) R+(t0, s) , (A.6) b σ12(t, t0) = Z t∧t0 0 b ξ(t, s) Q(t0, s) R1(t 0_{, s) R} 2(t0, s) R1(t0, s) + R2(t0, s) 1 R1(t0, s) + 1 R2(t0, s) ! · 1 − ∆N1(t, s) + ∆N2(t, s) − 1 R1(t, s) + R2(t, s) − 1 ! dN+(t0, s) R+(t0, s) , (A.7) b σ21(t, t0) = Z t∧t0 0 Q(t, s)ξ(tb 0, s) R1(t, s) R2(t, s) R1(t, s) + R2(t, s) 1 R1(t0, s) + 1 R2(t0, s) ! · 1 − ∆N1(t, s) + ∆N2(t, s) − 1 R1(t, s) + R2(t, s) − 1 ! dN+(t0, s) R+(t0, s) . (A.8)

(14)

BIBLIOGRAPHY

Chi, Y.C., and Tsai, M.H., Some Versatile Tests Based on the Simultaneous Use of Weighted Logrank and Weighted Kaplan-Meier Statistics, To appear in Communication in Statistics, 2001.

Gu, M., Follmann D. and Geller, N. L., Monitoring a general class of two-sample survival statistics with applications, Biometrika, 1999, 86, 45-57.

Gu, M. and Lai, T. L., Weak convergence of time-sequential rank statistics with applications to sequential testing in clinical trials, Annal of Statistics, 1991, 19, 1403-1433.

Harrington, D. P. and Fleming, T. R., A class of rank test procedures for censored survival data, Biometrika, 1982, 69, 553-566.

Jones, D. and Whitehead, J. Sequential forms of the logrank and modified Wilcoxon tests for censored data, Biometrika, 1979, 66, 105-113.

Lee, J. W. Some versatile tests based on the simultaneous use of weighted log-rank statistics, Biometrics, 1996, 52, 721-725.

Li, Z. A group sequential test for survival trials: An alternative to rank-based procedures, Biometrics, 1999, 55, 277-283.

O’Brien, P. C. and Flemming, T. R., A multiple testing procedure for clinical trials, Bio-metrics, 1979, 35, 549-556.

Pepe, M. S., and Fleming, T. R. Weighted Kaplan-Meier statistics : a class of distance tests for censored survival data, Biometrics, 1989, 45, 497-507.

Pocock, S. J., Group sequential method in the design and analysis of clinical trials, Biometrika, 1977, 64, 191-199.

Sellke, T., and Siegmund, D., Sequential analysis of the proportional hazard model, Biometrika, 1983, 70, 315-326.

(15)

Slud, D., Sequential linear rank tests for two-sample censored survival data, Annal of Statis-tics, 1984, 12, 551-571.

Slud, D. and Wei, L. J., Two-sample repeated significant tests based on the modified Wilcoxon statistic, Journal of the American Statistical association, 1982, 77, 862-868. Tsiatis, A. A., Repeated significance testing for a general class of statistics used in censored survival analysis, Journal of the American Statistical association, 1982, 77, 855-861.