GROUP SEQUENTIAL TESTS BASED ON THE SIMULTANEOUS USE OF WEIGHTED LOGRANK AND WEIGHTED KAPLAN-MEIER TESTS
Yunchan Chi
Department of Statistics
National Cheng-Kung University Tainan, Taiwan
ycchi@email.stat.ncku.edu.tw
ABSTRACT
This paper proposed a robust group sequential test procedure based on the joint use of group sequential weighted Kaplan-Meier test (Li, 1999; Gu et. al., 1999) and group sequential weighted logrank test (Gu and Lai, 1991). All the tests considered in the simulation study do not perform uniformly better than the others. Nevertheless, the overall performance of the proposed group sequential test procedure is slightly better than group sequential WKM and logrank tests.
Key Words : group sequential test procedures, weighted Kaplan-Meier test, weighted logrank test
1. INTRODUCTION
In clinical trials with survival time as response, the study will usually last for several years to draw the conclusion. Therefore, using group sequential tests to test the treatment effect and save the cost and time becomes a very important issue in medical research. Recently, many statisticians have devoted to group sequential analysis with censored observations and staggered entry. Slud (1984) gave detailed development of these methods ( Jones and Whitehead, 1979 ; Slud and Wei, 1982; Tsiatis, 1982; and Sellke and Siegmund, 1983) and generalized the results of Tsiatis (1982) and Sellke and Siegmund (1983) to weighted logrank statistics with weights independent of time. Gu and Lai (1991) further developed weak convergence for weighted logrank test with weights that may depend on time. However, Pepe and Fleming (1989) pointed out that weighted logrank tests are based on ranks, and these tests might not be sensitive to the magnitude of the difference in survival times against a specific alternative. Consequently, they suggested a class of test statistics based on the integrated weighted differences in Kaplan-Meier estimates and showed that these statistics compare favorably with the logrank test even under the proportional hazards alternatives, and may perform better under the early and crossing hazards alternatives. Further, Li (1999) and Gu et al. (1999), respectively, extended weighted Kaplan-Meier tests to sequential clinical trials and derived weak convergence of group sequential weighted Kaplan-Meier tests. For a given data set, however, one seldom knows what the exact alternative is. Therefore, Lee (1996) constructed versatile tests by combining four weighted logrank tests with various weight functions detecting proportional, early, late, and middle hazards differences, respec-tively. Chi and Tsai (2001) further constructed robust tests based on the simultaneous use of weighted logrank and weighted Kaplan-Meier tests. Their results showed that the overall performance of the linear combination of weighted logrank and weighted Kaplan-Meier tests is better than Lee’s linear combination test based on weighted logrank tests only. Therefore, group sequential tests based on the joint use of weighted logrank and weighted Kaplan-Meier tests will be derived in this paper.
logrank and group sequential weighted Kaplan-Meier tests are reviewed. Next, the proposed test statistics are developed in Section 3. Furthermore, simulation results of the power comparisons among the proposed test, grouped sequential weighted logrank tests and Kaplan-Meier tests are presented in Section 4.
2. LITERATURE REVIEW 2.1. Preliminary
In comparative clinical trials, patients usually enter the trial serially and are assigned to treatments according to some random mechanism. Let Yij be the entry time (calendar time)
of the jth subject in group i; j = 1, 2, . . . , ni, i = 1, 2, where ni represents the sample size of
the ith group. In addition, n = n1+ n2 is the total sample size. Let Tij denote the survival
time from entry for patient j in group i until the endpoint (death, tumor recurrence, etc.) under investigation; and Cij denote the time from entry until patient j in group i is lost
to followup. Thus the patient entering the trial at Yij is on test during the time interval
[Yij, Yij + (Tij ∧ Cij)], where ∧ denotes minimum. At time Yij + (Tij ∧ Cij), we will observe
an event if Tij ≤ Cij and otherwise observe that the patient is censored. Hence the censored
indicator for the jth subject in group i is defined as δij(t) = I {Tij ≤ Cij ∧ (t − Yij)+},
where a+ = max(0, a) and Xij(t) = Tij ∧ Cij ∧ (t − Yij)+ is the observed survival time.
Gu and Lai (1991) provided a very general setting that is based on counting processes and martingale in two dimension for deriving the weak convergence of the time-sequential censored rank statistics. Let Ni(t, s) = Pnj=1i I {Tij ≤ Cij ∧ (t − Yij)+∧ s} be the counting
process for evaluating the events occurred at time s, and Ri(t, s) =Pnj=1i I {Xij(t) ≥ s} be
the risk process for counting the subjects still at risk at time s. We assume survival times; Tij, are independent of entry times and censoring times; (Yij, Cij), and the distributions of
Yij, Tij and Cij for the ith population are continuous and denoted by Ei(t), Fi(t) and Li(t),
respectively. In addition, let Si(t) = 1 − Fi(t) and Gi(t) = 1 − Li(t) be the survival functions
of survival and censoring times, respectively, and the cumulative hazard function of the ith group is defined by Λi(t) = − log(Si(t)).
2.2. Group Sequential Weighted Logrank Tests
For testing the equality of two survival distributions (H0 : S1(t) = S2(t), for all t)
against stochastic ordering alternative (H1 : S1(t) ≤ S2(t), with strict inequality for some
t) in sequential clinical trials, Gu and Lai (1991) unified the most commonly used group sequential testing procedures as time-sequential censored rank statistics,
U1(t) = q 1/n Z t 0 K(t, s) dnΛb1(t, s) −Λb2(t, s) o , where Λbni(t, s) = Z s 0 dNi(t, u) Ri(t, u) , i = 1, 2, and K(t, s) = Q(t, s) R1(t, s) R2(t, s) R1(t, s) + R2(t, s) .
For example, Q(t, s) = 1 corresponds to the group sequential logrank test (Tsiatis, 1982) and Q(t, s) = R1(t, s) + R2(t, s) yields the group sequential Gehan-Wilcoxon test statistic
(Slud and Wei, 1982). Moreover, an extension version of weight function from Harrington and Fleming (1982) has the following form Q(t, s) =S(t, s)b %(1−S(t, s))b %, whereS(t, s) is Kaplan-b
Meier estimator of survival distribution based on combined data. Since the null asymptotic distribution of U1(t) based on the data at time t is normal with mean zero and variance
σ11(t, t) which is given in (A.1) in the Appendix and can be consistently estimated byσb11(t, t)
listed in (A.5) in the Appendix. Suppose the group sequential trials are examined at time points t1 < t2 <, · · · , tk. A two-sample group sequential weighted logrank test (GSWLR)
rejects H0in favor of H1 at the jth examination time point tj, if U1(tj)/
q b
σ11(tj, tj) ≥ zα∗j(tj),
where zα∗
j(tj) are derived by choosing positive values α1, α2, · · · , αk, so that the overall level
of significance α = Pk
j=1αj, and αj can be obtained by recursively solving the following
equations P (Z1(t1) < z1, Z2(t2) < z2, · · · , Zj−1(tj−1) < zj−1, Zj(tj) ≥ zj) = αj, j = 1, · · · , k,
where (Z1(t1), Z2(t2), · · · , Zj(tj)) is multivariate normal with mean zero and covariance
ma-trix with elements equal to σ11(ti, tj)/
q
σ11(ti, ti) σ11(tj, tj). Note that a program by Schervish
(1984) for calculating multivariate normal probabilities can be applied for finding zj. A
con-sistent estimator of σ11(ti, tj) are also given in (A.5) in the Appendix.
2.3. Group Sequential Weighted Kaplan-Meier Tests
magnitude of the difference in survival time against a specific alternative. The reason is illustrated by an example in Pepe and Fleming (1989). Consequently, they proposed a class of weighted Kaplan-Meier test statistics (WKM) based on the sum of weighted difference in Kaplan-Meier estimators. Moreover, they showed that WKM compares favorably with logrank test even under the proportional hazards alternative, and may perform better than logrank test under early and crossing hazards difference alternatives. However, WKM may perform worse under late hazards difference alternatives since the weight function is chosen to put less weight over later time period if censoring rate is heavy. Nevertheless, Li (1999) further extended their procedure to sequential clinical trials , and the extended weighted Kaplan-Meier tests based on the data at time t have the following form
U2(t) = √ n Z Tc(t) 0 b ω(t, s)nSb2(t, s) −Sb1(t, s) o ds,
where Tc(t) = sup{s :bb1(t, s) ∧bb2(t, s) > 0}, ∧ denotes minimum, Sbi(t, s) are Kaplan-Meier
estimator of survival distribution of group i,bbi(t, s) is an empirical estimate of Pr (Ci ≥ s, t − Yi ≥ s),
and the random weight function ω(t, s) estimating a deterministic function ω(t, s) whichb
downweights the contributions of the difference of two Kaplan-Meier estimators over later time period if censoring is heavy.
If the weight function is chosen appropriately, the resulted statistics will be stable. For instance, b ω(t, s) = bb1(t, s)bb2(t, s) b p1bb1(t, s) +pb2bb2(t, s) , where pbi = ni n, i = 1, 2, was used by Li (1999).
Furthermore, Gu et. al (1999) investigated more general weight functions in terms of U2(t) = √ n Z Tc(t) 0 c H(t, s) dnSb2(t, s) −Sb1(t, s) o ,
where Tc(t) is chosen so that the risk set at Tc(t) for each of the two groups is large enough,
c
H(t, s) is a random function of bounded variation which converges pointwise in probability to a deterministic piecewise continuous function H(t, s). Several test statistics belong to
this class. For example, if H(t, s) =c S(t, s), whereb S(t, s) is Kaplan-Meier estimator ofb
survival distribution based on combined data, then U2(t) is equivalent to a group sequential
and truncated version of Efron’s (1967) two-sample test statistic with Tc(t) = ∞. For
c
H(t, s) =
Z Tc(t)
s b
ω(t, u)du, the above statistic reduces to Li’s test. More examples can be found in Gu et. al (1999).
The null asymptotic distribution of U2(t) evaluated at time t is normal with mean
zero and variance σ22(t, t) which is given in (A.2) in the Appendix and can be
consis-tently estimated by σb22(t) listed in (A.6) in the Appendix. It is worth noticing that for c
H(t, s) =
Z Tc(t)
s b
ω(t, u)du, there were typos in σ22(t, t) formula in Li (1999, page 278-279),
the correct formula is given in the Appendix. As a result, the expression of the variance estimate of σ22(t, t) in Li (1999) also need to be modified. The correct formula is also given in
the Appendix. Thus, a two-sample group sequential weighted Kaplan-Meier test (GSWKM) rejects H0in favor of H1 at the jth examination time point tj, if U2(tj)/
q b
σ22(tj, tj) ≥ zα∗j(tj),
where zα∗j(tj) are, again, derived by choosing positive values α1, α2, · · · , αk, so that the
over-all level of significance α = Pk
j=1αj, and αj can be obtained by recursively solving the
following equations P (Z1(t1) < z1, Z2(t2) < z2, · · · , Zj−1(tj−1) < zj−1, Zj(uj) ≥ zj) = αj, j =
1, · · · , k, where (Z1(t1), Z2(t2), · · · , Zj(tj)) is multivariate normal with mean zero and
covari-ance matrix with elements equal to σ22(ti, tj)/
q
σ22(ti, ti) σ22(tj, tj). A consistent estimator
of σ22(ti, tj) are also given in (A.6) in the Appendix.
From Li’s limited simulation results, WKM performs well under early and crossing hazard differences alternatives, however, no simulation result about late hazard difference alterna-tive. This may be due to the weight function which is chosen to put less weight over later time period if censoring rate is heavy.
3. THE PROPOSED TEST STATISTICS
For a given data set, however, one seldom knows what the exact alternative is. Therefore, Lee (1996) constructed versatile tests by combining four weighted logrank (WLR) tests with various weight functions that are corresponding to detect proportional, early, late and middle
hazards differences, respectively. To take into account the advantages of both WKM and WLR, Chi and Tsai (2001) further proposed the simultaneous use of weighted logrank and weighted Kaplan-Meier statistics. Their simulation results demonstrated that the joint use of WLR and WKM produces better power than Lee’s linear combination test and is very robust against a broad range of alternatives. Therefore, the proposed test is an extension of the linear combination of group sequential WLR and WKM and is defined as U (t) = λ1U1(t) + λ2U2(t) for some real numbers λi, i = 1, 2, such that λ1 + λ2 = 1. Although
the derivations of the asymptotic distributions of group sequential WLR and WKM can be found in the literature, respectively, the linear combination of these two statistics has not been investigated yet. The tests U1(t) and U2(t) are correlated for a given data set; therefore,
it is necessary to examine the asymptotic distribution of U (t) based on the data at time t in detail and the derivations are outlined in the Appendix.
When using equal weight, λ1 = λ2 = 12, and by the Theorem in the Appendix, the
null asymptotic distribution of U (t) is normal with mean zero and variance (1 + ρ(t))/2, where ρ(t) = σ12(t, t)/
q
σ11(t, t) σ22(t, t) is the correlation between U1(t) and U2(t) and the
expressions of σ11(t, t), σ22(t, t) and σ12(t, t) are given in (A.1), (A.2), and (A.3), respectively,
in the Appendix. A consistent estimator of ρ(t) is ρ(t) =b σb12(t, t)/ q
b
σ11(t, t)σb22(t, t), where b
σ11(t, t),σb22(t, t) and σb12(t, t) are given in (A.5), (A.6), and (A.7), respectively, in the
Ap-pendix. Thus, we suggest rejecting H0in favor of H1 if GSKL(tj) = U (tj)/
r
1+bρ(tj)
2 ≥ z
∗ αj(tj),
where zα∗j(tj) are, again, derived by choosing positive values α1, α2, · · · , αk, so that the
over-all level of significance α = Pk
j=1αj, and αj can be obtained by recursively solving the
following equations P (Z1(t1) < z1, Z2(t2) < z2, · · · , Zj−1(tj−1) < zj−1, Zj(uj) ≥ zj) = αj, j =
1, · · · , k, where (Z1(t1), Z2(t2), · · · , Zj(tj)) is multivariate normal with mean zero and
covari-ance matrix with elements equal to σ(ti, tj)/
q
σ(ti, ti) σ(tj, tj), where σ(t, t) = λ21σ11(t, t0) +
λ2
4. SIMULATION RESULTS
To understand the accuracy of the asymptotic distribution of the proposed group sequen-tial test statistics, GSKL, and their power properties, Monte Carlo simulations were carried out under Li’s simulation settings. To compare the simulation results with that in Li (1999), the information fractions from Li’s simulation settings 16/190, 60/190, 124/190, 190/190 were used. Moreover, a total number of 824 entry times were generated according to the uniform (0, 1.5) distribution. Each patient was assigned to the treatment and the placebo groups with equal chance. In the absence of random loss to follow-up, and under the null hypothesis with exponential survival function with scale parameter 0.5108, the numerical integration results in the covariance matrix of group sequential WKM test statistic at four different times 0.5, 1.0, 1.5, 2.0 is Σ = 1.0 0.5894 1 0.4145 0.7450 1.0 0.3673 0.6543 0.8807 1.0 .
Moreover, proportional model, early, and crossing hazard difference alternatives demon-strated in Li (1999, page 280) were used for comparing the power. Thus O’Brien-Fleming-type and Pocock-O’Brien-Fleming-type critical values are (6.891, 3.473, 2.423. 1.99) and (2.703, 2.397, 2.288, 2.233), respectively. Note that these critical values are for testing omnibus alternatives, not for stochastic ordering alternatives.
The type I error rate is generated from exponential distribution with scale parameter 0.5108. In Table 1, the test statistics considered here maintain reasonable type I error rates. For proportional hazards model , the statistic based on the joint use of WLR and WKM, GSKL, has higher power than group sequential WKM but slightly lower power than group sequential logrank test. For early hazard difference, the powers of GSKL, group sequential WKM, and group sequential logrank are almost the same. Group sequential logrank test performs much worse under crossing hazard difference alternative, and GSKL has lower power
than WKM. For late hazard difference alternatives, group sequential WKM has lowest power and the power of GLKL is still in the second place. In conclusion, all the tests considered in the simulation do not perform uniformly better than the others. Nevertheless, the overall performance of GSKL is slightly better than group sequential WKM and logrank tests. Note that the number in the parenthesis is average stopping time.
Table 1 : Estimated Power and Nominal Level
Boundary Pocock
OBrien-Fleming GSWKM GSLR GSKL GSWKM GSLR GSKL Proportional 0.817 0.858 0.847 0.866 0.894 0.886 (3.06) (2.99) (3.01) (3.39) (3.36) (3.37) Early 0.743 0.735 0.746 0.794 0.770 0.786 (3.09) (3.05) (3.06) (3.43) (3.43) (3.48) Crossing 0.364 0.260 0.315 0.330 0.186 0.259 (3.41) (3.54) (3.47) (3.75) (3.86) (3.80) Late 0.059 0.083 0.069 0.065 0.099 0.078 (3.92) (3.93) (3.92) (3.98) (3.98) (3.98) Null 0.049 0.055 0.053 0.050 0.054 0.051 (3.94) (3.93) (3.93) (3.98) (3.98) (3.98)
APPENDIX
The martingale form of U1(t) evaluated at time t, nonstandardized form of GSWLR, is
U1(t) = Z s 0 H11(t, u) dM1(t, u) − Z s 0 H12(t, u) dM2(t, u), where H1j(t, u) = 1 √ nQ(t, u) R1(t, u) R2(t, u) R1(t, u) + R2(t, u) I {Rj(t, u) > 0} Rj(t, u) , j = 1, 2.
Gu and Lai (1991) developed a certain maximal inequality and rigorously stated and proved the asymptotic distribution of GSWLR under weight function Q(t, u) that may change with t. Thus, their results will not be reproduced here. Gu et. al (1999) further examined the group sequential test based on weighted Kaplan-Meier statistics, and the martingale form of it is U2(t) = Z s 0 H21(t, u) dM1(t, u) − Z s 0 H22(t, u) dM2(t, u) + op(1), where H2j(t, u) = √ nξ(t, u)b b Sj(t, u−) b Sj(t, u) I {Rj(t, u) > 0} Rj(t, u) , j = 1, 2, with b ξ(t, u) =H(t, u)S(u) +c Z Tc(t) u c H(t, v) dS(v).
Hence, the linear combination of U1(t) and U2(t), U (t), can be expressed as a martingale
with a negligible term, Un(t) = Z s 0 {λ1H11(t, u) + λ2H21(t, u)} dM1(t, u) − Z s 0 {λ1H12(t, u) + λ2H22(t, u)} dM2(t, u) + op(1).
In order to show the weak convergence of Un(t), several assumptions are needed and
A1. lim n→∞ 1 n n X i=1 Pr (C1i ≥ s, t − Y1i ≥ s) = b1(t, s), lim n→∞ 1 n n X j=1 Pr (C2j ≥ s, t − Y2j ≥ s) = b2(t, s), where n = n1+ n2, ni n → pi, 0 < pi < 1, i = 1, 2, and bi(t, s) = Pr(Ci ≥ s, Yi ≤ t−s), i = 1, 2. Note that bi(t, s) are also assumed to be continuous for 0 ≤ s ≤ t.
A2. {Q(t, s), 0 ≤ s ≤ t} is a predictable process with respect to {F (s)}, for every t ≥ 0. A3. Assume that there exists 0 ≤ γ < 12, and a nonrandom function q(t, s), such that
sup 0≤t≤τ | q(t, 0) | < ∞, sup 0≤s≤t≤τ | Q(t, s) − q(t, s) | Inn−1R1(t, s) + n−1R2(t, s) ≥ ε o p −→ 0 , for every ε > 0, as n → ∞ sup 0≤t≤τ n V0≤s≤t h n−1R1(t, s) + n−1R2(t, s) γ Q(t, s)i +V0≤s≤t[((1 − F1(s))b1(t, s) + (1 − F2(s))b2(t, s))γq(t, s)]} = Op(1).
A4. T1i are identical and independent random variables with a cumulative hazard function
Λ1, likewise, T2j are identical and independent random variables with a cumulative hazard
function Λ2.
A5. There exists a finite set D, such that as a function of s, H(t, s) is continuous except for s ∈ D for every t. As a function of t, H(t, u) is continuous for every u /∈ D. Also we have, for all 0 ≤ s ≤ Tc(t), τ0 ≤ t ≤ τ1 and s /∈ D, as n → ∞, H(t, s)c
p
−→ H(t, s).
A6. Let V[a,b]{f (u)} denote the total variation of the function f (u) on the interval [a, b].
The functions H(t, u) and H(t, u) satisfyc
sup τ0≤t≤τ1 h V[0,Tc(t)] n c H(t, u)o+ V[0,Tc(t)]{H(t, u)} i
is bounded in probability.
A7. For some c > 0 and any t ∈ [τ0, τ1], Si(Tc(t)) bi(t, Tc(t)) ≥ c, i = 1, 2. This essentially
says that there should be enough information at the point Tc(t) for all t ∈ [τ0, τ1].
A8. Function Tc(t) is continuous in t, and is chosen so that the risk at Tc for each of the two
groups is large enough.
Theorem : Under assumptions A1-A8, and H0 : S1 = S2, the process {Un(t), 0 ≤ t ≤ τ }
con-verges weakly in D[0, τ ] to a zero-mean Gaussian process {Z(t), 0 ≤ t ≤ τ } with covariance function σ(t, t0) =Cov(Z(t), Z(t0)) = λ21σ11(t, t0) + λ22σ22(t, t0) + λ1λ2σ12(t, t0) + λ1λ2σ21(t, t0), where σ11(t, t0) = 2 X i=1 Z t∧t0 0 q(t, s) q(t0, s) π1(t, s) π2(t, s) π1(t, s) + π2(t, s) π1(t0, s) π2(t0, s) π1(t0, s) + π2(t0, s) πi(t ∧ t0, s) πi(t, s) πi(t0, s) dΛ(s), (A.1) with πi(t, s) = piS(s)bi(t, s), i = 1, 2, σ22(t, t0) = 2 X i=1 Z t∧t0 0 ξ(t, s) ξ(t0, s)Si(s−) Si(s−) (Si(s))2 πi(t ∧ t0, s) πi(t, s) πi(t0, s) dΛ(s), (A.2) with ξ(t, s) = H(t, s)S(s) + Z t s H(t, u) dS(u), σ12(t, t0) = 2 X i=1 Z t∧t0 0 ξ(t, s) q(t0, s) π1(t 0, s) π 2(t0, s) π1(t0, s) + π2(t0, s) πi(t ∧ t0, s) πi(t, s) πi(t0, s) dΛ(s), (A.3) σ21(t, t0) = 2 X i=1 Z t∧t0 0 q(t, s) ξ(t0, s) π1(t, s) π2(t, s) π1(t, s) + π2(t, s) πi(t ∧ t0, s) πi(t, s) πi(t0, s) dΛ(s). (A.4)
An consistent estimator of Cov(Z(t), Z(t0)) isσb
2(t, t0) = λ2 1σb11(t, t 0)+λ2 2σb22(t, t 0)+λ 1λ2σb12(t, t 0) +λ1λ2σb21(t, t 0), the expression of b
σij(t, t0), i = 1, 2, j = 1, 2, are given below
b σ11(t, t0) = (1/n) Z t∧t0 0 Q(t, s) Q(t0, s) R1(t, s) R2(t, s) R1(t, s) + R2(t, s)
1 −∆N1(t, s) + ∆N2(t, s) − 1 R1(t, s) + R2(t, s) − 1 ! dN+(t0, s) R+(t0, s) , (A.5) b σ22(t, t0) = n Z t∧t0 0 b ξ(t, s)ξ(tb 0, s) 1 R1(t0, s) + 1 R2(t0, s) ! 1 −∆N1(t, s) + ∆N2(t, s) − 1 R1(t, s) + R2(t, s) − 1 ! dN+(t0, s) R+(t0, s) , (A.6) b σ12(t, t0) = Z t∧t0 0 b ξ(t, s) Q(t0, s) R1(t 0, s) R 2(t0, s) R1(t0, s) + R2(t0, s) 1 R1(t0, s) + 1 R2(t0, s) ! · 1 − ∆N1(t, s) + ∆N2(t, s) − 1 R1(t, s) + R2(t, s) − 1 ! dN+(t0, s) R+(t0, s) , (A.7) b σ21(t, t0) = Z t∧t0 0 Q(t, s)ξ(tb 0, s) R1(t, s) R2(t, s) R1(t, s) + R2(t, s) 1 R1(t0, s) + 1 R2(t0, s) ! · 1 − ∆N1(t, s) + ∆N2(t, s) − 1 R1(t, s) + R2(t, s) − 1 ! dN+(t0, s) R+(t0, s) . (A.8)
BIBLIOGRAPHY
Chi, Y.C., and Tsai, M.H., Some Versatile Tests Based on the Simultaneous Use of Weighted Logrank and Weighted Kaplan-Meier Statistics, To appear in Communication in Statistics, 2001.
Gu, M., Follmann D. and Geller, N. L., Monitoring a general class of two-sample survival statistics with applications, Biometrika, 1999, 86, 45-57.
Gu, M. and Lai, T. L., Weak convergence of time-sequential rank statistics with applications to sequential testing in clinical trials, Annal of Statistics, 1991, 19, 1403-1433.
Harrington, D. P. and Fleming, T. R., A class of rank test procedures for censored survival data, Biometrika, 1982, 69, 553-566.
Jones, D. and Whitehead, J. Sequential forms of the logrank and modified Wilcoxon tests for censored data, Biometrika, 1979, 66, 105-113.
Lee, J. W. Some versatile tests based on the simultaneous use of weighted log-rank statistics, Biometrics, 1996, 52, 721-725.
Li, Z. A group sequential test for survival trials: An alternative to rank-based procedures, Biometrics, 1999, 55, 277-283.
O’Brien, P. C. and Flemming, T. R., A multiple testing procedure for clinical trials, Bio-metrics, 1979, 35, 549-556.
Pepe, M. S., and Fleming, T. R. Weighted Kaplan-Meier statistics : a class of distance tests for censored survival data, Biometrics, 1989, 45, 497-507.
Pocock, S. J., Group sequential method in the design and analysis of clinical trials, Biometrika, 1977, 64, 191-199.
Sellke, T., and Siegmund, D., Sequential analysis of the proportional hazard model, Biometrika, 1983, 70, 315-326.
Slud, D., Sequential linear rank tests for two-sample censored survival data, Annal of Statis-tics, 1984, 12, 551-571.
Slud, D. and Wei, L. J., Two-sample repeated significant tests based on the modified Wilcoxon statistic, Journal of the American Statistical association, 1982, 77, 862-868. Tsiatis, A. A., Repeated significance testing for a general class of statistics used in censored survival analysis, Journal of the American Statistical association, 1982, 77, 855-861.