Short-term Effect: Proportional Hazards Model

We now illustrate two forms of ˜F (t|Z) which are the most popular choices of regression models in survival analysis. We will also propose model checking procedures to verify the validity of the model assumption. To simplify the presentation, we focus on the two-sample case with Z = 0, 1.

Assume the survival function ˜S(t) follows the form of a PH model such that

S(t|Z = 1) = ˜˜ S(t|Z = 0)^k, (15)

where k is a pre-determined constant. Notice that S(t|Z) = exp

−θ(z) ˜F (t|Z)

= exp

θ(z) ˜S(t|Z) − 1

. (16)

Thus we have

log S(t|Z)

θ(z) + 1 = ˜S(t|Z). (17)

It follows that

log S(t|Z = 1)

θ(1) + 1 = ˜S(t|Z = 1) =n ˜S(t|Z = 0)ok

= log S(t|Z = 0) θ(0) + 1

Accordingly

log log S(t|Z = 1) θ(1) + 1

= k log log S(t|Z = 0) θ(0) + 1

. (18)

Notice that equation (18) now involves only estimable quantities. The function S(t|z) can be estimated by the Kaplan-Meier estimator

S(t|Z = z) =ˆ Y

0<u≤t

1 −

i=1I(X_i = u, δ_i = 1, Z_i = z) Pn

i=1I(X_i ≥ u, Z_i = z)

Denote ˆθ(z) be the nonparametric estimate of θ(z). Define

X_i = log log ˆS(t_i|z = 1) θ(1)ˆ + 1

and

Y_i = log log ˆS(ti|z = 0) θ(0)ˆ + 1

! ,

where t₁ < t₂ < · · · < t_D are ordered failure points. If the PH assumption holds, equation (18) indicates that Y follow a linear relationship passing through the origin. We present some plots of ˆS(t|Z) for z = 0, 1 and the corresponding diagnostic plot of X_i versus Y_i for i = 1, 2, · · · , D based on 1000 simulated observations.

The diagnostic plots in Figure 5-2 reveal clear linear pattern for most points. Figure 5-3 presents the plots when the PH assumption is violated. There is a curved relationship between X_i and Y_i.

0 1 2 3 4 5

0.00.20.40.60.81.0

Time

Survival

Gamma( 1 , 1 ),cure rate= 0.3 ,censore= 0.4 Gamma( 1 , 2 ),cure rate= 0.5 ,censore= 0.58

●

PH model

Weilbull( 1 , 3 ),cure rate= 0.3 ,censore= 0.35 Weilbull( 1 , 2 ),cure rate= 0.5 ,censore= 0.53

●

PH model

(b) Case II

Figure 5-2: K-M curves and diagnostic plots when the PH assumption holds.

0 5 10 15 20

0.00.20.40.60.81.0

Time

Survival

Log−normal( 0.3 , 5 ),cure rate= 0.3 ,censore= 0.36 Log−normal( 1 , 0.1 ),cure rate= 0.4 ,censore= 0.53

●●

PH model

Log−normal( 1 , 1 ),cure rate= 0.3 ,censore= 0.62 Log−normal( 2.5 , 0.25 ),cure rate= 0.4 ,censore= 0.79

●●

PH model

(b) Case II

Figure 5-3: K-M curves and diagnostic plots when the PH assumption is violated.

5.3 Short Term Effect: Accelerated Failure Time Model

Under the AFT model, the survival function ˆS(t|z) for z = 0, 1 follows the relationship S(t|Z = 1) = ˜˜ S(kt|Z = 0)

for k being a prespecified constant. It follows that S(t|z = 1) = expn

θ(1)h ˜S(t|z = 1) − 1io

= expn

θ(1)h ˜S(kt|z = 0) − 1io . Accordingly we have

log S(t|Z = 1)

θ(1) + 1 = ˜S(t|Z = 1) = ˜S(kt|Z = 0) = log S(kt|Z = 0)

θ(0) + 1. (19)

We want to find a clear relationship from the above equation.

Define p₁ < · · · < p_M as some constants locating in (0, 1). Then we solve X_i satisfying log S(x_i|Z = 1)

θ(1) = p_i, i = 1, 2, · · · , M . Thus, for each p_i we have

X_i =S_Z=1⁻¹ˆ n exp

p_iθ(1)ˆ o .

Similar steps can be derived based on the right-hand side of equation (19). Set Y_i =S_Z=0⁻¹ˆ n

exp

p_iθ(0)ˆ o

for i = 1, 2, · · · , M . When the AFT model holds, (Xi, Yi) (i = 1, 2, · · · , M ) will follow a straight line through the origin. Plots following the AFT model are presented in Figure 5-4, in which n=1000.

The diagnostic plots in Figure 5-4 reveal clear linear pattern for most points. Figure 5-5 presents the plots when the AFT assumption is violated. There is a curved relationship between X_i and Y_i.

0 1 2 3 4 5

0.00.20.40.60.81.0

Time

Survival

Gamma( 1 , 3 ),cure rate= 0.3 ,censore= 0.38 Gamma( 1 , 3 ),cure rate= 0.5 ,censore= 0.57

●

Gamma( 2 , 3 ),cure rate= 0.3 ,censore= 0.41 Weilbull( 2 , 5 ),cure rate= 0.5 ,censore= 0.64

●

Figure 5-4: K-M curves and diagnostic plots when the AFT assumption holds.

0 1 2 3 4 5

0.00.20.40.60.81.0

Time

Survival

Weilbull( 0.5 , 1 ),cure rate= 0.4 ,censore= 0.47 Log−normal( 1 , 5 ),cure rate= 0.3 ,censore= 0.62

●

Gamma( 5 , 2 ),cure rate= 0.4 ,censore= 0.47 Log−normal( 1 , 5 ),cure rate= 0.3 ,censore= 0.48

●

Figure 5-5: K-M curves and diagnostic plots when the AFT assumption is violated.

Chapter 6 Conclusion

In the thesis, we study the non-mixture approach for analyzing survival data in presence of cure. This formulation has an interesting biologial interpretation. In parametric analysis, we find that outliers in T which often occur when N = 1 will affect estimation of the cure rate. For nonparametric inference, we propose two algorithms to solve the score funcions of nonparametric MLE. One is the classical Lagrange multiplier method and the other is by change of variables. Two regression models are considered under the simplified two-sample setting. One is the proportional hazard model and the other is the accelerated failure time model. We propose diagnostic plots which can verify the form of regression effect.

References

[1] J. Berkson and R. P. Gage (1952). Survival Curve for Cancer Patients Following Treatment. J. Amer. Statist. Assoc. 47, 501-515.

[2] Boag, JW. (1949). Maximum Likelihood Estimates of the Proportion of Patients Cured by Cancer Therapy. J. Royal Statist. Soc., Series B 11:15-44.

[3] Tsodikov, A. D. (1998). A Proportional Hazards Model Taking Account of Long-Term Survivors. Biometrics 54, 1508-1516.

[4] Tsodikov, A. D. (2001). Estimating of Survival Based on Proportional Hazards When Cure is a Possibility. Mathematical and Computer modelling. 33, 1227-1236.

[5] Tsodikov, A. D., Ibrahim, J. G. and Yakovlev A. Y. (2003). Estimating Cure Rates From Survival Data : An Alternative to Two-Component Mixture Models . E. J. Am.

Statist. Assoc. 98, 1063-1078.

Appendix: Additional Figures

0 5 10 15 20

0.00.40.8

survival function

t ST|Z((t))

Gamma( 1 , 1 ) Gamma( 4 , 1.5 ) Gamma( 6 , 2 )

0 5 10 15 20

0.00.10.20.30.4

density function

t fT|Z((t))

Gamma( 1 , 1 ) Gamma( 4 , 1.5 ) Gamma( 6 , 2 )

0 5 10 15 20

0.00.10.20.30.4

hazard function

t hT|Z((t))

Gamma( 1 , 1 ) Gamma( 4 , 1.5 ) Gamma( 6 , 2 )

0 5 10 15 20

0.00.40.8

survival function

t ST|Z((t))

Weibull( 1 , 3 ) Weibull( 2 , 10 ) Weibull( 4 , 15 )

0 5 10 15 20

0.00.10.20.30.4

density function

t fT|Z((t))

Weibull( 1 , 3 ) Weibull( 2 , 10 ) Weibull( 4 , 15 )

0 5 10 15 20

0.00.10.20.30.4

hazard function

t hT|Z((t))

Weibull( 1 , 3 ) Weibull( 2 , 10 ) Weibull( 4 , 15 )

0 5 10 15 20

0.00.40.8

survival function

t ST|Z((t))

Log−normal( 1 , 1 ) Log−normal( 2 , 0.4 ) Log−normal( 2.5 , 0.25 )

0 5 10 15 20

0.00.10.20.30.4

density function

t fT|Z((t))

Log−normal( 1 , 1 ) Log−normal( 2 , 0.4 ) Log−normal( 2.5 , 0.25 )

0 5 10 15 20

0.00.10.20.30.4

hazard function

t hT|Z((t))

Log−normal( 1 , 1 ) Log−normal( 2 , 0.4 ) Log−normal( 2.5 , 0.25 )

Figure A-1: Survival density and hazard functions for selected parametric distributions.

0 1 2 3 4

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.33

0.0 0.5 1.0 1.5 2.0

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.45

0.0 0.5 1.0 1.5

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.49

Figure A-2: Estimated survival functions when ˜F is correctly specified as Gamma(1,1)

1 2 3 4

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.34

1 2 3 4 5 6

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.48

1 2 3 4

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.64

Figure A-3: Estimated survival functions when ˜F is correctly specified as Gamma(4,1.5)

1 2 3 4 5 6

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.3

1 2 3 4 5

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.42

1 2 3 4

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.51

Figure A-4: Estimated survival functions when ˜F is correctly specified as Gamma(6,2)

0 2 4 6 8 10 12

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.33

0 1 2 3 4 5 6

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.38

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.53

Figure A-5: Estimated survival functions when ˜F is correctly specified as Weibull(1,3)

5 10 15

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.36

5 10 15 20

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.47

5 10 15

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.6

Figure A-6: Estimated survival functions when ˜F is correctly specified as Weibull(2,10)

4 6 8 10 12 14 16 18

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.33

5 10 15 20

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.47

6 8 10 12 14 16 18

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.63

Figure A-7: Estimated survival functions when ˜F is correctly specified as Weibull(4,15)

0 2 4 6 8 10 12 14

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.34

0 1 2 3 4 5 6 7

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.43

0 1 2 3 4

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.51

Figure A-8: Estimated survival functions when ˜F is correctly specified as log-normal(1,1)

2 4 6 8 10 12 14

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.3

2 4 6 8 10 12 14

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.38

4 6 8 10 12

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.5

Figure A-9: Estimated survival functions when ˜F is correctly specified as log-normal(2,0.4)

10 15 20

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.39

8 10 12 14 16 18

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.49

8 10 12 14 16 18

0.00.20.40.60.81.0

Time

S(t)

True MLE NPMLE

p= 0.3 , censoring rate = 0.57

Figure A-10: Estimated survival functions when F˜ is correctly specified as log-normal(2.5,0.25)

Appendix: Simulation Results

Table A-1: Relationship between C∼Unif[0,k] and Pr(δ = 0) for Gamma Distributions Gamma(1,1) k Gamma(4,1.5) k Gamma(6,2) k

p=0.3 Pr(δ=0)=0.3 × Pr(δ=0)=0.3 × Pr(δ=0)=0.3 × Pr(δ=0)=0.4 4 Pr(δ=0)=0.4 11 Pr(δ=0)=0.4 13 Pr(δ=0)=0.5 2 Pr(δ=0)=0.5 7 Pr(δ=0)=0.5 7 p=0.5 Pr(δ=0)=0.5 × Pr(δ=0)=0.5 × Pr(δ=0)=0.5 ×

Pr(δ=0)=0.6 3 Pr(δ=0)=0.6 8 Pr(δ=0)=0.6 10 Pr(δ=0)=0.7 2 Pr(δ=0)=0.7 5 Pr(δ=0)=0.7 6

Table A-2: Relationship between C∼Unif[0,k] and Pr(δ = 0) for Weibull Distributions Weibull(1,3) k Weibull(2,10) k Weibull(4,15) k

p=0.3 Pr(δ=0)=0.3 × Pr(δ=0)=0.3 × Pr(δ=0)=0.3 × Pr(δ=0)=0.4 10 Pr(δ=0)=0.4 33 Pr(δ=0)=0.4 58 Pr(δ=0)=0.5 6 Pr(δ=0)=0.5 22 Pr(δ=0)=0.5 36 p=0.5 Pr(δ=0)=0.5 × Pr(δ=0)=0.5 × Pr(δ=0)=0.5 ×

Pr(δ=0)=0.6 8 Pr(δ=0)=0.6 18 Pr(δ=0)=0.6 45 Pr(δ=0)=0.7 4 Pr(δ=0)=0.7 17 Pr(δ=0)=0.7 26

Table A-3: Relationship between C∼Unif[0,k] and Pr(δ = 0) for Log-normal Distributions log-normal(1,1) k log-normal(2,0.4) k log-normal(2.5,0.25) k

p=0.3 Pr(δ=0)=0.3 × Pr(δ=0)=0.3 × Pr(δ=0)=0.3 ×

Pr(δ=0)=0.4 14 Pr(δ=0)=0.4 33 Pr(δ=0)=0.4 54 Pr(δ=0)=0.5 8 Pr(δ=0)=0.5 19 Pr(δ=0)=0.5 33

p=0.5 Pr(δ=0)=0.5 × Pr(δ=0)=0.5 × Pr(δ=0)=0.5 ×

Pr(δ=0)=0.6 12 Pr(δ=0)=0.6 25 Pr(δ=0)=0.6 40 Pr(δ=0)=0.7 8 Pr(δ=0)=0.7 15 Pr(δ=0)=0.7 25

Table A-4: Maximized likelihood estimators for Gamma distributions with p = 0.3 and Pr(δ = 0) = 0.3

Gamma(1,1) Gamma(4,1.5) Gamma(6,2)

bias sd bias sd bias sd

n=100 α 0.032 0.159 0.129 0.706 0.252 0.086 β 0.069 0.285 0.079 0.345 0.082 0.344 θ 0.014 0.147 0.003 0.165 0.030 0.163 p 0.001 0.044 0.005 0.049 0.005 0.048 n=300 α 0.001 0.074 0.042 0.359 0.061 0.512 β 0.014 0.134 0.022 0.175 0.021 0.206 θ 0.003 0.091 0.004 0.093 0.012 0.102 p 0.002 0.027 0.003 0.028 0.005 0.031

Table A-5: Maximized likelihood estimators for Weibull distributions with p = 0.3 and Pr(δ = 0) = 0.3

Weibull(1,3) Weibull(2,10) Weibull(4,15)

bias sd bias sd bias sd

n=100 k 0.008 0.107 0.028 0.197 0.079 0.461 λ 0.028 0.483 0.051 0.746 0.068 0.683 θ 0.013 0.141 0.012 0.154 0.034 0.190 p 0.007 0.043 0.007 0.046 0.005 0.053 n=300 k 0.001 0.057 0.024 0.102 0.049 0.240 λ 0.016 0.287 0.039 0.503 0.048 0.383 θ 0.007 0.095 0.021 0.101 0.008 0.096 p 0.001 0.028 0.005 0.030 0.001 0.028

Table A-6: Maximized likelihood estimators for Log-normal distributions with p = 0.3 and Pr(δ = 0) = 0.3

log-normal(1,1) log-normal(2,0.4) log-normal(2.5,0.25)

bias sd bias sd bias sd

n=100 µ 0.007 0.110 0.005 0.058 0.001 0.037 σ² 0.014 0.094 0.010 0.031 0.001 0.021 θ 0.010 0.166 0.039 0.173 0.028 0.178 p 0.001 0.048 0.007 0.050 0.004 0.005 n=300 µ 0.008 0.091 0.001 0.033 0.001 0.023 σ² 0.005 0.057 0.005 0.020 0.001 0.015 θ 0.035 0.101 0.007 0.094 0.001 0.091 p 0.009 0.029 0.001 0.028 0.001 0.027

Table A-7: Maximized likelihood estimators for Gamma distributions with p = 0.3 and Pr(δ = 0) = 0.4

Gamma(1,1) Gamma(4,1.5) Gamma(6,2)

bias sd bias sd bias sd

n=100 α 0.036 0.175 0.154 0.768 0.221 1.183 β 0.095 0.423 0.071 0.384 0.090 0.480 θ 0.041 0.260 0.035 0.201 0.018 0.194 p 0.003 0.067 0.005 0.057 <0.001 0.055 n=300 α 0.001 0.091 0.093 0.426 0.117 0.677 β 0.022 0.237 0.046 0.215 0.051 0.275 θ 0.016 0.139 0.005 0.107 0.005 0.112 p 0.002 0.039 <0.001 0.032 0.003 0.021

Table A-8: Maximized likelihood estimators for Weibull distributions with p = 0.3 and Pr(δ = 0) = 0.4

Weibull(1,3) Weibull(2,10) Weibull(4,15)

bias sd bias sd bias sd

n=100 k 0.016 0.131 0.049 0.232 2.979 0.134 λ 51.455 1111.197 0.060 1.314 85.573 2163.867 θ 0.535 7.980 0.030 0.226 0.375 6.338 p 0.016 0.085 0.002 0.058 0.008 0.081 n=300 k 0.001 0.076 0.003 0.127 0.032 0.243 λ 0.112 0.698 0.001 0.623 0.053 0.377 θ 0.024 0.169 0.007 0.113 0.002 0.110 p 0.003 0.047 <0.001 0.033 0.001 0.033

Table A-9: Maximized likelihood estimators for Log-normal distributions with p = 0.3 and Pr(δ = 0) = 0.4

log-normal(1,1) log-normal(2,0.4) log-normal(2.5,0.25)

bias sd bias sd bias sd

n=100 µ 0.035 0.289 0.003 0.063 0.002 0.038 σ² 0.001 0.149 0.007 0.041 0.003 0.026

θ 0.067 0.354 0.022 0.182 0.007 0.179

p 0.007 0.073 0.002 0.052 0.003 0.053

n=300 µ 0.010 0.152 <0.001 0.037 <0.001 0.023 σ² 0.004 0.085 0.002 0.025 0.001 0.014

θ 0.013 0.144 0.001 0.101 0.001 0.114

p 0.001 0.042 0.002 0.030 0.001 0.032

Table A-10: NPMLE of θ and p for Gamma distributions with p = 0.3 and Pr(δ = 0) = 0.3.

Gamma(1,1) Gamma(4,1.5) Gamma(6,2)

bias sd bias sd bias sd

n=100 θ 0.158 0.172 0.131 0.143 0.131 0.139 p 0.040 0.043 0.034 0.036 0.034 0.036 n=300 θ 0.130 0.142 0.125 0.138 0.12 0.129 p 0.034 0.036 0.033 0.031 0.032 0.033

Table A-11: NPMLE of θ and p for Weibull distributions with p = 0.3 and Pr(δ = 0) = 0.3.

Weibull(1,3) Weibull(2,10) Weibull(4,15)

bias sd bias sd bias sd

n=100 θ 0.130 0.142 0.048 0.249 0.026 0.266 p 0.034 0.036 0.004 0.083 0.004 0.090 n=300 θ 0.124 0.134 0.039 0.250 0.033 0.258 p 0.033 0.034 0.001 0.085 0.001 0.087

Table A-12: NPMLE of θ and p for Log-normal distributions with p = 0.3 and Pr(δ = 0) = 0.3.

log-normal(1,1) log-normal(2,0.4) log-normal(2.5,0.25)

bias sd bias sd bias sd

n=100 θ 0.137 0.309 0.141 0.289 0.056 0.246 p 0.035 0.037 0.036 0.037 0.007 0.082 n=300 θ 0.136 0.297 0.139 0.279 0.059 0.239 p 0.035 0.037 0.035 0.037 0.008 0.080

Table A-13: NPMLE of θ and p for Gamma distributions with p = 0.3 and Pr(δ = 0) = 0.4.

Gamma(1,1) Gamma(4,1.5) Gamma(6,2)

bias sd bias sd bias sd

n=100 θ 0.090 0.337 0.148 0.389 0.096 0.327 p 0.017 0.073 0.036 0.041 0.019 0.071 n=300 θ 0.092 0.331 0.147 0.381 0.097 0.321 p 0.018 0.072 0.036 0.040 0.020 0.070

Table A-14: NPMLE of θ and p for Weibull distributions with p = 0.3 and Pr(δ = 0) = 0.4.

Weibull(1,3) Weibull(2,10) Weibull(4,15)

bias sd bias sd bias sd

n=100 θ 0.143 0.375 0.079 0.231 0.146 0.405 p 0.035 0.040 0.015 0.075 0.036 0.041 n=300 θ 0.144 0.413 0.082 0.227 0.145 0.396 p 0.035 0.040 0.016 0.073 0.036 0.040

Table A-15: NPMLE of θ and p for Log-normal distributions with p = 0.3 and Pr(δ = 0) = 0.4.

log-normal(1,1) log-normal(2,0.4) log-normal(2.5,0.25)

bias sd bias sd bias sd

n=100 θ 0.140 0.264 0.142 0.274 0.066 0.239 p 0.036 0.039 0.036 0.038 0.010 0.079 n=300 θ 0.138 0.259 0.141 0.267 0.069 0.234 p 0.035 0.039 0.036 0.037 0.012 0.077

在文檔中非混合治癒模型之統計推論 (頁 29-52)