Numerical algorithm and variance estimation 17

2 Literature Review for Semi-parametric Linear Models with Cure 4

3.4 Numerical algorithm and variance estimation 17

( | )w

with S_ being replaced by the following explicit formula:

The estimation procedure requires many iterations by updating the weights using previous estimates. Specifically let w^{( )}^m be the m-th step estimate of w based on (^ˆ^{( )}^m,^ˆ^{( )}^m ,S^ˆ_^{( )}^m ).

The procedure is repeated for m0,1, 2,... until convergence.

3.4.1 Re-sampling based on bootstrap approach

The non-differentiability and the complicated and dynamic weight components make it difficult to derive an analytic formula for variance estimation. The bootstrap approach provides a simulation scheme without extra analytic work. Say R bootstrap samples are drawn from the original data



⁽Xi^{, ,}i Zi^T^{), 1,...,}i n



. For each bootstrap sample, we perform the estimation procedure which involves  ( |w^{( )}^m ) and 0 U( | w^{( )}^m) for say m1,...,M . The sampling distributions of ^ˆ and ˆ can be approximated based on the K bootstrap estimates. The bootstrap method is time-consuming which involves solving the roots R M times. Note that solving U( | w^{( )}^m ) even once is not an easy task. 0

3.4.2 Re-sampling based on pivotal estimating functions

Parzen, Wei and Ying (1994) proposed a re-sampling method which has become a popular tool for variance estimation for many semi-parametric inference problems. This approach is useful when the estimating function is not smooth. In other situations, the derivative of the score function can be derived under some regularity conditions but still contains unknown density functions which cannot be estimated based on the simple plug-in approach.

Now we apply and modify the idea of Parzen et al. (1994). For our problem, the pivotal estimating function are the asymptotic distributions of

* 0

( | ) ( | )

U w

 



 

 

 

 

  

 .

Directly applying this approach, we first need to generate many replicates from the pivotal distribution denoted as (_j,U_j) for j1,...,R. Then solve

( | { , , ˆ }) ( | { , , ˆ })

w S

U w S U



   

  

   

   

 

   

  . (3.9c)

Let ( ,   be the corresponding solution for _j _j) j1,...,R. Then the conditional distribution of

 , given the observed sample, is asymptotic equivalent to the unconditional distribution

of ⁰ conditional on the observed sample, can be used to approximate the unconditional distribution of

( , ) ˆ ˆ .

The above procedure, however, is very time-consuming since it still involves many iterations to obtain the solution of (3.9c). We propose to modify the procedure by solving

where w denotes the final estimated weight. This modification can avoid the time-consuming ˆ^* iterations within each re-sampling run. In Appendix 3, we will see that this modification still produces valid results for variance estimation.

Now we derive the algorithm to simulate random samples from the pivotal distributions.

Since our interest is in , we only need to focus on U( | w^*) since it does not involve other

1/ 2 * 0 ˆ

( ; ) ~ _p(0, )

n^ U  w N  , (3.10c)

where the covariance matrix can be estimated by

ˆ ˆ

ˆ ( , ) ( , ) /ˆ ˆ

n T

i i

w w n

   



 



^. ^(3.10d)

In Section 3.3, we have shown that for  in a small neighborhood of ₀, n U^^{1/ 2} ( | wˆ^)n U^^{1/ 2} (₀;wˆ^)An^{1/ 2}(  ₀)o_p(1),

where A is the asymptotic slope matrix of n^^{1/ 2}U(₀;w^*) . We simulate G_i ~N(0,1) independently for i1,...,n. Let ^ˆ^* be the solution to







  ⁿ

i i w Gi

w U

ˆ ) ˆ, ( ˆ )

(   . (3.10e)

We can show that the conditional distribution of







 n

i i w Gi

1 2 /

1  (ˆ,ˆ ) , given the observed data, is also (0, )N_p  . Accordingly the conditional distribution of n¹^/²(ˆ^ˆ) follows N_p(0,A^¹A^¹), which is equivalent to the unconditional distribution of n¹^/²(ˆ₀) . To implement the re-sampling algorithm, we repeat (3.10e) for R times and then obtain ˆ^*

j for j1,...,R. The sample variance can be used to estimate Var( )ˆ . The proposed re-sampling procedure is much faster than the bootstrap approach since no iteration is needed in solving (3.10e) and also there is no need to deal with the estimating function of .

3.5 Model Checking

We utilize the martingale framework to construct a model checking procedure for the latency distribution. Here the model assumption refers to the chosen form of (.)h . Define the residual process:

1/ 2 1

( ; ) ˆ ( ; )

i i

V t  n^ Z M t 





^, ^(3.11a)

where

ˆ ( ; )_i departure from the imposed model.

First of all we need to show that, under the assumed model, V t( ; )ˆ converges weakly to a mean-zero Gaussian process. The argument is similarly to Ghosh (2003). Here we summarize the sketch of proof. One can write

Notice that

By the martingale central limit theorem and the consistency of ˆ, V t( ; )ˆ ^dN_p(0, ) ,

where the covariance matrix ( ) can be estimated by (3.10d). Furthermore its asymptotic

distribution can be approximated by ˆ ( )

V t ^{1/ 2} ^*

1 0

ˆ ˆ ˆ ˆ ˆ

{ ( ; , )} ( ; ) ( ; ) ( ; )

n t

i i i

n^ Z Z u  w dM u  G V t  V t 





  ^  . (3.12)

For informal model diagnostics, we can plot the sample curve of V t( ; )ˆ along with several simulated curves of ˆ ( )V t . If the sample curve is located within the range of simulated curves, the model assumption is reasonable. Formally, we can generate many replicates of ˆ ( )V t and compute the value of sup_t n^^{1/ 2}V tˆ( ) for the model candidates under consideration. The p-value refers to the empirical frequency that the observed value of sup_t n^^{1/ 2}V t( ; )ˆ exceeds the simulated

values of sup_t n^^{1/ 2}V tˆ( ) .

3.6 Simulation analysis

3.6.1 Data generation

We first generate Z from Bernoulli (0.5) and compute ₁

* (1) * (1)

0 (1, 1) ( 0, 0 ) 0 0 1

T T

Z   Z     Z ,

where the values of ₀^* are ₀⁽¹⁾ are specified. Then generate  ~Bernoulli(p_Z) with

0 0

exp( ) ( | )

1 exp( )

Z T

p Z Z

  

  

 .

If 1  , we generate the latency variable T which follows log T ₀⁽¹⁾Z₁₀⁽²⁾Z₂ 

where  follows the log-exponential distribution. If  0, we set T to be a very large number exceeding the support of C which follows a uniform distribution. Observed variables include replications of



^X^{, ,}^ ^Z



^{, where} ^X ^{ }^{T C}^and^ ^^{I T C}⁽ ^ ⁾. We consider two settings with A:

) 5 . 0 (

1 ~ Ber

Z and Z₂ ~ Ber(0.5); and B: Z₁ ~Ber(0.5) and Z₂ ~Unif(0,1). 3.6.2 Simulation results

Tables 1A and 1B show the results for estimating ₀⁽⁰⁾, ₀⁽¹⁾,

(0) 0

0 (0)

exp( ) ( | 0)

1 exp( )

p   Z 

   

 , and

(0) (1)

0 0

1 (0) (1)

0 0

exp( )

( | 1)

1 exp( )

p   Z  

 

   

  .

We calculate the average bias and standard deviation based on 1000 replications. In the two tables, the estimators have reasonable performances which improve as the sample size increases.

Transforming )(₀⁽⁰⁾,₀⁽¹⁾ into the probability scale based on (p₀,p₁), the performances of ˆ )

ˆ ,

(p₀ p₁ look satisfactory. Comparing the two tables which differ in the values of p₁, we see that the corresponding estimator becomes more variable when p₁ is closer to 0.5.

Our main proposal is developed for estimating ₀ in the latency model. Table 2A and Table 2B correspond to the incidence models in Table 1A and Table 1B respectively. The proposed estimators of ₀⁽¹⁾ and ₀⁽²⁾ have reasonable performances but sometimes produce larger bias when the sample size is small or the censoring rate is high. Our another important proposal is the re-sampling scheme for variance estimation. To examine the performance, we first check whether the sample average of ˆ^*

j which solves (3.10e) is close to the true parameter value. Then we examine whether the proposed estimator  ˆ (ˆ_j), which is sample standard deviation of ˆ^*

j

(j1,..., )R , is close to the simulated estimate denoted as se(ˆ_j). The results are satisfactory. As a consequence, the coverage probability is close to the 95% nominal level in most cases. Notice that the results in Table 2B appear to be better than those in Table 2A since the former corresponds to higher incidence rate which provides more data to estimate the latency distribution.

Finally we examine the proposed model checking procedure. We first simulate data from an AFT model and then analyze it by an AFT model. Figures 4.1A and 4.1B show the two

components of V t( ; )ˆ based on Z and ₁ Z respectively. The observed curves are mostly ₂ located within 20 simulated curves which show that the fitted model is acceptable. Then we generate an AFT model and fit a location shift model. Figures 4.2A and 4.2B, the observed curves are located outside the simulated curves which show that the fitted model is not satisfactory.

Chapter 4 Literature Review for

Transformation Models with Cure 4.1 Background

In this chapter we consider the second class of models with the incidence model given by Pr( 1| )Z  ( ₀| )Z ⁰

exp( ) 1 exp( )

T T

Z Z



 

 .

And for   , 1 T follows a transformation model of the form _i ( ) ^T 0

h T Z   ,

where ( )h  is a unknown monotone function but the distribution of  is completely specified.

Note that we denote the distribution, survival and cumulative hazard functions of  as F_, S_ and  which are fully specified. The most well-known example is the proportional hazards (PH) _ model in which  follows the extreme value distribution with F s_

 

1 exp{ exp( )}- - s . When

 follows the standard logistic distribution with F sε

 

^exp

 

s / + ^{{1 exp}

 

s ^}, the model

becomes the proportional odds (PO) model.

For the discussions in this chapter, observed data are denoted as



⁽Xi^{, ,}i Zi^{), 1,...,}i n



, where X_i   and T_i C_i _i I T( _i C_i). The parameters of interest are( , )  while h t is an ( ) infinite-dimensional nuisance function. In the early stage of methodology development, statisticians including Kuk and Chen (1992), Sy and Taylor (2000) and Peng and Dear (2000) focused on the special case that the latency distribution follows the PH model. Then a new trend starting from Lu and Ying (2004) considers statistical inference for the whole of class of transformation models. In this chapter, we review existing literature for transformation cure models. Roughly speaking, existing inference approaches can be classified into two types. One is based on the likelihood principle and the other is based on moment properties.

4.2 Different model expressions

We first review different formulations of a transformation model since the form of model expression affects subsequent inference development. The most well-known representation is given by

( ) ^T 0

h T Z   , (4.1a)

which states that the failure time T for a susceptible subject can be written as a parametric linear model after an unknown monotone transformation. Alternatively one can also write

{S t_Z( )} h t( ) Z^T 0

     , (4.1b)

where S t_Z( )Pr(T t| 1, )Z is the survival function of T| 1,Z and F t_( ) 1 ^¹( )t which is a known function. The representation of (4.1b) says that a known transformation of the survival function leads to a linear structure in the parameters which contains an un-specified intercept function. One can also write (4.1b) in terms of the cumulative hazard function, defined as

( ) log{ ( )}

Z t S tZ

    , such that

0 0

( ) log[ { ( ) ^T }] { ( ) ^T }

Z t ^ h t Z  H h t Z 

      , (4.1c)

where H t( ) log{^¹( )}t is also completely specified. Notice that the above three equivalent expressions only allow for time-independent covariates. Later in Chapter 5, we will discuss the extension of including time-dependent covariates.

4.3 Likelihood approach under the PH model

The likelihood function under the transformation model can be written as

1 1

[ ( ) ( )] [ⁱ ( ) ( ) 1 ( )] ⁱ

i i

i Z i Z i i i

f x ^ S x ^

      ^



   



^ ^ ^, (4.2a)

where S t_Z( )^¹{ ( )h t Z^T} and f t_Z( ) S t_Z( ) /t . Expressing the function in terms of hazard and survival functions, we obtain

_      



  n 

i i

Z i i Z i Z i

i i

i x S x S x

( 1 )

~ ( ) ( )

~ ( )

(  ^     ^

 . (4.2b)

When the latency follows the PH model, (4.2b) can be written as

Equation (4.3) can be simplified by first considering complete data with _i being observed.

The completed likelihood function for (,) can be written as follows:

 

be written as the product of the following two terms:

_      

Both terms involve possibly missing _i and the second equation involves the nuisance function

0(.) which involves the complicated summation, is not easy to handle numerically, a Monte Carlo approach was suggested to implement the procedure.

Later researchers proposed to analyze the two terms in (4.5a) and (4.5b) separately. The log-likelihood can be rewritten as the sum of

      

There are two ways of handling the nuisance function in (4.5b) – ignore it as in the classical analysis without cure or estimate it using explicit formula. Now we discuss both approaches.

Motivated by the marginal distribution of ranks discussed in Kalbfleisch and Prentice (1973), Peng and Dear (2000) suggested to ignore the nuisance function ₀(.) in (4.5b). To simplify the discussion, assume there are no ties. Let x₍₁₎  x₍_k₎ be ordered uncensored failure times with

. Finally an estimator of

 can be obtained by maximizing

 

( )

( ) 2

1 ( )

( ) exp

{ (1 ) }exp

k T

i i T

i i i j

j R x

L Z

w Z

 

  







 

 

 

   

 

 

 .

In each iteration of the maximization, w is treated as a fixed value by plugging in previous _i estimates of ( , ,  S^₀). The final estimator is obtained when the convergence criteria is satisfied.

Sy and Taylor (2000) proposed two methods for handling (4.5b). Their first proposal suggested to substitute ₀( )t in ~ )

, ( ₀

L by the Breslow estimator given by

 

:( )

{1/ exp }

j j

l l

j x t l R

  z

 

 

. Their first approach becomes very similar to that of Peng and Dear

(2000). Their second proposal was motivated by Kalbfleisch and Prentice (1980) such that the nuisance function S t₀( ) is decomposed as the product-limit form of hazard rates at observed failure times. Then ~ )

( ₀

2  

L can be expressed as a function of  and the (baseline) hazard rates. The idea of profile likelihood estimation is adopted such that the hazard rates are estimated first, assuming that  is known, and then  is maximized based on the profile likelihood.

4.4 Moment approach for the class of transformation models

Lu and Ying (2004) extended the analysis to the whole class of transformation models.

Unknown parameters become { , , ( )}  h  with true values denoted by { , ₀ ₀,h₀( )} respectively.

Properties of counting processes and martingales are applied to construct estimating functions.

Specifically define N t_i( )I X( _i t,_i  and 1) F as the corresponding filtration up to time t . _t It follows that

log{ ( )}

[ _i( ) | _t ] ( _i ) _Z( ) ( _i ) S t^Z E dN t F I X t d t I X t

 t

      

 ,

where

0 0

exp( )

( ) 1 ( )

1 exp( ) 1 exp( )

Z T T Z

S t Z S t

Z Z



 

 

   .

Additional re-parameterization is needed to express [E dN t F_i( ) | _t_] as a simpler function of the unknown parameters. It follows that

0 0 0 0

M ti   h is a mean-zero martingale. This expression is nice since it is a tractable function of linear terms in the unknown parameters. Two sets of estimating functions can be constructed:

1 for the transformation cure model, estimation for the latency distribution does not use the idea of E-step since the compensator (I X_i t d) _Z( )t does not need the information of _i. Nevertheless since  is also unknown, additional set of estimating equation is needed. Lu and Ying (2004) suggested to modify the estimating function:

1 0

{ , , ( )}  h  .

The approach proposed by Lu and Ying is attractive since it can be applied to the whole class of transformation models. Their idea is based on martingale properties which can be viewed as a moment approach. In the next chapter, we also consider general transformation models but adopt the likelihood principle for inference.

Chapter 5 Proposed Approach for Transformation Cure Model under Independent Censoring

Recently Zeng and Lin (2006) extended the transformation model without cure to allow for time-dependent covariates. The model assumption is imposed on the cumulative intensity function which can be expressed as

0 0

( ) ( ) exp{ ( )} ( )

A tZ G^ I T s  Z s dR s ^





^, ^(5.1)

where ( )Z t denotes the vector of time-dependent covariates, (.)G is a known transformation and (.)R is an unknown increasing function. We derive the connection of model (5.1) with the original model expressions discussed in Chapter 4 under time-independent covariates with

Z t( ) . Notice that the cumulative hazards function can be written as: Z

0 0

( ) exp{ ( )} ( )

Z t G^  Z s dR s ^

   







 



^exp ⁰^T ^{log[ ( )]}



G  Z R t

 

{ 0^T ( )}

H  Z h t

  , (5.2)

where ^{H t}^{( )}^^G



^{exp( )}^t



^and^{h t}^{( )}^^{log ( )}^{R t} which correspond to (4.1c). The advantage of (5.1) is the inclusion of time-dependent covariates in the model. Besides the model extension, Zeng and Lin (2006) also proposed likelihood-based inference methods which may yield more efficient results. We consider two extensions. In this chapter we apply the approach to the mixture model when the latency distribution follows the transformation models. In Chapter 6, we discuss a complicated situation of the cure model when dependent censoring exists.

5.1 Model assumptions

Let  be the susceptible indicator which, given the covariate Z , follows the logistic

model:

exp( ) ( | )

1 exp( )

T T

Z Z

  

 

 . (5.3a)

For 1  , the cumulative intensity function can be written as



⁰ ^(s)



( | ) ^t ( ) ^T^Z ( )

A t Z^Z G



I Ts e^ dR s ^, ^(5.3b) where (.)G is a known function and (.)R is an unknown increasing function. Define

( ) ( )

Y s I T  and s

( )

( ; , )t R 0^tY s e( ) ^^T^{Z s} dR s( )

  



Accordingly for   , the intensity function can be written as 1

 

( ) ( ; , )

a t G t R

t  

 

 



^{( ; , )}



^{( ; , )}

( ; , )

t R

G t R

t R t

   

 

    

    

^^g



^{ }^{( ; , )}^t ^R



^{I T t e}⁽ ^ ⁾ ^^T^{Z t}^{( )}^{dR t}^{( )}^, ^(5.3c)

where ( ) G(t) t t

g 

  . For time-independent covariates, the cumulative hazard function and hazard

function are defined as _Z( )t and ^_Z( )t which have similar expressions as A t_Z( ) and a t_Z( ) respectively but without the stochastic component (I T t . Accordingly we have )

1 exp( )

( ) Pr( | ) exp{ ( )}

1 exp( ) 1 exp( )

Z T T Z

S t T t Z Z t

Z Z



 

    

  

and

1 exp( )

( ) log{ ( )} log exp{ ( )}

1 exp( ) 1 exp( )

Z Z T T Z

t S t Z t

Z Z



 

 

          

 .

5.2 Likelihood analysis for complete data

Suppose we have complete data

 

X_i,_i,_i



,i1,...,n



. Let N_i(t)I(T_i t,_i 1),

( ) ( )

i i

Y s I T  and s ^{( )}

( ; , ) 0^t ( ) ^T^{Z s}ⁱ ( )

i t R Y s ei ^ dR s

  



. The log-likelihood function for



dR,



proportion to

Directly maximizing the above likelihood is difficult. Nevertheless Chen (2009) proposed closed-form score equations which yield nice analytical results. Here we modify his results for the extended cure model. We need to assume that R t is a step function which takes jumps only at ( ) observed failure points. Let t_(0,] be an observed time and R_ R t( )_ is a step function which jumps at time t_. The score function for dR involves the following two derivative equations:

 

Re-arranging the terms in the above equation and applying the formula in (5.5a)-(5.5c),



 

5.3 Imputation for handling missing data

The next job is to deal with the unknown value of _i . We can replace _i by The final estimating equations can be written as

 

5.4 Numerical algorithm

Implementation of the proposed estimation procedure is stated as follows.

i. Starting with initial the Breslow estimator dR⁽⁰⁾( ) 1 /t_  n and



^⁽⁰⁾^,^⁽⁰⁾



^^{(0, 0)}^{, we}

obtain

⁽⁰⁾( | ) exp t

S t Z G

  

    ,

and

w_i^EM⁽⁰⁾

(0) (0)

( ) ( ) 1

i i

S t S t

  .

ii. Denote k as the indicator of iterations. Given w and _i^{( )}^k w_i^{EM k}^{( )}, first obtain dR⁽^k^¹⁾ from (5.9b) and then



^⁽^k^¹⁾^,^⁽^k^¹⁾



from (5.9a) and (5.9c).

iii. The estimate of the survival function is updated as

 



^{( )}



( 1) ( )

( | ) exp 0^t ^k ( )

k Z k

S ^ t Z  G



e^ dR s ^, which is applied to obtain

w_i^{EM k}⁽ ^¹⁾

( ) ( )

( ) ( ) ( )

( ) ( )

( ) ( ) 1 ( )

k k

i i

k k k

i i i

S t S t

 

   

 

  

and ˆ_i⁽^k^¹⁾   _i (1 _i)w_i^{EM k}⁽ ^¹⁾. Then the weights w_i⁽^k^¹⁾ are obtained from (5.5a),(5.5b) and (5.5c).

iv. Repeat the steps (ii) and (iii) for k 0,1, 2,... until convergence.

5.5 Simulation analysis

5.5.1 Data generation

We first generate covariate Z (Z Z₁, ₂)^T where Z₁~Ber(0.5)and Z₂ ~ N(0,1) truncated at

 and. Then we generate 2  which follows the Bernoulli distribution with probability

(0) (1)

0 0 1

0 (0) (1)

0 0 1

exp( )

( | )

1 exp( )

p Z Z

 

   

  

  ,

where ₀ ( ₀⁽⁰⁾, ₀⁽¹⁾)^T.

If 1  , we generate the latency variable T with



⁰ ⁰



( ) exp ^t ^Z^T ( ) S t^Z  G



e ^ dR u ^,

where Z^T₀ (Z Z₁, ₂) (^T  ₀⁽¹⁾, ₀⁽²⁾)₀⁽¹⁾Z₁₀⁽²⁾Z₂. To obtain T , we can generate U ~ Unif(0,1) and then solve

Z( )

U  S T ^^exp^^_^^G

 ^

⁰^T ^e^Z^T^⁰^{dR u}^{( )}



^^_^.

Accordingly

 

⁰

log ^Z^T 0^T ( ) G^  U e ^



dR u ^.

In the simulations, we choose the proportional odds model with G(t)log(1t) and R(t) . t² Hence G t^¹( )exp( ) 1t  and e^^log^U  1 e^Z^T^⁰ . Finally for T²   , we let 1

^T ^ ^e^^Z^T^⁰



^e^^log^U ^{ }¹



¹^_U^U ^e^^Z^T^⁰ ^.

If 0  , we set T to be a very large number exceeding the support of the censoring variable C which follows the uniform distribution Uniform(0,_C). The values of _C are set to yield the censoring rates Pr( 0) with 25% and 40% respectively. Note that when Pr(   , this 1) 1 setting is the same as that in Chen (2009).

5.5.2 Simulation results

Tables 3A and 3B summarize the results for n200 and n500 based on 1000 replications. In Table 3A, the bias and standard error of ^ˆ₀ and ˆ₁ as well as those of p ˆ₀ ( p₀ 0.88) and p (ˆ₁ p₁0.73) look reasonable. From Table 3B, the proposed parameter estimators of _j (j1, 2) are virtually unbiased and have satisfactory standard deviations which improve as the sample size increases. The performances of R t are evaluated at selected value of ( ) t t which is the pth percentile of p S(t |Z 0). The estimator of R is roughly unbiased but _p becomes more variable as t t increases. Recall that the setting is an extension of the one in _p

Chen (2009) in which Pr( 1) and hence we can make comparison to examine the effect of cure on estimation of . In present of cure, the proposed estimators are still roughly unbiased, but the standard deviations slightly increase.

Chapter 6 Proposed Approach for Transformation Cure Model Under Dependent Censoring

6.1 Model assumptions

Now we consider the more complicated but commonly-seen situation that, under the mixture framework, the event of interest is subject to competing risks such as death. Let T be the time to ₁ the event of interest, say disease onset and T be the time to dependent censoring. We denote Z ₂ as the vector of covariates. Also assume the mixture framework such that only a proportion of subjects with  1 develop the disease. The incidence rate is also described by the logistic model:

exp( ) ( | )

1 exp( )

T T

Z Z

  

 

 . (6.1a)

For  1, the marginal cumulative hazard function of T_j | 1 can be written as

^^ ^j^{( | )}^{t Z} ^^G^j

 ^

⁰^t^e^^T^j^{Z s}^{( )}^{dR s}^j^{( )}



^, ^(6.1b)

where (.)G_j is a known functions and R_j(.) is an unknown increasing function for both j1, 2. For those with  0, we assume that T₁  and

Pr(T₂ t| 0, )Z Pr(T₂ t| 1, )Z Pr(T₂ t Z| ). (6.1c) This assumption implies that the competing risk events follow the same distribution for the susceptible and cured populations. The joint survival function of ( ,T T₁ ₂) | 1 is assumed to follow a copula model of the form:

S s t( , )Pr(T₁ s T, ₂ t Z| ,  1) C S s Z S t Z_{ ( | ),₁ ₂( | )}, (6.1d) where ( | )S t Z_j exp{ _j( | )}t Z and  measures the degree of association between T₁ and

T2.

We introduce some popular copula models.

a. Clayton copula (Clayton, 1978): C_

 

u,v (u¹^^ v¹^^ 1)¹^/(¹^^⁾, 1;

b. Frank copula (Frank,1979):

 











 

 



 1

) 1 )(

1 1 ( log

, u v

v u

C ,0 1;

c. Positive stable copula (Hougaard, 1986):

^C^

 

^u^,^v ^^exp



^

 

^^log^u



¹^/^ ^



^^log^v



¹^/^



^



^,⁰^^ ^¹^.

Note that the independent copula is degenerated case with C_

 

u,v uv.

It is important to mention that when Pr(   , the model assumptions reduce to the 1) 1 framework discussed in Chen (2010) who adopted the likelihood approach for parameter estimation. For inference, we will take the same likelihood principle combined with the EM technique. Before presenting the detailed likelihood derivations, it is useful to discuss how to impute the unknown _i under the new assumptions.

6.2 Imputation under the new models

In absence of external censoring, data consists of {( ,T_i _i,Z i_i)( 1,..., )}n , where T_i T₁_i T₂_i and  _i I T( ₁_iT₂_i). Notice that

( _i 1| ,_i _i) _i (1 _i) ( _i 1| ,_i _i 0) E   T       E   T   . and

Pr(_i 1| ,T_i  _i 0) ¹ ²

1 2 1 2

Pr( 1, , )

Pr( 1, , ) Pr( 0, , )

i i i i i

i i i i i i i i i i

T T T T

T T T T T T T T



 

  

       

1 2

1 2 2

Pr( , | 1) Pr( 1)

Pr( , | 1) Pr( 1) Pr( 0) Pr( | 0)

i i i i i i

i i i i i i i i i i

T T T T

T T T T T T

 

   

   

         ,

where

1 2

Pr(T_i t T, _i t|_i 1) C_⁰¹{ ( | ),S s Z S t Z S₁ ₂( | )}₂(t Z| ),

2 2

Pr(T_i t|_i 0)S (t)

and (S_j t Z| )exp{ _j( | )}t Z   _j( t Z| ). Note that if g t( ) is increasing, g t( ) g t( )g t( ); while if g t( ) is decreasing, g t( ) g t(  ) g t( ).

In presence of censoring, data consists of {( ,T_i  ₁_i, ₂_i,Z i_i)( 1,..., )}n , where

Notice that the effect of censoring is cancelled out in the above derivations which implies that the conditional means only contain model parameters. This simplification avoids estimation of nuisance parameter in subsequent inference.

6.3 Likelihood analysis under dependent censorship

We extend the results of Chen (2010) by temporarily assuming that _i is observable.

Additional step of imputation is needed for further implementation. Based on completed data

 



T_i,  1_i, 2_i, _i ,i1,...,n



, the log-likelihood function can be written as

1 2

Notice that the log-likelihood function contains the cause-specified hazard probabilities:

1^CS( t) Pr(T1 [ ,t t ),T2 t T| 1 t T, 2 t, 1) objective is to re-parameterize the log-likelihood function in terms of the model parameters in

(6.1a) ~ (6.1d). To simplify the presentation, we illustrate the analysis in terms of time-independent covariates since adapting to time-varying covariates is straightforward. For j 1,2, we derive the following quantities. The probability density function is given by

( | ) exp{ ( | )} ( | )

  



. The corresponding cause-specified hazard function is given by

re-expressed as follows:



¹, ²



| exp_ _ ( ; , )__ ^ ^{( ; , )}^



( ; , )



( ) ^{( )} ( )

To simplify the notations, define

   

 

_ ( ; , )_ ^ ^

 

Accordingly we obtain

^{E dN t F}



^j^{( ) |} ^t^



^^g^j



^^j^{( ; , )}^t^ ^ ^R



^^{Y t e}^{( )}^ ^^T^j^{Z t}^j^{( )}^^{dR t}^j^{( )}^.

The cumulative intensity function can be re-expressed as follows:

Gj^{( , )} R 



₀^ gj



j^{( ;}t j^,Rj⁾



Y t e^{( )} ^^T^j^{Z t}^{( )}dR tj^{( )}^. ^(6.4c)

6.4 Score equations under dependent censorship

Let t_ be the observed event-j time and assume R t_j( )_ is step function at jump time t_, for 2

1

j . Differentiating the log-likelihood function with respect to dR t_j( )_ involves the following

two derivative equations. By (6.3c), we can define the derivative _jk( ; , )t  R as the differentiation

of ( ; , )_j t  R with respect to _k( ;t _j,R_j) (k,j1,2) as follows.

 

² ^{( )}

 

6.5 Numerical algorithm

Implementation of the proposed estimation procedure is stated as follows.

i. Starting with initial the Breslow estimator dR⁽⁰⁾_j ( )t_ 1 /n, for j1, 2, and



^⁽⁰⁾^,^⁽⁰⁾



^^{(0, 0)}^,

ii. Denote k as the indicator of iterations. Given w ,₁^{( )}_i^k w ,₂^{( )}^k_i w and _i^{( )}^k w_i^{EM k}^{( )} , first obtain

( 1) 1

dRk^ from (6.8), dR₂⁽^k^¹⁾ from (6.9) and then



^⁽^k^¹⁾^,^¹⁽^k^¹⁾^,^²⁽^k^¹⁾



from (6.10) and (6.11).

iii. The estimate of the survival function is updated as

 



^{( )}



( 1) ( )

( | ) exp 0 ( )

t Z

k k

j j j

S ^ t Z  G



e^ dR s ^{, for} ^j^^{1, 2}^,

 

( 1) ( 1) ( 1)

00^k ( | ) 1^k ( | ), 2^k ( | ) C ^ t Z C S_ ^ t Z S ^ t Z which is applied to obtain

( ) ( ) ( )

( 1) 01 2 00

1 2 ( ) ( ) ( ) 1 2 ( ) ( )

01 2 2 00 2

( ) ( | ) ( )

(1 )(1 )

( ) ( | ) ( | ) ( ) ( | )

k k k

EM k i i

i i i k k k i i k k

i i

C t S t Z C t

w ^   C t S t Z ^S t Z    C t S t Z

   

and ˆ_i⁽^k^¹⁾   _i (1 _i)w_i^{EM k}⁽ ^¹⁾ . Then the weights w_1i⁽⁰⁾ , w_2i⁽⁰⁾ , w_i⁽⁰⁾ are obtained from (6.7a)~(6.7e),(6.8a)~(6.8e).

iv. Repeat the steps (ii) and (iii) for k 0,1, 2,... until convergence.

6.6 Simulation analysis

6.6.1 Data generation

We also generate covariate Z (Z Z₁, ₂)^T where Z₁~Ber(0.5)and Z₂ ~ N(0,1) truncated at

 and set 2

(0) (1)

0 0 1

0 (0) (1)

0 0 1

exp( )

( | )

1 exp( )

p Z Z

 

   

  

  ,

where ₀ ( ₀⁽⁰⁾, ₀⁽¹⁾)^T.

We assume that (Z Z affects the ₁, ₂) ( ,T T . Marginally for ₁ ₂) T₁|  , we set 1 G t₁( )log(1 t) and R t₁( ) which corresponds to a proportional odds model with the survival function: t²

^{S t Z}^¹^{( | )}^^exp



^^G¹

 ^

⁰^t^e^^{1 1}^Z^^²^Z²^{dR u}¹^{( )}

 

For T₂|  , we set 1 G t₂( ) and t R t₂( ) which corresponds to the survival function: ct

^{S t Z}^²^{( | )}^^exp



^^G²

 ^

⁰^t^e^^{3 1}^Z^^⁴^Z²^{dR u}²^{( )}

 

The value of c controls the proportion of experiencing the two events (i.e. Pr(T₁T₂|  ) and 1) in the simulations we set c0.3 and c0.15.

Now we describe how to simulate ( ,T T₁ ₂) | 1,Z Z₁, ₂ which jointly follows a Clayton model. At first, we generate a pair of correlated failure times ( ,Y Y following the Clayton ₁ ₂) distribution with exponential marginals and the association parameter  related to Kendall’s tau

在文檔中治癒模式之半母數迴歸分析 (頁 24-0)

Numerical algorithm and variance estimation 17

2 Literature Review for Semi-parametric Linear Models with Cure 4

3.4 Numerical algorithm and variance estimation 17











3.5 Model Checking





3.6 Simulation analysis





Chapter 4

Literature Review for

Transformation Models with Cure 4.1 Background

 

 

 

 





4.2 Different model expressions

4.3 Likelihood approach under the PH model



      

 

      

      

 

 

 

 

 

4.4 Moment approach for the class of transformation models

Chapter 5

Proposed Approach for Transformation Cure Model under Independent Censoring





 









5.1 Model assumptions









 









5.2 Likelihood analysis for complete data

 











 

 

5.3 Imputation for handling missing data

 

5.4 Numerical algorithm









 







5.5 Simulation analysis







 

_      

_      

 ^

 ^

 ^

 ^