Literature background 1 - 治癒模式之半母數迴歸分析

1 Introduction

1.1 Literature background 1

Traditional survival models assume that every subject in the study will eventually experience the event of interest. However, Kaplan-Meier curves based on empirical data often level off at the right tail and exhibit a stable plateau. Survival analysis which accounts for the possibility of cure or non-susceptibility has received increasing attentions in the literature since it provides reasonable explanations for some scientific phenomenon. The most popular approach to analyzing survival data in presence of cure is to represent the population as a mixture of susceptible and cured subjects. Define  as the indicator of susceptibility. The population is divided into two groups:

the susceptible with   and the cure with 1   . For 0  1, define T as the time to the failure event. When   , T is undefined or conventionally set to be infinity. Accordingly one 0 can write

Pr(T t ) Pr(T t | 1) Pr(  1) Pr(  . 0)

In presence of covariates denoted by Z , the mixture model can be written as

Pr(T t Z | )Pr(T t | 1, ) Pr(Z  1| )Z Pr( 0 | )Z . (1.1) Under the above mixture framework, most literature assumes that the incidence function follows the logistic regression model which can be written as

Pr( 1| )Z  ( 0| )Z ⁰

exp( ) 1 exp( )

T T

Z Z



 

 . (1.2)

Different proposals for modeling the latency variable T|  have appeared in the literature. 1 Parametric models including the Weibull, generalized Gamma and generzlized F have been proposed by Farewell (1982), Yamaguchi (1992) and Peng et al. (1998) respectively.

Semi-parametric models are more popular choices due to their flexibility and robustness. Most popular semi-parametric models, after some transformation, can be written as a linear regression form. For modeling T|  , one can write 1

( ) ^T 0

h T Z   , (1.3)

where ₀:p is the unknown regression parameter of interest, ( )1 h  is a monotone functions and  is the error term whose distribution does not depend on Z .

Now we discuss two general classes of model (1.3). One type of models, which refers to semi-parametric linear models, assumes that ( )h  is given but the error distribution is unknown.

For example if ( )h t log( )t , the model becomes an accelerated failure time model. If ( )h t  , it t comes a location-shift model. Hence unknown parameters become ₀ and F t_⁰( )Pr(  . The t) other class, known as transformation models, assumes that h( ) is unknown but the distribution of

 is specified. For the Cox proportional hazards (PH) model,  follows the extreme value distribution with S t_⁰( )exp{ exp( )} t and ⁰_( )t exp( )t . For the proportional odds model,  follows the logistic distribution with S t_⁰( )exp( ) /{1 exp( )}t  t . Unknown parameters contain

0 and h( ) .

Nonparametric analysis for cure models with right censored observations may suffer from the inherent non-identifiability problem. A censored observation indicates two possible situations:

the subject may be susceptible but the event has not occurred by the end of study; or he/she is cured. To distinguish the two different cases, the follow-up period has to be long enough to observe the susceptible ones as much as possible. The book of Maller and Zhou (1996) discusses the issue of identifiability and presents nonparametric tests to verify the condition of sufficient follow-up. Despite the theoretical contribution, these tests are not practical due to their low power.

Therefore for practical applications, expert opinions about whether cure exists or not are important for choosing an appropriate model (Farewell, 1986).

1.2 Outline of the thesis

This thesis considers semi-parametric inference based on models in (1.2) and (1.3). The problem of non-identifiability is not as serious as in the nonparametric setting since additional

model assumptions will be imposed. For the latency distribution, we consider both classes of models. We review the literature for semi-parametric linear models and present our proposal in Chapter 2 and 3 respectively. Then we consider transformation models in Chapters 4, 5 and 6.

Chapter 4 reviews existing literature and Chapters 5 and 6 present our proposals under independent censoring and dependent censoring respectively. Chapter 7 contains concluding remarks.

Chapter 2 Literature Review for Semi-parametric Linear Models with Cure 2.1 Overview

Under the mixture framework, assume that Pr(_i 1|Z_i) _i( 0| )Z ⁰

exp( ) 1 exp( )

T i

Z Z



 

 and for _i  , we have 1

( )_i _i^T 0 _i

h T Z   ,

where h( ) is specified and ( 1,..., )_i i n form an iid sample with an unknown marginal distribution independent of Z . Define _i f t_⁰( ) , F t_⁰( ) , S t_⁰( ) and ⁰_( )t as the density, distribution, survival and cumulative hazard functions of  respectively, all of which are unspecified. In this chapter, we review existing literature for estimating ( ₀, ₀) in presence of the nuisance function S t_⁰( ). Note that when ( )h t log( )t , the model becomes the AFT model;

and if h t( ) , the model becomes the location-shift model. t

Let C be the censoring variable for the ith subject. We will assume that _i C and _i T are _i independent. Denote observed data as



⁽Xi^{, ,}i Zi^{), 1,...,}i n



, where X_i  T_i C_i and

( )

i I Ti Ci

   . Before we discuss specific methods, it is useful to examine the inference problem using the classical likelihood approach. One can express the data under the scale of the error variable. Let _i()h

 

T_i Z_i^T , _i^C()h

 

C_i Z_i^T and  _i( )h X( _i)Z_i^T. Note that

( 0)

 i has the same distribution as  when ₀ is the true value of . The likelihood function can be written as

( , , 0)

L   f_ ⁰

 

⁽ ¹⁾

( ) ( ) ⁱ

n I

i i

f_ ^

    ^



 





 ^  ^^^{ }ⁱ^{( )}^S^⁰

^

^{ }ⁱ^{( )}

^

^{ }⁽¹ ^{ }ⁱ^{( ))}^^I⁽^ⁱ^⁰⁾, (2.1a) where ( ) _i exp(Z_i^T) /{1 exp( Z_i^T)}.

The second component in the right-hand side of (2.1a) becomes complicated after taking

logarithm.

The idea of EM algorithm is often adopted in statistical inference of cure models. If

“complete” data denoted as



⁽X Zi^, i^{, ,} i i^{), 1,...,}i n



are available, the above likelihood in (2.1a)

The resulting log-likelihood function can be written as:

log ( , ,L   f_⁰) ( , ,  f_⁰)₁( ) ₂( , f_⁰)

Notice that the parameters  and  become separated in (2.3a) and (2.3b) respectively.

Accordingly the score functions for  and  become least its parametric form, is known. However these two conditions often do not hold in practical applications. Now we discuss how to handle these problems.

To deal with possibly unknown value of _i, a common approach is to replace it by an imputed value, often an estimate of its conditional mean given observed data. Notice that when

i 1

Under the imposed models, we write

Under the semi-parametric setting, the major challenge is the log-likelihood function in (2.3b) or the score equation in (2.4b) which involves the nuisance functions f_⁰(.), f_⁰(.) and S_⁰(.), the first two of which are complicated. Existing methods try to get rid of the density function

0(.)

Note we use the same notations in (2.7a) and (2.7b) to simplify the presentation since these two functions are asymptotically equivalent. Replacing _i by _i (1 _i)w_i( , ,  S_), we have

ˆ ( | , ) estimating equation along with the score equations in (2.4a) and (2.4b) or its modified version.

We now introduce two papers which provide different ways of modifying the second score equation. Note that, since the transformation ( )h  is known, the papers usually assume the accelerated failure time model with h t( )log( )t .

2.2 M-Estimation by Li & Taylor (2002)

Li and Taylor (2002) extended the idea of M-estimators by Ritov (1990) to cure models. First, the covariate Z is centered to exclude the unknown intercept term:

0 0





. Following Ritov (1990), Li and Taylor (2002) suggested to replace

Finally Li and Taylor (2002) proposed to modify (2.4b) by

2.3 Log-rank type Estimation by Zhang & Peng (2007)

The log-likelihood function in (2.1b) is expressed in terms of density and survival functions.

Zhang and Peng (2007) re-wrote the function in terms of hazard and survival function such that

Zhang and Peng (2007) found new insights from (2.11). Specifically, replacing _i by (1 )

In particular, the expression in (2.12b) can be viewed as the likelihood from the model such that ( )_i _i^T _i*

h T Z   , (2.13a)

where _i^* has the hazard function  _i _⁰( ) . Notice that the problem in (2.13a) becomes a semi-parametric model without cure. It has the form of a semi-parametric linear model since (.)h is specified while the distribution of _i^* is unknown.

The proposal of Zhang and Peng (2007) was motivated by the work of Wei (1992) who incorporated the rank estimation method under the framework of PH models. Specifically (2.13a) can be written as

*( ) ( ) ^T

i h Ti Zi

     .

As mentioned earlier _i^* has the hazard function  _i _⁰( ) . Consider a more general type of

proportional hazards model for ^*

PH( )t _( ) exp(t Z^T )

    . (2.14a)

The score equation for  deriving from the partial likelihood function based on model (2.14a) is given by

Notice that (2.14b) has the form of log-rank statistics. When   , which reduces to the true 0 model (2.13a), the above score function becomes

Zhang and Peng (2007) suggested to add a weight function (0) to (2.14c) and proposed the following estimating function:

2.4 Sketch of Numerical Algorithm for EM-type Estimation

Now we discuss how to implement the estimation procedures which will also be adopted by the proposed approach discussed in the next chapter. We need to solve two estimating equations:

( | )w 0

For numerical implementation, let w^{( )}^m be the mth step estimate of w based on

( ) ( ) ( )

ˆ ˆ ˆ

( ^m, ^m ,S_^m ) . It is used to solve  ( |w^{( )}^m )0 and U^*( | w^{( )}^m )0 to obtain

( 1) ( 1)

ˆ ˆ

( ^m^ , ^m^ ) and

( 1)

ˆ ^m ( )

S_ ^ t  ¹

( )

1 1

( ( ) , 1)

exp{ }

( ( ) ) ( ( ) )

i i

n n

u t m

i i i

i i

I u

I u w I u

  

    





 



  

 



 

The procedure is repeated for m0,1, 2,... until convergence.

It is important to note that solving U^LT( | w^{( )}^m,S^ˆ_^{( )}^m)0 is more difficult than ( | ( )) 0

ZP m

U  w  since S^ˆ_^{( )}^m plays a more important role in the former equation. As a result, a grid search with a large number of finely spaced points is suggested by Li and Taylor (2002). In the simulation studies conducted by Zhang and Peng (2007), the estimator proposed by Li and Taylor (2002) may fail to produce a consistent estimator.

The dependency of w on ( , ,  S_) complicates theoretical analysis. Both papers did not derive asymptotic properties of their proposed estimators. The bootstrap approach was suggested by Zhang and Peng (2007) for variance estimation. We will briefly discuss this approach in the next chapter.

Chapter 3 Proposed Approach for Semiparametric Linear Models

In this chapter we present our proposal to replace the second score function in (2.4b):

3.1 Martingale estimating function based on complete data

Temporarily we assume that the information of _i is available. Recall that

 

 the at-risk process for a susceptible subject:

( ; ) ( ( ) , 1)

i i i

Y t  I   t   (3.1b)

and the corresponding filtration for the susceptible group:



3.2 The proposed estimating functions

A possibly unknown _i can be replaced by its imputed value: _i   _i (1 _i)w_i, where

We propose the following estimating function for 

 

The expression of U( | ) w in (3.4b) is equivalent to U^ZP( | ) w proposed by Zhang and Peng (2007) despite that the two proposals are developed based on different ideas. Nevertheless our approach starts from the concept of martingales which provides a useful framework for further analysis including large-sample analysis, variance estimation and model checking.

3.3 Large sample analysis

Recall that the proposed estimators of ( , ₀ ₀), denoted as (ˆ , )^ˆ solve

ˆ ˆ

( | { , , }) ( | ) 0

ˆ ( | )ˆ 0

( | { , , })

w S w

U w

U w S



     

   

     

 

     

   

 

  ,

where w^ˆ 



w j^{ˆ ,}j ^1,....,n



and ˆw_i w_i{ , ,  S^ˆ_(. | , )} wˆ , ˆ ( | , )

S t_  w  ¹

( ( ) , 1)

exp( )

{ (1 ) } ( ( ) )

i i

i n u t

i i i i

I u

w I u

  

   







 



  

 





We also define

* 0

( , , )

i i

w w   S_ ( )₀ ⁰( ( )) ( ) ( ( )) {1 ( )}

i i

i i i

S S



   

     

 

  



 ,

where S_⁰( ) is true survival function. It is not easy to establish asymptotic properties of ˆ( , )^ˆ jointly since wˆ still depends on ( , )  in a complicated way. To precede the theoretical development, we need to assume

Assumption: sup ˆ_i _i ( ^{1/ 3})

w w^ o n^ a.s. for all  and .

Note that the quality of weights still plays an important role. We ran simulations to evaluate the effect of using arbitrary weights but the results lead to a biased estimator of . The imposed assumption is a condition to assure that wˆ is a good weight. Due to this assumption,  ( | )wˆ can be ignored in the evaluation of ˆ since  affects ˆ^ˆ  only through wˆ.

Now we can focus on

0 1

Temporarily ignoring the estimated weight, first we examine the property of

U( | w^) ^* the sum of the following three terms:

  

(   . We apply similar techniques of Ying (1993) to prove that

1n 2n

B B has the order o n( ^{1/ 2}) in a o(n¹^/³) neighborhood of ₀. See Appendix 1 for the proof.

By the Taylor’s expansion,

^^{( ; )}^t   ^^{( ;}^t ⁰⁾



^^{( ;}^t ⁰⁾^o⁽¹⁾



^Z^T⁽  ⁰⁾, where _^*( ; )t   _^( ; ) /t   or equivalently t

 

estimated weight. Our original goal is to show that n^^{1/ 2}U( | ) wˆ n^^{1/ 2}U( | w^)o_p(1). get the desirable property. However since this is not a realistic assumption, we have to try other approaches.

We obtain some intermediate results. Applying similar techniques of expansion, we can write U( | ) wˆ U(₀| )wˆ A n^ˆ_n (  ₀)r_n, (3.7b) where the components of A^ˆ_n are similar to A with w_n ^ being replaced by wˆ . The difference of (3.7a) and (3.7b) directly follows that

* *

[ ( |U  w )U(0|w )][ ( | )U  wˆ U(₀| )]wˆ

 



0 0



1 1

ˆ ˆ

( ; , ) ( ; , ) ( ; ) ( ; , ) ( ; , ) ( ; )

n n

i i

Z t  w Z t  w dN t  Z t  w Z t  w dN t 

   

 



 

 

 

 ^{ .}^dⁿ

In Appendix 2, we show that d_n o n( ^{1/ 3}). Notice that, based on the right-hand sides of (3.7a) and (3.7b), one can also write

d_n  A n_n (  ₀)o n( ^{1/ 2}n   ₀ ) A nˆ_n (  ₀)r_n.

In Appendix 3, we show that r_n o n( ^{1/ 2}) and hence A_n  A^ˆ_n. We aim to establish the result:



^ˆ ⁰



^{(0, ( )} ¹ ^{( ) )}¹

n   Normal A ^  A ^ , (3.8)

where A is the limit of A and _n  is the limit of



⁰



² ⁰

1 ( ; , ) ( ; )

n i

Z Z t w dN t

n  ^ ^ 



 



 ^.

However the above proofs are not enough to make this conclusion. Let’s summarize the results that we have obtained:

U( | ) wˆ U(₀|w^*) { ( U ₀| )w Uˆ  (₀|w^*)}A n_n (  ₀)o n( ^1/2).

Note that U(₀|w^*)Normal(0, ) . If U(₀| )wˆ U(₀|w^*)o n( ^{1/ 2}) , it follows that asymptotically, 0n^^{1/ 2}U( | )ˆ ˆw A n( ^ˆ ₀) which implies the normality of ˆ . In

developing the variance estimator of ˆ, we still rely on the result in (3.8).

In Appendix 4, we show that for each t, Z t( ;₀, )wˆ Z t( ;₀,w^)o_p(1), but the order after taking the sum is still not derived yet. The final goal is to prove

1 ₀ 1 ₀ ^*

( | )ˆ ( | ) _p(1)

U w U w o

n   n   .

Note that

U(₀| )wˆ U(₀|w^*) 0



0 0



0 1

( ; , ) ( ; , )ˆ ( ; )

i i

Z t  w Z t  w dN t 

 



 

 ^.

In Appendix 4, we show that for each t, Z t( ;₀, )wˆ Z t( ;₀,w^)o_p(1), but the order after

taking the sum is still not derived yet. The difficulty comes from the dynamic weight which is a complicated function of



  . We have conducted simulations to check whether ^,





⁰ ⁰ ^*



3.4 Numerical algorithm and variance estimation

The proposed estimators solve

( | )w

with S_ being replaced by the following explicit formula:

The estimation procedure requires many iterations by updating the weights using previous estimates. Specifically let w^{( )}^m be the m-th step estimate of w based on (^ˆ^{( )}^m,^ˆ^{( )}^m ,S^ˆ_^{( )}^m ).

The procedure is repeated for m0,1, 2,... until convergence.

3.4.1 Re-sampling based on bootstrap approach

The non-differentiability and the complicated and dynamic weight components make it difficult to derive an analytic formula for variance estimation. The bootstrap approach provides a simulation scheme without extra analytic work. Say R bootstrap samples are drawn from the original data



⁽Xi^{, ,}i Zi^T^{), 1,...,}i n



. For each bootstrap sample, we perform the estimation procedure which involves  ( |w^{( )}^m ) and 0 U( | w^{( )}^m) for say m1,...,M . The sampling distributions of ^ˆ and ˆ can be approximated based on the K bootstrap estimates. The bootstrap method is time-consuming which involves solving the roots R M times. Note that solving U( | w^{( )}^m ) even once is not an easy task. 0

3.4.2 Re-sampling based on pivotal estimating functions

Parzen, Wei and Ying (1994) proposed a re-sampling method which has become a popular tool for variance estimation for many semi-parametric inference problems. This approach is useful when the estimating function is not smooth. In other situations, the derivative of the score function can be derived under some regularity conditions but still contains unknown density functions which cannot be estimated based on the simple plug-in approach.

Now we apply and modify the idea of Parzen et al. (1994). For our problem, the pivotal estimating function are the asymptotic distributions of

* 0

( | ) ( | )

U w

 



 

 

 

 

  

 .

Directly applying this approach, we first need to generate many replicates from the pivotal distribution denoted as (_j,U_j) for j1,...,R. Then solve

( | { , , ˆ }) ( | { , , ˆ })

w S

U w S U



   

  

   

   

 

   

  . (3.9c)

Let ( ,   be the corresponding solution for _j _j) j1,...,R. Then the conditional distribution of

 , given the observed sample, is asymptotic equivalent to the unconditional distribution

of ⁰ conditional on the observed sample, can be used to approximate the unconditional distribution of

( , ) ˆ ˆ .

The above procedure, however, is very time-consuming since it still involves many iterations to obtain the solution of (3.9c). We propose to modify the procedure by solving

where w denotes the final estimated weight. This modification can avoid the time-consuming ˆ^* iterations within each re-sampling run. In Appendix 3, we will see that this modification still produces valid results for variance estimation.

Now we derive the algorithm to simulate random samples from the pivotal distributions.

Since our interest is in , we only need to focus on U( | w^*) since it does not involve other

1/ 2 * 0 ˆ

( ; ) ~ _p(0, )

n^ U  w N  , (3.10c)

where the covariance matrix can be estimated by

ˆ ˆ

ˆ ( , ) ( , ) /ˆ ˆ

n T

i i

w w n

   



 



^. ^(3.10d)

In Section 3.3, we have shown that for  in a small neighborhood of ₀, n U^^{1/ 2} ( | wˆ^)n U^^{1/ 2} (₀;wˆ^)An^{1/ 2}(  ₀)o_p(1),

where A is the asymptotic slope matrix of n^^{1/ 2}U(₀;w^*) . We simulate G_i ~N(0,1) independently for i1,...,n. Let ^ˆ^* be the solution to







  ⁿ

i i w Gi

w U

ˆ ) ˆ, ( ˆ )

(   . (3.10e)

We can show that the conditional distribution of







 n

i i w Gi

1 2 /

1  (ˆ,ˆ ) , given the observed data, is also (0, )N_p  . Accordingly the conditional distribution of n¹^/²(ˆ^ˆ) follows N_p(0,A^¹A^¹), which is equivalent to the unconditional distribution of n¹^/²(ˆ₀) . To implement the re-sampling algorithm, we repeat (3.10e) for R times and then obtain ˆ^*

j for j1,...,R. The sample variance can be used to estimate Var( )ˆ . The proposed re-sampling procedure is much faster than the bootstrap approach since no iteration is needed in solving (3.10e) and also there is no need to deal with the estimating function of .

3.5 Model Checking

We utilize the martingale framework to construct a model checking procedure for the latency distribution. Here the model assumption refers to the chosen form of (.)h . Define the residual process:

1/ 2 1

( ; ) ˆ ( ; )

i i

V t  n^ Z M t 





^, ^(3.11a)

where

ˆ ( ; )_i departure from the imposed model.

First of all we need to show that, under the assumed model, V t( ; )ˆ converges weakly to a mean-zero Gaussian process. The argument is similarly to Ghosh (2003). Here we summarize the sketch of proof. One can write

Notice that

By the martingale central limit theorem and the consistency of ˆ, V t( ; )ˆ ^dN_p(0, ) ,

where the covariance matrix ( ) can be estimated by (3.10d). Furthermore its asymptotic

distribution can be approximated by ˆ ( )

V t ^{1/ 2} ^*

1 0

ˆ ˆ ˆ ˆ ˆ

{ ( ; , )} ( ; ) ( ; ) ( ; )

n t

i i i

n^ Z Z u  w dM u  G V t  V t 





  ^  . (3.12)

For informal model diagnostics, we can plot the sample curve of V t( ; )ˆ along with several simulated curves of ˆ ( )V t . If the sample curve is located within the range of simulated curves, the model assumption is reasonable. Formally, we can generate many replicates of ˆ ( )V t and compute the value of sup_t n^^{1/ 2}V tˆ( ) for the model candidates under consideration. The p-value refers to the empirical frequency that the observed value of sup_t n^^{1/ 2}V t( ; )ˆ exceeds the simulated

values of sup_t n^^{1/ 2}V tˆ( ) .

3.6 Simulation analysis

3.6.1 Data generation

We first generate Z from Bernoulli (0.5) and compute ₁

* (1) * (1)

0 (1, 1) ( 0, 0 ) 0 0 1

T T

Z   Z     Z ,

where the values of ₀^* are ₀⁽¹⁾ are specified. Then generate  ~Bernoulli(p_Z) with

0 0

exp( ) ( | )

1 exp( )

Z T

p Z Z

  

  

 .

If 1  , we generate the latency variable T which follows log T ₀⁽¹⁾Z₁₀⁽²⁾Z₂ 

where  follows the log-exponential distribution. If  0, we set T to be a very large number exceeding the support of C which follows a uniform distribution. Observed variables include replications of



^X^{, ,}^ ^Z



^{, where} ^X ^{ }^{T C}^and^ ^^{I T C}⁽ ^ ⁾. We consider two settings with A:

) 5 . 0 (

1 ~ Ber

Z and Z₂ ~ Ber(0.5); and B: Z₁ ~Ber(0.5) and Z₂ ~Unif(0,1). 3.6.2 Simulation results

Tables 1A and 1B show the results for estimating ₀⁽⁰⁾, ₀⁽¹⁾,

(0) 0

0 (0)

exp( ) ( | 0)

1 exp( )

p   Z 

   

 , and

(0) (1)

0 0

1 (0) (1)

0 0

exp( )

( | 1)

1 exp( )

p   Z  

 

   

  .

We calculate the average bias and standard deviation based on 1000 replications. In the two tables, the estimators have reasonable performances which improve as the sample size increases.

Transforming )(₀⁽⁰⁾,₀⁽¹⁾ into the probability scale based on (p₀,p₁), the performances of ˆ )

ˆ ,

(p₀ p₁ look satisfactory. Comparing the two tables which differ in the values of p₁, we see that the corresponding estimator becomes more variable when p₁ is closer to 0.5.

Our main proposal is developed for estimating ₀ in the latency model. Table 2A and Table 2B correspond to the incidence models in Table 1A and Table 1B respectively. The proposed estimators of ₀⁽¹⁾ and ₀⁽²⁾ have reasonable performances but sometimes produce larger bias when the sample size is small or the censoring rate is high. Our another important proposal is the re-sampling scheme for variance estimation. To examine the performance, we first check whether the sample average of ˆ^*

j which solves (3.10e) is close to the true parameter value. Then we examine whether the proposed estimator  ˆ (ˆ_j), which is sample standard deviation of ˆ^*

j

(j1,..., )R , is close to the simulated estimate denoted as se(ˆ_j). The results are satisfactory. As a consequence, the coverage probability is close to the 95% nominal level in most cases. Notice that the results in Table 2B appear to be better than those in Table 2A since the former corresponds to higher incidence rate which provides more data to estimate the latency distribution.

Finally we examine the proposed model checking procedure. We first simulate data from an AFT model and then analyze it by an AFT model. Figures 4.1A and 4.1B show the two

components of V t( ; )ˆ based on Z and ₁ Z respectively. The observed curves are mostly ₂ located within 20 simulated curves which show that the fitted model is acceptable. Then we generate an AFT model and fit a location shift model. Figures 4.2A and 4.2B, the observed curves are located outside the simulated curves which show that the fitted model is not satisfactory.

Chapter 4 Literature Review for

Transformation Models with Cure 4.1 Background

In this chapter we consider the second class of models with the incidence model given by Pr( 1| )Z  ( ₀| )Z ⁰

exp( ) 1 exp( )

T T

Z Z



 

 .

And for   , 1 T follows a transformation model of the form _i ( ) ^T 0

h T Z   ,

where ( )h  is a unknown monotone function but the distribution of  is completely specified.

Note that we denote the distribution, survival and cumulative hazard functions of  as F_, S_ and  which are fully specified. The most well-known example is the proportional hazards (PH) _ model in which  follows the extreme value distribution with F s_

 

1 exp{ exp( )}- - s . When

 follows the standard logistic distribution with F sε

 

^exp

 

s / + ^{{1 exp}

 

s ^}, the model

becomes the proportional odds (PO) model.

For the discussions in this chapter, observed data are denoted as



⁽Xi^{, ,}i Zi^{), 1,...,}i n



, where X_i   and T_i C_i _i I T( _i C_i). The parameters of interest are( , )  while h t is an ( ) infinite-dimensional nuisance function. In the early stage of methodology development, statisticians including Kuk and Chen (1992), Sy and Taylor (2000) and Peng and Dear (2000) focused on the special case that the latency distribution follows the PH model. Then a new trend starting from Lu and Ying (2004) considers statistical inference for the whole of class of transformation models. In this chapter, we review existing literature for transformation cure models. Roughly speaking, existing inference approaches can be classified into two types. One is based on the likelihood principle and the other is based on moment properties.

4.2 Different model expressions

We first review different formulations of a transformation model since the form of model expression affects subsequent inference development. The most well-known representation is given by

( ) ^T 0

h T Z   , (4.1a)

which states that the failure time T for a susceptible subject can be written as a parametric linear model after an unknown monotone transformation. Alternatively one can also write

{S t_Z( )} h t( ) Z^T 0

     , (4.1b)

where S t_Z( )Pr(T t| 1, )Z is the survival function of T| 1,Z and F t_( ) 1 ^¹( )t which is a known function. The representation of (4.1b) says that a known transformation of the survival function leads to a linear structure in the parameters which contains an un-specified intercept function. One can also write (4.1b) in terms of the cumulative hazard function, defined as

( ) log{ ( )}

Z t S tZ

    , such that

0 0

( ) log[ { ( ) ^T }] { ( ) ^T }

Z t ^ h t Z  H h t Z 

      , (4.1c)

where H t( ) log{^¹( )}t is also completely specified. Notice that the above three equivalent

在文檔中治癒模式之半母數迴歸分析 (頁 8-0)