Preliminary on Counting Processes and Martingales 5

Chapter 3 Inference without Covariates

3.1 Estimation based on Martingale Residuals

3.1.1 Preliminary on Counting Processes and Martingales 5

2 , 1 ( ) (

, = ≤ =

∧

=T C I T C i

X_i _i _i δ_i _i _i , and (C₁,C₂) are a pair of censoring times independent of (T₁,T₂) . Observed data can be denoted as { (X₁_j,X₂_j,δ₁_j,δ₂_j) , ( j=1,2,...,n)}. We assume that the joint survival function follows copula models:

)]}

( [ )]

( [ { } ) ( ), ( { ) ,

Pr(T₁ ≥t₁ T₂ ≥t₂ =C S₁ t₁ S₂ t₂ =ϕ_α^-1 ϕ_α S₁ t₁ +ϕ_α S₂ t₂ (3.1) The objective is to estimate α without specifying the marginal distributions.

In this chapter, we will review three semi-parametric inference approaches. In Section 3.1, we first review the concept of martingales which has been used by Hsu and Prentice (1996) to construct an estimating equation for estimating α . In Section 3.2, we review the paper of Shih and Louis (1995) who proposed a two-stage of estimation procedure.

Specifically they suggest to estimate the marginal distribution first. Then these marginal estimates are treated as pseudo-observations of {S₁(T₁),S₂(T₂)} in the likelihood based on

(.,.)

Cα . In Section 3.3, we apply the idea of the Log-Rank statistic by constructing a series of two-by-two tables to estimate the association parameter. This idea has been used by Day et al.

(1997) and Wang (2003) for analyzing semi-competing risks data.

3.1 Estimation based on Martingale Residuals

3.1.1 Preliminary on Counting Processes and Martingales

Based on observed variables, one can define N(t)= I(X ≤t,δ =1) which is a counting process and the filtrationF_t =σ[N(u),I(T ≤u,δ =0);0≤u≤t], which describes the history of N(t) prior to or at time t .

Define the cumulative intensity process:

∫

^≥

Λ t ^tI X u u du

0 ( ) ( )

)

( λ , (3.2) where

Δ According to the Doob-Meyer decomposition (Fleming and Harrington, 1991), the expectation of M(t) is a mean-zero martingale. Now we briefly verify the properties of

) (t M :

(A) Given F which contains the history of the process prior the time of t , the expectation _t of dN(t) be an intensity process, such as:

(B) By the Doob-Meyer decomposition, the expectation of dM(t) equals to zero, such that:

For the bivariate case, define

M are also correlated. Hsu and Prentice (1996) define the following cumulative covariance function, or the covariance rate, as:

∫ ∫

non- parametrically or under a specified model.

A nonparametric estimator of ψ(dt₁,dt₂) is given by

We have seen that, besides α , the model-based expression of the covariance function contains nuisance parameters. If the marginal functions S_i(t)andΛ_i(dt) (i=1,2) can be estimated, say by the Kaplan-Meier and Aalen’s estimators, respectively. One can estimate

)

Hsu and Prentice (1996) suggested the following estimating equation based on the weighted difference between the nonparametric estimator of ψ(t₁,t₂) and its model-based estimator:

Example: the Clayton model

∫ ∫

3.2 Two-Stage Estimation

We can view C_α(u₁,u₂) as the joint survival function of (U₁,U₂)= (S₁(T₁),S₂(T₂)). first and then plug in the likelihood function of α . There are two methods for estimating the marginal distribution. One is the parametric approach in which the marginal distributions are specified up to some unknown parameters. By applying the maximum likelihood approach for estimating the marginal parameters, one can estimate (S₁(X₁),S₂(X₂)) by

))

~ ( ),

(S₁ X₁ S₂ X₂ . The other approach does not assume the form of the marginal distributions, the Kaplan-Meier method can be applied for estimating S_i(T_i)(i =1,2). We illustrate this approach using the Clayton model as an example.

Example: the Clayton model

∏

If the marginal form is unknown, we estimate S_i(X_i) by Kaplan-Meier estimator:

3.3 Estimation based on Two-by-Two Tables 3.3.1 The Proposed Method

In this section, we propose an estimator of α for an Archimedean Copula Models of the form:

)]}Pr(T₁ ≥t₁,T₂ ≥t₂)=C{S₁(t₁),S₂(t₂)}=ϕ_α^-1{ϕ_α[S₁(t₁)]+ϕ_α[S₂(t₂ . (3.20) This idea is actually an application of the papers of Day et al. (1997) and Wang (2003) who considered semi-competing risks data. In presence of censoring, we observe

)}. Log-Rank statistic which can be constructed based on a series of two-by-two tables. At an observed failure points (t₁,t₂), we can construct the following two-by-two tables as follows:

Table: Two-by-Two Tables at time (t₁,t₂) The cell counts are defined as follows. Let

∑

Notice that the odds ratio of the table converges to Conditioning on the marginal counts, N₁₁(dt₁,dt₂)follows a hypergeometric distribution with mean:

By equating the empirical count with its model-based expected value and combining the tables with different (t₁,t₂), we can construct the following estimating equation:

∫ ∫

⁻ ⁼

Example: the Clayton model

) , ( ) , ( ) , (

) , ( ) , ) (

, , (

2 1 10 2 1 2

1 10

2 1 01 2 1 10 2

11 N dt t R t t N dt t

dt t N t dt dt N

E × + −

= × α

α α . (3.28)

3.4 Discussion

Shih and Louis (1995) proposed a two-stage estimation procedure. This approach is semi-parametric in the sense that the first stage can be estimated non-parametrically. However in some complicated data structures, such as semi-competing risk, nonparametric estimation in the first stage is not applicable. Hsu and Prentice (1996) constructed their estimating function based on martingale residuals. However the model-based expression involves too many high-dimensional nuisance parameters and therefore the resulting estimating equation becomes very complicated. Practical performance of this estimator heavily depends on the accuracy of the plug-in estimates in all data range. In our simulations, we have found that estimation in the tail region is not satisfactory.

The latter two approaches only use the conditions of the moments. It seems that the approach based on two-by-two tables is a more natural way for describing the dependence structure for AC models. Specifically for the Clayton model E₁₁(dt₁,dt₂,α) does not even contain any nuisance parameters. The key is that the theoretical odds ratio of the tables can well capture the association information for AC models. In contrast, ψ(t₁,t₂;α) is much less natural. It involves high-dimensional nuisance parameters that affect the subsequent inference procedure.

Chapter 4 Inference with Covariates

In this chapter, we discuss the estimation of α when there exist covariates that may affect the marginal distributions. Let (T₁,T₂) and (C₁,C₂) be failure times and censoring

Furthermore, we assume that

). In the following analysis, we assume that marginally the covariate effect follow the Cox Proportional Hazard model, such that

where )λ_i_,₀(t is the baseline hazard function. In Section 4.1, we modify the two-stage estimation approach. In Section 4.2, we apply our idea based on two-by-two tables to handle this more generalized situation.

4.1 Two-Stage Estimation

The first stage involves estimating pseudo-observations of (U₁,U₂). Note that, under the Cox PH model,.U_i =S_i(t_i |Z_i)=S_i_,₀(t_i)^exp(^Zⁱ^'^βⁱ⁾ (i=1,2). This implies that we need to estimate β_i and S_i_,₀(t_i) first. The regression parameter β_i can be estimated by maximizing the following likelihood

The estimator of the baseline function can be expressed by Breslow’s estimator:

For Clayton’s model with ¹

C , the above likelihood equals

)

4.2 Estimation based on Two-by-Two Tables

Based on pseudo-observations {(U₁_j,U₂_j,δ₁_j,δ₂_j)(j=1,...,n)}, we can construct the following two-by-two table:

Table: Two-by-Two Table based on pseudo-observations The cell counts are defined as follows. Let

∑

Accordingly the estimating function becomes:

∫ ∫

⁻ ⁼

Consider the Clayton model:

Chapter 5 Simulations

5.1 Data Generation

Via simulations, we will examine finite-sample performances of several estimators of α without and with covariates. We consider generating (T₁,T₂) from the Clayton model of the form: generation algorithm for the Clayton model proposed by Prentice and Cai (1992).

5.1.1 Data Generation without Covariates Step (i) Specify the value of τ and compute distributed.

Finally with {(T₁_j,T₂_j,C₁_j,C₂_j)(j=1,...,n)}, we can create observed data {(X₁_j,X₂_j,

5.1.2 Date Generation with Covariates

Now we generate data which include a binary covariate. We assume the marginal effect follows the Cox proportional hazard model (1972), such as:

, )

( )

( _i_,₀ ^exp(^Zⁱ^' ⁱ⁾

i t S t

S = ^β (5.2) where Z is the covariate and _i S_i_,₀(t) is the baseline survival function at time t (i=1,2). The general procedure can be stated as follows. Let S(T)=U, where U ~ U(0,1), and under the Cox proportional hazard model, we have S₀(T)^exp(^Z^{′ )}^β =U. Hence it follows that

) exp(

) )) log(

( log( ₀

β Z T U

S = ′ , (5.3) which implies that

⎥⎦

⎢ ⎤

⎣

⎡

= ⁻ ′ )

) exp(

) exp( log(

0 Zβ

S U

T . (5.4) We still need to specify the form of S₀(t). For most distributions, the inverse of S₀(t) may not have an explicit expression which increases the numerical difficulty in the analysis. In our simulations, we specify the baseline survival function to be S₀(t)=exp(−t) and obtain the following explicit expression:

) exp(

) log(

β Z T U

′

= − , (5.5)

where −log(U) which follows exp(1).

The data generation procedure is summarized below.

Step (i) Generate Z from Bernoulli(0.5) for _ij i=1,2.

Step (ii) Generate failure times (T₁^*_j,T₂^*_j) for the baseline group with

Zij using the above algorithm, whereT_ij^* ~exp(1)(i=1,2).

Step (iii) Given the value of β_i, for those with Z_ij =1, we set

) exp( ^'

i ij ij

ij Z

T T

= β (i =1,2).

Step (iv) Generate {(C₁_j,C₂_j)(j=1,2,...,n)}, both of which are exponential distributed.

Finally with {(T₁_j,T₂_j,C₁_j,C₂_j,Z₁_j,Z₂_j)(j=1,...,n)} , we can create observed data )}

,..., 2 , 1 ( ) , , , , ,

{(X₁_j X₂_j δ₁_j δ₂_j Z₁_j Z₂_j j= n such that X_ij =T_ij ∧C_ij and δ_ij = I(T_ij ≤C_ij). 5.2 Simulation Results

5.2.1 Results without Covariates

In this section, we evaluate two approaches based on two-stage estimation and the construction of two-by-two tables. Two sample sizes with n=100 and n=500 are considered. The parameter α ranges form 21. to 19 which correspond to τ from 0.1 to 0.9. For each estimator, the average bias and standard deviation of α are reported based on 500 replications. To achieve the targeted censoring rates, 30 % and 60 %, we set C to _i follow U(0,5.5) and U(0,2.5), respectively for i=1,2.

Table 1.1 summarizes the results in absence of censoring. Note that for the approach of two-stage estimation, we also present the results when the first stage of estimation is performed parametrically. Recall that we assume the marginal distribution of failure time X is exp(λ =1). Hence, for the parametric two-stage procedure, we use

∑

= _n= j

j n

j j

ˆ 1

λ (in complete data x ˆ= 1

λ ) (5.6)

to plug in the second stage for estimating α . This approach yields better results (smaller bias and smaller variation) than the semi-parametric two-stage procedure. As for the approach constructed based on two-by-two tables, it produces fairly nice results despite that it makes no assumption on the marginal distributions. Specifically it is fairly unbiased and the variance is only slightly larger than the parametric two-stage procedure. Note that the variation of all the

estimators becomes larger when α increases.

Table 1.2 and 1.3 are the results in presence of external censoring. Although variation of the estimators are larger than those without external censoring, the estimators still perform well. For Table 1.1 ~ 1.3, the variation is close to zero with the increasing of sample size (from 100 to 500). Hence, we can conclude that all of the estimations satisfy the property of consistency.

5.2.2 The Method Proposed by Hsu & Prentice

In this section, we examine the performance of the estimator proposed by Hsu and Prentice (1996) under the same settings. The results based on complete data are summarized below.

Strangely, based on Table A, we found that αˆ seems to be not consistent when 3

≥2

α . To investigate what caused the problem, we checked several things. The details are summarized in the Appendix. We suspected that the problem may be attributed to the estimation instability of ψ(t₁,t₂;α) in (3.10) in some region of (t₁,t₂). The plugged-in marginal estimators are usually unstable in the tail area. Hence we trimmed the integration area from [0,∞)×[0,∞) to a bounded region. Because this modification has no theoretical justification, we only evaluate the case without censoring.

Table B below contains the results for the modified estimator which is analyzed in a bounded region. Specifically, for each margin, we trim 50 % of the tail region. Based on Table B, we can know that both of bias and standard deviation of αˆ have significant improvement.

With the sample size increases from 100 to 500, the standard deviation of αˆ have less n=500

) 10 (st.error 10

bias× ⁻² × ^-2 2

1. -2.283 (5.861)

1.5 -6.701 (7.396)

1.85714 -15.267 (8.659) 3

2. -31.961 (8.996)

3.0 -66.637 (7.981) 4.0 -138.223 (5.786)

6 .

5 -284.485 (3.156)

9.0 -606.891 (1.162) 19.0 -1602.436 (0.860)

Table A: Original version of Hsu & Prentice’s estimator with no censoring

variation. Despite of the improvement, the method still performs not as well as the previous two approaches. One possible reason is that the model-based expectation ψ(t₁,t₂;α) in (3.10) contains more nuisance parameters than the other two approaches.

n=500 n=100 α

) 10 (st.error 10

bias× ⁻² × ^-2 2

1. 0.229 (10.741) 4.398 (24.211) 1.5 -0.162 (13.234) 3.805 (31.675) 1.85714 -0.145 (16.388) 3.804 (39.416)

2. -0.489 (20.883) 4.530 (49.133) 3.0 -0.161 (28.358) 5.530 (64.276) 4.0 0.922 (39.522) 9.705 (89.874)

6 .

5 2.931 (63.757) 12.754 (149.819) 9.0 13.172 (147.715) 43.050 (351.349) 19.0 104.310 (678.372) 322.146 (1652.888)

Table B: Modified version of Hsu and Prentice’s estimator with no censoring

5.2.3 Results with Covariates

We have proposed to extend two inference approaches to a more complex situation that covariates affect the marginal distribution. Now we check the validity of the extension by simulations. Here we assume the Cox Proportional Hazard model to describe marginal heterogeneity. For the two-stage estimation approach, we only report the results that the marginal distributions are estimated non-parametrically. The parameter of α ranges from

2 .

1 to 19 and β₁ =β₂=0.8. We also evaluate the situation in presence of censoring with 30

% and 60 % censoring. To achieve the targeted censoring rates, we let C follow _i )

5 . 3

exp(λ = and exp(λ =1.5), respectively (i=1,2). Two sample sizes with n=100 and

=500

n are evaluated. For each estimator, the average bias and standard deviation are reported based on 500 replications.

Table 2.1 summaries the result with covariates in absence of censoring. Our focus is on comparing the two methods after adjustment for the effects of β₁ andβ₂. The variation of the two approaches is close. However, the two-by-two table approach seems to produce less biased estimates. Table 2.2 and 2.3 are the results in presence of right censoring. The estimators of α have larger variation but still perform well. All of the estimators are consistent when the sample size increases. In the simulations not reported here, we find that the estimators of α become invalid if the marginal heterogeneity is ignored.

Chapter 6 Conclusion

In the thesis, we review three inference approaches for estimating the association parameter for copula models. The existing methods are originally developed for analyzing homogeneous data. Here we extend these methods to account for marginal heterogeneity explained by covariates.

The two-stage estimation procedure proposed by Shih and Louis (1995) is easy to implement but not applicable under more complicated data structures such as semi-competing risks data that involves dependent censoring. The proposed approach based on two-by-two tables is motivated by the Log-Rank statistics. In comparison, it is a simple procedure from both aspects of analytic derivations and computation. It also has nice performance in simulations. Since this approach only utilizes some moment conditions, it can be easily modified for different data structures. The estimator of Hsu and Prentice (1996) has poor performance in our simulations. If our numerical algorithm is correct, the poor performance may be caused by the plugged-in estimators of the nuisance functions.

The proposed method and the method by Hsu and Prentice (1996) are both moment-based procedures but their performances are very different. We have found that, for AC models, the odds ratio of the two-by-two table provides a better descriptive measure for the association. In contrast, the covariance function of martingale residuals proposed by Hsu and Prentice is much less natural. That is why it produces an estimating function that involves many nuisance parameters.

Table 1.1 Comparison of two approaches without external censoring.

) 10 (st.error 10

bias× ⁻² × ^-2

n=500 n=100 Two-Stage Two-Stage

parametric semi-parametric Two-by-Two Table

Parametric semi-parametric Two-by-Two Table 2

1. -0.213 (8.114) 0.669 (8.382) -0.026 (8.275) 0.968 (14.767) 3.8 (16.271) 0.383 (15.232) 1.5 -0.191 (10.047) 1.179 (10.531) 0.065 (10.536) 1.464 (19.040) 6.054 (20.966) 0.28 (20.749) 1.85714 -0.252 (12.514) 1.192 (13.326) 0.031 (13.401) 2.138 (23.801) 7.592 (26.739) -0.074 (27.063)

2. -0.393 (15.841) 0.844 (17.039) -0.042 (17.170) 2.793 (30.204) 7.993 (33.793) -0.028 (34.737)

3.0 -0.578 (20.526) 0.018 (22.080) -0.208 (22.244) 3.637 (39.343) 7.128 (43.565) -1.11 (45.182) 4.0 -0.882 (27.588) -1.561 (29.449) -0.306 (29.807) 4.775 (53.258) 4.507 (57.379) -2.252 (60.220)

6 .

5 -1.438 (39.439) -5.398 (41.709) -0.683 (42.362) 6.37 (76.751) -2.423 (81.810) -0.963 (85.493)

9.0 -2.532 (63.241) -16.53 (65.670) -1.721 (66.599) 9.213 (123.879) -29.453 (128.361) 0.159 (136.928) 19.0 -5.370 (134.733) -72.600 (138.037) -2.178 (142.870) 16.84 (264.620) -173.33 (271.930) 9.597 (303.381)

Table 1.2 Comparison of two approaches with censoring rate 0.3.

) 10 (st.error 10

bias× ⁻² × ^-2

n=500 n=100 Two-Stage Two-Stage

parametric semi-parametric Two-by-Two Table

Parametric semi-parametric Two-by-Two Table 2

1. 0.073 (9.088) 0.566 (9.182) 0.142 (9.068) 2.061 (16.984) 4.258 (18.502) 1.702 (17.562) 1.5 0.234 (11.437) 1.024 (11.641) 0.34 (11.635) 2.747 (22.384) 5.737 (23.998) 1.252 (23.918) 1.85714 0.19 (14.198) 0.965 (14.721) 0.334 (14.909) 3.487 (27.112) 6.941 (29.197) 1.423 (30.365)

2. 0.024 (17.916) 0.395 (18.709) 0.244 (19.105) 4.139 (34.322) 7.02 (36.550) 1.254 (39.061) 3.0 -0.314 (22.888) -0.743 (24.042) 0.255 (24.793) 4.853 (44.807) 5.003 (47.158) 0.051 (51.647) 4.0 -1.224 (30.484) -3.377 (31.640) 0.078 (33.127) 3.853 (60.379) -2.213 (61.451) 0.297 (69.924)

6 .

5 -2.872 (43.457) -9.567 (44.718) 0.114 (47.279) 0.406 (86.207) -21.197 (83.557) 5.469 (99.999) 9.0 -9.223 (68.992) -31.33 (69.585) 0.634 (74.918) -21.92 (141.180) -88.543 (128.043) 16.75 (165.275) 19.0 -70.26 (158.490) -189.043 (152.155) 3.405 (161.384) -212.22 (348.587) -481.44 (260.105) 33.59 (369.187)

Table 1.3 Comparison of two approaches with censoring rate 0.6.

) 10 (st.error 10

bias× ⁻² × ^-2

n=500 n=100 Two-Stage Two-Stage

parametric semi-parametric Two-by-Two Table

parametric semi-parametric

Two-by-Two Table

1. 2 0.44 (10.714) 0.584 (10.759) 0.53 (10.736) 4.018 (21.149) 5.067 (22.596) 0.846 (21.655) 1.5 0.323 (13.202) 0.703 (13.357) 0.442 (13.401) 4.091 (27.924) 6.032 (29.777) 0.639 (29.280) 1.85714 0.317 (16.350) 0.685 (16.582) 0.409 (16.822) 5.072 (34.088) 7.577 (36.085) 1.563 (36.934) 2. 3 0.199 (20.272) 0.409 (20.862) 0.416 (21.371) 6.378 (42.449) 8.277 (43.924) 0.798 (46.024) 3.0 0.073 (26.114) -0.205 (27.040) 0.577 (28.097) 7.54 (54.756) 7.428 (55.933) 0.861 (60.081) 4.0 -0.299 (34.732) -1.645 (36.026) 0.928 (37.965) 7.942 (72.487) 2.874 (73.717) 2.214 (83.575)

6 .

5 -1.692 (49.035) -5.835 (50.423) 1.059 (53.934) 6.253 (104.281) -10.142 (103.610) 7.152 (119.437) 9.0 -7.103 (78.580) -19.496 (79.399) 2.336 (86.260) -6.881 (166.990) -53.666 (159.488) 25.38 (196.765) 19.0 -54.14 (175.675) -124.01 (168.489) 6.457 (184.836) -165.24 (373.784) -377.02 (315.480) 32.51 (453.817)

Table 2.1 Comparison of two approaches under marginal heterogeneity without external censoring.

) 10 (st.error 10

bias× ⁻² × ^-2

n=500 n=100 Two-Stage Two-by-Two Two-Stage Two-by-Two

α β₁ = β₂

αˆ ¹

βˆ β^ˆ₂

αˆ ¹

βˆ β^ˆ₂ 2

1. -0.501 (6.391) 0.059 (6.288) -0.385 (9.188) -0.188 (9.640) -0.694 (14.900) -0.396 (14.132) 0.022 (21.557) -0.758 (21.390) 1.5 -1.983 (7.925) -1.230 (7.934) -0.257 (9.150) -0.345 (9.759) -3.612 (17.381) -2.002 (17.066) -0.526 (20.297) 1.460 (21.300) 1.85714 -3.055 (9.437) -1.924 (9.747) 0.729 (9.110) 0.194 (9.501) -7.404 (21.619) -5.029 (21.922) 1.015 (21.017) 0.766 (22.258)

2. -4.229 (11.624) -2.016 (11.763) -0.265 (8.924) 0.400 (9.970) -11.885 (25.475) -6.569 (26.423) 1.496 (21.028) 1.476 (21.540) 3.0 -5.615 (15.486) -2.360 (15.760) 0.626 (9.976) 0.494 (9.766) -15.534 (34.260) -6.149 (36.794) 1.836 (21.810) 2.830 (22.377) 4.0 -7.863 (20.590) -2.752 (21.118) 1.845 (9.404) 1.580 (9.646) -25.774 (47.262) -10.616 (50.432) 1.717 (22.160) 2.244 (23.109)

6 .

5 -14.769 (26.894) -5.651 (27.678) -0.111 (9.259) 0.027 (9.180) -42.934 (59.398) -14.262 (66.131) -0.876 (21.050) -0.829 (21.554) 9.0 -31.589 (42.410) -11.676 (43.023) 0.402 (9.888) 0.364 (9.788) -98.841 (99.761) -34.205 (105.528) 0.519 (21.985) 0.929 (21.968) 19.0

0.8

-114.648 (89.175) -41.539 (93.940) 0.494 (9.542) 0.473 (9.435) -341.087 (219.164) -127.069 (245.475) 0.420 (23.016) 0.205 (22.920)

Table 2.2 Comparison of two approaches under marginal heterogeneity with censoring rate 0.3.

) 10 (st.error 10

bias× ⁻² × ^-2

n=500 n=100 Two-Stage Two-by-Two Two-Stage Two-by-Two

α β₁ = β₂ αˆ ¹

βˆ β^ˆ₂

αˆ ¹

βˆ β^ˆ₂ 2

1. -0.267 (7.589) -0.250 (8.413) -0.345 (10.011) 0.080 (10.170) 0.638 (16.577) 1.978 (18.085) -0.614 (22.271) 0.593 (23.457) 1.5 -1.858 (9.418) -1.567 (10.379) 0.083 (9.424) -0.124 (10.435) -2.140 (21.365) -1.309 (23.301) -0.466 (22.783) 1.363 (25.114) 1.85714 -1.963 (10.775) -1.007 (11.722) 0.337 (9.852) 1.079 (10.411) -4.958 (23.924) -0.703 (28.945) 0.933 (22.150) 0.658 (23.185)

2. -3.272 (14.306) -0.576 (16.869) 0.232 (10.123) 0.704 (10.714) -10.582 (31.164) -3.821 (36.515) -0.872 (23.256) -0.840 (24.870) 3.0 -6.165 (17.556) -1.408 (20.790) 0.327 (10.057) 0.704 (10.849) -21.709 (37.096) -10.014 (45.931) 1.367 (24.254) 1.187 (23.103) 4.0 -7.834 (23.841) 0.285 (27.205) -0.431 (11.057) -0.273 (10.434) -24.520 (52.442) 3.311 (69.246) 1.304 (24.693) 1.672 (24.308)

6 .

5 -17.583 (31.227) -1.378 (36.732) 0.574 (10.675) 0.352 (10.524) -57.033 (69.257) -2.362 (98.199) 1.592 (23.854) 1.033 (23.943) 9.0 -109.124 (55.087) -58.938 (66.582) 0.586 (10.134) 0.349 (10.517) -259.518 (111.386) -135.602 (152.304) 2.719 (23.804) 0.992 (24.344) 19.0

0.8

-292.511 (127.019) -97.013 (136.070) 0.397 (11.115) 0.562 (11.293) -668.705 (210.578) -183.206 (335.635) 2.082 (24.015) 2.155 (24.233)

Table 2.3 Comparison of two approaches under marginal heterogeneity with censoring rate 0.6.

) 10 (st.error 10

bias× ⁻² × ^-2

n=500 n=100 Two-Stage Two-by-Two Two-Stage Two-by-Two

α β₁ = β₂ αˆ ¹

βˆ β^ˆ₂

αˆ ¹

βˆ β^ˆ₂ 2

1. -0.022 (9.175) 0.647 (11.317) 0.480 (11.516) 0.484 (12.338) 0.535 (19.467) 5.426 (27.392) 0.403 (27.579) 1.314 (28.132) 1.5 -0.700 (11.438) 0.997 (14.396) -0.311 (11.200) 0.597 (12.072) 0.269 (27.556) 2.762 (34.610) -0.788 (25.720) 0.609 (27.642) 1.85714 -1.103 (13.657) 0.187 (16.934) -0.092 (11.824) 0.619 (11.884) -2.658 (29.413) 1.710 (39.691) 1.143 (26.587) 4.003 (29.461)

2. -2.726 (18.460) 0.039 (23.483) -0.025 (11.456) 0.230 (12.124) -4.620 (40.159) 5.159 (57.367) -0.679 (27.484) -0.800 (28.109) 3.0 -5.260 (21.654) -1.091 (27.524) -0.243 (11.857) 0.166 (11.525) -14.801 (51.198) -1.219 (74.896) 1.004 (28.668) 2.123 (27.966) 4.0 -13.133 (28.383) -5.477 (37.248) 0.005 (12.170) 0.090 (11.847) -34.456 (62.953) 2.777 (97.260) 1.595 (27.887) -0.243 (26.747)

6 .

5 -22.755 (42.830) -6.343 (56.938) -0.314 (11.934) 1.115 (13.131) -70.193 (94.419) -13.234 (133.771) 0.490 (27.494) 0.838 (26.873) 9.0 -70.929 (68.613) -24.824 (92.450) -0.423 (12.367) 0.509 (11.945) -196.595 (142.673) -53.852 (234.286) 0.169 (29.724) -1.050 (26.876) 19.0

0.8

-395.364 (166.251) -185.900 (228.931) 0.254 (12.735) 0.040 (12.080) -840.543 (281.685) -326.036 (561.663) 2.275 (29.854) 0.901 (28.127)

References

Clayton, D. G. (1978). A Model for Association in Bivariate Life Tables and Its Application in Epidemiological Studies of Familial Tendency in Chronic Disease Incidence. Biometrics, 65, 141-151.

Fleming, T. R. and Harrington, D. P. (1991). Counting Processes and Survival Analysis. Wiley.

Genest, C. and Mackay, J. (1986). The Joy of Copulas: Bivariate Distributions with Uniform Marginals. The American Statistician, 40, No. 4.

Genest, C. and Rivest, L. P. (1993). Statistical Inference Procedures for Bivariate Archimedean Copulas. Journal of the American Statistical Association, 88, No.

423.

Hsu, L. and Prentice, R. L. (1996). On Assessing the Strength of Dependency Between Failure Time Variates. Biometrika, 83, 491-506.

Hsu, L. and Zhao, L. P. (1996). Assessing Familial Aggregation of Age at Onset, by Using Estimating Equations, with Application to Breast Cancer. Am. J. Hum.

Genet., 58, 1057-1071.

Nelsen, R. B. (1997). Dependence and Order in Families of Archimedean Copulas.

Journal of Multivariate Analysis, 60, 111-122.

Oakes, D. (1989). Bivariate Survival Models Induced by Frailties. Journal of the American Statistical Association, 84, 487-493.

Prentice, R. L. and Cai, J. (1992). Covariance and Survival Function Estimation Using Censored Multivariate Failure Time Data. Biometrika, 79, 495-512.

Shih, J. H. and Louis, T. A. (1995). Inference on the Association Parameter in Copula Models for Bivariate Survival Data. Biometrics, 51, 1384-1399.

Wang, W. (2003). Estimating the Association Parameter for Copula Models under Dependent Censoring. Journal of the Royal Statistical Society: Series B, 65, 257-73.

Wei, L. J., Lin, D. Y. and Weissfeld, L. (1989). Regression Analysis of Multivariate Incomplete Failure Time Data by Modeling Marginal Distributions. Journal of the American Statistical Association, 84, 1065-1073.

Appendix: Checking the Validity of the Method by Hsu and Prentice

Investigation #1: Is the distribution of αˆ reasonable?

Figure A.1

Finding: There seems to be a bound on αˆ .

Investigation #2: Whether the above problem is caused by the root-finding procedure?

Figure A.2 plot of α = 9

Finding: The estimating equation has a unique but wrong solution in some situation.

Investigation #3: Whether the plug-in estimators for the nuisance functions are not accurate?

Figure A.3 the marginal survival function and its estimator

Figure A.4 the cumulative hazard function and its estimator

Finding: The plugged-in estimator have reasonable performance only in some region.

在文檔中 Copula模式之下雙變元存活資料之統計推論 (頁 12-0)