• 沒有找到結果。

Conditional Distribution Approach

在文檔中 二維存活資料之模式檢驗 (頁 29-0)

Chapter 5: Data Generation Algorithms

5.2 Conditional Distribution Approach

5.2 Conditional Distribution Approach

5.2.1 Theoretical Background

The idea was proposed by Lee (1993). Given the marginal distribution of X1 and if the conditional distribution of X2|X1 is specified, then X2 can be generated. In general, Xk can be generated given that the form of Xk|X X1, 2,...,Xk1 is specified. The algorithm can be performed successively for k=2,...,p.

Now we apply the above idea to the family of Archimedean copula construction of the form:

(

1, 2,..., p

) (

1

( )

1 , 2

( )

2 ,..., p

( )

p

)

F x x x =C F x F x F x =φ1

{

φF x1

( )

1 ++φFk

( )

xk

}

. The joint distribution function is given by

(

1

) (

1

)

( )

1 ( )

{

1

( )

1

( ) }

1

( ) ( )

2. Let X1=F11

( )

U1 .

log log exp log log

.

Obviously, the above form does not allow an explicit solution. Hence to solve the equation, we need to do it numerically.

5.3: The Proposed Data Generation Method

The idea is based on a theorem in Genest & Rivest (1993). Briefly speaking, for

(

X Y,

)

which follow an AC model, we can define two random variables

(

U V,

)

where

(

X Y,

)

which follows an AC model. The algorithm can be stated as follows.

Hence we have

5.4: Comparisons of the Three Approaches

For the frailty approach, to generate a random replication of

(

X Y,

)

, we need to generate γ and a pair of uniform random variables. For the latter two approaches, we only need to generate a pair of uniform random variables. Hence the frailty approach requires generating at least 50% more random numbers. This is considered as a drawback. For the Clayton model in which γ follows the Gamma distribution, the algorithm is simpler.

However for the situation with an arbitrary distribution of γ , to generate a random replicate of γ needed additional work. Moreover, not all of AC family can be derived from frailty model, that is, not every generator φ

( )

can be expressed as an inverse function of Laplace transform of some random variable.

Although the idea of the conditional distribution approach is straightforward, the solution of Xk in (5.2) usually does not have a closed-form expression even for the bivariate case.

It is very time consuming if we have to solve the complicated equation numerically.

The proposed method is friendlier compared with the previous two methods. In comparison with the frailty approach, we do not have to generate random numbers, namely γ , which are used only for a temporary purpose. Compared with the conditional distribution approach, our method is technically easier to handle. Sometimes the inverse of K

( )

has an

explicit form. If not, we can take advantage of the monotone property of K

( )

and obtain its inverse using the bisection method. Despite the simplicity of the proposed method, currently the result of Genest and Rivest (1993) can not handle higher dimension with p> . 2 It implies that we need more general theoretical results in order to extend the proposed

algorithm to general multivariate situations.

In Figures 5.1, we plot the generated data using the proposed algorithm. The two models appear to be similar when the level of tau decreases.

Fig.5.1. Simulated Data using the Proposed Data Generation Algorithm

Chapter 6: Numerical Analysis

Here we examine the performance of the proposed test by simulations. Since we expect our proposed test can be applied to any Archimedean Copula model, we use the Gumbel

We generate bivariate failure times following the Gumbel model, also called the positive stable frailty model. We evaluate the performances under different Kendall’s τ equal to 0.3, 0.4, 0.5, 0.6 and 0.7 respectively. The marginal distributions of two variables are both exponential with means equal to 1. The bivariate censoring variables are mutually independent and also following exponential distributions such that the probability of censoring is from 0 to 0.5 respectively in each coordinate.

After estimating the association parameter α, we have ˆα and ˆαw and let γˆ=logαˆ and ˆγw=logαˆw. Then estimate the variance of ˆγw− , γˆ σˆJackknife2 . The Gumbel model is rejected if the test statistic

ˆ ˆ probabilities of accepting the Gumbel model under different settings are reported.

Table 6.1 and 6.2 report the empirical probabilities of choosing the Gumbel model. When the true model is Gumbel’s, the nominal probability should be 0.95. When the true model is Clayton’s or Frank’s, the probability is the estimate of type II error rate. Hence we hope that under Gumbel model is correct the proportion of choosing Gumbel should be close to 95/100, and the power is as large as possible. From table 6.1, we find that type-Ι error is a little

smaller than 0.05 when τ equals to 0.3. This may result from the variance estimator using the Jackknife method. The Jackknife algorithm tends to overestimate the variance and results in lower type-I error. When the sample size increases to 200, we see some improvement.

Specifically the results in Table 6.2 give more accurate type I probabilities and better power in Table 6.4 and Table 6.6. In Table 6.3 and Table 6.4, we evaluate the type II error probabilities when the true model is Clayton model. In Table 6.5 and Table 6.6, we evaluate the type II error probabilities when the true model is Frank model. From Table 6.3 to Table 6.6, we find that the power deceases as Kendall’s τ decreases. This is reasonable, since these three models will all reduce to independent models as Kendall’s τ tends to be zero. That is,

( ) ( ) ( )

Pr X >x Y, > y =Sx xSy y . This implies that it gets more difficult to distinguish the two models when they are similar.

Figure 6.1 to Figure 6.4 show the powers under true model is Clayton and Frank model with sample size equal to 100 and 200 respectively.

Table 6.1: Empirical Probabilities of Accepting the Gumbel Model with n =100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 0% Gumbel

Sample Mean -0.038 -0.029 -0.02 0.012 0.047

Sample Standard Deviation 0.88 0.959 1.01 0.993 0.99

Proportion of choosing Gumbel 99/100 97/100 96/100 96/100 95/100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 20% Gumbel

Sample Mean -0.115 -0.127 -0.141 -0.133 -0.134

Sample Standard Deviation 0.893 0.968 1.036 1.018 0.987

Proportion of choosing Gumbel 98/100 93/100 95/100 97/100 96/100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 50% Gumbel

Sample Mean -0.255 -0.252 -0.234 -0.207 -0.199

Sample Standard Deviation 0.882 0.933 0.941 0.886 0.838

Proportion of choosing Gumbel 97/100 96/100 95/100 96/100 99/100

Table 6.2: Empirical Probabilities of Accepting the Gumbel Model with n =200

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 0% Gumbel

Sample Mean 0.146 0.16 0.15 0.164 0.132

Sample Standard Deviation 0.986 1.033 1.022 1.006 1.001

Proportion of choosing Gumbel 97/100 95/100 92/100 93/100 93/100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 20% Gumbel

Sample Mean 0.097 0.109 0.088 0.118 0.114

Sample Standard Deviation 1.031 1.087 1.054 1.021 1.009

Proportion of choosing Gumbel 95/100 94/100 94/100 93/100 96/100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 50% Gumbel

Sample Mean -0.012 0.006 -0.034 -0.032 -0.032

Sample Standard Deviation 0.946 0.923 0.856 0.907 0.879

Proportion of choosing Gumbel 95/100 97/100 98/100 97/100 96/100

Table 6.3:Empirical Type II Error Probabilities of Accepting the Gumbel Model when the True Model is Clayton with n =100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 0% Clayton

Sample Mean -2.458 -3.203 -3.721 -4.118 -4.358

Sample Standard Deviation 1.113 1.226 1.194 1.193 1.246

Proportion of choosing Gumbel 30/100 17/100 6/100 3/100 3/100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 20% Clayton

Sample Mean -1.826 -2.397 -2.793 -3.113 -3.382

Sample Standard Deviation 1.034 1.108 1.108 1.13 1.236

Proportion of choosing Gumbel 55/100 39/100 24/100 10/100 12/100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 50% Clayton

Sample Mean -1.031 -1.379 -1.64 -1.919 -2.137

Sample Standard Deviation 0.879 0.983 1.059 1.144 1.135

Proportion of choosing Gumbel 83/100 72/100 65/100 58/100 52/100

Table 6.4:Empirical Type II Error Probabilities of Accepting the Gumbel Model when the True Model is Clayton with n =200

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 0% Clayton

Sample Mean -3.644 -4.76 -5.65 -6.303 -6.78

Sample Standard Deviation 1.217 1.511 1.695 1.832 1.894

Proportion of choosing Gumbel 6/100 2/100 0/100 0/100 0/100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 20% Clayton

Sample Mean -2.876 -3.782 -4.501 -5.055 -5.432

Sample Standard Deviation 1.083 1.342 1.525 1.672 1.738

Proportion of choosing Gumbel 23/100 6/100 3/100 0/100 0/100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 50% Clayton

Sample Mean -1.771 -2.405 -2.882 -3.314 -3.628

Sample Standard Deviation 0.951 1.218 1.392 1.549 1.579

Proportion of choosing Gumbel 58/100 31/100 28/100 24/100 15/100

Table 6.5:Empirical Type II Error Probabilities of Accepting the Gumbel Model when the True Model is Frank with n =100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 0% Frank

Sample Mean -1.865 -2.248 -2.547 -2.807 -2.977

Sample Standard Deviation 0.959 0.949 0.944 0.936 0.964

Proportion of choosing Gumbel 52/100 39/100 22/100 17/100 13/100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 20% Frank

Sample Mean -1.666 -2.047 -2.293 -2.552 -2.691

Sample Standard Deviation 0.945 0.957 0.928 0.92 0.956

Proportion of choosing Gumbel 67/100 50/100 38/100 21/100 17/100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 50% Frank

Sample Mean -1.24 -1.53 -1.729 -1.961 -2.018

Sample Standard Deviation 0.923 1.03 1.029 1.056 0.999

Proportion of choosing Gumbel 78/100 67/100 60/100 58/100 53/100

Table 6.6:Empirical Type II Error Probabilities of Accepting the Gumbel Model when the True Model is Frank with n =200

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 0% Frank

Sample Mean -2.829 -3.426 -3.882 -4.255 -4.597

Sample Standard Deviation 1.194 1.329 1.309 1.264 1.199

Proportion of choosing Gumbel 21/100 12/100 4/100 2/100 1/100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 20% Frank

Sample Mean -2.77 -3.382 -3.837 -4.142 -4.363

Sample Standard Deviation 1.167 1.333 1.345 1.346 1.315

Proportion of choosing Gumbel 24/100 15/100 5/100 2/100 2/100

tau=0.3 tau=0.4 tau=0.5 tau=0.6 tau=0.7

Censor proportion = 50% Frank

Sample Mean -2.229 -2.762 -3.139 -3.281 -3.413

Sample Standard Deviation 1.176 1.385 1.442 1.376 1.335

Proportion of choosing Gumbel 39/100 30/100 24/100 21/100 18/100

Fig.6.1: Curves of empirical power for H0: Gumbel vs. Ha: Clayton (n=100)

Fig.6.2: Curves of empirical power for H0: Gumbel vs. Ha: Frank (n=100)

Fig.6.3: Curves of empirical power for H0: Gumbel vs. Ha: Clayton (n=200)

Fig.6.4: Curves of empirical power for H0: Gumbel vs. Ha: Frank (n=200)

Fig.6.5: The local odds ratio functions at different levels of Kendall’s tau for the Gumbel model, the Clayton model and the Frank model

Chapter 7: Conclusion

In this article, we propose a test for checking whether the data following an AC model.

In our analysis, we use the Gumbel model for illustration. To verify whether proposed test statistic is asymptotically normal, and we examine its distribution by simulations. Our conjecture is confirmed. We have also found that the power of the proposed test is satisfactory.

Shih (1998) has analyzed the situation when the null hypothesis is the Clayton model while the alternative hypothesis is Gumbel’s model. In our simulations, we reverse the roles of the two models in setting the hypotheses. Our result is similar to that of Shih.

The power decreases as the censoring proportion increases. When the null hypothesis is the Gumbel model, the power is higher under the Clayton alternative than under the Frank model. Recall that in Figure 6.5., the Gumbel model is more close to the Frank model and less similar to the Clayton model. It is easier to distinguish two models which are more different which results in higher power.

As for future investigation, we may try more model combinations. Also it may be interesting to compare the proposed test with the test of Wang and Wells (2000) by simulations.

Appendix

Here, we prove the survival version of the theorem in Genest & Rivest. The proof can be divided into several parts.

Consider the survival AC model:

, ~ (1 ,1 ) 1{ (1 ) (1 )} Pr( , )

(iii) Show that the conditional survival function can be written as

( ) ( )

(i)

( )

( ) ( )

Here, we try to prove the asymptotic normality of γˆw− . The idea is that, first prove γˆ the asymptotic normality of untransformed estimator ˆαW −αˆUw, then utilize delta method to derive the asymptotic normality of ˆγw− .γˆ

1 ,

(

i j,

)

pairs observations. So, we can utilize the U -statistic to derive the analytic properties of S

( )

α .

Reference:

CLAYTON, D. G. (1978). A model for association in bivariate life tables and its application to epidemiological studies of familial tendency in chronic disease incidence. Biometrika. 65, 141-51.

DABROWSKA, D. (1988). Kaplan-Meier estimate on the plane. The Annals of Statistics. 16, 1475-89.

FREES EW, VALDEZ E. (1998). Understanding the relationships using copulas . North American Actuarial Journal. 2, 1-25.

GENEST, C. & RIVEST, L.-P. (1993). Statistical inference procedures for bivariate Archimedean copulas. Journal of the American Statistical Association. 88, 1034-43.

LEE, A. J. (1993). Generating random binary deviates having fixed marginal distributions and specified degrees of association. The American Statistician. 47, 209-215.

OAKES, D. (1986). Semiparametric inference in a model for association in bivariate survival data. Biometrika. 73, 353-61.

OAKES, D. (1989). Bivariate survival models induced by frailties. Journal of the American Statistical Association. 84, 487-93.

SHIH, J. H. (1998). A goodness-of-fit test for association in a bivariate survival model.

Biometrika 85, 189-200.

WANG, W. & WELLS, M. (1997). Nonparametric estimators of the bivariate survival function under simplified censoring conditions. Biometrika. 84, 863-880.

在文檔中 二維存活資料之模式檢驗 (頁 29-0)

相關文件