Monte Carlo estimation of extrapolation of quality-adjusted survival for follow-up studies

(1)

* Correspondence to: Jing-Shiang Hwang, Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan. E-mail: [email protected]

Contract/grant sponsor: National Science Council of ROC, Taiwan

CCC 0277}6715/99/131627}14$17.50 Received January 1998

Statist. Med. 18, 1627 } 1640 (1999)

MONTE CARLO ESTIMATION OF EXTRAPOLATION OF

QUALITY-ADJUSTED SURVIVAL FOR FOLLOW-UP

STUDIES

JING-SHIANG HWANG* AND JUNG-DER WANG

 Institute of Statistical Science, Academia Sinica, ¹aipei 115, Taiwan

 Center for Research of Environmental and Occupational Disease, Institute of Occupational Medicine and Industrial Hygiene, College of Public Health, National Taiwan University, No. 1, Sec. 1, Jen-Ai Rd, Taipei 100, Taiwan

SUMMARY

The expected quality-adjusted survival (QAS) for an index population with a speci"c disease can be estimated by summing the product of the survival function and the mean quality of life function of the population. In many follow-up studies with heavy censoring, the expected QAS may not be well estimated due to the lack of data beyond the close of follow-up. In this paper, we "rst created a reference population from the life tables of the general population according to the Monte Carlo method. Secondly, we "tted a simple linear regression line to the logit of the ratio of quality-adjusted survival functions for the index and reference populations up to the end of follow-up. Finally, combining information on the reference popula-tion with the "tted line, we predicted the expected quality-adjusted survival curve beyond the follow-up period for the index population. Simulation studies have shown that the simple Monte Carlo estimation procedure is a potential approach for estimating expected QAS and the survival function beyond the follow-up with a certain degree of accuracy. Copyright1999 John Wiley & Sons, Ltd.

1. INTRODUCTION

Quality of life (QOL) and quality-adjusted life year (QALY), as measures of outcome evaluation in health contexts, have been increasingly used in the analysis of health-related studies, especially cost-e!ectiveness analysis, in recent years. The QOL is often measured by utility scale or health pro"le and then summarized to reference points between 0 and 1; perfect health is assigned a weight of 1, and a state equivalent to being dead a weight of 0. An individual's quality-adjusted survival (QAS) is given as the integration of the patient's utility through his/her survival duration. Hwang et al. treated a patient's QOL as a stochastic process q(t"¹, x), t3[0, ¹], where ¹ is the survival time and x is a covariate vector representing a speci"c cohort or population with the index disease. The QAS time of the patient, measured in quality-adjusted survival years or

(2)

months, is therefore represented by

QAS(x)"

2

q(t"¹, x) dt. (1)

If patients are assumed to experience "nite k health states which di!er in their QOL, such as in Glasziou et al., the QAS is given as

QAS(x)" I

GqGsG

where q, 2, qI are the utilities assigned to each of k health states and s, 2, sI are the times spent in each of the states.

To estimate the expected QAS for a speci"c disease population or subpopulation with covariate vector x, Hwang et al. derived a simple approximation

E[QAS(x)]+

E[q(t"x)]S(t"x) dt

where E[q(t"x)] is the mean QOL at time t after onset for the disease subpopulation with covariate vector x, and S(t"x) is the survival function of that subpopulation. The estimated survival function, denoted by S< (t"x), can be obtained by applying the commonly used methods such as life table, Kaplan}Meier methods and parametric models to available survival data. Patients' QOL utilities at some time points can be obtained through a cross-sectional survey. The estimated E[q(t"x)], denoted by q'(t"x), is calculated using kernel smoothing techniques or"tting a non-linear curve to the QOL survey data. In a simulation study, Hwang et al. demonstrated that the area under the quality-adjusted survival curve, de"ned by qasc

Y

(t"x)"qL(t"x)SK(t"x), is a good estimate of the expected QAS for a speci"c population. However, this estimator of expected QAS is limited only to acquisition of QOL survey data and a complete follow-up for survival. In many chronic diseases commonly encountered, the life expectancy may be very long. For example, patients with diabetes mellitus, hypertension or papillary thyroid cancer frequently survive for more than 15}20 years if early recognized and carefully treated. Thus, one may be faced with sets of survival data with very high censoring rates, say more than 80 per cent. Then, the QAS estimation can only be computed up to the end of follow-up, say 5 years, instead of the whole life span.

Because of lack of data on both survival and QOL after censoring, the QAS method cannot be applied for clinical and public health decisions directly. While the mean QOL for the whole time span might be extrapolated by assigning some constant utility for the time beyond the follow-up, accurate survival estimates from heavy censoring data are not easily obtained. Parametric extrapolation of survival estimates beyond the follow-up limits is a common approach. Gelber

et al. proposed an estimator that is a composite of the Kaplan}Meier product limit estimate and

a parametric estimate of the tail of the survival function. The estimator is especially useful whenever a parametric model could be feasibly "tted to the tail rather than to the entire survival curve. Mark et al. proposed another composite modelling approach for estimating survival rates after the end of follow-up for the Global Utilization of Streptokinase and Tissue Plasminogen Activator for Occluded Coronary Arteries (GUSTO) study, which extended 1-year survival data by an additional 14 years from the Duke Cardiovascular Disease Database and used a Gompertz function to extrapolate the tail of the survival curve. The direct substitution of the

(3)

survival may produce accurate results to some degree. However, it works only when the relevant database is available.

In this paper, we propose a more feasible approach to project quality-adjusted survival estimates beyond the follow-up, especially when the data have been heavily censored. The main idea of this approach is to borrow information from a reference population, of which the survival function is easily obtained from some available life table data such as a table of vital statistics. This approach consists of roughly three phases. First we create a reference population with survival function estimated according to the Monte Carlo method from a population with known hazard functions. Second, we "t a simple linear regression to the logit transform of the ratio of QAS curves for both the index and the reference populations up to the end of follow-up. Finally, the estimated regression line and survival curve of the reference populations are used to estimate the entire quality-adjusted survival curve, and therefore the projected long-term QAS beyond the follow-up limit.

The general data structure for these studies is given in Section 2. Section 3 describes detailed modelling and estimating procedures. In Section 4, simulation studies are designed to mimic practical follow-up studies. The performance of the proposed approach is evaluated through these simulation studies. The potential applications of the proposed approaches and their limitations are discussed in the "nal section.

2. DATA STRUCTURE 2.1. Follow-up data for an index population

The methodology has been developed to estimate the expected QAS beyond the follow-up period for an index population with a disease or an injury based on a typical medical follow-up study that allows for arbitrary censorship and opportunity for quality of life interviews. Suppose that by the end of the study N patients have been recruited to form a sample of the index population. For the ith patient in the case sample, let >G denote the duration time since onset to the current study time or end of the follow-up study. If the patient is still alive, we assign the censor status variable dG"0; otherwise, >G represents complete survival time. Usually we have already collected covariate vector zG describing the patient's characteristics such as sex, onset age, race, family history and social status etc. We have also assumed that a random cross-sectional survey has been conducted on patients still alive, to collect a sample of <G and qG, where <G is the duration time since onset to the interview, and qG is the measured utility of the patient's quality of life at time <G. The estimated survival function can be estimated using the Kaplan}Meier method and denoted by S< (t"index) for t*0. The mean quality of life function can also be estimated using the kernel smoothing method, as described in Hwang et al., and denoted by q' (t"index) for t*0. Note that the estimates of both survival and mean QOL are usually reliable only up to the end of follow-up.

2.2. Simulated data for a reference population

The "rst phase of our estimation procedures is to create an informative reference population. The selection of reference subjects is similar to a cohort study, where one usually chooses a group according to comparability of e!ects, contrasted populations and information. For feasibility considerations, we suggest at least matching the gender and onset age. This is because vital statistics of a nation's general population are more readily accessible. If certain sources of

(4)

reference subjects are available for better comparability of potential confounders, we may consider those alternatives instead.

The idea of borrowing information from the general population to improve estimation of expected life in survival studies with incomplete follow-up data was proposed, for example, by Hakama and Hakulinen. We have integrated this idea with the Monte Carlo techniques to establish a base for the "rst phase of estimation procedures for expected quality-adjusted survival. The detailed procedures of this phase for constructing a reference population from the general population are described as follows.

For the ith patient in the sample of the index population, we choose a person with the same covariate zG, mainly onset age and gender, from the general population with known hazard function in vital statistics to form a reference sample. The survival time of the selected individual in the reference population is then generated according to the Monte Carlo method based on the hazard function of the individual's matched gender and onset age. For example, a survival time of a reference subject corresponding to a male patient of age x may be generated as follows. From the life table of the general population, we "rst "nd out pV>I>

V>I , the proportion of male persons

alive at the beginning of age interval (x#k, x#k#1) but dying during the interval for k*0. The conditional survival function of the male general population who have survived to age x is given by S(t"x)"RI(1!pV>I>

V>I ), for t'0, and S(0"x)"1. Secondly, a uniform random

number within zero and one is generated. The time tV such that S(tV"x) equals the uniform random number is a survival time for the reference population.

The survival curve, denoted by S< (t"ref ), of the reference population is then obtained by applying life table or product limit methods to the simulated survival times. In practical application,

S< (t"ref ) should be greater than S<(t"index). If there are quality of life data available for the general

population, we may use similar techniques to obtain an estimated mean quality of life function, denoted by q' (t"ref ). Usually we would simply assign q'(t"ref ) a proper constant close to 1 to ensure the quality-adjusted survival curve for reference population qasc

Y

(t"ref )"qL(t"ref )SK(t"ref ) is above the estimated curve qasc

Y

(t"index) for the index population.

3. MODEL AND ESTIMATION

Since both index and reference populations have the same distribution for the covariate vector, which includes the factors a!ecting people's survival and QOL, the ratio of qasc

Y

(t"index) and qasc

Y

(t"ref ), denoted by =(t), is assumed to behave in a stable manner after a period of stage of which the disease was "rst noticed and more invasive diagnostic and/or therapeutic procedures were carried out to study the extent of the disease or to surgically remove the lesion, and patients were usually under psychological stress and physical discomfort as well as higher risk of mortality because of these invasive procedures. In most practical situations, especially in populations with chronic diseases, =(t) will slowly decrease or remain constant after t is larger than this critical or unstable time point ¹. Meanwhile the hazard and quality of life are usually worse in the index population if compared with the reference population, therefore, we may assume that =(t) is between 1 and 0.

In a follow-up study, we can con"dently estimate =(t) only up to some speci"c time ¹, which is usually the end of the follow-up. Assuming that =(t) is stabilized after the unstable stage, we may use the predicted =(t) and estimated qasc

Y

(t"ref ) to estimate qasc

Y

(t"index) for t'¹. More precisely, we apply a logit transform to =(t) to extend the range from negative in"nity to positive in"nity, which approximates a straight line, except for the two tail ends. The left extreme tail

(5)

Figure 1. Plots of logit of =(t) for the proportional hazard examples with survival time in the reference populations having Weibull distributions with scale value one and shape values 0)5, 1)0 and 1)2 and exp (Zh)"2

corresponds to the period between onset and some unstable stage to behave more stably at time ¹, indicating the necessary time for stabilization. The right extreme tail corresponds to the time of near death, when the survival function is near zero. The central part of the logit curve is close to a straight line.

To illustrate the linearity of the logit transform of the ratio of these two quality adjusted survival functions, we assume that the hazard function for index population is proportional to the hazard function for the reference population. For example, we may have S(t"index)"

S(t"ref )8F, where Z is a covariate vector and h is the model parameter vector. Also assuming

that the ratio of mean quality of life functions between index and reference populations is a slowly decreasing function with time, for example 0)8 exp(!t), then we have =(t)"0)8 exp(!t)

S(t"ref )8F\. Suppose that the survival times of the reference population follow the popular

Weibull distributions with a scale parameter value of one and shape parameter values c" 0)5, 1)0, 1)2 indicating decreasing, constant and increasing hazard rates, respectively. In Figure 1 we can see clear linearity of the logit of =(t) for t'0, expect for a short period at the beginning.

The linearity property provides us with an alternative and easy way for projecting the survival and quality-adjusted survival estimates beyond the follow-up period. Therefore, we propose "tting a simple linear regression to logit of = (t) for t 3 [¹, ¹], that is

log

=(t)

1!=(t)

"a#bt#NR, for ¹)t)¹ (2)

(6)

Figure 2. Quality-adjusted survival curves for index sample and reference populations, and the estimated whole curve for the index population in a simulated 60-month follow-up study

Given the least squares estimates of the two parameters,a' and bK, the new estimate qasc

Y

(t"index) for t'¹ is given by

qasc

Y

(t"index)"qasc

Y

(t"ref ) exp(a#b) t) 1#exp(a#b) t)

. (3)

In order to gain an insight into estimation procedures, we calculated an example of qasc

Y

(t"index) from 60-month follow-up data and plotted the estimated quality-adjusted survival curve for index with the reference one in Figure 2.

Once we have obtained a better estimate of the whole curve of qasc

Y

(t"index), the area under the curve from 0 to t is an estimate of the expected quality-adjusted survival restricted to t for the speci"c disease population, denoted by QAS

Y

R(¹). Note that the estimate of the expected

QAS

Y

R(¹) is a!ected by the choice of the necessary stabilization time ¹. If we choose a smaller

¹, then we may confront the problem of a lack of "t. If ¹ is too close to ¹, fewer data are available for "tting a signi"cant line. In substantive terms, we might make our choice of ¹according to the characteristics of the disease. For instance, it may take up to 1}3 months for a complete work-up for hypertension to rule out secondary causes of hypertension and another 1}3 months to adjust the treatment regimen. Take another example; it may take 1}3 months for a patient to recover from thyroid surgery for thyroid cancer. In practice, the choice of ¹ can be similar to the procedures of Gelber et al.. That is, use the helpful plot of logit of =(t) to determine whether a simple linear regression "ts the tail well and what value of ¹ is appropriate. The plot of logit of =(t) may provide several possible ¹ values for modelling, thus we may calculate the QAS

Y

R(¹) for each possible ¹ and save the slope estimate, denoted as bK2, only

(7)

slopes should be very close for each ¹, but it is possible to have an inaccurate estimate of slope due to outliers or in#uential points in the selected time interval, especially when the sample size is not large enough. In order to have a stable estimate, we suggest that the expected quality-adjusted survival time, QAS

Y

R, be given by QAS

Y

R(¹*), where ¹* is the value ¹ value such that bK2*

is the median of all the savedbK2.

The standard error of QAS

Y

Rcan also be estimated by using resampling techniques similar to

the bootstrap method. The bth bootstrap data set of size N for the index population is sampled with replacement from the original data setX"+>G, dG, zG, <G, qG,,G, denoted by X@. Follow the entire Monte Carlo estimation procedure to create a new reference sample based on the bootstrap dataX@. The modelling and estimation procedures are then applied to the bootstrap data set and corresponding reference sample to produce a bootstrap estimate of expected QAS, denoted by QAS

Y

@R. We may repeat the bootstrap procedure B times to collect a sample of B bootstrap estimates of expected QAS. The standard error of QAS

Y

Ris therefore given by the sample standard

deviation of+QAS

Y

@R, @.

4. SIMULATION STUDY 4.1. Hypothetical disease populations

Three hypothetical populations of size 50,000 with speci"c diseases representing moderate, longer and shorter mean survival times were generated for the performance evaluation of our Monte Carlo estimation procedures. Patient's gender was generated according to a Bernoulli distribu-tion with probability 0)5 for the three populadistribu-tions. In populadistribu-tions I and III, the onset ages were generated from gamma distributions with means of 55 and 60 years old for men and women, respectively. The standard deviation of the onset were 12 and 6 years for both men and women in these two populations. In population II, the onset ages were also generated from gamma distributions with smaller means of 37 and 42 years old for men and women, respectively. A larger standard deviation of 42 years was set for these onset ages.

Each patient's hazards after the onset age are assumed to be proportional to the hazards in the general population with the same gender and age for populations I and II. The hazard function of the general population is based on the 1993 vital statistics of Taiwan. The exact survival times ¹of patients in populations I and II with gender z and onset age z are generated from the following two hazard functions, respectively:

h(t"z, z)"h(t"z, z) (1#A;(1#z)# B log(z)) (4) and

h(t"z, z)"h(t"z, z) (1#A; (1#z)#Blog(z)) (5) where h(t"z, z) is the hazard function of the general population for the age z, and gender z"1 for male and 0 for female. The random variable A is uniformly distributed in (0,1), and B is beta distributed with both of the parameters 0)5. These two populations were constructed to have the patient's hazard function worse than the general population with same age and gender. The degree of worsening conditions is partially contributed by the personal unknown random factor

A and patient's onset age with a scale random factor B. Males in these two populations have

(8)

The survival time is much longer and more variant in population II than in population I, while the survival times in population III were constructed independently, not related to the hazard function h(t"z, z) used in the other two populations, from a gamma distribution with a shorter mean of 8 years and a standard deviation of 4 years. The survival curves given in Figure 3 depict the survival di!erences among these three hypothetical index populations.

The quality of life function of the ith patient with simulated survival time ¹G among these three hypothetical index populations is determined by the same function as in Hwang et al., that is

qG(t"p, g, c, d)"p(1!t/¹G)E#d(1!p) sin(ctn/¹G) (6) where p,g, c and d are uniformly distributed in (0)8, 1), (0)01, 0)5), (0,4) and (0,1), respectively. Let QASG(t) be the ith patient's quality-adjusted survival since onset to time t, that is

QASG(t)"

R qG(t"p, g, c, d) dt " p¹G 1#g

1!

1!¹Gt >E

#d(1!p)t 2

1!2cnt¹G sin 2cnt ¹G

. (7)

With the above equation we can easily obtain the population mean QAS up to any time t by averaging these 50,000 QASG(t), which is denoted by E(QASR). The true cumulative distributions of mean quality-adjusted survival times, generated from (7), for these three hypothetical index populations are plotted in Figure 4. The mean QASs restricted to 60 months are very close, which are 48)6, 52)4 and 47)2 months for these three populations, respectively. As time expands to whole life, these three mean QASs grow with di!erent speeds to 131)9, 242)4 and 74)4 months. In the following steps of the simulation studies, we use k-month data sampled from each of these three populations to project results beyond k months which will be compared with the above true mean QAS for evaluating the degree of accuracy of the Monte Carlo approach.

4.2. Sample from the index populations

A hypothetical k-months follow-up study was designed to have patients entering the sample every year to accumulate a "nal sample of size N by the end of the follow-up. In the simulation, we selected a random sample of N patients uniformly from the above hypothetical index population. For the ith patient we recorded his/her gender and onset age and compared survival time >G to the follow-up time ;G, which is k months times a random number generated from beta(9)5, 0)5). Note that we have used a random ;G)k to make the sample heavily censored. If >G is smaller than ;G months, we treat >G as a complete survival time and assign the censor status variable dG"1. Otherwise we assign dG"0, indicating the case is still alive with right censoring time ;G. Meanwhile we also generate another uniform random number <G from 0 to ;G to represent the time of quality of life interview. To allow sampling error, the patient's quality of life qG(<G) at this time point is recorded as the patient's simulated quality of life values at a time point drawn uniformly between three months before and after <G.

4.3. Simulation results

The accuracy and precision of projected estimates of quality-adjusted survival in a follow-up study are determined mostly by the underlying true survival curve, length of follow-up, time to

(9)

Figure 3. The true and projected survival functions based on Monte Carlo and parametric model approaches for the three hypothetical index populations

(10)

Figure 4. The true and projected quality-adjusted survival estimates with bounds of one standard deviation for the three hypothetical index populations

extrapolation and sample size. For demonstration, we used 60-month follow-up data with sample sizes N"100 and 300 from these three hypothetical data sets to estimate and project the expected quality-adjusted survival time restricted to several limits beyond the follow-up period. The short follow-up data end up with censoring rates of 0)85, 0)95 and 0)79 for the three index populations.

For each setup we repeated the sampling and estimation procedures 300 times to obtain 300 projected estimates of quality adjusted survival time and 300 bootstrap standard errors. The average of these 300 QAS estimates and standard errors are denoted as QAS

Y

Rand SE(QAS

Y

R) for

projection time t, respectively. We then assessed the accuracy of the Monte Carlo estimator by the relative deviation of QAS

Y

Rfrom the true mean QAS of the index population, E(QASR). The

simulation results in Tables I, II and III and Figure 4 show that projected short-term results are quite good in terms of relative biases. The relative biases, as expected, tend to increase as extrapolations extend. However, small relative biases of 5}7 per cent are still found for the estimates even restricted to a long projection of 300 months.

In population I with sample size 100, projected QAS tends for a longer time to have a large underestimate. To further study the bias of the projected estimates, we may extend the follow-up period to a longer time. Suppose that we have a longer follow-up of 240 months and have computed the logit of = such as depicted in Figure 5. We see the slope of logit of = in population I has a slight upward change around the 140th month. This is why the projected estimates have produced underestimated results for time beyond that point in population I. Therefore, seeking a more appropriate reference population and continuing follow-up are necessary when a high accuracy is needed for inference.

The true standard error of the estimator is estimated by the mean squares of these 300 QAS estimate deviations from the true E(QASR). Comparing the average of the 300 bootstrap standard

(11)

Table I. Simulation results for population I based on 60-month follow-up sample with censoring rate 0)85 Sample size (N) Projected months True E(QASR) Estimated E(QASR)

Relative bias True SE Estimated SE (t) 100 0 48)64 48)13 !0)011 1)46 1)99 100 60 84)47 82)86 !0)019 6)63 10)12 100 120 108)45 103)90 !0)042 15)81 20)94 100 180 122)50 115)16 !0)060 24)33 30)53 100 240 129)19 120)44 !0)068 30)06 37)36 100 300 131)45 122)30 !0)070 32)65 40)79 300 0 48)64 48)45 !0)004 0)85 0)82 300 60 84)47 84)50 !0)000 3)64 3)79 300 120 108)45 106)92 !0)014 9)02 9)06 300 180 122)50 118)63 !0)032 14)64 14)01 300 240 129)19 123)69 !0)043 18)50 17)28 300 300 131)45 125)26 !0)047 20)17 18)74

Table II. Simulation results for population II based on 60-month follow-up sample with censoring rate 0)95 Sample size (N) Projected months True E(QASR) Estimated E(QASR)

Relative bias True SE Estimated SE (t) 100 0 52)42 52)09 !0)006 1)17 2)93 100 60 99)42 98)53 !0)009 6)64 14)63 100 120 139)45 136)91 !0)018 18)67 31)15 100 180 172)37 167)61 !₀₎₀₂₈ ₃₄₎₄₆ ₄₉₎₁₅ 100 240 198)08 191)89 !0)031 51)37 67)04 100 300 216)84 210)82 !0)028 67)52 83)72 300 0 52)42 52)26 !0)003 0)56 0)83 300 60 99)42 100)08 0)007 3)10 6)19 300 120 139)45 141)61 0)016 9)52 16)84 300 180 172)37 176)06 0)021 18)83 30)35 300 240 198)08 203)34 0)027 29)75 44)75 300 300 216)84 223)98 0)033 40)98 58)68 errors, SE(QAS

Y

_R

), with the estimated true standard error, we see that the bootstrap approach of

standard error estimation works well in populations I and III. For the 95 per cent censoring rate example of population II, the bootstrap standard error estimation tends to be too conservative. We also found that the main contribution of a larger sample size is in reducing standard error and of limited help on bias.

Although the methodology was developed for expected quality-adjusted survival estimation, it can be applied directly to survival function estimation simply by assuming that the quality of life function is a constant of one in the index population. Parametric model approaches for extrapolating survival function in follow-up studies are popular and also available in statistical

(12)

Table III. Simulation results for population III based on 60-month follow-up sample with censoring rate 0)79

Sample size Projected

months

True (N)

E(QASR)

Estimated

E(QASR)

Relative bias True SE Estimated SE (t) 100 0 47)19 47)11 !0)002 1)02 1)21 100 60 67)91 68)50 !0)009 5)48 5)76 100 120 73)23 74)19 !0)013 9)38 9)55 100 180 74)23 75)53 !0)017 11)10 11)42 300 0 47)19 47)19 0)000 0)63 0)65 300 60 67)91 69)10 0)018 3)40 3)24 300 120 73)23 74)77 0)021 5)60 5)37 300 180 74)23 75)88 0)022 6)37 6)19

Figure 5. Plots of logit of =(t) calculated from samples (size"100) of the three hypothetical populations

software such as SAS and IMSL. For demonstration, we compared the Monte Carlo approach with three models of log-normal, Weibull and extreme value in the FORTRAN library IMSL. Survival data in the above three hypothetical samples of size 300 were used for these approaches to project survival functions. In the simulation study, 100 repetitions were implemented and the averages of the 100 estimated survival functions obtained from each of the two approaches are plotted with the true ones in Figure 3. It is very clear that the Monte Carlo approach has a quite good performance of long-term projection in these three hypothetical populations. It also does not seem easy to identify proper parametric models to have such accurate projections, although we have not tried all available models.

(13)

5. DISCUSSION

Survival data with heavy censoring are often encountered in follow-up studies of cohorts with long-term survival. Without other auxiliary information, there is probably little chance for complex techniques to produce any convincing result beyond the end of follow-up from heavily censored data. In this paper we proposed to borrow information from easily accessible data, such as vital statistics, to match the index population, apply simple procedures including a logit transform of the ratio of quality-adjusted survivals for reference and index populations, and use the linearity of the transformed curve to make inferences. We successfully applied the methodo-logy to data from three di!erent types of hypothetical disease populations. In simulation, we predicted from 12-month to life-long results using a sample of 60-month follow-up with heavy censoring and compared these predictions to the true ones. Given a sample of 100 or 300, the relative biases of the projected estimates are within 5}7 per cent even for a long projection of 300 months. The true expected QAS beyond any time point of follow-up is also within one standard error of the projected estimate.

The methodology can be directly applied to the whole survival function estimation in follow-up studies. Simulation studies showed quite convincing results on three hypothetical examples, compared with approaches by popular parametric models. In fact, the usual parametric models may be suitable only for short-term projection as shown on Figure 3 or extrapolations based on better "t of tail part of observed survival proposed by Gelber et al. Figure 3 also indicates that these approaches may not produce accurate enough estimates for a longer time projection.

The performance of the proposed methodology is mainly determined by an available proper reference database, but for most follow-up studies of chronic disease, using the life table of general population as the reference is probably good enough. Besides, if there is a more valid reference population available, it can still be used directly in our method, conceptually similar to the approach proposed by Mark et al. Therefore, our simple methodology is more feasible and may be a universal solution for practical applications.

The accuracy and precision of the Monte Carlo estimator for the expected QAS are also a!ected by the length of follow-up, the behaviour of the two tails underlying the true curve of logit =, and, of course, sample size. Among these factors, we note that the non-linear left tail, re#ecting the patient's unstable quality of life in the beginning stage, is usually not uniform and must be adjusted for di!erent diseases in terms of magnitude and extent. For diseases which require both surgery and chemotherapy, say, breast cancer, this unstable stage may last up to 1 or 1)5 years, depending on the time required for the comprehensive diagnosis and treatment. For a chronic disease which only needs initial diagnostic work-up and selection of suitable therapeutic regimen, such as diabetes mellitus, this early unstable period may only take 2}3 months. The performance of the Monte Carlo estimator applied to a study with too short a follow-up period and too small a sample size is therefore not guaranteed for a long-term projection. In order to judge whether the Monte Carlo estimator of expected QAS is appropriate for a follow-up study, we suggest a diagnostic tool, which checks the linearity of the logit of =, such as that depicted in Figure 5. If the linearity is signi"cant, our approach can be applied to produce an accurate estimate of the expected QAS beyond the follow-up. On the other hand, the life-long projected QAS might be slightly underestimated when the true curvature of the right tail of logit of = is large. This may happen when an inappropriate reference population is used and cannot be fully evaluated based only on the observed data. Although a larger sample size will increase the precision, the gain in reducing bias may be limited. Therefore, it may be necessary to continue the follow-up to ensure

(14)

the accuracy. However, the estimator and its con"dence limits still provide a ballpark idea of prognosis and may be useful for policy and resource planning.

ACKNOWLEDGEMENTS

The authors are particularly grateful to the referees and the editor who provided helpful comments and suggestions on this paper. This study was supported in part by the National Science Council of ROC, Taiwan.

REFERENCES

1. Gold, M. R., Siegel, J. E., Russel, L. B. and Weinstein, M. C. Cost-e+ectiveness in Health and Medicine, Oxford University Press, New York, 1996.

2. Patrick, D. L. and Erickson, P. Health Status and Health Policy: Allocating Resources to Health Care, Oxford University Press, New York, 1993.

3. Testa, M. A. and Simonson, D. &Assessment of quality-of-life outcomes', New England Journal of

Medicine, 334, 835}840 (1996).

4. Hwang, J. S., Tsauo, J. Y. and Wang, J. D. &Estimation of expected quality-adjusted survival by cross-sectional survey', Statistics in Medicine, 15, 93}102 (1996).

5. Glasziou, P. P., Simes, R. J. and Gelber, R. D. &Quality adjusted survival analysis', Statistics in Medicine, 9, 1259}1276 (1990).

6. Shaper, A. G., Wannamethee, S. G. and Walker, M. &Body weight: implication for the prevention of coronary heart disease, stroke, and diabetes mellitus in a cohort study of middle aged men', British

Medical Journal, 314, 1311}1317 (1997).

7. Sytkowski, P. A., D'Agostino R. B., Belanger, A. J. and Kannel, W. B. &Secular trends in long-term sustained hypertension, long-term treatment, and cardiovascular mortality: The Framingham heart study 1950 to 1990', Circulation, 93, 697}703 (1996).

8. Mazzaferri, E. L. and Jhiang, S. M. &Long-term impact of initial surgical and medical therapy on papillary and follicular thyroid cancer', American Journal of Medicine, 97, 418}428 (1994).

9. Gelber, R. D., Goldhirsch, A. and Cole, B. F. &Parametric extrapolation of survival estimates with applications to quality of life evaluation of treatments', Controlled Clinical ¹rials, 14, 485}499 (1993). 10. Mark, D. B., Halatky, M. A., Cali!, R. M. et al. &Cost e!ectiveness of thrombolytic therapy with tissue plasminogen activator as compared with streptokinase for acute myocardial infarction', New England

Journal of Medicine, 332, 1418}1424 (1995).

11. Wang, J. D. and Miettinen, O. S. &Occupational mortality studies: principles of validity', Scandinavian

Journal of =ork and Environmental Health, 8, 153}158 (1982).

12. Hakama, M. and Hakulinen, T. &Estimating the expectation of life in cancer survival studies with incomplete follow-up information', Journal of Chronic Diseases, 30, 585}597 (1977).

13. Lee, E. T. Statistical Methods for Survival Data Analysis, 2nd edn, Wiley, New York, 1992.

14. Efron, B. and Tibshirani, R. J. An Introduction to the Bootstrap, Chapman and Hall, New York, 1993. 15. IMS¸ Stat/¸ibrary, Visual Numerics, Inc., 1997.

Monte Carlo estimation of extrapolation of quality-adjusted survival for follow-up studies

MONTE CARLO ESTIMATION OF EXTRAPOLATION OF

QUALITY-ADJUSTED SURVIVAL FOR FOLLOW-UP

STUDIES





Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y



Y

Y

Y

Y