On power and sample size calculations for likelihood ratio tests in generalized linear models

(1)

On Power and Sample Size Calculations for Likelihood

Ratio

Tests

in Generalized Linear Models

Gwowen Shieh

Department of Management Science, National Chiao Tung University,

Hsinchu, Taiwan 30050, Republic of China

email:gwshieh@cc.nctu.edu.tw

SUMMARY. A direct extension of the approach described in Self, Mauritsen, and Ohara (1992, Biometrics 48, 31-39) for power and sample size calculations in generalized linear models is presented. The major feature of the proposed approach is that the modification accommodates both a finite and an infinite number of covariate configurations. Furthermore, for the approximation of the noncentrality of the noncentral chi-square distribution for the likelihood ratio statistic, a simplification is provided that not only reduces substantial computation but also maintains the accuracy. Simulation studies are conducted t o assess the accuracy for various model configurations and covariate distributions.

KEY WORDS: Generalized linear models; Likelihood ratio test; Logistic regression; Noncentral chi-square; Poisson regression; Sample size; Score test; Statistical power.

1. Introduction

Generalized linear models were first introduced by Nelder and Wedderburn (1972) and are broadly applicable in almost all scientific fields. (See McCullagh and Nelder (1989) for fur- ther details.) The class of generalized linear models is specified by assuming independent scalar response variables Yi

,

i = 1,. . .

,

N , follow a probability distribution belonging to the exponential family of probability distributions with probability density of the form

e x p [ W - b ( 4 ) / 4 + )

+

4 y ,

4)l.

(1.1) The expected value E(Y) = p is related t o the canonical pa- rameter 0 by the function p = b’(B), where b’ denotes the first derivative of b. The link function g relates the linear predic- tors 17 to the mean response 17 = g ( p ) . The linear predictors can be written as

17 = ZT$

+

XTX,

_(1.2)

where Z ( p x 1) and X ( q x 1) are vectors of covariates, and $

( p x 1) and X (q x 1) represent the corresponding unknown regression coefficients. The scale parameter

#I

is assumed known. We wish to test the null hypothesis of Ho: $ = $0 against the alternative hypothesis HI: $

#

$ 0 , while treating X as a nuisance parameter.

For the purpose of power and sample size calculations within the framework of generalized linear models, two major tests have been proposed. They are the score test and likelihood ratio statistic developed by Self and Mauritsen (1988) and Self, Mauritsen, and Ohara (1992), respectively. However, these two approaches are limited t o models where the number of covariate configurations is finite. This is somehow impracti-

cal since it is quite common for generalized linear models used in medical and clinical research t o include continuous covari- ates as confounding factors, which have an infinite number of covariate configurations. For example, Whittemore (1981) illustrated the sample-size calculations for logistic regression with the problem of testing whether the incidence of coronary heart disease among white males aged 39-59 is related to their serum cholesterol and triglyceride levels. Also, previous studies indicate the joint distribution of cholesterol and log triglyceride is well presented by a bivariate normal distribution (see Hulley et al. (1980) for a thorough description of the analysis and other possible risk factors). In this case, to apply the approaches proposed by Self and Mauritsen (1988) and Self et al. (1992), one may apply a class grouping scheme over the range of covariate configurations. Such a strategy results in a categorical approximation of the true covariate distribution; hence, they are then still applicable with the consensus that the categorization is arbitrary. At first look, this seems to be a questionable approach because information about the actual serum cholesterol and triglyceride levels is thrown away. Furthermore, the interrelation between these two covariates may be distorted t o some extent. Consequently, these two approaches do not fully exploit the distribution information about continuous covariates when it is available. More impor- tantly, it is not clear how the results will be affected for uti- lizing a categorical approximation, not to mention that there is no unified rule for categorizing the covariates into a finite number of configurations.

In the present article, we generalize the Self et al. (1992) approach t o accommodate covariates with an infinite number of configurations. With this natural modification, one can perform power and sample-size calculations in generalized linear

1192

(2)

Power and Sample Size Calculations

1193

models with discrete and/or continuous covariates without any subjective or arbitrary class grouping process. In Sec- tion 2, the model and an approximation t o the distribution of the likelihood ratio statistic are described. Section 3 pro- vides the details of implementation. In Section 4, simulation studies are performed and results are presented that evaluate the adequacy of the proposed approach under various covariate distributions with an infinite number of configurations. Section 5 contains some remarks.

2. Model and Approximation

Consider a generalized linear model consisting of the response yi and covariate (zi,xi) defined in (1.1) and (1.2), respec- tively] for i = 1 , .

. .

,

N . Assume (yi, zi, xi) is a random sample from the joint distribution of (Y, Z, X) with p.d.f.

f(Yl Z ,

X ) =

f ( Y

I

Z, X) . f ( Z , X), where

f(Y

I

Z,

X) has the form defined

in (1.1) and

f ( Z , X)

is the joint p.d.f. for Z and X. The form of

f

(Z,

X)

is assumed to depend on none of the unknown pa- rameters

+

and A. The joint likelihood function of these N subjects is

N N

U+,N

= n f ( Y i , z i , x i ) = Hf(Yi

I

% , X i ) . f ( Z i I X i ) .

i=l i = l

It follows that the likelihood ratio statistic is 2{l(4,A) -

1(&, Ao)}, where l ( + ] A) is the log-likelihood function based

on L(+, A) and

(4,

A)

and

(+o,

Ao)

are the maximum likelihood estimators of (+, A) under the alternative and null mod-

els, respectively. The actual likelihood ratio test statistic is referred t o its asymptotic distribution under the null hypoth- esis, which is a central chi-square distribution with p d.f. In order to perform power analysis, one needs t o derive the distribution of the likelihood ratio statistic under the alternative hypothesis. Our formulation is analogous to that of Self et al. (1992). We approximate the distribution of the likelihood ratio statistic by a noncentral chi-square distribution with p d.f. The noncentrality parameter used in the approximation is computed by equating the expected value of a noncentral chi- square random variable to a n approximation of the expected value of the likelihood ratio statistic. The expected value of the likelihood ratio statistic is approximated by the expected value of lead terms in an asymptotic expansion of the likelihood ratio statistic under the alternative hypothesis. As in Self et al. (1992), the likelihood ratio statistic is decomposed into three terms,

where A: is the limiting value of A0 as described in Self and Mauritsen (1988). To approximate the expected value of the first term in (2.1), only the lead term in Cordeiro's (1983) expansion for generalized linear models is retained, and it results in a value of p

+

q.

The approximation of the second term is more troublesome because none of the expansions in Bartlett (1953), Lawley (1956), and Cordeiro (1983) are performed about the param- eter (@o,Ag). However, the expected value of the first term in the expansion is the trace of two q x q matrices, l 2 - I and

8, tr(X-'=), where

- [b'(0) - b'(0*)]

(;;:)

__

1

and

E[.] denotes the expectation taken with respect t o the joint distribution of (Yl, . .

. ,

Y N , Z , X) under the alternative hypothesis with true parameter values (+, A), and E(Z,x,[-] denotes the expectation taken with respect to the joint distribu- tion of (Z, X)

,

which is not dependent on the value of (+

,

A). Also, b" denotes the second derivative of b, 0 and

t?*

denote the canonical parameter values evaluated at (+, A) and (+a, A;), respectively, and Q* denotes the linear predictor evaluated at

(+o,Xg). It was found in Self et al. (1992) and Shieh and

O'Brien (1998) that the value of tr(E-'E) is very close t o q

for certain parameter values and discrete covariate distributions in several generalized linear models. This phenomenon is fortified here from the numerical results in this paper. We found that there is essentially no practical difference in the adequacy for power and sample size calculations by replacing it with q.

The third term does not involve any maximum likelihood estimators of (+,A). Its expectation can be written as N A * , where

A* = E ( z , x ) [2a-l(4){b'(0)[0 - t?*] - [b(t?) - b(t?*)]}] .

(2.2) By equating the expected value of a noncentral chi-square ran- dom variable with p d.f. and noncentrality y, namely p

+

y, to the respective expected value approximations of (2.1) derived above, which is (p

+

q )

-

q

+

N A *

.

This leads to an estimate of noncentrality, denoted by Y N , of the proposed noncentral chi-square distribution for the likelihood ratio statistic under the alternative hypothesis, i.e., yp.~ = N A * . The subscript N of 7~ represents its dependency on sample size N . Hence, for given parameter values (+, A) and covariate distribution

f(Z,

X), the actual likelihood ratio statistic of sample size N

under the alternative hypothesis is performed by referring it t o a noncentral chi-square distribution with p d.f. and noncen- trality N A * . Our approach differs from Self et al. (1992) in two respects. First, they proposed considering only categorical covariates, which are restricted t o have a finite number of configurations such as Bernoulli or multinomial distribu- tions. This is naturally extended here since the joint distribution of covariates ( Z , X ) could be either discrete or continuous with an infinite number of configurations, e.g., Pois- son and normal distributions. The expression of A* in (2.2) subsumes Self et al.'s equation (2.3) as a special case when

(3)

the joint distribution of (Z,

X)

is categorical with probabili- ties r j j , j = 1,. .

. ,

m. Second, their noncentrality estimate is N A *

+

q - tr(Z-'Z), whereas ours is simply NA*.

3. Implementation

In this section, we shall describe the necessary steps t o imple- ment the proposed approach. For a generalized linear model with specified parameter values ($, A) and chosen covariate

distribution f ( Z ,

X),

the sample size needed to test hypothesis Ho: $ =

+o

with specified significance level a and power 1 -

p

against the alternative H i : $ # $0 is computed as follows. First, find the lOO(1 - a ) t h percentile of a central

chi-square distribution with p d.f., denoted by

&,.

Next, find the noncentrality Y N of a noncentral chi-square distribution with p d.f. such that the 100. Pth percentile, denoted by X ; , ~ ( Y N ) , equals

&-,.

Then the sample size estimate N is computed as YN/A*, where A* is as defined in (2.2). For continuous covariate distributions, numerical integration is needed to carry out the expectation for A*.

4. Simulation Studies

Simulation studies are performed for evaluating the accuracy of the proposed approach for logistic regression models and Poisson regression models with an infinite number of covariate configurations. For illustrative purposes, we will restrict our attention to the logistic regression models here.

Two sets of linear predictors of the form q = XI

+

Z$

and 77 = XI

+

XXs

+

Zq5 are examined. In the case of the simple linear predictor, 7 = XI

+

Z$J, we consider normal, double exponential, exponential, and Poisson distributions for covariate 2. For the second linear predictor, 7 = XI

+

XXs

+

Z $ , the joint distribution of (2, X ) is of the form f(2, X) = f ( X

I

Z)f(Z). We assume 2 is a Bernoulli covariate with

p ( Z = 1) = 7r for 7r = 0.1, 0.5, and 0.9. The conditional

distribution of X given 2, denoted by [X

1

21, is [ X

1

2 = 11 [ X

I

2 = 01 + d , where [X

I

2 = 01 is a standardized version of

a normal, double exponential, exponential, or Poisson with a

mean of 10 random variables and d is described in the footnote of Table 2.

The parameter of interest, $, is taken to be log(2) and

log(5) for the two linear predictors, respectively. The confounding parameter As in the second model is set as log(2). The intercept parameter, XI, is chosen to satisfy different values of overall response probability ,!i = EtZ,x,[exp(q)/{l

+

For a given model, covariate distribution, regression coefficients, and overall response probability, the estimates of sample size required for testing Ho: $ = 0 against the alternative hypothesis HI: $

# 0 with significance level

0.05 and power 0.80, 0.90, and 0.95 are calculated. Due to the rounding of sample sizes, the precise nominal powers are not exactly 0.80, 0.90, and 0.95. They are recalculated with the inversion of the proposed formula discussed in Section 3.

Estimates of actual power associated with a given sample size and model configurations are then computed through Monte Carlo simulation based on 5000 replicate data sets. The adequacy of the sample size formula is determined by the difference between the estimated power and nominal power specified above. All calculations are performed using programs written with SAS/IML (SAS, 1989).

The results of the simulation studies are presented in Ta- bles 1 and 2. Table 1 contains results for the simple linear predictor, while Table 2 contains results of the multiple lin-

ear predictor for p ( 2 = 1) = 0.5. In general, the sample sizc: needed to achieve the significance level and power is larger for overall response probability

fi

= 0.02 than for ,!i = 0.15. However, for most occasions, the absolute errors of overall response probability ,G = 0.02 are larger than those of

p

= 0.15. Hence, the proposed method is comparatively more accurate for larger overall response probability. Generally, it maintains the accuracy within a reasonable range of nominal power for both cases. The simulation results for logistic regression in different settings and for Poisson regression models also suggest the proposed method performs well and may be of practical use; however, they are not reported here.

5. Discussion

We propose in this article an approach for sample size and power calculations in generalized linear models. This approach is a direct extension of the work by Self et al. (1992) to ac- exp(l7))I.

Table 1

Calculated sample sizes and estimates of actual power f o r the logistic regression model with linear predictor qa = XI

+

2$

Distribution of Zb

Normal Double exponential Exponential Poisson

ji = 0.02 Sample size 849 1136 1405 668 894 1105 Nominal power ,8003 ,9001 ,9501 3004 ,9002 .9500 Estimated power ,7774 .8818 .9406 .7504 3602 ,9168 Error -.0229 -.0183 -.0095 -.0500 -.0400 -.0332 ,!i = 0.15 Sample size 141 189 233 139 186 230

Nominal power .8011 .go12 ,9502 .8018 ,9012 ,9507 Estimated power .7930 .9016 .9470 .7968 .8908 .9444 Error -.0081 ,0004 -.0032 -.0050 -.0104 -.0063

a i = log(2) and E(Z) b P ( l ? ) / { l + exp(l?)Il =

B.

The distribution of 2 is standardized to have mean zero and variance one.

368 ,8009 .7190 -.0819 101 ,8002 .7742 -.0260 492 ,9003 3328 -.0675 136 .go18 3768 -.0250 608 567 ,9500 ,8006 ,8888 .7586 -.0612 -.0420 - 168 113 .9306 3002 ,9306 .7908 -.0203 -.0094 - 758 ,9001 .8534 ..0467 152 .go15 .8890 -.0125 938 ,9502 .9144

-

,0358 187 ,9500 .9338 -.0162

(4)

Power and Sample Size Calculations

1195

Table 2

Calculated sample sizes and estimates of actual power f o r the logistic regression model with linear predictor

va

= XI

+

X X s

-I- Z $ , where Z has a Bernoulli distribution (T = 0.5)

Distribution of [ X

1

ZIb

Normal Double exponential Exponential Poisson

,ii= 0.02

Sample size 2250 3011 3724 1552 2078 2569 1460 1955 2417 648 867 1073

Nominalpower 3002 ,9000 .9500 ,8001 .go01 ,9500 ,8001 .go01 ,9500 ,8003 ,9000 .9502 Estimatedpower ,8354 .9280 ,9680 ,8420 ,9252 ,9744 .8326 ,9328 ,9728 3372 .9218 ,9692 Error ,0352 .0280 .OH0 .0419 .0251 .0244 ,0325 .0327 ,0228 .0369 ,0218 ,0190

fi

= 0.15 Sample size 272 364 450 207 276 342 194 260 322 129 172 213 Nominal power ,8004 .9002 ,9501 ,8017 .9001 .9504 ,8000 .9003 .9505 3027 ,9008 .9508 Estimatedpower 3262 ,9150 .9610 3168 ,9114 ,9582 3226 .9082 .9576 ,8092 ,9072 .9540 Error ,0258 .0148 ,0109 ,0151 .0113 .0078 .0226 ,0079 .0071 ,0065 ,0064 .0032

a ?L = log(51, As = log(% and E ( z , x ) [exp(rl)/(l+ e v ( r l ) l l =

P .

The distribution [ X

I

Z = 01 is standardized to have mean zero and variance one. The distribution of [ X

I

2 = 11 F [ X

I

2 = 01

+

d ,

where d is 1.6832, 1.2958, 1.3863, and 5/(10)1/2 for normal, double exponential, exponential, and Poisson distributions, respectively.

commodate covariate distributions with an infinite number of configurations. Their approach is restricted to the generalized linear models with a finite number of covariate configurations such as Bernoulli and multinomial distributions. Furthermore, we modify the approximation of the noncentrality parameter in a noncentral chi-square distribution of the likelihood ratio statistic. This simple structure permits computational sim- plifications and maintains great accuracy based on the simulation results for different settings of logistic regression and Poisson regression models.

For generalized linear models with continuous covariates of natural interval and ratio measurement scales, some re- searchers may prefer to work with a categorical approximation by grouping the range of covariate values into finite intervals and then choosing representative class values (usually the class midpoints) and proportions for each class. This process will make the Self et al. (1992) approach still applicable for models with an infinite number of covariate configurations. However, there is no consensus in determining the covariate distribution approximation in terms of numbers of classes, the choices of class boundaries, and the class representative values. Consequently, one classification scheme may perform well for some cases but do poorly for others. Along with the proposed approach, we have simultaneously evaluated two different classification schemes for each of the four covariate distributions in the simulation studies. The results indicate that the proposed approach outperforms those two with categorical approximations of covariate distributions for most of the cases that we have considered. Therefore, when the covariate distributions are available, one should incorporate such information into the sample size calculations instead of their categorical approximations. However, as pointed out by the referee, the latter may be more robust when the distribution information about covariates is not accurately known. In such cases, one may try several different settings of finite configurations t o provide guidance about the sample sizes required for the study.

ACKNOWLEDGEMENTS

The author wishes t o thank the associate editor and referees for their helpful comments and Dr Ralph O’Brien for enlight- ening discussions.

RESUME

Cet article prksente une extension directe de l’approche dkcrite dans Self, Mauritsen and Ohara (1992, Biornetrics 48, 31-39) pour les calculs de puissance et de taille d’kchantillon pour les modkles linhaires gknkralisks. L’apport majeur de l’approche propos6e est la modification qui permet aussi bien un nom- bre fini qu’infini de configurations de covariables. De plus il est aussi propos6 une simplification pour l’approximation du paramktre de non centralit6 de la distribution du chi-deux non centrk, approximation qui non seulement rkduit les calculs de manikre apprkciable, mais aussi conserve la prkcision. Des ktudes de simulation sont faites pour Bvaluer cette pr6cision pour diff6rentes configurations de modkles et de distributions des covariables.

REFERENCES

Bartlett, M. A. (1953). Approximate confidence intervals. 11. More than one unknown parameter. Biometrika 40,306- 317.

Cordeiro, G. M. (1983). Improved likelihood ratio statistics for generalized linear models. Journal of the Royal Statistical Society, Series B 45, 404-413.

Hulley, S. B., Rosenman, R. A , , Bawol,

R., and Brand,

R.

J. (1980). Epidemiology as a guide t o clinical decisions: The association between triglyceride and coronary heart disease. N e w England Journal of Medicine 302, 1383- 1389.

Lawley, D. N. (1956). A general method for approximating the distribution of likelihood ratio criteria. Biometrika McCullagh, P. and Nelder, J. A. (1989). Generalized Linear

43, 295-303.

(5)

Nelder, J.

A.

and Wedderburn, R. W. M. (1972). Generalized linear models. Journal of the Royal Statistical Society, Series A 135, 370-384.

SAS. (1989). SAS/IML Software: Usage and Reference, Ver- sion 6 . Carey, North Carolina: SAS Institute.

Self, S.

G.

and Mauritsen, R. H. (1988). Power/sample size calculations for generalized linear models. Biometrics 44, Self, S. G., Mauritsen, R. H., and Ohara, J. (1992). Power calculations for likelihood ratio tests in generalized linear models. Biornetrics 48, 31-39.

79-86.

Shieh, G. and O’Brien,

R.

G. (1998).

A

simpler method t o compute power for likelihood ratio tests in generalized linear models. Paper presented at the Annual Joint Sta- tistical Meetings of the American Statistical Association, Dallas, Texas.

Whittemore, A . S. (1981). Sample size for logistic regression with small response probability. Journal of the American Statistical Association 76, 27-32.

Received August 1999. Revised January 2000. Accepted March 2000.