Sample size calculations for logistic and Poisson regression models

全文

(1)Biometrika (2001), 88, 4, pp. 1193–1199 © 2001 Biometrika Trust Printed in Great Britain. Sample size calculations for logistic and Poisson regression models B GWOWEN SHIEH Department of Management Science, National Chiao T ung University, Hsinchu, T aiwan 30050, R.O.C. [email protected] S A method is proposed for improving sample size calculations for logistic and Poisson regression models by incorporating the limiting value of the maximum likelihood estimates of nuisance parameters under the composite null hypothesis. The method modifies existing approaches of Whittemore (1981) and Signorini (1991) and provides explicit formulae for determining the sample size needed to test hypotheses about a single parameter at a specified significance level and power. Simulation studies assess its accuracy for various model configurations and covariate distributions. The results show that the proposed method is more accurate than the previous approaches over the range of conditions considered here. Some key words: Generalised linear model; Information matrix; Logistic regression; Maximum likelihood estimator; Poisson regression; Power; Sample size; Wald statistic.. 1. I The class of generalised linear models, first introduced by Nelder & Wedderburn (1972) and later expanded by McCullagh & Nelder (1989, Ch. 2), is specified by assuming independent scalar response variables Y (i=1, . . . , N) to have an exponential-family probability density of the form i exp[{Y h−b(h)}/a(w)+c(Y, w)]. (1) The expected value E(Y )=m is related to the canonical parameter h by the function m=b∞(h), where b∞ denotes the first derivative of b. The link function g relates the linear predictors g to the mean response g=g(m). The linear predictors can be written as g=b +XTb, 0 where X is a K×1 vector of covariates, and b and b=(b , . . . , b )T represent the corresponding 0 1 K K+1 unknown regression coefficients. The scale parameter w is assumed to be known. Assume (y , x ), for i=1, . . . , N, is a random sample from the joint distribution of (Y, X) with probability i i density function f (Y, X)= f (Y | X)f (X), where f (Y | X) has the form defined in (1) and f (X) is the probability density function for X. The form of f (X) is assumed to depend on none of the unknown parameters b and b. The likelihood function associated with the data is 0 N N L (b , b)= a f (y , x )= a f (y | x )f (x ). 0 i i i i i i=1 i=1 It follows from the standard asymptotic theory that the maximum likelihood estimator (b@ , b@ T)T 0 is asymptotically normally distributed with mean (b , bT)T and with variance–covariance matrix 0 given by the inverse of the (K+1)×(K+1) Fisher information matrix I(b , b), where the (i, j )th 0.

(2) G S. 1194 element of I is. A. B. ∂2 log L (i, j=0, . . . , K ). ∂b ∂b i j We wish to test the composite null hypothesis H : b =0 against the alternative hypothesis 0 1 H : b N0, while treating (b , b , . . . , b ) as a nuisance parameter. The Wald-type test of this 1 1 0 2 K hypothesis is based on b@ /{var(b@ )}1/2, where b@ is the first entry of b@ =(b@ , b@ , . . . , b@ )T and 1 1 1 1 2 K var(b@ ) is the second diagonal term of I−1(b@ , b@ ). The actual test is performed by referring the 1 0 statistic to its asymptotic distribution under the null hypothesis, which is the standard normal distribution. In general, there is no simple closed-form expression for Fisher’s information matrix except in some special cases. In view of their practical importance and the existence of explicit formulae, we restrict attention to the cases of logistic and Poisson regression models. For the logistic regression model, an approximate expression for Fisher’s information matrix was provided in Whittemore (1981). The approximation is based on the moment generating function of the distribution of the covariates and is valid when the overall response probability is small. A formula for determining the sample size is based on the resulting asymptotic variance of the maximum likelihood estimator of the parameters. Later, the technique was extended to the Poisson regression model in Signorini (1991). However, in this case, the expression of Fisher’s information matrix is exact and there is no restriction of use in terms of the overall response level. This paper presents a direct modification of the approaches of Whittemore (1981) and Signorini (1991) to the question of sample size calculations. The major difference between our approach and theirs is that the value of the nuisance parameter under the null model is different from that under the alternative model. We use the limiting value of the maximum likelihood estimator of (b , b , . . . , b ) under the 0 2 K constraint b =0 as specified in the null hypothesis. Self & Mauritsen (1988) and Self et al. (1992) 1 took a similar approach to the determination of sample sizes for the score statistic and likelihood ratio test, respectively, within the framework of generalised linear models. Although their methods are applicable to logistic and Poisson regression models, they assumed all the covariates in the model to be categorical with a finite number of covariate configurations. However, here we allow the covariate to be continuous or discrete with an infinite number of configurations, as in the cases of normal and Poisson covariates. In § 2, the methodology is described and the procedures are illustrated with examples. In § 3, simulation studies are performed and comparison made with the methods of Whittemore (1981), Signorini (1991) and Self et al. (1992). I =−E ij. 2. T   It was shown in Whittemore (1981) that I j N exp(b )E{X X exp(XTb)} (i, j=0, . . . , K ), ij 0 i j with X =1, for logistic regression with small response probabilities, while the equation is exact 0 for the case of Poisson regression as described by Signorini (1991). Based on this question and the moment generating function of the distribution of covariates, m(t)=E{exp(XTt)}, the asymptotic variance of the maximum likelihood estimator b@ of b can be expressed as V (b , b)/N, 1 1 0 where V (b , b)=n(b)/exp(b ); n(b) is the second diagonal element of M−1(b) with M=(m ), 0 0 ij m =m =m, m =m =m , m =∂m/∂t , and m =∂2m/∂t ∂t , for i, j=0, . . . , K. 00 0 i0 0i i i i ij i j In order to approximate the required sample size, we need to examine the asymptotic mean and variance of b@ under the null model, as in Self & Mauritsen (1988). Let (b* , b* , . . . , b* ) denote 1 0 2 K the solution of the equation lim N−1E{S (b , 0, b , . . . , b )}=0, where S represents derivaN2 N 0 2 K N tives of the loglikelihood function with respect to (b , b , . . . , b ) and E{.} denotes expectation 0 2 K taken with respect to the true value of (b , b , b , . . . , b ). We synthesise the ideas of Whittemore 0 1 2 K (1981) and Self & Mauritsen (1988) by incorporating the value (b* , b*) with b*=(0, b* , . . . , b* ) 0 2 K as the asymptotic mean under the null model. Thus, the sample size needed to test the hypothesis.

(3) Miscellanea. 1195. H : b =0 with specified significance level a and power 1−c against the alternative H : b N0 is 0 1 1 1 V 1/2(b* , b*)Z +V 1/2(b , b)Z 2 0 a/2 0 c , (2) N PM b 1 where Z is the 100(1−p)th percentile of the standard normal distribution. In contrast, Whittemore p (1981) and Signorini (1991) set the values of the nuisance parameter equal to (b , b(0)) with b(0)= 0 (0, b , . . . , b ), under both the null and alternative models. In our notation this gives 2 K n1/2(b(0))Z +n1/2(b)Z 2 a/2 c . N exp(−b ) (3) WS 0 b 1 In order to improve the accuracy, some modification was provided in Whittemore (1981). For the univariate case, K=1, the sample size is more accurately calculated as. q. r. q. N =N {1+2 exp(b )d(b )}, W1 WS 0 1. r. (4). where n1/2(0)+n1/2(b )R(b ) 1 1 , n1/2(0)+n1/2(b ) 1 R(b )=n(b ){m (2b )−2m−1(b )m (b )m (2b )+m−2(b )m(2b )m2 (b )}. 1 1 11 1 1 1 1 1 1 1 1 1 1 For the multivariate case, K2, the correction is too complicated, and a simple version was proposed for routine use: d(b )= 1. N =N {1+2 exp(b )}. (5) W2 WS 0 In summary, the actual implementation of the proposed approach is as follows. For a chosen logistic or Poisson regression model with specified parameter value (b , b), distribution of covariates 0 f (X) and required significance a and power 1−c, the minimum sample size N needed is determined from equation (2). The test statistic is b@ {N exp(b@ )/n(b@ )}1/2, (6) 1 0 which is referred to the standard normal distribution. The null hypothesis is rejected if the absolute value of the statistic exceeds Z . a/2 To illustrate the general formula we continue the sample size calculations in Whittemore (1981) and Signorini (1991) for logistic and Poisson regression models, respectively. Whittemore (1981) presented the problem of testing whether or not the incidence of coronary heart disease among white males aged 39–59 is related to their serum cholesterol level. During an 18-month follow-up period, the probability of a coronary heart disease event for a subject with the population mean serum cholesterol level is judged to be 0·07, and the cholesterol levels in this population are well represented by a standard normal distribution. Hence, we consider a simple logistic regression with g=b +Xb , b =log(0·07), and X~N(0, 1). To detect the odds ratio of 0 1 0 e0·1, corresponding to b =0·1, and e0·5, corresponding to b =0·5, for a subject with a cholesterol 1 1 level of one standard deviation above the mean at a significance level 0·05 and with power 0·95, equation (4) yields N =21 147 and N =839, respectively. Here W1 W1 1+(1+b2 ) exp(5b2 /4) 1 1 , n(b )=exp(−b2 /2). d(b )= 1 1 1 1+exp(−b2 /4) 1 To detect the same effects with the proposed method (2), the required sample sizes are N = PM 18 478 and N =662, respectively. The respective values of b* are −2·6549 and −2·5541. PM 0 Signorini (1991) discussed a study of water pollution in terms of the number of illnesses and infections contracted per swimming season for ocean swimmers versus non-ocean or infrequent swimmers. We continue to consider a simple Poisson regression for the number of infections, with.

(4) G S. 1196. Table 1. Calculated sample sizes and estimates of actual power at specified sample size for logistic regression Proposed method Power 0·90 0·95 Sample sizea (N , N , N ) PM W1 S Nominal powerb at N PM Estimated power Error. 1739 0·9000 0·8802 −0·0198. 2165 0·9500 0·9392 −0·0108. Sample sizea (N , N , N ) PM W1 S Nominal powerb at N PM Estimated power Error. 364 0·8997 0·8792 −0·0205. 441 0·9499 0·9290 −0·0209. Sample sizea (N , N , N ) PM W1 S Nominal powerb at N PM Estimated power Error. 411 0·9002 0·8612 −0·0390. 507 0·9500 0·9222 −0·0278. Sample sizea (N , N , N ) PM W2 S Nominal powerb at N PM Estimated power Error. 5785 0·9000 0·9216 0·0216. Sample sizea (N , N , N ) PM W2 S Nominal powerb at N PM Estimated power Error Sample sizea (N , N , N ) PM W2 S Nominal powerb at N PM Estimated power Error. Whittemore (1981) Power 0·90 0·95 Bernoulli 2405 0·7737 0·8802 0·1065. (0·5) 2923 0·8644 0·9392 0·0748. Self et al. (1992) Power 0·90 0·95 1938 0·8667 0·8658 −0·0009. 2396 0·9288 0·9304 0·0016. Normalised Poisson (5)c 530 634 0·7302 0·8260 0·8792 0·9290 0·1490 0·1030. 484 0·8020 0·8404 0·0384. 599 0·8711 0·9022 0·0311. Standard normald 547 666 0·7933 0·8755 0·8612 0·9222 0·0679 0·0467. 545 0·8030 0·8346 0·0316. 675 0·8777 0·9014 0·0237. Multinomial (0·76, 6929 6305 0·9500 0·8684 0·9542 0·9216 0·0042 0·0532. 0·19, 0·01, 0·04) 7545 5783 0·9290 0·9001 0·9542 0·8834 0·0252 −0·0167. 7152 0·9439 0·9262 −0·0177. 3311 0·9000 0·9274 0·0274. Multinomial (0·40, 4075 3869 0·9500 0·8457 0·9602 0·9274 0·0102 0·0817. 0·10, 0·10, 0·40) 4711 3245 0·9152 0·9056 0·9602 0·9170 0·0450 0·0114. 4014 0·9528 0·9546 0·0018. 1726 0·8999 0·8790 −0·0209. Multinomial (0·25, 2150 2315 0·9500 0·7875 0·9376 0·8790 −0·0124 0·0915. 0·25, 0·25, 0·25) 2813 1947 0·8756 0·8627 0·9376 0·8612 0·0620 −0·0015. 2408 0·9260 0·9256 −0·0004. a Sample sizes needed to achieve power 0·9 and 0·95, respectively. b Nominal powers at calculated sample sizes of the proposed method in a. c The categorical approximation of Po(5) is (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) with probabilities (0·0404, 0·0842, 0·1404, 0·1755, 0·1755, 0·1462, 0·1044, 0·0653, 0·0363, 0·0318). d The categorical approximation is (−2·2, −1·7, −1·2, −0·7, −0·2, 0·2, 0·7, 1·2, 1·7, 2·2) with probabilities (0·0228, 0·0441, 0·0918, 0·1499, 0·1915, 0·1915, 0·1499, 0·0918, 0·0441, 0·0228).. a single Bernoulli covariate indicating ocean swimming if X=1 and with p=pr(X=1)=0·5. In this case, the estimated infection rate of non-ocean or infrequent swimmers is 0·85, corresponding to b =log(0·85), and the ratio of mean response for X=1 to mean response for X=0 is 1·3, 0 corresponding to b =log(1·3). It follows from equation (3) with n(b )={p exp(b )}−1+(1−p)−1 1 1 1 that the sample sizes N needed for a significance level 0·05 with power 0·80, 0·90 and 0·95 are WS 518, 685 and 841, respectively. Our equation (2) gives 469, 629 and 779 at the three corresponding power levels and b* =−0·0228. 0.

(5) Miscellanea. 1197. Table 2. Calculated sample sizes and estimates of actual power at specified sample size for Poisson regression Proposed method Power 0·90 0·95 Sample sizea (N , N , N ) PM WS S Nominal powerb at N PM Estimated power Error. 1835 0·9001 0·8880 −0·0121. 2285 0·9500 0·9494 −0·0006. Sample sizea (N , N , N ) PM WS S Nominal powerb at N PM Estimated power Error. 389 0·8999 0·8966 −0·0033. 472 0·9498 0·9283 −0·0116. Sample sizea (N , N , N ) PM WS S Nominal powerb at N PM Estimated power Error. 437 0·8997 0·8762 −0·0235. 541 0·9500 0·9404 −0·0096. Sample sizea (N , N , N ) PM WS S Nominal powerb at N PM Estimated power Error. 6165 0·9000 0·9462 0·0462. Sample sizea (N , N , N ) PM WS S Nominal powerb at N PM Estimated power Error Sample sizea (N , N , N ) PM WS S Nominal powerb at N PM Estimated power Error. Signorini (1991) Power 0·90 0·95 Bernoulli 2354 0·8069 0·8880 0·0811. (0·5) 2861 0·8905 0·9494 0·0589. Self et al. (1992) Power 0·90 0·95 1856 0·8968 0·8918 −0·0050. 2295 0·9492 0·9514 0·0022. Normalised Poisson (5)c 464 554 0·8299 0·9060 0·8966 0·9382 0·0667 0·0322. 455 0·8505 0·8802 0·0297. 563 0·9104 0·9278 0·0174. Standard normald 508 619 0·8485 0·9185 0·8762 0·9404 0·0277 0·0219. 512 0·8502 0·8748 0·0282. 633 0·9154 0·9390 0·0203. Multinomial (0·76, 7383 6218 0·9500 0·8971 0·9706 0·9462 0·0206 0·0491. 0·19, 0·01, 0·04) 7442 4933 0·9483 0·9519 0·9706 0·9364 0·0223 −0·0155. 6101 0·9776 0·9656 −0·0120. 3537 0·9000 0·9342 0·0342. Multinomial (0·40, 4353 3963 0·9500 0·8615 0·9692 0·9342 0·0192 0·0727. 0·10, 0·10, 0·40) 4825 3143 0·9265 0·9305 0·9692 0·9400 0·0427 0·0095. 3886 0·9682 0·9710 0·0028. 1835 0·9001 0·8904 −0·0097. Multinomial (0·25, 2285 2354 0·9500 0·8069 0·9484 0·8904 −0·0016 0·0835. 0·25, 0·25, 0·25) 2861 1856 0·8905 0·8968 0·9484 0·8954 0·0579 −0·0014. 2295 0·9492 0·9506 0·0014. a Sample sizes needed to achieve power 0·9 and 0·95, respectively. b Nominal powers at calculated sample sizes of the proposed method in a. c The categorical approximation of Po(5) is (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) with probabilities (0·0404, 0·0842, 0·1404, 0·1755, 0·1755, 0·1462, 0·1044, 0·0653, 0·0363, 0·0318). d The categorical approximation is (−2·2, −1·7, −1·2, −0·7, −0·2, 0·2, 0·7, 1·2, 1·7, 2·2) with probabilities (0·0228, 0·0441, 0·0918, 0·1499, 0·1915, 0·1915, 0·1499, 0·0918, 0·0441, 0·0228).. 3. S  The finite-sample adequacy of our formula was assessed through simulations, in which we also compared the proposed method to those of Whittemore (1981), Signorini (1991) and Self et al. (1992). The results are presented in Tables 1 and 2 for logistic and Poisson regression models, respectively. For both regression models, two linear predictors are examined, namely g=b +X b and g= 0 1 1 b +X b +X b . In the case of the simple linear predictor g=b +X b , we consider Ber(0·5), 0 1 1 2 2 0 1 1 normalised Po(5) and standard normal distributions for the covariate X . For the second predictor 1.

(6) 1198. G S. g=b +X b +X b , the joint distribution of (X , X ) is assumed to be multinomial with prob0 1 1 2 2 1 2 abilities p , p , p and p , corresponding to (x , x ) values of (0, 0), (0, 1), (1, 0) and (1, 1), respect1 2 3 4 1 2 ively. Three sets of (p , p , p , p ) are studied to represent different distributional shapes, namely 1 2 3 4 (0·76, 0·19, 0·01, 0·04), (0·4, 0·1, 0·1, 0·4) and (0·25, 0·25, 0·25, 0·25). Both the parameter of interest b and the confounding parameter b are taken to be log 2. The intercept parameter b is chosen 1 2 0 to satisfy the overall response m: =0·05, where m: =E[exp(g)/{1+exp(g)}] and m: =E{exp(g)} for the logistic and Poisson regression models, respectively. First, we calculated from equations (2)–(5) the sample sizes (N , N , N , N ) required to PM WS W1 W2 achieve the selected significance 0·05 and power (0·90, 0·95) within the model specifications. Let N S denote the sample size for the approach of Self et al. (1992). Since they assumed all of the covariates to be categorical with a finite number of configurations, discretisation schemes are needed for the cases of Poisson and standard normal covariates; the chosen schemes are listed in the footnotes of Tables 1 and 2. These estimates of sample size allow comparison of relative efficiencies of the approaches. Since the magnitude of the sample size affects the accuracy of the asymptotic distribution and the resulting formulae, we unify the sample sizes in the simulations; the sample size N is chosen as the benchmark and is used to recalculate the nominal powers for all competing PM approaches. Estimates of the true power associated with given sample size and model configuration are then computed through Monte Carlo simulation based on 5000 independent datasets. For each replicate, N covariate values are generated from the selected distribution. These covariate values determine PM the incidence rates for generating N Bernoulli or Poisson outcomes. Then the test statistic is PM computed and the estimated power is the proportion of the 500 replicates whose test statistic values exceed the critical value. The adequacy of the sample size formula is determined by the difference between the estimated power and nominal power specified above. All calculations are performed using programs written with SAS/IML (SAS Institute, 1989). The results in Tables 1 and 2 suggest that there is a close agreement between the estimated power and the nominal power for the proposed method regardless of the model configuration and covariate distribution. The only exceptions are with the extremely skewed multinomial covariate distribution (0·76, 0·19, 0·01, 0·04). The approach of Self et al. (1992) is also very good at achieving the nominal levels, but the approaches of Whittemore (1981) and Signorini (1991) incur much larger errors. We conclude that the proposed method maintains the accuracy within a reasonable range of nominal power and is much more accurate than the previous approaches proposed by Whittemore (1981) and Signorini (1991). Nevertheless, unbalanced allocations appear to degrade the accuracy of sample size calculations.. A I would like to thank the editor and referees for their helpful comments that improved the presentation of this paper. This work was partially supported by the National Science Council of Taiwan.. R MC, P. & N, J. A. (1989). Generalized L inear Models, 2nd ed. London: Chapman and Hall. N, J. A. & W, R. W. M. (1972). Generalized linear models. J. R. Statist. Soc. A 135, 370–84. SAS I (1989). SAS/IML Software: Usage and Reference, Version 6. Carey, NC: SAS Institute Inc. S, S. G. & M, R. H. (1988). Power/sample size calculations for generalized linear models. Biometrics 44, 79–86..

(7) Miscellanea. 1199. S, S. G., M, R. H. & O, J. (1992). Power calculations for likelihood ratio tests in generalized linear models. Biometrics 48, 31–9. S, D. F. (1991). Sample size for Poisson regression. Biometrika 78, 446–50. W, A. S. (1981). Sample size for logistic regression with small response probability. J. Am. Statist. Assoc. 76, 27–32.. [Received March 2000. Revised April 2001].

(8)