© 2010 The Psychonomic Society, Inc. 824
P
S
In view of the widespread recognition and increased use of moderated multiple regression (MMR) in behav-ioral and related disciplines, various attempts have been devoted to address methodological and computational is-sues in the detection of interaction effects. It is evident from the comprehensive review of Aguinis, Beaty, Boik, and Pierce (2005) that MMR studies focus mainly on null hypothesis significance testing for drawing conclusions about moderating effects. This dominance of hypothesis testing for making statistical inferences does not occur exclusively in MMR analysis. It more broadly reflects the longstanding and prevalent reliance of applied research on significance tests across many scientific fields. However, the dichotomous accept–reject decision of null hypoth-esis significance testing ignores other useful information in its analysis. As an alternative, confidence intervals are more informative about location and precision of the sta-tistic, and they are the best reporting strategy according to the recommendations of Wilkinson and the American Psychological Association Task Force on Statistical Infer-ence (1999), as well as of the Publication Manual of the American Psychological Association (APA, 2001). Conse-quently, the notion of interval estimation has been stressed repeatedly in the literature on education, psychology, and social sciences. For example, see Algina and Olejnik (2000), Kelly and Maxwell (2003), Smithson (2001), and
Steiger and Fouladi (1997) for in-depth discussions on constructing confidence intervals for the squared multiple correlation coefficient, regression coefficient, and related parameters within the multiple regression framework.
The most common application of MMR is in the con-text of simple interaction models with criterion variable Y, predictor variable X, moderator variable Z, their cross product term XZ, and an error term ε in the formulation of Y 5 βI 1 XβX 1 ZβZ 1 XZβXZ 1 ε. The moderator Z is essentially the second predictor variable hypothesized to moderate the X–Y relationship. In the present article, we consider the situation in which both the predictor X and the moderator Z are continuous variables since it is applicable to a wide range of problems encountered in ap-plied research. Because of the nature of continuous mea-surements, it is conceivable that not only are the values of the response variables for each participant available only after the observations are made, but the levels of predictor and moderator variables are also outcomes of the study. In order to take account of this stochastic feature of ex-planatory variables, the appropriate strategy is to consider a random regression formation rather than a fixed or con-ditional setting. Similar emphasis and related implications can be found in Dunlap, Xin, and Myers (2004), Gatso-nis and Sampson (1989), Mendoza and Stafford (2001), Shieh (2006), and Shieh and Kung (2007). In practice, the
Sample size determination for confidence
intervals of interaction effects in moderated
multiple regression with continuous
predictor and moderator variables
Gwowen ShiehNational Chiao Tung University, Hsinchu, Taiwan
Moderated multiple regression (MMR) has been widely employed to analyze the interaction or moderating effects in behavior and related disciplines of social science. Much of the methodological literature in the context of MMR concerns statistical power and sample size calculations of hypothesis tests for detecting moderator variables. Notably, interval estimation is a distinct and more informative alternative to significance testing for inference purposes. To facilitate the practice of reporting confidence intervals in MMR analyses, the present article presents two approaches to sample size determinations for precise interval estimation of interaction effects between continuous moderator and predictor variables. One approach provides the necessary sample size so that the designated interval for the least squares estimator of moderating effects attains the specified coverage probability. The other gives the sample size required to ensure, with a given tolerance probability, that a confidence interval of moderating effects with a desired confidence coefficient will be within a specified range. Numerical examples and simulation results are presented to illustrate the usefulness and advantages of the proposed methods that account for the embedded randomness and distributional characteristic of the moderator and predictor variables.
doi:10.3758/BRM.42.3.824
cifically constructed to relate manager’s self-assurance (Y ) with length of time in the position (X ), managerial abil-ity (Z), and their interaction. Note that both explanatory variables (time in the position and managerial ability) are not typically fixed in advance and that they are available after collecting the data. Therefore, there is no problem in regarding them as random, provided that the managers are drawn randomly from the relevant population. Hence, the appropriate approach is random regression modeling. The purpose of the present investigation was to find out to what extent the relation between self-assurance and time in position varies with managerial ability. Essentially, it is constructive to assess the systematic magnitude alternation for the strength of the relationship between the manager’s self-assurance and length of time in the managerial position that results from a one-unit change in managerial ability.
In a continual effort to support analytical development and to improve the practical use of research findings in MMR, the present article contributes to the derivation and evaluation of sample size methodology in two important aspects. On one hand, it provides the necessary sample size so that the designated interval for the least squares estimator of moderating effects attains the specified cov-erage probability. The following discussion shows that this problem is identical to the computation of the mini-mum sample size so that the prescribed confidence inter-val formula of moderating effects attains a desired level of confidence. On the other hand, the present study gives the sample size required to ensure, with a given tolerance probability, that a confidence interval of moderating ef-fects with a desired confidence coefficient will be within a specified range. Notably, the sample size formulas of Guenther and Thomas (1965), Hahn and Meeker (1991), Kupper and Hafner (1989), and Nelson (1994) are con-cerned exclusively with the length of confidence intervals. Since the actual values of the resulting confidence interval depend not only on the estimated width but also on the realized value of the location estimator, their procedures do not consider the stochastic nature of the point estimator for central tendency. Nonetheless, these previous studies focused on the interval estimation procedures in one- and two-sample problems; hence, they did not address the as-sociated issues in an MMR application. Sample size tables are provided for a variety of situations to demonstrate the individual impact of deterministic factors and how they pertain to the two aforementioned precision considerations of confidence intervals. Furthermore, numerical examples and simulation results are presented to illustrate the useful-ness and advantage of the proposed methods that account for the embedded randomness and distributional character-istic of the moderator and predictor variables.
Interval Estimation Procedures of Moderated Effects
Consider the simple interaction model or MMR model within the fixed modeling framework
Yi 5 βI 1 XiβX 1 ZiβZ 1 XiZiβXZ 1 εi, (1) where Yi is the value of the response variable Y; Xi and Zi are the known constants of the predictor X and modera-inferential procedures of hypothesis testing and interval
estimation are the same under both fixed and random for-mulations. However, the distinction between the two mod-eling approaches becomes crucial when power, coverage probability, and corresponding sample size calculations are to be made. See Cramer and Appelbaum (1978) and Sampson (1974) for clear and succinct presentations on the intrinsic appropriateness and theoretical properties of fixed and random models.
For the simple interaction model described above, in most illustrative and theoretical treatments of MMR, it is generally assumed that the two continuous predictor and moderator variables have a joint bivariate normal distri-bution (see, e.g., O’Connor, 2006). Obviously, the prod-uct of two normally distributed variables does not have a normal distribution. Moreover, there are also many situ-ations in which the predictor and moderator variables are continuous but the assumption of normality is completely untenable. These results are concerned with fixed- or multinormal-regressors settings and are thus not appli-cable to the great diversity of random frameworks. Re-cently, Shieh (2007) considered using a unified approach to accommodate arbitrary distributional formulations of the stochastic explanatory variables and demonstrated power calculation as well as sample size determination for hypothesis tests of coefficient parameters within the ran-dom regression framework. The general results of Shieh (2007) were utilized in Shieh (2009) to perform power and sample size computations in MMR to detect interaction effects between continuous predictor and moderator vari-ables, regardless of whether they follow a jointly bivariate normal distribution. It is well known that there exists a direct connection between hypothesis testing and interval estimation, although the two procedures are philosophi-cally different in the power and precision viewpoints. In particular, the necessary sample size required for signifi-cance testing is a function of coefficient parameters. On the other hand, it will be shown later that the sample size needed for precise interval estimation is affected by the interval width and does not depend on the magnitude of coefficient parameters. Related discussions and examples can be found in Algina and Olejnik (2000) and in Kelly and Maxwell (2003). Not surprisingly, the sample size re-quired to test a hypothesis regarding the specific value of a parameter with desired power can be markedly different from the sample size needed to obtain adequate precision of interval estimation in the same study. The planning of sample size should be included as an integral part in the design of MMR studies, and it is of both methodologi-cal and practimethodologi-cal importance to develop feasible methods for sample size determination considering precise interval estimation.
To elucidate the key concepts in the present article, con-sider a study on the self-assurance of managers that exam-ines how the impact of length of time in the position on self-assurance is moderated by managerial ability (Aiken & West, 1991, chap. 2). A sample of managers is randomly selected from the participating source corporation, and various measurements for each manager are recorded. The MMR model Y 5 βI 1 XβX 1 ZβZ 1 XZβXZ 1 ε is
spe-rate in achieving the desired reliability. In the following sections, two interval estimation approaches to sample size determination are developed.
Sample Size Methodology for Designated Interval Estimation of Moderating Effects
When the focus is on the inferential procedure of in-terval estimation, it is prudent for one to ensure that the resulting estimate is in the neighborhood of the actual or possible parameter value with sufficiently high prob-ability. In the context of MMR analysis, therefore, it is of interest to calculate the sample size required for a desig-nated interval so that the least squares estimator of moder-ating effects simultaneously satisfies the desired levels of precision and probability. Ultimately, the corresponding method for sample size determination requires consider-ing the samplconsider-ing distribution of the least squares estimator βXZ of βXZ. Analogous to the practical standpoint of Shieh (2009) for providing a generally useful and versatile solu-tion without being specifically confined to any particular joint probability function g(Xi, Zi), the large-sample dis-tribution of T M XZ( ) XZ ˆ ˆ / β β β σ = −
{ }
2 1 2 (5)is presented in Equation A6 of Appendix A, where β is a constant and is not necessarily equivalent to βXZ. The asymptotic property of TXZ(β) will be later employed to implement varieties of probability calculations and sam-ple size determinations.
With the specified quantities of population configura-tions for a moderating effect βXZ, error variance σ2, joint
distribution g(X, Z) of (X, Z), probability level 1 2 α, and the designated interval (βXZ 2 bL, βXZ 1 bU) with proper bounds bL . 0 and bU . 0, the smallest sample size N needed for the interval (βXZ 2 bL, βXZ 1 bU) of βXZ with coverage probability of at least 1 2 α can be computed from
P{βXZ 2 bL , βXZ , βXZ 1 bU} $ 1 2 α. (6) Alternatively, Equation 6 can be expressed as
P{βXZ 2 bU , βXZ , βXZ 1 bL} $ 1 2 α. Therefore, the sample size problem just described is equiv-alent to finding the minimum sample size N needed for the designated confidence interval (βXZ 2 bU, βXZ 1 bL) of βXZ to attain the desired level of confidence 1 2 α. However, unlike the fixed predictor and moderator setting, the com-putation of P{βXZ 2 bL , βXZ , βXZ 1 bU} in Equation 6 is fairly complicated due to the arbitrary and stochastic characteristics of (X, Z). The theoretical properties of the proposed procedure are presented in Appendix B.
In order to enhance the application of precise interval estimation of the moderating effect, selected computations of sample size planning for precise interval estimation of moderating effects are performed. To improve analytical tractability in the derivation and primary focus in liter-ature, the MMR model with bivariate normal predictor tor Z; εi is independent and identically distributed N(0,
σ2) random errors for i 5 1, . . . , N; and βI, βX, βZ, and
βXZ are unknown parameters. To examine the moderator effect, we are concerned with the distributional property associated with the least squares estimator βXZ of βXZ. Ac-cording to the standard results (Rencher, 2000, Section 8.6), a 100(1 2 α)% confidence interval of βXZ is (βXZ 2 tN24,α1{σ2M}1/2, βXZ 1 tN24,α2{σ2M}1/2), (2) where σ2 is the usual unbiased estimator of σ2; M is the
(3, 3) element of A21, A 5
S
Ni51(Xi 2 X)(Xi 2 X)T; X 5
S
Ni51Xi/N, and Xi 5 (Xi, Zi, XiZi)T is the 3 3 1 row vector for values of predictor Xi, moderator Zi, and their cross product XiZi for i 5 1, . . . , N. In addition, tN24,α1 and tN24,α2 are the 100(1 2 α1)th and 100(1 2 α2)th
percen-tiles of the t distribution with N 2 4 degrees of freedom, respectively, and α 5 α1 1 α2. See Rencher (2000, chap. 7
and 8) for general treatments and further details on linear models and their analysis. The most common practice is to assume α1 5 α2 5 α/2, and this leads to the shortest
100(1 2 α)% two-sided confidence interval for βXZ: (βXZ 2 tN24,α/2{σ2M}1/2, βXZ 1 tN24,α/2{σ2M}1/2). (3) Furthermore, the 100(1 2 α)% one-sided lower and upper confidence intervals can be readily obtained from Equa-tion 2 by setting either α1 or α2, respectively, to zero, as
follows:
(2`, βXZ 1 tN24,α{σ2M}1/2) and
(βXZ 2 tN24,α{σ2M}1/2, `). (4) Here, we concentrate exclusively on the specific cir-cumstance that both the predictor X and the moderator Z are continuous variables. Due to the nature of continu-ous measurements encountered in practical research, the explanatory variables typically cannot be controlled and are available only after observation. Hence, in order to extend the concept and applicability to MMR, the con-tinuous predictor and moderator variables {(Xi, Zi), i 5 1, . . . , N} in Equation 1 are assumed to have a joint probability function g(Xi, Zi) with finite moments. More-over, the form of g(Xi, Zi) does not depend on any of the unknown parameters (βI, βX, βZ, βXZ) or σ2. From the
in-vestigations of Shieh (2007, 2009), it is conceivable that the extended consideration of random features associated with the predictor X and the moderator Z complicates the fundamental statistical properties of the inferential pro-cedures. As was noted above, however, the inferential procedures of hypothesis testing and interval estimation are the same under both fixed and random formulations. Hence, the two- and one-sided confidence limits given in Equations 3 and 4 are still valid under random predictor and moderator settings. The follow-up analyses can be performed without any alteration or extra effort. In view of the practical value of interval estimation, it is impor-tant to determine the necessary sample size so that the resulting interval estimate is not only precise enough to identify meaningful findings but also sufficiently
accu-hypothesis, but in that the determined value of the point estimate and the width of the interval also give ideas of the inherent location and precision of the estimation. How-ever, the interval estimation procedures are intrinsically stochastic in nature. From a study-planning point of view, researchers may wish to obtain meaningful research find-ings so that the resulting confidence interval will meet the prespecified assurance and precision requirements. The corresponding approach to determining the required sample size is presented next.
With the specified quantities of population configura-tions for moderating effect βXZ, error variance σ2, joint
distribution g(X, Z) of (X, Z), tolerance probability 1 2 γ, and the prescribed range (βXZ 2 wL, βXZ 1 wU) with proper bounds wL . 0 and wU . 0, the minimum sample size N required to ensure that the 100(1 2 α)% two-sided confidence interval given in Equation 3 is within the range of (βXZ 2 wL, βXZ 1 wU) with a tolerance probability of at least 1 2 γ can be determined by
P{βXZ 2 wL , βXZ 2 tN24,α/2 (σ2M)1/2 and βXZ 1 tN24,α/2(σ2M)1/2 , βXZ 1 w
U} $ 1 2 γ. (7) This procedure is complex since it must consider the stochastic nature of the confidence limits βXZ 2 tN24,α/2{σ2M}1/2 and βXZ 1 tN24,α/2{σ2M}1/2 within the unconditional framework that the predictor and modera-tor are random variables. The corresponding analytical presentation is summarized in Appendix C.
As additional illustrations, we continue to exemplify the sample size procedures for the preceding MMR model with bivariate normal predictor and moderator variables. In this case, Table 3 presents the minimum sample sizes required to ensure that the 95% two-sided confidence interval (βXZ 2 tN24,.025{σ2M}1/2, βXZ 1 tN24,.025{σ2M}1/2) is within the range of (βXZ 2 w, βXZ 1 w) with a tolerance probability of at least .90 and .95 for values of ρ ranging from 0 to .8 in incre-ments of .2, and w 5 0.2, 0.225, and 0.25. In addition, Table 4 shows the corresponding sample sizes that en-sure that the 95% one-sided confidence intervals (2`, βXZ 1 tN24,.05{σ2M}) and (βXZ 2 tN24,.05{σ2M}, `) are within the ranges of (2`, βXZ 1 w) and (βXZ 2 w, `), respectively, with tolerance probabilities of at least .90 and moderator variables is used as the base for numerical
exposition. Specifically, the coefficient parameters and variance of the MMR model are set as βI 5 βX 5 βZ 5 βXZ 5 1 and σ2 5 1, respectively. For the joint
distribu-tion of the predictor and moderator, the (X, Z) variables are jointly normally distributed with mean (0, 0), variance (1, 1), and correlation ρ. The minimum sample sizes that are needed to control the designated two-sided intervals (βXZ 2 b, βXZ 1 b) of βXZ with coverage probability of at least .90 and .95 are presented in Table 1 for values of ρ ranging from 0 to .8 in increments of .2, and b 5 0.1, 0.125, and 0.15. Similarly, the corresponding sample size calculations for the one-sided interval (2`, βXZ 1 b) and (βXZ 2 b, `) of βXZ are listed in Table 2. Note that the sam-ple sizes presented in Table 2 are applicable for one-sided intervals (2`, βXZ 1 b) and (βXZ 2 b, `) of βXZ under the chosen model configurations. An inspection of both tables reveals the expected general relations: Sample sizes increase with an increasing level of confidence 1 2 α, and they increase with decreasing value of bound b when all other factors are fixed. Moreover, the sample size reported in Table 1 for a two-sided confidence interval is greater than the corresponding value of a one-sided confidence interval in Table 2 for fixed values of ρ, b, and 1 2 α. Sample Size Methodology for Confidence Intervals of Moderating Effects With Specified Ranges and Tolerances
It is well known that confidence intervals are superior to hypothesis tests not only in that they reveal what param-eter values would be rejected if they were used in a null
Table 1
Minimum Sample Sizes Required for the Prescribed Two-Sided Interval (βXZ 2 b, βXZ 1 b) of βXZ With Coverage Probability
of at Least .90 and .95 for Bivariate Normal Predictor and Moderator Variables (σ2 5 1, µX 5 µZ 5 0, σ2 X 5 σZ2 5 1) b 0.1 0.125 0.15 ρ .90 .95 .90 .95 .90 .95 0 280 396 183 258 130 183 .2 270 382 177 249 126 177 .4 245 346 162 228 117 164 .6 213 300 142 199 104 145 .8 181 254 123 171 92 127 Table 2
Minimum Sample Sizes Required for the Chosen One-Sided Interval (2`, βXZ 1 b) and (βXZ 2 b, `) of βXZ With Coverage Probability of at Least .90 and .95 for Bivariate Normal Predictor
and Moderator Variables (σ2 5 1, µX 5 µZ 5 0, σX 5 2 σZ 5 1)2 b 0.1 0.125 0.15 ρ .90 .95 .90 .95 .90 .95 0 171 280 112 183 81 130 .2 166 270 109 177 79 126 .4 151 245 100 162 73 117 .6 132 213 89 142 66 104 .8 112 181 77 123 58 92 Table 3
Minimum Sample Sizes Required to Ensure That the 95% Two-Sided Confidence Interval (βXZ 2 tN24,.025{ˆσ2M}1/2, βXZ 1
tN24,.025{ˆσ2M}1/2) Is Within the Range of (βXZ 2 w, βXZ 1 w)
With a Tolerance Probability of at Least .90 and .95 for Bivariate Normal Predictor and Moderator Variables
(σ2 5 1, µX 5 µZ 5 0, σ2 X 5 σZ2 5 1) w 0.2 0.225 0.25 ρ .90 .95 .90 .95 .90 .95 0 343 405 274 325 226 268 .2 332 393 266 315 220 260 .4 303 359 245 290 203 241 .6 266 315 216 257 180 215 .8 227 270 186 222 157 187
of their numerical study was to determine whether the re-lationship between the self-assurance of managers (Y ) and the length of time in the position (X ) changes as a func-tion of managerial ability (Z). To facilitate the following illustration in the context of MMR research, suppose that there are 60 pairs of observations for predictor variable X and moderator variable Z obtained from a pilot study. The values of (X, Z) that are presented in Table 5 represent random samples generated from a bivariate normal popu-lation with µX 5 µZ 5 0, σ2
X 5 σZ2 5 1, and correlation ρ 5 .4. In view of the continuous characteristics of mea-surements X and Z, it is clear that the sample values in the subsequent study vary from one application to another. However, the observed configurations from the pilot study can be employed as an empirical approximation for the underlying joint distribution of X and Z. Moreover, it is shown next that the suggested approach and a simplified method utilize the empirical features that are associated with the predictor and moderator variables in distinctive ways and, thus, the two formulas lead to substantially dif-ferent results in sample size calculations and in accuracy in achieving satisfactory levels of precision for interval estimation.
We follow the analysis results in Aiken and West (1991, p. 10) that the parameter estimates of the MMR model are chosen as βI 5 2.54, βX 5 1.14, βZ 5 3.58, βXZ 5 2.58, and σ2 5 1. On the basis of the 60 observed
configura-tions of pilot data in Table 5 with Xi 5 (Xi, Zi, XiZi)T and the empirical probability 1/60 for i 5 1, . . . , 60, the esti-mated moment matrices for the quantities in Equation A4 can be obtained by ˆm 5
S
60 i51Xi/60, ˆS 5S
60i51(Xi 2 ˆm)(Xi 2 ˆm)T/60, and ˆΨ 5S
60 i51[(Xi 2 ˆm)(Xi 2 ˆm)T⊗ (Xi 2 ˆm)(Xi 2 ˆm)T]/60. Thus, the approximate normal distribution of W* in Equa-tion A5 has the estimated mean µW* 5 1.2348 and esti-mated variance σ2W* 5 22.6511. In planning a research study according to the present information, the minimum sample sizes needed to control the designated two-sided interval (βXZ 2 b, βXZ 1 b) 5 (2.58 2 0.15, 2.58 1 0.15) 5 and .95. As in the numerical evaluations associated with
Table 2, the sample sizes given in Table 4 are applica-ble to both cases of one-sided intervals because of the special feature of the noncentral t distribution. It can be seen from Tables 3 and 4 that required sample sizes in-crease with an increasing level of tolerance probability 1 2 γ, and with a decreasing value of bound w when all other factors are fixed. As before, the sample size reported in Table 3 for the two-sided confidence interval is greater than the corresponding value of the one-sided confidence interval reported in Table 4 for fixed values of ρ, w, and 1 2 α. Furthermore, although the results are not completely comparable, the sample sizes in Tables 3 and 4 are larger than those in Tables 1 and 2.
Numerical Examples
The following numerical assessment represents a typi-cal research situation frequently encountered in the plan-ning stage of a study in order to assess interaction effects in the context of MMR. The ultimate aim is to demonstrate the sample size calculations for precise interval estimation of moderating effects based on a pilot sample and to show the potential consequence of failing to account for the un-derlying stochastic property of the explanatory variables. As a continued exposition of the illustration of Aiken and West (1991), it is important to remember that the aim
Table 4
Minimum Sample Sizes Required to Ensure That the 95% One-Sided Confidence Interval (2`, βXZ 1 tN24,.05{ˆσ2M}) and (βXZ 2
tN24,.05{ˆσ2M}, `) Is Within the Range of (2`, βXZ 1 w) and
(βXZ 2 w, `), Respectively, With a Tolerance Probability of at Least .90 and .95 for Bivariate Normal Predictor and Moderator Variables (σ2 5 1, µX 5 µZ 5 0, σ2 X 5 σZ2 5 1) w 0.2 0.225 0.25 ρ .90 .95 .90 .95 .90 .95 0 227 287 182 230 150 190 .2 220 278 177 223 146 185 .4 201 254 162 206 135 171 .6 176 224 144 182 120 153 .8 151 192 124 158 105 134 Table 5
Observed Values of Predictor Variable X and Moderator Variable Z of the Pilot Study
X Z X Z X Z X Z 20.9121 20.7970 20.3581 0.1677 0.4875 0.0481 20.2312 22.6297 0.6161 0.4406 21.7096 20.0614 21.3712 20.2643 0.1967 0.6026 20.3459 20.2503 21.2201 1.0737 20.3063 0.4640 20.7609 20.1105 20.3654 0.7871 1.9457 20.4328 21.2158 20.8524 21.3095 20.1378 0.2258 20.7407 20.0119 0.4386 1.1241 0.5519 22.0270 0.3233 0.0206 0.5837 0.1606 0.2365 21.3135 1.5577 1.4949 0.7624 0.8080 2.2212 20.1174 21.1017 0.1751 0.1340 0.5943 20.3610 20.0031 20.9145 0.2718 1.0854 0.2313 0.3495 20.2982 20.2510 0.7696 0.6172 0.8000 0.2615 20.4457 0.9176 21.3263 20.1808 0.5753 20.5732 21.2381 20.1725 2.8890 1.2777 1.2771 1.4634 21.6247 20.3238 20.8302 21.1981 0.3750 0.2207 20.8958 0.4195 0.5934 20.5248 20.6407 20.6331 0.7223 1.2787 21.6284 20.5142 1.6639 0.8816 20.3646 0.9514 0.8073 1.2787 0.4745 1.2441 20.2153 1.3834 22.9043 22.2853 0.9276 1.5124 20.7966 0.5477 0.4095 0.1387 0.1980 0.1679 0.5019 0.4255 0.5386 0.9979
SAS/IML (SAS Institute, 2008) programs employed to perform the sample size calculations of the proposed ap-proaches are presented in Appendixes D and E.
Simulation Study
In order to compare the performance and to reinforce the fundamental distinction of the two competing ap-proaches, further simulation studies are conducted. For demonstration, the MMR model with the bivariate normal predictor and moderator variables described above is ex-ploited as the basis for a Monte Carlo examination. The numerical study is conducted in two steps. First, under the selected values of coefficient parameters, error vari-ance, and distribution configurations of bivariate predic-tor and moderapredic-tor distribution, the approximate coverage probabilities of the two methods are calculated with the reported sample size of the proposed approach. The cor-responding results are presented in Table 6. It follows that the approximate coverage probabilities of .8033, .9005, and .9507 for the proposed method are almost identical to the desired values of .80, .90, and .95 for sample sizes 74, 116, and 162, respectively, whereas the computed cover-age probabilities of .8484, .9274, and .9661 associated with the simplified method are somewhat greater than the desired values of .80, .90, and .95, respectively.
In the second step, the sample size N calculated by the proposed approach is utilized as a benchmark to assess the simulated coverage probability. Estimates of the true coverage probability associated with given sample size and parameter configurations are then computed through a Monte Carlo simulation of 10,000 independent data sets. For each replicate, N sets of predictor and moderator val-ues are generated from the designated bivariate normal distribution. These values of predictor and moderator, in turn, determine the mean responses for generating N nor-mal outcomes with the MMR model. Next, the estimate βXZ is computed, and the simulated coverage probability is the proportion of the 10,000 replicates whose values of βXZ fall between 2.43 and 2.73. The adequacy of the examined procedure for coverage probability and sample size calculation is determined by the formula error 5 simulated coverage probability 2 approximate coverage probability, comparing the simulated coverage probability and approximate coverage probability that were computed earlier. All of the calculations are performed using pro-grams written with SAS/IML (SAS Institute, 2008). The (2.43, 2.73) of βXZ with a desired coverage probability can
be determined by the approximate coverage probability function defined in Equation B2. The resulting sample sizes are 74, 116, and 162 for coverage probabilities of .80, .90, and .95, respectively. On the other hand, the re-searcher may presume that the identical empirical struc-ture of predictor and moderator variables in the pilot data will continue to occur in the investigation. Therefore, the inference of moderating effects can be conducted with the simplified or conditional distribution of TXZ(β) in tion A1. With the fixed modeling formulation of Equa-tion A3, the minimum sample sizes needed to control the designated two-sided interval (βXZ 2 b, βXZ 1 b) 5 (2.43, 2.73) of βXZ, with coverage probability of at least .80, .90, and .95, are 60, 98, and 139, respectively. These sample sizes are smaller than those reported earlier, according to the more involved normal mixture of noncentral t distri-butions in Equation B1. The sizable discrepancy between these two procedures indicates the need to assess their ad-equacy for interval estimation in achieving the nominal coverage probability.
Furthermore, so that the resulting confidence interval of a desired confidence coefficient will fall into a scien-tifically credible range with a specified level of tolerance probability, the numerical study is extended to illustrate the advantage of the suggested procedure and the defi-ciency of the alternative simplified method for sample size calculations. For the MMR model with the bivari-ate normal predictor and moderator variables examined above, the minimum sample sizes required for the sug-gested formula in Equation C2 to ensure that the 95% two-sided confidence interval (βXZ 2 tN24,.025{σ2M}1/2, βXZ 1 tN24,.025{σ2M}1/2) is within the range of (βXZ 2 w, βXZ 1 w) 5 (2.58 2 0.225, 2.58 1 0.225) 5 (2.355, 2.805), with tolerance probabilities of at least .80, .90, and .95, are 192, 239, and 285, respectively. Accordingly, the minimum sample sizes required for the conditional formulation in Equation C4 to ensure that the 95% two-sided confidence interval (βXZ 2 tN24,.025{σ2M}1/2, βXZ 1 tN24,.025{σ2M}1/2) is within the range of (βXZ 2 wL, βXZ 1 wU) 5 (2.355, 2.805), with tolerance probabilities of at least .80, .90, and .95, are 169, 208, and 246, respectively. Obviously, the calculated sample sizes of the two proce-dures differ considerably for the setting considered pres-ently. The differences between the two approaches are further examined in the following simulation study. The
Table 6
Approximate Coverage Probabilities and Simulated Coverage Probabilities at Specified Sample Sizes for the Prescribed Two-Sided
Interval (βXZ 2 0.15, βXZ 1 0.15) of βXZ With Bivariate Normal Predictor and Moderator Variables (βXZ 5 2.58, σ2 5 1, µX 5 µZ 5 0,
σ2
X 5 σZ2 5 1, r 5 .4)
Proposed Method Simplified Method
Simulated Approximate Approximate
Coverage Coverage Coverage
N Probability Probability Error Probability Error
74 .7983 .8033 20.0050 .8484 20.0501
116 .8943 .9005 20.0062 .9274 20.0331
research is needed before it can be accepted in place of the commonly used fixed linear regression model. The present article aimed to demonstrate the technical development of precise interval estimation and related sample size meth-odology with sufficient clarity so that MMR practitioners can perceive the applicability and usefulness of the infor-mation. Specifically, the proposed approach fully accom-modates the arbitrary distributional formulations of the stochastic explanatory variables. The differences and im-pacts of failing to account for the randomness of predictor and moderator variables are elucidated through rigorous analytical presentations and numerical assessments. It is shown that the existing fixed modeling formulation may distort the precision analysis and lead to a poor choice of sample sizes. More importantly, although the suggested general procedures for sample size determinations are derived from large-sample theory, the simulation study demonstrates their accuracy in achieving desired levels of coverage and tolerance for interval estimation over a wide range of model settings. The generality and accuracy of the proposed methodology not only facilitate the echoed statistical practice of confidence intervals but also further fortify the potential applicability of MMR analysis. Ac-cordingly, the results provide the basis for probing related considerations in more complicated situations, such as the three-way interactions discussed in Aiken and West (1991) and Dawson and Richter (2006).
AuThOR NOTE
The author thanks the editor, Gregory Francis, and the two anonymous reviewers for their valuable comments on earlier drafts of the article. This research was partially supported by National Science Council Grant NSC-97-2410-H-009-011-MY2. Correspondence concerning this article should be addressed to G. Shieh, Department of Management Science, National Chiao Tung University, 1001 Ta Hsueh Road, Hsinchu, Taiwan 30050 (e-mail: [email protected]).
REfERENCES
Aguinis, H., Beaty, J. C., Boik, R. J., & Pierce, C. A. (2005). Effect size and power in assessing moderating effects of categorical variables using multiple regression: A 30-year review. Journal of Applied Psy-chology, 90, 94-107.
Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Newbury Park, CA: Sage.
Algina, J., & Olejnik, S. (2000). Determining sample size for accurate estimation of the squared multiple correlation coefficient. Multivari-ate Behavioral Research, 35, 119-137.
simulated coverage probability and error for the proposed and simplified methods are also summarized in Table 6. As seen from the results, the performance of the proposed method appears to be remarkably good for the range of model specifications considered in the present article. In contrast, the simplified method yielded much larger errors and, in particular, the error is as large as 0.0501 for the sample size of 74 with coverage probability around .80. Comparatively, these errors associated with the simplified method may be too large to be satisfactory.
As in the previous case, we first evaluate the approxi-mated tolerance probabilities with sample sizes of 192, 239, and 285 for the two distinct procedures, and the re-sulting values are summarized in Table 7. Then, the simu-lated tolerance probabilities for the prescribed parameter setting and sample size are computed with the proportion of 10,000 replicates of 95% two-sided confidence inter-vals (βXZ 2 tN24,.025{σ2M}1/2, βXZ 1 tN24,.025{σ2M}1/2) that are within the range of (2.355, 2.805). The differences between the simulated tolerance probability and approxi-mate tolerance probability or error 5 simulated tolerance probability 2 approximate tolerance probability are also presented in Table 7. Clearly, the errors of the simplified method are substantially larger than those associated with the suggested approach. Hence, it can be concluded that the sample sizes calculated with the conditional formula in Equation C4 are too small to ensure sufficient tolerance probability, and the phenomenon shall continue to exist in other settings of random explanatory variables.
Conclusions
Due to the prevalence of MMR applications in various disciplines of social sciences, it seems prudent to ensure and to extend the understanding of fundamental proper-ties of related inference procedures. When assessing the extent of interaction effects between continuous predic-tor and moderapredic-tor variables, the underlying stochastic configurations of the predictor and moderator vary from one research study to another and inevitably necessitate random modeling instead of the commonly used fixed or conditional model setting. It is important that the corre-sponding theoretical implications of power and precision appraisal be well understood when MMR analyses are ad-opted by researchers. As presented above, random MMR modeling is comparatively more complex so that more
Table 7
Approximate Tolerance Probabilities and Simulated Tolerance Probabilities at Specified Sample Sizes for the 95% Two-Sided Confidence Interval (βXZ 2 tN24,.025{ˆσ2M}1/2, βXZ 1 tN24,.025{ˆσ2M}1/2)
Within the Range of (βXZ 2 0.225, βXZ 1 0.225) With Bivariate Normal Predictor and Moderator Variables (βXZ 5 2.58,
σ2 5 1, µX 5 µZ 5 0, σ2
X 5 σZ2 5 1, r 5 .4)
Proposed Method Simplified Method
Simulated Approximate Approximate
Tolerance Tolerance Tolerance
N Probability Probability Error Probability Error
192 .7834 .8018 20.0184 .8662 20.0828
239 .8902 .9005 20.0103 .9425 20.0523
O’Connor, B. P. (2006). Programs for problems created by continuous variable distributions in moderated multiple regression. Organiza-tional Research Methods, 9, 554-567.
Rencher, A. C. (2000). Linear models in statistics. New York: Wiley. Sampson, A. R. (1974). A tale of two regressions. Journal of the
Ameri-can Statistical Association, 69, 682-689.
SAS Institute (2008). SAS/IML user’s guide, Version 9.2. Cary, NC: Author.
Shieh, G. (2006). Exact interval estimation, power calculation and sam-ple size determination in normal correlation analysis. Psychometrika,
71, 529-540.
Shieh, G. (2007). A unified approach to power calculation and sample size determination for random regression models. Psychometrika, 72,
347-360.
Shieh, G. (2009). Detecting interaction effects in moderated multiple regression with continuous variables: Power and sample size consid-erations. Organizational Research Methods, 12, 510-528.
Shieh, G., & Kung, C. F. (2007). Methodological and computational considerations for multiple correlation analysis. Behavior Research Methods, 39, 731-734.
Smithson, M. (2001). Correct confidence intervals for various regres-sion effect sizes and parameters: The importance of noncentral distri-butions in computing intervals. Educational & Psychological Mea-surement, 61, 605-632.
Steiger, J. H., & Fouladi, R. T. (1997). Noncentrality interval estima-tion and the evaluaestima-tion of statistical models. In L. Harlow, S. Mu-laik, & J. H. Steiger (Eds.), What if there were no significance tests? (pp. 222-257). Mahwah, NJ: Erlbaum.
Wilkinson, L., & the Task Force on Statistical Inference, APA Board of Scientific Affairs (1999). Statistical methods in psychol-ogy journals: Guidelines and explanations. American Psychologist,
54, 594-604.
American Psychological Association (2001). Publication manual of the American Psychological Association (5th ed.). Washington, DC: Author.
Cramer, E. M., & Appelbaum, M. I. (1978). The validity of polynomial regression in the random regression model. Review of Educational Research, 48, 511-515.
Dawson, J. F., & Richter, A. W. (2006). Probing three-way interactions in moderated multiple regression: Development and application of a slope difference test. Journal of Applied Psychology, 91, 917-926.
Dunlap, W. P., Xin, X., & Myers, L. (2004). Computing aspects of power for multiple regression. Behavior Research Methods, Instru-ments, & Computers, 36, 695-701.
Gatsonis, C., & Sampson, A. R. (1989). Multiple correlation: Exact power and sample size calculations. Psychological Bulletin, 106,
516-524.
Guenther, W. C., & Thomas, P. O. (1965). Some graphs useful for statistical inference. Journal of the American Statistical Association,
60, 334-343.
Hahn, G. J., & Meeker, W. Q. (1991). Statistical intervals: A guide for practitioners. New York: Wiley.
Kelly, K., & Maxwell, S. E. (2003). Sample size for multiple regres-sion: Obtaining regression coefficients that are accurate, not simply significant. Psychological Methods, 8, 305-321.
Kupper, L. L., & Hafner, K. B. (1989). How appropriate are popular sample size formulas? American Statistician, 43, 101-105.
Mendoza, J. L., & Stafford, K. L. (2001). Confidence interval, power calculation, and sample size estimation for the squared multiple cor-relation coefficient under the fixed and random regression models: A computer program and useful standard tables. Educational & Psycho-logical Measurement, 61, 650-667.
Nelson, L. S. (1994). Sample size for confidence intervals with specified lengths and tolerances. Journal of Quality Technology, 26, 54-63.
APPENDIX A The Distribution of TXZ
It follows from the standard assumption in Equation 1 under a fixed modeling framework that the variable
T M XZ( ) XZ ˆ ˆ / β β β σ = −
{ }
2 1 2 (A1)has a noncentral t distribution t(N 2 4, Γ) with N 2 4 degrees of freedom and noncentrality parameter Γ, where σ2 is the usual unbiased estimator of σ2; M is the (3, 3) element of A21, A 5 SN
i51(Xi 2 X)(Xi 2 X)T, X 5
SN
i51Xi/N, and Xi 5 (Xi, Zi, XiZi)T is the 3 3 1 row vector for values of predictor Xi, moderator Zi, and their cross product XiZi for i 5 1, . . . , N; β is a constant; and the noncentrality parameter Γ 5 (βXZ 2 β)/{σ2M}1/2. Accordingly, a particular formulation can be obtained by substituting β with βXZ into TXZ as follows:
T M XZ( XZ) XZ XZ ˆ ˆ / , β β β σ = −
{ }
2 1 2 (A2)and TXZ(βXZ) is distributed as t(N 2 4)—a t distribution with N 2 4 degrees of freedom. Note that TXZ(βXZ) in Equation A2 provides a useful tool for conducting statistical inferences of hypothesis testing and interval estima-tion about the magnitude of moderating effect βXZ. In this case, the coverage probability for a designated interval (βXZ 2 bL, βXZ 1 bU) of βXZ can be computed using the simple expression of
P{βXZ 2 bL , βXZ , βXZ 1 bU} 5 P{t(N 2 4, 2δU) , 0} 2 P{t(N 2 4, δL) , 0}, (A3) where bL . 0, bU . 0, δU 5 bU/{σ2M}1/2, and δL 5 bL/{σ2M}1/2.
Instead of a mere fixed or conditional formulation, we focus on the particular random regression situation in which both the predictor X and the moderator Z are continuous random variables within the context of MMR. Specifically, the continuous predictor and moderator variables {(Xi, Zi), i 5 1, . . . , N} are assumed to have a joint probability function g(Xi, Zi) with finite moments. Moreover, the form of g(Xi, Zi) does not depend on any of the unknown parameters (βI, βX, βZ, βXZ) and σ2. The moments of the explanatory vectors Xi 5 (Xi, Zi, XiZi)T are defined as
m 5 E[Xi], S 5 E[(Xi 2 m)(Xi 2 m)T], and
where E[·] denotes the expectation taken with respect to the joint probability density function g(Xi, Zi) of (Xi, Zi), and ⊗ represents the Kronecker product. According to the formulations of A and M presented in Equation A1
for TXZ, both A and M are functions of random variables (Xi, Zi), i 5 1, . . . , N, within the random regression framework and, therefore, TXZ has a noncentral t distribution with random noncentrality Γ. It follows from Shieh (2009) that W* 5 1/{(N 2 1)M} has an asymptotic normal distribution:
W* N(µ
W*, σW*2 ), (A5)
where µW* 5 1/(cTS21c), σW*2 5 µW*4 {(cTS21⊗ cTS21)Ψ(S21c ⊗ S21c) 2 µ2W *2}/(N 2 1), c 5 (0, 0, 1)T is a 3 3 1 row vector, and S and Ψ are defined in Equation A4. Therefore, the distribution of TXZ(β) under the random regression setting can be well approximated by the following two-stage distribution:
TXZ(β) | W* ~ t{N 2 4, (βXZ 2 β)[(N 2 1)W*/σ2]1/2} and W* N(µW*, σW*2 ). (A6) The approximate distribution of TXZ(β) is particularly useful to evaluate the cumulative probability function for βXZ in terms of FXZ(c) 5 P{βXZ , c}, where c is a constant. It can be readily shown from the definition of TXZ that
FXZ(c) 5 P{βXZ 2 c , 0} 5 P{TXZ(c) , 0}. Accordingly, the cumulative distribution function FXZ(c) can be approximated by
FXZ(c) EW*[P(t{N 2 4, [(N 2 1)W*/σ2]1/2(βXZ 2 c)} , 0)], (A7) where the expectation EW*[·] is taken with respect to the approximate normal distribution of W* presented in Equation A5.
APPENDIX A (Continued)
APPENDIX B
Sample Size Calculations for Designated Interval Estimation of Moderating Effects
It follows from the definition of TXZ(β) given in Equation 5 and the associated asymptotic approximation of cumulative distribution function FXZ of βXZ presented in Equation A7 that the probability P{βXZ 2 bL , βXZ , βXZ 1 bU} in Equation 6 can be approximated by
P{βXZ 2 bL , βXZ , βXZ 1 bU} 5 FXZ(βXZ 1 bU) 2 FXZ(βXZ 2 bL)
EW*[P{t(N 2 4, 2∆U) , 0}] 2 EW*[P{t(N 2 4, ∆L) , 0}], (B1) where ∆U 5 bU{(N 2 1)W*/σ2}1/2, ∆L 5 bL{(N 2 1)W*/σ2}1/2, and the expectation EW*[·] is taken with respect to the approximate normal distribution of W* presented in Equation A5. Hence, the suggested computation of the smallest sample size N needed for the prescribed interval (βXZ 2 bL, βXZ 1 bU) of βXZ with coverage probability of at least 1 2 α is performed with the approximate coverage probability formula
EW*[P{t(N 2 4, 2∆U) , 0} 2 P{t(N 2 4, ∆L) , 0}] $ 1 2 α. (B2) It should be noted that numerical computations of the expected value in Equation B2 require the evaluation of a noncentral t cumulative distribution function and the one-dimensional integration with respect to a normal distribution. This procedure is not as simple as using a z or t table, but it is not unreasonable in light of modern computing capabilities. Moreover, two important aspects of the proposed procedure should be pointed out. First, both probability functions P{t(N 2 4, 2∆U) , 0} and P{t(N 2 4, ∆L) , 0} do not involve the regression coefficient βXZ, which corresponds to the extent of the moderating effect. However, there is a direct functional relation between the magnitudes of cumulative probability and the bounds bL and bU. Second, the mean values of the predictor, moderator, and their product are not included in the asymptotic distribution of W* defined in
Equation A5. Hence, the mean vector (first moments) associated with the joint distribution of explanatory vari-ables does not have any influence on the resulting probability levels and required sample sizes.
In a similar fashion, the corresponding sample size calculations for the prescribed lower and upper one-sided intervals in the form of (2`, βXZ 1 bU) and (βXZ 2 bL, `) for βXZ with coverage probability of at least 1 2 α can be conducted with the modified or approximate probability functions in terms of a normal mixture of a noncentral t cumulative distribution function given by
P{βXZ , βXZ 1 bU} EW*[P{t(N 2 4, 2∆U) , 0}] $ 1 2 α and
P{βXZ 2 bL , βXZ} 1 2 EW*[P{t(N 2 4, ∆L) , 0}] $ 1 2 α, (B3) respectively, where ∆U, ∆L, and W* are given above in Equation B1. It can be readily shown from Equation B3, with bU 5 bL 5 b and ∆U 5 ∆L 5 ∆ 5 b{(N 2 1)W*/σ2}1/2, that
P{t(N 2 4, 2∆) , 0} 5 1 2 P{t(N 2 4, ∆) , 0} and
APPENDIX C
Sample Size Calculations for Confidence Intervals of Moderating Effects With Specified Ranges and Tolerances
According to the asymptotic results for βXZ presented in Appendix A, we propose to consider the following alternative formula for computing the probability described in Equation 7:
P{βXZ 2 wL , βXZ 2 tN24,α/2(σ2M)1/2 and βXZ 1 tN24,α/2(σ2M)1/2 , βXZ 1 wU}
EW*[P{t(N 2 4, 2ΛU) , 2tN24,α/2} 2 P{t(N 2 4, ΛL) , tN24,α/2}], (C1) where ΛU 5 wU{(N 2 1)W*/σ2}1/2, ΛL 5 wL{(N 2 1)W*/σ2}1/2, and the expectation EW*[·] is taken with respect to the approximate normal distribution of W* presented in Equation A5. Thus, the minimum sample size N required to ensure that the 100(1 2 α)% two-sided confidence interval (βXZ 2 tN24, α/2{σ2M}1/2, βXZ 1
tN24, α/2{σ2M}1/2) is within the range of (βXZ 2 wL, βXZ 1 wU) with a tolerance probability of at least 1 2 γ can be determined by
EW*[P{t(N 2 4, 2ΛU) , 2tN24,α/2} 2 P{t(N 2 4, ΛL) , tN24,α/2}] $ 1 2 γ. (C2) Moreover, the sample size calculations for the lower and upper one-sided confidence intervals in the form of (2`, βXZ 1 tN24,α{σ2M}1/2) and (βXZ 2 tN24,α{σ2M}1/2, `) that fall within the ranges of (2`, βXZ 1 wU) and (βXZ 2 wL, `) with a tolerance probability of at least 1 2 γ can be performed with
P{βXZ 1 tN24,α(σ2M)1/2 , βXZ 1 wU} EW*[P{t(N 2 4, 2ΛU) , 2tN24,α}] $ 1 2 γ and
P{βXZ 2 wL , βXZ 2 tN24,α(σ2M)1/2} 1 2 EW*[P{t(N 2 4, ΛL) , tN24,α}] $ 1 2 γ, (C3) respectively, where ΛU and ΛL are given above for Equation C1. It can be readily shown from Equation C3, with
wU 5 wL 5 w and ΛU 5 ΛL 5 Λ 5 w{(N 2 1)W*/σ2}1/2, that
P{t(N 2 4, 2Λ) , 2tN24,α} 5 1 2 P{t(N 2 4, Λ) , tN24,α} and
EW*[P{t(N 2 4, 2Λ) , 2tN24,α}] 5 1 2 EW*[P{t(N 2 4, Λ) , tN24,α}]. In contrast, the tolerance probability with respect to the conditional distribution in Equation A1 is
P{βXZ 2 wL , βXZ 2 tN24,α/2(σ2M)1/2 and βXZ 1 tN24,α/2(σ2M)1/2 , βXZ 1 wU}
5 P{t(N 2 4, 2λU) , 2tN24,α/2} 2 P{t(N 2 4, λL) , tN24,α/2}, (C4) where λU 5 wU/{σ2M}1/2 and λL 5 wL/{σ2M}1/2.
APPENDIX D
SAS Program to Perform Sample Size Calculations for Designated Interval Estimation of Moderating Effects
PROC IML;
*REQUIRED USER SPECIFICATIONS PORTION;
*SPECIFY THE VALUES OF ALPHA, SIGMA2, BETAXZ AND BOUND; ALPHA=0.2;SIGMA2=1;BETAXZ=2.58;B=0.15;
*SPECIFY THE PAIRED-VALUES OF X AND Z SEQUENTIALLY; XT={-0.9121 -0.7970,-0.3581 0.1677,0.4875 0.0481,-0.2312 -2.6297, 0.6161 0.4406,-1.7096 -0.0614,-1.3712 -0.2643,0.1967 0.6026, -0.3459 -0.2503,-1.2201 1.0737,-0.3063 0.4640,-0.7609 -0.1105, -0.3654 0.7871,1.9457 -0.4328,-1.2158 -0.8524,-1.3095 -0.1378, 0.2258 -0.7407,-0.0119 0.4386,1.1241 0.5519,-2.0270 0.3233, 0.0206 0.5837,0.1606 0.2365,-1.3135 1.5577,1.4949 0.7624, 0.8080 2.2212,-0.1174 -1.1017,0.1751 0.1340,0.5943 -0.3610, -0.0031 -0.9145,0.2718 1.0854,0.2313 0.3495,-0.2982 -0.2510, 0.7696 0.6172,0.8000 0.2615,-0.4457 0.9176,-1.3263 -0.1808, 0.5753 -0.5732,-1.2381 -0.1725,2.8890 1.2777,1.2771 1.4634, -1.6247 -0.3238,-0.8302 -1.1981,0.3750 0.2207,-0.8958 0.4195, 0.5934 -0.5248,-0.6407 -0.6331,0.7223 1.2787,-1.6284 -0.5142, 1.6639 0.8816,-0.3646 0.9514,0.8073 1.2787,0.4745 1.2441, -0.2153 1.3834,-2.9043 -2.2853,0.9276 1.5124,-0.7966 0.5477, 0.4095 0.1387,0.1980 0.1679,0.5019 0.4255,0.5386 0.9979}; *END OF REQUIRED USER SPECIFICATIONS;
****************************************************************; XE=XT||(XT[,1]#XT[,2]);K=NROW(XE);XM=XE[:,];XC=XE-J(K,1,1)*XM; H=J(3,3,0);HH=H@H; DO I=1 TO K; H=H+XC[I,]`*XC[I,];HH=HH+(XC[I,]`*XC[I,])@(XC[I,]`*XC[I,]); END; CIL=BETAXZ-B;CIU=BETAXZ+B;COVERP=1-ALPHA; NUMINT=1000;L=NUMINT+1; COEVEC=({1}||REPEAT({4 2},1,NUMINT/2-1)||{4 1})`; INT=PROBIT(0.999995);INTERVAL=2#INT/NUMINT; ZVEC=((INTERVAL#(0:NUMINT))+(-INT))`; WZPDF=(INTERVAL/3)#COEVEC#PDF('NORMAL',ZVEC,0,1); SIGM=H/K;PSI=HH/K;ISIGM=INV(SIGM);MUW=1/ISIGM[3,3]; VW=ISIGM[3,];VARW=(MUW##4)#((VW@VW)*PSI*(VW`@VW`)-MUW##(-2)); DELTAW=BETAXZ#SQRT(MUW/SIGMA2); N=10;NCOVERP=0; DO WHILE(NCOVERP<COVERP); N=N+1; WVEC=SQRT(VARW/(N-1))#ZVEC+MUW;WVEC=WVEC#(WVEC>0); NCOVERP=WZPDF`*(CDF('T',0,N-4,(BETAXZ-CIU)#SQRT((N-1)#WVEC)) -CDF('T',0,N-4,(BETAXZ-CIL)#SQRT((N-1)#WVEC))); END;
PRINT ALPHA BETAXZ B CIL CIU COVERP N; QUIT;
APPENDIX E
SAS Program to Perform Sample Size Calculations for Confidence Intervals of Moderating Effects With Specified Ranges and Tolerances
PROC IML;
*REQUIRED USER SPECIFICATIONS PORTION;
*SPECIFY THE VALUES OF ALPHA, SIGMA2, BETAXZ, BOUND AND TOLERANCE; ALPHA=0.05;SIGMA2=1;BETAXZ=2.58;W=0.225;LGAMMA=0.20;
*SPECIFY THE PAIRED-VALUES OF X AND Z SEQUENTIALLY; XT={-0.9121 -0.7970,-0.3581 0.1677,0.4875 0.0481,-0.2312 -2.6297, 0.6161 0.4406,-1.7096 -0.0614,-1.3712 -0.2643,0.1967 0.6026, -0.3459 -0.2503,-1.2201 1.0737,-0.3063 0.4640,-0.7609 -0.1105, -0.3654 0.7871,1.9457 -0.4328,-1.2158 -0.8524,-1.3095 -0.1378, 0.2258 -0.7407,-0.0119 0.4386,1.1241 0.5519,-2.0270 0.3233, 0.0206 0.5837,0.1606 0.2365,-1.3135 1.5577,1.4949 0.7624, 0.8080 2.2212,-0.1174 -1.1017,0.1751 0.1340,0.5943 -0.3610, -0.0031 -0.9145,0.2718 1.0854,0.2313 0.3495,-0.2982 -0.2510, 0.7696 0.6172,0.8000 0.2615,-0.4457 0.9176,-1.3263 -0.1808, 0.5753 -0.5732,-1.2381 -0.1725,2.8890 1.2777,1.2771 1.4634, -1.6247 -0.3238,-0.8302 -1.1981,0.3750 0.2207,-0.8958 0.4195, 0.5934 -0.5248,-0.6407 -0.6331,0.7223 1.2787,-1.6284 -0.5142, 1.6639 0.8816,-0.3646 0.9514,0.8073 1.2787,0.4745 1.2441, -0.2153 1.3834,-2.9043 -2.2853,0.9276 1.5124,-0.7966 0.5477, 0.4095 0.1387,0.1980 0.1679,0.5019 0.4255,0.5386 0.9979}; *END OF REQUIRED USER SPECIFICATIONS;
****************************************************************; XE=XT||(XT[,1]#XT[,2]);K=NROW(XE);XM=XE[:,];XC=XE-J(K,1,1)*XM; H=J(3,3,0);HH=H@H; DO I=1 TO K; H=H+XC[I,]`*XC[I,];HH=HH+(XC[I,]`*XC[I,])@(XC[I,]`*XC[I,]); END; CIL=BETAXZ-W;CIU=BETAXZ+W;COVERP=1-LGAMMA; NUMINT=1000;L=NUMINT+1; COEVEC=({1}||REPEAT({4 2},1,NUMINT/2-1)||{4 1})`; INT=PROBIT(0.999995);INTERVAL=2#INT/NUMINT; ZVEC=((INTERVAL#(0:NUMINT))+(-INT))`; WZPDF=(INTERVAL/3)#COEVEC#PDF('NORMAL',ZVEC,0,1); SIGM=H/K;PSI=HH/K;ISIGM=INV(SIGM);MUW=1/ISIGM[3,3]; VW=ISIGM[3,];VARW=(MUW##4)#((VW@VW)*PSI*(VW`@VW`)-MUW##(-2)); DELTAW=BETAXZ#SQRT(MUW/SIGMA2); N=10;NCOVERP=0; DO WHILE(NCOVERP<COVERP); N=N+1; WVEC=SQRT(VARW/(N-1))#ZVEC+MUW;WVEC=WVEC#(WVEC>0); NCOVERP=WZPDF`*(CDF('T',TINV(ALPHA/2,N-4),N-4,(BETAXZ-CIU)#SQRT((N-1)# WVEC))-CDF('T',TINV(1-ALPHA/2,N-4),N-4,(BETAXZ-CIL)#SQRT((N-1)#WVEC))); END;
PRINT ALPHA BETAXZ W CIL CIU COVERP N; QUIT;
(Manuscript received September 24, 2009; revision accepted for publication March 6, 2010.)