Sample size requirements for interval estimation of the strength of association effect sizes in multiple regression analysis

(1)

Multiple regression analysis is one of the major methods of statistical analysis in applied research across many scientifi c fi elds. For descriptive purpose, the sample squared multiple correlation coeffi cient, usually denoted by R2_{, is commonly employed to} assess the strength of association between the response variable and the predictor variables in many applications. See Bobko (2001) and Cohen et al. (2003) for operational guidelines and practical implications in areas of management and behavioral sciences. A primary concern of regression analysis is the conception of the two distinct scenarios of fi xed (conditional) and random (unconditional) modeling formulations that ultimately lead to different inferential procedures. One must have a clear understanding of the respective setups and how they can be utilized before the issues involved in the construction of an appropriate regression model can be fully

explained. Notably, Sampson (1974) gave an excellent and thorough description of the two modeling formulations in which the random setting adopts the convenient assumption that all variables have a joint multivariate normal distribution. The procedures for power calculation, interval estimation, and sample size determination under the fi xed regression models are well known; see Murphy and Myors (2004) and Smithson (2003) and the references therein for further details. However, the statistical properties of corresponding inferential procedures are more complex under the random model.

Although the underlying normality assumption provides a convenient and useful setup, the resulting probability density function of the sample squared multiple correlation coeffi cient R2_is notoriously complicated in form. The complexity incurs numerous investigations to give various expressions, approximations and computing algorithms for the distribution of sample squared multiple correlation coeffi cient. See Johnson et al. (1995, Chapter 32) and Stuart and Ord (1994, Chapter 16) for further details. For the purpose of point estimation, it is well known that R2_is a positively biased estimator of the population squared multiple correlation coeffi cient ρ2_{. To reduce the bias, several shrinkage} estimators have been suggested in the literature. See Raju et al.

Sample size requirements for interval estimation of the strength

of association effect sizes in multiple regression analysis

Gwowen Shieh

National Chiao Tung University

Abstract

Resumen

Background: Effect size reporting and interpreting practices have been

extensively recommended in academic journals when analyzing primary outcomes of all empirical studies. Accordingly, the sample squared multiple correlation coeffi cient is the commonly reported strength of association index in practical applications of multiple linear regression.

Method: This paper examines the sample size procedures proposed by

Bonett and Wright for precise interval estimation of the squared multiple correlation coeffi cient. Results: The simulation results showed that their simple method for attaining the desired precision of expected width provides satisfactory results only when sample sizes are large. Moreover, the suggested sample size formula for achieving the designated assurance probability is inaccurate and problematic. Conclusions: According to these fi ndings, their sample size procedures are not recommended.

Keywords: Assurance probability, expected width, squared multiple

correlation.

Requisitos del tamaño de la muestra para la estimación por intervalo de la fuerza de asociación de tamaños de efecto en análisis de regresión. Antecedentes: la práctica al presentar e interpretar el tamaño de efecto

ha sido recomendada extensivamente en revistas académicas al analizar resultados primarios en estudios empíricos. En consecuencia, el coefi ciente de correlación múltiple al cuadrado de la muestra es el índice de fuerzas de asociación que se presente con más frecuencia en aplicaciones prácticas de regresión lineal múltiple. Método: este trabajo examina el procedimiento del tamaño de la muestra que Bonett y Wright propusieron para una precisa estimación por intervalos de coefi ciente de correlación múltiple al cuadrado.

Resultados: el resultado de esta simulación señala que su método simple

para alcanzar la deseada precisión de la amplitud esperada proporciona el resultado satisfactorio solamente cuando el tamaño de la muestra sea extensivo. Además, la fórmula del tamaño de la muestra sugerida para lograr la designada probabilidad garantizada es inexacta y problemática.

Conclusiones: de acuerdo con estos descubrimientos, no se recomienda el

procedimiento del tamaño de la muestra.

Palabras clave: probabilidad garantizada, anchura esperada, correlación

múltiple al cuadrado. Psicothema 2013, Vol. 25, No. 3, 402-407

doi: 10.7334/psicothema2012.221

Received: July 25, 2012 • Accepted: January 18, 2013 Corresponding author: Gwowen Shieh

Department of Management Science and Institute of Statistics National Chiao Tung University

30010 Hsinchu (Taiwán) e-mail: gwshieh@mail.nctu.edu.tw

(2)

(1997), Shieh (2008), and Yin and Fan (2001) for further details. On the other hand, Helland (1987) suggested a simple approximate confi dence interval using ordinary F distribution and the accuracy of the approximation is remarkably good for practical purposes. Moreover, exact confi dence interval procedures were presented in Mendoza and Stafford (2001), Shieh (2006), Shieh and Kung (2007), and Steiger and Fouladi (1992). Unlike the approximate method, the exact approach employs an inversion technique of R2_distribution and is called the “cumulative distribution function” pivotal method in Casella and Berger (2002, Section 9.2.3) and Mood, Graybill and Boes (1974, Section 4.2). Therefore, the calculations of exact confi dence intervals for ρ2_{are methodologically and} computationally more involved than those for the standard interval procedures of treatment contrasts in ANOVA. Consequently, the calculation of confi dence intervals requires a special purpose computer program for performing the necessary computations of the probability distribution function of R2_.

Instead of a direct accept-or-reject conclusion in a simple hypothesis test, confi dence intervals are more informative about location and precision of the statistic, and they should be the best reporting strategy according to the recommendations of Wilkinson and the American Psychological Association Task Force on Statistical Inference (1999), as well as the Publication Manual of the American Psychological Association (2009). In addition, the editorial guidelines and methodological recommendations of several prominent educational and psychological journals stress that it is necessary to include some measures of effect size and confi dence intervals for all primary outcomes. For example, see Alhija and Levy (2009), Dunst and Hamby (2012), Fritz, Morris and Richler (2012), Odgaard and Fowler (2010), and Sun, Pan and Wang (2010). The emphasis on reporting effect sizes and confi dence intervals implies that researchers should plan studies not only to select practically meaningful effect size indices but also to have suffi ciently accurate interval estimates of effect sizes. Thus it is prudent to facilitate this research practice by determining the necessary sample sizes to satisfy the desired precision of interval estimation in the planning stage of research design.

It follows from the general review of effect size estimates in Breaugh (2003), Ferguson (2009), Fern and Monroe (1996), Kirk (1996), Richardson (1996), and Vacha-Haase and Thompson (2004) that the squared multiple correlation coeffi cient is one the most commonly used strength of association measures in social science research. Accordingly, there is a considerable recent literature pertaining to the sample size determinations for precise interval estimation of squared multiple correlation coeffi cient within the linear regression framework. Due to the complexity of the exact probability density distribution of R2_{, the calculations of required} sample size are extremely complicate to perform both effi ciently and reliably. Therefore, Kelley (2008) and Krishnamoorthy and Xia (2008) utilized the simulation-based or trial-and-error approach to circumvent the diffi culties in calculating the necessary sample sizes for adequate interval precision with respect to the control of expected width, and to the assurance probability of interval width within a designated value. Both computer programs and tabular sample sizes are provided in Kelley (2008) and Krishnamoorthy and Xia (2008) for constructing precise confi dence intervals under the selected precision criterion. Although the diffi culty of exact sample size computations has been avoided in the Kelley (2008) and Krishnamoorthy and Xia (2008), their suggested simulation procedures are still computationally intensive.

In view of the importance of accurate sample size formulas for a precise confi dence interval of the squared multiple correlation coeffi cient and computational demands of the current methods, Bonett and Wright (2011) proposed a simple procedure of approximating the sample size requirement for obtaining a squared multiple correlation confi dence interval with desired precision. The suggested sample size formula is derived from the approximate confi dence interval of the squared multiple correlation coeffi cient using the asymptotic normal distribution of the sample multiple correlation coeffi cient. It is noted in Bonett and Wright (2011) that the resulting technique is attractive in its simplicity and is surprising accurate for controlling the expected width for the nearly-exact confi dence intervals of Helland (1987). Moreover, they also presented a closed-form sample size formula for computing the necessary sample size that will yield a confi dence interval that is not wider than the designated bound with a nominal assurance probability. Numerical illustration and practice recommendation are described to illustrate and enhance the practical usefulness of their procedures.

Despite the appealing advantage of simplicity for the sample size procedures of Bonett and Wright (2011), two obvious caveats in their arguments and expositions should be noted. First, it is well known that the distribution of sample squared multiple correlation is generally skewed. Hence, the equidistant confi dence interval derived from the asymptotic normal distribution for transformation of sample squared multiple correlation is therefore presumably inappropriate and is not likely to be accurate. Second, the justifi cation for the accuracy of their formulas is only based on the computed relative precision of confi dence limits for the approximate interval procedure of Helland (1987) under selected values of model confi gurations. Accordingly, the lack of rigorous assessment of the accuracy of sample size formulas through comprehensive simulation study is an obvious drawback of the current explication in Bonett and Wright (2011). The actual performance of the suggested sample size procedures should be extensively evaluated before it can be adopted as a general methodology in practice. To this end, the article aims to conduct detailed numerical investigations to assess the adequacy of the sample size methods in Bonett and Wright (2011). We respectively suggest that our article serves as an updating and clarifi cation of their recent work.

The remainder of the paper is organized in the following manner. In the next section, the fundamental results of the approximate interval procedure of Helland (1987) and sample size techniques for precise confi dence intervals of Bonett and Wright (2011) are described. Then, Monte Carlo simulation studies were performed to appraise the accuracy the sample size formulas under a variety of model and precision confi gurations. The accuracy of the approximate techniques of Bonett and Wright (2011) are evaluated by the computed Helland’s (1987) confi dence intervals corresponding to the control of expected width, and to the tolerance probability of interval width within a designated value. Finally, some concluding remarks are provided.

Confi dence intervals and sample size calculations Consider the standard multiple linear regression model with criterion variable Y and p predictor variables (X1, ..., Xp) for N

independent sets of these jointly multivariate normal variables. The sample squared multiple correlation coeffi cient R2_{is a}

(3)

prevailing strength of association effect size measure for the population squared multiple correlation coeffi cient ρ2_{between the} criterion variable and the set of predictor variables. For practical use, Helland (1987) presented an approximate interval estimation procedure for ρ2_{. Specifi cally, the approximate 100(1 − α)%} confi dence interval for ρ2_is

ˆ L2, ˆU2

(

)

₍₁₎ where ˆ _L2₌

(

N p 1

)

R 2_{1 R}

( )

2 _pF L N p 1

(

)

{

R2+ 1 R

( )

2 FL

}

, ˆU2= N p 1

(

)

R2 1 R

(

2

)

pFU N p 1

(

)

{

R2+ 1 R

(

2

)

FU

}

,

F_L is the 100(1 – α/2) percentile of the F distribution with ν_L and N – p – 1 degrees of freedom, and νL = {(N – p – 1)ρ̂

2 L + p} 2_{/{N – 1 –} (N – p – 1)(1 – ρ̂2 L) 2_{}, whereas F}

U is the 100(α/2) percentile of the F

distribution with ν_U and N – p – 1 degrees of freedom, and ν_U = {(N – p – 1)ρ̂2 U + p} 2_{/{N – 1 – (N – p – 1)(1 – ρ̂}2 U) 2_{}. Essentially, since F} L

and FU, or νL and νU also depend on the confi dence limits ρ̂

2

L and ρ̂

2

U,

the optimal values of ρ̂2

L and ρ̂

2

U in Equation 1 need to be found by a

simple iterative search. It follows from the numerical comparison with the exact results, Helland (1987) concluded that the accuracy of the approximate interval estimates is surprisingly good because the error is negligible for practical purposes. Moreover, the approximate confi dence intervals for ρ2_{can be computed with the} SAS procedure PROC CANCORR (SAS Institute, 2011).

To ensure the precision of Helland’s (1987) confi dence intervals for the squared multiple correlation coeffi cient, two methods were considered in Bonett and Wright (2011). The width of the 100(1 – α)% confi dence interval (ρ̂2

L, ρ̂ 2 U) is denoted by W = ρ̂ 2 U − ρ̂ 2 L. One

formula gives the minimum sample size, such that the expected confi dence interval width E[W] is within the designated bound. The other provides the sample size needed to guarantee, with a given assurance probability P{W ≤ ω}, that the width of a confi dence interval will not exceed the planned range. Specifi cally, for a given value ρ2_{= ρ˜}2_{, the sample size N}

EW needed for the expected width

of a 100(1 – α)% confi dence interval (ρ̂2

L, ρ̂

2

U) to fall within the

designated bound ω is the minimum integer N such that

N 162

{

z_/2/ ln(e)

}

2+ p + 2, ₍₂₎ where z_α/2 is the upper 100(α/2) percentile of the standard normal distribution and ẽ = (1 − ρ˜2_{+ ω/2)/(1 − ρ˜}2_{− ω/2). On the other} hand, for a given value ρ2_{= ρ˜}2_{, the sample size N}

AP required to

guarantee with a given assurance probability (1 − γ) that the width of a 100(1 – α)% confi dence interval (ρ̂2

L, ρ̂

2

U) will not exceed the

planned range ω is the smallest integer N such that

N 16U2

{

z_/2/ ln(e)

}

2+ p + 2, (3) where ρ̃2 U = 1 – exp[ln(1 – ρ˜ 2_{) + z} γ{4ρ˜ 2_{/(N – p – 2)}}1/2_{], and z} γ is the upper 100·γ percentile of the standard normal distribution. Bonett and Wright (2011) suggested repeating the calculations of N and ρ̃2

U

two or three times for better approximation. Analytical arguments and theoretical justifi cations can be found in Bonett and Wright

(2011). It is clear from Equations 2 and 3 that the sample sizes required to attain the designated precision of expected width and assurance probability can be readily computed without complex algorithm. For illustration, consider the model and precision settings with ρ2_{= 0.2, p = 5, 1 – α = 0.95, and ω = 0.3. It follows} from Equation 2 that the sample size N_EW needed for the expected width of a 95% confi dence interval (ρ̂2

L, ρ̂

2

U) to fall within the

designated bound 0.3 is N_EW = 93 when the underlying population ρ2_{= 0.2. Likewise, the corresponding sample size is N}

EW = 43 for

the confi gurations of ρ2_{= 0.7, p = 5, 1 – α = 0.95, and ω = 0.3. In} contrast, the necessary sample size computed with the simulation-based approach of Kelley (2008, Table 4, p. 547) is 87 and 50, respectively. Whereas, Krishnamoorthy and Xia (2008, Table 2, p. 401) yielded the respective values 84 and 49 for ρ2_{= 0.2 and 0.7} according to their trial-and-error procedure. It can be readily seen from these results that there are discrepancies between the sample sizes computed by the different techniques of Bonett and Wright (2011), Kelley (2008), and Krishnamoorthy and Xia (2008). It is presumably that the computational intensive methods of Kelley (2008), and Krishnamoorthy and Xia (2008) are more accurate than the simplifi ed formula of Bonett and Wright (2011). However, the prescribed exemplifi ed sample size calculations for precise confi dence intervals are not detailed enough to elucidate whether the trade of accuracy for simplicity is a wise bargain. To our best acknowledge, no research to date has examined the performance of Bonett and Wright’s (2011) simple formulas in greater detail. Consequently, it is worthwhile to clarify the issue surrounding the adequacy of their techniques because computational simplicity is not the only concern in sample size planning. For pedagogical and practical purposes, the accuracy of their sample size procedures is demonstrated in the next numerical investigation.

Numerical study

In order to demonstrate the features of Bonett and Wright’s (2011) sample size procedures in Equations 2 and 3, empirical examinations were performed for precise interval estimation of the squared multiple correlation coeffi cient. The numerical study was carried out in two stages. The fi rst stage involved extensive sample size calculations for the two precision principles of expected width and assurance probability across a variety of model confi gurations. In the second stage, Monte Carlo simulation studies were conducted to assess the actual precision outcome for the suggested sample sizes under the design characteristics described in the fi rst stage. Sample size calculations

The determination of sample sizes needed for the chosen precision of the confi dence intervals requires the specifi cation of the confi dence level, the magnitude of squared multiple correlation coeffi cient, and the number of predictor variables. It is evident that the infl uence of each of these components on the precision behavior not only differs but also depends on the concurrent impact of other factors. To provide a concise explication, the numerical assessments are specifi ed by fi xing the number of predictors p = 5 and confi dence level 1 – α = 0.95, and varying the squared multiple correlation coeffi cient ρ2_{from 0.1 to 0.9 with an increment of 0.1} in the appraisals. Moreover, the interval bound ω = 0.2, 0.3, and 0.4, and assurance probability 1 – γ = 0.90 are selected for the two precision criteria of expected width and assurance probability.

(4)

These levels were chosen to refl ect common sample sizes used in typical research settings. Accordingly, the computed sample sizes N_EW and N_AP with respect to the selected precision requirements are listed in Tables 1 and 2 for the expected width and assurance probability principle, respectively. As expected, the sample sizes vary with the parameter and precision specifi cations of ρ2_{and ω} in the two tables. But it is evident from the reported results that the sample size is increasing with decreasing value of ω when all other factors are fi xed. Also, the sample size is a concave function of ρ2_{with a maximum around 0.3 when all other factors are fi xed.} Since the bound of interval width is identical in Tables 1 and 2, the consistent magnitude difference in the sample sizes NEW and

N_AP indicates that it typically requires a larger sample size to meet the necessary precision of assurance probability than the control of a designated expected width. In other words, the sample sizes computed by the expected width consideration tend to be inadequate to guarantee the desired assurance level of interval width. Simulation study

We then evaluate the accuracy of the sample size calculations through the following Monte Carlo simulation study. Under the computed sample sizes, parameter confi gurations, and precision settings described in Table1 and 2, estimates of the true expected width or assurance probability are computed through Monte Carlo simulation of 10,000 independent data sets. Note that the exact probability density function of R2_{is extremely complex and} therefore, it is diffi cult to generate a pseudo random variable with the common expression of R2_{in terms of the hypergeometric and} beta functions. However, it is well known that there is a direct connection between the correlation model with multinormal variables and the multivariate normal regression model. Hence, inferences for ρ2_{can be accomplished with the usual F* statistic:}

F*= R

2_{/ p}

1 R2

( )

/ N

(

p 1

)

Additionally, there is an important correspondence between the derived F* distribution and the following generic form suggested by Gurland (1968), namely

Z+ W

(

₁

)

1/2

{

}

2

+W2

W3 _,

where Λ= ρ2_{/(1 − ρ}2_{), Z has the standard normal distribution N(0,} 1), W₁ ~ χ2_{(N − 1), W}

2 ~ χ

2_{(p − 1), W} 3 ~ χ

2_{(N − p − 1) where χ}2_(df) denotes a chi-square distribution with df degree(s) of freedom, and the random variables Z, W₁, W₂ and W₃ are mutually independent. Consequently, the pseudo F* random variable or, equivalently, the pseudo random variable R2_{= pF*/{(N − p − 1) + pF*}, can be} generated by employing the provided random number functions of standard normal and chi-square distributions in most modern statistical packages.

For each replicate of R2_{, the confi dence limits and corresponding} interval width of the two-sided 95% confi dence intervals (ρ̂2

L, ρ̂

2

U) of ρ

2 are calculated. Then the simulated expected width is the mean of the 10,000 replicates of interval widths, whereas the simulated assurance probability is the proportion of the 10,000 replicates whose values of interval width are less than or equal to the specifi ed bound ω. The adequacy of the sample size procedure for precise interval estimation is determined by one of the following formulas: error = simulated expected width − nominal expected width or error = simulated assurance probability − nominal assurance probability. Both the simulated expected width and simulated tolerance probability along with the associated errors are summarized in Tables 1 and 2 as well.

Results and discussions

For the simulated results of expected width and assurance probability in Tables 1 and 2, there exists some disturbing behavior for the sample size procedures of Bonett and Wright (2011). First, the simulated expected width and corresponding error in Table 1 show that the computed sample sizes by Equation 2 are not Table 1

Computed sample size and simulated expected width for the nearly exact 95% two-sided confi dence interval of strength of association effect size ρ2_{when the number of}

predictors p = 5

ω 0.2 0.3 0.4

ρ2 _N

EW Simulated E[W] Error NEW Simulated E[W] Error NEW Simulated E[W] Error

0.1 131 0.1920 -0.0080 082 0.2390 -0.0610 56 0.2861 -0.1139 0.2 202 0.1955 -0.0045 093 0.2846 -0.0154 55 0.3612 -0.0388 0.3 230 0.1972 -0.0028 105 0.2899 -0.0101 61 0.3764 -0.0236 0.4 225 0.1982 -0.0018 102 0.2939 -0.0061 59 0.3853 -0.0147 0.5 194 0.1999 -0.0001 088 0.2984 -0.0016 50 0.3985 -0.0015 0.6 149 0.2017 -0.0017 067 0.3058 -0.0058 38 0.4164 -0.0164 0.7 097 0.2067 -0.0067 043 0.3250 -0.0250 24 0.4658 -0.0658 0.8 048 0.2237 -0.0237 020 0.4088 -0.1088 08 0.8872 -0.4872 0.9 008 0.8458 -0.6458 008 0.8458 -0.5458 08 0.8461 -0.4461

(5)

uniformly accurate for the 27 combined settings of ρ2_{and ω. The} absolute error is less than 0.01 for ρ2_{≤ 0.7 when ω = 0.2, and for 0.4} ≤ ρ2_{≤ 0.6 when ω = 0.3. Whereas all the resulting absolute errors} are greater than 0.01 except for the single case of ρ2_{= 0.5 when ω} = 0.4. Moreover, it is noteworthy that the error is increasing with ρ2 for each fi xed value of three interval bounds. The distinct pattern reveals the potential defi ciency in the sample size computation of Bonett and Wright (2011). Unfortunately, this undesirable property was not addressed in their numerical assessment.

On the other hand, the assurance performance in Table 2 also demonstrates the underlying drawback of the simple formula in Equation 3. Due to the highly skewed distribution of R2_{and the} underlying metric of integer sample sizes, some of the simulated assurance probabilities are 1 for the calculated sample sizes. Hence the corresponding errors have the value of 0.1 for three, four and fi ve cases when ω = 0.2, 0.3 and 0.4, respectively. The only two cases giving acceptable result are associated with the simulated assurance 0.9745 and 0.9062 for N_AP = 195 and 101 under the settings of ρ2₌ 0. 1 and ω = 0.2, and ρ2_{= 0.5 and ω = 0.3, respectively. However,} there are fi ve, four, and three occurrences that the computed sample sizes do not guarantee the desired assurance probability level for ω = 0.2, 0.3 and 0.4, respectively. All but one of these achieved or simulated assurance probabilities are not substantially lower than the nominal probability 0.9, and the only exception is associated with the simulated value 0.8987 and error -0.0013 for ρ2_{= 0.5 and ω = 0.2.} Consequently, the sample size formula has the serous disadvantage of underestimating the necessary sample size for achieving the specifi ed assurance level. Nonetheless, our extended calculations also confi rm that this phenomenon continues to exist in other model confi gurations. Overall, the presented numerical evidence suggests that the sample size formulas of Bonett and Wright (2011) are not accurate enough to serve as a general method for computing the sample sizes for ensuring precise confi dence intervals of squared multiple correlation.

Conclusions

There is a considerable recent literature pertaining to the illuminating applications of effect sizes and confi dence intervals

in quantitative study. Accordingly, the desirability of achieving required precision in effect size estimation and the importance of sample size planning in constructing precise confi dence intervals are repeatedly emphasized in applied research across many scientifi c fi elds. Researchers should become methodologically conscious that most rules of thumb for sample size calculations are inadequate to warrant the conclusion that the resulting confi dence interval is of statistical precision and practical importance. Due to the computational complexity in sample size computation for precise interval estimation of strength of association effect sizes, Bonett and Wright (2011) presented alternative and simple sample size techniques in two distinct aspects. One method gives the minimum sample size, such that the expected confi dence interval width is within the designated bound. The other provides the sample size needed to guarantee, with a given assurance probability, that the width of a confi dence interval will not exceed the planned range. To justify the usefulness of the suggested methodology, numerical investigations were preformed here to evaluate the accuracy of their sample size procedures. In view of the conducted comprehensive empirical assessments, the approximate formulas of Bonett and Wright (2011) are not accurate enough to give optimal sample sizes in achieving the desired precision. Therefore, their procedures are not recommended for precise interval estimation of squared multiple correlation coeffi cient in multiple regression analysis.

In order to enhance the applicability of confi dence intervals for strength of association effect sizes, in the present article, we present a comprehensive and update account of the corresponding sample size techniques. It is important to realize that the simplicity of an explicit formula may be appealing for inducing computational shortcuts but it does not involve all of the key factors in sample size calculation and, thus, is generally error prone. Without our appraisal and demonstration in this paper, applied researchers and practitioners will unknowingly adopt their sample size formulas for its advantage of simplicity. This may lead to miscomputed sample size, distorted precision performance and unsatisfactory research outcome for the planned study. Consequently, instead of the simplifi ed formulas, it is prudent to consider a more sophisticated approach such as the prescribed simulation-based method. Table 2

Computed sample size and simulated assurance probability for the nearly exact 95% two-sided confi dence interval of strength of association effect size ρ2_{when the number}

of predictors p = 5 and assurance probability 1 − γ = 0.9

ω 0.2 0.3 0.4

ρ2 _N

AP Simulated P{W ≤ ω} Error NAP Simulated P{W ≤ ω} Error NAP Simulated P{W ≤ ω} Error

0.1 195 0.9745 -0.0745 129 1.0000 -0.1000 93 1.0000 -0.1000 0.2 257 1.0000 -0.1000 127 1.0000 -0.1000 79 1.0000 -0.1000 0.3 273 1.0000 -0.1000 132 1.0000 -0.1000 80 1.0000 -0.1000 0.4 256 1.0000 -0.1000 122 1.0000 -0.1000 73 1.0000 -0.1000 0.5 216 0.8987 -0.0013 101 0.9062 -0.0062 60 1.0000 -0.1000 0.6 163 0.6750 -0.2250 075 0.6155 -0.2845 44 0.5755 -0.3245 0.7 105 0.5355 -0.3645 048 0.4662 -0.4338 27 0.3718 -0.5282 0.8 051 0.3946 -0.5054 022 0.2800 -0.6200 08 0.0527 -0.8473 0.9 008 0.0494 -0.8506 008 0.0752 -0.8248 08 0.1048 -0.7952

(6)

American Psychological Association (2009). Publication manual of the

American Psychological Association (6th ed.). Washington, DC:

Author.

Alhija, F.N.A., & Levy, A. (2009). Effect size reporting practices in published articles. Educational and Psychological Measurement, 69, 245-265.

Bobko, P. (2001). Correlation and regression: Applications for Industrial

Organizational Psychology and Management (2nd. ed.). Thousand

Oaks, CA: Sage.

Bonett, D.G., & Wright, T.A. (2011). Sample size requirements for multiple regression interval estimation. Journal of Organizational Behavior, 32, 822-830.

Breaugh, J.A. (2003). Effect size estimation: Factors to consider and mistakes to avoid. Journal of Management, 29, 79-97.

Casella, G., & Berger, R.L. (2002). Statistical inference (2nd ed.). Pacifi c Grove, CA: Duxbury.

Cohen, J., Cohen, P., West, S.G., & Aiken, L.S. (2003). Applied multiple

regression/correlation analysis for the behavioral sciences (3rd ed.).

Mahwah, NJ: Erlbaum.

Dunst, C.J., & Hamby, D.W. (2012). Guide for calculating and interpreting effect sizes and confi dence intervals in intellectual and developmental disability research studies. Journal of Intellectual & Developmental

Disability, 37, 89-99.

Ferguson, C.J. (2009). An effect size primer: A guide for clinicians and researchers. Professional Psychology: Research and Practices, 40, 532-538.

Fern, E.F., & Monroe, K.B. (1996). Effect-size estimates: Issues and problems in interpretation. Journal of Consumer Research, 23, 89-105.

Fritz, C.O., Morris, P.E., & Richler, J.J. (2012). Effect size estimates: Current use, calculations, and interpretation. Journal of Experimental

Psychology: General, 141, 2-18.

Gurland, J. (1968) A relatively simple form of the distribution of the multiple correlation coeffi cient. Journal of the Royal Statistical Society,

Series B, 30, 276-283.

Helland, I.S. (1987). On the interpretation and use of R2_{in regression}

analysis. Biometrics, 43, 61-69.

Johnson, N.L., Kotz, S., & Balakrishnan, N. (1995). Continuous univariate

distributions (2nd ed., Vol. 2). New York: Wiley.

Kelley, K. (2008). Sample size planning for the squared multiple correlation coeffi cient: Accuracy in parameter estimation via narrow confi dence intervals. Multivariate Behavioral Research, 43, 524-555.

Kirk, R. (1996). Practical signifi cance: A concept whose time has come.

Educational and Psychological Measurement, 56, 746-759.

Krishnamoorthy, K., & Xia, Y. (2008). Sample size calculation for estimating or testing a nonzero squared multiple correlation coeffi cient.

Multivariate Behavioral Research, 43, 382-410.

Mendoza, J.L., & Stafford, K.L. (2001). Confi dence interval, power calculation, and sample size estimation for the squared multiple correlation coeffi cient under the fi xed and random regression models:

A computer program and useful standard tables. Educational and

Psychological Measurement, 61, 650-667.

Mood, A.M., Graybill, F.A., & Boes, D.C. (1974). Introduction to the

theory of statistics (3rd ed.). New York: McGraw-Hill.

Murphy, K.R., & Myors, B. (2004). Statistical Power Analysis: A simple

and general model for traditional and modern hypothesis tests (2nd

ed.). Hillsdale, NJ: Laurence Erlbaum Associates.

Odgaard, E.C., & Fowler, R.L. (2010). Confi dence intervals for effect sizes: Compliance and clinical signifi cance in the Journal of Consulting and

Clinical Psychology. Journal of Consulting and Clinical Psychology, 78, 287-297.

Raju, N.S., Bilgic, R., Edwards, J.E., & Fleer, P.F. (1997). Methodology review: Estimation of population validity and cross-validity, and the use of equal weights in prediction. Applied Psychological Measurement,

21, 291-305.

Richardson, J.T.E. (1996). Measures of effect size. Behavior Research

Methods, Instruments, & Computers, 28, 12-22.

Sampson, A.R. (1974). A tale of two regressions. Journal of the American

Statistical Association, 69, 682-689.

SAS Institute (2011). SAS/STAT User’s Guide, Version 9.3. Cary, NC: SAS Institute Inc.

Shieh, G. (2006). Exact interval estimation, power calculation and sample size determination in normal correlation analysis. Psychometrika, 71, 529-540.

Shieh, G. (2008). Improved shrinkage estimation of squared multiple correlation coeffi cient and squared cross-validity coeffi cient.

Organizational Research Methods, 11, 387-407.

Shieh, G., & Kung, C.F. (2007). Methodological and computational considerations for multiple correlation analysis. Behavior Research

Methods, 39, 731-734.

Smithson, M. (2003). Confi dence intervals. Sage University Papers Series on Quantitative Applications in the Social Sciences, 07-140. Thousand Oaks, CA: Sage.

Steiger, J.H., & Fouladi, R.T. (1992). R2: A computer program for interval estimation, power calculations, sample size estimation, and hypothesis testing in multiple regression. Behavioral Research Methods,

Instruments and Computers, 24, 581-582.

Stuart, A., & Ord, J.K. (1994). Kendall’s advanced theory of statistics (6th ed., Vol. 1). New York: Halsted Press.

Sun, S., Pan, W., & Wang, L.L. (2010). A comprehensive review of effect size reporting and interpreting practices in academic journals in education and psychology. Journal of Educational Psychology, 102, 989-1004. Vacha-Haase, T., & Thompson, B. (2004). How to estimate and interpret

various effect sizes. Journal of Counseling Psychology, 51, 473-481. Wilkinson, L., & the Task Force on Statistical Inference (1999). Statistical

methods in psychology journals: Guidelines and explanations. American

Psychologist, 54, 594-604.

Yin, P., & Fan, X. (2001). Estimating R2_{shrinkage in multiple regression:}

A comparison of different analytical methods. Journal of Experimental

Education, 69, 203-224.