The inequality between the coefficient of determination and the sum of squared simple correlation coefficients

(1)

This article was downloaded by: [National Chiao Tung University 國立交通大學] On: 27 April 2014, At: 23:13

Publisher: Taylor & Francis

Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

The American Statistician

Publication details, including instructions for authors and subscription information:

http://www.tandfonline.com/loi/utas20

The Inequality Between the Coefficient of

Determination and the Sum of Squared Simple

Correlation Coefficients

Gwowen Shieha

a

Gwowen Shieh is Associate Professor, Department of Management Science, National Chiao Tung University, 1001 Ta Hsueh Road, Hsinchu, Taiwan 30050, Republic of China . This work was partially supported by National Science Council of Taiwan under contract number NSC89-2118-M-009-016. The author thanks the associate editor and referees for helpful comments that improved the presentation of this article.

Published online: 01 Jan 2012.

To cite this article: Gwowen Shieh (2001) The Inequality Between the Coefficient of Determination

and the Sum of Squared Simple Correlation Coefficients, The American Statistician, 55:2, 121-124, DOI: 10.1198/000313001750358437

To link to this article: http://dx.doi.org/10.1198/000313001750358437

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

(2)

The Inequality Between the Coef¢cient of Determination and

the Sum of Squared Simple Correlation Coef¢cients

Gwowen SHIEH

The inequality between the coef¢cient of determination and the sum of two squared simple correlation coef¢cients in a two-variable regression model is reexamined through two relative measures. They are the relative coef¢cient of determination and the relative simple correlation, which are the ratio of the coef¢-cient of determination to the sum of squares of the two simple correlations and the ratio of two simple correlations, respec-tively. This approach not only permits new insights into their relationship but also allows clear and informative visual repre-sentations of various aspects of the counterintuitive condition. We considerthe occurrence and correspondingmagnitude,prob-ability, and expected magnitude of the enhancement-synergism situation. Numerical examples are presented to illustrate these phenomena.

KEY WORDS: Coef¢cient of determination; Multiple re-gression; Simple correlation coef¢cient.

1. INTRODUCTION

Hamilton (1987) discussed the counterintuitivenature of mul-tivariate relationship in standard multiple regression models— the coef¢cient of determination can exceed the sum of the squared correlation coef¢cients between the response variable and each explanatoryvariable. To understandthis interestingand surprising feature, the theoretical proof and geometrical argu-ment that such inequalitycan occur were provided for the regres-sion model with two explanatory variables. A slightly simpler proof was given by Bertrand and Holder (1988). Related com-ments and discussions can be found in Currie and Korabinski (1984), Freund (1988), Hamilton (1988), Mitra (1988), Cuadras (1993) and their references. As visual supplement, Currie and Korabinski (1984) and Freund (1988) contain several diagrams that intend to illustrate when the counterintuitiveconditionscan occur. Since more than three measures are involved in those plots, their uses are limited to the selected conditional values of the chosen measure. Consequently, the interrelation is not com-pletely shown by single or even several plots together. Hence more concise diagrams are needed to effectively conceive and evaluate the occurrences and magnitudes of these phenomena. Although the existence of the inequality is well presented and

Gwowen Shieh is Associate Professor, Department of Management Science, National Chiao Tung University, 1001 Ta Hsueh Road, Hsinchu, Taiwan 30050, Republic of China (E-mail: gwshieh@cc.nctu.edu.tw). This work was par-tially supported by National Science Council of Taiwan under contract number NSC89-2118-M-009-016. The author thanks the associate editor and referees for helpful comments that improved the presentation of this article.

recognized in the aforementioned articles, it is still not clear exactly how often it can happen.

The purpose of this article is to provide more informative plots for the existence and subsequent analyses of the inequality and to quantify the probability and expected magnitude of the occurrence.

2. MAIN RESULTS FOR TWO-VARIABLE REGRESSION

Consider the standard regression model with one response variable Y and two explanatory variables X1and X2

Yi= ¬ + 1X1i+ 2X2i+ "i; i = 1; : : : ; n; (1)

where ¬ , 1, and 2 are parameters, and "i are iid N (0; ¼ 2)

random variables. Let rY 1, rY 2, and r12 be the usual simple

(product-moment) correlation coef¢cients between Y and X1,

Y and X2, and X1and X2, respectively.It follows from Equation

(8) of Hamilton (1987) or Equation (3-81) of Johnston (1991) that the coef¢cient of determination R2_{in terms of the simple}

correlation coef¢cients is R2= r 2 Y 1+ rY 22 2rY 1rY 2r12 1 r2 12 : (2)

We will focus on the inequality between the coef¢cient of de-termination and the sum of two squared simple correlation co-ef¢cients

R2> r2Y 1+ r2Y 2: (3)

The inequalitymay seem surprising or counterintuitiveat¢rst. It will be shown later that it occurs more often than one may think. Currie and Korabinski (1984) called such an occurrence “en-hancement,” while Hamilton (1988) suggested “synergism” as an alternative. Here it will be termed “enhancement-synergism.”

2.1 When Does the Enhancement-Synergism Condition Hold?

Since it is obvious from (2) that R2 _{= r}2

Y 1+ r2Y 2for r12 =

0, we will assume r12 6= 0 in the remainder of this article. It

is shown in Hamilton (1987) that the necessary and suf¢cient condition for (3) in terms of the simple correlation coef¢cients rY 1, rY 2, and r12is r12 ³ r12 2rY 1rY 2 r2Y 1+ rY 22 ´ > 0: (4)

Accordingly, the validity of inequality (4) depends upon the in-terrelation of rY 1, rY 2, and r12. In this case graphical

represen-tation is extremely informative in understanding its occurrence. To lay the basis for developing a simpli¢ed view and providing a concise visualizationof when the inequality can occur, we¢rst de¢ne the relative simple correlation, denoted by Q, as the ratio c

*2001 American Statistical Association The American Statistician, May 2001, Vol. 55, No. 2 121

(3)

Q r1 2 -1.0 -0.5 0.0 0.5 1.0 -1 .0 -0 .5 0. 0 0. 5 1. 0

Figure 1. The Regions of Enhancement-Synergism.

of rY 2to rY 1:

Q = rY 2=rY 1: (5)

Since the designation of X1and X2is arbitrary, as long as only

one of rY 1 and rY 2 is zero, Q can be set as zero. The case

that both rY 1and rY 2 are zero will be excluded because R2is

obviously zero from (2).

Equation (5) enables us to rewrite (4) in terms of Q and r12,

and one gets r12 > g(Q) for r12 > 0 and r12 < g(Q) for

r12< 0, where g(Q) = 2Q=(1 + Q2). Equivalently, Q < q0or

Q > 1=q0for r12> 0, and Q < 1=q0or Q > q0for r12< 0,

where q0 = (1

p

1 r122 )=r12. Note that g(Q) = g(1=Q),

for Q 6= 0. Therefore, the relation between r12and Q can be

represented by the relation between r12and Q for jQj µ 1.

Fig-ure 1 presents the occurrence of (4) for combinations of r12and

Q for jQj µ 1. Those four shaded areas stand for the occurrence regions of enhancement-synergism.We believe Figure 1 is more effective for communicating the results than Figure 2 of Currie and Korabinski (1984) where the occurrence of enhancement-synergism was presented with multiple conditional plots (com-binations of rY 2and r12for selected values of rY 1).

2.2 What is the Magnitude of Enhancement-Synergism?

Instead of measuring the magnitude of enhancement-synergism with the direct difference of R2_{and r}2

Y 1+ rY 22 , for

the purpose of simpli¢cation, we suggest using the relative co-ef¢cient of determination, denoted by H,

H = R2=(r2Y 1+ rY 22 ); (6)

which is the ratio of R2_{to r}2 Y 1+ r

2

Y 2. A useful expression for

H is H = 1 + Q 2 _2r 12Q (1 r2 12)(1 + Q2) (7)

as a function of Q and r12. A close examinationof (7) shows that

H attains its maximum 1=(1 jr12j) and minimum 1=(1+jr12j)

for Q = sign(r12) and sign(r12), respectively. The function

sign(x) returns respective value of 1, 0, or 1 if x > 0, x = 0, or x < 0. Besides, H(Q) = H(1=Q) for Q 6= 0. As in the previous discussion of g(Q), the relation between H and Q can be represented by the relation between H and Q for jQj µ 1. To visualize these facts, we plot H against Q for jQj µ 1 in Figure 2 for three different values r12 = :1, :5, and :9. (The plots for

negative values of r12are mirror images of those for positive

values.) We believe Figure 2 provides a clearer presentation of the relative magnitude and range of the enhancement-synergism condition than the plots in Freund (1988) and Figure 1 of Currie and Korabinski (1984).

2.3 How Often Does the Enhancement-Synergism Condition Occur and What is the Expected Magnitude of Enhancement-Synergism?

Because we know the enhancement-synergism condition ex-ists, it is of interest to know how often it occurs and what is the expected magnitude in terms of relative coef¢cient of termination; that is, what are P (H > 1) and E[H]? To de-termine these two quantities, we start with the derivation of the pdf of Q. Assume the true coef¢cient parameters of 1

and 2 in (1) are 1¤ and ¤2, respectively. It can be shown

from the standard assumptions that Q = Z2=Z1, where Zj =

SY j=(¼ Sj) ¹ N (· j; 1); SY j =Pni= 1(Yi Y )(Xji Xj); Sj

is the square root of S2 j =

Pn

i= 1(Xji Xj)2, j = 1 and 2,

· 1 = (¤1S1+ 2¤ S2r12)=¼ and · 2 = (2¤ S2+ 1¤ S1r12)=¼ .

Note that corr(Z1; Z2) = r12. Hence the distribution of Q is

exactly the same as the distribution of the ratio of two correlated normal random variables with mean (· 1; · 2), variance (1, 1) and

correlation r12. This is a special case of the ratio of two

corre-lated normal random variables discussed by Fieller (1932) and Hinkley (1969). Its explicit pdf and cdf were given by Hinkley (1969, eq. (1)–(3)). It appears, however, that there is no ana-lytic form for P (H > 1) = 1 P (q0 < Z2=Z1 < 1=q0) for

Q H -1.0 -0.5 0.0 0.5 1.0 0 1 2 3 4 5 6 7 8 9 10 11 r12 = 0.9 r12 = 0.5 r12 = 0.1

Figure 2. The Relation Between H and Q.

122 General

(4)

Table 1. The Occurrence Probability of Enhancement-Synergismand the Expected Relative Coefcient of Determination P(H > 1| · 1, · 2, r12) E{H| · 1, · 2, r12} · 1 · 2 r12= .1 r12= .5 r12= .9 r12= .1 r12= .5 r12= .9 2 2 .0549 .0972 .1360 .9329 .7881 .8951 2 1 .1992 .3129 .6875 .9583 .9670 2.5033 2 1 .5383 .7192 .9925 1.0100 1.3343 5.4156 2 1 .8508 .9514 1.0000 1.0607 1.6597 7.5596 2 2 .9656 .9953 1.0000 1.0854 1.8107 8.5373 1 2 .1992 .3129 .6875 .9583 .9670 2.5033 1 1 .2824 .3284 .3586 .9707 .9922 1.6872 1 0 .5137 .5864 .8683 1.0071 1.2358 4.2373 1 1 .7507 .8551 .9984 1.0453 1.5342 6.7198 1 2 .8508 .9514 1.0000 1.0607 1.6597 7.5596 0 2 .5383 .7192 .9925 1.0100 1.3343 5.4156 0 1 .5137 .5864 .8683 1.0071 1.2358 4.2373 0 0 .5000 .5000 .5000 1.0050 1.1547 2.2942 0 1 .5137 .5864 .8683 1.0071 1.2358 4.2373 0 2 .5383 .7192 .9925 1.0100 1.3343 5.4156 1 2 .8508 .9514 1.0000 1.0607 1.6597 7.5596 1 1 .7507 .8551 .9984 1.0453 1.5342 6.7198 1 0 .5137 .5864 .8683 1.0071 1.2358 4.2373 1 1 .2824 .3284 .3586 .9707 .9922 1.6872 1 2 .1992 .3129 .6875 .9583 .9670 2.5033 2 2 .9656 .9953 1.0000 1.0854 1.8107 8.5373 2 1 .8508 .9514 1.0000 1.0607 1.6597 7.5596 2 0 .5383 .7192 .9925 1.0100 1.3343 5.4156 2 1 .1992 .3129 .6875 .9583 .9670 2.5033 2 2 .0549 .0972 .1360 .9329 .7881 .8951 r12> 0, or 1 ¡ P (1=q0< Z2=Z1< q0) for r12< 0, except in

the following special case. Assume ¤

1 = 2¤ = 0, or equivalently · 1= · 2 = 0. In this

case, Q can be viewed as the ratio of two correlated standard normal variables and has a Cauchy distribution with location parameter ³ = r12and scale parameter ¶ = (1 ¡ r122 )1=2; see

Johnson, Kotz, and Balakrishnan (1994, eq. 16.1). Since its cdf is of the form FQ(q) = :5 + ¸ ¡ 1tan¡ 1f(q ¡ ³ )=¶ g, one has

P (H > 1) = 1 ¡ jFQ(q0) ¡ FQ(1=q0)j

= 1 ¡ jtan¡ 1_{(¡ q}

0) ¡ tan¡ 1(1=q0)j=¸ = :5:

The last equality follows from the fact that ¡ q0× 1=q0= ¡ 1.

Therefore it is equallylikely to have or not to have enhancement-synergism in a two-variable regression with explanatory vari-ables that are absolutely irrelevant for describing the response variable. More importantly, this is true for all r126= 0. This

out-come may be easy to guess; however, it is not as trivial as one may think.

The actual occurrence probability of enhancement-synergism under various values of (· 1; · 2) are calculated through

numer-ical integration and are listed in Table 1 for r12= :1, :5, and :9.

In general, P (H > 1j· 1; · 2; r12) = P (H > 1j¡ · 1; ¡ · 2; r12)

and P (H > 1j· 1; ¡ · 2; r12) = P (H > 1j ¡ · 1; · 2; r12) due to

symmetry. It is also true that P (H > 1j· 1; · 2; r12) = P (H >

1j· 1; ¡ · 2; ¡ r12).

Based on the pdf of Q and (7), we can evaluate the expected relative coef cient of determination E[H] for any r12 6= 0.

Again it does not appear to have a simple analytic form and numerical integration is necessary to carry out the expectation.

Table 1 also presents the expected magnitude of enhancement-synergism in terms of H for r12 = :1; :5, and :9. In

partic-ular, when (· 1; · 2) = (0; 0), the values are 1.0050, 1.1547,

and 2.2942 for r12= :1; :5; and :9, respectively. This indicates

that the relative coef cient of determination is greater than one in “average” for all r12 6= 0 when (· 1; · 2) = (0; 0). So far

we are unable to prove that E[Hj· 1 = 0; · 2 = 0; r12] > 1

for all r12 6= 0. Moreover, Table 1 shows that the

differ-ences of E[H] among different values of (· 1; · 2) are more

dra-matic as jr12jgets larger. Overall E[Hj· 1; · 2; r12] = E[Hj ¡

· 1; ¡ · 2; r12]; E[Hj· 1; ¡ · 2; r12] = E[Hj ¡ · 1; · 2; r12] and

E[Hj· 1; · 2; r12] = E[Hj· 1; ¡ · 2; ¡ r12].

3. CONCLUSION

We provide a simpli ed and systematic view of the coun-terintuitive inequality or the enhancement-synergism condition that the coef cient of determination can exceed the sum of two squared simple correlation coef cients. The major differ-ence between our approach and others is that the relative sim-ple correlation Q and the relative coef cient of determina-tion H are the primary tools for analyzing such phenomenon rather than the original measures and their difference. The fol-lowing four major questions are studied: (1) When does the enhancement-synergism condition occur? (2) What is the mag-nitude of enhancement-synergism? (3) How often does the enhancement-synergism condition occur? (4) What is the ex-pected magnitude of enhancement-synergism? Both theoretical arguments and graphical presentations are given. Numerical ex-amples are provided to illustrate the levels of the

(5)

synergism and its dependence on the other measures. In addi-tion to the surprising enhancement-synergism condiaddi-tion itself, we point out two interesting features when the explanatory ables are absolutely irrelevant for describing the response vari-able. It is shown that even the two true coef¢cient parameters are indeed zero (¤

1 = 2¤ = 0) the occurrence probability of

the enhancement-synergismcondition is .5 for all r126= 0.

Fur-thermore, under the same assumption, it is shown numerically that the expected relative coef¢cient of determination appears to be greater than 1 for all r126= 0 and is increasing with jr12j.

[Received October 1999. Revised April 2000.]

REFERENCES

Bertrand, P. V., and Holder, R. L. (1988), “A Quirk in Multiple Regression: The Whole Regression can be Greater Than the Sum of its Parts,” The Statistician,

37, 371–374.

Cuadras, C. M. (1993), “Interpreting an Inequality in Multiple Regression,” The

American Statistician, 47, 256–258.

Currie, I., and Korabinski, A. (1984), “Some Comments on Bivariate Regres-sion,” The Statistician, 33, 283–293.

Fieller, E. C. (1932), “The Distribution of the Index in a Normal Bivariate Pop-ulation,” Biometrika, 24, 428–440.

Freund, R. J. (1988), “When is R2 _{> r}2

y1+ r2y2 (Revisited),” The American

Statistician, 42, 89–90.

Hamilton, D. (1987), “Sometimes R2_{> r}2

y x1+ r2y2, Correlated Variables are

not Always Redundant,” The American Statistician, 41, 129–132. (1988), “Sometimes R2 _{> r}2

y x1+ ry x2 2, Correlated Variables are not

Always Redundant” (Reply), The American Statistician, 42, 90–91. Hinkley, D. V. (1969), “On the Ratio of Two Correlated Normal Random

Vari-ables,” Biometrika, 56, 635–639.

Johnson,N. L., Kotz, S., and Balakrishnan, N. (1994),Distributions in Statistics:

Continuous Univariate Distributions I, New York: Wiley. Johnston, J. (1991), Econometric Methods, New York: McGraw-Hill. Mitra, S. (1988), “The Relationship Between the Multiple and the Zero-Order

Correlation Coef¢cients,” The American Statistician, 42, 89.

124 General