4.5 Covariance and Correlation
In earlier sections, we have discussed the absence or presence of a relationship between two random variables, Independence or nonindependence. But if there is a relationship, the relationship may be strong or weak. In this section, we discuss two numerical measures of the strength of a relationship between two random variables, the covariance and correlation.
Throughout this section, we will use the notation EX = µX, EY = µY, VarX = σ2X, and VarY = σ2Y.
Definition 4.5.1 The covariance of X and Y is the number defined by Cov(X, Y ) = E((X − µX)(Y − µY)).
Definition 4.5.2 The correlation of X and Y is the number defined by ρXY = Cov(X, Y )
σXσY . The value ρXY is also called the correlation coefficient.
Theorem 4.5.3 For any random variables X and Y ,
Cov(X, Y ) = EXY − µXµY.
Theorem 4.5.5 If X and Y are independent random variables, then Cov(X, Y ) = 0 and ρXY = 0.
Theorem 4.5.6 If X and Y are any two random variables and a and b are any two constants, then
Var(aX + bY ) = a2VarX + b2VarY + 2abCov(X, Y ).
If X and Y are independent random variables, then
Var(AX + bY ) = a2VarX + b2VarY.
Theorem 4.5.7 For any random variables X and Y , 1
a. −1 ≤ ρXY ≤ 1.
b. |ρXY| = 1 if and only if there exist numbers a 6= 0 and b such that P (Y = aX + b) = 1.
If ρXY = 1, then a > 0, and if ρXY = −1, then a < 0.
Proof: Consider the function h(t) defined by
h(t) = E((X − µX)t + (Y − µY))2
= t2σX2 + 2tCov(X, Y ) + σY2. Since h(t) ≥ 0 and it is quadratic function,
(2Cov(X, Y ))2− 4σX2σY2 ≤ 0.
This is equivalent to
−σXσY ≤ Cov(X, Y ) ≤ σXσY. That is,
−1 ≤ ρXY ≤ 1.
Also, |ρXY| = 1 if and only if the discriminant is equal to 0, that is, if and only if h(t) has a single root. But since ((X − µX)t + (Y − µY))2 ≥ 0, h(t) = 0 if and only if
P ((X − µX)t + (Y − µY) = 0) = 1.
This P (Y = aX + b) = 1 with a = −t and b = µXt + µY, where t is the root of h(t). Using the quadratic formula, we see that this root is t = −Cov(X, Y )/σ2X. Thus a = −t has the same sign as ρXY, proving the final assertion. ¤
Example 4.5.8 (Correlation-I) Let X have a uniform(0,1) distribution and Z have a uni- form(0,0.1) distribution. Suppose X and Z are independent. Let Y = X + Z and consider the random vector (X, Y ). The joint pdf of (X, Y ) is
f (x, y) = 10, 0 < x < 1, x < y < x + 0.1
Note f (x, y) can be obtained from the relationship f (x, y) = f (y|x)f (x). Then Cov(X, Y ) = EXY = −(EX)(EY )
= EX(X + Z) − (EX)(E(X + Z))
= σ2X = 1 12
2
The variance of Y is σY2 = VarX + VarZ = 121 + 12001 . Thus
ρXY = 1/12
p1/12p
1/12 + 1/1200 = r100
101.
The next example illustrates that there may be a strong relationship between X and Y , but if the relationship is not linear, the correlation may be small.
Example 4.5.9 (Correlation-II) Let X ∼ Unif (−1, 1), Z ∼ Unif (0, 0.1), and X and Z be independent. Let Y = X2 + Z and consider the random vector (X, Y ). Since given X = x, Y ∼ Unif (x2, x2+ 0.1). The joint pdf of X and Y is
f (x, y) = 5, −1 < x < 1, x2 < y < x2+ 1 10. Cov(X, Y ) = E(X(X2+ Z)) − (EX)(E(X2+ Z))
= EX3+ EXZ − 0E(X2+ Z)
= 0 Thus, ρXY = Cov(X, Y )/(σXσY) = 0.
Definition 4.5.10 Let −∞ < µX < ∞, −∞ < µY < ∞, 0 < σX, 0 < σY, and −1 < ρ < 1 be five real numbers. The bivariate normal pdf with means µX and µY, variances σX2 and σY2, and correlation ρ is the bivariate pdf given by
f (x, y) = 1 2πσxσY
p1 − ρ2 exp©
− 1
2(1 − ρ2)
¡(x − µX
σX )2−2ρ(x − µX
σX )(y − µY
σY )+(y − µY
σY )2¢ª for −∞ < x < ∞ and −∞ < y < ∞.
The many nice properties of this distribution include these:
a. The marginal distribution of X is N(µX, σX2 ).
b. The marginal distribution of Y is N(µY, σY2).
c. The correlation between X and Y is ρXY = ρ.
d. For any constants a and b, the distribution of aX + bY is N(aµX+ bµY, a2σ2X+ b2σ2Y + 2abρσXσY).
3
Assuming (a) and (b) are true, we will prove (c). Let
s = (x − µX
σX )(y − µY
σY ) and t = (x − µX
σX ).
Then x = σXt+µX, y = (σYs/t)+µY, and the Jacobian of the transformation is J = σXσY/t.
With this change of variables, we obtain ρXY =
Z ∞
−∞
Z ∞
−∞
sf (σXt + µX,σYs
t + µY)|σXσY t |dsdt
= Z ∞
−∞
Z ∞
−∞
s(2πσXσYp
1 − ρ2)−1exp¡
− 1
2(1 − ρ)2(t2− 2ρs + (s
t)2)¢σXσY
|t| dsdt
= Z ∞
−∞
√1
2πexp(−t2 2)dt
Z ∞
−∞
√ s 2πp
(1 − ρ2)t2 exp¡
− (s − ρt2)2 2(1 − ρ2)t2
¢ds
The inner integral is ES, where S is a normal random variable with ES = ρt2 and VarS = (1 − ρ2)t2. Thus,
ρXY = Z ∞
−∞
ρt2
√2π exp{−t2/2}dt = ρ.
4