4.5 Covariance and Correlation

(1)

4.5 Covariance and Correlation

In earlier sections, we have discussed the absence or presence of a relationship between two random variables, Independence or nonindependence. But if there is a relationship, the relationship may be strong or weak. In this section, we discuss two numerical measures of the strength of a relationship between two random variables, the covariance and correlation.

Throughout this section, we will use the notation EX = µ_X, EY = µ_Y, VarX = σ²_X, and VarY = σ²_Y.

Definition 4.5.1 The covariance of X and Y is the number defined by Cov(X, Y ) = E((X − µ_X)(Y − µ_Y)).

Definition 4.5.2 The correlation of X and Y is the number defined by ρ_XY = Cov(X, Y )

σ_Xσ_Y . The value ρ_XY is also called the correlation coefficient.

Theorem 4.5.3 For any random variables X and Y ,

Cov(X, Y ) = EXY − µ_Xµ_Y.

Theorem 4.5.5 If X and Y are independent random variables, then Cov(X, Y ) = 0 and ρ_XY = 0.

Theorem 4.5.6 If X and Y are any two random variables and a and b are any two constants, then

Var(aX + bY ) = a²VarX + b²VarY + 2abCov(X, Y ).

If X and Y are independent random variables, then

Var(AX + bY ) = a²VarX + b²VarY.

Theorem 4.5.7 For any random variables X and Y , 1

(2)

a. −1 ≤ ρ_XY ≤ 1.

b. |ρ_XY| = 1 if and only if there exist numbers a 6= 0 and b such that P (Y = aX + b) = 1.

If ρ_XY = 1, then a > 0, and if ρ_XY = −1, then a < 0.

Proof: Consider the function h(t) defined by

h(t) = E((X − µ_X)t + (Y − µ_Y))²

= t²σ_X² + 2tCov(X, Y ) + σ_Y². Since h(t) ≥ 0 and it is quadratic function,

(2Cov(X, Y ))²− 4σ_X²σ_Y² ≤ 0.

This is equivalent to

−σ_Xσ_Y ≤ Cov(X, Y ) ≤ σ_Xσ_Y. That is,

−1 ≤ ρ_XY ≤ 1.

Also, |ρXY| = 1 if and only if the discriminant is equal to 0, that is, if and only if h(t) has a single root. But since ((X − µ_X)t + (Y − µ_Y))² ≥ 0, h(t) = 0 if and only if

P ((X − µ_X)t + (Y − µ_Y) = 0) = 1.

This P (Y = aX + b) = 1 with a = −t and b = µXt + µY, where t is the root of h(t). Using the quadratic formula, we see that this root is t = −Cov(X, Y )/σ²_X. Thus a = −t has the same sign as ρXY, proving the final assertion. ¤

Example 4.5.8 (Correlation-I) Let X have a uniform(0,1) distribution and Z have a uni- form(0,0.1) distribution. Suppose X and Z are independent. Let Y = X + Z and consider the random vector (X, Y ). The joint pdf of (X, Y ) is

f (x, y) = 10, 0 < x < 1, x < y < x + 0.1

Note f (x, y) can be obtained from the relationship f (x, y) = f (y|x)f (x). Then Cov(X, Y ) = EXY = −(EX)(EY )

= EX(X + Z) − (EX)(E(X + Z))

= σ²_X = 1 12

2

(3)

The variance of Y is σ_Y² = VarX + VarZ = ₁₂¹ + ₁₂₀₀¹ . Thus

ρ_XY = 1/12

p1/12p

1/12 + 1/1200 = r100

101.

The next example illustrates that there may be a strong relationship between X and Y , but if the relationship is not linear, the correlation may be small.

Example 4.5.9 (Correlation-II) Let X ∼ Unif (−1, 1), Z ∼ Unif (0, 0.1), and X and Z be independent. Let Y = X² + Z and consider the random vector (X, Y ). Since given X = x, Y ∼ Unif (x², x²+ 0.1). The joint pdf of X and Y is

f (x, y) = 5, −1 < x < 1, x² < y < x²+ 1 10. Cov(X, Y ) = E(X(X²+ Z)) − (EX)(E(X²+ Z))

= EX³+ EXZ − 0E(X²+ Z)

= 0 Thus, ρXY = Cov(X, Y )/(σXσY) = 0.

Definition 4.5.10 Let −∞ < µ_X < ∞, −∞ < µ_Y < ∞, 0 < σ_X, 0 < σ_Y, and −1 < ρ < 1 be five real numbers. The bivariate normal pdf with means µ_X and µ_Y, variances σ_X² and σ_Y², and correlation ρ is the bivariate pdf given by

f (x, y) = 1 2πσxσY

p1 − ρ² exp©

− 1

2(1 − ρ²)

¡(x − µX

σ_X )²−2ρ(x − µX

σ_X )(y − µY

σ_Y )+(y − µY

σ_Y )²¢ª for −∞ < x < ∞ and −∞ < y < ∞.

The many nice properties of this distribution include these:

a. The marginal distribution of X is N(µ_X, σ_X² ).

b. The marginal distribution of Y is N(µY, σ_Y²).

c. The correlation between X and Y is ρ_XY = ρ.

d. For any constants a and b, the distribution of aX + bY is N(aµ_X+ bµ_Y, a²σ²_X+ b²σ²_Y + 2abρσ_Xσ_Y).

3

(4)

Assuming (a) and (b) are true, we will prove (c). Let

s = (x − µX

σ_X )(y − µY

σ_Y ) and t = (x − µX

σ_X ).

Then x = σ_Xt+µ_X, y = (σ_Ys/t)+µ_Y, and the Jacobian of the transformation is J = σ_Xσ_Y/t.

With this change of variables, we obtain ρXY =

Z _∞

−∞

Z _∞

−∞

sf (σXt + µX,σ_Ys

t + µY)|σ_Xσ_Y t |dsdt

= Z _∞

−∞

Z _∞

−∞

s(2πσ_Xσ_Yp

1 − ρ²)⁻¹exp¡

− 1

2(1 − ρ)²(t²− 2ρs + (s

t)²)¢σ_XσY

|t| dsdt

= Z _∞

−∞

√1

2πexp(−t² 2)dt

Z _∞

−∞

√ s 2πp

(1 − ρ²)t² exp¡

− (s − ρt²)² 2(1 − ρ²)t²

¢ds

The inner integral is ES, where S is a normal random variable with ES = ρt² and VarS = (1 − ρ²)t². Thus,

ρ_XY = Z _∞

−∞

ρt²

√2π exp{−t²/2}dt = ρ.

4