4. Multiple Random Variables
4.1 Joint and Marginal Distributions
Definition 4.1.1 An n-dimensional random vector is a function from a sample space S into Rn, n-dimensional Euclidean space.
Suppose, for example, that with each point in a sample space we associate an ordered pair of numbers, that is, a point (x, y) ∈ R2, where R2 denotes the plane. Then we have defined a two -dimensional (or bivariate) random vector (X, Y ).
Example 4.1.2 (Sample space for dice)
Consider the experiment of tossing two fair dice. The sample space for this experiment has 36 equally likely points. Let
X=sum of the two dice and Y =|difference of two dice|.
In this way we have defined then bivariate random vector (X, Y ).
The random vector (X, Y ) defined above is called a discrete random vector because it has only a countable (in this case, finite) number of possible values. The probabilities of events defined in terms of X and Y are just defined in terms of the probabilities of the corresponding events in the sample space S. For example,
P (X = 5, Y = 3) = P ({4, 1}, {1, 4}) = 2 36 = 1
18.
Definition 4.1.2 Let (X, Y ) be a discrete bivariate random vector. Then the function f (x, y) from R2into R defined by f (x, y) = P (X = x, Y = y) is called the joint probability mass function or joint pmf of (X, Y ). If it is necessary to stress the fact that f is the joint pmf of the vector (X, Y ) rather than some other vector, the notation fX,Y(x, y) will be used.
The joint pmf can be used to compute the probability of any event defined in terms of (X, Y ).
Let A be any subset of R2. Then
P ((X, Y ) ∈ A) = X
(x,y)∈A
f (x, y).
Expectations of functions of random vectors are computed just as with univariate random variables. Let g(x, y) be a real-valued function defined for all possible values (x, y) of the discrete random vector (X, Y ). Then g(X, Y ) is itself a random variable and its expected value Eg(X, Y ) is given by
Eg(X, Y ) = X
(x,y)∈R2
g(x, y)f (x, y).
Example 4.1.2 (Continuation of Example 4.1.2)
For the (X, Y ) whose joint pmf is given in the following table
X
2 3 4 5 6 7 8 9 10 11 12
0 361 361 361 361 361 361
1 181 181 181 181 181
Y 2 181 181 181 181
3 181 181 181
4 181 181
5 181
Letting g(x, y) = xy, we have
EXY = (2)(0) 1
36+ · · · + (7)(5) 1
18 = 1311 18.
The expectation operator continues to have the properties listed in Theorem 2.2.5 (textbook).
For example, if g1(x, y) and g2(x, y) are two functions and a, b and c are constants, then E(ag1(X, Y ) + bg2(X, Y ) + c) = aEg1(X, Y ) + bEg2(X, Y ) + c.
For any (x, y), f (x, y) ≥ 0 since f (x, y) is a probability. Also, since (X, Y ) is certain to be in R2,
X f (x, y) = P ((X, Y ) ∈ R2) = 1.
Theorem 4.1.6
Let (X, Y ) be a discrete bivariate random vector with joint pmf fXY(x, y). Then the marginal pmfs of X and Y , fX(x) = P (X = x) and fY(y) = P (Y = y), are given by
fX(x) =X
y∈R
fX,Y(x, y) and fY(y) =X
x∈R
fX,Y(x, y).
Proof: For any x ∈ R, let Ax = {(x, y) : −∞ < y < ∞}. That is, Ax is the line in the plane with first coordinate equal to x. Then, for any x ∈ R,
fX(x) = P (X = x)
= P (X = x, −∞ < Y < ∞) (P (−∞ < Y < ∞) = 1)
= P ((X, Y ) ∈ Ax) (definition of Ax)
= X
(x,y)∈Ax
fX,Y(x, y)
=X
y∈R
fX,Y(x, y).
The proof for fY(y) is similar. ¤
Example 4.1.7 (Marginal pmf for dice)
Using the table given in Example 4.1.4, compute the marginal pmf of Y . Using Theorem 4.1.6, we have
fY(0) = fX,Y(2, 0) + · · · + fX,Y(12, 0) = 1 6. Similarly, we obtain
fY(1) = 5
18, fY(2) = 2
9, fY(3) = 1
6, fY(4) = 1
9, fY(5) = 1 18. Notice that P5
i=0fY(i) = 1.
The marginal distributions of X and Y do not completely describe the joint distribution of X and Y . Indeed, there are many different joint distributions that have the same marginal distribution. Thus, it is hopeless to try to determine
the joint pmf from the knowledge of only the marginal pmfs. The next example illustrates the point.
Example 4.1.9 (Same marginals, different joint pmf) Considering the following two joint pmfs,
f (0, 0) = 1
12, f (1, 0) = 5
12, , f (0, 1) = f (1, 1) = 3
12, f (x, y) = 0 for all other values.
and
f (0, 0) = f (0, 1) = 1
6, f (1, 0) = f (1, 1) = 1
3, f (x, y) = 0 for all other values.
It is easy to verify that they have the same marginal distributions. The marginal of X is fX(0) = 1
3, fX(1) = 2 3. The marginal of Y is
fY(0) = 1
2, fY(1) = 1 2.
In the following we consider random vectors whose components are continuous random vari- ables.
Definition 4.1.10A function f (x, y) from R2 into R is called a joint probability density func- tion or joint pdf of the continuous bivariate random vector (X, Y ) if, for every A ⊂ R2,
P ((X, Y ) ∈ A) = Z Z
A
f (x, y)dxdy.
If g(x, y) is a real-valued function, then the expected value of g(X, Y ) is defined to be Eg(X, Y ) =
Z ∞
−∞
Z ∞
−∞
g(x, y)f (x, y)dxdy.
The marginal probability density functions of X and Y are defined as fX(x) =
Z ∞
−∞
f (x, y)dy, −∞ < x < ∞, fY(y) =
Z ∞
f (x, y)dx, −∞ < y < ∞.
Any function f (x, y) satisfying f (x, y) ≥ 0 for all (x, y) ∈ R2 and 1 =
Z ∞
−∞
Z ∞
−∞
f (x, y)dxdy
is the joint pdf of some continuous bivariate random vector (X, Y ).
Example 4.1.11 (Calculating joint probabilities-I) Define a joint pdf by
f (x, y) =
6xy2 0 < x < 1 and 0 < y < 1 0 otherwise
Now, consider calculating a probability such as P (X + Y ≥ 1). Let A = {(x, y) : x + y ≥ 1}, we can re-express A as
A = {(x, y) : x + y ≥ 1, 0 < x < 1, 0 < y < 1} = {(x, y) : 1 − y ≤ x < 1, 0 < y < 1}.
Thus, we have
P (X + Y ≥ 1) = Z
A
Z
f (x, y)dxdy = Z 1
0
Z 1
1−y
6xy2dxdy = 9 10.
The joint cdf is the function F (x, y) defined by F (x, y) = P (X ≤ x, Y ≤ y) =
Z x
−∞
Z y
−∞
f (s, t)dtds.
Hence,
∂2F (x, y)
∂x∂y = f (x, y) and
−∂2P (X ≤ x, Y ≥ y)
∂x∂y = f (x, y)