Multivariate Distribution
The random vector X = (X1, . . . , Xn) has a sample space that is a subset of Rn. If X is dis- crete random vector, then the joint pmf of x is the function defined by f (x) = f (x1, . . . , xn) = P (X1 = x1, . . . , Xn− xn) for each (x1, . . . , xn) ∈ Rn. Then for any A ⊂ Rn,
P (X ∈ A) = X
x∈A
f (x).
If X is a continuous random vector, the joint pdf of X is a function f (x1, . . . , xn) that satisfies
P (X ∈ A) = Z
· · · Z
A
f (x)dx = Z
· · · Z
A
f (x1, . . . , xn)dx1· · · dxn.
Let g(x) = g(x1, . . . , xn) be a real-valued function defined on the sample space of X. Then g(X) is a random variable and the expected value of g(X) is
Eg(X) = Z ∞
−∞
· · · Z ∞
−∞
g(x)f (x)dx
and
Eg(X) = X
x∈Rn
g(x)f (x) in the continuous and discrete cases, respectively.
The marginal distribution of (X1, . . . , Xn) , the first k coordinates of (X1, . . . , Xn), is given by the pdf or pmf
f (x1, . . . , xk) = Z ∞
−∞
· · · Z ∞
−∞
f (x1, . . . , xn)dxk+1· · · dxn or
f (x1, . . . , xk) = X
(xk+1,...,xn)∈Rn−k
f (x1, . . . , xn) for every (x1, . . . , xk) ∈ Rk.
If f (x1, . . . , xk) > 0, the conditional pdf or pmf of (Xk+1, . . . , Xn) given X1 = x1, . . . , Xk = xk is the function of (xk+1, . . . , xn) defined by
f (xk=1, . . . , xn|x1, . . . , xk) = f (x1, . . . , xn) f (x1, . . . , xk).
Example 4.6.1 (Multivariate pdfs) Let n = 4 and
f (x1, x2, x3, x4) =
3
4(x21+ x22+ x23+ x24) 0 < xi < 1, i = 1, 2, 3, 4
0 otherwise
The joint pdf can be used to compute probabilities such as P (X1 < 1
2, X2 < 3
4, X4 > 1 2)
= Z 1
1 2
Z 1
0
Z 3
4
0
Z 1
2
0
3
4(x21+ x22+ x23+ x24)dx1dx2dx3dx4 = 151 1024. The marginal pdf of (X1, X2) is
f (x1, x2) = Z 1
0
Z 1
0
3
4(x21+ x22+ x23+ x24)dx2dx4 = 3
4(x21+ x22) + 1 2 for 0 < x1 < 1 and 0 < x2 < 1.
Definition 4.6.2 Let n and m be positive integers and let p1, . . . , pn be numbers satisfying 0 ≤ pi ≤ 1, i = 1, . . . , n, and Pn
i=1pi = 1. Then the random vector (X1, . . . , Xn) has a multinomial distribution with m trials and cell proabilities p1, . . . , pn if the joint pmf of (X1, . . . , Xn) is
f (x1, . . . , xn) = m!
x1! · · · xn!px11· · · pxnn = m!
Yn i=1
pxii xi! on the set of (x1, . . . , xn) such that each xi is a nonnegative integer and Pn
i=1xi = m.
Example 4.6.3 (Multivariate pmf) Consider tossing a six-sided die 10 times. Suppose the die is unbalanced so that the probability of observing an i is i/21. Now consider the vector (X1, . . . , X6), where Xi counts the number of times i comes up in the 10 tosses.
Then (X1, . . . , X6) has a multinomial distribution with m = 10 and cell probabilities p1 =
1
21, . . . , p6 = 216. For example, the probability of the vector (0, 0, 1, 2, 3, 4) is f (0, 0, 1, 2, 3, 4) = 10!
0!0!1!2!3!4!( 1 21)0( 2
21)0( 3 21)1( 4
21)2( 5 21)3( 6
21)4 = 0.0059.
The factor x m!
1!···xn! is called a multinomial coefficient. It is the number of ways that m objects can be divided into n groups with x1 in the first group, x2 in the second group, . . ., and xn in the nth group.
Theorem 4.6.4 (Multinomial Theorem)
Let m and n be positive integers. Let A be the set of vectors x = (x1, . . . , xn) such that each xi is a nonnegative integer and Pn
i=1xi = m. Then, for any real numbers p1, . . . , pn, (p1+ . . . + pn)m = X
x∈A
m!
x1! · · · xn!px11. . . pxnn.
Definition 4.6.5 Let X1, . . . , Xn be random vectors with joint pdf or pmf f (x1, . . . , xn).
Let fXi(xi) denote the marginal pdf or pmf of Xi. Then X1, . . . , Xn are called mutually independent random vectors if, for every (x1, . . . , xn),
f (x1, . . . , xn) = fX1(x1) . . . fXn(xn) = Yn i=1
fXi(xi).
If the Xi’s are all one dimensional, then X1, . . . , Xnare called mutually independent random variables.
Mutually independent random variables have many nice properties. The proofs of the fol- lowing theorems are analogous to the proofs of their counterparts in Sections 4.2 and 4.3.
Theorem 4.6.6 (Generalization of Theorem 4.2.10)
Let X1, . . . , Xn be mutually independent random variables. Let g1, . . . , gn be real-valued functions such that gi(xi) is a function only of xi, i = 1, . . . , n. Then
E(g1(X1) · · · g(Xn)) = (Eg1(X1)) · · · (Egn(Xn)).
Theorem 4.6.7 (Generalization of Theorem 4.2.12)
Let X1, . . . , Xn be mutually independent random variables with mgfs MX1(t), . . . , MXn(t).
Let Z = X1+ · · · + Xn. Then the mgf of Z is
MZ(t) = MX1(t) · · · MXn(t).
In particular, if X1, . . . , Xn all have the same distribution with mgf MX(t), then MZ(t) = (MX(t))n.
Example 4.6.8 (Mgf of a sum of gamma variables)
Suppose X1, . . . , Xn are mutually independent random variables, and the distribution of Xi is gamma(αi, β). Thus, if Z = X1+ . . . + Xn, the mgf of Z is
MZ(t) = MX1(t) · · · MXn(t) = (1 − βt)−α1· · · (1 − βt)−αn = (1 − βt)−(α1+···+αn). This is the mgf of a gamma(α1 + · · · + αn, β) distribution. Thus, the sum of a indepen- dent gamma random variables that have a common scale parameter β also has a gamma distribution.
Example
Let X1, . . . , Xnbe mutually independent random variables with Xi ∼ N(µi, σi2). Let a1, . . . , an
and b1, . . . , bn be fixed constants. Then Z =
Xn i=1
(aiXi+ bi) ∼ N(
Xn i=1
(aiµi+ bi), Xn
i=1
a2iσi2).
Theorem 4.6.11 (Generalization of Lemma 4.2.7)
Let X1, . . . , Xn be random vectors. Then X1, . . . , Xn are mutually independent random vectors if and only if there exist functions gi(xi), i = 1, . . . , n, such that the joint pdf or pmf of (X1, . . . , Xn) can be written as
f (x1, . . . , xn) = g1(x1) · · · gn(xn).
Theorem 4,6,12 (Generalization of Theorem 4.3.5)
Let X1, . . . , Xn be random vectors. Let gi(xi) be a function only of xi, i = 1, . . . , n. Then the random vectors Ui = gi(Xi), i = 1, . . . , n, are mutually independent.
Let (X1, . . . , Xn) be a random vector with pdf fX(x1, . . . , xn). Let A = {x : fX(x) > 0}.
Consider a new random vector (U1, . . . , Un), defined by U1 = g1(X1, . . . , Xn), . . ., Un = gn(X1, . . . , Xn). Suppose that A0, A1, . . . , Ak form a partition of A with these properties.
The set A0, which may be empty, satisfies P ((X1, . . . , Xn) ∈ A0) = 0. The transformation (U1, . . . , Un) = (g1(X), . . . , gn(X)) is a one-to-one transformation from Ai onto B for each i = 1, 2, . . . , k. Then for each i, the inverse functions from B to A can be found. Denote the
ith inverse by x1 = h1i(u − 1, . . . , un), . . . , xn = hni(u1, . . . , un). Let Ji denote the Jacobian computed from the ith inverse. That is,
Ji =
¯¯
¯¯
¯¯
¯¯
¯¯
¯¯
∂h1i(u)
∂u1
∂h1i(u)
∂u2 . . . ∂h1i(u)
∂u1
∂h2i(u)
∂u1
∂h2i(u)
∂u2 . . . ∂h2i(u)
∂u1
... ... . .. ...
∂hni(u)
∂u1
∂hni(u)
∂u2 . . . ∂hni(u)
∂u1
¯¯
¯¯
¯¯
¯¯
¯¯
¯¯
the determinant of an n×n matrix. Assuming that these Jacobians do not vanish identically on B, we have the following representation of the joint pdf, fU(u1, . . . , un), for u ∈ B:
fu(u1, . . . , un) = Xk
i=1
fX(h1i(u1, . . . , un), . . . , hni(u1, . . . , un))|Ji|.