• 沒有找到結果。

Multivariate Distribution

N/A
N/A
Protected

Academic year: 2022

Share "Multivariate Distribution"

Copied!
5
0
0

加載中.... (立即查看全文)

全文

(1)

Multivariate Distribution

The random vector X = (X1, . . . , Xn) has a sample space that is a subset of Rn. If X is dis- crete random vector, then the joint pmf of x is the function defined by f (x) = f (x1, . . . , xn) = P (X1 = x1, . . . , Xn− xn) for each (x1, . . . , xn) ∈ Rn. Then for any A ⊂ Rn,

P (X ∈ A) = X

x∈A

f (x).

If X is a continuous random vector, the joint pdf of X is a function f (x1, . . . , xn) that satisfies

P (X ∈ A) = Z

· · · Z

A

f (x)dx = Z

· · · Z

A

f (x1, . . . , xn)dx1· · · dxn.

Let g(x) = g(x1, . . . , xn) be a real-valued function defined on the sample space of X. Then g(X) is a random variable and the expected value of g(X) is

Eg(X) = Z

−∞

· · · Z

−∞

g(x)f (x)dx

and

Eg(X) = X

x∈Rn

g(x)f (x) in the continuous and discrete cases, respectively.

The marginal distribution of (X1, . . . , Xn) , the first k coordinates of (X1, . . . , Xn), is given by the pdf or pmf

f (x1, . . . , xk) = Z

−∞

· · · Z

−∞

f (x1, . . . , xn)dxk+1· · · dxn or

f (x1, . . . , xk) = X

(xk+1,...,xn)∈Rn−k

f (x1, . . . , xn) for every (x1, . . . , xk) ∈ Rk.

If f (x1, . . . , xk) > 0, the conditional pdf or pmf of (Xk+1, . . . , Xn) given X1 = x1, . . . , Xk = xk is the function of (xk+1, . . . , xn) defined by

f (xk=1, . . . , xn|x1, . . . , xk) = f (x1, . . . , xn) f (x1, . . . , xk).

(2)

Example 4.6.1 (Multivariate pdfs) Let n = 4 and

f (x1, x2, x3, x4) =





3

4(x21+ x22+ x23+ x24) 0 < xi < 1, i = 1, 2, 3, 4

0 otherwise

The joint pdf can be used to compute probabilities such as P (X1 < 1

2, X2 < 3

4, X4 > 1 2)

= Z 1

1 2

Z 1

0

Z 3

4

0

Z 1

2

0

3

4(x21+ x22+ x23+ x24)dx1dx2dx3dx4 = 151 1024. The marginal pdf of (X1, X2) is

f (x1, x2) = Z 1

0

Z 1

0

3

4(x21+ x22+ x23+ x24)dx2dx4 = 3

4(x21+ x22) + 1 2 for 0 < x1 < 1 and 0 < x2 < 1.

Definition 4.6.2 Let n and m be positive integers and let p1, . . . , pn be numbers satisfying 0 ≤ pi ≤ 1, i = 1, . . . , n, and Pn

i=1pi = 1. Then the random vector (X1, . . . , Xn) has a multinomial distribution with m trials and cell proabilities p1, . . . , pn if the joint pmf of (X1, . . . , Xn) is

f (x1, . . . , xn) = m!

x1! · · · xn!px11· · · pxnn = m!

Yn i=1

pxii xi! on the set of (x1, . . . , xn) such that each xi is a nonnegative integer and Pn

i=1xi = m.

Example 4.6.3 (Multivariate pmf) Consider tossing a six-sided die 10 times. Suppose the die is unbalanced so that the probability of observing an i is i/21. Now consider the vector (X1, . . . , X6), where Xi counts the number of times i comes up in the 10 tosses.

Then (X1, . . . , X6) has a multinomial distribution with m = 10 and cell probabilities p1 =

1

21, . . . , p6 = 216. For example, the probability of the vector (0, 0, 1, 2, 3, 4) is f (0, 0, 1, 2, 3, 4) = 10!

0!0!1!2!3!4!( 1 21)0( 2

21)0( 3 21)1( 4

21)2( 5 21)3( 6

21)4 = 0.0059.

The factor x m!

1!···xn! is called a multinomial coefficient. It is the number of ways that m objects can be divided into n groups with x1 in the first group, x2 in the second group, . . ., and xn in the nth group.

(3)

Theorem 4.6.4 (Multinomial Theorem)

Let m and n be positive integers. Let A be the set of vectors x = (x1, . . . , xn) such that each xi is a nonnegative integer and Pn

i=1xi = m. Then, for any real numbers p1, . . . , pn, (p1+ . . . + pn)m = X

x∈A

m!

x1! · · · xn!px11. . . pxnn.

Definition 4.6.5 Let X1, . . . , Xn be random vectors with joint pdf or pmf f (x1, . . . , xn).

Let fXi(xi) denote the marginal pdf or pmf of Xi. Then X1, . . . , Xn are called mutually independent random vectors if, for every (x1, . . . , xn),

f (x1, . . . , xn) = fX1(x1) . . . fXn(xn) = Yn i=1

fXi(xi).

If the Xi’s are all one dimensional, then X1, . . . , Xnare called mutually independent random variables.

Mutually independent random variables have many nice properties. The proofs of the fol- lowing theorems are analogous to the proofs of their counterparts in Sections 4.2 and 4.3.

Theorem 4.6.6 (Generalization of Theorem 4.2.10)

Let X1, . . . , Xn be mutually independent random variables. Let g1, . . . , gn be real-valued functions such that gi(xi) is a function only of xi, i = 1, . . . , n. Then

E(g1(X1) · · · g(Xn)) = (Eg1(X1)) · · · (Egn(Xn)).

Theorem 4.6.7 (Generalization of Theorem 4.2.12)

Let X1, . . . , Xn be mutually independent random variables with mgfs MX1(t), . . . , MXn(t).

Let Z = X1+ · · · + Xn. Then the mgf of Z is

MZ(t) = MX1(t) · · · MXn(t).

In particular, if X1, . . . , Xn all have the same distribution with mgf MX(t), then MZ(t) = (MX(t))n.

(4)

Example 4.6.8 (Mgf of a sum of gamma variables)

Suppose X1, . . . , Xn are mutually independent random variables, and the distribution of Xi is gamma(αi, β). Thus, if Z = X1+ . . . + Xn, the mgf of Z is

MZ(t) = MX1(t) · · · MXn(t) = (1 − βt)−α1· · · (1 − βt)−αn = (1 − βt)−(α1+···+αn). This is the mgf of a gamma(α1 + · · · + αn, β) distribution. Thus, the sum of a indepen- dent gamma random variables that have a common scale parameter β also has a gamma distribution.

Example

Let X1, . . . , Xnbe mutually independent random variables with Xi ∼ N(µi, σi2). Let a1, . . . , an

and b1, . . . , bn be fixed constants. Then Z =

Xn i=1

(aiXi+ bi) ∼ N(

Xn i=1

(aiµi+ bi), Xn

i=1

a2iσi2).

Theorem 4.6.11 (Generalization of Lemma 4.2.7)

Let X1, . . . , Xn be random vectors. Then X1, . . . , Xn are mutually independent random vectors if and only if there exist functions gi(xi), i = 1, . . . , n, such that the joint pdf or pmf of (X1, . . . , Xn) can be written as

f (x1, . . . , xn) = g1(x1) · · · gn(xn).

Theorem 4,6,12 (Generalization of Theorem 4.3.5)

Let X1, . . . , Xn be random vectors. Let gi(xi) be a function only of xi, i = 1, . . . , n. Then the random vectors Ui = gi(Xi), i = 1, . . . , n, are mutually independent.

Let (X1, . . . , Xn) be a random vector with pdf fX(x1, . . . , xn). Let A = {x : fX(x) > 0}.

Consider a new random vector (U1, . . . , Un), defined by U1 = g1(X1, . . . , Xn), . . ., Un = gn(X1, . . . , Xn). Suppose that A0, A1, . . . , Ak form a partition of A with these properties.

The set A0, which may be empty, satisfies P ((X1, . . . , Xn) ∈ A0) = 0. The transformation (U1, . . . , Un) = (g1(X), . . . , gn(X)) is a one-to-one transformation from Ai onto B for each i = 1, 2, . . . , k. Then for each i, the inverse functions from B to A can be found. Denote the

(5)

ith inverse by x1 = h1i(u − 1, . . . , un), . . . , xn = hni(u1, . . . , un). Let Ji denote the Jacobian computed from the ith inverse. That is,

Ji =

¯¯

¯¯

¯¯

¯¯

¯¯

¯¯

∂h1i(u)

∂u1

∂h1i(u)

∂u2 . . . ∂h1i(u)

∂u1

∂h2i(u)

∂u1

∂h2i(u)

∂u2 . . . ∂h2i(u)

∂u1

... ... . .. ...

∂hni(u)

∂u1

∂hni(u)

∂u2 . . . ∂hni(u)

∂u1

¯¯

¯¯

¯¯

¯¯

¯¯

¯¯

the determinant of an n×n matrix. Assuming that these Jacobians do not vanish identically on B, we have the following representation of the joint pdf, fU(u1, . . . , un), for u ∈ B:

fu(u1, . . . , un) = Xk

i=1

fX(h1i(u1, . . . , un), . . . , hni(u1, . . . , un))|Ji|.

參考文獻

相關文件

• Since successive samples are correlated, the Markov chain may have to be run for a considerable time in order to generate samples that are effectively independent samples from p(x).

• So long as the random variables are sampled from a distribution that is similar in shape to from a distribution that is similar in shape to the integrand, variance is

The time complexity of flatten is Θ(n 2 ) (in the worst case), where n is the number of Tip nodes in the tree.... A simple observation shows that there are exactly n − 1 Bin

Many recent developments on generative models for natural images have relied on heuristically-motivated metrics that can be easily gamed by memorizing a small sample from the

As the result, I found that the trail I want can be got by using a plane for cutting the quadrangular pyramid, like the way to have a conic section from a cone.. I also found

Suppose there is a differentiable function g(T ) (an estimator of some parameter) for which we want an approximate estimate of variance...

The stack H ss ξ (C, D; m, e, α) was constructed in section 2.3.. It is a smooth orbifold surface containing a unique orbifold point above each ℘ i,j.. An inverse morphism can

As an example of a situation where the mgf technique fails, consider sampling from a Cauchy distribution.. Thus, the sum of two independent Cauchy random variables is again a