Sums of Random Variables from a Random Sample

(1)

Sums of Random Variables from a Random Sample

Definition 5.2.1

Let X₁, . . . , X_n be a random sample of size n from a population and let T (x₁, . . . , x_n) be a real-valued or vector-valued function whose domain includes the sample space of (X1, . . . , Xn).

Then the random variable or random vector Y = T (X₁, . . . , X_n) is called a statistic. The probability distribution of a statistic Y is called the sampling distribution of Y .

The definition of a statistic is very broad, with the only restriction being that a statistic cannot be a function of a parameter. Three statistics that are often used and provide good summaries of the sample are now defined.

Definition

The sample mean is the arithmetic average of the values in a random sample. It is usually denoted by

X =¯ X₁+ · · · + X_n

n = 1

n Xn

i=1

Xi.

Definition

The sample variance is the statistic defined by

S² = 1 n − 1

Xn

i=1

(X_i− ¯X)².

The sample standard deviation is the statistic defined by S =√ S².

The sample variance and standard deviation are measures of variability in the sample that are related to the population variance and standard deviation.

Theorem 5.2.4

Let x₁, . . . , x_n be any numbers and ¯x = (x₁+ · · · + x_n)/n. then

(a) mina

P_n

i=1(xi − a)² =P_n

i=1(xi− ¯x)². (b) (n − 1)s² =P_n

i=1(x_i− ¯x)² =P_n

i=1x²_i − n¯x².

1

(2)

Lemma 5.2.5

Let X₁, . . . , X_n be a random sample from a population and let g(x) be a function such that Eg(X₁) and Varg(X₁) exists. Then

E¡Xⁿ

i=1

g(X_i)¢

= n(Eg(X₁)).

and

Var¡Xⁿ

i=1

g(X_i)¢

= n(Varg(X₁)).

THeorem 5.2.6

Let X₁, . . . , X_n be a random sample from a population with mean µ and variance σ² < ∞.

Then

(a) E ¯X = µ.

(b) Var ¯X = ^σ_n².

(c) ES² = σ².

Proof: We just prove part (c) here.

ES² = E¡ 1 n − 1[

Xn

i=1

X_i²− n ¯X²]¢

= 1

n − 1(nEX₁²− nE ¯X²)

= 1

n − 1(n(σ²+ µ²) − n(σ²

n + µ²)) = σ².

¤

About the distribution of a statistic, we have the following theorems. Theorem 5.2.7

Let X1, . . . , Xn be a random sample from a population with mgf MX(t). Then the mgf of the sample mean is

MX¯(t) = [MX(t/n)]ⁿ.

Example (Distribution of the mean)

Let X₁, . . . , X_nbe a random sample from a N(µ, σ²) population. Then the mgf of the sample

2

(3)

mean is

MX¯(t) = [exp(µt n +σ²

2 (t/n)²)]ⁿ

= exp(µt + σ²/n 2 t²).

Thus, ¯X has a N(µ, σ²/n) distribution.

The mgf of the sample mean a gamma(α, β) random sample is

MX¯(t) = [( 1

1 − β(t/n))^α]ⁿ =¡ 1 1 − (β/n)t

¢_nα ,

which we recognize as the mgf of a gamma(nα, β/n), the distribution of ¯X.

If Theorem 5.2.7 is not applicable, because either the resulting mgf of ¯X is unrecognizable or the population mgf does not exists. In such cases, the following convolution formula is useful.

Theorem 5.2.9

If X and Y are independent continuous random variables with pdfs fX(x) and fY(y), then the pdf of Z = X + Y is

fZ(z) = Z _∞

−∞

fX(w)fY(z − w)dw.

Proof: Let W = X. The Jacobian of the transformation from (X, Y ) to (Z, W ) is 1. So the joint pdf of (Z, W ) is

f_Z,W(z, w) = f_X,Y(w, z − w) = f_X(w)f_Y(z − w).

Integrating out w, we obtain the marginal pdf of Z and finish the proof. ¤

Example (Sum of Cauchy random variables)

As an example of a situation where the mgf technique fails, consider sampling from a Cauchy distribution. Let U and V be independent Cauchy random variables, U ∼ Cauchy(0, σ) and V ∼ Cauchy(0, τ ); that is,

f_U(u) = 1 πσ

1

1 + (u/σ)², f_V(v) = 1 πτ

1 1 + (v/τ )²,

3

(4)

where −∞ < U, V < ∞. Based on the convolution formula, the pdf of U + V is given by

f_Z(z) = Z _∞

−∞

1 πσ

1 1 + (w/σ)²

1 πσ

1

1 + ((z − w)/τ )²dw,

= 1

π(σ + τ )

1

1 + (z/(σ + τ ))²,

where −∞ < z < ∞. Thus, the sum of two independent Cauchy random variables is again a Cauchy, with the scale parameters adding. It therefore follows that if Z₁, . . . , Z_n are iid Cauchy(0,1) random variables, then P

Z_i is Cauchy(0, n) and also ¯Z is Cauchy(0,1). The sample mean has the same distribution as the individual observations.

Theorem 5.2.11

Suppose X1, . . . , Xn is a random sample from a pdf or pmf f (x|θ), where

f (x|θ) = h(x)c(θ) exp(

Xk

i=1

w_i(θ)t_i(x))

is a member of an exponential family. Define statistics T1, . . . , Tk by

T_i(X₁, . . . , X_n) = Xn

j=1

t_i(X_j), i = 1, . . . , k.

If the set {(w₁(θ), w₂(θ), . . . , w_k(θ)), θ ∈ Θ} contains an open subset of R^k, then the distri- bution of (T₁, . . . , T_k) is an exponential family of the form

f_T(u₁, . . . , u_k|θ) = H(u₁, . . . , u_k)[c(θ)]ⁿexp(

Xk

i=1

w_i(θ)u_i).

The open set condition eliminates a density such as the N(θ, θ²) and, in general, eliminates curved exponential families from Theorem 5.2.11.

4