Sums of Random Variables from a Random Sample
Definition 5.2.1
Let X1, . . . , Xn be a random sample of size n from a population and let T (x1, . . . , xn) be a real-valued or vector-valued function whose domain includes the sample space of (X1, . . . , Xn).
Then the random variable or random vector Y = T (X1, . . . , Xn) is called a statistic. The probability distribution of a statistic Y is called the sampling distribution of Y .
The definition of a statistic is very broad, with the only restriction being that a statistic cannot be a function of a parameter. Three statistics that are often used and provide good summaries of the sample are now defined.
Definition
The sample mean is the arithmetic average of the values in a random sample. It is usually denoted by
X =¯ X1+ · · · + Xn
n = 1
n Xn
i=1
Xi.
Definition
The sample variance is the statistic defined by
S2 = 1 n − 1
Xn
i=1
(Xi− ¯X)2.
The sample standard deviation is the statistic defined by S =√ S2.
The sample variance and standard deviation are measures of variability in the sample that are related to the population variance and standard deviation.
Theorem 5.2.4
Let x1, . . . , xn be any numbers and ¯x = (x1+ · · · + xn)/n. then
(a) mina
Pn
i=1(xi − a)2 =Pn
i=1(xi− ¯x)2. (b) (n − 1)s2 =Pn
i=1(xi− ¯x)2 =Pn
i=1x2i − n¯x2.
1
Lemma 5.2.5
Let X1, . . . , Xn be a random sample from a population and let g(x) be a function such that Eg(X1) and Varg(X1) exists. Then
E¡Xn
i=1
g(Xi)¢
= n(Eg(X1)).
and
Var¡Xn
i=1
g(Xi)¢
= n(Varg(X1)).
THeorem 5.2.6
Let X1, . . . , Xn be a random sample from a population with mean µ and variance σ2 < ∞.
Then
(a) E ¯X = µ.
(b) Var ¯X = σn2.
(c) ES2 = σ2.
Proof: We just prove part (c) here.
ES2 = E¡ 1 n − 1[
Xn
i=1
Xi2− n ¯X2]¢
= 1
n − 1(nEX12− nE ¯X2)
= 1
n − 1(n(σ2+ µ2) − n(σ2
n + µ2)) = σ2.
¤
About the distribution of a statistic, we have the following theorems. Theorem 5.2.7
Let X1, . . . , Xn be a random sample from a population with mgf MX(t). Then the mgf of the sample mean is
MX¯(t) = [MX(t/n)]n.
Example (Distribution of the mean)
Let X1, . . . , Xnbe a random sample from a N(µ, σ2) population. Then the mgf of the sample
2
mean is
MX¯(t) = [exp(µt n +σ2
2 (t/n)2)]n
= exp(µt + σ2/n 2 t2).
Thus, ¯X has a N(µ, σ2/n) distribution.
The mgf of the sample mean a gamma(α, β) random sample is
MX¯(t) = [( 1
1 − β(t/n))α]n =¡ 1 1 − (β/n)t
¢nα ,
which we recognize as the mgf of a gamma(nα, β/n), the distribution of ¯X.
If Theorem 5.2.7 is not applicable, because either the resulting mgf of ¯X is unrecognizable or the population mgf does not exists. In such cases, the following convolution formula is useful.
Theorem 5.2.9
If X and Y are independent continuous random variables with pdfs fX(x) and fY(y), then the pdf of Z = X + Y is
fZ(z) = Z ∞
−∞
fX(w)fY(z − w)dw.
Proof: Let W = X. The Jacobian of the transformation from (X, Y ) to (Z, W ) is 1. So the joint pdf of (Z, W ) is
fZ,W(z, w) = fX,Y(w, z − w) = fX(w)fY(z − w).
Integrating out w, we obtain the marginal pdf of Z and finish the proof. ¤
Example (Sum of Cauchy random variables)
As an example of a situation where the mgf technique fails, consider sampling from a Cauchy distribution. Let U and V be independent Cauchy random variables, U ∼ Cauchy(0, σ) and V ∼ Cauchy(0, τ ); that is,
fU(u) = 1 πσ
1
1 + (u/σ)2, fV(v) = 1 πτ
1 1 + (v/τ )2,
3
where −∞ < U, V < ∞. Based on the convolution formula, the pdf of U + V is given by
fZ(z) = Z ∞
−∞
1 πσ
1 1 + (w/σ)2
1 πσ
1
1 + ((z − w)/τ )2dw,
= 1
π(σ + τ )
1
1 + (z/(σ + τ ))2,
where −∞ < z < ∞. Thus, the sum of two independent Cauchy random variables is again a Cauchy, with the scale parameters adding. It therefore follows that if Z1, . . . , Zn are iid Cauchy(0,1) random variables, then P
Zi is Cauchy(0, n) and also ¯Z is Cauchy(0,1). The sample mean has the same distribution as the individual observations.
Theorem 5.2.11
Suppose X1, . . . , Xn is a random sample from a pdf or pmf f (x|θ), where
f (x|θ) = h(x)c(θ) exp(
Xk
i=1
wi(θ)ti(x))
is a member of an exponential family. Define statistics T1, . . . , Tk by
Ti(X1, . . . , Xn) = Xn
j=1
ti(Xj), i = 1, . . . , k.
If the set {(w1(θ), w2(θ), . . . , wk(θ)), θ ∈ Θ} contains an open subset of Rk, then the distri- bution of (T1, . . . , Tk) is an exponential family of the form
fT(u1, . . . , uk|θ) = H(u1, . . . , uk)[c(θ)]nexp(
Xk
i=1
wi(θ)ui).
The open set condition eliminates a density such as the N(θ, θ2) and, in general, eliminates curved exponential families from Theorem 5.2.11.
4