5.1 Basic Concepts of Random Samples

(1)

5.1 Basic Concepts of Random Samples

Definition 5.1.1

The random variables X₁, . . . , X_n are called a random sample of size n from the population f (x) if X1, . . . , Xn are mutually independent random variables and the marginal pdf or pmf of each X_i is the same function f (x). Alternatively, X₁, . . . , X_n are called independent and identically distributed (iid) random variables with pdf or pmf f (x). This is commonly abbreviated to iid random variables.

If the population pdf or pmf is a member of a parametric family with pdf or pmf given by f (x|θ), then the joint pdf or pmf is

f (x₁, . . . , x_n|θ) = Yn

i=1

f (x_i|θ),

where the same parameter value θ is used in each of the terms in the product.

Example Let X₁, . . . , X_n be a random sample from an exponential(β) population. Specif- ically, X₁, . . . , X_n might correspond to the times (measured in years) until failure for n identical circuit boards that are put on test and used until they fail. The joint pdf of the sample is

f (x₁, . . . , X_n|β) = Yn

i=1

f (x_i|β) = 1

βⁿe⁻^Pⁿⁱ⁼¹^xⁱ^/β.

This pdf can be used to answer questions about the sample. For example, what is the probability that all the boards last more than 2 years?

P (X₁ > 2, . . . , X_n> 2) = P (X₁ > 2) · · · P (X_n> 2)

= [P (X₁ > 2)]ⁿ = (e^−2/β)ⁿ = e^−2n/β.

Random sampling models

(a) Sampling from an infinite population. The samples are iid.

(b) Sampling with replacement from a finite population. The samples are iid.

1

(2)

(c) Sampling without replacement from a finite population. This sampling is sometimes called simple random sampling. The samples are not iid exactly. However, if the popu- lation size N is large compared to the sample size n, the samples will be approximately iid.

Example 5.1.3 (Finite population model)

Suppose {1, . . . , 1000} is the finite population, so N = 1000. A sample of size n = 10 is drawn without replacement. What is the probability that all ten sample values are greater than 200? If X₁, . . . , X₁₀ were mutually independent we would have

P (X1 > 200, . . . , X10> 200) = ( 800

1000)¹⁰ = .107374.

Without the independent assumption, we can calculate as follows.

P (X₁ > 200, . . . , X₁₀> 200) =

¡₈₀₀

10

¢¡₂₀₀

0

¢

¡₁₀₀₀

10

¢ = .106164.

Thus, the independence assumption is approximately correct.

2