Certain functions of the random variables observed in a sample are used to estimate or make decisions about unknown population parameters

(1)

Chapter 7 Sampling Distribution and Central Limit Theorem

7.1 Introduction

In this chapter, we will be working with functions of the variables

observed in a random sample selected from a population of interest. Certain functions of the random variables observed in a sample are used to estimate or make decisions about unknown population parameters. For example, we usually use sample mean

1, 2, , _n Y Y " Y

Y to estimate a population meanμ . Notice that the random variable Y is a function of (only) the random variables and the (constant) sample size n. The random variable

1, 2, , _n Y Y " Y

Y is therefore an example of a statistic.

【Supplementary Definition】Random Sample (隨機樣本)

Let the random variables Y Y₁, ₂,",Y_n have a joint density function

1,2, , ( ,1 2, , ) ( ) (1 2) ( )

Y Y Yn n n

f _" y y " y = f y f y "f y , where f( )⋅ is the common density function of each y , then is defined to be a random sample of size from a population with density of

1, 2, , _n

Y Y " Y n

( ) f ⋅ .

【Note】

(2)

【Example 1】

If Y Y₁, ₂,",Y_n is a random sample from a density f( ; )⋅ θ , then

1

1 ⁿ

i i

Y Y

n ₌

= ∑ ^{is a}

statistic.

Statistics are used to make inferences (estimates or decisions) about unknown population parameters. Because a statistic is a function of the random variables observed in a sample, the statistic itself is a random variable. Consequently, using the methods of Chapter 6, we will derive its probability distribution, which we will call the sampling distribution of the statistic.

The form of the theoretical sampling distribution of a statistic will depend upon the distribution of the observable random variables in the sample. In next section, we will derive the sampling distributions for some statistics used to make inferences about the parameters of a normal distribution.

(3)

7.2 Sampling Distribution Related to the Normal Distribution

【Theorem 7.1】

Let Y Y₁, ₂,",Y_n be a random sample of size n from N( ,μ σ²) , then

2

1

1 ~ ,

n i i

Y Y N

n n

μ σ

=

⎛ ⎞

= ⎜ ⎟

⎝ ⎠

∑ ^.

【pf】

【Example 2】

A bottling machine can be regulated so that it discharges an average of μ ounces per bottle. It has been observed that the amount of fill dispensed by the machine is normally distributed with σ =1 ounce. A sample of n=9 filled bottles is randomly selected from the output of the machine on a given day (all bottled with the same machine setting) and the ounces of fill measured for each.

(1) Please find the probability that the sample mean will be within 0.3 ounce of the true meanμ for that particular setting.

(4)

【Theorem 7.2】

Let Y Y₁, ₂,",Y_n be a random sample of size n from N( ,μ σ²) . Then

( )

i i

Z = Y −μ σ are independent, standard variables, where i=1, 2,",n , and

2 2

1 1

n n

i i

Z Y μ

= = σ

⎛ −

= ⎜

⎝ ⎠

∑ ∑ ^⎞⎟ has a χ²distribution with n degrees of freedom.

【pf】定理 7.2 同定理 6.4 之証明

【Theorem 7.3】

Let Y Y₁, ₂,",Y_n be a random sample of size n from N( ,μ σ²) . Then

2 (

2

2 2

1

( 1) 1 ⁿ

i i

n S

Y Y

σ σ =

− = ∑ − ) ^{has a}^χ²distribution with (n-1) degrees of freedom.

Also, Y and S² are independent random variables.

【pf】

(5)

【Definition 7.2】

Let Z be a standard normal random variable, and let W be a χ²-distributed variable with ν degrees of freedom. Then, if Z and W are independent,

T Z

W ν

=

is said to have a t-distribution with ν degrees of freedom (d.f.). The notation used is:T ~t_ν .

【Remark】

t 分配之特性：

1

2 2

1 2 1

(1). . . . : ( ) 1 , , 0

2 (2). ( ) 0 ; ( )

2

T

p d f f t t t

E T Var T

ν ν

ν νπ ν ν

ν ν

− +

⎛ + ⎞

Γ⎜⎝ ⎟⎠ ⎛ ⎞

= Γ⎜ ⎟⎛ ⎞⎝ ⎠ × ⎜⎝ + ⎟⎠ −∞ < < ∞

= =

−

>

(3)偏態係數β₁=0 ( t 分配為對稱分配，對稱點為 0)。

峰態係數β₂=[3(ν-2)]/(ν -4), ν >4 (高狹峰)

(4)t_ν ⎯⎯⎯_ν_→∞→N(0,1)

【Definition 7.3】

Let W₁ and W₂ be independent χ²-distributed random variables with ν₁ and

(6)

【Remark】

F 分配之特性：

1 1 2

1

1 2

2 1 2

1 2 1

1 2

1 2 2 2

(1). . . . : ( ) 2 1 , 0 ,

2 2

p d f fF x x x x

ν ν ν

ν

ν ν

ν ν ν ν

− +

−

⎛ + ⎞

Γ⎜⎝ ⎟ ⎛ ⎞⎠ ⎛ ⎞

=Γ⎛ ⎞ ⎛⎜ ⎟ ⎜⎝ ⎠ ⎝Γ ⎞ ⎝ ⎠⎟⎠⎜ ⎟ ⎜⎝ + ⎟⎠ >

為正數

2 2 2

2 1 2 1 2 2 2

(2). ( ) , 3;

2

( )=2 ( 2) /[ ( 2) ( 4) ], 5 E F

Var F

ν ν ν

ν ν ν ν ν ν ν

= ≥

−

+ − − − ≥

1,2 2,1

(3)X ~F Y 1 ~F

ν ν，則 = X ν _ν

1,

1 2

2 2 2 2

1 1 1

,

2 2

1, 1, , , / 2

2 2

1, 1, , / 2

2 2

, 1 1 , ,

(4) ~ ,

. .

. / F F

a F t F t

b F Z F Z

c F F

ν ν

ν ν ν α ν α

α α

ν χ νν ν ν α χν α

∞ ∞

= ⇒ =

則

(7)

7.3 The Central Limit Theorem And the Method of Proof C.L.T.

【Theorem 7.4】Central Limit Theorem (C.L.T)

LetY Y₁, ₂,",Y_n be i.i.d. r.v.’s with finite expectationμ and finite and positive varianceσ², and let Y be the sample mean of Y Y₁, ₂,",Y_n. Then

( ) ( )

~ (0,1), ( )

d

n n

Y E Y Y n Y Y

U n

Var Y n

μ μ μ

σ σ

σ ^→∞

⎛ ⎞

− − − −

= = = = ⎜ ⎟ →

⎝ ⎠ Z N

or

2

z

( ) 1 2

( ) ,

2

z y

n

P n Y μ z z e dy

σ π

−

→∞ −∞

⎡ − ⎤

≤ → Φ = ∈ℜ

⎢ ⎥

⎣ ⎦ ∫

【Theorem 7.5】

Let and be random variables with moment-generating functions and , respectively. If

Yn Y ( )

Yn

m t

Y( ) m t

lim ( ) ( )

Yn Y

n m t m t

→∞ = ,

for all real , then the distribution function of converges to the distribution function of Y as .

t Y_n

n→ ∞

(8)

7.4 A Proof of the Central Limit Theorem

Review C.L.T

If Y Y₁, ₂,...,Y_n i i d. . . E Y( )_i =μ, var( )Y_i =σ², i=1, 2,...,n

Let S_n =∑Y_i Y_n =S_n/ⁿ , then ⁽ ⁾ ₂ ^(0,1)

var

i n n d

n

Y n S ES

n Y N

n S μ μ

σ σ

− −

− =∑ = ⎯⎯→ _.

【pf】

(9)

7.5 The Normal Approximation to the Binomial Distribution

The central limit theorem also can be used to approximate probabilities for some discrete random variables when the exact probabilities are tedious to calculate. One useful example involves the binomial distribution for large values of the number of trials, n.

Suppose Y , the number of successes in n trials, as a sum of a sample consisting of 0’s and 1’s; that is

1 n

i i

Y

=

=∑X , where

1 0

th i

if the i trial results in success

X otherwise

= ⎨⎧

⎩ .

The random variables X for _i i=1, 2," are independent (because the trials are ,n independent), and it is easy to show that (E X_i)= and p for

. Consequently, when n is large, the sample fraction of successes, ( _i) (1 ) Var X = p −p 1, 2, ,

i= " n

1

1 ⁿ

i i

Y X X

n =n∑₌ = ^,

possesses an approximately normal sampling distribution with mean E X( _i)= p and

(10)

Continuity Correction (連續修正)

) 5 . 0 5

. 0 ( )

(a≤S ≤b ≈P a− ≤S ≤b+

P _n _n )

) (

5 . ( 0 ) )

( 5 . ( 0

n n n

n

S ES a

S ES b

σ σ

− Φ −

− − Φ +

≅

) 5 . 0 5

. 0 ( )

(a<S <b ≈P a+ ≤S ≤b−

P _n _n )

) (

5 . ( 0 ) )

( 5 . ( 0

n n n

n

S ES a

S ES b

Φ σ

Φ −σ − − ⁺ ⁻

≅

) 5 . 0 5

. 0 ( )

(a≤S <b ≈P a− ≤S ≤b−

P _n _n )

) (

5 . ( 0 ) )

( 5 . ( 0

n n n

n

S ES a

S ES b

Φ σ

Φ −σ − − ⁻ ⁻

≅

) 5 . 0 ( )

(a S_n P a S_n

P < ≈ + ≤ P(a≤S_n)≈P(a−0.5≤S_n)

【Example 3】

Candidate A believes that she can win a city election if she can earn at least 55% of the votes in precinct I. She also believes that about 50% of the city’s voters favor her. If n=100 voters show up to vote at precinct I, what is the probability that candidate A will receive at least 55% of their votes？

【sol】

【Example 4】

Suppose that has a binomial distribution with n=25 and p=0.4. Please find the exact probabilities that and

Y

8

Y ≤ Y =8, and compare these to the corresponding values found by using the normal approximation.

【sol】