StatisticalInference:TestsofHypotheses 101 學年統計學 (I) 授課老師：蔡碧紋

(1)

101 學年統計學(I) 授課老師：蔡碧紋

Statistical Inference : Tests of Hypotheses

(2)

Scientific questions

1. Whether the true average lifetime of a certain brand of tire is at least 22,000 kilometers.

2. whether fertilizer A produces a higher yield of soybeans than fertilizer B.

3. Pharmaceutical company: decide on the basis of samples whether at least 90% of all patients given a new medication will recover from a certain disease.

4. 廠商宣稱每杯citi coffee 重量至少 300 g.

2

(3)

Statistical Hypotheses testing

1. if the lifetime of the tire has pdf f (x ) = λe^−λx,x > 0, then the expected lifetime, _λ¹,is at least greater than 22000.

2. decide whether µA> µ_B, where µA, and µB are the means of the two populations

3. whether p, the parameter of a binomial distribution is greater or equal to 0.9.

4. whether µ > 300.

In each case, it is assumed that the stated distribution correctly describes the experimental conditions, and the hypothesis concerns the parameter(s) of the distribution.

(4)

Hypotheses Testing

假設 (hypothesis) 就是我們對於母體參數的宣稱(claim, statement)

The Scientist formulates a statement concerning the value of a parameter.

Atest of a statistical hypothesis is a procedure for deciding whether to “accept” or “reject” the hypothesis.

假設檢定 (hypothesis testing) 的目的就是要對這些宣稱提供統計上的檢驗, 以統計的檢定方法來推論假設的"真偽".

4

(5)

Terms you should know about hypotheses testing

1. Null and alternative hypotheses (H0 vs H1)

2. Test statistic, T (X), and T (x) (the distribution of T (X)) 3. Significance level of the test α

4. Rejection region (critical region) (RR) and acceptance region

5. Type I and Type II error probabilities.

6. p-value 7. power

(6)

Null and Alternative Hypotheses

Assume that the form of the distribution for the population is known , X ∼ F (x ; θ) where θ ∈ Ω, where Ω is the set of all possible values of θ can take, and is called the

parameter space.

The statistical hypothesis is a statement about the value of the parameter(s) of the distribution, such as

“θ ∈ ω”

ωbe a subset of Ω.

This is a statistical hypothesis and is denoted by H (H0), calledNull Hypothesis

6

(7)

Null and Alternative Hypotheses

On the other hand, the statement “θ ∈ ¯ω” (where ¯ωis the complement of ω w.r.t Ω) is called thealternative to H0

and is denoted by Ha(or H1).

we write

H0 : θ ∈ ω and Ha: θ ∈ ¯ω (or θ /∈ ω)

(8)

H

₀

, H

₁

1. In some case, we want to know the mean of something is as what people stated (or represents the status quo) we put it in the null hypothesis H0

主計處調查國民平均月所得為20000元

H0: vs H1 :

合歡山一月平均降雪量為20 公分。

H0: vs H1 :

2. Often hypothesis arise in the form that we want to know if a new product, technique, teaching method, etc., is better than the existing one. In this context, H0

is a statement thatnullifiesthe theory and is

sometimes called a null hypothesis. In this case, 我們

把想要檢定的假設定為 H1，H0 則為其相反之假設。

8

(9)

H

₀

, H

₁

1. if the lifetime of the tire has pdf f (x ) = λe^−λx,x > 0, then the expected lifetime, _λ¹,is at least greater than 22000.

H0: vs H1 :

2. decide whether µA> µB, where µA, and µB are the means of the two populations

H0: vs H1 :

3. whether p, the parameter of a binomial distribution is greater or equal to 0.9.

H0: vs H1 :

4. 廠商宣稱每杯citi coffee 重量至少 300 g.

H0: vs H1 :

(10)

Hypotheses Testing

I 假設檢定係指在尚未蒐集樣本資料、進行推論之前，

就事先對母體的某種特徵性質作一合理的假設敘述，

再利用隨機抽出的樣本及抽樣分配，配合機率原理，

以判斷此項假設是否為真。

以統計方法進行決策的過程中，會提出兩個假設：

H0: null hypothesis (虛無假設)。

H1: alternative or research hypothesis(對立假設、研究假設)。

可能的結論：

1. 有足夠的統計證據可推論 H1為真 (reject H0and accept H1)。

2. 沒有足夠的統計證據可推論 H1為真 (do not reject (Fail to reject, retain) H0. The data doesn’t provide enough evidence to support the alternative

hypothesis)。

10

(11)

Hypotheses Testing

I 假設檢定的主要精神在於尋找證據來拒絕H0而接

受H₁，我們無法證明H0為絕對正確，只有不能拒斥

它。

證據的角色: 假設 H0 為真的情況下, 嘗試在其間找出矛

盾,然後進行推論。

假設 H0為真, 收集到此資料的可能性，如果是異常稀少事

件（顯著的異常），則判定H0的假設是錯誤，所以拒絕H0.

因此假設檢定又稱為『顯著性檢定』（significant test）

(12)

simple and composite hypothesis

H0 : µ =166 vs H1: µ >166, H1: µ <166 or H1: µ 6=166

If ω contains onlyone point, i.e., if ω = {θ : θ = θ₀} then H0 is called asimple hypothesis which completely specifies the null distribution. We write it as H0: θ = θ₀. Otherwise, if it does not completely specify the

distribution. It is calledcomposite hypothesis.

12

(13)

Test Statistics

Test Statistics: T (X),

a function of a set of i.i.d. random variables X1, . . . ,Xn

which follow some distribution F (x , θ).

Such as X or S² etc.

Or a pivotal quantity for the test statistic

(14)

Significance level of the test

We defined an event with samll probability.

When H0 is true, the probability of an extreme event such as that {X > c} is very small. If {X > c}, we will reject the null hypothesis.

This small probability is calledthe significance level of the test, denoted by α.

P(X > c|H₀ : µ =160) = α

Thus the hypothesis testing is also called the test of significance.

14

(15)

Rejection Region

If X > c, then we will reject H0. T

{X > c} is called the rejection region, RR (or critical region) denoted by R.

Note that critical value is defined before (ex-ante) we collect the data.

We will make the decision by comparing x with c

(16)

Decision rules

we use data (random sample) to test if the data provides significant evidence to reject the null hypothesis.

If X > c reject H₀

Atest of hypothesesis a rule, or decision, based on a sample from a given distribution to show whether the data support our hypothesis.

16

(17)

Decision

Data:

If x > c reject H₀ Normally, the conclusion is either

1. Reject H0 and conclude that H1 is true, or 2. Do not reject H0

Remark:

Note that, we don’t say we“accept” H0, because it implies stronger action than is really warranted. We can’t find enough evidence to reject H0 but it does not mean that H0

is absolutely true.

(18)

Example: 7.2-1

Test if X , the breaking strength of a steel bar.

X ∼ N (50, 36) or X ∼ N (55, 36). We want to know if new method increases the mean of the strength to 55.

Draw sample and test if they are from normally distributed populatioin with mean 55.

1. two hypotheses: H0: µ =50 vs Ha: µ =55 2. test statistic: X

3. Decision rule: If X > 53, we will reject H0. Rejection Region R = {x > 53}

Data: 16 random samples are drawn.

We observed x = 53.75, then we will reject H0.

18

(19)

Example: 7.2-1

(20)

Example: 7.2-1

18

(21)

0.000.050.100.150.200.25

Distribution for x.bar under H0

prob

43 45 47 49 51 53 55 57

c=53.5

●

obs

(22)

Example: The significance level of the test

The probability of a rare event when H0 is true.

The significance level of the testis

P X > 53|H₀ :Xi ∼ N (50, 36)

P(X − 50 6/√

16 > 53 − 50 6/√

16 ) =P(Z > 12/3) = 0.023 The significance level of the test given R = {x > 53} is α =0.02.

20

(23)

Type I and Type II error probabilities

Probability of making a wrong decision:

There are two types of errors that can occurs.

Our decision Reject H0 Accept H0

Actual H0 is true Type I error good situation H0 is false good Type II error Probabilities associated with the two incorrect decisions are denoted by type I and type II error probabilities.

(24)

Type I and Type II error probabilities

1. Reject H0 when it is true.

α =P( Type I error ) = P( reject a true H₀) = P( reject H₀|H₀ is true)

2. Fail to reject H0 when it is false (Fail to accept H1

when H1 is true)

β =P( Type II error ) = P( Retain a false H0) = P( retain H0|H₁ is true)

22

(25)

Test statistics I

Often we work with the distribution of T (X), pivotal

quantity for the test statistic, such as a standard normal, t, χ², or F .

Pivotal quantity: a function of (data) observations and unknown parameters whose distribution does not depend on the unknown parameters

(26)

Test statistics II

1. If X1, · · ·Xn∼ N (µ, σ²)with µ is the unknown

parameter and σ²is some known constant, we have X ∼ N (µ, σ²/√

n), X − µ σ/√

n ∼ N (0, 1)

2. If X1, · · ·Xn∼ N (µ, σ²)where µ and σ²are unknown parameters, we have

X − µ S/√

n ∼ t(n − 1)

24

(27)

Test statistics III

3. X1, · · · ,Xn∼ Bernoullip and Y = X₁+ · · · +Xn we

have (Y /n) − p

qp(1−p) n

→ N (0, 1)

4. X1, · · · ,Xn∼ N (µ, σ²)where µ and σ² are unknown parameters and let S²=P(Xi − X )²/(n − 1).

W = (n − 1)S²

σ² ∼ χ²(n − 1)

(28)

Test statistics IV

Based on the distribution of the test statistic we define the Critical value t^∗, such as

zα,tα, χ²_α,or Fα and

construct the decision rule (rejection region) for a given the significance level of the test α.

26

(29)

Critical value for α=0.05

EX:P(Z ≥ 1.645) = 0.05, z0.05 =1.645

0.00.10.20.30.4

standard Normal

dnorm(x)

crtical value1.645

(30)

The procedure

1. Specify the null and alternative hypotheses.

2. Specify the significant level of the test α. (control the Type I error)

3. Define a test statistic T (X) and its distribution under H0

4. Decision rule: obtain the rejection region R = {x : T (x) ∈ R(θ0)}.

5. Obtain the data and calculate the value of the test statistic T (x) (Tobs)

6. Conclusion: If the test statistic T (x) ∈ R(θ0)reject H0

and conclude that there is strong evidence to reject the null hypothesis at the significant level α

28

(31)

Example (n=16, x = 53.75)

1. Hypotheses: H0 : µ =50 vs Ha: µ =55 2. Given α = 0.05 (Significance level of the test) 3. Test statistic:

Z = X − 50 6/√

16 ∼^H⁰ N(0, 1)

4. Decision rule: if Zobs > Z^0.05 =1.645 (critical value) Reject H0

5. Data x = 53.75 we have

zobs = 53.75 − 50 6/4 =2.5 6. Becuse zobs = 2.5 > 1.645

Reject H : µ =50 at α = 0.05.

(32)

Probability value (p-value) of the test

When H0 is true, the probability that the test statistic is equal to or exceeds the actually observed value toward the direction of the alternative hypothesis.

Tail-end probability under H0toward H1. P(X ≥ 53.75|H0) = P(Z ≥ 3.75

6/4) = φ(2.5) = 0.006 X ∼^H⁰ N (50, 36/16)

The p-value of the testis 0.006.

Small p-value provides evidence toreject the null hypothesis H0given the data.

30

(33)

Decision rule

If the p-value of a thest if as small or smaller than the significance level of a test, α, we say the the data are statistically significant at an α significant level.

The p-values : if p-value < α reject H0at significance level α.

(We don’t need to find different t^∗ for different significance level α.

Recall: Decision rule by critical value

if T (x) ∈ R(θ0)reject H0 at significance level α.

(34)

Decision rule

If the p-value of a thest if as small or smaller than the significance level of a test, α, we say the the data are statistically significant at an α significant level.

The p-values : if p-value < α reject H0at significance level α.

(We don’t need to find different t^∗ for different significance level α.

Recall: Decision rule by critical value

if T (x) ∈ R(θ0)reject H0 at significance level α.

31

(35)

−4 −2 0 2 4

0.00.10.20.30.4

standard Normal

x

dnorm(x)

crtical value1.645 ^●z.obs

(36)

Power

Power=P(Accept H₁ when H1 is true)

Power=1 − β, 1 − β is defined as thepower of the test.

33

(37)

α, β and Power

Example (Ex 7.2-1)

Let X be the breaking strength of a steel bar. H0: µ =50 vs Ha : µ =55 Given C = {(x1, · · · ,xn) :x ≥ 53} or C = {x : x ≥ 53} Data: n=16, what are the type I and type II error probabilities?

X ∼ N(50, 36/16) under H0: µ =50 X ∼ N(55, 36/16) under H1: µ =55

(38)

0.000.100.20

45 47 49 51 53 55 57 59

0.000.100.20

45 47 49 51 53 55 57 59

35

(39)

α, β and Power

Type I error rate = 0.0228 Type II error rate = 0.0912

The significance level of the test α =0.0228.

The power of the test is 1 − β=0.9088.

(40)

The relationship between α and β

0.000.050.100.150.200.25

45 47 49 51 53 55 57 59

37

StatisticalInference:TestsofHypotheses 101 學 年 統 計 學 (I) 授 課 老 師 ： 蔡 碧 紋

101 學年 統計學(I) 授課老師：蔡碧紋