• 沒有找到結果。

The Best Significance Tests

在文檔中 最高密度顯著性檢定 (頁 41-47)

It is known that the Fisherian significance test is not generally accepted since tests of this type do not automatically fulfill any desirable optimal property. With the fact that all HDS tests are best significance tests in some sense, it then raises the question that if there is a guide that some Fisherian significance tests are also best significance tests. The following theorem provides this guide, simply for a special case.

Theorem 3.9. Let X = (X1, ..., Xn) be a random sample from f (x, θ) with an observation X = x. Consider the hypothesis H0 : θ = θ0 and we assume that the family of densities {f (x, θ) : θ ∈ Θ} has a monotone likelihood in a sufficient statistic T = t(X):

(a) If the monotone likelihood is nondecreasing in t(x), then the left hand Fisherian signifi-cance test with p-value

px = Pθ0(t(X) ≤ t(x)) is a best significance test.

(b) If the monotone likelihood is nonincreasing in t(x), then the right hand Fisherian signif-icance test with p-value

px = Pθ0(t(X) ≥ t(x)) is a best significance test.

Example 3.5. (a) Let X be random sample from the exponential distribution with pdf f (x, θ) = θe−θxI(0 < x < ∞). The family of this exponential distribution has a monotone likelihood nonincreasing in sufficient statistic Pn

i=1

Xi. Then the Fisherian significance test with p-value

px = Pθ0( Xn

i=1

Xi Xn

i=1

xi) = P (Γ(n, 1 θ0) ≥

Xn i=1

xi) is a best significance test.

(b) Let X = (X1, ..., Xn) be a random sample from the uniform distribution U(0, θ), θ > 0.

Considering the null hypothesis H0 : θ = θ0, the likelihood function is L(θ, x) = 1

θnI(0 < x(n)< θ)

where X(n) is the largest order statistic which is sufficient for parameter θ. Since the space of x(n) is (0, ∞), the family of unform distribution under H0 has monotone likelihood non-increasing in the sufficient statistic Y(n). The right hand Fisherian test with p-value

px = Pθ0(X(n)≥ x(n)) =

From the theory we have developed for best significance test and Fisherian significance tests, we may draw several conclusions from the fact that we apply likelihood function for the HDS test:

(a) If the likelihood function L(x, θ) involves a univariate parameter θ and has a monotone likelihood in a sufficient statistic, then a one sided Fisherian significance test based on the sufficient statistic is a best significance test. This provides a guide in selecting tests statistic and the choice of one sided or two sided test so that it shares an optimal property.

(b) If the likelihood function L(x, θ) involves a univariate parameter θ and statistics including a sufficient one and some other ancillary statistics, then the sufficient statistic based Fisherian significance test doesn’t share the optimal property. For examples, the best significance test for hypothesis H0 : θ = θ0 in the negative exponential distribution of Example 3.2 involves Pn

i=1

(Xi− θ0) which is an ancillary statistic.

(c) If the likelihood function L(x, θ1, ..., θk) involves k parameters and we testing hypoth-esis H0 : θ1 = θ10, ..., θk = θk0, the Fisherian significance test is classically constructed by Bonferroni technique combining tests, using separate sufficient statistics, for hypotheses H0 : θj = θj0. This doesn’t share the optimal property. We have an example to explain this point. The HDS test for hypothesis H0 : µ = µ0, σ = σ0 where sample is drawn from normal distribution is based on statistic Pn

i=1

(Xi− µ0)2. This test statistic is the sum of Pn

i=1

(Xi− ¯X)2 and n( ¯X − µ0)2, one is a function of sample variance S2 and the other is a function of sample mean where S2 and ¯X are separately sufficient for σ2 and µ.

(d) If the likelihood function L(x, θ1, ..., θk) involves k parameters but we preassume that

θ2 = θ20, ..., θk = θk0 are known, the interest is to test H0 : θ1 = θ10, the traditional Fisherian significance test is to construct the test statistic based on the sufficient statistic for parameter θ1. This test definitely doesn’t share the optimal property. In this situation, the likelihood function is of the form L(x, θ10, ..., θk0) that generally involve extra sufficient statistics for θ2, ..., θk. However, these extra sufficient statistics are ancillary since their corresponding parameters are preassumed to be known and then are not involved in the Fisherian significance test.

(e) The HDS tests always employ all information of statistics involving in the likelihood function. This is the reason that they are always optimal in sense of smallest volume.

In the rest of this section, we will evaluate the performance of HDS and Fisherian sig-nificance tests for cases of several hypothesis problems. For sigsig-nificance test with only null hypothesis, we are allowed to have data departure not only from the null hypothesis, but also from any pre-assumption set on the statistical model including independence, identical distribution, or some pre-assumed parameter values. We expect that a significance test may provide p-value with strong evidence for any of these departures.

A model that the data will be drawn for simulation is called the true model. A model that is assumed by the statistician before the execution of hypothesis testing is called the pre-assumed statistical model. This pre-assumed model may coincides or not coincides with the true statistical model for this situation of significance test. We design the following situations for simulation study:

(a) The true model and the pre-assumed model are identical.

(b) The pre-assumed model has parameter values varies with the true model. This inconsis-tency could be that the null hypothesis isn’t correct or other parameters in the model aren’t correctly specified.

(c) The true model is with correlated sample and the pre-assumed model is with random sample assumption.

Example 3.6. We consider two cases for simulation study where one is that the model assumption and the null hypothesis are all correct and the other one is that the model assumption is incorrect. In every simulation for various model assumption, we choose several

sample sizes n = 5, 10, 20, 40, 100 and replications m = 10000. Suppose that pj, j = 1, ..., m represents the computed p-values of all replication for one test, we compute average p-value

¯

In first case, we assume that we have random sample drawn from N(µ, 1) and the consid-ered assumption is H0 : µ = 0. This is the situation that H0 is correct and all preassumed assumptions such as random sample, normality and σ and null hypothesis regarding with µ are all correct. Significance tests for this case are expected to have large p-values. The HDS test has p-value, with xi, i = 1, ..., n,

The Fisherian significance two sided test is with p-value px = Pµ=0,σ=1(

n| ¯X| ≥√

n|¯x|) = P (|Z| ≥√

n|¯x|) (8)

We display the average p-values and their standard errors for the two significance tests.

Table 2. Average p-value for two significance tests under H0 : µ = 0 when N(0, 1) is true.

n p¯hds σˆhdsp p¯z σˆpz

We have two conclusions drawn from the results in Table 2:

(a) The average p-values and their corresponding standard errors for the HDS and Fisherian significance tests are with values very close. So, these two tests perform quite similar when the assumed statistical model is identical with the true statistical model.

(b) With p-value in average nearly 0.5, these two significance tests both provide no real evidence against the assumed statistical model.

Case B: In this simulation, we draw random sample X = (X1, ..., Xn) from normal distri-bution N(0, 4).

In first case, we assume that we have random sample drawn from N(µ, 1) and the consid-ered assumption is H0 : µ = 0. This is the situation that H0 is correct but the preassumed σ = 1 is not true. The HDS test has p-value exactly the same as it in (7) and the Fisherian significance two sided test is with p-value exactly the same as it in (8). We display the average p-values and their standard errors for the two significance tests.

Table 3. Average p-value for two significance tests under H0 : µ = 0 when N(0, 4) is true.

n p¯hds σˆhdsp p¯z σˆzp

5 0.0780 0.0282 0.2939 0.0930

10 0.0200 0.0063 0.2992 0.0951

20 0.0015 0.0002 0.2952 0.0932

40 8.4316e − 07 2.0733e − 07 0.3008 0.0956 100 1.7347e − 15 1.1108e − 26 0.2967 0.0929

In the second case, we have samples drawn from the same normal distribution N(0, 4).

We assume that we draw them from N(µ, 1) and the considered assumption is H0 : µ = 1.

This is a situation that H0 and the preassumed assumption on σ are all incorrect. Then smaller p-values are again desired. The HDS test defines p-value as

phd= Pµ=1,σ=1( Xn

i=1

(Xi− 1)2 Xn

i=1

(xi− 1)2) = P (χ2n Xn

i=1

(xi− 1)2).

The Fisherian significance two sided test is with p-value px = Pµ=1,σ=1(

n| ¯X − 1| ≥√

n|¯x − 1|) = P (|Z| ≥√

n|¯x − 1|).

The following table displays the average p-values and MSE’s for two significance tests.

Table 4. Average p-value for two significance tests under H0 : µ = 1 when N(0, 4) is true.

n p¯hds σˆphds p¯z σˆpz

5 0.0500 0.0177 0.1719 0.0704

10 0.0086 0.0022 0.1016 0.0459

20 0.0003 4.9857e − 05 0.0341 0.0156

40 1.8765e − 07 8.706e − 11 0.0039 0.0012

100 0 0 9.3025e − 07 2.3016e − 09

We have conclusions drawn from Tables 3 and 4:

(a) The HDS and Fisherian significance tests are with average p-values also decreasing when the sample size decreases. This show that these two significance tests are more efficient in detecting the evidence against the model change when the sample size is large.

(b) The average p-value in each corresponding sample size is smaller than it of the Fisherian significance test. This shows that the former one is more efficient than the latter one in this detection of evidence for a model shift.

(c) The average standard errors of the p-values for the HDS test are relatively smaller than those of the Fisherian significance test. This shows the stability of using the HDS test.

Case C: In this simulation, we draw random sample X = (X1, ..., Xn) from the following AR(1) model

Xi = ηi, i = 1, ..., n ηi = ρηi−1+ ²i where ²i’s are i.i.d. with normal distribution N(0, 1).

In this case, we assume that we have a random sample from N(µ, 1) and the considered assumption is H0 : µ = 0. This is the situation that H0 is correct, however, the error variables are not iid and are with AR(1) structure. It is expected to have small p-values for significance tests. The p-values for HDS and Fisherian significance tests are with the same forms, respectively, of (7) and (8). Suppose we will reject H0 when p value is less than 0.05.

Table 5. Average p-value for two significance tests under H0 : µ = 0 when AR(1) is true.

ρ p¯hds nrej p¯z nrej

0.1 0.4819 624 0.2330 1398

0.2 0.4245 1001 0.2155 1866

0.3 0.3345 1873 0.1957 2423

0.4 0.2163 3675 0.1713 3282

0.5 0.1131 6228 0.1477 4061

0.6 0.0432 8337 0.1226 5013

0.7 0.0098 9608 0.0947 6106

0.8 0.0012 9945 0.0647 7345

0.9 6.8908e − 05 9996 0.0335 8600

1.0 6.4842e − 07 10000 0.0051 9777

We have several conclusions drawn from Table 5:

(a) In this situation that the data drawn from an AR(1) model, however, we compute two significance tests based on a model of iid random variables. The p-values of two considered significance tests are both decreasing when ρ is increasing from zero. It is reasonable that when ρ is close to zero the true statistical and the assumed statistical model are similar.

(b) In cases that ρ’s are smaller than 0.4, the Fisherian significance test seems to be better than the HDS test. When ρ’s are larger than 0.5, our new test seems to be better.

(c) When we set a significance level as 0.05 these two significance tests are with numbers of rejection increase in ρ in a reasonable trend. One interesting fact is that for case that ρ = 0.4 the HDS test has larger average p value but with larger number of rejection.

在文檔中 最高密度顯著性檢定 (頁 41-47)

相關文件