• 沒有找到結果。

HDS Test for Continuous Distributions

在文檔中 最高密度顯著性檢定 (頁 32-36)

We will illustrate the differences from two aspects between the classical Fisherian significance and HDS test through a study of these two tests under several continuous distributions. First, one is to see how much information contained in the model has been involved to measure the evidence again H0. We will present a study of the HDS test with several examples where we will see evidence against H0 drawn from sources from the statistical model. Second, it will be seen that the HDS test has the advantage of observing a distributional shift. For this, we will design some special situations assisting to explain these advantages. As recommended by R.A. Fisher, the classical Fisherian significance tests should consider involving only the sufficient statistics. Here we first use the normal distribution to interpret these two aspects.

Example 3.1. Let X1, ..., Xnbe a random sample drawn from a normal distribution N(µ, σ) and consider the null hypothesis H0 : µ = µ0, σ = σ0. An appropriate way to interpret this where χ2n is the random variable distributed as χ2 with degrees of freedom n.

In the hypothesis involving assumption of both µ and σ, the HDS test gives an exact p-value which provides the evidence against the assumption that N(µ0, σ20) is true. The extreme set based on this HDS when observation x0 = (x10, ..., xn0)0 is observed is

where the determination of extreme sample point x relies on two variations (x − µ0)2 and Σni=1(xi− x)2. One measures the departure of the sample mean from location parameter and the other measures the dispersion of the sample point.

Let’s further consider the hypothesis H0 : µ = 0 and we assume that σ = 1 is known. The classical significance test of Fisher based sufficient statistic ¯X gives p-value

px= Pµ0(

n|X| ≥√

n|x|) = P (Z ≥√ n|¯x|),

where Z has the standard normal distribution N(0, 1). In fact, the HDS test for H0 is exactly the same as it for the hypothesis H0 and then it also has p-value. Let’s interpret the difference between the HDS test and the classical significance test. Suppose that we have drawn a sample of size even number n and the observation is as follows:

xi0=



i × 1000 if i = 1, 3, 5, 7, ..., 2n − 1

−(i − 1) × 1000 if i = 2, 4, 6, 8, ..., 2n

Here Σni=1x2i0 is a huge value but ¯x0 = 0 such that the Fisherian p-value px = P (|Z| ≥ 0) = 1 and HDS test p-value phd is approximately 0. There are completely opposite ways indicated from the p-values of two significance test. The unsignificant p-value for Fisherian significance test provides no evidence against H0, but the HDS test provides very strong evidence against H0. It is interesting that the results of these two tests are completely different. Without specified alternative hypothesis, the HDS test gives the p-value indicating that the pre-assumption σ = 1 may be wrong although H0 : µ = 0 is probably valid. In Fisherian significance test, unsignificance p-value leads us to accept H0 and do nothing further for this wild observation. In fact, the strong evidence provided by the HDS test indicates that we will not blindly believe that the population mean µ has been changed, but we will probably suspect that σ or the distribution is no longer true.

Let’s see the use of HDS test on multivariate data. We consider that X1, ..., Xn is a random sample drawn from a multivariate normal distribution Nk(µ, Σ). Suppose that the null hypothesis is H0 : µ = µ0, Σ = Σ0, where µ0 and Σ0 are a known k-vector and a k × k positive definite matrix, and we have observation (x10, ..., xn0). It is seen that L(X, µ0, Σ0) ≤ L(x0, µ0, Σ0) if and only ifPn

i=1(Xi−µ0)0Σ−10 (Xi−µ0) ≥Pn

i=1(xi0−µ0)0Σ−10 (xi0−µ0). Then,

the p-value of HDS test for this multivariate normal distribution is

Example 3.2. Let X1, ..., Xn be a random sample drawn from the negative exponential distribution with density function f (x, θ) = e−(x−θ)I(θ < x < ∞). Consider significance tests for the null hypothesis H0 : θ = θ0. Let X(1) represent the first order statistic of this random sample and we denote as random variable with gamma distribution Γ(a, b) where a and b are its corresponding parameters. Given an observation (x10, ..., Xn0), the Fisherian significance test generally chooses sufficient statistic X(1) as the test statistic that yields p-value as follows.

px = Pθ0(X(1)− θ0 ≥ x(1)0− θ0)

= P (Γ(1,1n) ≥ x(1)0− θ0)

since X(1)− θ has gamma distribution Γ(1,n1), where x(1)0 is the observed value of X(1). Now, let’s consider the HDS test for this hypothesis testing problem. Since L(xa, θ) ≥ L(xb, θ) is equivalent to Pn has distribution Γ(n,n1).

The Fisherian significance test traditionally uses only the sufficient statistic, X(1), to com-pute p-value and it will claims to have strong evidence of departure from H0 if x(1)− θ0 is large enough to yield a small p-value. However, the HDS test computes the p-value based on both X(1) and Σni=1(Xi − X(1)), where the latter measures the sum of distances between each observation Xi and the first order statistic X(1). Thus, it uses information more than it provided by the sufficient statistic to compute p-value.

In consideration of an extreme case, let’s assume that θ0 = 0 and the observation is x(1)0 = 0.01 and x(i)0 = 100, i = 2, ..., n. In this situation, the Fisherian significance test

will claim that there is no enough evidence against the null hypothesis. However, in this rare case for an exponential distribution produced an observation like this, the small p-value for HDS test leads to strong evidence against null hypothesis. There is no specified alternative hypothesis in a significance test, leads us to suspect if the distribution is no longer an exponential one. This needs a further investigation. 2

We have two conclusions drawn from the above two examples:

(I) The results showing in the examples indicate us that the HDS test is more sensitive than Fisherian significance test significance to the rare events.

(II) This sensitivity makes the HDS test to have a desired property in sense of smallest volume of non-extreme set that doesn’t happen in classical Fisherian significance tests.

Occasionally, HDS test leads to a test statistic exactly the same as it for a classical Fisherian significance test.

Example 3.3. Let X1, ..., Xn be a random sample drawn from a distribution with pdf f (x, θ) = θxθ−1, 0 < x < 1, θ ∈ R and θ 6= 0.

The HDS test uses the distribution shape under H0 automatically to classify the extreme sets producing Ehd = {(x1, ..., xn)0 : 0 < −Pn

For the Fishserian significance test, it traditionally uses the best statistic Qn

i=1

xθi0. One of the

one-sided tests with extreme sets such as {(x1, ..., xn)0 : Qn

i=1

xθi0 > Qn

i=1

xθi00} or {(x1, ..., xn)0 : Qn

i=1

xθi0 < Qn

i=1

xθi00} and a two sided test is an alternative choice, where Mood et al.(1974) choose a two-sided version.

In this example, the HDS test and the classical test both are constructed via a same statistic. However, the HDS test has the advantage of automatically determining the extreme set. 2

From these examples, we summarize some other conclusions in support of the HDS test.

Firstly, the HDS tests automatically determine the extreme set to compute p-values. On the other hand, the Fisherian significance test may be struggling in determining a test statistic or deciding if it is a one sided or two sided test. Secondly, the HDS test constructs the extreme set Ehd containing sample points more weirder than those in the non-extreme set such that L(x1, θ0) < L(x2, θ0) for x1 ∈ Ehd and x2 6∈ Ehd. This property holds in general only for the HDS test. Thirdly, the HDS test usually use a statistic containing information in the data related to both location and scale parameters to determine the extreme set. This statistic, hence, often combines sufficient and ancillary statistics. However, the Fisherian significance test only involves information contained in the data related to the parameter which is discussed in null hypothesis. With using rich information, the HDS test seems to be quite satisfactory in detecting distributional shift when the observation gives a small p-value. In this situation, a further investigation is needed to detect what happen for a small p-value. It may be thatH0 is not true or anothter distributional shift. Actually, these possible conclusions are resulted from the use of likelihood function.

在文檔中 最高密度顯著性檢定 (頁 32-36)

相關文件