• 沒有找到結果。

Summary

在文檔中 最高密度顯著性檢定 (頁 15-23)

Started from introducing of statistical model, we reviewed the statistical inference approaches in previous sections. The approach with techniques based on likelihood function is espe-cially interesting. The use of maximum likelihood estimation leads to the desired property of asymptotically attaining Cramer-Rao lower bound. On the other hand, the use of

likeli-hood ratio results the most power test. In the following sections, we will study a point in statistical inference that is somehow different from the sufficiency and conditionality. We will investigate the information that is contained in the statistical model and introduce two statistical principles. Then we will follow these principles to define a new extreme set and conduct a new significance hypothesis test, highest density significance test, in section 3 and 4. Further, we will develop several different control charts for different distributions in section 5.

2 Probability and Plausibility

2.1 Concepts of Probability and Plausibility

Birnbaum(1962) proposed that “report of experimental results in scientific journals should in principle be descriptions of likelihood functions, when adequate mathematical-statistical models can be assumed, rather than reports of significance levels or interval estimates.” The likelihood function formulated from statistical model has been broadly applied with a long history in statistical inference. The simplest form of the statistical model is the case that there existed a random sample X1, ..., Xn drawn from a distribution with pdf f (x, θ), x ∈ Rx, θ ∈ Θ where Rx is sample space of variable X and Θ is the parameter space.

Most statistical inference problems in parametric model arise from the fact that we do not know the true parameter θ as we have the observations. According to the occurrence of the observation, we wish to infer something requested from problems concerning about the true parameter. For dealing with the raised statistical problem, we generally want to summarize important and necessary information to develop a statistical procedure with desired property.

Among many available procedures, we expect a desired one to achieve the followings:

(a) Generally accepted criterion of optimal procedures.

(b) Demonstrate that the best procedure is possible, i.e. the best one is obtainable.

Various statistical problems need various amount of information to accomplish the achieve-ment. For a general study, it is interesting to observe that the total amount information is contained in the statistical model postulated by Fisher as

Γx = {f (x, θ)x ∈ Rx, θ ∈ Θ, X = (X1, ..., Xn) is a random sample}. (1)

For aiming in solving various statistical problems, there are several techniques implemented for data reduction, which are proposed trying to gather possible information for inference of θ. With disciplines of statistical problem, it has not been known if there is a technique of data reduction that may accomplish our goal for dealing with various statistical inference problems. The reason raising this question is related to the concept of the sufficiency which was introduced by Fisher(1922). This concept is now played a central role in frequency based inferences and then many frequentist approaches are recommended to rely on the

sufficient statistic. It is general in literature that we define a statistic S(X) to be sufficient for the family {f (x, θ), θ ∈ Θ} if the conditional distribution of the random sample X given a value of S(X) does not depend on θ (see, for example Spanos (1999, p627)). Suppose that the sufficient statistic provides enough information for dealing with all statistical inference problems. We must see that there is a statistical technique involving sufficient statistic for every inference problem is shown to have some desirable properties. However, this is not true in some important statistical inference problems.

Sufficient statistic perhaps plays the most important part in developing minimum variance unbiased estimator. Let’s see its role in some other statistical inference problems. Mainly due to Barnard (1949,1980), the pivotal quantity, an elegant technique, has been very popularly applied for constructing the confidence interval. On the other hand, a significance test, formulated by Fisher (1915), is a method for measuring statistical evidence against a null hypothesis H0 and is done by selecting a test statistic T and computing the probability of the tail area of the distribution of T beyond its observed value which is called the p-value. The pivotal quantity for confidence interval and the test statistic for significance test are generally recommended to be constructed involving the sufficient statistic without any careful justification. Unfortunately, the statistical procedures based on them are not justified with any desired optimal properties. It may be that the sufficient statistic, especially the minimal one, condenses the information in the statistical model Γx so much that it is not appropriate to be applied to all statistical problems.

Vapnik (1998, p12) argued the techniques for problems in statistical inferences that a restricted amount of information only can solve some special problems and it can never solve all different statistical problems with effective procedures. In interval estimation and significance test, the lack of optimal properties reveals that there must have interesting evidence that can’t be discovered in a sufficient statistic. Silvey (1975) and Lindsey (1996) both pointed out that many frequentist theory and techniques for confidence sets appear ad hoc because they are not wholly model-based, relying on the likelihood function, and other single unified principle. It is lack a technique that can capture useful information embedded in the data and the model for construction of inference methods.

Without applying the likelihood function, the techniques for confidence interval in liter-ature are not convincing in terms of plausibility which has the information represented by

the size of the likelihood function L(θ, x) when X = x is observed. This indicates that we should be careful in using sufficient statistic to construct confidence interval. The problem raised by Silvey(1975) and Lindsey(1996) also occurs in the significance test problem. With assuming that H0 is true, the existed significance tests do not restrict the set of probable non-extreme points. This results that the p-values computed from these significant tests not appropriate as evidence against H0.

Let’s examine the concern of Silvey (1975) and Lindsey (1996) about plausibility and probability of sufficient-statistic based on confidence interval and significance test. When vector X = x is observed , we say that θ1 is more plausible than θ2 if L(x, θ1) > L(x, θ2), where L(θ, x) is the likelihood function for random sample X. Regarding to the null hy-pothesis H0 : θ = θ0, we may say that sample point x1 is more probable than another point x2 if L(x1, θ) > L(x2, θ). Suppose that C(X) is a 100(1 − α)% confidence interval for θ and A(x0) is the non-extreme set for a significance test. The likelihood sets corresponding with confidence interval C(X) and significance test when X = x0 is observed are, respectively,

LSC = {L(θ, x) : θ ∈ C(x), x ∈ Rnx} for confidence interval, and LSA(x0) = {L(θ0, x) : x ∈ A(x0)} for significance test.

For interval estimation, the likelihood set is set of plausibilities values for a confidence interval. As we have discussed above, a set to be more plausibility may be more suitable to play as a confidence interval or significance test. Thus we have to choose a confidence set whose corresponding likelihood set stay away from zero. On the other words, we want to construct a confidence interval which includes the most plausible points. The likelihood sets of some typical confidence intervals will be shown in the following examples.

Example 2.1. Let X1, ..., Xn be a random sample drawn from the normal distribution N(µ, σ2). First we consider the confidence interval for mean µ with known variance σ = 1 for convenience, and then the sample space is Rx = R. The popularly used 100(1 − α)%

confidence interval based on sufficient statistic ¯X for µ is ( ¯X − zα/2 1

√n, ¯X + zα/2 1

√n).

Three facts are employed in deriving the likelihood set. (1). ¯x − zα/21n ≤ µ ≤ ¯x + zα/21n if and only if n(¯x − µ)2 ≤ zα/22 . (2). Pn

i=1

(Xi − ¯X)2 ∼ χ2(n − 1) which has sample space

(0, ∞). (3). ¯X and Pn

In this example, it has been shown that the likelihood set includes zero which indicates that this confidence interval is implausible.

Similarly, the popularly used significance test based on sufficient statistic Pn

i=1

Xi when X = x0 is to compute p-value P (√

n| ¯X| ≥√

n|¯x0||µ = 0) where µ = 0 representing that H0 is true. By denoting ¯x0 as the sample mean of vector x0, the likelihood set of the significance test derived in the following

is identical to it for the confidence interval for mean.

Next, we consider the confidence interval for variance σ2 where µ = µ0 is also assumed to be known. With sufficient statistic Pn

i=1(xi = µ0)2, the widely adopted 100(1 − α)%

confidence interval for σ2 is

(

With σ2 restricting to the inequality, χ2α/2

Pn i=1

(xi−µ0)2

σ2 ≤ χ21−α/2 and each xi may take any value in R, the likelihood set is

LSC = {(2πσ12)n/2e

The result is also shown that the likelihood set includes zero and its neighbors. The confi-dence interval for σ is not a desired conficonfi-dence interval in sense of plausibility.

Example 2.2. (Likelihood set for confidence interval based on sufficient statistic) Let X1, ..., Xn be a random sample drawn from the negative exponential distribution with pdf f (x, θ) = e−(x−θ)I(θ < x < ∞). The typical confidence interval for θ is constructed by the pivotal quantity Y1 = X(1)− θ which has distribution Y1 = Gamma(1,1n) which uses the the sufficient statistic X(1). Let a and b be two positive constants satisfying 1 − α = P (a < Y1 <

b), and then a 100(1 − α)% confidence interval for θ is

(X(1)− b, X(1)− a). (2) likelihood set for confidence interval (2) as

LSC = {e

Similarly, it also happens in the case that using only the sufficient statistic makes able to stay from zero. It also happens for a confidence interval which has likelihood set including less plausible points.

In the case of null hypothesis H0 : θ = θ0, suppose that the observation of the random sample is x0 = (x10, ..., xn0)0 with x0(1) value of the first order statistic x(1). Then the

significance test defines the p-value as P (X(1) ≥ x0(1)|θ = θ0). The significance test has the same likelihood set for the confidence interval as shown in the following:

LSA(x0)= {e

We have examined the likelihood sets for the confidence interval and significance test that are constructed by the sufficient statistic. With these results, we have several conclusions and comments:

(a) Classically, the confidence interval may not contain the most plausible point, the maxi-mum likelihood estimate, and a significance test may not contain the most probable point, the point x achieving maxx∈RnxL(x, θ0). However, the examples mentioned above do contain corresponding most plausible points and most probable points in the corresponding intervals or significance tests. This indicates that the sufficient-statistic based statistical procedures seems to be efficient in catching the information in the statistical model (1) for problems searching for most plausible and most probable points.

(b) From the analyzed examples, the likelihood sets for confidence intervals includes all displausible points (those as closer to zero), and the likelihood sets for significance tests include all disprobable points. This provides evidence to support the concern of Silvey (1975) and Lindsey (1996) about displausibility for existed confidence intervals, where we also have the analogous result for significance test. We then may conclude that the sufficient statistic does not contain all plausibility information and probability information, and then it is inappropriate to say that it is sufficient for the family {f (x, θ), θ ∈ Θ}.

(c) As argued by Vapnik (1998), there requires information much more than it provided by a sufficient statistic to deal with a more general statistical problems. Then the information desired to construct the confidence interval is much more than what that the sufficient statistic has provided.

在文檔中 最高密度顯著性檢定 (頁 15-23)

相關文件