• 沒有找到結果。

Power of a Test

在文檔中 最高密度顯著性檢定 (頁 51-0)

In Neyman-Pearson formulation, we evaluate a the test statistics based on its power which is the probability of rejecting H0 when the H0 is false. Now, we consider the two-sided test where there is no uniformly most powerful test. Suppose the sample is drawn from N(µ, σ2) and the null hypothesis is H0 : µ = µ0 with known σ = σ0 and the alternative hypothesis is H1 : µ 6= µ0. The typical level α test using the sufficient statistic ¯X sets the rejection region

|σX−µ¯0/0n| > z1−α/2 where z1−α/2 is the 1 − α/2 quantile of the standard normal. This is not an SVNES significance test. Thus its power is

βN P(µ, σ) = Pµ,σ(| ¯X − µ0| > σ0nz1−α/2)

= 1 − Φ(σ/µ0−µn+ z1−α/2σσ0) + Φ(µσ/0−µn − z1−α/2σσ0) (13) when N(µ, σ) is the true distribution.

For the HSD test, the test statistic is

Pn i=1

(xi−µ0)2

σ20 and the power is βHDS(µ, σ) = Pµ,σ(

Pn

i=1(Xi−µ0)2 σ20 > qχ2

n,1−α)

= Pµ,σ2

n,ncp=n(µ−µ0)2σ2 > σσ202qχ2n,1−α),

(14)

where χ2

n,ncp=n(µ−µ0)2σ2 is a noncentral χ2 random variable with degree of freedom n and non-centrality ncp = n(µ−µσ20)2 and qχ2n,1−α is the 1 − α quantile of central χ2n random variable.

The points of power function of these two tests for H0 : µ = 0 with known variances σ = 1 and sample size n = 5 are shown in Figure 1 and 2. When the assumed variance is true in Figure 3, the power of HDS test is equal or smaller than one of sufficient statistic based test, which is denoted as NP in the figures. As the true variance is equal to 2 in Figure 4, the comparison of the two tests shows that the HDS test has larger power when the true means is close to it the null hypothesis is true but not in the other ways. On the other hand, in Figure 5, the power of HDS test is larger than it based on the other one when the true variance is equal to 3. Hence the HDS test is more sensitive to detect the distributional change when the true variance is larger than the known value. When the true variance is equal to 0.5 which is smaller than it in null hypothesis in Figure 6, the power of HDS test is always smaller. Thus, in this situation, the HDS test can not be easy to reject the null hypothesis.

Figure 1: Power of HDS test and n=5

mu

−4

−2 0

2 4

sigma

0.5 1.0

1.5 2.0

2.5 3.0 power

0.0 0.2 0.4 0.6 0.8 1.0

Figure 2: Power of Neyman-Pearson test and n=5

mu

−4

−2 0

2 4

sigma

0.5 1.0

1.5 2.0

2.5 3.0 power

0.0 0.2 0.4 0.6 0.8 1.0

Figure 3: Powers of the two tests as σ2 = 1 and n=5

−3 −2 −1 0 1 2 3

0.00.20.40.60.81.0

mu

power

NP HDS

Figure 4: Powers of the two tests as σ2 = 2 and n=5

−4 −2 0 2 4

0.00.20.40.60.81.0

mu

power

NP HDS

Figure 5: Powers of the two tests as σ2 = 3 and n=5

−4 −2 0 2 4

0.00.20.40.60.81.0

mu

power

NP HDS

Figure 6: Powers of the two tests as σ2 = 0.5 and n=5

−2 −1 0 1 2

0.00.20.40.60.81.0

mu

power

NP HDS

Figure 7: Powers of the two tests as µ = 0 and n=5

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.00.20.40.60.81.0

sigma

power

NP HDS

Figure 8: Powers of the two tests as µ = 1 and n=5

0.0 0.5 1.0 1.5 2.0 2.5 3.0

0.00.20.40.60.81.0

sigma

power

NP HDS

Further, the powers of the HDS test are smaller than the other’s when variances happen to be smaller in the condition that the mean is fixed to be 0 and 1 in Figure 7 and 8, respectively. With evaluation of the power, we have shown that HDS test for hypothesis of a normal distribution can be easy to detect the false null hypothesis with larger true variance.

Although the HDS test has smaller power when the assumed variance is larger than true one, it is not the main consideration of control chart which focuses on the larger dispersion.

The extension of the HDS test to control chart will be discussed in the following section.

4 Control Charts

A process statistical control is to see if the process is stable. A stable process indicates that the distribution of the characteristic is unchanged. Hence a process statistical control is to test if there is a distribution shift. Usually, a distribution involves several parameters. Sup-pose that the density function f with p parameters is denoted f (x, θ1, ..., θp). The common method to deal with this problem is to design a statistical process control scheme for each of p parameters separately. It can also be interpreted that a test for each hypothesis is with probability of type I error α. For example, when a variable of a characteristic obeys a normal distribution with mean µ and variance σ2, the popular technique is to construct ¯X-chart to monitor the shift of mean µ and R-chart to monitor the change of standard deviation σ.

Suppose that the control charts are constructed by statistics ˆθ1, ..., ˆθp separately for the p parameters and these statistics are independent. Then we will reject the hypothesis to interpret that the process is statistical out of control if some of these charts lead to a verdict of rejection. However, the overall probability of a type I error then becomes 1 − (1 − α)p. A second method is to reject the hypothesis whenever all schemes lead to rejection. Then, the overall probability of a type I error becomes αp. Since the different assertions for testing hypothesis lead to various probability of type I error, it may confuse the user with controlling the error probability. This is one deficit for the classical control charts.

There are another two deficits often occurring in the classical technique to develop control charts. One is the ignorance of possible correlation among the p test statistics. For example, the test statistics involved in ¯X-chart and R-chart are actually correlated. It leads to incor-rect (probability) control charts as we have mentioned. The other one is that an occurrence of a shift in distribution it may not be detectable through a shift in one or more param-eters involving in f . In explaining this point, Hoerl and Palm (1992) and Woodall (2000) argued that the control charts are aimed to detect deviations from the model, including the distribution assumption itself.

As considering these deficits, Grimshaw and Alt (1997) argued that the traditional ¯X and R charts are efficient in detecting changes of distributional mean and variation. Besides, their efficiencies can be remarkably reduced due to departures from the shape of the density function. Thus they proposed a quantile control chart which has control limits estimated by

a confidence band for a quantile vector. They showed that these charts are quite effective in detecting changes in the distributional shape which are undetected through the ¯X and R charts. However, we still need to concern that such a nonparameteric control chart generally has less efficiency than appropriate parametric one when the distribution is known.

Unlike the classical Shwhart control chart tracking a sample point through its mean or range, we track the value of its density. Any sample point x to be classified to have either a chance cause or an assignable cause will be only determined by the size of its density. Thus we need only to construct a lower limit for the sample density. This allows us to use only one chart for monitoring the process no matter how many parameters be involved in the distribution and the probability of type I error can be generally controlled with a specified value.

4.1 Density Control Charts

In the hypothesis testing problem, the density function f can be anything other than f0 such as f1 when H0 is false. f1 can be different from f0 in its mean, variation or the shape of the density. In control charting, the practitioners concern whether the process is in statistical control interpreted by a distribution. Thus we can extend the optimality of significance test to control charting although there is a debate over the relation between hypothesis testing and control charting (Woodall, 2000). This property is also introduced to tolerance interval based on coverage interval of highest density values by Huang, Chen, and Welsh (2006).

Now, we can establish a control chart which has the same idea of HDS test according to the relative values of a density function. Let X1, ..., Xn be a random sample drawn from a distribution with pdf f . By setting a HDS test with level α, it leads to region {x : L(x, f0) ≤ `(f0)} where `(f0) satisfies

α = Pf0(L(X, f0) ≤ `(f0)). (15) We now introduce the framework of a new control chart including a lower density limit and the tracking variable.

Definition 5.1. Let x be the sample point and L(x, f0) be its joint density. The density

Shewhart control chart specifies the control limit for tracking variable L(x, f0) as LCL = `(f0)

Tracking variable: L(x, f0) where constant `(f0) satisfies (15).

Although there are many other applications, a control chart is very useful in online process monitoring. To interpret the use of density control chart for detection of assignable causes, let’s motivate it from the use of classical control chart. The classical Shwhart control chart monitors a parameter θ with tracking an estimator ˆθ by setting control chart,

UCL = µθˆ+ 3σθˆ

LCL = µθˆ− 3σθˆ (16)

where µθˆ and σθˆ are, respectively, the mean and standard deviation of ˆθ. For control chart in (16), if the sample values of ˆθ fall in the control limits, i.e., ˆθ ∈ (LCL, UCL), and do not exhibit any systematic pattern, we say that the process is in statistical control at the level indicated by the chart, generally this level is not known. In the case of density control chart, if the process density f remains at the function f0, then values of L(x, f0) should be larger than `(f0). Here `(f0) is the lower α percentage point of the distribution of joint density L(X, f0) when H0 is true. We then have a rule for online process monitoring through the density control chart as:

If the sample values of L(x, f0) lies above the lower limit LCL = `(f0) and do not exhibit any systematic pattern, we say that the process is in statistical control at the level 1 − α.

In this setting, we may set any proper level such as 1 − α = 0.9, 0.95 or 0.9973. This is in general with difficulty in the classical control chart.

Usually, the pdf f may often be represented in the form f (x, θ1, ..., θp) so that the joint pdf L(x, f ) and `(f0) may be formulated as L(x, θ1, ..., θp) = Qn

i=1f (xi, θ1, ..., θp) and `(θ1, ..., θp), respectively. We also generally not know θ1, ..., θp. Therefore, the unknown parameters must be estimated from preliminary samples taken when the process is thought to be in statistical control. In fact, the answer of the question whether the process is in statistical control may not be known. Thus we have to consider another way to deal with the p parameters such as substitutes for them. Suppose that m samples are available, each containing n observations

on the quality characteristic of interest. In practice, Shewhart(1931) substituted m1 Pm

j=1x¯j and m1 Pm

j=1sj for µ and σ in the control charts where the ¯xj and sj are j-th sample mean and standard deviation. This method was also adopted by Chao and Cheng (1996) and Spiring and Cheng (1998). Thus we extend it to all of the parameters of a distribution. Let θˆi1, ˆθi2, ..., ˆθip, i = 1, ..., m be the estimated values, respectively, of θ1, ..., θp of the m groups of the sample. Then the grand averages of these estimates are

θ¯j = 1 m

Xm i=1

θˆij, j = 1, ..., p.

Woodall (2000) argued that a control chart is a test concerning the hypothesis that the in-control parameter values are true. From this point, the hypothesis of distribution f for a control chart is appropriate as

H0 : f (x, θ1, ..., θp) = f (x, ¯θ1, ..., ¯θp).

Then, we have a control chart agreeing with Woodall’s point when the underlying distribution involves parameters θ1, ..., θp.

Definition 5.2. The density Shewhart control chart specifies the framework of density control chart as

LCL = `(¯θ1, ..., ¯θp)

Tracking variable: L(x, ¯θ1, ..., ¯θp)

We notice that the rule for online process monitoring is still valid with the sample density values replacing L(x, f0) by L(x, ¯θ1, ..., ¯θp).

4.2 Density Control Charts for Some Distributions

4.2.1 Density Control Charts for Normal Distribution

One of the most important uses of a control chart is to improve the process. Consequently, we may also use the the density control chart to evaluate if there are assignable causes. For example, when a sample point X = x falls below the density control limit, this x may reveals to be an out of control sample point and there may exist an assignable cause. If assignable causes can be eliminated from the process, variability will be reduced and the process will be improved. In this section, we introduce density control charts for normal distribution.

Suppose that a quantity characteristic is normally distributed with unknown mean µ and variance σ2, and then we have density at x as L(x, µ, σ) = (2π)n/212)n/2e

Pni=1(xi−µ)2

σ2 . Since

Pn

i=1(Xi−µ)2

σ2 ∼ χ2n, the inequality L(X, µ, σ) ≥ `(µ, σ) subjected to 1 − α = Pµ,σ(L(X, µ, σ) ≥

`(µ, σ)) yields `(µ, σ) = (2π)n/212)n/2eχ2n,1−α2 . The framework of the normal density control chart is

LCL = (2π)n/212)n/2eχ2n,1−α2 Tracking variable: L(x, µ, σ) = (2π)n/212)n/2e

Pni=1(xi−µ)2 2σ2

According to Shewhart (1931), we can replace µ and σ by ¯¯x = m1 Pm

j=1x¯j and ¯s = m1 Pm

j=1sj respectively, where ¯xj and sj are the sample mean and the standard deviation of j-th group respectively. The density control chart turns out to have the framework as

LCL = (2π)n/21s2)n/2eχ2n,1−α2 Tracking variable: L(x, ¯¯x, ¯s) = (2π)n/21s2)n/2e

Pni=1(xi−x)2¯¯ 2 ¯S2

Because the value of likelihood function is usually too small to monitor and no obviously difference between in-control and out-of-control, the log-likelihood will be easier to identify the out-of-control points. The framework for log-likelihood control chart is

LCL = −n2ln(2π¯s2) − χ2n,1−α2

Tracking variable: ln(L(x, ¯¯x, ¯s)) = −n2ln(2π¯s2) − Pni=12 ¯(xS2i−¯¯x)2

For a given observation x = (x1, ..., xn)0, we compare its probability L(x, ¯¯x, ¯s) with the lower control limit LCL and the control chart indicates a sign of out of control if L(x, ¯¯x, ¯s) <

LCL.

Example 4.1. The process control of vane operating, which is an important functional parameter for a component part for a jet aircraft engine, has been studied in constructing statistical control charts by Montgomery, Runger and Hubele (2004) to assess the statistical stability of this manufacturing process. With preliminary 20 samples of sample size 5, they first constructed ¯X chart that indicates the samples, numbered 6, 8, 11, 19, are departure from the process mean and R chart that indicates the sample, numbered 9, is shift with variation. Removed these samples potentially resulted from assignable causes, they further construct ¯X and R charts from the rest of 15 samples for future judgement of statistical stability of the manufacturing process. These procedures are shown in Figure 9, 10, 11, and 12.

Figure 9: ¯X chart for Vane Opening

5 10 15 20

283032343638

sample number

X−bar

LCL= 36.67

UCL= 29.97

Figure 10: R chart for Vane Opening

5 10 15 20

2468101214

sample number

Range

UCL= 12.27

Figure 11: ¯X chart for Vane Opening,revised limits

5 10 15 20

283032343638

sample number

X−bar

LCL= 30.33 UCL= 36.10

Figure 12: R chart for Vane Opening, revised limit

5 10 15 20

2468101214

sample number

Range

UCL= 10.57

Figure 13: Log-density control chart for Vane Opening

5 10 15 20

−25−20−15−10

sample number

logL

LCL= −17.4010

Now, we want to construct density control chart from this preliminary samples. For easy presentation in this study, we construct the log-density control chart as

LCL = −n2ln(2π¯s2) − χ2n,1−α2

Test statistic function: lnL(x1, ..., xn, ¯¯x, ¯s) = −n2ln(2π¯s2) −

Pn i=1

(xi−¯x)¯2 2 ¯S2

(17)

Computed from these 20 preliminary samples, we have ¯¯x = 33.32 and ¯S = 2.09748. Hence the lower control limit for the log-density control chart is

LCL = −17.4010.

Plotting the test statistic function lnL(x1, ..., x5, ¯¯x = 33.32, ¯s = 2.09748) = −8.298398 −

Pn i=1

(xi−33.32)2

8.7989 for these twenty samples associated with the control limit, we have the log-density control chart in Figure 13.

We see that sample numbers 6, 8, 9, 19 are out of control on this log-density control chart.

We should discard these four samples, considered as being resulted from assignable causes, and recompute the log-density control limit. Computing from the rest 16 samples, we have

Figure 14: Log-density control chart for Vane Opening, revised limits

5 10 15 20

−30−25−20−15−10

sample number

logL

LCL= −16.6869

¯¯x = 33 and ¯S = 1.81833. Hence the lower control limit for the log-density control chart is LCL = −16.6869.

Again, plotting the test statistic function lnL(x1, ..., x5, ¯¯x = 33, ¯s = 1.81833) = −7.584304 −

Pn

i=1(xi−33)2

6.6127 for the preliminary twenty samples associated with the new control limit in Fig-ure 14, we may see that the samples of possibly resulted from assignable causes is the set numbered 6, 8, 9, 11, 15, 19. Since there is one more sample, numbered 11, found to be one possibly resulted from assignable cause, we discard these five samples and recompute the log-density control limit that yields ¯¯x = 33.0428 and ¯S = 1.79721. Hence we have a new log-density control chart as

LCL = −16.6285

Test statistic function: lnL(x1, ..., xn) = −7.525884 −

Pn

i=1(xi−33.0428)2 6.4599

(18)

We may see that the samples of possibly resulted from assignable causes is still the set numbered 6, 8, 9, 11, 15, 19. Hence, the log-density control chart of (18) in Figure 15 can now be used to judge the statistical control of the manufacturing process. It is interesting that

Figure 15: Log-density control chart for Vane Opening, twice revised limits

5 10 15 20

−30−25−20−15−10

sample number

logL

LCL= −16.6285

the sample numbered 15 appears out of control in density control chart but not in ¯X and R charts. This shows that a sample possibly resulted from assignable causes may not be detected from the ¯X and R charts. The control chart of (18) should be revised periodically and when the process has been improved.2

Let χ2o = Pni=1(x¯s2i−¯x)¯2 which has χ2 distribution with degrees of freedom n, where ¯¯x is the in control mean and ¯s is the in control standard deviation. From the relation L(x, ¯¯x, ¯s) ≤ LCL if and only if χ2o ≥ χ2n,1−α, we may see that the density control chart is equivalent to the following chi-square control chart,

UCL = χ2n,1−α LCL = 0

Tracking variable: χ2o = Pni=1(x¯s2i−¯x)¯2

When x is observed, the control chart checks x by comparing the chi-square value χ2o with the lower and upper control limits. The rule for process online monitoring is: If the sample chi-square values χ2o fall within the control limits, LCL and UCL, and do not exhibit any systematic pattern, we say that the process is in statistical control at level 1 − α. The

Figure 16: A control parabola for normal distribution.

-6s2

¯ x parabola control region:

1

¯

s2[(n − 1)s2 + n(¯x − ¯¯x)2] ≤ χ2n,1−α

¯¯x

chi-square control chart may be represented graphically.

In other words, the monitor value of chi-square control chart is composed by two parts since

χ2o = n(¯x − ¯¯x)2

¯

s2 +(n − 1)s2

¯

s2 (19)

where ¯x = 1nPn

i=1xi is the mean and s2 = n−11 Pn

i=1(xi− ¯x)2 is the variance for the sample point x with x0 = (x1, ..., xn). Equation (19) defines a parabola centered at (¯¯x, o) with principal axes parallel to the s2 axis shown in Figure 16. The representation of this control chart is analogous to semicircle chart proposed by Chao and Cheng (1996). Hence the semicircle chart is a density control chart for normal distribution.

Taking χ2o in (19) equal to χ2n,1−α implies that the sample x with sample mean ¯x and sample variance s2 is on the curve with value χ2o. Any observation locating inside the parabola indicates that the process is statistical in control, but otherwise it is statistical out of control. We may call it the control parabola and Figure 17 is the example of Vane Opening. Although it loses tracking time sequence which is also argued in semicircle control by Cheng and Thaga (2006), this problem is popular in a single chart to monitor multiple parameters. The log-likelihood control chart is suitable for monitoring the change of the process with multiple parameters in the time sequence.

Figure 17: Parabola control chart for vane opening

28 30 32 34 36 38

051015202530

xbar

s2

4.2.2 Density Control Charts for Negative Exponential Distribution

Suppose that a process distribution for a quality characteristic is negative exponentially distributed with probability density function

f (x, θ1, θ2) = 1 θ1

ex−θ2θ1 , x ≥ θ2 (20)

where θ1 > 0 and θ2 ∈ R are unknown parameters. In this situation, we have joint den-sity at x1, ..., xn as L(x1, ..., xn, θ1, θ2) = θ1n

1e

Pn i=1(xi−θ2)

θ1 . Suppose that we have a training sample xij, i = 1, ..., n, j = 1, ..., m of m groups of size n from an in control distribution.

We can then calculate m location estimates ˆθ2j = min{x1j, ..., xnj}, j = 1, ..., m and scale estimates ˆθ1j = n1 Pn

i=1

(xij − ˆθ2j), j = 1, ..., m; as well as their average ¯θ1 = m1 Pm

j=1θˆ1j and θˆ2 = min{ˆθ21, ...., ˆθ2m}. The appropriate hypothesis for this in control process distribu-tion is H0 : X ∼ f (x, ¯θ1, ˆθ2). When we consider that ¯θ1 and ˆθ2 as the true θ1 and θ2, we have

Pn i=1

(Xi−ˆθ2)

θ¯1 ∼ Γ(n, 1). The inequality L(X1, ..., Xn, ¯θ1, ˆθ2) ≥ `(¯θ1, ˆθ2) subjected to 1 − α = Pθ¯1θ2(L(X1, ..., Xn, ¯θ1, ˆθ2) ≥ `(¯θ1, ˆθ2)) yields `(¯θ1, ˆθ2) = θ¯1n

1eχ22n,1−α2 . We then have

the following new control chart when the process distribution is negative exponential:

LCL = θ¯1n

1eχ22n,1−α2

Test statistic function: L(x1, ..., xn, ¯θ1, ˆθ2) = θ¯1n 1e

Pn i=1(xi−ˆθ2)

¯θ1 .

(21)

For a given observation x1, ..., xn, we compare its density value L(x1, ..., xn, ¯θ1, ˆθ2) with the lower control limit LCL. The control chart indicates being out of control if L(x1, ..., xn, ¯θ1, ˆθ2) <

LCL.

We may choose t(X1, ..., Xn) = n1 Pn

i=1

(Xi − ˆθ2) as the test statistic. With the fact that

χ20 = 2

Pn i=1

(Xi−ˆθ2)

θ¯1 has distribution χ22n when H0 is true and the relation L(x1, ..., xn, ¯θ1, ˆθ2) ≤ LCL if and only if χ20 ≥ χ22n,1−α, we may see that the new control chart is exactly a chi-square control chart as

UCL = 2nθ¯1χ22n,1−α LCL = 0

Test function: t(x1, ..., xn) = 1nPn

i=1

(xi − ˆθ2)

(22)

The average run length (ARL) tell us, for a given situation in the distribution, how long on the average we will plot successive control chart points before we detect a point beyond the control chart. The ARL from the chart in (22) for this negative exponential distribution is

ARL = 1

P (χ22n θθ¯11χ22n,1−α2n(θθ21−ˆθ2)).

We performed a simulation to study the ARL with respect to location (θ2) shift and scale 1) shift. With sample size n = 5 and in-control parameter values ¯θ1 = 1.0 and ˆθ2 = 0, the following table display show the resulting ARL.

Table 9. ARL for the mean and location shifts.

θ2 = 0 θ2 = 0.2 θ2 = 0.5 θ2 = 1.0 θ2 = 2.0 θ2 = 3.0 n = 5

θ1 = 1.0 370.37 258.01 151.41 64.03 13.06 3.43

θ1 = 1.2 76.26 57.57 38.10 19.65 5.91 2.23

θ1 = 1.5 17.83 14.57 10.84 6.79 2.97 1.57

θ1 = 2.0 5.01 4.42 3.68 2.77 1.71 1.22

θ1 = 2.5 2.66 2.44 2.17 1.80 1.34 1.10

θ1 = 3.0 1.87 1.76 1.62 1.44 1.18 1.05

n = 10

θ1 = 1.0 370.37 204.90 87.36 23.55 2.97 1.09

θ1 = 1.2 50.63 32.81 17.69 7.00 1.78 1.03

θ1 = 1.5 9.25 6.98 4.71 2.67 1.26 1.01

θ1 = 2.0 2.53 2.20 1.81 1.40 1.06 1.00

θ1 = 2.5 1.51 1.40 1.27 1.13 1.02 1.00

θ1 = 3.0 1.21 1.16 1.11 1.05 1.00 1.00

We have several conclusions drawn from the results in Table 9:

1. The ARL is strictly decreasing when either one of θ1 and θ2 or both increase. This density control chart is with rapid detection of large shifts in the process level.

1. The ARL is strictly decreasing when either one of θ1 and θ2 or both increase. This density control chart is with rapid detection of large shifts in the process level.

在文檔中 最高密度顯著性檢定 (頁 51-0)

相關文件