Distributional and inferential properties of the estimated precision C-p based on multiple samples

(1)

Note

Distributional and Inferential Properties of the

Estimated Precision C

p

Based on Multiple Samples

W. L. PEARN1 _{and Y. S. YANG}2

1_{Department of Industrial Engineering & Management, National Chiao Tung University;} 2_{Department of Industrial Engineering, Da-Yeh University, Taiwan ROC}

Abstract. Process precision index Cphas been widely used in the manufacturing industry to provide

numerical measures on process potential. Pearn et al. (1998) considered an unbiased estimator of Cp

for one single sample. They showed that the unbiased estimator is the UMVUE. They also proposed an efficient test for Cpbased on one single sample, and showed that the test is the UMP test. In this

paper, we consider an unbiased estimator of Cp for multiple samples. We show that the unbiased

estimator is the UMVUE of Cp, which is asymptotically efficient. We also consider an efficient test

for Cp, and show that the test is the UMP test for multiple samples. The practitioners can use the

proposed test on their in-plant applications to obtain reliable decisions.

Key words: process precision index, unbiased estimator, UMVUE, asymptotically efficient, UMP test, p-value, power.

1. Introduction

Process capability indices, which establish the relationships between the actual process performance and the manufacturing specifications, have been the focus of recent research in quality assurance and process capability analysis. Those cap-ability indices, quantifying process potential and performance, are important for any successful quality improvement activities and quality program implementation. The first process capability index appeared in the literature is the precision index

Cp, which is defined by Kane (1986) as:

Cp=

USL− LSL

6σ ,

where USL is the upper specification limit, LSL is the lower specification limit, and σ is the process standard deviation. The numerator of Cpgives the range over which the process measurements are allowable. The denominator gives the range over which the process is actually varying. The index Cpwas designed to measure

(2)

the magnitude of the overall process variation relative to the manufacturing toler-ance, which is to be used for processes with data that are normal, independent, and in statistical control. Clearly, the index only measures the potential of a process (the potential to reproduce acceptable product), and does not take into account whether the process is centered. The use of the capability indices was first explored within the automotive industry. Ford Motor Company (1986) has used Cp to keep track of the process performance and to reduce process variation. Recently, the manu-facturing industries have been making an extensive effort to implement statistical process control (SPC) in their plants and supply bases. Capability indices derived from SPC have received increasing usage not only in capability assessments, but also in the evaluation of purchasing decisions. Capability indices are becoming the standard tools for quality report, particularly, at the management level around the world. Proper understanding and accurate estimating them is essential for the company to maintain a capable supplier.

2. Estimating Cpbased on Multiple Samples

For cases where the data are collected as one single sample, Pearn et al. (1998) considered an unbiased estimator of Cp. They showed that the unbiased estimator is the UMVUE (uniformly minimum variance unbiased estimator) of Cp. They also proposed an efficient test for Cp based on one single sample, and showed that the test is the UMP test. For cases where the data are collected as multiple samples, Kirmani et al. (1991) considered m samples each of size n and suggested the following estimator of Cp, where ¯Xi is the ith sample mean, and Si is the ith sample standard deviation:

ˆC∗ P = USL− LSL)dp 6 , where dp = m(n− 1) − 1 m(n− 1) m(n−1)−1 Sp , m(n−1)−1 = E Si σ = E χm(n−1)−1 √ m(n− 1) − 1 = 2 m(n− 1) − 1 m(n− 1) 2 m(n− 1) − 1 2 −1 , S_p2 = 1 m(n− 1) m i=1 (n− 1)S_i2= 1 m m i=1 S_i2,

(3)

noting that the statistic Si/σ is distributed as χm(n−1)−1/[m(n − 1) − 1]1/2. Under normality assumption, the estimator ˆC_p∗ is distributed as:

ˆC∗ p∼ √ m(n− 1) − 1m(n−1)−1 χ2 m(n−1) Cp.

The estimator ˆCp∗is unbiased, and its probability density function (PDF) can be obtained as the following, for y > 0, where k= [m(n−1)−1]C2

pm(n2 −1)−1, which is a function of Cp. g(y)= 2k m(n−1)/2 2m(n−1)/2[m(n − 1)/2]y −[m(n−1)+1]_exp₋k 2 1 y2 .

The variance of ˆC_p∗ can be calculated as the following (Kirmani et al. (1991)). Tables I(a)–I(d) display the values of the variance for Cp = 1.00, 1.33, 1.67, 2.00,

m= 10(5)25, and n = 2(1)15. We note that for fixed m × n sample observations,

Var ( ˆCp∗)for large m and small n is greater than that for small m and large n. For example, for Cp = 1.00 with m × n = 60 Var ( ˆCp∗)= 0.0133 for m = 20, n = 3, Var ( ˆC_p∗)= 0.0117 for m = 15, n = 4, and Var ( ˆC_p∗)= 0.0105 for m = 10, n = 6.

Similarly, for Cp = 1.00 with m × n = 100, Var ( ˆCp∗) = 0.0068 for m = 25,

n= 4, Var ( ˆC_p∗)= 0.0064 for m = 20, n = 5, and Var ( ˆC_p∗)= 0.0056 for m = 10, n= 10.

Var ( ˆC_p∗) = E[( ˆC_p∗)2] − [E( ˆC_p∗)]2

= (USL − LSL)2 m(n2 −1)−1 [m(n − 1) − 1] 36m(n− 1) E 1 S_P2 − C2 P = C2 P m(n− 1) − 1 m(n− 1) − 2 _m(n2 ₋₁₎₋₁− 1 = C2 P 1 2 m(n−1)−2 − 1

In the following, we investigate some other statistical properties of ˆCp∗. We show that ˆC_p∗is the UMVUE of Cp, which is also asymptotically efficient. Under regular conditions, an estimator ˆθn is said to be asymptotically efficient if the asymptotic efficiency, limn→∞e( ˆθn) = limn→∞[1/I (θ)Var ( ˆθn)] = 1, where 1/I (θ) is the Cramer–Rao lower bound.

(4)

Table Ia. Variance of ˆC_p∗ for Cp = 1.00, for m= 10(5)25, and n = 2(1)15. m n 10 15 20 25 2 0.0643 0.0391 0.0282 0.0220 3 0.0282 0.0180 0.0133 0.0105 4 0.0180 0.0117 0.0087 0.0068 5 0.0133 0.0087 0.0064 0.0050 6 0.0105 0.0068 0.0050 0.0040 7 0.0087 0.0056 0.0042 0.0034 8 0.0074 0.0048 0.0036 0.0028 9 0.0064 0.0042 0.0032 0.0026 10 0.0056 0.0038 0.0028 0.0022 11 0.0050 0.0034 0.0026 0.0020 12 0.0046 0.0030 0.0022 0.0018 13 0.0042 0.0028 0.0020 0.0016 14 0.0040 0.0026 0.0020 0.0016 15 0.0036 00.002 0.0018 0.0014

Table Ib. Variance of ˆC_p∗ for Cp = 1.33, for m= 10(5)25, and n = 2(1)15. m n 10 15 20 25 2 0.1138 0.0692 0.0499 0.0388 3 0.0499 0.0319 0.0236 0.0185 4 00.031 0.0207 0.0153 0.0121 5 0.0236 0.0153 0.0114 0.0089 6 0.0185 0.0121 0.0089 0.0071 7 0.0153 0.0099 0.0075 0.0060 8 0.0132 0.0085 0.0064 0.0050 9 0.0114 0.0075 0.0057 0.0046 10 0.0099 0.0067 0.0050 0.0039 11 0.0089 0.0060 0.0046 0.0035 12 0.0082 0.0053 0.0039 0.0032 13 0.0075 0.0050 0.0035 0.0028 14 0.0071 0.0046 0.0035 0.0028 15 0.0064 0.0043 0.0032 0.0025

(5)

Table Ic. Variance of ˆC∗_p for Cp = 1.67, for m= 10(5)25, and n = 2(1)15. m n 10 15 20 25 2 0.1795 0.1091 0.0786 0.0612 3 0.0786 0.0503 0.0372 0.0292 4 0.0503 0.0326 0.0241 0.0191 5 0.0372 0.0241 0.0179 0.0140 6 0.0292 0.0191 0.0140 0.0112 7 0.0241 0.0157 0.0118 0.0095 8 0.0208 0.0134 0.0101 0.0078 9 0.0179 0.0118 0.0089 0.0073 10 0.0157 0.0106 0.0078 0.0061 11 0.0140 0.0095 0.0073 0.0056 12 0.0129 0.0084 0.0061 0.0050 13 0.0118 0.0078 0.0056 0.0045 14 0.0112 0.0073 0.0056 0.0045 15 0.0101 0.0067 0.0050 0.0039

Table Id. Variance of ˆC_p∗ for Cp = 2.00, for m= 10(5)25, and n = 2(1)15. m n 10 15 20 25 2 0.2574 0.1564 0.1127 0.0878 3 0.1127 0.0722 0.0533 0.0419 4 0.0722 0.0468 0.0346 0.0273 5 0.0533 0.0346 0.0257 0.0201 6 0.0419 0.0273 0.0201 0.0160 7 0.0346 0.0225 0.0169 0.0136 8 0.0298 0.0193 0.0144 0.0112 9 0.0257 0.0169 0.0128 0.0104 10 0.0225 0.0152 0.0112 0.0088 11 0.0201 0.0136 0.0104 0.0080 12 0.0185 0.0120 0.0088 0.0072 13 0.0169 0.0112 0.0080 0.0064 14 0.0160 0.0104 0.0080 0.0064 15 0.0144 0.0096 0.0072 0.0056

(6)

(a) ˆC_p∗is the UMVUE of Cp.

(b) (mn)1/2( ˆC_p∗− Cp)converges to N (0,[Cp]2/2) in distribution. (c) ˆC_p∗is asymptotically efficient.

Proof: (a) It is easy to show that the statistics S_p2is a complete sufficient statistic for Cp since the probability density function of ˆCp∗ belongs to the exponential family. Further, since ˆCp∗is an unbiased estimator for Cp, which is also a function of S2

p only, then by Lehmann–Scheffe’s theorem (see Arnold (1990)), ˆC∗p is an UMVUE of Cp.

(b) If the process characteristic is normally distributed, then the statistic

(mn)1/2(S_p2−σ2)converges to N (0, 2σ4₎_{in distribution. We define the continuous}

function, g(t), as

g(t)= (USL − LSL)/(6t1/2),

and its derivative is

g(t)= −(USL − LSL)/(12t3/2).

By the Cramer-σ theorem (see Arnold (1990)), we have √

mn[g(S2_p)− g(σ2)] → N(0, 2σ2[g(σ2)]2)

in distribution, where[g(σ2)]2 = [Cp]2/(4σ4). The result is obviously equivalent to√mn(d/3SP− Cp)→ N(0, C2p/2) in distribution. Kirmani et al. (1991) proved that ˆC_p∗ is a consistent estimator of Cp, then ˆCp∗ converges to Cp in probability. Thus, by Slutzky’s Theorem (see Arnold (1990)) we have

(mn)1/2( ˆC_p∗− d/3SP)→ N(0, [Cp]2/2) in distribution, and so

(mn)1/2( ˆC_p∗− Cp)→ N[0, [Cp]2/2) in distribution.

(c) Noted that Var ( ˆC_p∗)= [CP]2{1/m(n2 −1)−2−1} by Kirmani (1991). For single sample of size n, the information for CP is

I1(CP)= E[−∂2log f1(x; CP)/∂CP2] = 2(n − 1)/C 2 P, where f1(x; CP)= 2 (√(n− 1)/2Cp)n−1 [(n − 1)/2] x −n_{exp[−(n − 1)(C} p)2(2x2)−1], x > 0

(7)

by Chou and Owen (1989). Therefore, the information for m independent sub-groups of each size of n I (CP) = mI1(CP) = 2m(n − 1)/(CP)2. Next, we computed the asymptotic efficiency for ˆC_p∗.

lim m(n−1)→∞e( ˆC ∗ P) = lim m(n−1)→∞C 2 P/2m(n− 1)C 2 P(m(n−2−1)−2− 1, let k = m(n − 1) = lim k→∞ 2 k−2/2k(1− 2 k−2) = lim k→∞[1 − (1/2k) + (1/8k 2₎ +o(1/k2₎_{]/{2k[(1/2k) − (1/8k}2₎_{+ o(1/k}2₎_{]} = 1.}

Therefore, ˆC_p∗ is asymptotically efficient.

3. UMP Test for CpBased on Multiple Samples

For cases with multiple samples, to determine whether a given process meets the preset requirement and runs under the desired quality condition, we consider the following testing hypotheses with null hypothesis H0: Cp C (the process is incapable), versus the alternative H1: Cp > C (the process is capable). Thus, we may consider the test φ∗(x)= 1 if ˆC_p∗ > c∗, and φ∗(x)= 0, otherwise. The test φ∗

rejects the null hypothesis if ˆC_p∗ > c∗, with type I error α(c∗) = α, the chance of

incorrectly judging an incapable process as capable. Kirmani et al. (1991) obtained the critical value c∗, which satisfies the following equation:

P[ ˆCp∗ > c∗ | H0: Cp ≤ C] = P χm(n2 −1)< [m(n − 1) − 1]2 m(n−1)−1 c∗ C 2 = α. c∗= C [m(n − 1) − 1]m(n2 −1)−1 χ_m(n2 _−1),α ,

where χ_m(n2 _−1),αis the lower α-percentage point on the chi-square distribution with

m(n− 1) degrees of freedom. The null hypothesis (Cp C) is rejected and the process is declared capable if the value of ˆC_p∗is greater than c∗.

A test is said to be the uniformly most powerful test (UMP) against the alterna-tive H1(but not against another if H0is simple but H1is composite) if it is the most

powerful against every simple alternative in H1. As noted by Lindgren (1968),

if the process characteristic X has a distribution in the exponential family, with

f (x; θ) = B(θ)h(x) exp[Q(θ)S(x)], and if Q(θ) is monotone increasing, then the

critical region S(X) > K is uniformly most powerful for θ θ∗ against θ > θ∗. Thus, we can show the following:

(8)

THEOREM 2. For the testing hypotheses, H0: Cp C versus H1: Cp > C, the test defined as φ∗(x)= 1 if ˆC_p∗ > c∗, and φ∗(x)= 0 otherwise, is the UMP test of

level α.

Proof: Under the assumption of normality, the density function of ˆC_p∗ is given below, where k= [m(n − 1) − 1]C_p2_m(n2 ₋₁₎₋₁. f (x)= 2k m(n−1)/2 2m(n−1)/2[m(n − 1)/2]x −[m(n−1)+1]_exp₋k 2 1 x2 .

We note that the above probability density function belongs to the exponential family with S(x)= −(x −2), Q(CP)= (1/2)[m(n−1)−1]Cp2

2

m(n−1)−1, x is real and Q(CP)is strictly increasing in CP. Thus, by the theory described in Lindgren (1968) it is clear that the test φ∗ is the uniformly most powerful. The UMP test rejects the null hypothesis if, and only if, −x−2 −(c∗)−2, where P[−x−2 −(c∗₎−2_{] = α. Since ˆC}∗

p = [m(n − 1) − 1]−1/2m(n−1)−1CpK−1/2, where K is distributed as χ2

m(n−1), and the critical region can be expressed as following: C| K [m(n − 1)]2 m(n−1)−1C2 ≤ 1 [c∗_]2 .

The critical value, c∗, for an α level of significance is derived by satisfying the equation, [m(n − 1)]2 m(n−1)−1C 2 (c∗)2 ≥ χ 2 m(n−1),α (c∗)2 = [m(n − 1)] 2 m(n−1)−1C 2 χ2 m(n−1),α .

In the following, we first calculate the p-value (risk for wrongly rejecting the null hypothesis H0: Cp C) given an observed value of the statistic. Suppose the observed value of the statistic ˆCp∗ = W, then we can calculate p-value as the following, where K is distributed as χ2

m(n−1). p-value = P { ˆC_P∗ ≥ W | CP ≤ C} = P √ m(n− 1) − 1m(n−1)−1C √ K ≥ W | CP ≤ C = P [m(n − 1) − 1]2 m(n−1)−1C2 K ≥ W 2_{| C} P ≤ C = P χ_m(n2 ₋₁₎≤ [m(n − 1) − 1] 2 m(n−1)−1C 2 W2 | CP ≤ C .

(9)

The power of the UMP test (probability of correctly rejecting the null hypothesis

Cp C when the true Cp > C), can also be computed for the given alternative hypothesis, H1: Cp = CI > C. The power of the test, denoted as π(CP)can be computed as the following:

π(CP) = P { ˆCP > c∗ | CP = C1} = P √ m(n− 1) − 1m(n−1)−1C1 √ K ≥ c ∗_{| C} P = C1 = P [m(n − 1) − 1]2 m(n−1)−1C 2 1 K ≥ c ∗2_{| C} P = C1 = P χ_m(n2 ₋₁₎≤ [m(n − 1) − 1] 2 m(n−1)−1C 2 1 c∗2 | CP = C1 . 4. An Application

Consider a forging manufacturing process making a specific type of piston rings for automotive engines. The engineers wish to establish a precision control of the inside diameter of the piston rings to monitor the process performance, for this par-ticular type of piston rings, using the process precision index Cp. The specification limits for the inside diameter of the piston ring are set to the upper specification limit USL = 74.050 mm, and the lower specification limit LSL = 73.950 mm.

Ten samples, each of size five are taken from the process that is demonstrably in control (stable). The inside diameter measurement data for the ten samples are displayed in Table II. The minimal precision requirement of this process is set to Cp = 1.33 in the factory, which is continuously used within the automotive industry as a capability benchmark. To test whether the piston ring manufacturing process meets the precision requirement or not, we use the UMP test developed for multiple samples for the hypotheses, H0: Cp 1.33 versus alternative H1:

Cp>1.33, to obtain a reliable decision making with risk α= 0.05. The calculated sample mean and the sample variance for the ten samples are tabulated in Table III. We also run the SAS computer software to obtain the critical value 1.60 for risk

α= 5%. Thus, we have S_p2 = 1 m(n− 1) m i=1 (n− 1)S_i2= 1 m m i=1 S_i2= 0.000093, ˆC∗ p = 1.69, c∗= 1.60.

Since calculated Cpfrom the sample data, 1.69, is greater than the critical value 1.60, then we may conclude, with 95% confidence, that the process meets the

(10)

Table II. The collected sample data (10 samples, a total of 50 observations). Sample 1 73.995 73.992 74.001 74.011 74.004 Sample 2 73.992 74.007 74.015 73.989 74.014 Sample 3 73.985 74.003 73.993 74.015 73.988 Sample 4 73.988 74.000 73.990 74.007 73.995 Sample 5 73.994 73.998 73.994 73.995 73.990 Sample 6 74.012 74.014 73.998 73.999 74.007 Sample 7 74.006 74.010 74.018 74.003 74.000 Sample 8 73.988 74.001 74.009 74.005 73.996 Sample 9 74.015 74.008 73.993 74.000 74.010 Sample 10 73.982 73.984 73.995 74.017 74.013

Table III. The calculated sample

mean, and the sample variance for the 10 samples. Sample 1 74.001 0.000056 Sample 2 74.003 0.000149 Sample 3 73.997 0.000150 Sample 4 73.996 0.000060 Sample 5 73.994 0.000008 Sample 6 74.006 0.000053 Sample 7 74.007 0.000049 Sample 8 74.000 0.000067 Sample 9 74.005 0.000076 Sample 10 73.998 0.000262

precision requirement Cp>1.33. The probability of wrongly judging an incapable process as a capable one is 5%.

5. Conclusions

Process precision index Cphas been widely used in the manufacturing industry to provide numerical measures on process potential. It measures the overall process variation relative to the specification tolerance. Statistical properties of the estim-ated Cp based on one single sample, have been investigated extensively. But, the properties of the estimated Cpbased on multiple samples have been comparatively neglected. In this paper, we considered an estimator of Cp denoted as ˆCp∗, based on multiple random samples with each of size n, and investigated its statistical

(11)

properties. We showed that the estimator ˆC_p∗ is the UMVUE of Cp, which is also asymptotically efficient. In addition, we showed that the test based on the UMVUE of Cpis the UMP test. Using this test, the practitioners can make reliable decisions on whether their processes meet the precision requirement preset in the factory, with the decision error minimized.

References

Arnold, S. F. (1990). Mathematical Statistics. Prentice Hall.

Bickel, P. & Doksum, K. A. (1977). Mathematical Statistics. San Francisco: Holden-day.

Chou, Y. M. & Owen, D. B. (1989). On the distributions of the estimated process capability indices.

Communications in Statistics: Theory and Methods 18: 4549–4560.

Lindgren, B. W. (1968). Statistical Theory. New York: Macmillan.

Kane, V. E. (1986). Process capability indices. Journal of Quality Technology 18(1): 41–52. Kirmani, S. N. U. A., Kocherlakota, K. & Kocherlakota, S. (1991). Estimation of sand the process

capability index based on subsamples. Communications in Statistics: Theory and Methods 20: 275–291.

Kotz, S. & Lovelace, C. R. (1998). Process Capability Indices in Theory and Practice.

Kocherlakota, S. (1992). Process capability index: Recent developments. Sankhya: The Indian

Journal of Statistics 54: 352–369.

Pearn, W. L., Lin, G. H. & Chen, K. S. (1998). Distributional and inferential properties of the process accuracy and process precision indices. Communications in Statistics: Theory and Methods 27: 985–1000.

(12)