• 沒有找到結果。

Robust Estimation in Multivariate Control Chart

In most regression problems, the parameter estimators are correlated. Thus a popular procedure in the retrospective Phase I is the T2 control chart, which is a tool to detect

multivariate outliers and shifts. To construct the T2 chart, we need to estimate the mean vector and covariance matrix from a set of historical data. However, the presence of multiple outliers may go undetected due to their biasing effect on the estimators, which is known as the masking effect. Efforts to address this problem have focused on the robust estimation in the presence of multiple outliers, especially for the covariance matrix. A number of different researchers have studied the robust estimation in multivariate settings, for example, Rousseeuw (1984) and Rousseeuw and Leroy (1987).

The T2statistics based on the usual sample variance-covariance matrix estimator (de-noted by Tusu2 ) and the successive-difference variance-covariance matrix estimator (denoted by Tdif2 ) are most commonly used for multivariate control charts. Sullivan and Woodall (1996) showed that Tusu2 and Tdif2 are not effective in detecting more than a very small num-ber of outliers. Sullivan and Woodall (1996) and Vargas (2003) both showed that Tdif2 is effective in detecting both sustained step and ramp shifts in the mean vector. Sullivan and Woodall (1996) found not only that Tusu2 is less effective in detecting a shift in the mean vector, but also that the power to detect the shift decreases as the magnitude of the shift increases. They found that Tusu2 has the effect of “pooling” the data all together such that a large step shift “inflates” the variance, thus making detection of the shift difficult.

Vargas (2003) and Jensen, Birch, and Woodall (2005) studied that the T2 statistics based on high-breakdown estimators, such as the MVE and MCD methods of Rousseeuw (1984) (denoted respectively by Tmve2 and Tmcd2 ), are excellent in detecting multiple outliers for Phase I. Jensen, Birch, and Woodall (2005) further investigated the more advantageous situations of the MVE and MCD estimators for certain combinations of the sample size and the number of outliers present for multivariate Phase I applications. The MVE estimator is preferred for smaller sample sizes and a smaller percentage of outliers while the MCD estimator is preferred for larger sample sizes and/or large percentages of outliers. The

simulations and generated control limits presented there give useful guidelines about the situations for which the high-breakdown approach is most appropriate. Jensen, Birch, and Woodall (2005) discussed some properties of the MVE and MCD methods along with their computing algorithms.

The distributions of the exact MCD and MVE estimators of location and scale are not known in closed form. However, the asymptotic distributions of the MVE and MCD estimators can be derived. Davies (1992) showed that the exact MVE estimators of location and scale are consistent for the mean vector and covariance matrix respectively provided that the random error vectors are i.i.d. Similar results were given in Butler, Davies, and Jhun (1993) for the exact MCD estimators. However, the MCD estimators converge to their population counterparts at a rate of n−1/2 while the MVE estimators converge at a slower rate of n−1/3, thus the MCD estimators are more efficient. In addition, the distribution of the MCD estimator of location converges to a normal distribution, which is not necessarily the case for the MVE estimator of location. Thus, the asymptotic properties of the MCD estimators are superior to those of the MVE estimators. Davies (1997) and Butler, Davies, and Jhun (1993) also indicated that the asymptotic distributions of the Tmve2 and Tmcd2 statis-tics converge in distribution to a χ2p distribution for i = 1, . . . , m. Hardin and Rocke (2005) provided an improved F approximation of the MCD estimator that gives accurate outlier rejection points for various sample sizes. So it may be useful to study the use of approximate control limits which are much simpler to obtain than those obtained via simulation. They believed that it is likely that large sample sizes are needed for the χ2p approximation to be sufficiently accurate.

The hybrid algorithm of Rocke and Woodruff (1996) is a combination of the data partitioning methods of Woodruff and Rocke (1994), the FSA algorithm involving the MCD from Hawkins (1994), and M-estimation. This hybrid algorithm is very effective in detecting

a larger percentage of outliers. Rousseeuw and Van Driessen (1999) proposed an algorithm, which they called the FAST-MCD, that is based on an iterative scheme and the MCD estimators. The FAST-MCD method is able to handle large data sets within a reasonable amount of time.

In Section 4, we give a brief overview of various robust estimation methods based on the MCD method for multivariate Phase I application. In addition to using the MCD estimators in the T2 statistic, with an attempt to enhance the detecting power of the control chart, we use the FDR procedure proposed by Benjamini and Hochberg (1995) for determing outliers in the data set. The proposed scheme is compared with the MCD-based T2 control chart in Sections 5 and 6.

3 Modeling Nonlinear Profiles

3.1 Nonlinear Regression Model

Assume that there are sample profiles in the historical data for Phase I analysis. For each sample i we observe the response variable yij and a set of predictor variables xij (k = 1), j = 1, · · · , n, i = 1, · · · , m. We present the nonlinear regression model (2.3) in matrix form

and the covariate matrix be

X =

is the parameter matrix, where βi is the p × 1 vector to be estimated for profile i, and

² =

is the random error matrix, where ²ij are assumed to be i.i.d normal random variables with mean zero and variance σ2.

To simplicity notation, we rewrite the form in (2.3) by stacking the n observations within each profile as yi = (yi1, yi2, · · · , yin)0, f (xi, βi) = (f (xi1, βi), f (xi2, βi), · · · , f (xin, βi))0, and ²i = (²i1, ²i2, · · · , ²in)0. The vector form is then given by

yi = f (xi, βi) + ²i, i = 1, 2, . . . , m. (3.5)

For the nonlinear regression model given in (3,5), we first must obtain the estimate of βi for each profile. This is usually accomplished by employing the Gauss-Newton procedure and iterating until convergence to obtain the least squares estimates. Define the n×p matrix of the derivatives of f (xi, βi) with respect to βi as Then an iterative solution for ˆβi is given by

βˆ(a+1)i = ˆβ(a)i + ( ˆF0(a)i Fˆ(a)i )−1Fˆ0(a)i (yi − f (xi, ˆβ(a)i )). (3.7)

See Myers (1990, Chapter 9) or Schabenberger and Pierce (2002, Chapter 5) for a concise discussion of nonlinear regression model estimation. A more detailed treatment can be found in Gallant (1987) or Seber and Wild (2003).

Unlike linear regression, the small-sample distribution of parameter estimators in non-linear regression is unobtainable, even when the errors ²ij are assumed to be i.i.d. normal random variables. Let F (ˆβi)(= ˆFi) be the derivative matrix in (3.6) evaluated at the param-eter vector estimate ˆβi. Seber and Wild (2003, Chapter 12) gave the asymptotic distribution

of ˆβi as well as the necessary assumptions and regularity conditions needed for it. Since the following assumptions and regularity conditions

1. The ²ij are i.i.d. with mean zero and variance σ2.

2. For each i, f (xi, βi) is a continuous function of βi for βi ∈ B, where B is a closed, bounded subset of Rp .

3. βi is an interior point of B. Let B be an open neighborhood of B.

4. The first and second derivatives, f(xi,βi)

∂βir and 2f(xi,βi)

∂βir∂βis (r, s = 1, 2, . . . , p), exist and are continuous for βi for all βi ∈ B.

5. n−1F (βi)0F (βi) converges to some matrix Ω(βi) uniformly in βi for βi ∈ B.

6. n−1Pni=1

·2f(xi,βi)

∂βir∂βis

¸2

converges uniformly in βi for βi ∈ B.

7. Ωi=Ω(βi) is nonsingular.

hold, the asymptotic distribution of ˆβi is given by

√n(ˆβi− βi) −→ Np(0, σ2−1i ). (3.8)

Also n−1Fˆ0iFˆi is a strongly consistent estimator of Ωi. For practical purposes, the distri-bution given by (3.8) can not be calculated since the matrix Ωi is unknown. Instead, the following approximate asymptotic distribution of ˆβi is commonly used:

βˆi ≈ Npi, σ2(F0iFi)−1). (3.9)

For the “in-control” case, we have βi = β for all m samples, where β is the in-control parameter vector. Accordingly, the Ωi (and Fi) matrices are the same for all m profiles if all profiles have the same underlying function, f , the same x-values, and the same values of βi. However, the ˆFi matrices are not equal since the ˆβi values vary from profile to profile.

相關文件