Process monitoring based on control charting usually consists of two phases—Phase I and Phase II. The purpose of this paper is to propose and study a new strategy for Phase I process monitoring. In Phase I, process data are collected and analyzed with the goals to evaluate the process stability and to model the in-control process. Thus, control charts are often used in Phase I to signal out-of-control conditions of the process so that corrective actions can be taken to bring the process to the in-control state.
To construct a suitable control chart for the on-line process monitoring in Phase II, good estimates of the in-control process parameters, e.g., the mean and standard deviation of the monitoring statistic, are needed for setting up reliable control limits. For this, we need a set of in-control process data.
In Phase I, samples are collected from the process to determine if they come from the in-control process. The current practice is to use the dataset to set up preliminary control limits for the monitoring statistic, such as X , R, or S, to identity potential “out-of-control”
points. For simplicity, we only consider the points that exceed the control limits as the
“out-of-control” points in this study. Other rules such as run rules can be added in real applications. If there are samples exceeding the control limits, then the operators or process engineers should investigate the process to see if there are any assignable causes for these beyond-limits points. If indeed some assignable causes are found, then appropriate corrective actions should be taken to eliminate the causes. If any of corrective actions changes the process itself, we need to re-collect data from the new process and re-start the whole screening process again. If the assignable causes do not affect the current process, then we can simply discard the out-of-control points and re-do the control charting using the remaining data. If no assignable causes can be found, then people can choose either keep or discard these points. No one knows which action is correct without further information since the points may
exceed the limits simply by chance or there may be some uncovered assignable causes. For being conservative, many practitioners may choose to discard these beyond-limits points to avoid potential contamination in the dataset. Having excluding these beyond-limits points, another set of preliminary control limits is calculated from the remaining dataset for further screening of the out-of-control data points. The above screening steps are repeated until no more beyond-limits points are found.
In this paper, we study this practice and find that it tends to mistakenly screen out too many in-control data points. Thus we propose a more effective procedure for collecting in-control data for Phase II usage.
Statistically, in any control charting, there are possibilities that some in-control samples may get wrongly discarded and some out-of-control samples may remain undetected, which are similar to committing Type I and Type II errors in hypothesis testing respectively. A good control chart should be able to control these two types of error rates. However, it is surprising to observe that the practice of discarding all the beyond-limits points (when no assignable causes are found) can be quite inefficient, in the sense that more than expected in-control samples are discarded.
On the other hand, it is also well known that usually there will be more than expected out-of-control samples not detected with the preliminary control limits when data are contaminated by some out-of-control samples. The reason is that these out-of-control samples usually introduce more variation to the data, which makes the preliminary control limits too wide.
To detect these out-of-control samples and to prevent losing too many in-control samples as well, we propose an add-on iterative procedure called One-At-A-Time (OAAT) procedure that discards only the most extreme beyond-limits point and then updates the control limits at each iteration. It is found from the simulation study that, with control limits constructed under the same overall false-alarm rate (defined later), the OAAT procedure will screen out much
less in-control samples than the traditional discard-all practice and in general still have about the same power in detecting out-of-control samples.
We remark here that the traditional practice inspects the process for assignable causes when an out-of-control signal occurs, while the new procedure only performs the investigation at the end of the whole iterating process. The final number of beyond-limits points is most likely less than or equal to that of the discarding-all practice. Thus the new practice can reduce the number of times of investigation and possible adjustments of the process, which may reduce a great deal of costs. Of course, the new procedure will need more computing power, which is no longer an issue with the enhancing computer power nowadays.
Another important issue to address is the criteria of performance evaluation of control charts used in Phase I analysis. Currently in the literature the evaluation is mostly based on a so-called “signal probability” (Sullivan and Woodall, 1996), which is defined as the family-wise signal rate, the probability that at least one sample point in the dataset signals out of control. The overall false-alarm rate mentioned above is the signal probability when the process is in control. Therefore the signal probability criterion can only evaluate the effectiveness of the control schemes on judging if the whole dataset comes from the in-control process or not.
Since signal probability can not distinguish cases with different numbers of out-of-control points, it is not really a good measure for comparing the performance of different screening methods. To fit the purpose of Phase I analysis better, we suggest using the expected number of correctly rejected samples and the expected number of wrongly rejected samples as the comparison criteria. The former measures the detecting power and the latter measures the frequency of the false alarms.
In genetic research, people often select significant genes through the control of the family-wise error rate (FWER) in multiple hypotheses testing. Classical methods such as the Bonferroni approach for testing the significance of each gene suffer tremendous loss of power
since the number of genes under investigation is huge, say, in thousands. To overcome this difficulty, Benjamini and Hochberg (1995) suggested a sequential p-value method to control the false discovery rate (FDR) (to be defined later) for finding the significant genes. They claimed that this FDR procedure has better power than the Bonferroni approach. Since screening out-of-control data points is similar to finding significant genes, we are interested in the effectiveness of this sequential p-value method in our application.
The traditional approach in Phase I control charting is to control the individual false-alarm rate for each sample no matter how many samples are tested. A more recent approach is to control the overall false-alarm rate (α), usually a Bonferroni-type error rate ( ) is used for controlling the individual false-alarm rate for each of the m samples. We compare the Bonferroni method and FDR method in this paper. We also apply the OAAT procedure to the FDR method (denoted by OAAT/FDR) to improve the screening process.
α/ m
We shall describe the new strategy with the Shewhart X control chart. The rest of this paper is organized as follows. Section 2 reviews the related fundamentals of Phase I analysis.
Section 3 compares the traditional method controlling the individual false-alarm rate with its OAAT version (denoted by OAAT/traditional) as well as comparing the Bonferroni method controlling the overall false-alarm rate with its OAAT version (denoted by OAAT/Bonferroni).
Section 4 compares three methods: Bonferroni, FDR, and OAAT/FDR method. Section 5 summarizes the results of the study and gives some possible future research directions.