2. Literature Review
2.4 Normal Transformation
For most industrial applications, normality is assumed due to the advantage of the analytical convenience and existing effective statistical methods. For example, Platt et al. [60]
assumed that the lead time demand is normally distributed, so the asymptotic results can be used as the EOQ from zero to positive infinity to fit a theoretic curve for the order quantity Q and the reorder point R. Silva Filho [68] proposed the cumulative demand is a random variable represented by a compound Poisson process. Because the demand affects the inventory system, a chanced constraint is used to preserve the inventory constraint explicitly in a stochastic optimization model. A Gaussian approximation is also proposed to the compound Poisson process. You et al. [80] used Box-Cox transformation method to transform the experiment data investigated from microcircuit process. But, for many engineering operations such as locating pins or automatic sensors, the manufacturing data is often truncated or appears to be non-normal. Pezdek [64] gave a non-normal data example and
18
perform process performance analysis. Pezdek [64] demonstrated how the non-normal characteristic would significantly impact on the data analysis result and the conclusion, thus convey incorrect process information. If the process characteristic is not normally distributed, there are two popular approaches to transform the non-normal data into a normal one. First, Johnson [46] proposed a system of three transformation families for selection of a transformation to normality. Let X be a random variable and Z be a standard normal variable.
The three transformation families in Johnson system are, respectively as Equation (9) – (11),
( ) [ ( ) ]
{
ε λ ε}
ε λ εη
γ + − − − < < +
= ln X X , X
Z , (9)
(
ε)
εη
γ + − >
= ln X , X
Z , (10)
( )
[
−]
−∞< <∞+
= sinh− X , X
Z γ η 1 ε λ , (11)
where - ∞ < γ, ε < ∞, η > 0, and λ > 0 are four parameters. The distribution determined by (9) is called the SB distribution denoted by SB (γ, η, ε, λ). Similarly, the distribution determined by (10) is called the SL distribution denoted by SL(γ, η, ε), and by (11) called the SU distribution denoted by SU
[37]
(γ, η, ε, λ). The subscripts, B, L, and U, refer to X being bounded, lognormal, and unbounded, respectively. Hahn and Shapiro gave further description of these distributions. In using the Johnson system, the first step is to determine which of the three
19
families should be used. The next step is to estimate parameters of the transformation family selected. A moment approach in the selection step is to choose the transformation family according to which region of the ( β1,β2) plane the estimated third (β1) and fourth (β2
[69]
) standardized sample moments fall into. Slifker and Shapiro pointed out the major shortcomings of this procedure such as high mean-square errors and vulnerability to outliers of the sample third and fourth moments.
Another percentile approach prevails and is in fact mostly adopted in practice. Johnson [46] proposed a method, which uses four percentiles. Based on symmetrical points, Bukac [12]
suggested procedures for estimating parameters of SB distribution. Later, Mage [54]
presented a method of reducing Bukac’s quadratic equations to a quadratic equation. Slifker and Shapiro [69] suggested choosing four symmetric standard normal deviates equally spaced with intervals 2z, i.e. 3z, z, -z, and -3z, admittedly not a serious restriction. Bowman and Shenton [9] proposed a simple algorithmic solution for normal deviates -sz, -z, z, and sz where s and z are arbitrary positive constants and s > 1. Meanwhile, Owen [56] proposed the starship procedure to search out a transformation that most nearly transforms the sample to normality, which is not only tied to Johnson system but also many possibilities exist for the transformations. Chou et al. [20] recommended that use the set Z = { z0: z0 = 0.25, 0.26, …, 1.25}, instead of a single chosen value, to fit all the Johnson distributions which are feasible
20
for the Slifker and Shapiro’s estimation formulas. The best-fit Johnson distribution is chosen to be the one that best transforms the data to normality among the z0 values in Z. However, this procedure cannot discriminate the SL distribution family from the other two families.
Chen and Kamburowska [16] proposed a procedure, called M procedure, which is consistent by setting a bound on the parameter to prevent from an incorrect selection when the underlying distribution is an SL distribution.
Box and Cox [10] modified the family of power transformation proposed by Tukey [71].
Its simple form defined as T λ: y y(λ)
The transformation in Equation (12) is defined for y > 0. It is hoped that for some value
of λ, a non-normal data can be fitted to a normal distribution. Box and Cox [10] used the maximum likelihood method to estimate the parameter λ. An analytical expression for the
accuracy of maximum likelihood estimate of λ is derived by Draper and Cox [29]. Hinkley [39] used order statistics to estimate the transformation parameter. Later, Hinkley [40]
assumed that there might be a value of λ making the transformed data nearly symmetry and proposed a similar method for choosing a symmetrical transformation based on the asymmetry degree of the sample, which is measured by Equation (13)
21
d = (sample mean – sample median)/sample scale (13)
If the underlying distribution is symmetric, then the mean and the median must be identical.
Thus, the sample data drawn from such distribution should reflect such property, and a good estimate of λ should minimize the value of d.
Base on the Tukey’s [71] recommendation with setting λ to –2 ≤ λ ≤ 2, Hinkley [40]
proposed a step-by-step procedure for computing the power of Box-Cox transformation based on moment of percentile may be presented as follows:
Step 1: Choose -2 as an initial guess λ0
Step 2: Transform the original sample by taking the power λ
of λ for a given random sample.
0
Step 3: Calculate d defined in Equation (13) using the inter-quartile range as the sample scale.
and then find the sample mean, sample median, and sample inter-quartile range for the transformed random sample.
Step 4: Check whether d is less than a predetermined precision level. If not, iterate Steps 1-3 by increasing the magnitude of λ by unit of 0.05 as new λc, till the difference
between λ0 and λc is smaller than the predetermined precision level.
22
Step 5: Use the λ derived from Step 4 as the optimal estimateλˆ. Employ Shapiro-Wilk [66] test to check the normality of the transformed sample.
23