6 Hierarchical priors for variable selection

In above sections, we only consider the independent priors for γ. In this section, the dependent prior structure, proposed by Chipman et al. (1997), is considered in our analysis approach.

Since the significance of interactions can be assumed to depend on the significance of their corresponding main effects, i.e. “effect heredity” principle (Wu and Hamada, 2000), thus the independent prior assumption may not appropriate in this situation. Therefore, hierarchical priors for γ, proposed by Chipman et al. (1997), which incorporates the “effect heredity”

principle, is introduced below.

Consider an example which has three main effects A, B, C and three two-factor interactions AB, AC and BC. The significance of two-factor interactions depends on whether their corre-sponding main effects are included in the model or not. This belief can be expressed in the prior for γ = (γ_A, γ_B, γ_C, γ_AB, γ_AC, γ_BC) as follows:

P (γ) = P (γ_A)P (γ_B)P (γ_C)P (γ_AB|γ_A, γ_B)P (γ_AC|γ_A, γ_C)P (γ_BC|γ_B, γ_C). (10)

In equation (10), we assume that the second-order terms (i.e AB, AC, BC) which are conditional on first-order terms (i.e A, B, C) are independent. Independence is also assumed between the main effects.

The probability that interaction term which depends on their corresponding main effects is active or not, i.e. P r(γ_AB |γ_A, γ_B), takes on four different values:

According to the choice of these values, it may represent different principles of variable selection.

For example, one might choose (p₀₀, p₀₁, p₁₀, p₁₁) = (0, 0, 0, p) that this prior belief means the interaction term AB may be active if both main effects A and B are active; otherwise the interaction term AB must be inactive. This principle is called the “strong heredity” principle (Chipman et al., 1997). Another choice of (p₀₀, p₀₁, p₁₀, p₁₁) = (0, p₁, p₂, p₃) that this prior belief means the interaction term AB may be active if at least one of main effects A and B is active.

This principle is called the “weak heredity” principle (Chipman et al., 1997) or “effect heredity”

principle (Wu and Hamada, 2000). Instead of the probabilities p₀₀, p₀₁, p₁₀ to be zero, a small value ² is used. Strong heredity (0, 0, 0, p₁₁) is thus relaxed to (², ², ², p₁₁) or (0, ², ², p₁₁) and weak heredity (0, p₁₀, p₀₁, p₁₁) is relaxed to (², p₁₀, p₀₁, p₁₁). The ordering p₀₀≤ (p₀₁, p₁₀) ≤ p₁₁ seems natural.

The higher order polynomials and interactions may be included in the model. For example, there are the fourth-order term A²B², third-order terms A²B, AB², second-order terms A², B², AB and first-order terms A and B in the model. We consider the corresponding main effects of a term which multiplied by the smallest order is original term. For example, A²B² has main effects A²B (A²B multiplied by B is A²B²) and AB² (AB² multiplied by A is A²B²). These relations can be expressed by using a network graph in Figure 14.

A²B²

Figure 14. Network graph of relations among polynomial interactions

Example 6.1. Re-visit the blood glucose experiment in Example 4.1.2 in Section 4.1. We consider a model which consists of 15 main effects and 98 all two-factor interactions. For the prior parameters ν, λ and τ , we set (ν, λ) as (0, 1) and use the modified LOOCV method to tune another prior parameter τ , and then the hierarchical priors for γ used by following Chipman et al. (1997) are:

P (γ_AB² = 1|γ_AB, γ_B²) =

This prior allows interactions to be active if only one parent term is active, and even if both parents are inactive, there is a small probability that the interaction term will be active.

For CGS, we run 100,000 iterations and discard first 50,000 iterations. For the last 50,000 iterations, we collect 10,000 samples by every 5 draws. To use CGS, we tune the parameter τ from {1, 2, 3, 4, 5, 6}, and then we choose the value of τ as 3 by modified LOOCV method.

After applying CGS with τ = 3, the marginal posterior probabilities of all factors are shown in Figure 15. Due to the median probability criterion, the result suggests that B, BH2 and B2H2 are active which is consistent with the results in Hamada et al. (2009). Chipman et al. (1997) identified the top 10 models, and our selection result is one of these models.

B BH2 B2H2

Figure 15. Marginal posterior probabilities of blood glucose experiment by using relaxed weak heredity prior.

7 Conclusion

This thesis adopts a new analysis approach based on CGS, proposed by Chen et al. (2009), to identify active factors in supersaturated designs. In our analysis approach, the parameter is tuned via LOOCV approach. To avoid the over-fitting problem, we propose a modified LOOCV approach which integrates the original cross validation with information criterion by adding a penalty term. We demonstrate the performance of modified LOOCV via a simulation study.

Simulation results show that the parameter is tuned by adding the penalty term, 2k², with cross validation values performs well, and it also performs well in real examples. We also compare the performance of CGS with that of Dantzig selector method with modified AIC criterion, CGS preforms better. However, we can not ensure all cases work well, since it is only proved in Rao and Wu (2005) that if C_n satisfies (8), for all large n, the criterion with a penalty term could choose the true model. In addition to our analysis approach, we also use a graphical approach, which is similar to Phoa et al. (2009), to identify active factors in real examples. The selection results are similar to these of our analysis approach.

In addition to independent priors for variable selection, we also consider the dependent prior structure in Chipman et al. (1997) for our method. We show that CGS with the dependent prior assumption still works well in Blood glucose experiment with main effect and all two-factor interactions. However, due to the different prior assumption, we may obtain the different selection results. Thus how to choose the proper dependent prior assumption is needed to study.

The CGS may not work well, when there are highly correlations between variables or the residual variance is small. In such situations, we might suggest to adopt the stochastic matching pursuit algorithm proposed by Chen et al. (2009). The stochastic matching pursuit algorithm consists of two processes, addition process and deletion process. If the wrong variables are selected, we can screen out them in deletion process. However, this algorithm costs more time than CGS. Thus if there are not these situations, we still suggest to adopt CGS.

References

[1] Akaike, H. (1973), “Information theory and an extension of the maximum likelihood prin-ciple”, in: Petrov, B.N., Cz`aki, F. (eds.), Second International Symposium on Information Theory. Budapest, 267-281.

[2] Barbieri, M. and Berger, J. O. (2004), “Optimal predictive model selection”, Annals of Statistics, 32, 870-897.

[3] Beattie, S. D., Fong, D. K. H and Lin, D. K. J. (2002), “A two-Stage Bayesian model selection for supersaturated designs”, Technometrices, 44, 55-63.

[4] Box, G. E. P. and Meyer, P. D. (1986), “An analysis for unreplicated fractional factorials”, Technometrics, 28, 11-18.

[5] Chen, R. B., Chu, C. H., Lai, T. Y. and Wu, Y. N. (2009), “Stochastic matching pursuit for Bayesian variable selection”, Accepted by Statistics and Computing.

[6] Chipman, H. (1996), “Bayesian variable selection with related predictors”, Canadian Jour-nal of Statistics, 24, 17-36.

[7] Chipman, H., Hamada, H. and Wu, C. F. J. (1997), “A Bayesian variable selection approach for analyzing designed experiments with complex aliasing”, Technometrics, 39, 372-381.

[8] F´evotte, C., Godsill, S. J. (2006), “Sparse linear regression in unions of bases via Bayesian variable selection”, IEEE Signal Processing Letter, 13, 441-444.

[9] George, E. I. and McCulloch, R. E. (1993), “Variable selection via Gibbs sampling”, Journal of the American Statistical Association, 88, 881-889.

[10] George, E. I. and McCulloch, R. E. (1997), “Approaches for Bayesian variable selection”, Statistica Sinica, 7, 339-373.

[11] Georgiou, S. D. (2008), “Modelling by supersaturated designs”, Computational Statistics and Data Analysis, 53, 428-435.

[12] Geweke, J. (1996), “Variable selection and model comparison in regression”, in Bernardo, J.M., Berger, J.O., Dawid, A.P. and Simth, A.F.M. (eds.), Proceedings of the fifth Valencia International Meeting on Bayesian Statistics.

[13] Hamada, C. A. and Hamada, M. S. (2009), “All-subsets regression under effect heredity restrictions for experimental designs with complex aliasing”, Quality and Reliability Engi-neering International, 26, 75-81.

[14] Hannan, E. J. and Quinn, B. G. (1979), “The determination of the order of an autoregres-sion”, Journal of the Royal Statistical Society. Series B, 41, 190-195.

[15] Li, R. and Lin, D. K. J. (2003), “Analysis method for supersaturated design: Some com-parisons”, Journal of Data Science, 1, 249-260.

[16] Li, R. and Lin, D. K. J. (2009), “Variable selection for screening experiments”, Quality Technology & Quantitative Management, 6, 271-280.

[17] Lin, D. K. J. (1993), “A New Class of Supersaturated Design”, Technometrics, 35, 28-31.

[18] Phoa, F. K. H., Pan, Y. H. and Xu, H. (2009), “Analysis of supersaturated designs via the Dantzig selector”, Journal of Statistical Planning and Inference, 139, 2362-2372.

[19] Rais, F., Kamoun, A., Chaabouni, M., Claeys-Bruno, M., Phan-Tan-Luu, R. and Sergent, M. (2009), “Supersaturated design for screening factors influencing the preparation of sul-fated amides of olive pomace oil fatty acids”, Chemometrics and Intelligent Laboratory Systems, 99, 71-78.

[20] Rao, C. R. and Wu, Y. (2005), “Linear model selection by cross-validation”, Journal of Statistical Planning and Inference, 128, 231-240.

[21] Smith, M. and Kohn, R. (1996), “Nonparametric regression using Bayesian variable selec-tion”, Journal of Ecomometrics, 75, 317-343.

[22] Westfall, P. H., Young, S. S. and Lin, D. K. J. (1998), “Forward selection error control in the analysis of supersaturated designs”, Statistica Sinica, 8, 101-117.

[23] Wolfe, P. J., Godsill, S. J. and Ng, W. J. (2004), “Variable selection and regularization for time-frequency surface estimation”, Journal of the Royal Statistical Society. Series B, 66, 575-589.

[24] Wu, C. F. J. and Hamada, M. (2000), “Experiments: Planning, analysis and parameter design optimization”, Wiley: New York, U.S.A.

[25] Zhang, Q. Z., Zhang, R. C. and Liu, M. Q. (2007), “A method for screening active effects in supersaturated designs”, Journal of Statistical Planning and Inference, 137, 2068-2079.

Appendix I

Table 2

Design matrix and response data, cast fatigue experiment.

Run A B C D E F G Response

Design matrix and response data, blood glucose experiment.

Run A B C D E F G H Response

Table 4

A two-level supersaturated design (Lin, 1993).

Run Factors Response

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Y

1 + + + − − − + + + + + − − − + + − − + − − − + 133

2 + − − − − − + + + − − − + + + − + − − + + − − 62

3 + + − + + − − − − + − + + + + + − − − − + + − 45

4 + + − + − + − − − + + − − + + − + + + − − − − 52

5 − − + + + + − + + − − − − + + + − − + − + + + 56

6 − − + + + + + − + + + − + + − + + + + + + − − 47

7 − − − − + − − + − + − + + − + + + + + + − − + 88

8 − + + − − + − + − + − − − − − − − + − + + + − 193

9 − − − − − + + − − − + + − + − + + − − − − + + 32

10 + + + + − + + + − − − + + + − + − + − + − − + 53

11 − + − + + − − + + − + − + − − − + + − − − + + 276

12 + − − − + + + − + + + + − − + − − + − + + + + 145

13 + + + + + − + − + − − + − − − − + − + + − + − 130

14 − − + − − − − − − − + + + − − − − − + − + − − 127

Table 5

A two-level supersaturated design (F. Rais, 2009).

Run Factors

在文檔中 National University of Kaohsiung Repository System:Item 310360000Q/10494 (頁 30-40)