Conclusion - 基於模糊線性區別分析之模糊分群法與結合空間資訊之支撐向量機

This study proposes a clustering algorithm, called FLDC, and two kinds of spatial-contextual support vector machines (SCSVMs). FLDC is based on the Fisher criterion composed of the fuzzy between- and within-cluster scatter matrices extended from LDA. Experimental results with both synthetic and real data indicate that the proposed clustering algorithm outperformed the KMS, KMD, FCM, GK, GG, PCM, FPCM, PFCM, FCS, FSMM, and FMSFA algorithms.

The results of clustering synthetic data sets reveal that FLDC only worked well when the distribution of clusters showed a normal distribution.

Hence, future research should extend FLDC using kernel tricks, that is, a clustering algorithm based on an unsupervised version of kernel-based LDA for non-normal data sets.

Another direction for future research is to show that the proposed optimization problem is non-convex and nonlinear. Although the proposed methods work well, the optimal solution may fall into a local minimum, and the interior-point optimization method is time consuming. Thus, it is necessary to find a more efficient algorithm for solving such problems.

The number of clusters is an important factor in all clustering algorithms. Future research should develop or choose an appropriate criterion for FLDC, [Akaike and Bayesian information criteria (AIC and BIC)], to determine the number of clusters.

For SCSVMs, results show that a SCSVM based on the neighborhood system in the original space can overcome similar spectral properties.

SCSVM modifies the decision function and the constraints of SVM based on spatial-contextual information. A PR step consisting of a fixed-window-based postfiltering was employed to reduce the remaining

noise in the classification map. The experiments in this study compared and analyzed the effects of different types of classifiers on the classification accuracy and classification map of the proposed SCSVM, ML classifier, ML-MRF classifier, k-NN classifier, a standard supervised SVM, a CS⁴VM, and SVM+EM.

The experimental results obtained from two different hyperspectral image data sets, the Indian Pine site (a mixed forest/agricultural site in Indiana) and the Washington D.C. Mall hyperspectral image (an urban site in Washington D.C.), confirm that the proposed SCSVM improves the classification accuracies and kappa coefficients.

This discussion leads to the following conclusions about SCSVMs.

1. SCSVM (OAA) performs better than or similar to SCSVM (OAO) in the IPS data set. The classification map of IPS data set obtained from SCSVM (OAA) with the PR step (Fig. 27 (h)) is very close to the ground truth, and the SCSVM classification accuracy and kappa coefficient are 95.5% and 94.9%, respectively. However, in the Washington D.C. Mall data set, SCSVM (OAO) performs better than or similar to SCSVM (OAA), and SCSVMF (OAA) performs better than or similar to SCSVMF (OAO).

2. This study shows that selecting a suitable spatial parameter  improves SCSVM performance, and the best choice of  becomes larger as the training sample size increases. That is,  has a significant influence on performance, especially for the SCSVM (OAA).

3. The computational cost of the learning phase in the proposed SCSVM is slightly higher than that of the standard SVM in each round. From a theoretical viewpoint, a standard supervised SVM is a special case of

SCSVM if the parameter  is equal to 0. However, CS⁴VM requires a huge semi-sample set from the neighborhoods of each training sample in the objective function. Hence, the computational cost of the CS⁴VM learning phase is slightly higher than that of SCSVM learning phase.

This is because SCSVM only uses the same training sample in the objective function in each round. For example, in the IPS data set experiment, the training phase of a supervised SVM (OAA) took about 7.566s on a PC with an Intel Core 2 Duo CPU at 2.4 GHz and a 4-Gb DDR2 RAM. The training phase of SCSVM (OAA) took about 7.909s on the same machine, but the training phase of CS⁴VM required about 185.56s.

4. The SVM+EM method is particularly suitable for classifying images with large spatial structures (e.g., the IPS image) when the spectral responses of different classes are dissimilar and the classes contain a comparable number of pixels. However, most real data does not always satisfy this condition (e.g., the Washington D.C. Mall image). Hence, SVM+EM is not suitable for all situations. In the SCSVM classifier, the spatial neighborhood system can be modified according to the spatial structures of different data sets.

References

[1] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning, New York: Springer-Verlag, 2001.

[2] S. Theodoridis and K. Koutroumbas, Pattern Recognition, 3rd edition, Academic Press, 2006.

[3] G. Camps-Valls and L. Bruzzone, Kernel Methods for Remote Sensing Data Analysis. London, U.K.: Wiley, Nov. 2009.

[4] C.-H. Li, B.-C. Kuo, and C.-T. Lin, “LDA-based clustering algorithm and its application to an unsupervised feature extraction,” IEEE Transactions on Fuzzy Systems, vol. 19, no. 1, pp.152-163, Feb. 2011.

[5] B. Balasko, J. Abonyi, and B. Feil, Fuzzy clustering and data analysis toolbox for use with Matlab, Available from: <http://www.fmt.vein.hu/softcomp>.

[6] C.-T. Lin and C.-S. George Lee, Neural Fuzzy Systems: A Neuro-Fuzzy Synergism to Intelligent Systems. Prentice Hall, 1996.

[7] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms.

Plenum Press, New York,1981.

[8] P. J. Rouseeuw, L. Kaufman, and E. Trauwaert, “Fuzzy clustering using scatter matrices,” Computational Statistics & Data Analysis, vol. 23, pp. 135-151, 1996.

[9] D.E. Gustafson and W.C. Kessel, “Fuzzy clustering with fuzzy covariance matrix,” In Proceedings of the IEEE CDC, San Diego, pp. 761-766, 1979.

[10] N.R. Pal, K. Pal, and J.C. Bezdek, “A mixed c-means clustering model,” IEEE International Conference on Fuzzy Systems, pp. 11-21, 1997

[11] N.R. Pal, K. Pal, J.M. Keller, J.C. Bezdek, “A possibilistic fuzzy c-means clustering algorithm,” IEEE Transactions on Fuzzy Systems, vol. 13, no. 4, pp.

517–530, 2005.

[12] J.C. Bezdek and J.C. Dunn, “Optimal fuzzy partitions: A heuristic for estimating the parameters in a mixture of normal distributions,” IEEE Transactions on Computers, pp. 835-838, 1975.

[13] S. Chatzis and T. Varvarigou, “Robust fuzzy clustering using mixtures of student’s-t distributions,” Pattern Recognition Letters, vol. 29, no. 13, pp.

1901-1905, October 2008.

[14] K. Fukunaga, Introduction to Statistical Pattern Recognition. San Diego, CA:

Academic, 1990.

[15] K.-L. Wu, J. Yu, and M.-S. Yang, “A novel fuzzy clustering algorithm based on a fuzzy scatter matrix with optimality tests,” Pattern Recognition Letters, vol. 26, pp. 639-652, 2005.

[16] Z. Yin, Y. Tang, F. Sun, and Z. Sun, “Fuzzy clustering with novel separable criterion,” Tsinghua Science & Technology, vol. 11, no. 1, pp. 50-53, Feb. 2006.

[17] O. Chapelle, B. Schölkopf, and A. Zien, Semi-Supervised Learning. MIT Press, Cambridge, MA, 2006.

[18] J. Li, J. M. Bioucas-Dias, and A. Plaza, “Semisupervised hyperspectral image segmentation using multinomial logistic regression with active learning,” IEEE Transactions on Geoscience and Remote Sensing, vol. 48, no. 11, pp.4085-4098, Nov. 2010.

[19] G. Camps-Valls, T. V. B. Marsheva, and D. Zhou, “Semi-supervised graph-based hyperspectral image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 45, no. 10, pp.3044-3054, Oct. 2007.

[20] Q. Jackson and D. A. Landgrebe, “Adaptive Bayesian contextual classification based on Markov random fields,” IEEE Transactions on Geoscience and Remote Sensing, vol. 40, no. 11, pp. 2454-2463, 2002.

[21] B.-C. Kuo, C.-H. Chuang, C.-S. Huang, and C.-C. Hung, “A nonparametric contextual classification based on Markov random fields,” WHISPERS '09 - 1st Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing, 2009.

[22] D. A. Landgrebe, Signal Theory Methods in Multispectral Remote Sensing, John Wiley and Sons, Hoboken, NJ: Chichester, 2003.

[23] B.E. Boser, I.M. Guyon, and V.N. Vapnik. “A training algorithm for optimal margin classifiers,” in Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp.144-152, 1992.

[24] V.N. Vapnik, The Nature of Statistical Learning Theory, 2nd ed.New York:

Springer-Verlag, 2001.

[25] Y.P. Zhao and J.G. Sun, “A fast method to approximately train hard support vector regression,” Neural Networks, vol. 23, no. 10, pp. 1276-1285, Dec. 2010.

[26] K. Ersahin, I. G. Cumming, and R. K. Ward, “Segmentation and classification of polarimetric SAR data using spectral graph partitioning,” IEEE Transactions on Geoscience and Remote Sensing, vol. 48, no. 1, pp.164-174, Jan. 2010.

[27] G. Camps-Valls, N. Shervashidze, and K. M. Borgwardt, “Spatio-spectral remote sensing image classification with graph kernels,” IEEE Transactions on Geoscience and Remote Sensing Letter, vol. 7, no. 4, pp.741-745, Oct. 2010.

[28] M. Fauvel, “Spectral and spatial methods for the classification of urban remote sensing data,” Ph.D. dissertation, Grenoble Inst. Technol., Grenoble, France, 2007.

[29] M. Pesaresi and J. A. Benediktsson, “A new approach for the morphological segmentation of high-resolution satellite imagery,” IEEE Transactions on Geoscience and Remote Sensing, vol. 39, no. 2, pp. 309-320, Feb. 2001.

[30] M. Fauvel, J. Chanussot, J. A. Benediktsson, and J. R. Sveinsson, “Spectral and spatial classification of hyperspectral data using SVMs and morphological profiles,” IEEE Transactions on Geoscience and Remote Sensing, vol. 46, no. 10, pp. 3804-3814, Oct. 2008.

[31] Y. Tarabalka, J.A. Benediktsson, and J. Chanussot, ‘‘Spectral-spatial classification of hyperspectral imagery based on partitional clustering techniques.’’ IEEE Transactions on Geoscience and Remote Sensing, vol.47, no.8, pp. 2973-2987, Aug. 2009.

[32] L. Bruzzone and C. Persello, “A novel context-sensitive semisupervised SVM classifier robust to mislabeled training samples,” IEEE Transactions on Geoscience and Remote Sensing, vol. 47, issue 7, pp. 2142-2154, 2009.

[33] G. Camps-Valls, L. Gomez-Chova, J. Munoz-Mari, J. Vila-Frances, and J.

Calpe-Maravilla, “Composite kernels for hyperspectral image classification,”

IEEE Geoscience and Remote Sensing Letters, vol. 3, issue 1, pp. 93- 97, 2006.

[34] S. T. John, and C. Nello, Kernel Methods for Pattern Analysis, Cambridge, U.K.:

Cambridge Univ. Press, 2004.

[35] F. Melgani, and L. Bruzzone, “Classification of hyperspectral remote sensing images with support vector machines,” IEEE Geoscience and Remote Sensing, vol.

42, no. 8, pp. 1778-1790, Aug. 2004.

[36] C.-C. Chang and C.-J. Lin, LIBSVM: A Library for Support Vector Machines, 2001. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm

[37] G. Camps-Valls and L. Bruzzone, “Kernel-based methods for hyperspectral image classification,” IEEE Geoscience and Remote Sensing, vol. 43, no. 6, pp.

1351-1362, Jun. 2005.

[38] M. Fauvel, J. Chanussot, and J.A. Benediktsson, “Evaluation of kernels for multiclass classification of hyperspectral remote sensing data,” in Proceedings of ICASSP, pp. II-813–II-816, May 2006.

[39] C.-H. Li, B.-C. Kuo, C.-T. Lin, and C.-S. Huang, “A spatial-contextual support vector machine for remotely sensed image classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. PP, no. 99, pp. 1-16, August 2011.

[40] A.K. Jain, R.P.W. Duin, and J.C. Mao, “Statistical pattern recognition: a review,”

IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 1, pp. 4-37, 2000.

[41] E. Alpaydin, Introduction to Machine Learning. MIT Press 2004.

[42] P.F. Hsieh, D.S. Wang, and C.W. Hsu “A linear feature extraction for multi-class classification problems based on class mean and covariance discriminant information,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 2, pp. 223-235, Feb. 2006.

[43] S.J. Kim, A. Magnani, and S. Boyd, “Optimal kernel selection in kernel Fisher discriminant analysis,” in International Conference on Machine Learning, Pittsburgh, PA, 2006, pp. 465-472.

[44] H. Xiong, M. N. S. Swamy, and M. O. Ahmad, “Optimizing the kernel in the empirical feature space,” IEEE Transactions on Neural Networks, vol. 16, no. 2, pp. 460-474, Mar. 2005.

[45] P.S. Szczepaniak, P.J.G. Lisboa, and J. Kacprzyk, Fuzzy Systems in Medicine.

Physica-Verlag Heidelberg New York, 2000.

[46] N. Pal, K. Pal, J. Keller, J. Bezdek, “A possibilistic fuzzy c-means clustering algorithm,” IEEE Transactions on Fuzzy Systems, 13 (4) (2005) 517–530.

[47] B.-C. Kuo and D. A. Landgrebe, “Nonparametric weighted feature extraction for classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 42, no. 5, pp.1096-1105, May 2004.

[48] B.-C. Kuo, D. A. Landgrebe, L.-W. Ko, and C.-H. Pai, “Regularized feature extractions for hyperspectral data classification,” International Geoscience and Remote Sensing Symposium (IGARSS), Toulouse, France, July 21–25, 2003.

[49] R.A. Waltz, J.L. Morales, J. Nocedal, and D. Orban, “An interior algorithm for nonlinear optimization that combines line search and trust region steps,”

Mathematical Programming, vol 107, no. 3, pp. 391-408, 2006.

[50] D.G. Luenberger and Y. Ye, Linear and Nonlinear Programming, 3rd ed., New York, NY: Springer, 2009.

[51] S. Chatzis, http://web.mac.com/soteri0s/Sotirios_Chatzis/Software.html

[52] R. Krishnapuram and J. Keller, “A possibilistic Approch to Clustering,” IEEE Transactions on Fuzzy Systems, vol. 1, no. 2, May 1993.

[53] L.I. Kuncheva and D.P. Vetrov, “Evaluation of stability of k-means cluster ensembles with respect to random initialization,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 11, pp. 1798-1808, 2006.

[54] L.I. Kuncheva, http://www.bangor.ac.uk/~mas00a/activities/artificial_data.htm [55] C.L. Blake and C.J. Merz, “UCI Repository of Machine Learning Databases,”

1998, http://www.ics.uci.edu/~mlearnMLRepository.html

[56] S. Chatzis and T. Varvarigou, “Factor Analysis Latent Subspace Modeling and Robust Fuzzy Clustering Using t-Distributions,” IEEE Transactions on Fuzzy Systems, vol. 17, no. 3, pp. 505-517, June 2009.

[57] G. McLachlan, R. Bean, and L. B.-T. Jones, “Extension of the mixture of factor analyzers model to incorporate the multivariate t-distribution,” Computational Statistics & Data Analysis, vol. 51, no. 11, pp. 5327-5338, 2007.

[58] K. Honda and H. Ichihashi, “Regularized linear fuzzy clustering and probabilistic PCA mixture models,” IEEE Transactions on Fuzzy Systems, vol. 13, no. 4, pp.

508-516, Aug. 2005.

[59] Z. Ghahramani and G. Hinton, “The EM algorithm for mixtures of factor analyzers,” Department of Computer Science, University of Toronto, Toronto, ON, Canada, Tech. Rep. CRGTR-96-1, 1997.

[60] S. Knerr, L. Personnaz, and G. Dreyfus, “Single-layer Learning Revisited: a Stepwise Procedure for Building and Training a Neural Network,” in J. Fogelman, editor, Neurocomputing: Algorithms, Architectures and Applications.

Springer-Verlag, 1990.

[61] C.-W. Hsu and C.-J. Lin, “A Comparison of Methods for Multiclass Support Vector Machines,” IEEE Transaction on Neural Networks, vol. 13, issue 2, pp.

415-425, 2002.

[62] L. Bottou, C. Cortes, J. Denker, H. Drucker, I. Guyon, L. Jackel, Y. LeCun, U.

Muller, E. Sackinger, P. Simard, and V. Vapnik. ‘‘Comparison of classifier methods: a case study in handwriting digit recognition.’’ in Proceedings International Conference on Pattern Recognition, pp. 77-87, 1994.

[63] B.-C. Kuo, C.-H. Li, and J.-M. Yang, “Kernel Nonparametric Weighted Feature Extraction for Hyperspectral Image Classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 47, no. 4, pp.1139-1155, April 2009.

[64] J.-M. Yang, P.-T. Yu, and B.-C. Kuo, “A Nonparametric Feature Extraction and Its Application to Nearest Neighbor Classification for Hyperspectral Image Data,”

IEEE Transactions on Geoscience and Remote Sensing, vol. 48, no. 3, pp.1279-1293, March 2010.

[65] J.-M. Yang, B.-C. Kuo, P.-T. Yu, and C.H. Chuang, “A Dynamic Subspace Method for Hyperspectral Image Classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 48, no. 7, pp.2840-2853, July 2010.

[66] A. R. Webb and K. D. Copsey, Statistical Pattern Recognition, 3rd Edition, John Wiley & Sons, Ltd, Chichester, UK, 2011.

在文檔中基於模糊線性區別分析之模糊分群法與結合空間資訊之支撐向量機 (頁 91-98)