結論與討論

本研究之研究方法根基於 GHSOM 進行研究，透過其訓練演算法可得到具樹狀階層結構之資料分群結果，且群集與群集之間的關係與其距離具有正向相關。本研究在前置處理先將影像與註解透過空間向量模型將兩者分別進行特徵向量轉換，透過將兩者量化為固定長度之向量資料後，結合多語言資訊檢索概念，將影像與註解視為兩種不同語言，透過 GHSOM 將資料進行訓練與分群，其分群結果依關聯程度呈現樹狀階層資料結構，再結合本研究發展之影像與註解對應方法以給予影像與註解階層之間適當關聯對應，新進影像可透過前述方法標記至適當影像群集後，透過已建立好之影像與註解關聯進行自動註解。本研究結果得知對於新進影像可自動給予註解，且本研究方法所給予之註解具有約百分之五十之正確註解成功率。

本研究實驗過程中發現本研究所採用之測試影像資料集具有資料群聚明顯之特性。測試影像資料集已預先被分類為 21 類，每一類別中所包含之影像均十分相似，故在訓練過程中出現分群結果易於資料訓練初期便獲得良好之分群結果，因此樹狀階層並不明顯。此外，本研究將影像透過還原步驟將影像像素還原為灰階像素，此部分所遺失之部分影像資訊亦可能影響本研究之成效。接著為本研究選取所採用之查詢關鍵字依註解關鍵字出現類別隨機選取，而測試影像資料則選取各原始影像類別之前百分之二十，

在進行實驗過程發現部分關鍵字雖在註解字彙集中具有極高之出現頻率，但在本研究中卻無法得到相對應之成效，經過深入探究後得知原因為查詢關鍵字雖具有出現較高之頻率，但卻未必會出現於該群集之前百分之二十，亦即可能頻繁出現於影像其餘百分之八十之影像註解中，因此導致本研究之召回率偏低，最後，本研究發現當影像與註解資料再經過 GHSOM 訓練並產生各群集後，所產生註解群集樹狀階層呈現階層扁平之特殊情況，當影像階層發生扁平狀之情況，容易造成各影像群集內所包含之影像資料過多，而當影像階層發生該情形，往往註解階層之階層亦伴隨出現相同情況，因此容易出現註解群集再經過本研究之對應方法得到階層較淺之群集獲得較高權重，亦即註解結果將出現廣義化情形。所為廣義化情形即為當影像階層中某一群集含有大量影像，則該影像群集

所對應之註解群集往往亦包含有大量註解，因此註解結果便不夠細膩正確，因此本研究提出一種解決概念，稱之為階層倍數法，其概念以資料階層深淺為基礎，認為資料若分群較深入，則其被分類正確之機率就越高，因此可考量資料所在階層給予其適合之權重值以解決此種狀況。

在本研究實際驗證過程中，由於訓練與測試之影像均已透過人工方式給予之註解，

因此正確性極高，未來若可增加此類已人工註解正確之影像資料做為實驗資料，可得到更佳之註解正確率，且對於新進影像之涵蓋範圍亦可不僅限於目前所採用之影像資料庫而更加擴大。由於影像資料龐大，往往需將資料量化以縮減資料量，不同影像特徵均有各自所代表之影像資訊，在本研究首先將影像像素還原至灰階像素，接著採用可代表影像色彩分佈之色彩直方圖與顯示影像空間關係及色彩變動頻率之能量頻譜作為影像特徵，未來若可處理完整色彩影像，並納入更多不同之影像特徵表達方式例如 MPEG-7，

或是結合多種影像之特徵以對於影像內容則可有更完整與充足資訊，對於判別影像實際內容意涵將有更大助益。

參考文獻

[1] 葉怡成 (2003) . 類神經網路模式應用與實作. 台灣：儒林

[2] Alahakoon, D., Halgamuge, S. K., and Srinivasan, B. (2000). Dynamic self-organizing maps with controlled growth for knowledge discovery , IEEE Transactions on Neural Networks, vol. 11 No. 3, pp. 601-614.

[3] Bach, J. R., Fuller, C., Gupta, A., Hampapur, A., Horowitz, B., and Humphrey, R.(1996).

The virage image search engine, Storage and Retrieval for Still Image and Video Databases IV, Vol. 2670, No. 1, pp. 76-87.

[4] Blei, D. and Jordan, M. (2003). Modeling annotated data, Proceedings of the ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 127-134.

[5] Blei, D. M., Ng, A. Y.,Jordan, M. I., and Lafferty, J. (2003). Latent Dirichlet Allocation, Machine Learning Research, vol. 3, no. 5., pp. 993-1022.

[6] Duygulu, P., Barnard, K., de Fretias, N., and Forsyth, D. (2002). Object recognition as machine translation： Learning a lexicon for a fixed image vocabulary. Proceedings of the European Conference on Computer Vision, pp. 97-112.

[7] Feng, S., Manmatha, R., and Lavrenko, V. (2004) .Multiple Bernoulli relevance models for image and video annotation, IEEE Conference on Computer Vision and Pattern Recognition, pp. 1002-1009.

[8] Flickner, M., Sawhney, H., Niblack, W.,Ashley, J., Huang, Q., Dom, B., Gorkani, M., and Hafner, J. (1995). Query by Image and Video Content： The QBIC System.

IEEE Computer, Vol. 28, No. 9, pp. 23-32.

[9] Han, J. and Kamber, M. (2000). Data Mining： Concepts and Techniques, Morgan Kaufmann.

[10] Jeon, J., Lavrenko, V., and Manmatha, R. (2003). Automatic image annotation and retrieval using cross-media relevance models, Proceedings of the ACM SIGIR

Conference on Research and Development in Information Retrieval, pp. 119-126.

[11] Kohonen, T. (1982). Self-organized formation of topologically correct feature maps, Biological Cybernetics, Vol. 43, No. 1., pp. 59-69.

[12] Kohonen, T., Kaski, S., Lagus, K., Salojärvi, J., Honkela, J., and Paatero, V. (2000). Self organization of a massive document collection, IEEE Transactions on Neural Networks, Vol. 11, No. 3., pp. 574-585.

[13] Lavrenko, V., Manmatha, R., and Jeon, J. (2003) . A model for learning the semantics of Pictures, Proceedings of the 16th Conference on Advances in Neural Information Processing Systems NIPS.

[14] Merkl, D. (1998). Text classification with self-organizing maps： some lessons learned, Neurocomputing, Vol. 21, No. 1-3, pp. 61-77.

[15] Merkl, D. and Rauber, A. (1997). Alternative ways for cluster visualization in self-organizing maps, Proceedings of the Workshop on Self-Organizing Maps (WSOM97), T. Kohonen, Ed., Espoo, Finland, pp. 106-111

[16] Merkl, D. and Rauber, A. (1999). Automatic labeling of selforganizing maps for information retrieval, Proceedings of the 6. International Conference on Neural Information Processing (ICONIP99), Perth, Australia, pp. 16-20.

[17] Metzler, D. and Manmatha, R. (2004). An inference network approach to image retrieval, Proceedings of the International Conference on Image and Video Retrieval, pp. 42-50.

[18] Miikkulainen, R. (1990). Script recognition with hierarchical feature maps, Connection Science, vol. 2, pp. 83-101.

[19] Mori, Y., Takahashi, H., and Oka, R. (1999). Image-to-word transformation based on dividing and vector quantizing images with words. Proceedings of the International Workshop on Multimedia Intelligent Storage and Retrieval Management, Orlando, FL.

[20] Oliva, A. and Torralba, A. (2001). Modeling the shape of the scene： a holistic representation of the spatial envelope, International Journal of Computer Vision, pp.

145-175.

[21] Oliva, A. and Torralba, A. (2002). Scene-centered representation from spatial envelope Properties, Proceedings of 2nd Workshop on Biologically Motivated Computer Vision, Tuebingen, Germany.

[22] Pentland, A., Picard, R. W., and Sclaroff, S. (1996). Photobook： Content-Based Manipulation of Image Databases, International Journal of Computer Vision, Vol. 18, No.

3, pp. 233-254.

[23] Rauber, A. (1999). LabelSOM： On the labeling of selforganizing maps, Proceedings of the International Joint Conference on Neural Networks (IJCNN'99), Washington, DC, pp.

10-16.

[24] Rauber, A., Dittenbach, M., and Merkl, D. (2001). Towards Automatic Content-Based Organization of Multilingual Digital Libraries： An English, French and Russian Information Agency Nowosti News, Proceedings of the Third All-Russian Scientific Conference Digital Libraries： Advanced Methods And Technologies, Digital Collections" (RCDL01),Russia, pp. 11-13.

[25] Rauber, A., Merkl, D., and Dittenbach, M. (2002). The Growing Hierarchical Self-Organizing Map ： Exploratory Analysis of High-Dimensional Data, IEEE Transactions on Neural Networks , Vol. 48., pp. 199-216.

[26] Salton, G. and McGill, M. J. (1986). Introduction to Modern Information Retrieval. New York, USA ： McGraw-Hill.

[27] Schweighofer, E. and Winiwarter, W. (1995). KONTERM： exploratory data analysis for semi-automatic indexation of legal documents. Workshop at the 6th International Conference on Database and Expert Systems Applications,pp. 407–412.

[28] Shih, J. Y., Chang, Y. J., and Chen, W. H. (2008). Using GHSOM to construct legal maps for Taiwan's securities and futures markets, Expert Systems with Applications, Vol. 34, pp. 850-858 .

[29] Salton, G., Wong, A., and Yang, C. S. (1975). A Vector Space Model for Automatic

[30] Smith, J. R. and Chang, S. F. (1996). VisualSEEK： a fully automated content-based image query system, Proceedings of ACM Multimedia, pp. 87-98.

[31] Torralba, A. and Oliva, A. (2003). Statistics of natural image categories, Network： Computation in Neural Systems, 14, pp. 391-412.

[32] Tsaih, R. H., Lin, W. Y., and Huang, S. Y. (2009). Exploring Fraudulent Financial Reporting with GHSOM, Lecture Notes In Computer Science, Vol. 5477, pp. 31-41 . [33] Ultsch, (1992). Self-organizing neural networks for visualization and classification, in

Information and Classification. Concepts, Methods and Application.

[34] Yang, H. C., Lee, C. H., and Chen, D. W. (2008). A Method for Multilingual Text Mining and Retrieval Using Growing Hierarchical Self-Organizing Maps. Journal of Information Science, Vol. 35, No. 1, pp. 2-23.

[35] Yavlinsky, A., Schofield, E., and Rüger S. (2004). Automated image annotation using global features and robust nonparametric density estimation, Proceedings of the 4thInternational Conference on Image and Video Retrieval, Singapore, pp. 507-517.

[36] Yeh, C. H. and Chau, R. (2004) . Filtering Multilingual Web Content Using Fuzzy Logic and Self-Organizing Maps, Neural Computing & Applications, Vol. 13, No. 2, pp.140-148.

在文檔中基於增長層級式SOM之自動影像註解方法 (頁 54-59)

參 考 文 獻

參考文獻