一個強健式模糊群聚以及新式模糊建模方法的整合與加強之研究

(1)

行政院國家科學委員會專題研究計畫成果報告

一個強健式模糊群聚以及新式模糊建模方法的整合與加強之研究

研究成果報告(精簡版)

計畫類別：個別型

計畫編號： NSC 98-2221-E-011-118-

執行期間： 98 年 08 月 01 日至 99 年 07 月 31 日執行單位：國立臺灣科技大學電機工程系

計畫主持人：楊英魁

計畫參與人員：碩士班研究生-兼任助理人員：吳姿嫚碩士班研究生-兼任助理人員：陳冠禹博士班研究生-兼任助理人員：方偉力

報告附件：出席國際會議研究心得報告及發表論文

處理方式：本計畫可公開查詢

中華民國 100 年 01 月 04 日

(2)

1

行政院國家科學委員會補助專題研究計畫成果報告

※※※※※※※※※※※※※※※※※※※※※※※※※

※ ※

※ 一個強健式模糊群聚以及新式模糊建模方法的整合 ※

※ 與加強之研究 ※

※ ※

※※※※※※※※※※※※※※※※※※※※※※※※※

計畫類別：

■

個別型計畫 □整合型計畫計畫編號：NSC 98-2221-E-011-118

執行期間： 98 年 8 月 1 日至 99 年 7 月 31 日

計畫主持人：楊英魁共同主持人：

本成果報告包括以下應繳交之附件：

□赴國外出差或研習心得報告一份

□赴大陸地區出差或研習心得報告一份

□出席國際學術會議心得報告及發表之論文各一份

□國際合作研究計畫國外研究報告書一份

執行單位：國立台灣科技大學

中華民國 99 年 12 月 5 日

(3)

行政院國家科學委員會專題研究計畫成果報告

計畫編號：NSC 98-2221-E-011-118 執行期限：98 年 8 月 1 日至 99 年 7 月 31 日

主持人：楊英魁國立台灣科技大學電機工程研究所

一、中文摘要

在實際應用系統的取樣資料中，由於各種因素影響，常常使得取樣資料中存有一般雜訊(noise)或大誤差(outlier)的資料。若將這些雜訊或大誤差資料直接使用在系統行為函數的建模上，往往使得所建立的系統模式與實際的系統行為產生很大的誤差，這種現象稱為「過度適應」

(overfitting)。

為了解決這個問題，本研究提出一個非監督模糊建模方式，可直接由具有雜訊及大誤差的輸入-輸出取樣資料中擷取模糊規則，以建立系統行為的近似函數。所提方法主要有兩個特色：(1) 提出一個強健式 fuzzy c-means(RFCM)演算法，以降低雜訊及大誤差的影響。(2)提出一個模糊資料篩選器(FDS)，可尋找資料轉折特徵點，藉以將一個非線性系統分割成片段的子系統組合。因此，可以建立起一個較少規則及較少誤差的 Takagi and Sugeno 模糊模式。

關鍵詞：模糊群聚，函數趨近，強健式，

模糊建模，雜訊，大誤差 Abstract

Due to influence of various factors, the sampling data used for system modeling often include various kinds of noise and outliers. If such sampling data is directly being used to model a system, there will be a big difference, named overfitting, on the system behavior between the resultant modeled system and the actual system.

To overcome this problem, this proposal presents an unsupervised fuzzy model construction approach to extracting fuzzy rules directly from numerical input–output sampling data mixed up noise and outliers for nonlinear systems. There are two core ideas in the proposed method: (1) The robust fuzzy c-means (RFCM) algorithm is proposed to greatly mitigate the influence of data noises and outliers; and (2) A fuzzy-based data sifter (FDS) is proposed to locate good turning-points to partition a given nonlinear data domain into piecewise clusters so that a Takagi and Sugeno fuzzy model can be constructed with fewer rules and less errors.

Keywords: fuzzy clustering ， function approximation，robust，fuzzy modeling，

noise ， outlier, fuzzy clustering, pattern classification, data distribution, feature weight

(4)

3

二、緣由與目的

近年來，在科學應用上，對一個未知的系統使用不明確知識建模(modeling)一直是非常重要的研究課題，而類神經網路 (neural network) 和模糊理論 (fuzzy theory)常常是這些不明確知識建模最常被使用的方法。這些建模的方式大都以系統的輸出與輸入的取樣資料分析 (sampling data analysis)作為基礎，並嘗試為系統的行為建立其工作模式或是系統行為的近似函數 (approximate function)，並將系統的取樣資料作為所建立模式的訓練樣本，用來調整系統模式的參數，希望所建立的系統模式更能近似系統的行為函數的要求。但是，在實際應用系統的取樣資料中，由於系統的環境影響、使用取樣的儀器或取樣的方法等因素，常常使得取樣資料中存有一般雜訊 (noise)或大誤差(outlier)的資料。若將這些雜訊或大誤差的資料直接使用在系統的行為函數建模上，往往使得所建立的系統模式與實際的系統行為產生很大的誤差，這種現象稱為「過度適應」 (overfitting)。因此，在系統的建模過程中，如何將雜訊或大誤差資料的影響降至最低，誠屬系統建模中重要關鍵因素之一。

在系統建模的領域中，強健式學習 (robust learning)便是為了將雜訊或大誤差資料的影響降到最低的方法之一。目前大部分的強健式學習的演算法在學習的過

程中，都對目標函數(objective function) 加入調整函數 (tuning function)( 例如 loss function)，使得在系統的學習過程中對雜訊或大誤差資料的影響降低。但此種方法最大的問題是不知道那些資料點是正確的資料點，那些點是具有大誤差的資料點，使得強健式學習的結果產生誤差。

雖然在目標函數中加入調整函數的方法常被拿來用在具有大誤差取樣資料的系統建模上，但是這些方法仍然存在著問題。其中最重要的是它的強健式學習演算法的初始狀態必須小心的選擇，其主要的問題是不知道那些資料點是正確的資料點，那些點是具有大誤差的資料點，使得這些估測 (estimate)大誤差的資料點的調整函數可能產生錯誤的結果。所以，當選擇到一個不適合的模式初始狀態時，這些方法的調整函數將無法正確的辨識大誤差的資料點，以至於某些大誤差資料點被認為是真正的資料點，某些真正的資料點被認為是大誤差點，因而造成所建立的模式不符合系統的行為模式。因此，本研究在建模過程中，先利用非監督式群聚演算法對一群具有雜訊的系統取樣資料找出資料的初始群聚中心，再利用群聚間距離的計算修正大誤差的資料點在學習過程中的影響，而不需辨認取樣資料中真正的資料或大誤差點，以避免上述調整函數的影響。

因此，本研究希望提出一種新的模糊建模方式，藉以建立系統行為的近似函數。此種建模方式乃基於資料分析(data

(5)

analysis)的技巧，在不需要指定大誤差的的資料點下，將資料的雜訊及大誤差的的資料點降低其影響程度，並可以不需事先指定歸屬函數及規則的個數前提下，即能將資料依照其分佈(distribution)的特徵自動分成不同的群聚，並以此群聚為基礎，為系統的行為模式建立近似函數，並將降低雜訊及大誤差的影響後的資料點，

作為所建立模式的訓練樣本，以調整系統行為的近似函數參數，如此可使得所建立的系統模式更符合系統的行為函數。

三、研究結果與討論

本研究的主要目的在於提出一個強健式(robust)的演算法，將取樣資料點的雜訊及大誤差的影響減到最低，以得到一組足以代表系統行為的資料點，再以這些資料點作為模糊建模過程中所需的取樣資料，以建立系統行為的近似函數。

在過去的二、三十年來，模糊建模被成功的應用在複雜系統的建模上。模糊系統依照知識庫的結構可分為語言式的模糊模式(linguistic fuzzy models)、模糊關係模式 (fuzzy relational models) 及 Takagi-Sugeno 模糊模式 (TS model) 三種。其中 TS 模糊模式將系統的輸入空間分割成多個子空間(subspace)，這些子空間被視為一個描述線性或非線性系統的群聚。當取樣資料點的雜訊及大誤差的影響減到最低後，如何有效的建立一個好又正

確的系統模式成為主要的課題。對此，本研究的目的是提出一個新的建構 TS 模糊模式方法以解決具有雜訊及大誤差的非線性系統函數近似問題。為了使用 TS 模糊模式建立系統的行為模式，將分析所獲得代表系統行為模式的群聚中心，以產生群聚中心的轉折點(turning point)。所謂轉折點，代表系統行為模式的區域最大點或區域最小點。在 TS 模糊模式中，每一條規則的後件部代表系統行為模式的輸出可以用系統輸入變數的線性組合加以描述，而轉折點代表系統行為模式輸出的區域最大點或區域最小點。因此，利用轉折點作為系統的切割依據，即可將系統切割成線性子群聚的組合，每一個群聚以一條 TS 模糊模規則來表達。為計算系統行為模式群聚中心的轉折點，本研究使用一個具有模糊型樣比對滑動窗(slip window)的模糊資料篩選器(fuzzy–based data sifter)計算群聚中心符合峰值 (peak value) 及谷值 (valley value) 特徵的符合程度 (match degree)，將符合程度值較高的群聚中心點作為系統建模時切割的依據，這些符合程度值較高的取樣點稱之為轉折點。一個群聚中心點若是具有峰值特徵點 (characteristic)，代表此點為系統群聚中心點中的區域最大點(local maximum)，

相反地，一個群聚中心點若是具有谷值特徵點，代表此點為系統群聚中心點中的區域最小點(local minimum)。所以峰值及谷值特徵的轉折點代表系統取樣資料分佈趨

(6)

5

勢，也代表是最佳的系統分割點。因此可利用這些轉折點將系統分割(partition) 成多個群聚，然後對每一個群聚做線性迴歸分析(linear regression analysis)，

以獲得足以代表每一個群聚的局部線性近似 (approximate) 迴歸參數 (regression parameter)。再將分割完成的每一個群聚的迴歸參數，代入 TS 模糊模式，每一個群聚用一條規則表示。在系統分割的過程中，以目標系統的輸出值與模式輸出值間的誤差值的大小決定是否將系統繼續分割為更小的子空間的依據。當系統分割完畢，即初步(rough)的建立起目標系統的模糊初始模式(initial model)。最後，利用梯度下降法(gradient descent algorithm) 對這個 TS 模糊初始模式的參數做細調，即可對系統行為函數建立一個有效又正確的 TS 模糊模式。

為了解決這個問題，本研究提出一個非監督模糊建模方式，可直接由具有雜訊及大誤差的輸入-輸出取樣資料中擷取模糊規則，以建立系統行為的近似函數。所提方法主要有以下幾個重要步驟。

1. 首先提出一個以模糊距離關係為基礎的非監督式群聚演算法，此演算法對具有雜訊及大誤差的系統取樣資料找出初始群聚中心。該演算法將每一個取樣資料點視作一個群聚中心，並利用群聚間的距離遠近修正這些群聚中心，使得彼此距離較近的群聚慢慢相互靠近，最後，同一個群聚的資料點，

所代表的群聚中心會收斂到一個點上，該點即為該群聚資料點的中心。

2. 提出一個強健式模糊C-Means 演算法 (RFCM, robust FCM)以降低雜訊及大誤差資料的影響。RFCM以非監督式群聚演算法所產生的群聚中心為資料初值，並以平滑曲線觀念，降低大誤差資料點的影響。希望由此方法所找出系統的行為曲線能更接近真實曲線。

且採用本方法不需要事先設定群聚的數目，所以本法更適合用在對某些無法事先預知群聚數目的系統模式建立。

3. 為建立函數的近似行為，本研究提出新的建構TS模糊模式方法以解決具有雜訊及大誤差的非線性系統函數近似問題。為了使用TS模糊模式建立系統的行為模式，提出一個模糊資料篩選器(Fuzzy-based Data Sifter, FDS) 以計算群聚中心的轉折點，並以此轉折點作為系統切割的依據，將系統切割成線性子群聚的組合。FDS是一種具有模糊型樣辨識能力的篩選器，以滑動窗的方式，每次從群聚中心所構成的資料流中，選取m個群聚中心作為資料點，以峰值及谷值兩個特徵型樣，

利用模糊理論的觀念計算每個資料點滿足峰值及谷值特徵型樣的歸屬值。

某一個群聚中心點滿足峰值或谷值特徵型樣的歸屬值愈高，代表該群聚中心點符合區域最大點或區域最小點的

(7)

特徵愈高，該點即可當作系統的切割點，當系統切割時，依群聚中心點的歸屬值，由大到小依序取出作為切割的依據。這些轉折點將系統分割成線性子群聚的組合，因此，每一個子群聚的輸入-輸出關係可用一個線性方程式表達，亦即，一個子群聚可以用一條TS模糊模式規則來描述。

當系統切割完後，即使用線性迴歸方法來計算每一個子群聚的輸入-輸出的線性迴歸參數，再以此參數建立系統的 TS 模糊模式。由於是使用由 RFCM 所得的群聚中心作為訓練資料，所產生的參數受到雜訊及大誤差資料點的影響已被排除或很小。

本研究將 RFCM 所得的群聚中心作為訓練資料，以梯度下降法來細調 TS 模糊模式的參數，以獲得更近似系統行為函數的參數。

四、未來應用方向

1. 在本計畫的研究中，提出一個強健式模糊群聚演算法，對於文字外型分析時，可以找出足以代表文字字型的骨架。因此，未來可延續這個成果，用在字型辨識及圖形識別的研究上。

2. 將本計畫的理論繼續發揚光大，以適用在資料探勘(data mining)、知識管理(knowledge management)、控制等領域上。

五、重要參考文獻

[1] George J. Klir and Bo Yuan, Fuzzy Sets and Fuzzy Logic ： Theory and Applications, Prentice Hall Inc.,New Jersey, 1995.

[2] V. Boskovitz, H. Guterman, “An adaptive neuro-fuzzy system for automatic image segmentation and edge detection, ” IEEE Trans. On Fuzzy Systems, vol. 10, pp. 247 -262 , Apr. 2002.

[3] N. Li, Y.F. Li, “Feature encoding for unsupervised segmentation of color images ” IEEE Trans. On Systems, Man, and Cybernetics, Part B, vol. 33, no. 3, pp. 438 –447, Jun. 2003.

[4] S. Eschrich, Ke Jingwei, L.O. Hall, and D.B. Goldgof, “Fast accurate fuzzy clustering through data reduction” IEEE Trans. On Fuzzy Systems, vol. 11, no. 2 , pp. 262 –270, Apr. 2003.

[5] C. Stutz, T.A. Runkler, “Classification and prediction of road traffic using application-specific fuzzy clustering”

IEEE Trans. On Fuzzy Systems, vol. 10, no. 3 , pp. 297 –308, Jun. 2002.

[6] N.R.Pal, J.C.Bezdek, “On cluster validity for the fuzzy c-means model,” IEEE Trans. On Fuzzy Systems vol. 3, no. 3, pp.

370-379, Aug. 1995.

[7] A.M. Bensaid, L.O. Hall, J.C. Bezdek, L.

P. Clarke, et al. “ Validity-guided (re)clustering with applications to image segmentation, ” IEEE Trans. On Fuzzy Systems, vol. 4, no. 2, pp. 112-123, May 1996.

[8] C.C.Lee, “Fuzzy logic in Control Systems : Fuzzy logic controller, part I&II,” IEEE Trans. on Systems, Man, and Cybernetics, vol.20, no.2, pp.404-435, Mar./Apr. 1990.

[9] M. Sugeno, “An introductory survey of

(8)

7

fuzzy control,” Information Sciences, vol. 36, no. 1/2, pp. 59-83, 1985.

[10] T. Takagi and M. Sugeno, “ Fuzzy identification of systems and its applications to modeling and control,”

IEEE Trans. on Systems, Man, and Cybernetics, vol. 15, pp. 116-132, Jan./Feb. 1985.

[11] L. X. Wang and J. M. Mendel,

“Generating fuzzy rules by learning from examples,” IEEE Trans. on Systems, Man, and Cybernetics, vol. 22, pp.

1414-1427, Nov./Dec. 1992.

[12] M. Sugeno and T. Yasukawa, “A fuzzy-logic-based approach to qualitative modeling,” IEEE Trans. on Fuzzy Systems, vol. 1, pp. 7-31, Feb.

1993.

[13] S. Abe, and M.-S. Lan, “A method for fuzzy rules extraction directly from numerical data and its application to pattern classification,” IEEE Trans. on Fuzzy Systems, vol. 3, pp. 18-28, Feb.

1995.

[14] S. Abe, M.-S. Lan, and R. Thawonmas,

“Tuning of a fuzzy classifier derived from data,” International Journal of Approximate Reasoning, vol. 14, no. 1, pp. 1-24, Jan. 1996.

[15] P. K. Simpson, “Fuzzy min-max neural networks—Part I: Classification, ” IEEE Trans. on Neural Networks, vol. 3, pp.

776-786, Sep. 1992.

[16] S. K. Pal and S. Mitra, “Multilayer perceptron, fuzzy sets, and classification,” IEEE Trans. on Neural Networks, vol. 3, pp. 683-697, Sep.

1992.

[17] V. Uebele, S. Abe, and M.-S. Lan, “A neural-network-based fuzzy classifier,”

IEEE Trans. on Systems, Man, and

Cybernetics, vol. 25, pp. 353-361, Feb.

1995.

[18] S. K. Halgamuge, W. Poechmueller, and M. Glesner, “An alternative approach for generation of membership functions and fuzzy rules based on radial and cubic basic function networks,” International Journal of Approximate Reasoning, vol.

12, no. 3/4, pp. 279-298, Apr./May 1995.

[19] H. Ishibuchi, K. Nozaki, and H. Tanaka,

“Distributed representation of fuzzy rules and its application to pattern classification,” Fuzzy sets and Systems, vol. 52, pp. 21-32, Nov. 1992.

[20] H. Ishibuchi, K. Nozaki, N. Yamamoto, and H. Tanaka, “Selecting fuzzy if-then rules for classification problems using genetic algorithms,” IEEE Trans. on Fuzzy Systems, vol. 3, pp. 260-270, Aug.

1995.

[21] I.H. Suh, J.H. Kim, and C.H. Rhee,

“Convex-set-based fuzzy clustering,”

IEEE Trans. on Fuzzy Systems, vol. 7, no. 3, pp. 271-285, 1999 .

[22] N. Zahid , O. Abouelala, M. Limouri, and A. Essaid, “Unsupervised fuzzy clustering,” Pattern Recognition Letters 20, pp.123-129, 1999 . [23] H.K. Kwan, and Y. Cai, “A fuzzy neural

network and its application to pattern recognition,” IEEE Trans. on Fuzzy Systems, vol. 2, no. 3, pp. 185-193, Aug. 1994.

[24] N.R. Pal, G.K. Mandal, and E.V. Kumar,

“Comments on A fuzzy neural network and its application to pattern recognition,” IEEE Trans. on Fuzzy Systems, vol. 7, no. 4, pp. 479-480, Aug. 1999.

[25] M.Y.Chen, and D.A.Linkens,“A fast fuzzy modelling approach using

(9)

clustering neural networks,” IEEE 98 on Fuzzy Systems Proceeding, pp.1088-1093,1998.

[26] H.M. Lee, C.M. Chen, J.M. Chen, and Y.L. Jou “An efficient fuzzy classifier with feature selection based on fuzzy entropy ” IEEE Trans. on Systems, Man, and Cybernetics, Part B, vol. 31, no. 3, pp. 426-432, Jun. 2001.

[27] W. Pedrycz, “Fuzzy sets in pattern recognition: Methodology and

methods,” Pattern Recognition, vol. 23, no. 1/2, pp. 121-146, 1990.

[28] L. I. Kuncheva, Fuzzy Classifier Design, Physica-Verlag,New York, 2000.

[29] E.H. Ruspini , “ A new approach to fuzzy clustering ,” Information and Control. vol. 15, pp.22-32,1969 .

[30] J. C. Dunn, “A fuzzy relative of the ISODATA process and its use in detecting compact well separated

clusters,” J.

Cybern.,vol.3,no.3,pp.32-57,1974.

[31] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, New York,1981.

[32] M.P. Windham, “Cluster validity for the fuzzy c-means clustering algorithm,”

IEEE Trans. on Pattern Analysis and Machine Intelligence, PAMI-4 ,no.4, pp.357-363, July 1982.

[33] J.-S.R.Jiang , C.-T.Sun, and E.Mizutani, Neuro-Fuzzy and Soft Computing Prentice-Hall Inc. , pp.18-19,1997.

[34] Seymour Lipschutze, Theory and Problems of Probability, McGraw-Hill, Inc., New York, 1968.

[35] R. A. Fisher, “The use of multiple measurements in taxonomic problem,”

Annals of Eugenics, vol. 7, no. 2, pp.

179-188, 1936.

[36] G. Castellano and A.M. Fanelli, “A staged approach for generation and compression of fuzzy classification rules,” The Ninth IEEE International Conference on Fuzzy Systems, vol. 1, pp.42-47, May 2000.

[37] K. Nozaki, H. Ishibuchi, and H. Tanaka,

“Adaptive fuzzy rule-based classification systems,” IEEE Trans. on Fuzzy Systems, 4(3):238–250, Aug. 1996.

[38] D. Nauck and R. Kruse, “Nefclass - a neuro-fuzzy approach for the classification of data,” in Proceedings of the 1995 ACM Symposium on Applied Computing, pp. 461-465, Nashville, Feb 1995

(10)

9

(11)

1

參加 ICAI2010 研討會心得報告

報告人：楊英魁 2010.07.13

時間： 2010 年 7 月 12 日 ~ 2010 年 7 月 15 日地點： Las Vegas, USA

報告內容：

這次在 Las Vegas, USA 為期四天所舉行的研討會 The 2010 International Conference on Artificial Intelligence (ICAI2010)，是學術界在人工智慧與控制領域上重要的一次會議。ICAI2010 是 WORLDCOMP10 (The 2010 World Congress in Computer Science, Computer Engineering, and Applied Computing)中 25 個研討會之一。由於它的重要性，所以有另外九個與人工智慧領域有關的研討會一起舉行。總共有來自全球不同領域的超過兩千五百個專家學者參與，氣氛熱絡，連當地旅館都不容易定到，每位 Keynote Speaker 都是在人工智慧領域裡大師級的人物。

參加這次 ICAI2010，主要是去發表一篇由國科會所支持研究的成果論文：A Novel Approach of Enhancing CMAC Learning Mechanism。發表此論文時，大約有 90 幾位專家學者參與討論，

氣氛對非常熱絡，對此論文所提出的方法與理論，與會者都極為肯定。

這幾天期間，與各地學者專家深入討論各個不同的領域，受益良多，也能正確的掌握目前人工智慧的領域，尤其是每天第一場的 keynote speech 更是精采。主講者不但學術豐富，有幽默感，而且明確指出今後在此領域上可以進行的幾個方向，足以當作最好的參考。

參加此研討會，不但有機會與來自世界各地的學者專家廣泛討論，相互切磋，也因此更確定目前所進行的研究方向是正確的。而且同時可以參加六個相關研討會，真是不虛此行

(12)

2

A Novel Approach of Enhancing CMAC Learning Mechanism

Ying-Kuei Yang¹, Po-Lun Chang¹² and Jin-Fang Liu¹

1Department of Electrical Engineering, National Taiwan University of Science and Technology, Taipei, TAIWAN

2Department of Electrical Engineering, Lunghwa University of Science and Technology, Taipei, TAIWAN e-mail: ¹[email protected], ²[email protected]

Abstract - Learning and convergence are the two issues being most concerned in the research area of Cerebellar Model Articulation Controller (CMAC). This paper proposes to incorporate grey relational analysis with number of learning to obtain an adequate and appropriate learning rate for improving CMAC convergence. Additionally, this paper also proposes that the amount of weight adjustment to a memory cell of an addressed hyper cube must be proportional to the learned input area, grey relational grade and inverse of number of learning instances to minimize the learning interference.

A credit apportionment approach is thus derived for implementing this idea to achieve fast and accurate learning performance. The results of the experiments conducted in this study clearly demonstrate that the proposed approach provides a more accurate learning mechanism and faster convergence.

Keywords: CMAC, learning interference, credit apportionment, learning instances, grey relational grade

1. Introduction

There were numerous studies employing Cerebellar Model Articulation Controller(CMAC) model in various applications [1][2][3]. In the relational papers dealing with CMAC, Chiang and Lin had proposed to embed the Gauss function into CMAC model to improve learning accuracy [4]. Sayil and Lee presented a maximum error algorithm that adopted the neighborhood training concept to accelerate CMAC convergence [5]. Horváth and Szabó presented the generalization features and enhanced strategy of the CMAC model [6]. Lin and Chiang had proposed some convergent features of CMAC [7]. However, the performance of these approaches in terms of convergence was inadequate and could not effectively satisfy the requirements of real-time applications [8][9][10].

In related research on learning rate and accuracy, Su、Tao、Hung[8][9] and Lu、Chang[11]

employed credit assignment to reform the CMAC learning strategy. However, the acceleration of learning speed takes place only during early learning cycles. Further, the lack of adaptive learning rate has caused an unstable system. Lu and Chang [11] applied credit assignment concept to the mapping hyper cubes of inputs and their neighborhood states to increase learning speed and accuracy. This approach unfortunately failed to achieve significant improvement [9][10].

In addition, the existent effect of learning interference reduces learning performance and

accuracy during this phase [8][9][10][11]. Therefore, this paper proposes a novel learning framework that employs the concept of credit assignment based on grey relational grade, the trained input area, the number of trainings, and the concept of adaptive grey learning rate in a CMAC model in order to mitigate the influence of learning interference so that the learning speed and output accuracy can be effectively improved. During the learning phase, the grey relational coefficient is calculated for each

(13)

3

input state after a learning iteration. An appropriate grey learning rate is then derived by incorporating the calculated grey relational coefficient with the number of learning iterations. Then the credit distributed to an addressed hyper cube is in inverse proportion to the number of trainings.

2. CMAC Architecture

The basic structure of a conventional CMAC model [1] is shown in Figure 1. A CMAC model quantizes the learning space into several discrete states that serve as input states of the CMAC model and are represented as set S in Figure 1. Each input state is mapped from indexed memory A to the corresponding real memory cells W that store the input states information and are summed to produce actual output value. A mapping block is a hyper cube between axes. The total hyper cubes are real memory cells that store relational information regarding addressed input states.

S1

Sk

S A

W

∑

l e a r n i n g

s p a c e m e m o r y i n d i c e s

m e m o r y c e l l s

y a c t u a l o u t p u t

∑

yˆ t a r g e t o u t p u t + e r r o r −

Figure1. Basic structure of a CMAC model

If there are Nh hyper cubes in which each input state maps to Ne hyper cubes, then theactual output is shown in following Equation (1). The ys denotes the actual system output for input state s,

T

a is the indexed vector, and w is the memory cell vector. s

(1)

During the learning phase, the error of actual output value to the desired output value is uniformly distributed to regulate and train the memory cells of a CMAC model. The weight relation between before and after trainings of a memory cell is shown in Equation (2).

( 1)

( ) ( 1)

,

( ^T ⁱ )

i i s s

j j s j

w w a y

α N e

∧ −

− −

= + ⋅ ⋅ a w (2)

In Equation (2), s denotes an input state, w^{( )}_jⁱ represents the weight of j-th hyper cube in the training iteration of number i, a_{s j}_, is an index vector for input state s and hyper cube of number j,

(y_s ^T_s ( 1)ⁱ )

∧ − a w − denotes the learning error, α represents learning rate, and each input state corresponds to Ne hyper cubes.

1 2

,1 ,2 , ,

1

[ , , , ] ^h

h

N T

s s s s N s s j j

j

N

w

y a a a w a w

w

=

⎡ ⎤

⎢ ⎥

= ⎢ ⎥= =

⎢ ⎥

⎣ ⎦

∑

L a w M

(14)

4

3. A Novel Learning Framework

The goal of this study is to propose the concept of credit assignment [9][10] that incorporates the grey relational grade with adaptive learning rate to achieve better learning performance for a CMAC model.

3.1 Grey Relational Analysis

Grey relational analysis is a method for measuring similarity [10]. Assuming a reference sequence x₀ ={x₀(1),x₀(2),...,x₀(n)} with m comparison sequences x_i ={x_i(1),x_i(2),...,x_i(n)}, i=1,2,..., m, then the grey relational coefficient between x₀ and x_i at the k-th state is defined as follows [1][10].

( )

max 0

max

0( ), ( ) min( )

Δ

⋅ + Δ

Δ

⋅ +

= Δ

ξ ξ k k

x k x c

i

i ⁽³⁾

where c(x₀(k),x_i(k)) is termed as the grey relational coefficient,Δ₀_i(k)= x₀(k)−x_i(k), ξ∈(0,1] denotes the distinguishing coefficient to control the resolution between Δ_max and Δ_min, _max maxmax ₀_i(k)

k

i Δ

=

Δ ,

and _min minmin ₀_i(k)

k

i Δ

=

Δ .

Once the grey relational coefficients are determined for all n states, their weighted average, termed grey relational grade, can be calculated by

( )

[ ]

∑

=

⋅

=

n

k

i k

i w cx k x k

x x g

1

0

0, ) ( ), ( )

( (4)

where w_k denotes the weighting factor of the grey relational coefficient c(x₀(k),x_i(k)) and

1 =1

∑ = n

k wk . Generally, wk =¹n is selected for all k.

By above descriptions, the grey relational coefficient and grey relational grade are two effective parameters for analyzing the difference and similarity measures of actual system outputs to their corresponding target outputs during the learning phase of a CMAC model.

3.2 Grey Learning Rate

If the learning rate of CMAC is set to a larger value during the learning phase, then CMAC could be faster convergence, but with lower accuracy and possibly unstable phenomenon. On the other hand, if the learning rate of CMAC is set to be smaller, then CMAC could result in slower

convergence but with better accuracy and less risk of reaching unstable situation.

Assume that input space is partitioned into n states and target function yˆ to be learned is known, then the desired system output for a specific input state, say s_k, k=1,2,...,n, can be mathematically expressed as yˆ k( ). These n desired outputs can be calculated to form a reference sequence as y^∧ ={yˆ(1),yˆ(2),...,yˆ(n)}. To analyze the grey relation between the desired outputs and their corresponding actual outputs, the comparison sequence that is generated using the actual output of the CMAC at every state is denoted asy={y(1),y(2),...,y(n)}. According to Equation (3), the grey

relational coefficient for input state s_k in a single comparison sequence can be written as follows.

(15)

5

max max min

) )) (

ˆ( ), (

( Δ + ⋅Δ

Δ

⋅ +

= Δ

ξ ξ k k

y k y GRC

y

⁽⁵⁾

whereΔ_y(k)= y(k)−yˆ(k) , _max _max _y₍_k₎

k Δ

=

Δ , _min min _y(k)

k Δ

=

Δ and

1 )) ( ˆ ), ( 1 (

max min

≤ + ≤

Δ + Δ

k y k y ξ GRC

ξ .

Because both Δ_max and Δ_min are constant values for a CMAC model, the grey relational coefficient increases with decreasing output error _(k₎

Δy .

During the learning phase, the output errors of input states should be also related to the inverse of the number of training iterations. Therefore, this paper proposes an adaptive regulation of learning rate, termed grey learning rate, that is based on the number of training iterations and grey relational

coefficients of input states in a CMAC model. During the i-th training iteration, the grey learning rate at the input state s_k is proposed as follows.

)) ( ), ( ( )

(

) 1 (

) 1 ( _

k y k y GRC i

i

i k

grey _∧

= −

α ⁽⁶⁾

where GRC ⁽ⁱ ¹⁾(y(k),y(k))

− ∧ is the grey relational coefficient at statesk in the (i-1)-th training iteration. Initially, GRC⁽⁰⁾₍y₍k_),^∧y₍k₎₎=₁ and grey_α⁽¹⁾(k)=1. When the training iteration i becomes larger, it means the systems has been trained more times already. Similarly, when the grey relational coefficient GRC becomes large, it means the system is closing to the final state. In these two cases, Equation (6) shows that the learning rate should be a smaller value. That is, the system is tuned by smaller changes to avoid any overshooting or instability.

3.3 Grey-area-time Credit Apportionment

A hyper cube that includes more input states is more influenced by the learning interference during the learning phase. To mitigate this effect, the distribution of errors among the addressed hyper cubes must be proportional to the hyper cube creditability. The key information available for use as credit is the number of times a hyper cube has been updated [8][9]. In addition, conceptually, the accuracy of the stored weights in hyper cubes should increases with the number of input states during a learning phase. For this, the trained proportion of input states is proposed to be considered as one factor of creditability. Further, as discussed previously, an adaptive learning rate is necessary to avoid an unstable system. This means the grey relational grade in a hyper cube should be also a factor involving creditability of the hyper cube. Consequently, the number of updated times for hyper cubes, the trained proportion of input states and grey relational grade in a hypercube can be integrated to provide an indicator of hyper cube creditability. The credibility, termed grey-area-time, is defined as shown below.

∑

=

−

×

× +

×

= +

−

− _m

c

i i

i

j GRG j a c t

j GRG j a j j t time area grey

1

1 ) ( )

(

1 ) ( )

( )

(

)) ( ( ) ( ) 1 ) ( ((

)) ( ) ( ) 1 ) ( ) ((

( (7)

where t( j) denotes the accumulative learning times of the j-th hyper cube, and m represents the number of addressed hyper cubes for an input statesk. Notice that t(c) must include the value 1 to

(16)

6

prevent from dividing the equation by zero . a⁽ⁱ⁾(j)is defined as following Equation (8) and

)

)(

( j

GRGⁱ is defined as following Equation (9).

cube hyper the in states of number

states input trained of number j the

aⁱ

max ) 1

)(

( = + (8)

wherea⁽ⁱ⁾(j) denotes the trained area proportion of input states in the j-th hypercube at the iteration i.

Notice that the numerator of Equation (8) must include the value 1 to prevent the value being zero.

∑

=

− ∧

⎥⎦⎤

⎢⎣⎡

⎟⎠

⎜ ⎞

⎝

⋅ ⎛

= ^p

c

i c

i j w GRC yc y c

GRG

1

) 1 ( )

( ( ) ( ), ( ) (9)

where GRG⁽ⁱ⁾(j) represents the grey relational grade of the j-th hyper cube at present iteration i, and p is the number of addressed input states corresponding to the j-th hyper cube. Moreover, w_c is the weighting factor of the grey relational coefficient _⎟

⎠

⎜ ⎞

⎝

⎛ ( ),^∧( )

)

( yc yc

GRCⁱ and

∑

=₁ =1

p

c wc . Normally,

c p

w = ¹ for all p. Initially, GRG⁽¹⁾(j)=1.

Based on above discussions, this study modifies the weight updating formula in Equation (2) according to the grey-area-time credit assignment. That is, the Equation (2) is rewritten as

) ( )

( ⁽ ¹⁾ ⁽⁾

, ) 1 ( )

( w a y a w grey area time j

w_jⁱ = _jⁱ⁻ +α⋅ _s_j⋅ ^∧_s− ^T_s ⁱ⁻ ⋅ − − ⁱ (10) 3.4 A Novel Learning Framework

Two key factors affecting the learning result of hyper cubes are: (1) the amount of error distributed to a hyper cube; and (2) the learning rate. Combining the previous discussions, the following Equation (11) shows how grey-area-time credit assignment and grey learning rate are

integrated together for better updating the weights of hyper cubes. The learning rate α of Equation (10) is replaced by grey_α⁽ⁱ⁾(k) of Equation (6).

) ( )

( ) (

_ ⁽⁾ _, ⁽¹⁾ ⁽⁾

) 1 ( )

( w grey k a y aw greyareatime j

w sj s ^Ts ⁱ ⁱ

i i

j i

j = ⁻ + α ⋅ ⋅^∧− ⁻ ⋅ − − (11)

A CMAC model using Equation (11) for its learning mechanism is termed as Grey-area-time CMAC. In Equation (11), it is easy to verify that grey_α⁽ⁱ⁾(k)⋅grey−area−time⁽ⁱ⁾(j) does not exceed value 1 and gradually approaches value zero in the later cycles of learning phase.

4. Simulation Results

In each experiment, results of four approaches of conventional CMAC, Time-Credit CMAC [8], Fuzzy-time-credit CMAC [9] and the Grey-area-time CMAC proposed in this paper are compared.

Example 1:

The target function is y(x)=sinx/x where –30<x<30. The distinguishing coefficient of grey relational coefficients is assigned asξ =0.1. There are 10 training cycles. The root mean square error (RMSE) is employed for comparison. Figure 2 shows the performance of different CMAC models. It is observed in Figure 2 that the proposed Grey-area-time CMAC has the best performance. Furthermore,

(17)

7

Grey-area-time CMAC has been stable and achieved convergence faster than other three methods.

Learnig comparisons

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07

1 2 3 4 5 6 7 8 9 10

training iteration

RMSE error Conventional CMAC

Time-credit CMAC Fuzzy-time-credit CMAC Grey-area-time CMAC

Figure 2: Learning comparison for different CMAC models

Example 2:

The target function is z x y( , ) (= x²−y²) sin 5x where -1< x < 1 and -1< y < 1. The distinguishing coefficient is set to ξ=0.1. Learning comparison adopts root mean square error

(RMSE). Figure 3 shows the performance for different CMAC models. It can be observed from Figure 3 that the proposed Grey-area-time CMAC results in less RMSE than other three models since the first learning cycle. Furthermore, the RMSE monotonically decreases in the Grey-area-time CMAC until it stabilizes after 9 learning cycles.

Learning comparisons

0 0.01 0.02 0.03 0.04

1 2 3 4 5 6 7 8 9 10

training iteration

RMSE error Conventional CMAC

Time-credit CMAC Fuzzy-time-credit CMAC Grey-area-time CMAC

Figure 3: Learning Comparisons for different CMAC models

5. Conclusions

This paper presents an enhanced strategy for creating a novel learning framework for the CMAC model. The accumulated frequency of updating to the hyper cubes [2][6], the trained proportion of input states and grey relational grade of hyper cubes are integrated into a measure of the credibility of the hyper cubes for each input state. This paper also considers the adaptive regulation of learning rate with the number of training iterations and grey relational coefficients in the CMAC model. The credit apportionment is combined with the grey learning rate to improve system performance. The conducted experiments indicate that the proposed approach works well in terms of system stabilization, fast

(18)

8

convergence and approximation to the target function.

References

[1] M. F. Yeh and H. C. Lu, “On-Line Adaptive Quantization Input Space in CMAC Neural Network”, IEEE International Conference on Systems, Man and Cybernetics, vol.4, 2002 [2] H. M. Lee and C. M. Chen, “Self-Organizing HCMAC Neural-Network Classifier”, IEEE

Transaction on Neural Networks, vol.14, no.1, pp.15-27, 2003

[3] J. C. Jan and S. L. Hung, “High-order MS_CMAC Neural Network”, IEEE Transaction on Neural Networks, vol.12, no.3, pp.598-603, 2001

[4] C. T. Chiang and C. S. Lin, “CMAC with General Basis Functions”, Neural Networks, vol.9, no.7, pp.1199-1211, 1996

[5] S. Sayil and K. Y. Lee, “A Hybrid Maximum Error Algorithm with Neighborhood Training for CMAC”, IEEE Proceedings of the International Joint Conference on Neural Networks, vol.1, pp.165-170, 2002

[6] G. Horváth and T. Szabó, “CMAC Neural Network with Improve Generalization Property for System Modeling”, IEEE Instrumentation and Measurement Technology Conference, vol.2, pp.1603-1608, 2002

[7] C. S. Lin and C. T. Chiang, “Learning Convergence of CMAC Technique”, IEEE Transaction on Neural Networks, vol.8, no.6, pp.1281-1292, 1997

[8] S. F. Su, T. Tao, and T. H. Hung, “Credit Assigned CMAC and Its Application to Online Learning Robust Controllers”, IEEE Transaction on System, Man, and Cybernetics-Part B:

Cybernetics, vol.33, no.2, pp.202-213, 2003

[9] Shun-Feng Su, Zne-Jung Lee, and Yan-Ping Wang, “Robust and Fast Learning for Fuzzy Cerebellar Model Articulation Controllers”, IEEE Transaction on System, Man, and Cybernetics-Part B: Cybernetics, vol.36, no.1, pp.203-208, 2006

[10] Ming-Feng Yeh and Kuang-Chiung Chang, “A Self-Organizing CMAC Network With Grey Credit Assignment”, IEEE Transaction on System, Man, and Cybernetics-Part B: Cybernetics, vol.36, no.3, pp.623-635, 2006

[11] H. C. Lu and J. C. Chang, “Enhance the Performance of CMAC Neural Network via Fuzzy Theory and Credit Apportionment”, IEEE Proceedings of the 2002 International Joint Conference on Neural Networks, vol.1, pp.715-720, 2002

(19)

國科會補助計畫衍生研發成果推廣資料表

日期:2011/01/04

國科會補助計畫

計畫名稱: 一個強健式模糊群聚以及新式模糊建模方法的整合與加強之研究計畫主持人: 楊英魁

計畫編號: 98-2221-E-011-118- 學門領域: 人工智慧

無研發成果推廣資料

(20)

98 年度專題研究計畫研究成果彙整表

計畫主持人：楊英魁計畫編號：98-2221-E-011-118- 計畫名稱：一個強健式模糊群聚以及新式模糊建模方法的整合與加強之研究

量化

成果項目 ^{實際已達成}

數（被接受或已發表）

預期總達成數(含實際已

達成數)

本計畫實際貢獻百

分比

單位

備註（質化說明：如數個計畫共同成果、成果列為該期刊之封面故事 ...

等）

期刊論文 0 0 100%

研究報告/技術報告 0 0 100%

研討會論文 0 0 100%

論文著作篇

專書 0 0 100%

申請中件數 0 0 100%

專利已獲得件數 0 0 100% 件

件數 0 0 100% 件

技術移轉

權利金 0 0 100% 千元

碩士生 0 0 100%

博士生 0 0 100%

博士後研究員 0 0 100%

國內

參與計畫人力

（本國籍）

專任助理 0 0 100%

人次

期刊論文 2 2 100%

研究報告/技術報告 0 0 100%

研討會論文 3 3 100%

論文著作篇

專書 0 0 100% 章/本

申請中件數 0 0 100%

專利已獲得件數 0 0 100% 件

件數 0 0 100% 件

技術移轉

權利金 0 0 100% 千元

碩士生 2 2 100%

博士生 1 1 100%

博士後研究員 0 0 100%

國外

參與計畫人力

（外國籍）

專任助理 0 0 100%

人次

(21)

其他成果

(

無法以量化表達之成果如辦理學術活動、獲得獎項、重要國際合作、研究成果國際影響力及其他協助產業技術發展之具體效益事項等，請以文字敘述填列。)

無

成果項目量化 名稱或內容性質簡述

測驗工具(含質性與量性) 0

課程/模組 0

電腦及網路系統或工具 0

教材 0

舉辦之活動/競賽 0

研討會/工作坊 0

電子報、網站 0

科教處計畫加填項

目計畫成果推廣之參與（閱聽）人數 0

(22)

(23)

國科會補助專題研究計畫成果報告自評表

請就研究內容與原計畫相符程度、達成預期目標情況、研究成果之學術或應用價值（簡要敘述成果所代表之意義、價值、影響或進一步發展之可能性）、是否適合在學術期刊發表或申請專利、主要發現或其他有關價值等，作一綜合評估。

1. 請就研究內容與原計畫相符程度、達成預期目標情況作一綜合評估

■達成目標

□未達成目標（請說明，以 100 字為限）

□實驗失敗

□因故實驗中斷

□其他原因說明：

2. 研究成果在學術期刊發表或申請專利等情形：

論文：■已發表 □未發表之文稿 □撰寫中 □無專利：□已獲得 □申請中 ■無

技轉：□已技轉 □洽談中 ■無其他：（以 100 字為限）

因此計畫而已發表一篇 SCI 期刊論文, 另有一篇也已被 SCI 期刊接受, 將於 2011 年出版, 並且發表三篇研討會論文.

3. 請依學術成就、技術創新、社會影響等方面，評估研究成果之學術或應用價值（簡要敘述成果所代表之意義、價值、影響或進一步發展之可能性）（以 500 字為限）

本計畫結果為: 1. 提出一個以模糊距離關係為基礎的非監督式群聚演算法, 2. 提出一個強健式模糊 C-Means 演算法(RFCM, robust FCM)以降低雜訊及大誤差資料的影, 3. 提出一個模糊資料篩選器(Fuzzy-based Data Sifter, FDS)以計算群聚中心的轉折點,進而建立 TS 模糊模式,由於是使用由 RFCM 所得的群聚中心作為訓練資料，所產生的參數受到雜訊及大誤差資料點的影響已被排除或很小.

未來應用方向: 1. 在本計畫的研究中,提出一個強健式模糊群聚演算法,對於文字外型分析時,可以找出足以代表文字字型的骨架.因此,未來可延續這個成果,用在字型辨識及圖形識別的研究上. 2. 將本計畫的理論繼續發揚光大,以適用在資料探勘(data mining)、

知識管理(knowledge management)、控制等領域上.

一個強健式模糊群聚以及新式模糊建模方法的整合與加強之研究

行政院國家科學委員會專題研究計畫 成果報告

一個強健式模糊群聚以及新式模糊建模方法的整合與加強 之研究

研究成果報告(精簡版)

中 華 民 國 100 年 01 月 04 日

行政院國家科學委員會補助專題研究計畫成果報告

※※※※※※※※※※※※※※※※※※※※※※※※※

※ ※

※ 一個強健式模糊群聚以及新式模糊建模方法的整合 ※

※ 與加強之研究 ※

※ ※

※※※※※※※※※※※※※※※※※※※※※※※※※

計畫類別：

個別型計畫 □整合型計畫 計畫編號：NSC 98-2221-E-011-118

執行期間： 98 年 8 月 1 日至 99 年 7 月 31 日

計畫主持人：楊英魁 共同主持人：

本成果報告包括以下應繳交之附件：

□赴國外出差或研習心得報告一份

□赴大陸地區出差或研習心得報告一份

□出席國際學術會議心得報告及發表之論文各一份

□國際合作研究計畫國外研究報告書一份

執行單位：國立台灣科技大學

中 華 民 國 99 年 12 月 5 日

行政院國家科學委員會專題研究計畫成果報告

計畫編號：NSC 98-2221-E-011-118 執行期限：98 年 8 月 1 日至 99 年 7 月 31 日

主持人：楊英魁 國立台灣科技大學 電機工程研究所

參加 ICAI2010 研討會心得報告

( )

( )

[ ]

∑

∑

∑

∑

國科會補助計畫衍生研發成果推廣資料表

國科會補助計畫

無研發成果推廣資料

98 年度專題研究計畫研究成果彙整表

(

國科會補助專題研究計畫成果報告自評表

1. 請就研究內容與原計畫相符程度、達成預期目標情況作一綜合評估

■達成目標

□未達成目標（請說明，以 100 字為限）

□實驗失敗

□因故實驗中斷

□其他原因 說明：

2. 研究成果在學術期刊發表或申請專利等情形：

論文：■已發表 □未發表之文稿 □撰寫中 □無 專利：□已獲得 □申請中 ■無

技轉：□已技轉 □洽談中 ■無 其他：（以 100 字為限）

3. 請依學術成就、技術創新、社會影響等方面，評估研究成果之學術或應用價 值（簡要敘述成果所代表之意義、價值、影響或進一步發展之可能性）（以 500 字為限）

行政院國家科學委員會專題研究計畫成果報告

一個強健式模糊群聚以及新式模糊建模方法的整合與加強之研究

中華民國 100 年 01 月 04 日

個別型計畫 □整合型計畫計畫編號：NSC 98-2221-E-011-118

計畫主持人：楊英魁共同主持人：

中華民國 99 年 12 月 5 日

主持人：楊英魁國立台灣科技大學電機工程研究所

□其他原因說明：

論文：■已發表 □未發表之文稿 □撰寫中 □無專利：□已獲得 □申請中 ■無

技轉：□已技轉 □洽談中 ■無其他：（以 100 字為限）

3. 請依學術成就、技術創新、社會影響等方面，評估研究成果之學術或應用價值（簡要敘述成果所代表之意義、價值、影響或進一步發展之可能性）（以 500 字為限）