灰色預測模型及階層式小腦模型類神經網路在網際網路資訊挖掘上的應用研究(III)

(1)

行政院國家科學委員會專題研究計畫成果報告

灰色預測模型及階層式小腦模型類神經網路在網際網路資訊挖掘上的應用研究 (3/3)

計畫類別：個別型計畫

計畫編號： NSC92-2213-E-011-001-

執行期間： 92 年 08 月 01 日至 93 年 07 月 31 日執行單位：國立臺灣科技大學資訊工程系

計畫主持人：李漢銘

計畫參與人員：王榮英洪大為劉承剛葉柏毅

報告類型：完整報告

報告附件：出席國際會議研究心得報告及發表論文處理方式：本計畫可公開查詢

中華民國 93 年 9 月 7 日

(2)

行政院國家科學委員會專題研究計畫成果報告

灰色預測及階層式小腦模型類神經網路在網際網路資訊挖掘上的應用研究(III)

Grey Prediction Model and Hierarchical CMAC Neural Networks and Their

Applications on Web Mining (III)

計畫編號：NSC-92-2213-E-011-001

執行期限：92 年 8 月 1 日至 93 年 7 月 31 日主持人：李漢銘國立台灣科技大學資訊工程系研究生：王榮英國立台灣科技大學資訊工程研究所洪大為國立台灣科技大學資訊工程研究所劉承剛國立台灣科技大學資訊工程研究所葉柏毅國立台灣科技大學資訊工程研究所

一、中文摘要(關鍵詞: 網際網路資訊探勘、小腦模型類神經網路、階層式小腦模型類神經網路、個人化使用者模型、分類器)

近幾年來資訊量快速膨漲成長，特別是全球資訊網(WWW)上所建構的資訊。也因為這些需要透過搜尋引擎搜尋的資料量暴漲，所以有時搜尋結果並不一定符合使用者的需求。且也因此針對使用者需求而採用適當的個人化資訊 (Personalized information)提供是非常重要的，網頁資訊探勘技術 (Web mining techniques)被公認為在網頁資訊探勘上相當實用的技術。過去研究先驅指出在網路資訊挖掘方面主要的問題有：高維度資料 (high-dimensional data) 處理、累進式學習 (incremental learning)策略，以及得到大量資訊處理能力和平行、分散式資料探勘演算法。本計畫第二年所新提出之具自組織階層式小腦模型類神經網 路(self-organizing HCMAC neural network)分類器正 是針對此類問題所設計的，其由二維具權重處理之灰色 CMACs 所建構之類神經網路可以根據訓練樣本之分布，具有處理高維度分類問題與自我組織記憶體結構能力。再者，我們也提出一個可以累進式從新資料中學習新增資料樣本之學習演算法用來訓練自我組織之 HCMAC 類神經網路，此所提架構更可以運用在累進式學習使用者回饋(user feedback) 用以識別個人化的網頁(personalized Web pages)。

因此本年度計畫我們運用所提方法在一個包含四個使用者類型的網頁測試資料庫(benchmark data set)，以實作系統且測試展示所提方法之有效性。實驗結果明顯顯示具自組織階層式小腦模型類神經網路擁有累進式學習能力且可以處理一般傳統方法在高維度資訊處理時記憶體大量耗用的問題。甚者，實驗也證實所提方法在使用者感興趣之網頁判定上比一般知名分類器具有更佳的預測能

力，亦即我們成功運用灰色預測及階層式小腦模型類神經網路在網際網路資訊挖掘上的應用研究。是故，本計畫所提三年期預定目標成功實現與完成，

相關成果亦陸續發表在相關國際期刊與國際研討會上。

英文摘要 (Keyword: Web Mining, Cerebellar Model Arithmetic Computer, Hierarchical CMAC Neural Network, Personalized User Profile, Classifier)

In recent years, information has grown rapidly, especially on the World Wide Web. Also volume of information found by search engines tends to be large, and these documents are not tailored to a user’s actual needs and interests. Thus, to offer the personalized service that includes only user interested information become increasingly important. Web mining techniques have proven themselves as a very useful tool for mining information of interests on the Web.

However, past pioneers’ studies have indicated that the main challenges in Web mining are in terms of handling high-dimensional data, achieving incremental learning (or incremental mining), providing scalable mining and parallel and distributed mining algorithms. This study presents a novel self-organizing Hierarchical CMAC (HCMAC) neural network composed of two-dimensional Weighted Grey CMACs (WGCMAC) capable of handling both higher dimensional classification problems and self-organizing memory structure according to the distribution of training patterns.

Moreover, a learning algorithm that can learn incrementally from new added data without forgetting prior knowledge is proposed to train the self-organizing HCMAC neural network. It can be applied to incrementally learn user profiles from user feedback for identifying personalized Web pages. A

(3)

benchmark dataset of Web pages ratings that contains four topics of user profiles is used to demonstrate the effectiveness of the proposed method. Experimental results show that the self-organizing HCMAC neural network has a good incrementally learning ability and can overcome the problem of enormous memory requirement in the conventional CMAC while it is applied to solve the higher dimensional classification problems. Furthermore, experiments also confirm that the self-organizing HCMAC neural network has a better forecasting ability to identify user interesting Web pages than other well-known classifiers do.

Finally the goal of our 3-year plan on applying the Grey Prediction Model and Hierarchical CMAC Neural Networks and their applications on Web mining is completely succeeded and achieved. Related results have been published in international journals and international conferences’ proceeding.

二、計畫緣由與目的

基於網頁資訊爆炸性成長，網頁資訊探勘被認為是十分有效的網頁資訊萃取法[1]。但是前期研究多基於高維度資訊處理[2]、累進式學習[3]、適於大量[4]與平行[5]分散式之資訊探勘演算法。目前許多智慧型搜尋代理人協助使用者提供個人搜尋服務以使用者相關之資訊濾除使用者所不需要的網路資訊，例如 Pazzani et al. [6] 試著將搜尋網頁提供使用者選取興趣文件（正範例）或非興趣文件（負範例）以製作使用者模型提供後續個人化網頁資訊探勘。Starr et al. [7] 提出 the Do-I-Care Agent (DICA) 以探尋相關使用者如同儕、群組們間共享之網頁資訊探索。甚者，CiteSeer [8]系統也提供自動根據使用者研究興趣尋找相關科技文獻。

基於先前研究[9] [10]，我們在前期與去年[11]

[12]計畫中提出具自組織能力之階層式小腦模型類神經網路，包含一個可以自動分析輸入樣本空間的輸入空間切割與分析模組，以及一個階層式小腦模型類神經網路架構。自組織空間切割與分析模組採用 entropy measure 及 golden section search method，藉由資料分佈來決定輸入空間量化切割的方式。此外，所提出的階層式小腦模型類神經網路可以將高維度分類問題切成許多可處理的二維子問題，可以有效克服原來傳統分類器對於解決高維度問題需要大量記憶體的問題，其很適合在處理網頁大量維度資訊及符合在網頁個人資訊探勘時所需要的累進式修正使用者模型。最後，所提方法透過累進式學習使用者的模型以預測使用者之瀏覽行為。實驗證實所提方法的確具有累進式學習使用者行為、良好預測使用者感興趣之網頁能力、高維度分類功能、低記憶體需求、且容易實現在分散式

與平行系統中。

三、研究方法及成果

(一) 具權重可微分之 GCMAC （WGCMAC）為了建構階層式小腦模型類神經網路架構，首先必須引入可微分 CMAC 的觀念。由於傳統 CMAC 採用 constant base 基礎函數來建構 hypercube 架構，因此在每個量化狀態均為常數，因此輸入輸出的導數無法得到，也就是傳統 CMAC 無法得到輸入與輸出變數之間的微分特性，這導致 CMAC 其在實際應用上的限制很大，也無法利用傳統 CMAC 建構階層式小腦模型架構。本計畫所提之具權重之灰色階層式 CMAC 網路更較其他學者[15][16]所提之輸出可微分方法擁有較低之計算複雜度，且更加入灰色預測模型以預測計算輸出值。甚者配合提出一頗具效率的演算法，實驗結果顯示學習速率更高且更準確，更帶有輸出導數資訊與低記憶體需求特性。

本部分計畫成果發表在重要國際會議

International Fuzzy Systems Association and the North American Fuzzy Information Processing Society Joint Conference (IFSA/NAFIPS) 2001[12]、International Computer Symposium, Taiwan 2000[22]。

(二) 自我組織之 HCMAC 類神經網路架構

一個階層式小腦模型類神經網路(HCMAC)由二維的 WGCMACs 所組成。圖一展示最小的 HCMAC 類神經網路拓譜，該圖說明每一個 WGCMAC 包含兩個特徵輸入，接著第一層 WGCMAC 輸出接到第二層 WGCMAC 的輸入端。整個 HCMAC 架構可以被展開成完滿二元樹(full binary tree)拓譜結構。我們也基於 WGCMAC 輸出可微特性，針對此神經網路結構推導出 gradient descent 的學習法則。

WGCMAC₁

WGCMAC₃ WGCMAC₂

) (s y

s1 s₂ s3 s₄

y1 y₂

圖(一) HCMAC 類神經網路最小架構

圖一包含四輸入特徵標示為s_i (i=1,2,...,4)，表

(4)

示第i^th個神經網路輸入端，y_j ( j=1,2)表示 j

t

隱藏層的輸出，y(s)表示 HCMAC 根據特定s輸入時的網路輸出。整體如圖一所示，一個四輸入特徵的問題可以切割成兩維的 WGCMACs。圖二說明圖一詳細 HCMAC 類神經網路架構細節情形。此外，不同於傳統 CMAC 採用統一量化輸入空間建構記憶結構[17]，我們基於 Shannon’s entropy measure 與 the golden section search method 採取更有效的自我組織輸入空間量化方法[11]，如此將獲得更有效率的記憶結構以大量節省高維度空間、稀疏樣本輸入空間時衍生的記憶體使用問題。本部分計畫成果發 表在重要國際期刊 IEEE Transaction on Neural Networks, vol. 14, no. 1, pp. 15-27, 2003 [11]、國際 會議 International Joint INNS-IEEE Conference

on Neural Networks, Washington DC, 2001 [23]。

L-1

∑ actual output

error desired output

actual memory

_ +

mapping

A association

memory

∑

∑ error

actual memory

mapping

A association

memory

∑ error

actual memory

mapping

A association

memory WGCMAC1

WGCMAC₂ WGCMAC3

derivative information center

variance Output Layer

( L^th Layer )

thLayer

y1 y2

s1 s2 s3 s4

) (s y

) (s y!

Ηc c

Η Η c parameters

1 1,b a

parameters parameters

2 2,b

a a₃,b₃

w1,i

w2,i w3,i

圖(二). 展示圖一中詳細 HCMAC 類神經網路架構之內部細節

(三) 自我組織之 HCMAC 類神經網路的學習法則 HCMAC 類神經網路的學習法則是根據我們所提出的 HCMAC 最小架構（圖一）所推論出來的。

可微分 WGCMAC 的概念以及梯度下降的學習規則

（gradient descent）被使用來推導學習法則，因為圖二 WGCMAC 的第 L^th層是一個輸出的 node，傳統的 CMAC 學習規則被用來當作學習法則即可，為了推導學習法則，我們定義了一個 Cost Function 如下：

))

2

( ) ˆ ( 2 (

1 y s y s

E = −

(1) 其中yˆ s( )為期望輸出值，而y(s)為實際上的輸出值。

根據 HCMAC 類神經網路是由二維的

WGCMAC 組成的架構分析，它有三個參數必須調整，即權重（weight）、高斯函數的半徑(radius)和平均值(mean)。我們所提出一個誤差倒傳遞的演算法來訓練此類神經網路，當實際輸出和所期望的輸出不同的時候,輸出第L^th層的 WGCMAC1參數會被第一個更新，然後誤差會從 L^th層往後傳播到L-1^th的隱藏層來更新 WGCMAC2和 WGCMAC3的參數，

整個學習法則摘要如下：

(1) 更新輸出層的 WGCMAC1

( ) [ ( ) ] ⁽ ⁾

(ˆ() ())

.

1 1

2 1

) 1 ( 1 1 1 , 1 ) 2

1 ( 2 1 1 1 1

2 1 2

1 2 , 1

1 1

1

s y s y

a e w b e e N N e a e b

a a E

e i

i i

e H a N

c y

a e e N a a a

a

−























−

− +

−

=

∂

′

− ∂

=

∆

+

−

− +

−

∑=

η η



 (2)

其中∆a₁是在 WGCMAC1中係數的修正值，和是 WGCMAC

a

₁

b1 ₁中和的實際值。 _i是輸入狀態時特徵維度中第一個對應的 hypercube 的中心。是輸入狀態時，特徵維度中第一個對應的 hypercube 的權重。

a b c₁_,

s

i^th

1 ,

w1

s

ηa為學習速率。是輸入狀態時對應 hypercube 的個數。

Ne

s

(

¹

) ⁽

^ˆ⁽ ⁾ ⁽ ⁾

⁾

1 2 1

) 1 ( 1

1 1

1

1 e y s y s

a e b b E

Ne

a a b

b

−

=

∂

∂ ′

−

=

∆

+

η −

η

(3)

其中∆b₁是在 WGCMAC1中係數的修正值b 。η_b為學習速率。

)]

( ) ( [ ) 1 ( )

( ₁_, ⁽⁰⁾

,

1 t w t x i x i

w _i = _i − +α_w _t − (4) 其中是輸入狀態且時間為 t 時在 WGCMAC

)

,(

1 t

w _i

s

1中i^th對應的 hypercube 的權重。w₁_,_i(t−1) 是輸入狀態且時間為 t-1 時在 WGCMAC

s

1中對應的 hypercube 的權重。是在輸入狀態時對應 hypercube 之目標記憶體內容。是在輸入狀態時對應 hypercube 之實際記憶體內容。

ith

)

)(

0

( i

x_t

s

i^th

)

)(

0

( i

x

s

i^th α_w

為學習速率。

(2) 更新隱藏層中的 GCMAC2和 GCMAC3

首先，在隱藏層 GCMAC2的學習法則如下：

( )

2 1 1 2

2

) ) ( ( ) ˆ( .

.

a y y

s s y y s y

a a E

a a

∂

− ∂

=

∂

∂ ′

−

=

∆

η η

(5)

其中∆a₂是在 WGCMAC2中係數的修正值。a

2 1

a y

∂

(5)

和

1

) ( y s y

∂

∂ 可以經過式(6) and (10)計算獲得。η_a為學習

速率。

( ) [ ( ) ]

( )

) (6)

1 ( 2 2 1 , 2 ) 2

1 ( 2 2 2 2

1 2 2

2 1

, 2 1

2 2

2 1 .

1 ⁻ ⁺

−

− +

−













−

− +

−

∂ =

∂

∑=

e i

k k

e H a N

c s

a e e N a

a e

a w b e e N N a e

e b a y

其中a2 和 b₂是 WGCMAC2中和的實際值。

是輸入狀態時特徵維度中第一個對應的 hypercube 的中心。是輸入狀態時，特徵維度中第一個對應的 hypercube 的權重。

a b c₁_,_i

s

i^th

1 ,

w2

s

ηa為學習速率。Ne是輸入狀態時對應 hypercube 的個數。

s

( )

2 1 1 2

2

) ) ( ( ) ˆ( .

.

b y y

s s y y s y

b b E

b b

∂

− ∂

=

∂

∂ ′

−

=

∆ η

η

(7)

其中∆b₂是在 WGCMAC2中係數的修正值。b

2 1

b y

∂

和

1

) ( y s y

∂

∂ 可以經過式(8) and (10)計算獲得。η_b為學習

速率。

( )

⁽ ¹⁾

2 2

1 =− 1 1− ₂ ₋ ₂ ₊

∂

∂ a a N_e

e a e

b

y (8)

其中a₂ 和 b₂是 WGCMAC2中和的實際值。 a b

)]

( ) ( [ ) 1 ( )

(t w t x⁽⁰⁾ i x⁽⁰⁾ i

w₂_,_i = ₂_,_i − +α_w _t − (9) 其中w₂_,_i(t)是輸入狀態且時間為 t 時在

s

WGCMAC2

中對應的 hypercube 的權重。是輸入狀態且時間為 t-1 時在 WGCMAC

ith w₂_,_i(t−1)

s

2 中對應的

hypercube 的權重。是在輸入狀態時對應 hypercube 之目標記憶體內容。是在輸入狀態

時對應 hypercube 之實際記憶體內容。

ith

)

)(

0

( i

x_t

s

i^th

)

)(

0

( i

x

s

i^th α_w為學

習速率。

同理 WGCMAC3 在隱藏層的學習速率一樣可以如同 WGCMAC2推導而來。式(5)和(7)顯示期望目標值與實際輸出透過輸出層逆向回傳至隱藏層的之導數資訊

1

) ( y s y

∂

∂ 和

2

) ( y s y

∂

∂ ，其導數值可以由式(10)和

(11) 推導而來。

(

¹ ₂¹^,¹

)

¹^,¹

⁽

²

⁾ ( )

⁽ ¹⁾

1

1 2 1

2

1 2 , 1

. 1 )

( ₋ ₊

−

− −

=−

∂

∑

=

e i

k k

N a H a

c y

e e e

H w c y y

s

y (10)

(

² ¹₂^,²

)

¹^,¹

⁽

²

⁾ ( )

⁽ ¹⁾

2

1 2 1

2 1

2 , 1

. 1 )

( ₋ ₊

−

− −

=−

∂

∑=

e i

k k

N a H a

c y

e e e

H w c y y

s

y (11)

其中yi是 WGCMAC1中第特徵值， _i是輸入狀態時特徵維度中第一個對應的 hypercube 的中心。是輸入狀態時，特徵維度中第一個對應的 hypercube 的權重。

ith c1,

s

i^th

1 ,

w1

s

(四) 自我組織 HCMAC 類神經網路的累進式學習演算法

在自我組織 HCMAC 類神經網路中儲存權重的動作其實是將資訊分散式的儲存在不同層的 WGCMACs 中。所以針對特定輸入而言，就只有少數對應到的 hypercube 貢獻其輸出到自我組織 HCMAC。HCMAC 與 multilayer perceptron 的差異是 HCMAC 學習時只做部分的修正；而 multilayer perceptron 卻需要全面的修正。也因此 HCMAC 可以針對新加進來的學習樣本進行局部的資訊學習與修正而輕易達到累進式學習的優越能力。我們將學習分成兩階段： batch learning phase and incremental learning phase。batch learning 是針對預先收集到的網頁資料進行分類，而因為資料不可能完全事先收集完成，所以事後運作後所陸續得到之新資訊則採用累進式 incremental learning 學習。以下條列出 incremental learning 的進行步驟：

Step 1.

Step 2.

Step 3.

Step 4.

測試針對新 pattern 進來，先測試其以 batch learning 時所獲得之輸出結果，如果其分類結果正確則跳過此 pattern 且前進至步驟 4，否則進行步驟 2。

經由先前訓練結果鑑別新加入 pattern 其鄰居 pattern。

根據暨有訓練參數與訓練週期重新訓練該鄰近 pattern 與新進 pattern。

如果沒有新 pattern 被加入則結束學習程序。否則跳至步驟 1.

(五) 研究成果

為了展現計畫所提方法的效能，我們採用 Syskill & Webert Web pages ratings[18]的資料測試自我組織 HCMAC 類神經網路可否累進式學習使用者模型然後探勘出使使用者所感興趣之網頁，以達成網際網路資訊挖掘上的應用研究之目標。該資料集先以提供一個使用者介面之系統以利回饋三種等級之興趣喜好層度：很喜歡 (hot) 、中等 (medium)、沒興趣(cold)。並且將使用者回饋紀錄在使用者模型中。這些評估紀錄便成為正範例或負

(6)

5

表一.實驗採用網頁主題與使用者喜好分佈情形

Interesting levels and corresponding number of pages

Topic

Number of hot pages

Number of medium

pages

Number of cold

pages

Total number of pages

Bands-recording

artists 15 7 39 61 Goat 32 1 37 70 Sheep 14 0 51 65 BioMedical 32 3 101 136

60 70 80 90

0 100 200 300 400

Number of features

Testing accuracy

VSM LTC BackP SOI-HCMAC

圖(三). 在四種使用者型態下，不同特徵個數之平均預測效果

表(二).不同分類演算法使用 128 個關鍵字為特徵之平均預測準確度

Topics Information-based feature selection algorithm (Boolean vector representation) [7], [8]

Method

Topics

N Neigh ID3 Percept BackP PEBLS Bayes Rocchio SOI- HCMAC

BioMedical 74.5 70.2 73.2 76.0 74.6 77.3 77.5 77.6

Sheep 79.3 78.4 78.9 80.5 79.3 81.5 78.8 81.5

Goats 62.0 64.7 66.3 67.0 62.7 62.9 69.4 69..0

Bands 74.4 70.7 71.4 73.1 74.5 73.4 73.7 74.7

(7)

60 65 70 75 80 85 90 95 100

20 30 40 50 60

Number of examples

Accuracy Training accuracy

Testing Accuracy

圖(四). 於自我組織階層式小腦運算模型中針對 BioMedical 主題之使用者模型累進式學習

範例。資料集(data set)中還包括網頁原始 HTML 碼和使用者的四種不同型態：Bands-recording artists, Goats, Sheep, and BioMedical，除此之外，

還紀錄 HTML 檔名、使用者喜好等級、網頁 URL、瀏覽日期、網頁抬頭。表一展示實驗採用網頁主題與使用者喜好分佈情形。我們隨機選擇被定出評等的頁面子集合以訓練自我組織小腦運算模型類神經網路，且利用剩餘的網頁測試系統之效能。從訓練資料集中 128 個最富資訊量的關鍵字經由 Fair feature subset selection algorithm [19]選擇當成特徵，接著這些訓練資料集被轉換成特徵向量以訓練網路。然後，測試資料集亦被轉換為特徵向量。最後已經學習完成的使用者模式被用來評估測試集中使用者對該網頁是否感興趣。我們執行 10 次獨立測試。圖三顯示在四種使用者型態下，不同特徵個數之平均預測效果，我們發現所提方法擁有平均最佳的效能。甚者，實驗結果也揭示太少特徵將導致低準確率，

因為多數重要的決定性特徵已經遺失。另外，太多特徵也將因為雜訊或多於特徵而導致準確率降低。本部分計畫成果發表在重要國際會議 Joint 9^th IFSA World Congress and 20^th NAFIP

International Conference (IFSA/NAFIPS’01)2001[19]、且已被接受發表在

重要國際期刊 Applied Intelligence[20]。

接著，我們引用Pazzani et al. [6]論文選定的七種分類器學習效果與本計畫所提方法比較。該文選用20個訓練測試集，其餘當測試集。另外該文選用information-based feature selection [6]與 Boolean vector representation建立特徵向量，其實

驗結果顯示Bayesian classifier擁有最佳效能。基於同樣實驗條件下，我們採用計畫所提方法實驗之。結果壓倒性超越其他七種分類演算法，整體結果如表二所示。圖四展示於自我組織階層式小腦運算模型中針對BioMedical 主題之使用者模型累進式學習效能，在訓練範例夠多時，擁有將近75%接近80%的準確率，性能相當良好。本部分計畫個人化相關研究成果發表在重要國際會 議 International Computer Symposium,

Workshop on Artificial Intelligence, Don Wha University, 2002 [24]、另外一篇論文也已被接受

發表在重要國際期刊Applied Intelligence[21]。

四、結論

本計畫第三年依據前兩年研究成果：梯度預測搜尋演算法、具自組織能力之階層式小腦模型類神經網路，進行其在網際網路資訊挖掘上的應用研究。我們選定在網路資訊快速膨脹導致使用者不易尋找到其感興趣文件為動機下，以累進式的網頁個人化資訊挖掘為主題的應用研究。經本年度實現系統與驗證結果顯示，計畫所提方法可以在使用者與網頁互動時立即且有效學習建立起使用者模式，且可以有效解決傳統CMAC中高維度資料導致的龐大記憶體需求，甚至，具自我組織輸入空間方法可以適當的根據輸入樣本調整輸入空間的量化方式，因此記憶體結構可以自動被適當的決定，而不至於無理的膨脹。整體實驗顯示，我們在網路資訊挖掘上得到相當良好的成效與結果，未來我們將運

6

(8)

7 用本計畫三年以來的研究成果，繼續運用在預測、

分類、資訊挖掘的其他研究上，甚至，將更多的研究心得與成果發表在國際型討會與國際期刊上。

五、參考文獻

[1] Raymond Kosala and Hendrik Blockeel, “Web Mining Research: A Survey,” ACM SIGKDD, vol. 1, issue 1, pp. 1-15, 2000.

[2] Jiawei Han and Micheline Kamber, Data Mining:

Concepts and Techniques, A Harcourt Science and Technology Company, 2001.

[3] S. Thomas, S. Bodgala, K. Alsabti and S. Ranka, ”An Efficient Algorithm for Incremental Updation of Accociation Rules in Large Databases,” Proceedings of Third KDD Conference, Auguest 1997.

[4] Mohammed J. Zaki, “Scalable Algorithma for Association Mining,” IEEE Transaction on Knowledge and Data Engineering, vol. 12, no. 3, pp. 372-390, 2000.

[5] Jong Soo Park, Ming-Syan Chen and Philip S. Yu,

“Efficient Parallel Data Mining for Association Rules,” Proceedings of the ACM 4^th International Conference on Information and Knowledge Management, November 29-December 2, pp. 31-36, 1995.

[6] Michael Pazzani and Daniel Billsus, “Learning and Revising User Profiles: The Identification of Interesting Web Sites,” Machine Learning 27, pp.

313-331, 1997.

[7] Starr B., Ackerman M., and Pazzani M., “Do-I-Care:

A Collaborative Web Agent,” Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI’96), pp. 273-274, 1996.

[8] Kurt D. Bollacker, Steve Lawrence and C. Lee giles,

“A System for Automatic Personalized Tracking of Scientific Literature on the Web,” Proceedings of the Fourth ACM Conference on Digital Libraries, Berkeley, California, United States, pp. 105-113, Auguest 11-14,1999.

[9] J.S. Albus, “A New Approach to Manipulator Control:

The Cerebellar Model Articulation Controller (CMAC),” Journal of Dynamic Systems, Measurement, and Control, Transactions of ASME, pp. 220-227, 1975.

[10] J.S. Albus, “Data Storage in the Cerebellar Model Articulation Controller (CMAC),” Journal of Dynamic Systems, Measurement, and Controller, Transactions of ASME, pp. 228-233, 1975.

[11] Hahn-Ming Lee, Chih-Ming Chen and Yung-Feng Lu,

“A Self-organizing HCMAC Neural Network Classifier,” IEEE Transaction on Neural Networks, vol.

14, no. 1, pp. 15-27, 2003.

[12] Chih-Ming Chen, Chin-Ming Hong,, “A Weighted Grey CMAC Neural Network with Output Differentiability,” International Fuzzy Systems Association and the North American

FuzzyInformation Processing Society Joint Conference (IFSA/NAFIPS), vol. 2, pp. 1009-1014, Vancouver, Canada July 25-28 2001.

[13] J. L. Deng , “Control Problems of Grey System,”

Systems & Control Letters, vol. 1, pp. 288-294, 1982.

[14] D. E. Rumelhart, G. E. Hiton, and R. J. Williams,

“Learning Internal Representation by Error Propagation,” Parallel Distributed Processing, vol. 1, pp. 318-362, 1996.

[15] C.T. Chiang & C.S. Lin., “CMAC with General Basis Functions,” Neural Networks, vol. 9, no. 7, pp.

1199-1211, 1996.

[16] Chun-Shin Lin and Ching-Tsan Chiang, “Integration of CMAC Technique and Weighted Regression for Efficient Learning and Output Differentiability,” IEEE Transactions on Systems, Man, and Cybernetics, vol.

28, no. 2, pp. 231-237, 1998.

[17] C.J. Lee & W.S. Lin, “A Method of Clustering Quantization for Better Training of CMAC,” Journal of the Chinese Institute of Engineers, vol. 19, no. 3, pp.

309-320, 1996.

[18] Datasets for Data Mining, http://www.kdnuggets.com/datasets/index.html.

[19] Hahn-Ming Lee, Chih-Ming Chen and Chia-Chen Tan,

“An Intelligent Web-Page Classifier with Fair Feature-Subset Selection,” Proceedings of Joint 9^th IFSA World Congress and 20^th NAFIP International Conference (IFSA/NAFIPS’01), Vancouver, Canada July 25-28, vol. 1, pp. 395-400, 2001.

[20] Hahn-Ming Lee, Chih-Ming Chen, and Chia-Chen Tan, 2003, "An Intelligent Web-Page Classifier with Fair Feature-Subset Selection," accepted by Applied Intelligence.

[21] Hahn-Ming Lee, Chi-Chun Huang and Tzu-Ting Kao, 2004, "Personalized Course Navigation based on Grey Relational Analysis," accepted by Applied Intelligence.

[22] Chih-Ming Chen and Hahn-Ming Lee, 2000, “An Efficient Gradient Forecasting Search Method Utilizing the Discrete Difference Equation Prediction Model,” Proc. of 2000 International Computer Symposium, Taiwan.

[23] Hahn-Ming Lee, Chih-Ming Chen, and Yung-Feng Lu, 2001, “A Self-organizing HCMAC Neural Network Classifier,”Proc. of the International Joint INNS-IEEE Conference on Neural Networks, Washington DC, 2001.

[24] Chih-Ming Chen, Hahn-Ming Lee, and Ya-Hui Chen, 2002, “A Personalized E-Learning System by Using Item Response Theory,” International Computer Symposium, Workshop on Artificial Intelligence, Don Wha University, 2002

灰色預測模型及階層式小腦模型類神經網路在網際網路資訊挖掘上的應用研究(III)

行政院國家科學委員會專題研究計畫 成果報告

Grey Prediction Model and Hierarchical CMAC Neural Networks and Their

Applications on Web Mining (III)

International Fuzzy Systems Association and the North American Fuzzy Information Processing Society Joint Conference (IFSA/NAFIPS) 2001[12]、International Computer Symposium, Taiwan 2000[22]。

t

on Neural Networks, Washington DC, 2001 [23]。

))

( ) ˆ ( 2 (

1 y s y s

E = −

a

s

s

s

(

) (

)

s

s

s

s

( )

s

s

s

( )

( )

s

s

s

s

(

)

(

) ( )

∑

(

)

(

) ( )

s

s

Workshop on Artificial Intelligence, Don Wha University, 2002 [24]、另外一篇論文也已被接受

行政院國家科學委員會專題研究計畫成果報告

) ⁽

⁾

⁽

⁾ ( )

⁽

⁾ ( )