• 沒有找到結果。

行政院國家科學委員會專題研究計畫 成果報告

N/A
N/A
Protected

Academic year: 2022

Share "行政院國家科學委員會專題研究計畫 成果報告"

Copied!
109
0
0

加載中.... (立即查看全文)

全文

(1)

模糊測度 Choquet 積分應用於教育測驗分析之研究(I) 研究成果報告(精簡版)

計 畫 類 別 : 個別型

計 畫 編 號 : NSC 97-2410-H-468-014-

執 行 期 間 : 97 年 08 月 01 日至 98 年 07 月 31 日 執 行 單 位 : 亞洲大學生物科技學系

計 畫 主 持 人 : 劉湘川 共 同 主 持 人 : 郭伯臣

計畫參與人員: 學士級-專任助理人員:林莞惠

報 告 附 件 : 出席國際會議研究心得報告及發表論文

處 理 方 式 : 本計畫涉及專利或其他智慧財產權,2 年後可公開查詢

中 華 民 國 98 年 10 月 30 日

(2)

成果報告(精簡版)

(2008/08/01~2009/07/31)

計畫類別: ■個別型計畫 □整合型計畫 計畫編號: NSC 97-2413-H-468-014-

執行期間: 97 年 08 月 01 日 至 98 年 07 月 31 日 計畫主持人: 劉湘川 亞洲大學生物資訊學系暨心理學系 共同主持人: 郭伯臣 國立臺中教育大學教育測驗統計研究所

計畫參與人員:

專任助理人員:

林莞惠 靜宜大學應用數學系統計資訊組 兼任助理人員:

劉育隆 亞洲大學資訊工程學系博士班

杜雨潔 國立台中教育大學教育測驗統計研究所博士班 林士勛 國立台中教育大學教育測驗統計研究所

成果報告類型(依經費核定清單規定繳交) :精簡報告

報告附件: 出席國際學術會議心得報告及發表論文各一份

處理方式:本研究計畫涉及專利或其他智慧財產權,兩年後可公開查詢

執行單位:亞洲大學

(3)

率為其機率密度之可加測度,這樣的規定最大的好處在於計算簡便,但「機率之可加性運算」

顯然必須滿足;「不同機率密度間無交互作用之基本假設」,然而在諸多實際應用上並不能完全 適用,因而有其他不同之非可加性測度應運而生,例如可能性測度(Possibility measure)、似真 性測度(Plausibility measure) 、信任性測度(belief measure)、必然性測度(Necessity measure)等,

事實上,上述四種非可加性測度及眾所周知之機率測度都是單調性測度之特例,由於單調性測 度之單調性條件甚多,同時確定並不容易,有其不明確性,故菅野道夫(Sugeno,)於 1974 年在 提出λ測度之同時,首先將單調性測度稱為模糊測度,繼而,依循菅野道夫之說法,王正元與 喬治‧克里爾(Zhenyuan Wang & George J. Klir) 於 1992 年出版第一本有關單調性測度之書籍,

將其命名為「模糊測度理論(Fuzzy Measure Theory)」,該模糊測度理論是古典測度理論之推廣 理論,然而單調性測度發展之初,只討論明確數而非模糊數,事實上不宜稱為模糊測度,特別 是目前單調性測度之發展,已由明確數擴張至模糊數了,則關於模糊數之單調性測度可稱為模 糊化單調性測度(Fuzzified monotone measure),若將單調性測度仍稱為模糊測度,則關於模糊數 之模糊測度就有模糊不清之議,加之非單調性測度也已被學界引進,故而王正元與喬治‧克里 爾於2009 年出版之模糊測度理論擴充版已更名為「廣義測度理論(Generalized Measure

Theory)」,由於目前學界所熟悉且容易溝通之名詞,仍稱之為模糊測度,故本研究計畫之單調 性測度亦稱為模糊測度,另外單調性測度必須配合單調性積分才能竟其工,換言之、魯貝格積 分必須相應擴張為單調性積分。此外,對應於單調性測度稱為模糊測度,則單調性積分也常被 稱為模糊積分。第一個提出改進魯貝格積分之單調性積分者,應是義大利數學家 魏塔利 (Giusseppe Vitali, 1925,1997), 他於 1925 以義大利文發表,延遲至 1997 年才被翻譯成英文方為 人知,其後被法國數學家薛奎爾(Gustave Choquet)重新發現於 1954 年再度提出,經二十餘年之 澎勃發展,學界已習慣稱之為Choquet 積分,故本研究計畫亦稱為 Choquet 積分。雖然菅野道 夫(Sugeno,)於 1974 年也提出新的模糊積分,稱為 Sugeno 積分,與其提出之λ測度廣為工程、

管理等學界應用,但Sugeno 積分既不及 Choquet 積分之靈敏,且非魯貝格積分之推廣積分方 法,故暫時未列入本研究計畫之內容。Sugeno(1974)將模糊測度分為次可加測度,可加測度、

及超可加測度,三種,主要因為其所提出之λ測度隨λ值而異,只有該三種可能,而目前學界 也以為模糊測度只有該三種分類,本人於2006 年首先指出,實務所須,單調性混合模糊測度 是不可忽略的,另指出Sugeno 之λ測度與 Zadeh(1978)之可能性測度均為單值模糊測度,適用 性有限,有必要發展具有上述四種類別之多值模糊測度,並於2007 年起,陸續提出系列具上 述改良性值之多值模糊測度族。除可提供工程、管理等學界應用外,並希望能兼顧理論與應用 之進一步發展,且能轉化應用於教育測驗領域。

貳、研究目的與方法

本研究計畫為三年期研究計畫之第一年計畫;「模糊測度Choquet 積分應用於教育測驗分析 之研究(I)」本年度計畫主要在探討「應用作者所提供之多種新模糊測度 Choquet 積分法來建立

(4)

1. 作者提出 L 測度,並探究其基本數理性質

2. 作者提出改進之完備 L 測度,並探究其基本數理性質 3. 作者提出新測度δ測度,並探究其基本數理性質

4. 作者提出基於 L 測度與δ測度之組合多值模糊測度,並探究其基本數理性 質

5. 作者等提出分組資料之 C 測度,並探究其基本性質

6. 作者提出γ模糊密度(γ支撐),並驗證其優於 C 支撐與 V 支撐 二、電腦分析系統程式設計

1. 基於γ支撐之 L 測度 Choquet 積分迴歸預測模式之電腦分析系統程式設計 2. 基於γ支撐之完備 L 測度 Choquet 積分迴歸預測模式之電腦分析系統程式

設計

3. 基於γ支撐之δ測度 Choquet 積分迴歸預測模式之電腦分析系統程式設計 4. 基於γ支撐之 L(δ)組合多值模糊測度 Choquet 積分迴歸預測模式之電腦分

析系統程式設計

5. 基於γ支撐之 C 測度 Choquet 積分迴歸預測模式之電腦分析系統程式設計 三、兩組教育測驗評量預測實證資料

1.苗栗某中學 60 位學生以其國中數學、理化,生物,及地球科學之畢業成績預 測其國中基本能力測驗之自然科成績

2. 臺中某國民小學 128 位學生之上臂三頭肌、上臂二頭肌、肩胛下、腸棘上等 四處的皮脂厚度推估出來的體脂肪率預測體脂肪計體脂肪率。

四、以預測量均方誤差為比較準則,進行上述資料之各種預測模式之 k 折交互驗證法 ( K-Folds Cross Validation Method) 比較研究

預測模式列示於下:

1. 複迴歸預測模式 2. 脊迴歸預測模式

(5)

5. 基於γ支撐之 L 測度 Choquet 積分迴歸預測模式 6. 基於γ支撐之完備 L 測度 Choquet 積分迴歸預測模式 7. 基於γ支撐之δ測度 Choquet 積分迴歸預測模式

8. 基於γ支撐之 L(δ)組合多值模糊測度 Choquet 積分迴歸預測模式 9. 基於γ支撐之 C 測度 Choquet 積分迴歸預測模式

叁、研究成果與發表論文目錄

一、. 提出 L 測度之重要數理性質及其 Choquet 積分迴歸預測模式 (一) 提出 L 測度之重要數理性質如下:

1. L 測度滿足有界性能與單調性是以模糊測度

2. L 測度是決定係數 L 在定義域

[

0 ,

)

上之連續遞增函數

3. L 測度為多值模糊測度,L

[

0 ,

)

,不同之決定係數L 值決定了不同之模

糊測度,換言之,L 測度有無限多模糊測度解,且其公式解具封閉型式 4. L=0 時,L 測度恰好為 Zadeh 之 P 測度

5. L 測度可為混合模糊測度、次可加模糊測度,及超可加模糊測度

(二)完成基於γ支撐之 L 測度 Choquet 積分迴歸預測模式之電腦分析系統程式設計 (三) 發表EI 級論文兩篇如下驗證了 L 測度 Choquet 積分迴歸預測模式優於複迴歸預測

模式、脊迴歸預測模式、基於γ支撐之Zadeh P 測度 Choquet 積分迴歸預測模式、

及基於γ支撐之Sugeno 之λ測度 Choquet 積分迴歸預測模式。

( 見附漸次出席國際會議發表論文及心得報告)

1. .Hsiang-Chuan Liu, Yu-Chieh Tu, Wen-Chih Lin, and Chin-Chun Chen (2008).

Choquet integral regression model based on L-Measure and γ-Support. Proceedings of

2008 International Conference on Wavelet Analysis and Pattern Recognition. (Hong

Kong, 30-31, Aug. 2008.) Volume: 2, pp.777-782. ISBN: 978-1-4244-2238-8. (EI

paper)

2. Hsiang-Chuan Liu, Yu-Du Jheng, Guey-Shya Chen and Bai-Cheng Jeng. (2008)

(6)

978-1-4244-2238-8. INSPEC Accession Number: 10299006. (EI paper)

二、. 提出完備 L 測度之重要數理性質及其 Choquet 積分迴歸預測模式 (一) 提出完備 L 測度之重要數理性質如下:

1. 在既定之模糊密度條件下,提出最大模糊測度;B 測度,及完備測度定義,並 指出Sugeno 之λ測度、Zadeh 之 P 測度、及 L 測度均未包含 B 測度,換言之,

λ測度、 P 測度、及 L 測度均非完備測度。

2. 完備 L 測度滿足有界性能與單調性是以模糊測度

3. 完備 L 測度是決定係數 L 在定義域

[

0 ,

)

上之連續遞增函數

4. 完備 L 測度為多值模糊測度,L

[

0 ,

)

,不同之決定係數 L 值決定了不同

之模糊測度,換言之,L 測度有無限多模糊測度解,且其公式解具封閉型式 5. 完備 L=0 時,完備 L 測度恰好為最小測度;Zadeh 之 P 測度

6. L → ∞ 時,完備L 測度恰好為最大測度;B 測度

7. 完備 L 測度可為混合模糊測度、次可加模糊測度,及超可加模糊測度

(二)完成基於γ支撐之完備 L 測度 Choquet 積分迴歸預測模式之電腦分析系統程式設計 (三) 發表EI 級論文同時被刊登於專書如下,驗證了完備 L 測度 Choquet 積分迴歸預測

模式優於複迴歸預測模式、脊迴歸預測模式、Zadeh P 測度 Choquet 積分迴歸預 測模式、Sugeno 之λ測度 Choquet 積分迴歸預測模式、及 L 測度 Choquet 積分迴 歸預測模式。

1. Hsiang-Chuan Liu, “A theoretical approach to the completed L-fuzzy

measure”, Conference Proceedings of 2009 International Institute of Applied Statistics Studies (2009IIASS), July 24-28 2009.Qindao, China, pp. 1121-1124, 2009.

ISBN:978-0-9806057-4-7. (EI paper)

2. Hsiang-Chuan Liu (2009). “A theoretical approach to the completed L-fuzzy measure”,

Quantitative Analysis Techology and Related Engineering Applications, pp. 1121-1124,

2009, AUSSINO ACADEMIC PUBLISH HOUSE Sydney Australia, ZHU Koulai &

Henry ZHANG, ISBN:978-0-9806057-4-7.

(7)

1. δ測度滿足有界性能與單調性是以模糊測度

2. δ測度是決定係數δ在定義域

[

1, 1

]

上之連續遞增函數

3. δ測度為多值模糊測度,δ ∈ −

[

1, 1

]

,不同之決定係數δ值決定了不同之模 糊測度,換言之,δ測度有無限多模糊測度解,且其公式解具封閉型式 4. δ=-1 時,δ測度恰好為 Zadeh 之 P 測度

5. δ=0 時,δ測度恰好為可加測度,當模糊密度之和為 1 時,Sugeno 之λ測度 即可加測度,此時δ測度恰好亦為λ測度,並指出L 測度及完備 L 測度均未包 含可加測度。

6. − ≤1 δ < 0 時,δ測度為次可加測度,0 < δ ≤ 1 時,δ測度為超可加測度 7. δ測度不可能為混合模糊測度及完備測度。

(二)完成基於γ支撐之δ測度 Choquet 積分迴歸預測模式之電腦分析系統程式設計 (三) 發表EI 級期刊論文如下,驗證了δ測度 Choquet 積分迴歸預測模式優於複迴歸預

測模式、脊迴歸預測模式、Zadeh P 測度 Choquet 積分迴歸預測模式、

及Sugeno 之λ測度 Choquet 積分迴歸預測模式。

Hsiang-Chuan Liu, Der-Bang Wu, Yu-Du Jheng and Tian-Wei Sheu (2009).

“Theory of Multivalent Delta-Fuzzy Measures and its Application”, WSEAS

TRANSACTION ON INTERNATIONAL SCIENCE AND APPLICATION ,Vol. 6, No.

6 1061-1070, June 2009. ISSN: 1790-0832. (EI Journal)

四、. 提出基於 L 測度與δ測度之組合多值模糊測度之重要數理性質及其 Choquet 積分迴歸 預測模式

(一) 提出基於 L 測度與δ測度之組合多值模糊測度;L(δ)測度之重要數理性質如下:

1. L(δ)測度滿足有界性能與單調性是以模糊測度

2. L(δ)測度是決定係數 L 在定義域

[

1,

)

上之連續遞增函數

3. L(δ)測度為多值模糊測度,L ∈ −

[

1,

)

,不同之決定係數 L 值決定了不同 之模糊測度,換言之,L(δ)測度有無限多模糊測度解,且其公式解具封閉型

(8)

糊測度解

4. L=-1 時,L(δ)測度恰好為 Zadeh 之 P 測度

5. L=0 時,L(δ)測度恰好為可加測度,當模糊密度之和為 1 時,Sugeno 之λ測 度即可加測度,此時L(δ)測度恰好亦為λ測度。

6. − ≤1 L < 0 時,δ測度為次可加測度,0 < L < ∞ 時,L(δ)測度為超可加 測度

7. L(δ)測度不可能為混合模糊測度及完備測度。

(二)完成基於γ支撐之 L(δ)測度 Choquet 積分迴歸預測模式之電腦分析系統程式設計 (三) 發表EI 級期刊論文如下,驗證了 L(δ)測度 Choquet 積分迴歸預測模式優於複迴

歸預測模式、脊迴歸預測模式、Zadeh P 測度 Choquet 積分迴歸預測模式、Sugeno 之λ測度Choquet 積分迴歸預測模式、L 測度 Choquet 積分迴歸預測模式

及δ測度Choquet 積分迴歸預測模式。

Hsiang-Chuan Liu, Chin-Chun Chen, Der-Bang Wu, and Tian-Wei Sheu (2009).

“Theory and Application of the Composed Fuzzy Measure of L-Measure and Delta-Measures”, WSEAS TRANSACTION ON INTERNATIONAL SCIENCE AND CONTRAL , Issue 8. Vol. 4, pp. 359-368, Augest 2009.

ISSN: 1991-8763.

(EI Journal)

五、. 提出基於 C 測度與δ測度之組合多值模糊測度之重要數理性質及其 Choquet 積分迴歸 預測模式

(一) 提出基於 C 測度之重要性質如下:

1. 基於複雜度之 C 測度滿足有界性能與單調性是以模糊測度

2. C 測度適合於分組資料之模糊測度度即可加測度,此時 L(δ)測度恰好亦為λ測 度。

(二)完成了 C 測度 Choquet 積分預測模式之電腦分析系統程式設計

(三) 發表 SCI 級期刊論文如下,驗證了 C 測度 Choquet 積分預測模式優於複迴歸預測模 式、脊迴歸預測模式、Zadeh P 測度 Choquet 積分迴歸預測模式、Sugeno 之λ測度

(9)

六、 提出基於 Pearson 相關係數之γ模糊密度(γ支撐),並驗證其優於 C 支撐與 V 支撐發 表論文( 同一之論文) 如下

1. .Hsiang-Chuan Liu, Yu-Chieh Tu, Wen-Chih Lin, and Chin-Chun Chen (2008).

Choquet integral regression model based on L-Measure and γ-Support. Proceedings of

2008 International Conference on Wavelet Analysis and Pattern Recognition. (Hong

Kong, 30-31, Aug. 2008.) Volume: 2, pp.777-782. ISBN: 978-1-4244-2238-8. (EI

paper)

2. Hsiang-Chuan Liu, Yu-Du Jheng, Guey-Shya Chen and Bai-Cheng Jeng. (2008) Choquet Integral Logistic Regression Algorithms Based on L-Measure and γ-Support.

Proceedings of 2008 International Conference on Wavelet Analysis and Pattern Recognition. (Hong Kong, 30-31, Aug. 2008.) .Volume: 2, pp.771-776. ISBN:

978-1-4244-2238-8. INSPEC Accession Number: 10299006. (EI paper)

四、結論

本研究計畫第一年度經數理分析之探討,提出一種有效之模糊密度;γ支撐,四種改善之多

值模糊測度及一種分組資料可用之模糊測度,同時完成了基於γ支撐之上述各種模糊測度之

Choquet 積分迴歸模式,包含複迴歸預測模式及脊迴歸預測模式之電腦分析系統程式設計、進行

兩組教育測驗資料之五折交互驗證比較研究,各種模糊測度之 Choquet 積分迴歸模式,均獲得 有效之成果,並發表了 1 篇 SCI 期刊論文,2 篇 EI 期刊論文,及 3 篇 EI 研討會論文。

五、附錄:發表論文

(10)

Applying a complexity-based Choquet integral to evaluate students’ performance

Jiunn-I Shieha,*, Hsin-Hung Wub, Hsiang-Chuan Liuc

aDepartment of Information Science and Applications, Asia University, No. 500 Lioufeng Road, Wufeng Shiang, Taichung 413, Taiwan

bDepartment of Business Administration, National Changhua University of Education, Taiwan

cInstitute of Bioinformatics, Asia University, Taiwan

a r t i c l e i n f o

Keywords:

Fuzzy measure Discrete Choquet integral Entropy

Complexity

a b s t r a c t

The weighted arithmetic mean and the regression methods are the most often used operators to aggre- gate criteria in decision making problems with the assumption that there are no interactions among cri- teria. When interactions among criteria exist, the discrete Choquet integral is proved to be an adequate aggregation operator by further taking into accounts the interactions. In this study, we propose a com- plexity-based method to construct fuzzy measures needed by the discrete Choquet integral and a real data set is analyzed. The advantage of the complexity-based method is that no population probability is to be estimated such that the error of estimating the population probability is reduced. Four methods, including weighted arithmetic method, regression-based method, the discrete Choquet integral with the entropy-based method, and our proposed discrete Choquet integral with the complexity-based method, are used in this study to evaluate the students’ performance based on a Basic Competence Test. The results show that the students’ overall performance evaluated by our proposed discrete Choquet integral with the complexity-based method is the best among the four methods when the interactions among cri- teria exist.

Ó 2008 Elsevier Ltd. All rights reserved.

1. Introduction

The most often used operator to aggregate criteria in decision making problems is the classical weighted arithmetic mean (Fish- burn, 1970). In many practical applications the decision criteria present some interaction. However, the problem of modeling such an interaction remains a difficult question, which is often over- looked (Domingo & Torra, 2002). The reason is that practitioners are lack of suitable tools to deal with the interactions such that cri- teria are assumed to be independent and exhaustive. This comes primarily from the absence of a precise definition of interactions as well as the complexity and difficulty of identifying the interac- tion phenomena among criteria. It is known that the mutual inde- pendence among the criteria is a necessary condition for aggregation operator to be additive. That is, if some criteria are preferentially dependent with the others, then no additive aggre- gation operator can model the preferences of the decision maker (Domingo & Torra, 2002).

The weighted arithmetic mean and regression method are un- able to overcome the undesirable phenomenon of dependence. In contrast, the Choquet integral takes into account the interactions among criteria. In addition, there is a key issue unsolved in the application of fuzzy integral with the determination of density

values to decide the fuzzy measures in the fusion process. In this study, entropy-based method and our proposed complexity-based method to construct the fuzzy measures in the discrete Choquet integral are discussed.

This paper is outlined as follows: Section2reviews weighting methods, fuzzy measures, and discrete Choquet integrals with two different constructs in fuzzy measures. A procedure of using Choquet integral is provided in Section3. A case study of applying the weighted arithmetic mean method, regression method, Cho- quet integral with the entropy-based method, and our proposed Choquet integral with the complexity-based method is performed in Section4to analyze the students’ overall performance on Basic Competence Test when the interactions exist. Finally, conclusions are summarized in Section5.

2. Weighting methods, fuzzy measures, and discrete Choquet integral

The classical weighted arithmetic mean method is the most commonly used operator to aggregate criteria in decision making problems without further considering the interactions among cri- teria. The regression method is to maximize the linear relation among the criteria without further taking into considering the interactions among criteria. On the contrary, the discrete Choquet integral is proved to be an adequate aggregation operator that j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / e s w a

(11)

byMurofushi and Sugeno (1989, 1991).

Choquet integral is defined to integrate functions with respect to the fuzzy measures (Murofushi & Sugeno, 1989). Fuzzy integrals are very useful for global evaluation models but the number of parameters of fuzzy measures is large. The definitions of fuzzy measures and Choquet integrals are as follows (Murofushi & Su- geno, 1989):

Definition 1. Let N be a finite set of criteria. A discrete fuzzy measure on N is a set function v: 2N?[0, 1] which satisfies the following axioms:

(i) v(/) = 0, v(N) = 1 (boundary conditions);

(ii) A # B implies vðAÞ 6 vðBÞ (monotonicity) for A, Be2N.

For each subset of criteria S # N, v(S) can be interpreted as the weight of the coalition S.

Definition 2. Let v be a fuzzy measure on N = {1, 2, . . ., n}. The dis- crete Choquet integral of function x: N ? R with respect to v is defined by CvðxÞ ¼Pn

i¼1xðiÞ½vðAðiÞÞ  vðAðiþ1ÞÞ, where ðÞ indicates a permutation on N such that xð1Þ6xð2Þ6   6 xðnÞ. Also AðiÞ¼ fðiÞ; . . . ; ðnÞg, and Aðnþ1Þ¼ /. For instance, if x16x36x2, then rank x1, x2, x3 from the smallest one to the largest one.

The result is x(1)= x1, x(2)= x3, x(3)= x2. Finally, Cvðx1;x2;x3Þ ¼ x1

½vðfð1Þ; ð2Þ; ð3ÞgÞ þ ðx3 x1Þ  ½vfð2Þ; ð3Þg þ ðx2 x3Þ  ½mðfð3ÞgÞ.

The discrete Choquet integral takes into account the interaction by means of the fuzzy measure v. If the criteria are independent, the fuzzy measure is additive. Then, the discrete Choquet integral coincide with the weighted arithmetic mean method. That is, CvðxÞ ¼Pn

i¼1vðfigÞ  xi, xeRn. For example, there are five students and three courses (D1, D2, and D3). Assume the raw data and a fuzzy measuremon each subset are inTables 1 and 2, respectively. In Table 2, (0, 0, 0), (1, 0, 0), (0, 1, 0), (1, 1, 0), (0, 0, 1), (1, 0, 1), (0, 1, 1), and (1, 1, 1) represent empty set, {D1}, {D2}, {D1, D2}, {D3}, {D1, D3}, {D2, D3}, and {D1, D2, D3}, respectively. For the first student, the raw scores are 70, 81, and 75. First, rank the scores from the small- est to the largest, i.e., 70, 75, and 81. Then, the overall performance

Choquet integral are 77.6667, 74.6667, 81.0002, and 77.8334, respectively.

To evaluate a discrete Choquet integral, we need a fuzzy mea- sure first. How to find a suitable fuzzy measure becomes an issue.

To be a fuzzy measure, the measure needs to satisfy the axioms of the fuzzy measure. We note that entropy measure and complexity- based measure are qualified to be fuzzy measures. The former one is proposed byKojadinovic (2004)and the latter one is proposed in our study.

To measure the uncertainty of a random variable, the concept of entropy was introduced (Shannon, 1948). The basic idea is that an item with large entropy in its ratings is more important in a user’s interest than an item with small entropy. Based on this idea, an en- tropy-based method is in the following (Yu, Wen, Xu, & Ester, 2001): Given a discrete random variable A, let pAbe the probability of A, then define entropy of A to be h(A) = PpAlog2pA, where pA> 0. With the similar formula, let B be a discrete random vector which contains at least two discrete random variables, then gener- alize this idea to a random vector and call pBbe the joint probabil- ity and h(B) the joint entropy. By using the idea of joint entropy to calculate the entropy of the subsets of criteria of N, define the fuzzy measurem1as the following:m1ðSÞ ¼hðNÞhðSÞfor all S # N (Kojadinovic, 2004). By using the idea of entropy, we need to decide the number of level to be used to classify the raw data into the level of the score for each criterion. For example, let the number of level to be used be 2 and S contain only two random variables X1and X2. In addi- tional, assume the raw data are inTable 3.

The raw data inTable 3can be classified intoTable 4by histo- gram equalization of ‘‘hist.m” program of Matlab 7.0 for each ran- dom variable. To generate the complete information of fuzzy measurem1, first to compute h(N). A joint pattern (1, 2) means that X1= 1 and X2= 2, and (2, 2) means that X1= 2 and X2= 2. There are 3 of pattern (1, 2) and 2 of pattern (2, 2). Thus, pS(X1= 1, X2= 2) = 3/

5 = 0.6, and pS(X1= 2, X2= 2) = 2/5 = 0.4. Therefore, hðNÞ ¼

0:6  log2ð0:6Þ  0:4  log2ð0:4Þ ¼ 0:9710. Next, h(S) is computed when S = X1and X2, i.e., h(X1) and h(X2). In this case, there are 3 pat- tern ‘‘1” and 2 pattern ‘‘2” in X1. FromTable 4, pX1ðX1¼ 1Þ ¼ 3=5 ¼ 0:6, pX1ðX1¼ 2Þ ¼ 2=5 ¼ 0:4, and hðX1Þ ¼ 0:6  log2ð0:6Þ  0:4

log2ð0:4Þ ¼ 0:9710. In contrast to X1, there are 5 pattern ‘‘2” in X2. From Table 4, pX2ðX2¼ 1Þ ¼ 0=5 ¼ 0, pX2ðX2¼ 2Þ ¼ 5=5 ¼ 1,

Table 2

A fuzzy measure used to demonstrate computation of the overall performance by Choquet integral

D1 D2 D3 Fuzzy measurem

0 0 0 0

1 0 0 0.1667

0 1 0 0.1667

1 1 0 0.5

Table 1

Example of the raw data used to demonstrate computation of the overall performance by Choquet integral

Student D1 D2 D3

1 70 81 75

2 70 85 86

3 65 85 84

4 75 91 85

5 75 80 82

Table 3

Example of the raw data used to construct fuzzy measures based on entropy and complexity methods

Student X1 X2

1 70 81

2 70 85

3 65 85

4 75 91

5 75 80

Table 4

The level of the score for each criterion classified from the raw data in Table 3 when the number of level is two

Student X1 X2

1 1 2

(12)

od is also easy to compute the fuzzy measure of a random vector with more than two discrete random variables. However, the en- tropy-based weighting scheme might take the risk to estimate the probability for each criterion. If the sample size is small, it often makes a larger error to estimate the population probability. Under such circumstances, we propose a complexity method to improve the prediction.

The basic concept of complexity is that the more substructures in a system, the more complex the system. This concept is in agree- ment with our intuitive understanding that it is the connectedness of the system elements that matters more. Thus, the more con- nected the system, the higher the number of substructures in it.

Then, it is a good reason to count how many substructures in a structure (Bonchev & Rouvray, 2003). The complexity C of a dis- crete random variable X is defined to be the function which counts the number of distinct patterns in X. The complexity C of n discrete random variables X1;X2; . . . ;Xn is defined as the function which counts the number of distinct patterns in joint pattern of X1, X2, . . . , Xn. For a finite number of random variables X1, X2, . . . , Xn, the complexity is finite. Thus, C(X1, X2, . . . , Xn) always can be nor- malized to be 1. Moreover, it is very natural to defined C(/) to be zero, where / is an empty set. By using the idea of complexity C to calculate the complexity of the subsets of criteria of N, define C1as the following: C1ðSÞ ¼CðNÞCðSÞfor all S # N. It is easy to check that C1 has property of monotonicity. That is, X # Y implies C1ðXÞ 6 C1ðYÞ for X; Y 2 2N. In addition, C1(/) = 0. By the definition of fuzzy measure, C1is a fuzzy measure.

Let the number of level to be 2 and S contain only two random variables X1and X2. By using the raw data fromTable 3, the raw data can be classified by histogram equalization of ‘‘hist.m” pro- gram of Matlab 7.0 for each random variable, as shown inTable 4. To generate the complete information of fuzzy measurem1, com- pute C(N). FromTable 4, there are two different joint patterns, i.e., (1, 2) and (2, 2). Thus, the complexity of N is 2. Next, C(S) is com- puted when S = X1and X2. That is, compute C(X1) and C(X2). There are two different patterns in X1. Then, C(X1) = 2. Moreover, there are only 1 pattern in X2, i.e., C(X2) = 1. By C1ðSÞ ¼CðNÞCðSÞ for all S # N, the fuzzy measure C1is completely defined by the following Table 6. Although our example is to compute the fuzzy measure of a random vector with two discrete random variables, the complex- ity method is also quite easy to compute the fuzzy measure of a random vector with more than two discrete random variables.

a Basic Competence Test to evaluate the students’ performance.

3. A procedure of using the discrete Choquet integral

A five-step procedure of applying the Choquet integral based on Calvo, Kolesarova, Komornikova, and Mesiar (2001)is as follows:

Step 1. Decide the range of level to be used to classify the raw data into the level of the score for each criterion in our study by Scott’s rule and Sturge’s formula. Assume that m is the num- ber of the level of scores and m = 2, 3, 4, 5, 6, 7, 8, 9 are the range in our study. Then, transform the scores of the raw data into the level of the scores for each item when m = 2, 3, 4, 5, 6, 7, 8, 9.

Step 2: Check the mutual interaction and the strength of inter- action among criteria. First, calculate the Chi-square divergence between a pair of criteria, and use statistical test to determine if there is any mutual interaction among the criteria for each m = 2, 3, 4, 5, 6, 7, 8, 9. For the analysis of correlation, we chose Cramer’s coefficients to determine if there is strong mutual interaction among criteria. Compute Cramer’s coefficients for each m = 2, 3, 4, 5, 6, 7, 8, 9. Note that if there is no interaction among criteria, we expect that the accuracy of the Choquet inte- gral method is as well as that of weighted arithmetic mean method.

Step 3. For each m make the following calculations: (1) use credit hours to get the weight for each course; (2) use regres- sion method to get the weight for each course; (3) by using the results from Step 2, compute fuzzy measures based on entropy and joint entropy for each subset of all courses. Then, the importance for each subset is resolved; (4) use the results from Step 2, compute fuzzy measures based on the complexity for each subset of all courses. Thus, the importance for each subset is available.

Step 4: Calculate the weighted arithmetic mean and regression methods among all courses from the raw data. Later, transform the results into the level of the scores for each course when m = 2, 3, 4, 5, 6, 7, 8, 9. Use the Choquet integral with the entropy method and the complexity-based method to compute overall performance values discussed in Step 3 for each m = 2, 3, 4, 5, 6, 7, 8, 9. Finally, transform the results into the level of the scores for each m = 2, 3, 4, 5, 6, 7, 8, 9.

Step 5: Calculate the accuracy for each method for each m = 2, 3, 4, 5, 6, 7, 8, 9.

4. A case study

A data set comes from a class with 45 students in a junior high school, and each student took three courses (namely physics and chemistry, biology, and geoscience) for natural science. The credit hours for these three courses are 16, 4, and 4, respectively. The maximum score for each course is 100 points. Later, all students took a Basic Competence Test for all junior high school students.

The maximum and minimum scores of the Basic Competence Test are 60 and 1, respectively. To simplify the notations, physics and chemistry, biology, and geoscience are denoted by C1, C2, C3, while the score of the Basic Competence Test is denoted by Obj. The de- tailed information is depicted inTable 7.

Table 5

A fuzzy measure constructed by the entropy method

X1 X2 Fuzzy measurem1

0 0 0/0.9710 = 0

1 0 0.9710/0.9170 = 1

0 1 0/0.9710 = 0

1 1 0.9710/0.9710 = 1

Table 6

A fuzzy measure constructed by the complexity method

X1 X2 Fuzzy measure C1

(13)

(Scott, 1979). In practice,ris replaced by the estimated standard deviation, s. In our study, the sample of size n is 45. From the raw data, R = 35, 26, 28, and 45 for each item and s = 8.4887, 6.7182, 7.6480, and 10.7691, respectively. By the above formula, m would be 4.2021, 3.9443, 3.7313, and 4.2587, respectively. Thus, m = 4 or 5 are possible candidates. The other one is the Sturge’s for- mula: m = 1 + 3.3 * log10(n) (Scott, 1992). From the latter formula,

tions at significant level of 0.01 among courses and observe the strength of mutual interactions among courses. First, use the results in Step 1 and ‘‘crosstab.m” program of Matlab 7.0 to com- pute the corresponding p-values and Chi-square divergence be- tween a pair of criteria for each m = 2, 3, 4, 5, 6, 7, 8, 9. Later, compute Cramer’s correlation coefficient by using Chi-square values by the following formula: G ¼ ffiffiffiffiffi

v2 nL

q

, where n = 45 and L = m  1 for each m = 2, 3, 4, 5, 6, 7, 8, 9. From p-values under m = 2, 3, 4, 5, 6, 7, 8, 9, summarized inTable 8, clearly there exist mutual interactions at significant level of 0.01 among courses when m = 2, 3, 4, 5, 6, 7, 8 except m = 9. From Cramer’s correlation coeffi- cient in Table 8, we know the strength of mutual interactions among courses is stronger. Thus, we expect the accuracy of the Choquet integral method is better than those of weighted arithme- tic mean and regression methods when m = 2, 3, 4, 5, 6, 7, 8.

The third step is to calculate the importance for each course by weighted arithmetic mean and regression methods, and the results are summarized inTable 9. FromTable 9, C1(physics and chemis- try) has the highest importance than C2(biology) and C3(geosci- ence) by the weighted arithmetic mean method, i.e., C1> C2= C3. In contrast to the weighted arithmetic mean method, the regres- sion method shows different importance as follows: C1> C3> C2. That is, it suggests that the class needs to put more efforts on geo- science to improve the score on the Basic Competence Test. For the evaluation of the Choquet integral with the entropy-based and the complexity-based methods, calculate the importance for each sub- set generated by all courses for m = 2, 3, 4, 5, 6, 7, 8, 9. The numer- ical figures of fuzzy measures for each subset are computed by Matlab and provided inTable 10. FromTable 10, the importance of complexity-based method is larger than that of entropy-based method for each subset of all criteria. This means that the impor- tance of entropy-based method is underestimated. The reasons may come from the error of estimating a population probability by a small sample of size 45.

The fourth step is to compute the overall performance of stu- dents by the four methods. For each student, the overall perfor- mance and the score of the Basic Competence Test are transformed into the level of the scores for each item, as shown inTable 11, where M1, M2, M3, and M4 represent the weighted arithmetic mean method, the regression method, the Choquet inte- gral with the entropy-based method, and the Choquet integral with the complexity-based method, respectively. The different numeri- cal figures in the Choquet integral column depicted in Table 11 have different meanings. The higher the value of Choquet integral is, the better it is. Finally, the fifth step is to compare the predic- tions of different methods under different m, depicted inTable 12, where higher value means better accuracy. Obviously, the Choquet integral with the complexity-based method has the best accuracy among the four methods. The reasons may be that to estimate a population by the sample probability is worsen when m is greater than 4. It is worth to note that the regression method has better accuracy than the weighted arithmetic mean method since the regression method is to minimize the error without the assump- tion of mutual interaction among courses.

2 70 85 86 42 27 88 84 80 35

3 65 85 84 33 28 55 65 60 5

4 75 91 85 25 29 78 85 75 27

5 75 80 82 27 30 72 84 78 47

6 68 75 76 33 31 64 76 70 27

7 70 77 72 35 32 60 70 65 20

8 80 78 70 31 33 69 80 70 35

9 83 81 85 50 34 66 78 66 17

10 75 79 83 31 35 62 70 66 13

11 62 74 68 35 36 61 72 65 28

12 68 74 80 30 37 68 74 71 11

13 77 85 81 37 38 53 65 59 9

14 66 76 74 29 39 67 70 64 36

15 78 88 83 31 40 59 65 68 16

16 57 67 62 15 41 74 82 75 49

17 56 70 63 12 42 58 66 62 15

18 68 80 74 31 43 76 74 78 38

19 53 66 58 21 44 84 81 78 37

20 65 81 73 32 45 76 72 74 35

21 62 76 69 12

22 67 75 71 22

23 74 71 68 28

24 61 69 65 28

25 64 70 67 24

Table 8

The results of Cramer’s correlation coefficients

C1 C2 C3 C1 C2 C3

m = 2 m = 3

C1 1 0.5307 0.5737 1 0.5437 0.5284

C2 0.5307 1 0.7441 0.5437 1 0.6765

C3 0.5737 0.7441 1 0.5284 0.6765 1

p < 0.01 p < 0.01

m = 4 m = 5

C1 1 0.5097 0.5744 1 0.5131 0.5885

C2 0.5097 1 0.6026 0.5131 1 0.5583

C3 0.5744 0.6026 1 0.5885 0.5583 1

p < 0.01 p < 0.01

m = 6 m = 7

C1 1 0.4848 0.5537 1 0.4991 0.537

C2 0.4848 1 0.6212 0.4991 1 0.5821

C3 0.5537 0.6212 1 0.537 0.5821 1

p < 0.01 p < 0.01

m = 8 m = 9

C1 1 0.515 0.5329 1 0.5164 0.5375

C2 0.515 1 0.5336 0.5164 1 0.5049*

C3 0.5329 0.5336 1 0.5375 0.5049* 1

p < 0.01

*p > 0.01.

Table 9

Weights for each course by the weighted arithmetic mean and regression methods

數據

Table IV Measurements of BIA and four skinfold determinations of percent body fat   No BIA  biceps triceps Sub-
Table IV Measurements of BIA and four skinfold determinations of percent body fat   No  BIA  biceps  triceps  Sub-
Table 2    Leave-one-out CV accuracy of six Classification  algorithms
Table 2    Sample size of each group  Group Number  of  Samples
+7

參考文獻

相關文件

The major qualitative benefits identified include: (1) increase of the firms intellectual assets—during the process of organizational knowledge creation, all participants

This research is to integrate PID type fuzzy controller with the Dynamic Sliding Mode Control (DSMC) to make the system more robust to the dead-band as well as the hysteresis

This paper integrates the mechatronics such as: a balance with stylus probe, force actuator, LVT, LVDT, load cell, personal computer, as well as XYZ-stages into a contact-

This project integrates class storage, order batching and routing to do the best planning, and try to compare the performance of routing policy of the Particle Swarm

由於本計畫之主要目的在於依據 ITeS 傳遞模式建構 IPTV 之服務品質評估量表,並藉由決

As for current situation and characteristics of coastal area in Hisn-Chu City, the coefficients of every objective function are derived, and the objective functions of

Subsequently, the relationship study about quality management culture, quality consciousness, service behavior and two type performances (subjective performance and relative

Ogus, A.,2001, Regulatory Institutions and Structure, working paper No.4, Centre on Regulation and Competition, Institute for Development Policy and Management, University