• 沒有找到結果。

基於資訊理論熵之特徵選取 - 政大學術集成

N/A
N/A
Protected

Academic year: 2021

Share "基於資訊理論熵之特徵選取 - 政大學術集成"

Copied!
53
0
0

加載中.... (立即查看全文)

全文

(1)). % Entropy based Feature Selection. 。. 106. 6.

(2) 三. 力 切. ) entropy. 切 entropy. 切 (Lasso(F-score(. 三. (% 干. ). entropy ). 己. 三. (. (. (entropy. I. FCBF.

(3) Abstract Feature selection is a common preprocessing technique in machine learning. Although a large pool of feature selection techniques has existed, there is no such a dominant method in all datasets. Because of the complexity of various data formats, establishing a new method can bring more insight into data, and applying proper techniques to analyzing data would be the best choice. In this study, we used the concept of entropy from information theory to build a similarity matrix between features. Additionally, we constructed a DCG-tree to separate variables into clusters. Each core cluster consists of rather uniform variables, which share similar covariate information. With the core clusters, we reduced the dimension of a highdimensional dataset. We assessed our method by comparing it with FCBF, Lasso, F-score, random forest and genetic algorithm. The performances of prediction were demonstrated through real-world datasets using hierarchical clustering with voting algorithm as the classifier. The results showed that our entropy method has more stable prediction performances and reduces sufficient dimensions of the datasets simultaneously.. Keywords: machine learning, feature selection, dimension reduction, entropy. II.

(4) ..................................................................................................................... 1 ............................................................................................................. 3. 仍. ................................................................................................. 9 ........................................................................................... 9. ( (. 」. 干. ............................................................................. 11. (Connectionist Bench(Sonar, Mines vs. Rocks) ............................................... 13 (SPECTF heart ................................................................................................. 15 ............................................................................................... 17 「............................................................................................... 17 (. (Hierarchical clustering) ......................................................... 17. (. (Voting) ................................................................................... 17 ............................................................................................... 19. 切. 切 ..................................................................... 19. ( (entropy. .................................................................................................. 19. (FCBF(Fast Correlation-Based Filter) ............................................................. 25 (Lasso ............................................................................................................... 27 L(F-score............................................................................................................. 28 (. Random Forest. 三. ........................................................................ 29. Genetic Algorithm. (也. ............................................................... 30. ....................................................................................................... 30 (entropy. .................................................................................................. 30. (Lasso ............................................................................................................... 31 (F-score............................................................................................................. 32 (. ......................................................................................................... 33. 三. ..................................................................................................... 34. L(也. ............................................................................................... 35. L. ....................................................................................................... 35 ............................................................................. 36. ( 干. ................................................................. 38. (. 」. (小. ................................................................................................................. 40. (SPECTF .......................................................................................................... 42 ............................................................................................................... 43 III.

(5) 仍......................................................................................................................... 45. IV.

(6) 5-1 5-2. ................................................ 36 」. 5-3 小 5-4 SPECTF. .................................... 38. 干. ................................................................................ 40 ......................................................................... 42. V.

(7) 3-1. .................................................................................... 10. 3-2. ................................................................ 11. 3-3. 」. ............................................................ 12. 干. 3-4. 」. ........................................ 12. 干. 3-5 Connectionist Bench(Sonar, Mines vs. Rocks) 3-6. .......................................... 14. Connectionist Bench(Sonar, Mines vs. Rocks) .......................... 14. 3-7 SPECTF heart 3-8. ........................................................................................... 16 SPECTF heart. ......................................................................... 16. 4-1. ............................................................................................... 23. 4-2. 力 ....................................................................................... 23. 4-3. ........................................................................................ 18. 五. 4-4 Entropy 4-5 Log(λ). ..................................................................................... 31 切. MSE. 4-6. 」. 干. 4-7. 」. 干. ............................................................................... 32 切 切. 5-1 5-2. F-score. .............................. 33. ........................................ 34. ................................................................ 37 」. 5-3 小 5-4 SPECTF. ............................................ 39. 干. ........................................................................................ 41 ................................................................................. 43. VI.

(8) 切 (variables). 三 「. 切. (. (feature). 切. 切. 切 之. 「. 干. ( 切. 切. :. 「 切. (precision). 「. 切. (accuracy). 「. (complexity). 之 切. (Dimension reduction). 千 (Feature selection)) 三. 、 切. 「. 切. (. 「. 十 (. 切. 「. y ) entropy. entropy ) F-score( 三 (%. (Random Forest). (Lasso. (genetic algorithm)) y. p 1. 切. :.

(9) ( Rocks). 」. (Connectionist Bench(Sonar, Mines vs.. 干. (SPECTF heart. ) 仍. %. % ) ( 今. 介. p. L ). 2.

(10) 仍 Feature selection 、 Feature extraction) Feature selection. 切. Feature extraction. 切. 中. (LDA. 切 Feature selection. ). ) (Feature selection). 、. 「. Filter(Wrapper(Embedded)Filter. 切. 切. 切 )Filter 才. 切. 「. (. 下 ). (. 「 切. Filter |. 切. 「 切. 「. p )Wrapper Sequential forward selection. 切 切. ). 切 ). 切. (. ( Embedded. ). 切. 切. 、 Wrapper. Embedded. (Saeys, Inza, & Larrañaga, 2007)) 3. 「.

(11) r. ( Yu 、 Li. 2003. 二. Saeys, et. al, 2007. entropy. p. Filter. 「. FCBF(Fast Correlation-Based Filter) 切 切. 切 S # = {X' , … , X* } S #. 切. 切. X,. 切 切. 切 切. S-.,/. i ≠ j S-.,/. ). 切 i. 切. j切 Sp5 .(. 切. Sp6 . )Yu 、 Liu (2003). 切. feature). 切. X.. 切 X.. n. (. 切. (predominant. Filter. ). 「. 切 (Yu & Liu, 2003). )FCBF. entropy 切. 三 切 Filter. ). Lin. 「. F-score. 才. 2005. NIPS Feature Selection Challenge. 切. F-score. 切. SVM. F-score 切 ). (. 切 (Chen & Lin, 2006)). 切. 干 F-score 切. 切. (Akay, 2009)). 4. Akay (2009).

(12) Wrapper. %. 「 切. (切 ( Jain (2000). )Raymer(Punch(Goodman(Kuhn %. 切. 切. % KNN(K Nearest. n Neighbor). 了. p. %. k=7. KNN. ). 切 %. p. (Raymer, Punch,. 切. Goodman, Kuhn, & Jain, 2000)). % %. 切. ). 三. Embedded. % (. 上七. 切. 切. 內. )2.. p. 「切 1. 切. ). 三. ) Díaz-Uriarte 、 Alvarez de Andrés (2006) 1. 2.. 切 切. ,之 切. p ). ). 3. 4.. 叉. ). ) 5. 互p 干. %.

(13) 5.. 切. 6.. 切. ) ). Díaz-Uriarte 、 Alvarez de Andrés (2006) 士. 「. 三. 三 (nodesize)(. r (mtry). 切. (ntree)(. 切. 10. ). SVM(KNN. 三. ). 三. 「 mtry. ) mtry. 1 5. y. ) 士. 三. (. 切. 切 切. 切 三. 1. 介. 切. 切. 5. p. ). p. p. 已 切. ) 力. 一 三. 切 切 切. ㄧ. 三 (Díaz-Uriarte & Alvarez de Andrés, 2006)) 6.

(14) Svetnik. 2004. p. 三. 切. OOB(Out-of-Bag). 三 五). 切. 切. 切. 切. (Svetnik, Liaw, Tong, & Wang, 2004)). n 三. OOB. 力切. 五. 五. L. OOB. 人子. OOB. 干) Saeys. 2008. p. ensemble feature selection techniques. bagging. 「. 「. ). p9. 10. 10 p 40. 切. p 10. 切. 三. ). p. (Saeys, Abeel, & Van de Peer, 2008)). p 4. p. SU(Symmetrical Uncertainty)(Relief(SVM Díaz-Uriarte. Alvarez de Andrés (2006). 切. 三. p 力 Saeys. 切 三. RFE(. ensemble feature selection. 切. techniques. (Saeys et al., 2008)) 三. 切 7. 仍 切.

(15) 力. 三. ). 8.

(16) ( (BioBehavioral Assessment Project). Capitanio. 六. 1. 0 (Golub, Hogrefe, Widaman,. & Capitanio, 2009)). r. 25. ). 切. (. (. (」 (. ……. ) 二 Golub et. al, 2009 ) 切. 1907. 130. 0 切 ). 切 )207. 0. 切. 1400. (207. 切. (p. %. 1520. 916. 切 (. (. (. (. )). (604. 3-1 n 0. p. p. 、 ). 9. 0.

(17) 切 8. ) 4. 3-2(1. 切. 2. 切 (. ) ) 3-1. 10.

(18) 3-2. (. 」. 干 Wolberg. 1995. UC Irvine Machine Learning. Repository. 切. (FNA) (. 士. 切 (. ) 212. r. 切. (357. 569. ) 32. 切. ID. 干 30. t (. 切. ). 3-3 切. ) 3-4(B 63%. Benign 3-4. 切. 11. M. Malignant. )).

(19) 干 ) 3-3. 3-4. 」. 干. 」. 12. 干.

(20) (Connectionist Bench(Sonar, Mines vs. Rocks) Gorman 、 Sejnowski UC Irvine Machine Learning Repository. 小. 小 208. ) 切 0. ㄧ. 60. 小 1. 111. 切 ). (97. 切. n. ). 3-5 切 p. 切 (. ). )) 切 ). 13. 3-6. M. R.

(21) 3-5 Connectionist Bench(Sonar, Mines vs. Rocks). 3-6. Connectionist Bench(Sonar, Mines vs. Rocks). 14.

(22) (SPECTF heart Cios 、 Kurgan. 干. 分. 干. (Single Photon Emission Computed Tomography, SPECT) Machine Learning Repository 切. r. (55. ). 內. 、 t. ). 45 (212. 、. 、. t. 267. ). 切. UC Irvine. 110. ). 3-7 切 p. p. 切. ). 3-8. 三. ). r. 15. ).

(23) 3-7 SPECTF heart. 3-8. SPECTF heart. 16.

(24) 「 (Hierarchical clustering). ( 、三. 干. p 、. 「). ). 、. 「. 「. 「 「. ). ( )Linkage criteria. (. complete. –linkage clustering(average-linkage clustering). complete-linkage clustering. A(B complete-linkage clustering. max{ d(a, b) ∶ a ∈. A , b ∈ B}. ) Ward’s minimum variance criteria. linkage criteria)Ward’s method. 切. 切. ). (Voting). (. (HC) 0. 切 0. 1. 1). 五. p 1. 、 干. 17. 1. ).

(25) 3:2 切. 三 916. ) 604. 、 )y 下. (. 4-3). 切 p. ). 4-1. 五. Hierarchical clustering. p i. (i = 1 … n). 干. 1. 1. 下. 0. 0. 18.

(26) 切 entropy. 切 entropy 、. ). (. 切 切. %. ). 「). (entropy. ). 1. Shannon entropy entropy entropy. 40. n. Shannon entropy (. 才. p. ). entropy)) entropy. n 三切 F. ). {G' , … , GH }. I(G). 三. entropy J F H. J F = −. I(G, )log (I(G, )) ,O'. entropy. 已 三切. entropy. 三切 內 切. entropy. 中 entropy 19. ). Shannon entropy Shannon. 內.

(27) ). 2.. (Data Cloud Geometry Tree). (. 才 ,之. criteria. 今 ). (. 力Linkage. complete(average(single. ) 干. 2013. Fushing、McAssey. p. 公. | (Fushing, Wang, VanderWaal, McCowan, & Koehl, 2013)) d(x. , x/ ) x.. I.. i D D./ = d(x. , x/ ). ( x. ∈ X *. i = 1…n. V. d(x, xR ). X、V = exp(− 0. Boltzmann. d(x ,xv ) ) T. 一. (Fushing & McAssey, 2010))oWXYZ = exp(−. d(xi ,xj ) ) T. V) T. 切. ) II.. P(X. () p. %. (Markov transition probability matrix) DX = diag(. 、 T. * /O' W'/ ,. . . . ,. 什 20. 中 PX = D6' X WX * /O' W./ ,. ...). DX. 三 WX. MCMC.

(28) %. Fushing(. (. McAssey. MCMC. regulated random walk) 切. a. P./ . i. j. 三. 中. i. 三. j. P./. 三. e). 三. 三. 、. 三 m. j. 三 e[/]. 中. i. 中. P./. 三. M. 三 )M 切中. 三 大). M b.. 三. y. ). 三. ) III.. ). regulated random walk E E./. i. N = 1000. 化N N. 、j. N T. 三 T. 三 T. 三. 21. ). ) T.

(29) IV. E. T. 2010) i. E. 中. (Fushing & McAssey, c. c. I* − B 6d EB 6d B. t. 、. B... E. {λ' , . . . , λ* }. ) 1 - λ. /max{λ' , . . . , λ* }. p. 4-1. 力) 1. 三 4-2 叉. V.. 力. 干. : ∆./ =. (i,j). {δ./ 1 , … , δ./ k , … , δ./ K } Th. ). {T' , … , Th , … , Ti } δ./ k = 1. (i,j). 0). u./ = min {Th. i h δ./. k > 0}. ) entropy 內. 十 ). 22. 切.

(30) 4-2. 4-3. 力. entropy entropy. Shannon entropy Shannon entropy. 切. y. 切. ). 切 Shannon entropy. 切. 三. 切 23. 切 entropy.

(31) 才 y %. 切. %. 切. 切. ). p切. 切. y. (Lee, 2017)) 1.. 切. 十 mutual entropy). 切. sn. H{mon pq mor } = n. r. sr. W/ /O'. {−P tuvY × logP tuvY } .O'. Ch : 切. k(Fh ). CÑ : 切. l(FÑ ). G. : ipÜ. ipÜ. k(Fh ) *{áàâ } Y. W/ :. ä. N:. Entropy ratio{mon pq mor } = n. r. Entropy ratio. DCG tree. 3.. p. 切. n. o }~ | r } r. { on {|. n. }~ |c r}. + Entropy ratio. oZ. o. {mZ pq mY Y }. 2 p. ). 切. 切. 4.. ) ). 5. 6.. {|. {mY Y pq mZ }. mutual entropy .,/ = 2.. oZ. o. { on. 五 p. ) 24. ).

(32) (FCBF(Fast Correlation-Based Filter) entropy 「. 切. Filter. : ). p、 切. 切. 切. ) entropy. p. 切. entropy. 切 X entropy H X = −. P x. log ã (P x. ) .. y. 切 Y. 切. X entropy. H XY =−. P(y/ ) /. P(x. |y/ ) log ã (P x. |y/ ) .. (information gain) IG X Y = H X − H X Y n. X、Y. 切. ). % [0,1]. ㄧ. SU X, Y = 2. IG(X|Y) H X + H(Y). 25. X、Y. symmetrical uncertainty.

(33) δ. ) SU X, C > δ 、. C. 切. 切. 切. 切 ). 切. 切. 入. 切. ) 1. Predominant Correlation 切. F. 、. C. SU.,è ≥ δ. Predominant. F/. SU/,. ≥ SU.,è ) Sp.. redundant peer. F.. ∀F/ j ≠ i. F/. F. Sp5 . =. redundant peer. F/ Fj ∈ Sp. , SU/,è > SU.,è }、Sp6 . = F/ Fj ∈ Sp. , SU/,è ≤ SU.,è }) 2. Predominant Feature Predominant. 切 Predominant. redundant peer. Predominant). 切. 切 %. 切. :. 入. 、. 切. ) 1. (. Sp5 . = ∅). 2. (. Sp5 . ≠ ∅). F. 切. Sp5 .. 3. (. ). F.. 切. Predominant. ( Sp6 . SU.,è. Sp6 .. Predominant Feature. 切. ) Predominant 26. F.. 切 切. 力. ) 切. 切.

(34) 切. ). FCBF. 切. ). (Lasso Lasso. ). ã. -. *. y. − βï − .O'. β/ x./. -. βã/ = RSS + λ. +λ. /O'. -. /O'. βã/ ) /O'. 、 OLS RSS. OLS 0 p. 0 p. 切. 切. λ a. b.. ã. -. β/ x./. -. +λ. /O'. Lasso. -. |óò | = RSS + λ /O'. 切. ). |óò | /O'. 十. λ. ㄧ. λ. |. 0). y. − βï − . 1.. 切 切. *. .O'. 0. )已. )Lasso. 切. ). ) Lasso. OLS. 27. 10. 五). 力.

(35) 2.. λ. 五. 3.. 切. 切. 切. ). ). 4.. 五. 5.. p. ). ). L(F-score F-score. 才 x.. ) j. 切. 切 n5(. i = 1, … m. n6. F-score. 切. 5. 6. − x/ )ã + ( x/ − x/ )ã 1 1 5 ã 6 ã (5) (6) *ô *ö. (x − x/ ) +. (x − x/ ) n5 − 1 hO' h,/ n6 − 1 hO' h,/ ( x/. F-score. n. 切. )F-score F-score F-score. 切. 1.. 切. 2.. F-score. a. b.. 力. F-score. 才. ). 切. 切 )十. ). ㄧ. 切. ) ) 28. ).

(36) c.. ㄧ 五. ). 3.. 五. (. 三. 切. ). Random Forest. 三. 切. 三. ) 三. 切. Bagging. )切 ) 切. 切 ). p. 三. 切. 切. ). 切. ) 1.. 三. p切. 2.. ㄧ. ) 切. 化 a.. 力. 切 ). b.. ). c.. ㄧ 五. 3.. ) 五. 切. ) 29. 入. ).

(37) Genetic Algorithm. (也 %. p p. n. ). n. ). n. p. 內. e ). 三. % p. 切. 五. 、. ). 」. 干. ). (entropy 切. p. 力. 切. 切. n. (. p n. n. 4-4)). 4-4. 切. 、 ). 切. n. p. p. p. 切 ). 30.

(38) 4-4 Entropy. (Lasso λ. Lasso λ 「. 五 p. 4-5) 切 切. MSE. 「. 10 Lasso. 才 「. 0. 切. MSE) 0. ). 31. p )MSE. λ. 4-5. 五. Lasso. λ. 切.

(39) 4-5. λ. MSE. 切. (F-score F-score. 切 4-6. 切. 」. 干. ). 」 0.2 F-score. 0.2. 0.2. 1.5 p. F-score. p. 干 F-score. 切. F-score ㄧ. 切. 0.2. 切. 0.2(0.3(0.4. 1.5. 切. 0.1) 五. 切. ). 32.

(40) 4-6. (. 」. 干. 切. F-score. 三 三. 切. 切 切 4-7)、. ). 」. F-score. 干 三. 2.5. 切 )、 F-score. 切 2.5). 切. 8. 0.1. ㄧ ). 33. ). 五.

(41) 4-7. 」. 干. 切. L(也 也. 也. 切. ). %. )% (. (. ). (generation). n. 切 gn. ). 力. gn. gn. gn. gn gn. 內. e). 不 ) e 也. r. p. 切. 34. L ). e n. gn. nL. ).

(42) L. p. 五. entropy. p. entropy. 切. p. 切 今. p. p entropy. 切. ) 6:4. 、. p (Specificity). 士. 「). 35. (Sensitivity) 、.

(43) (. Dimension. All. Entropy. Lasso. F-score. Random Forest. Genetic Algorithm. FCBF. 0.6125. 0.6329. 0.6145. 0.6276. 0.6375. 0.6257. 0.6184. 130. 29. 40. 46. 21. 75. 2. 30%(12). 26.08%(12). 23.8%(5). 22.7%(17). 50%(1). Entropy 切. Sensitivity. 0.2334. 0.4371. 0.0613. 0.3510. 0.4950. 0.3825. 0.0679. Specificity. 0.8624. 0.7620. 0.9793. 0.8100. 0.7314. 0.7860. 0.9814. 5-1 5-1. 0.6375. 三 21. ). entropy. 0.6329. 29. entropy Entropy. )y entropy. 切. 切. 切. 切. 切. 3-1). F-score. p %. (. ). p Lasso 、 FCBF. n. Sensitivity. 切 切. p entropy. ) 36. 、. 三.

(44) 5-1 、 ) 5-1 (a). entropy. (b). 三. ) (b). (a). 37.

(45) (. 」. 干. All. Entropy. Lasso. F-score. Genetic. Random. Algorith. Forest. FCBF. m. 0.9227. 0.9262. 0.9332. 0.9262. 0.9279. 0.9297. 0.9490. 30. 12. 13. 16. 13. 15. 3. 30.7%(4). 75%(12). 61.5%(8). Dimension Entropy 切. 26.7%(4) 66.7%(2). Sensitivity. 0.8349. 0.9009. 0.8962. 0.9150. 0.8632. 0.9009. 0.9057. Specificity. 0.9748. 0.9412. 0.9552. 0.8627. 0.9664. 0.9468. 0.9748. 5-2 」 2. 」. 干 30. 干. FCBF. Lasso. 13. 0.9332. FCBF. entropy F-score score 、. 、 F-score 16. entropy entropy. 三 切. 12 切. FLasso 、%. % n. ). 5-. 5-2 entropy 38. 切.

(46) ) 5-2 (a). 」. 干. entropy. (b). (a). FCBF) (b). 39.

(47) (小 Random. Genetic. Forest. Algorithm. 0.8702. 0.899. 0.8221. 0.8173. 16. 22. 11. 37. 10. 31.2%(5). 45.4%(10). 63.6%(7). 35.1%(13). 40%(4). All. Entropy. Lasso. F-score. 0.8269. 0.8606. 0.8413. 60. 20. 小. Dimension. FCBF. Entropy 切. Sensitivity. 0.8649. 0.8739. 0.8829. 0.8739. 0.9279. 0.8829. 0.8468. Specificity. 0.7835. 0.8454. 0.7938. 0.8660. 0.8660. 0.7526. 0.7835. 5-3 小 5-3. entropy(F-score( 三. 小 三 entropy. entropy y. ) 切 ). 三 5-3. p p. ). 40. 、 F-score. 切.

(48) 5-3 小 (a). entropy. (b). (a). 三 (b). 41.

(49) (SPECTF. SPECTF. Dimension. All. Entropy. Random. Genetic. Forest. Algorithm. 0.8182. 0.7909. 0.6636. 0.7818. 16. 9. 2. 29. 4. 37.5%(6). 88.9%(8). 100%(2). 27.6%(8). 50%(2). Lasso. F-score. 0.7818 0.800. 0.7273. 44. 12. FCBF. Entropy 切. Sensitivity. 0.7090 0.6727. 0.5636. 0.7090. 0.7272. 0.4545. 0.6364. Specificity. 0.8545 0.9273. 0.8909. 0.9273. 0.8545. 0.8727. 0.9273. 5-4 SPECTF SPECTF score. 5-4 entropy. 三. 2. 已 entropy. 切. F-. ). F-score 、. 切. 5-4. 切. ). 42. 三.

(50) 5-4 SPECTF (a)entropy. (b)F-score. (a). (b). entropy 、 切. entropy. y ) p. 3-1 切. FCBF. 切 43. Filter. 「切.

(51) 切. 、. 切. 切 0. p. FCBF. 小. p. 切. ) entropy entropy. 切. FCBF ). entropy y. 切. 之)FCBF. 切. 切 )Entropy. FCBF. FCBF. 切 entropy 力. 之. 之. 切. F-score( 三. 切 切. 、 F-score. 三. ). entropy. y. 互. ). entropy. 三. DCG tree DCG tree. entropy. entropy. 切. ㄧ. 44. ). p. 切.

(52) 仍 Akay, M. F. (2009). Support vector machines combined with feature selection for breast cancer diagnosis. Expert systems with applications, 36(2), 3240-3247. Chen, Y.-W., & Lin, C.-J. (2006). Combining SVMs with various feature selection strategies Feature extraction (pp. 315-324): Springer. Díaz-Uriarte, R., & Alvarez de Andrés, S. (2006). Gene selection and classification of microarray data using random forest. BMC Bioinformatics, 7(1), 3. doi:10.1186/1471-2105-7-3 Fushing, H., & McAssey, M. P. (2010). Time, temperature, and data cloud geometry. Physical Review E, 82(6), 061110. Fushing, H., Wang, H., VanderWaal, K., McCowan, B., & Koehl, P. (2013). Multi-scale clustering by building a robust and self correcting ultrametric topology on data points. PloS one, 8(2), e56259. Golub, M. S., Hogrefe, C. E., Widaman, K. F., & Capitanio, J. P. (2009). Iron deficiency anemia and affective response in rhesus monkey infants. Developmental psychobiology, 51(1), 47-59. Lee, O. (2017). Data-driven computation for pattern information. ProQuest, UMI Dissertations Publishing.. 45.

(53) Raymer, M. L., Punch, W. F., Goodman, E. D., Kuhn, L. A., & Jain, A. K. (2000). Dimensionality reduction using genetic algorithms. IEEE Transactions on Evolutionary Computation, 4(2), 164-171. doi:10.1109/4235.850656 Saeys, Y., Abeel, T., & Van de Peer, Y. (2008). Robust Feature Selection Using Ensemble Feature Selection Techniques. In W. Daelemans, B. Goethals, & K. Morik (Eds.), Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2008, Antwerp, Belgium, September 15-19, 2008, Proceedings, Part II (pp. 313-325). Berlin, Heidelberg: Springer Berlin Heidelberg. Saeys, Y., Inza, I., & Larrañaga, P. (2007). A review of feature selection techniques in bioinformatics. bioinformatics, 23(19), 2507-2517. Svetnik, V., Liaw, A., Tong, C., & Wang, T. (2004). Application of Breiman’s Random Forest to Modeling Structure-Activity Relationships of Pharmaceutical Molecules. In F. Roli, J. Kittler, & T. Windeatt (Eds.), Multiple Classifier Systems: 5th International Workshop, MCS 2004, Cagliari, Italy, June 9-11, 2004. Proceedings (pp. 334-343). Berlin, Heidelberg: Springer Berlin Heidelberg. Yu, L., & Liu, H. (2003). Feature selection for high-dimensional data: A fast correlationbased filter solution. Paper presented at the ICML.. 46.

(54)

參考文獻

相關文件

• raw scores 80, 60 with term scores F, B: impossible from the principle: no individual score

In this section we define a general model that will encompass both register and variable automata and study its query evaluation problem over graphs. The model is essentially a

A quote from Dan Ariely, “Big data is like teenage sex: everyone talks about it, nobody really knows how to do it, everyone thinks everyone else is doing it, so everyone claims they

Initial Approaches and Some Settings Sparse Features and Linear Classification Condensed Features and Random Forest Ensemble and Final Results.. Discussion

基於 TWSE 與 OTC 公司之特性,本研究推論前者相對於後者採取更穩定之股利政 策 (Leary and Michaely, 2011; Michaely and

important to not just have intuition (building), but know definition (building block).. More on

Classifying sensitive data (personal data, mailbox, exam papers etc.) Managing file storage, backup and cloud services, IT Assets (keys) Security in IT Procurement and

“Big data is high-volume, high-velocity and high-variety information assets that demand cost-effective, innovative forms of information processing for enhanced?. insight and