• 沒有找到結果。

A novel feature selection method for large -scale data set

N/A
N/A
Protected

Academic year: 2022

Share "A novel feature selection method for large -scale data set"

Copied!
1
0
0

加載中.... (立即查看全文)

全文

(1)

題名: A novel feature selection method for large -scale data set 作者: W. C. Chen;S. S. Tseng

貢獻者: Department of Information Science and Applications

關鍵詞: machine learning;knowledge discovery;feature selection;bitmap indexing;rough set

日期: 2005

上傳時間: 2009-11-30T08:03:24Z 出版者: Asia University

摘要: Feature selection is about finding useful (relevant) features to describe an application domain. The problem of finding the minimal subsets of features that can describe all of the concepts in the given data set is NP- hard. In the past, we had proposed a feature selection method, which originated from rough set and bitmap indexing techniques, to select the optimal (minimal) feature set for the given data set efficiently. Although our method is sufficient to guarantee a solution's optimality, the

computation cost is very high when the number of features is huge. In this paper, we propose a nearly optimal feature selection method, called bitmap-based feature selection method with discernibility matrix, which employs a discernibility matrix to record the important features during the construction of the cleansing tree to reduce the processing time.

And the corresponding indexing and selecting algorithms for such feature selection method are also proposed. Finally, some experiments and comparisons are given and the result shows the efficiency and accuracy of our proposed method.

參考文獻

相關文件

In this chapter we develop the Lanczos method, a technique that is applicable to large sparse, symmetric eigenproblems.. The method involves tridiagonalizing the given

For the proposed algorithm, we establish its convergence properties, and also present a dual application to the SCLP, leading to an exponential multiplier method which is shown

Given a connected graph G together with a coloring f from the edge set of G to a set of colors, where adjacent edges may be colored the same, a u-v path P in G is said to be a

In the past researches, all kinds of the clustering algorithms are proposed for dealing with high dimensional data in large data sets.. Nevertheless, almost all of

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005..

Random Forest Algorithm Out-Of-Bag Estimate Feature Selection.. Random Forest

We try to explore category and association rules of customer questions by applying customer analysis and the combination of data mining and rough set theory.. We use customer

In this chapter, the results for each research question based on the data analysis were presented and discussed, including (a) the selection criteria on evaluating