• 沒有找到結果。

DecisionTree OutputFormat DataFormat InputFormat Specificationof5.3(2)and5.3(5) Homework#5

N/A
N/A
Protected

Academic year: 2022

Share "DecisionTree OutputFormat DataFormat InputFormat Specificationof5.3(2)and5.3(5) Homework#5"

Copied!
2
0
0

加載中.... (立即查看全文)

全文

(1)

Data Structures and Algorithms (NTU, Spring 2012) instructor: Hsuan-Tien Lin

Homework #5

TA in charge: Yu-Cheng Chou, Ya-Hsuan Chang and Wei-Yuan Shen

RELEASE DATE: 05/04/2012 DUE DATE: 05/18/2012, 17:00

Specification of 5.3(2) and 5.3(5)

We provide two data sets, namely heart and wine for you to test your program. Of course, we may use other data sets to evaluate your performance. If you are interseted in trying more data sets, you can check on UCI Machine Learning Repository.

http://archive.ics.uci.edu/ml/data sets.html

Input Format

For both 5.3(2) and 5.3(5), the first argument of your program should be the training data file which contains the examples. For the Random Forest in 5.3(5), the second argument of your program should be the number of decision tree to build. For example,

./tree heart ./forest heart 20

Data Format

The first line contains two integers n and m, the former one (n) is the number of examples and the latter one (m) is the number of total factors. Each of the following n lines represents an example in the following format, where each number is separated by a space:

label factor[0] factor[1] ... factor[m-1]

For instance, for the line

1 14.23 1.71 2.43 15.6 127 2.8 3.06 0.28 2.29 5.64 1.04 3.92 1065 1 is the label and the rest are the factors.

Output Format

Decision Tree

Please output your tree as a function in C/C++ language. The function must follow this signature:

int tree predict(double *attr);

The only argument is a double array which contains the factors of one example in the same format as input. This function should return the label prediction of the example (1 or -1 for heart, for in- stance). Also, please name your output file as ”tree pred.h”. Then, you can compile and run the provided ”tree predictor.cpp” to check how good your decision tree is (see README). For example, your ”tree pred.h” should look like:

int tree_predict(double *attr){

if(attr[0] > 5){

return 1;

} else{

return -1;

} }

1 of 2

(2)

Data Structures and Algorithms (NTU, Spring 2012) instructor: Hsuan-Tien Lin

Random Forest

Similar with the decision tree function, you need to output your forest as a function in C/C++ language.

The function must follow this signature:

int forest predict(double *attr);

The argument and return value specs are the same as the decision tree function. Also, please name your output file as ”forest pred.h”. Then, you can compile and run the provided ”forest predictor.cpp” to check how good your Random Forest is (see README). For example, your ”forest pred.h” should looks like:

int forest_predict(double *attr){

tree1_predict:

tree2_predict:

treeT_predict:

voting:

}

Data Set Description

In this section we provide the meaning of factors and label in the data sets.

Wine

label means two types of wine from two different cultivars.

factors are:

1) Alcohol 2) Malic acid 3) Ash

4) Alcalinity of ash 5) Magnesium 6) Total phenols 7) Flavanoids

8) Nonflavanoid phenols 9) Proanthocyanins 10)Color intensity 11)Hue

12)OD280/OD315 of diluted wines 13)Proline

Heart

The data set describes diagnosing of cardiac Single Proton Emission Computed Tomography (SPECT) images.

label means two catogories of patients : normal and adnormal.

factors are:

1. F1R: continuous (count in ROI (region of interest) 1 in rest) 2. F1S: continuous (count in ROI 1 in stress)

3. F2R: continuous (count in ROI 2 in rest) 4. F2S: continuous (count in ROI 2 in stress) 5. F3R: continuous (count in ROI 3 in rest) 6. F3S: continuous (count in ROI 3 in stress) 7. F4R: continuous (count in ROI 4 in rest) 8. F4S: continuous (count in ROI 4 in stress) ...

- all continuous attributes have integer values from the 0 to 100

2 of 2

參考文獻

相關文件

The ProxyFactory class provides the addAdvice() method that you saw in Listing 5-3 for cases where you want advice to apply to the invocation of all methods in a class, not just

[r]

依獎懲及 法定程序 予以書面 懲處 暫時讓學 生與其他 同學保持 距離..

第二級失能 生活補助金 滿第一年 15萬元 11.25萬元 滿第二年 20萬元 15.00萬元 滿第三年 25萬元 18.75萬元 滿第四年 30萬元

Majikan yang memenuhi salah satu kondisi seperti di bawah ini (silakan pilih salah satu), saya (TKA) terhitung sejak tahun bulan tanggal melanjutkan pekerjaan

We work over the complex number field C.. Let X be a projective minimal Gorenstein 3-fold of general type.. The above sum runs over all those exceptional divisors of p that lie over

Wang and Lih proved the 4-choosability of planar graphs without 5-cycles, or without 6-cycles, or without intersecting 3-cycles in [10, 9, 11]5. Farzad [3] proved the 4-choosability

Mathematics Education Section: The Advisor shall nominate 3-5 members, and 3 of them should be appointed by the committee of qualification examination as the examiner for