• 沒有找到結果。

CHAPTER IV EXPERIMENT RESULTS

4.1 E XPERIMENT RESULTS

We calculated the contribution of each source corpus when GSP segmented Chinese words and the result was reported by Table 4.1. And the average length of extracted words by two segmentation systems was reported by Table 4.2. And the average results of 5-fold cross validation were shown in Table 4.3 and Table 4.4.

Besides, we also compared which term weighting method and which the state-of-art classifier are suggested through our experiment.

First, the overall performances in each term weighting method were compared in Fig.4.1, Fig. 4.2, Fig. 4.3, Fig. 4.4, Fig. 4.5, and Fig. 4.6. Results showed that binary, tf, tfidf and tfchi had maximum performance among all methods.

Second, the overall performances of each multi-class classifier were compared in Exact Match Ratio, Micro-average F1, and Macro-average F1 as Fig. 4.7. Results showed that BTMSVM was suggested among all classifier methods.

Table 4.1 Contribution to Chinese word segmentation from two source corpus.25

Full text Title

Number of words Percentage Number of words Percentage

Wikipedia 2,751 17.17% 712 16.52%

Google suggest 13,267 82.83% 3,582 83.48%

Total 16,108 100.00% 4,294 100.00%

Table 4.2 The average length of extracted words.26

Full text Title

Number of characters

(A)

Number of words

(B)

Keyword Avg.. length

(C) =A/B

Number of characters

(A)

Number of words

(B)

Keyword Avg. length

(C) =A/B

CKIP 71,762 22,550 3.18 20,621 6,532 3.16

GSP 62,459 16,108 3.90 17,149 4,291 4.00

We report the average results which include eight term weighting methods, four classifiers and three evaluation methods via 5-fold cross validation in Table 4.3 and Table 4.4. The dataset in Table 4.3 is full text while in Table 4.4 is title.

Table 4.3 The average performance of 5-fold cross validation where dataset is full text.27 Full text Exact Match Ratio(%) Micro-F1(%) Macro-F1(%) Classifiers CKIP0 GSP CKIP0 GSP CKIP0 GSP`

Binary

OAOSVM 82.600 80.54 82.600 80.54 76.980 73.69 OAASVM 79.000 75.75 81.840 79.71 74.010 71.04 BSVM 83.350 81.28 83.350 81.28 77.190 73.89 BTMSVM 91.37* 88.58 91.37* 88.58 90.07* 87.10 TF

OAOSVM 79.620 77.50 79.620 77.50 71.990 69.65 OAASVM 75.940 72.38 79.400 76.31 70.280 67.35 BSVM 80.850 76.75 80.850 76.75 74.070 66.76 BTMSVM 91.020 87.59 91.020 87.59 84.670 80.27 TFIDF

OAOSVM 65.490 74.39 65.490 74.39 37.360 57.79 OAASVM 53.700 54.46 68.510 68.16 50.360 49.20 BSVM 73.620 73.40 73.620 73.40 57.370 55.00 BTMSVM 87.450 86.52 87.450 86.52 78.800 86.52 TFCHI

OAOSVM 79.530 79.56 79.530 79.56 71.810 73.07 OAASVM 77.250 78.83 76.970 79.44 71.270 73.07 BSVM 10.480 10.48 10.480 10.48 4.740 4.74 BTMSVM 91.020 88.73 91.020 88.73 84.670 84.42 TFIG

OAOSVM 40.410 40.41 40.410 40.41 14.390 14.39

OAASVM 00 0 00 0 00 0

BSVM 40.410 40.41 40.410 40.41 14.390 14.39 BTMSVM 63.820 72.55 63.820 72.55 38.840 41.88 TFGR

OAOSVM 40.410 40.41 40.410 40.41 14.390 14.39

OAASVM 00 00 00 0 00 0

BSVM 40.410 40.401 40.410 040.41 14.390 14.39 BTMSVM 63.820 72.55 63.820 72.55 38.840 41.88 TFOR

OAOSVM 60.340 57.03 60.340 57.03 34.840 32.91 OAASVM 22.530 18.37 35.860 30.27 17.090 15.54 BSVM 61.880 57.56 61.880 57.56 35.700 33.14 BTMSVM 68.790 76.12 68.790 76.12 45.240 51.99 TFRF

OAOSVM 65.490 62.98 65.490 62.98 37.360 36.68 OAASVM 44.510 37.87 59.170 52.69 34.510 31.08 BSVM 66.510 64.48 66.510 64.48 37.780 37.58 BTMSVM 83.460 84.49 83.460 84.49 67.430 70.07 Note: Number in bold indicates the best performance in each classifier

Note: Number denoted with ‘*’ indicates the best performance overall.

Table 4.4 The average performance of 5-fold cross validation where dataset is title.28 Title Exact Match Ratio(%) Micro-F1(%) Macro-F1(%) Classifiers CKIP GSP CKIP GSP CKIP GSP Binary

OAOSVM 85.20 83.93 85.20 83.93 78.91 77.27 OAASVM 81.59 79.28 84.07 82.31 76.89 74.93 BSVM 85.94 83.41 85.94 83.41 80.77 76.76 BTMSVM 90.84* 90.55* 90.84* 90.55* 89.60 85.26 TF

OAOSVM 84.18 84.17 84.18 84.17 77.16 78.59 OAASVM 79.84 80.29 81.95 82.52 75.30 74.75 BSVM 82.59 82.87 82.59 82.87 75.90 76.18 BTMSVM 89.35 89.91 89.35 89.91 88.79 88.28 TFIDF

OAOSVM 81.09 81.33 81.09 81.33 71.58 72.05 OAASVM 70.13 66.51 78.53 75.44 66.05 65.71 BSVM 82.40 71.40 82.40 71.40 73.68 52.10 BTMSVM 91.94* 90.06 91.94 90.06 87.00 81.75 TFCHI

OAOSVM 79.81 84.42 79.81 84.42 72.60 78.67 OAASVM 79.81 77.71 78.46 76.31 72.69 69.19 BSVM 10.48 10.48 10.48 10.48 4.74 4.74 BTMSVM 86.18 88.55 86.18 88.55 80.51 85.39 TFIG

OAOSVM 40.41 40.41 40.41 40.41 14.39 14.39

OAASVM 0 0 0 0 0 0

BSVM 40.41 40.41 40.41 40.41 14.39 14.39 BTMSVM 65.85 72.55 65.85 72.55 39.32 41.88 TFGR

OAOSVM 40.41 40.41 40.41 40.41 14.39 14.39

OAASVM 0 0 0 0 0 0

BSVM 40.41 40.41 40.41 40.41 14.39 14.39 BTMSVM 65.85 72.55 65.85 72.55 39.32 41.88 TFOR

OAOSVM 67.02 53.5 67.02 53.5 38.07 30.49 OAASVM 65.22 15.62 66.94 26.24 38.20 13.15 BSVM 66.47 52.91 66.47 52.91 37.80 30.07 BTMSVM 80.73 74.81 80.73 74.81 70.57 50.57 TFRF

OAOSVM 72.65 70.65 72.65 70.65 52.25 53.37 OAASVM 63.99 54.21 72.65 67.08 46.23 47.35 BSVM 73.42 71.40 73.42 71.40 54.05 52.10 BTMSVM 84.15 86.43 84.15 86.43 68.16 72.16 Note: Number in bold indicates the best performance in each classifier

Note: Number denoted with ‘*’ indicates the best performance overall.

Fig 4.1 to 4.3 report the overall performance in each term weighting method by four multi-class classifiers in three evaluation methods where the dataset is full text.

Fig. 4.1 The performance of Exact Match Ratio on full text.15

Fig. 4.2 The performance of Micro-F1 on full text.16

OAASVM

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Exact Match Ratio

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Exact Match Ratio

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Exact Match Ratio

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Exact Match Ratio

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Micro-F1

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Micro-F1

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Micro-F1

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Micro-F1

Fig. 4.3 The performance of Macro-F1 on full text.17

Fig 4.4 to 4.6 report the overall performance in each term weighting method by four multi-class classifiers in three evaluation methods – Exact Match Ratio, Micro F1, and Macro F1 where the dataset is title. Fig 4.1 to 4.6 also report that the results by GSP almost achieve excellent overall performance with CKIP.

OAASVM

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Macro-F1

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Macro-F1

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Macro-F1

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Macro-F1

Fig. 4.4 The performance of Exact Match Ratio on title.18

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Exact Match Ratio

CKIP-T IT LE

GSP-T IT LE OAOSVM

0

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Exact Match Ratio

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Exact Match Ratio

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Exact Match Ratio

Fig. 4.5 The performance of Micro-F1 on title.19

Fig. 4.6 The performance of Macro-F1 on title.20

BTMSVM

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Micro-F1

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Micro-F1

CKIP-T IT LE

GSP-T IT LE OAOSVM

0

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Micro-F1

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Micro-F1

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Macro-F1

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Macro-F1

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Macro-F1

Binary TF TFIDF TFCHI TFIG TFGR TFOR TFRF

Macro-F1

Fig 4.7 depicts the comparison of overall performance in the dataset on full text and on title. BTMSVM classifier has better performance among all classifiers. In addition, almost the performances have better results while the dataset title only was chosen.

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

Fulltext

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

Fulltext

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

Fulltext

TFIDF - CKIP

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

Fulltext

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

Fulltext

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

Fulltext

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

TFIG - CKIP

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

Fulltext

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

Fulltext

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

Fulltext

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

TFOR - CKIP

TFOR - GSP

TFRF - CKIP

TFRF - GSP

Fig. 4.7 The comparison of four multi-class SVMs in each term weighting method. 21

Fulltext

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

Fulltext

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

Fulltext

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

Fulltext

Exact Match Ratio Micro-F1 Macro-F1 OAASVM OAOSVM

Exact Match Ratio Micro-F1 Macro-F1

We also choose one article and the segmented results by each segmentation system.

The comparison of extracted words by two systems were reported by Table 4.5. The keyword “ 兩 公 約 ” (International Covenant on Civil and Political Rights and International Covenant on Economic, Social and Cultural Rights) is a new word (OOV) and also an absolute keyword for categorizing. In CKIP, “兩公約” was segmented as “公約”. We observed Table 3.14 and found “公約” may not have discriminative values based the conclusion on Table 3.6 and Table 3.7. This article was actually categorized into the class of Student Affairs according this OOV word –“兩公約” by eyes. Therefore, to segment out ‘兩公約” was needed.

Table 4.5 The comparison of segmented words.29 Raw content

請尚未填報「兩公約」及「兩公約施行法」辦理情形調查表之學校,儘速於 98

年12 月 30 日前填報完畢,尚未辦理宣導及講習活動者,也請於 98 年 12 月 30

日前辦理完畢並填報,請 查照。

Results of two word segmentation system Number of words

CKIP 公約/公約/施行法/辦理/情形/調查表/學校/日前/完畢/辦

理/宣導/講習/活動/日前/辦理/完畢/查照 17

GSP 尚未/兩公約/兩公約施行法/理情/調查表/學校/日前/尚

未/宣導/講習/活動/日前/查照 13

Note: Words in bold indicate that they are absolute keywords in categorizing

相關文件