Performance Evaluation - EXPERIMENT DESIGN AND RESULTS

CHAPTER 4 EXPERIMENT DESIGN AND RESULTS

4.2 Performance Evaluation

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Figure 5: The 24^th simulation set.

4.2 Performance Evaluation

Here we show a brief rule that how to evaluate the DSM’s performance in this study. According to the theoretical non-outlier or theoretical outlier defined when designing the experiment by judging 𝜖𝜖𝑡𝑡, we can compare the identification result to the proposed DSM. There are four possible outcomes: (the term’s fist character is on behalf of the theoretical type and the second character is on behalf of the resulted type identified by the proposed DSM)

(1) N-O: The theoretical non-outlier that has been incorrectly specified as outlier candidate.

(2) O-N: The theoretical outlier that has been incorrectly specified as non-outlier.

(3) O-O: The theoretical outlier that has been correctly specified as outlier candidate.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

(4) N-N: The theoretical non-outlier that has been correctly specified as non-outlier.

In the Table 4, we show two misrecognized identification types.

Table 4: The possible experiment outcome measurement

Error Type Meaning

Ratio of Type I error

The proportion of theoretical non-outliers that has been incorrectly identified as outlier candidates.

Ratio of Type II error

The proportion of theoretical outliers that has been incorrectly identified as non-outliers.

In the practical application, Type II is more critically serious than Type I. The main reason is Type II may be much more harmful than Type I to our decision or operation. In other words, the risks of Type II error may incur a greater loss to companies or organizations (Joo, Hong & Han, 2003). Due to this reason, we hope that the proportion of Type II error can be very low.

Here we take the 95^th simulation set as an example. Table 5 is the result about take the 95^th simulation set. Furthermore, we also demonstrate the 100 simulation sets’

total performance in Table 6 for discussion.

‧

Table 5: The experiment result of the 95^th simulation set

Time Stamp

Training block

Testing block Evaluation

learning

block Potential outlier

M Training

‧

Table 6: The whole experiments’ performance

Total 100

Total amount of O-O Total amount of O-N Total amount of N-O

Potential outlier Testing block Learning block

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

In the Table 5, we have learned that each window’s accuracy is higher than 95%

and even some windows’ accuracy reach 100%. Here, we adopt indicators Type I and Type II of Table 4 to measure the experiment performance.

For the total 100 simulation sets’ performance, we can know the total amount of theoretical outliers in the whole 100 simulation sets is 281. There are 275 theoretical outliers have been identified as outlier candidates -- 79 of them are detected within the training block (28.7%) and 196 within the testing block (71.3%). This is good phenomenon since the theoretical outliers can be detected as early as possible, especially in testing block. Note that some theoretical outliers are in the training block when M=1. So if this DSM can be applied in long-term or infinite time series data.

It’s likely that we may get more theoretical outliers output as outlier candidate in testing block.

In our study, we hope the proportion of Type II error can be less. Based on this anticipation, we have determined a rule: if the instance has been distinguished as outlier candidate whenever in each window, it will be output to decision maker in the first time. After outputting it to decision maker, decision maker have to evaluate whether it is a real outlier or not. Due to this mechanism, the decision maker only has to review the outlier candidates provided by this DSM. With this rule, the decision maker needs to review only approximately 12% data. This DSM decreases large proportion of data need to be reviewed and really achieves time-saving goal.

In Table 6, there are 6 theoretical outliers that are not detected as outlier candidate, and this proportion is at approximately 2%. The total output amount of the outlier candidates is 2408, at around 12 percentage. Same as experiment result about

‧

the theoretical outlier, this result explains a large proportion of outlier candidate being output while in the testing block. Furthermore, this 6 theoretical outliers that do not be identified as outlier candidates are distributing in 6 individual simulation sets.

For investigating the reason causing Type II error, we investigate into each experiment result. Here we try to discuss possible reasons causing this error. Note that the misrecognized identification causes Type II error. Due to the characteristic of application area, we try to discuss some possible solutions to this error case.

In the cases of 23^rd, 37^th and 75^th simulation sets, the theoretical outliers are in the testing block of last window that consists of the 196^th to 200^th instances.

Furthermore, the theoretical outlier is closer to the majority of the data.

In the 23^rd simulation set, Figure 6 shows the entire 200 instances. In this set, the 41^st, 120^th, 183^rd and 196^th instances are theoretical outlier. There only the 197^th instance didn’t be identified as outlier candidate by the proposed DSM. In the last window where M = 20, the training block is the instances from 95^th to 195^th, and the test block is the instances from the 196^th to 200^thwhere the 197^th instance is theoretical outlier and the 196^th, and 198^th to 200^th are theoretical non-outliers. The result of the last window shows that the 120^th and 183^rd instances have been correctly identified as outlier candidates before. However, the 196^th instance has been incorrectly identified as non-outliers, and its value seems near to the neighborhood instances’ value. Due to the deviance between the outlier candidates and the neighborhood instances, we think this reason causes the misrecognized identification and gives rise to the 196^th theoretical outlier being identified as non-outlier. However, if we have more time series data for performing more windows, we think this DSM will have high possibility to identify it successfully.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Figure 6: The 23^rd simulation set.

Figure 7 and 8 show the 37^th and the 45^th simulation sets individually. In this two sets the problem we face is similar to the 23^rd set we discuss before. There are also one theoretical outlier in last window doesn’t been correctly identified as outlier candidate. In 37^th set, the 197^th instance is closer to the majority data. In the 45^th set, the 196^th instance seems have similar feature obviously. Due to the deviance between the outlier candidates and the majority data, we think this reason causes the misrecognized identification and gives rise to the 197^th theoretical outlier in 37^th set and the 196^th theoretical outlier in the 45^th set being identified as non-outliers incorrectly. Although this instances have not been identified as outlier candidate, we also believe that if there have more time series data for performing more windows.

We have confidence that the proposed DSM may have high possibility to identify it correctly in following windows.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Figure 7: The 37^th simulation set.

Figure 8: The 45^th simulation set.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

In the 56^th simulation set, the Figure 9 shows that both the 24^th and 153^rd are instances as theoretical outliers, but the 24^th instances doesn’t be detected as outlier candidate while M = 1, 2, 3 and 4. However, this theoretical outlier has been detected as a potential outlier in the first window, where M=1. Regretfully, in the continuing widows, this theoretical outlier has been regarded as non-outlier until this instance being discarded. We think if there have more historical data with same concept, and perform with this instance. Maybe this theoretical outlier can be distinguished as outlier candidate. Another solution is that if we shrink the window size to fit this data nature, and we may detect it successfully.

Figure 9: The 56^th simulation set.

Consider the 49^th and 97^th simulation set, this two sets also have similar situation to the 56^th set. The difference between the 56^th experiment set and 49^th, 97^th experiment set is that the theoretical outlier in 49^th, 97^thdoesn’t be detected as potential outlier yet. In Figure 10 and 11, although the 6^th instance both in the 49^th and

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

97^th set seem to be very obvious to be distinguished as outlier candidate, the envelope module does not distinguish it correctly. Considering the majority of data in whole window where M = 1 or 2, the majority of data seems having a high similarity with the latter part in the specific window.

Owing to the similarity between the theoretical outlier and the majority of data in the specific windows, we think the possible solutions to distinguish the theoretical outlier may be same as the 56^th set’s conditions. One is that if there have more historical data with same concept. Maybe this theoretical outlier can be distinguished as outlier candidate in earlier window. Another one is that if we adjust the window size to fit this data nature, and we may detect it successfully. However, in the initial window, we consider that this DSM can perform extra proper size for early instances.

Based on this simulation data, we realize that the trend of the data may let the theoretical outlier near the origin have strong similarity to the instance in the later part of the window. So we suggest that the DSM perform the extra window with appropriate small size to identify the theoretical outlier correctly.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Figure 10: The 49^th simulation set.

Figure 11: The 97^th simulation set.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Let’s go back to a successful experiment set. Refer to the Table 5 and Figure 3.

In the 95^th simulation set, most of windows, the theoretical outliers can be identified well. However, while M= 8, 9, 10 and 11, SLFN has incorrectly specified a theoretical outlier as a non-outlier. Thus the misrecognized identification cause a Type II error case. Due to the characteristic of application area, we use an auxiliary decision support rule. If the outlier candidate has been output before, then we don’t take unnecessary pains to focus on it. The reason is that the decision maker has determine before whether the potential outlier is a real outlier or just normal instance.

For discussing, we show some representative window’s chart in the 95^th experiment set below to illustrate the result we faced. First, we try to explain the meaning of each present type. The green line is the fitting function trained by the DSM. The yellow upper and lower lines are the boundaries of the envelope. In this envelope, the envelope’s bulk is set to 5. The bulk’s width is based on the error term’s standard deviation, which is 2. With the confidence level, we set the bulk as 2*2.5 = 5.

So the total width of the envelope is 2*ε, 10.

Then, we elaborate the dots’ meaning in the chart. The blue circle is the theoretical non-outlier distinguished as non-outlier in training block, obviously in the envelope. The yellow triangle is theoretical non-outlier but this instance is distinguished and output as outlier candidate. The white square is the theoretical outlier distinguished correctly as outlier candidate. The red square is the theoretical outlier distinguished incorrectly as non-outlier, and this result causes Type II error.

The white circle is the theoretical non-outlier distinguished as non-outlier in testing block.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Figure 12: The 95^th simulation set’s moving windows when M=2

Figure 12 shows the 2^nd window. In this window, the 6^th to 105^th instances compose training block. There are 2 theoretical outliers in this window. One in the 32^nd instances and the other one is the 55^th instances, both in training block. Then, we try to observe the trend, especially in testing set. Actually, the trend of data seems to be rising.

Considering the result, this proposed DSM successfully identified the 32^nd and 55^th instances, theoretical outliers, as outlier candidates in this window. On the other hand, the proposed DSM also output 2 theoretical non-outliers. One in the training block, the other one in the testing block. One is the 82^nd instance, the other one is the 109^th instance. This 2 instances can be classified as Type I error.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Figure 13: The 95^th simulation set’s moving windows when M=13

Figure 13 shows the 13^rd window. This window shows the 119^th and 162^nd instances are theoretical outlier, and this DSM also successfully identified it as outlier candidates. Besides, one thing worth to notice is that the 162^nd instance has been distinguished correctly as outlier candidate in the testing block.

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

Figure 14: The 95^th simulation set’s moving windows when M=16

Figure 14 shows the 16^th window. We can notice that the concept has changed after t = 151. It seems that even with the change of concept, the DSM still handles with outlier detection well. Through the 82^nd instance is incorrectly detected as an outlier candidate, this may incur a Type I error. Hence the expected application area we proposed, we still output this instance as an outlier candidate to decision maker.

However, the 119^th and 162^nd instances are still correctly distinguished as outlier candidates.

Summary of this 95^th simulation set, there are totally 4 theoretical outliers, and all of these theoretical outliers has been detected. One promising thing is that the result shows that both the 119^th and 162^nd theoretical outliers are still correctly distinguished as outlier candidates while they appear in the testing block in the first time. Of course, the 32^nd and 55^th theoretical outliers have been correctly distinguished as outlier candidates while M=1, that is to say, the 32^nd and 55^th

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

theoretical outliers are also detected by the proposed DSM in the initial window.

There are totally 16 instances, merely 8% of whole data in this simulation set, has been output as outlier candidates. This result clearly reflects this DSM make a well decision support.

Another interesting issue in the computer application is zero-day attack or vulnerability. A zero-day attack is a cyber-attack exploiting a vulnerability that has not been disclosed publicly (Bilge & Dumitras, 2012). In practical application, there is almost defense against a zero-day attack. Furthermore, the IDS with signature-based scanning method seems to be hard to detect it successfully while the attack still remains unknown openly.

From the literature review of zero-day attack, we can deal the zero-day attack problem with unsupervised learning technique. Because the DSM is ignorant about that the concept has drifted while t = 151 in our experiment, we consider the first theoretical outlier appears while t ≥ 151 is the zero-day attack. In our research, all of the zero-day attacks in 100 simulations have been detected successfully. In Table 7, we also know the proportion of zero-day attacks are identified as outlier candidate is 45%, and this figure also means that 45 sets’ zero-day attacks are detected in the testing block.

Table 7: The detecting zero-day attack’s performance

Detected in which block Testing block Training block

Total 100 simulation sets 45 55

‧

國

立政治大學

‧

N a tio na

l C h engchi U ni ve rs it y

在文檔中適用於動態環境中偵測離群值之決策支援機制 - 政大學術集成 (頁 47-63)