• 沒有找到結果。

2. LITERATURE REVIEW

2.3 Artificial Neural Networks

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Family structure. Family structure affects the likelihood of a student loan default

in many ways. First, the more family members students claimed, the more likely the loan will default (Volkwein & Szelest, 1995). Volkwein and Szelest (1995) found that the probability of default for every dependent child in a household increased by 4.5%.

Being a single parent was also associated with a greater risk of loan default (Volkwein et al., 1998). Being separated, divorced, or widowed was found to increase the probability of defaulting by more than 7 percent (Volkwein & Szelest, 1995). The last way that family can affect student loan default is by providing a pocket money. Students who can rely on family support are less likely to default than those who do not have family support (Volkwein et al., 1998).

Income. As common sense, students from low-income families tend to incur more

debt than their wealthier peers during school (Herr & Burt, 2005). Therefore, the higher the family income, the lower the probability of a student default. Families with more money can provide a financial safety net for students than low-income families. This financial safety net also helps students meet their loan obligations through variances in personal income (Gross, Jacob PK, 2009).

2.3 Artificial Neural Networks

2.3.1 Single-Hidden Layer Feedforward Neural Networks (SLFN)

To handle anomaly detection and resistance learning, Tsaih and Cheng (2009) implemented an adaptive SLFN to solve this problem. The fitting function of SLFN is defined as:

𝑎𝑖(𝑥) = tanh (𝑤𝑖0𝐻 + ∑ 𝑤𝑖𝑗𝐻𝑥𝑗

𝑚

𝑗=1

)

superscript o throughout the paper refers to quantities related to the output layer; and 𝑤𝑖𝑜 is the weight between the 𝑖𝑡ℎ hidden node and the output node. In their study, a character in bold represents a column vector, a matrix, or a set, and the superscript T indicates the transposition.

Through this SLFN, the input information x is first transformed into ≡ (𝑎1, 𝑎2, … , 𝑎𝑝)𝑇 , and the corresponding value of f is generated by a rather than x.

Scilicet, given the observation, all the corresponding values of hidden nodes are first calculated with 𝑎𝑖 ≡ tanh(𝑤𝑖0𝐻 + ∑𝑚𝑗=1𝑤𝑖𝑗𝐻𝑥𝑗) for all I, and the corresponding value 𝑓(x) is then calculated as 𝑓(x) = 𝑔(𝑎) ≡ 𝑤0𝑜+ ∑𝑝𝑖=1𝑤𝑖𝑜𝑎𝑖 . (Tsaih and Lian, 2017).

2.3.2 The Resistant Learning with Envelope Module (RLEM)

Tsaih and Cheng (2009) proposed a resistant learning outlier detection mechanism with the SLFN and a tiny pre-specified value as 10-6 to deduce a function form. The mechanism dynamically adapts the number of the hidden nodes and the relative weights of SLFN during the training process. By the way, they also implemented both robustness analysis and deletion diagnostics to exclude potential outliers at the early stage, thus prevent the SLFN from learning them (Rousseeuw and Driessen, 2006).

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

Above all, the weight-tuning mechanism, the recruiting mechanism, and the reasoning mechanism are implemented to allow the SLFN to evolve dynamically during the learning process and to explore the non-linear relationship that is acceptable between explaining the variables and the response in the presence of outliers.

Huang et al. (2014) propose an envelope bulk mechanism integrated with the SLFN to handle outlier detection problem. This outlier detection algorithm is performed with an envelope bulk whose half width is 2ε. The ε is changed from a tiny value (10−6) to a non-tiny value (1.96) due to the envelope module. The value changes to 1.96 similarly according to that the 5% significance level in given the distribution is normal.

The standard to distinguish whether the instance is outlier or not is the instance’s residual is greater than ε ∗ γ ∗ σ, where σ is the standard deviation of the residual of the current reference observations and γ is a constant that is equal to or greater than 1.0, depending on the user’s stringency in the outlier detection. The smaller the γ value is, the more stringent the outlier detection is. Furthermore, if our requirements are stricter, we also can modify the ε value to an appropriate value (Tsaih and Wu, 2017).

In short, this envelope module allows us to encapsulate the response elements seen as inliers in the envelope. Vice versa, the response as outliers won’t be wrapped in the envelope. The quantity of the inliers is decided by the ε and γ. The stricter parameter is, the less inliers inside the envelope. In other aspect of outliers, there will be more potential outliers determined by the envelope module. As stated in Huang et al. (2014), the resistant learning algorithm with the envelope module in Table 2. In step 2, 𝑘 can be referred to the percentage of potential outlier, which means at least (1 − 𝑘) data will be wrapped into the envelope. For instance, if there are approximately at least 95% non-outliers and at most 5% non-outliers, the SLFN will take 95% data into consideration while building the SLFN.

module to cope with outlier detection problem. In Table 2, N is the total amount of training cases, m is the amount of input nodes, and the width of the envelope bulk is 2ε.

In step 2, k refers to the percentage of potential outliers and the potential outliers won’t be wrapped into the envelope.

Table 2: The resistant learning with envelope module (Huang et al., 2014)

Step 1: Use the first m+1 reference observations in the training data set to set up an acceptable SLFN estimate with one hidden node. Set n = m+2.

Step 2: If n > N*(1 – k), STOP.

Step 3.1: Use the obtained SLFN to calculate the squared residuals regarding all N training data.

Step 3.2: Present the n reference observations (xc, yc) that are the ones with the smallest n squared residuals among the current squared residuals of all N training data.

Step 4: If all of the smallest n squared residuals are less than ε (the envelope width), then go to Step 7; otherwise, there is one and only one squared residual that is larger than ε.

Step 5: Set 𝐰̃ = 𝐰.

Step 6: Apply the gradient descent mechanism to adjust weights w of SLFN. Use the obtained SLFN to calculate the squared residuals regarding all training data. Then, either one of the following two cases occurs:

(1) If the envelope of obtained SLFN does contain at least n observations, then go to Step 7.

(2) If the envelope of obtained SLFN does not contain at least n observations, then set 𝐰 = 𝐰̃ and apply the augmenting mechanism to add extra hidden nodes to obtain an acceptable SLFN estimate.

Step 7: Implement the pruning mechanism to delete all of the potentially irrelevant hidden nodes; n + 1  n; go to Step 2.

Wu (2017) present a mechanism that implements the moving window and RLEM, which is adapted from the works of Tsaih and Cheng (2009) and Huang et al. (2014).

In Table 3, M is the index of the current window, N is the sample size of the training

block, B is the sample size of the testing block,  is the standard deviation of training data in the training block, and  represents the maximal deviation between actual and predicted outputs.

Table 3: The proposed mechanism implemented by Wu (2017)

Step 0: Set M as 1.

Step 1.1: Apply the RLEM stated in Table 2 (with envelope width = 2) to the N training examples {(x(M-1)B+1, y(M-1)B+1+h), (x(M-1)B+2, y(M-1)B+2+h), …, (x

(M-1)B+N, y(M-1)B+N+h)} to filter out Nk potential outliers and obtain an acceptable SLFN.

Step 1.2: Remove the outlier candidates, and then use the SLFN obtained in Step 1.1 and the RLEM stated in Table 2 (with envelope width = 2) again to learn the remained N(1-k) training examples.

Step 2: Apply the SLFN obtained in Step 1.2 to the B testing examples {(x

(M-1)B+N+1, y(M-1)B+N+1+h), (x(M-1)B+N+2, y(M-1)B+N+2+h), …, (xMB+N, yMB+N+h)}.

Step 3: For more data, M  M+1 and GOTO Step 1.1; otherwise, STOP.

相關文件