• 沒有找到結果。

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

6. Discussions and implications

This study develops the dual approach as a DSS architecture that can be used for FFD. The proposed dual approach is data-driven to perform the system modeling via directly using the sampled data. As shown in Figure 5, the system architecture based on the dual approach consists of a series of four phases. The details and the associated modules have been explained phase by phase in section 3.1 to section 3.4, respectively.

Below, we summarize these four phases and the corresponding modules.

In the training phase, the data preprocessing is first executed through the sampling module and variable-selecting module. Then, all samples with the corresponding values of selected variables are the input of the clustering module to generate two GHSOMs (i.e., fraud samples are used to generate FT and non-fraud samples are used to generate NFT). The modeling phase consists of the statistic-gathering module, rule-forming module, feature-extracting module and pattern-extracting module. The first two modules utilize the statistics of FT and NFT leaf nodes to form the classification rules which are different due to different spatial hypotheses. Then, the classification rules are tuned respectively and compete with each other to become the dominant one. The last two modules involve the discovery of features (e.g., principal components) and patterns (e.g., fraud categories) in the FT leaf nodes. The extracted features and patterns of each FT leaf node are valuable for FFD decision support through being retrieved in the decision support phase.

The analyzing phase consists of the group-finding module and classifying module.

Based on the GHSOM clustering rule, each investigated sample is clustered into its belonging leaf nodes in FT and NFT, and these two leaf nodes are paired. Then, the classifying module uses the dominant classification rule obtained from the training phase to determine if the investigated sample is fraud. If an investigated sample is identified fraud, then the decision support phase will be executed. The feature-retrieving module retrieves the features and patterns from the investigated sample’s belonging FT leaf node, and the decision-supporting module integrates the extracted features and patterns for the purpose of decision aid.

The implications for decision support in FFD, the research implications, and the FFR managerial implications are given in the following subsections.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

6.1 The decision support in FFD

The proposed system architecture results in a process of identifying any interesting pattern that can facilitate the FFD decision making. Besides, the dual approach can be integrated with other statistical, mathematical, artificial intelligence, or machine learning techniques to extract and identify useful information which contribute to the domain knowledge.

Ngai et al. (2010) have done a complete academic review of FFD. They summarized that the data mining techniques of outlier detection and visualization have seen only limited use. In real world FFD cases, the sample size of the fraud cases compared with the normal majority is relatively low. The detection of the fraud case may be regarded as recognizing the outlier from the healthy majority. Therefore, Agyemang et al. (2006) pointed out that outlier detection is a very complex task akin to find a needle in a haystack. Although we use the pair-matching to do the sampling in the FFR case mentioned in Chapter 4, as shown in section 3.1, the proposed sampling module does not stick on the pair-matching. Since the dual approach is data-driven, it can be applied to the case of outlier detection in FFD.

With the implementation of the proposed dual approach based on the GHSOM, the fraud samples and non-fraud samples are clustered separately and then the matched pairs of groups can help scale down the focus scope, such that the developed classification rule based on the associated spatial hypotheses (i.e., non-fraud-central or fraud-central) is capable of identifying the fraud samples (i.e., outliers) more accurately. Note that the classification rule based on a spatial hypothesis is developed through the proposed optimization technique for the corresponding discriminant boundary, in which the decision makers can objectively set their weightings of type I and type II errors. Therefore, the dominate classification rule is flexible enough when applying to other FFD application domains with different preference of type I and type II errors.

Also, providing fraud related patterns for a suspected sample can contribute to FFD decision making. The feature-extracting module and pattern-extracting module is able to be applied to other financial fraud scenarios (e.g., bank fraud, insurance fraud) and financial crises scenarios (e.g., bankruptcy, stock market crashes). When applying

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

to other similar scenarios, the feature-extracting module changes the input variables according to the problem domain, and the pattern-extracting module adjusts the definition of fraud categories (or crisis categories) to develop the pattern map of FT.

Such reference can enhance the quality of decision support by pinpointing the risk area (i.e., the variables in the principal components, and the fraud categories) required attention, and therefore help reduce the likelihood of issuing doubtful loan-related decisions and help provide sufficient information for decision support.

It is worthy of a future work to implement the proposed DSS architecture based on the dual approach for any FFD related application domain. The implemented DSS may contain an additional data-importing module, and a visualization module. The visualization module visualizes the identification results for a creation investigated sample, and provides a whole viewpoint of the FT (i.e., pattern map) in which the fraud categories and the principal components of each leaf node can be selected to be shown on the diagram. The decision support module can be extended to include the results of other feature extracting mechanisms (such as statistical approach and data mining approach). Then, a voting mechanism will be used to integrate all the obtained features to help decision makers receive equitable and rational decision support.

6.2 The research implications

This study utilizes the advantage of the GHSOM and pioneers a novel dual approach for constructing a DSS architecture for FFD purpose. The proposed DSS architecture is data-driven and adaptive to fit any FFD scenarios with two basic groups, fraud and non-fraud (unhealthy and healthy), and the fraud group can be divided into different subcategories which represent distinctive fraud patterns. The designed modules and processes are described and evaluated phase by phase, and the methods within several modules (sampling, variable-selecting, clustering, feature-extracting, and pattern-extracting modules) can be replaced with other similar methods which make the proposed DSS architecture more generalizable for the real world practical use.

The experimental results show that the implementation of the DSS architecture based on the proposed dual approach can help the decision support in FFD through

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

providing an alternative way of investigating financial data, which includes the dual clustering by the GHSOM and the development of adaptive classifiers for each pairs of subgroups (i.e., leaf nodes).

The implementation of the proposed DSS architecture can not only identify the fraud cases, but also provide the extracted features and patterns for reference.

Furthermore, the clustering results in FT can provide more amounts of subgroups, and provide more fraud-related information within subgroups compared to the dichotomous detection results which are generally provided by the conventional FFD studies, so that a comprehensive exploration of the relationship between different subgroups is intriguing and possible. Also, the GHSOM of the proposed system architecture is applicable to the adaptive sample size (i.e., data-driven) since the GHSOM will be re-developed accordingly, and the feature-extracting module and pattern-extracting module can provide the corresponding characteristics (e.g., the inherent variable features and the fraud patterns) as the fraud potentiality for the investigated samples.

Different from the traditional GHSOM studies which cluster the whole training samples at one time, the clustering module of the proposed approach separates the training samples into fraud group and non-fraud group to generate two GHSOMs. The idea of such design is to improve the unsupervised learning mechanism through utilizing the spatial relationship between a pairs of leaf nodes from these two GHSOMs.

That is, for each pair of leaf nodes, developing an adaptive classification rule based on such spatial relationship. The discriminate boundary can be tuned through the proposed optimization method in which the weightings of type I and type II errors are adjustable according to the decision makers’ preference that renders the outcome of the analyzing phase with more acceptable classification performance for a certain application domain.

For each leaf node of FT, the feature extraction mechanism extracts the fraud categories from the exogenous information and the principal components from the input variables, respectively. Therefore, for any sample clustered into a leaf node of FT, the corresponding principal components and fraud categories can be used to represent the associated fraud regularities. These fraud regularities can be used as the pre-warning signal and can reveal the associated potential fraud activities to help monitor the suspected sample. Furthermore, the pattern-extracting module needs a definition of domain categories from some authentic references. The pattern-extracting

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

module can be implemented through either the domain experts or applying the text mining technique.

The theoretical meaning of the spatial relationship is an interesting topic and is worthy of a deeper analysis. The spatial hypotheses (or belief) of this study are: for a pair of leaf nodes from FT and NFT, the associated fraud samples tend to locate around the non-fraud counterparts, or the associated non-fraud samples tend to locate around the fraud counterparts. In the modeling phase of the dual approach, the spatial relationships between the fraud samples and the non-fraud samples of the paired subgroups are identified and then utilized to develop the associated classification rule which is the dominant classification rule of two candidate classification rules (non-fraud-central rule and fraud-central rule) derived from two spatial hypotheses.

The dominance of the non-fraud-central rule leads to an implication that most of fraud samples cluster around the non-fraud counterpart, and the dominance of fraud-central rule leads to an implication that most of non-fraud samples cluster around the fraud counterpart. If one of these two spatial regularities fits to the sample data, the corresponding classification rule can provide superior classification performance;

moreover, the spatial relationship within fraud and non-fraud samples can provide valuable insights for the FFD domain experts.

The above mentioned implications bring out the advantages of the outcome of the proposed dual approach. That is, the abundant information associated with the outcome could enrich the conventional dichotomous detection for decision aid.

6.3 The FFR managerial implications

In contrast with prior FFR studies focusing on finding the signification input variables regarding FFR and providing dichotomous prediction result without giving further explanations, this study has shown that the proposed dual approach can help not only identify FFR, but also help interpret the FFR behaviors of samples.

The proposed approach involves a feature inspection on the fraud training samples, and the accumulated FFR understandings help creditors and capital providers evaluate the integrity of financial statements to facilitate their investment or credit decision-making. The accumulated FFR understandings also help facilitate the

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y

development of credit risk evaluation model used internally. Besides, the feature results can be employed by auditors into their audit plans to ensure their firms or clients remain competitive.

Regarding the implication for forensic accounting, the retrieved information including the FFR fraud categories and the principal components can help forensic accountants by providing the common features based on the similar samples belonged to the same leaf node, and help them perform extended procedures as part of the statutory audit.

Furthermore, the clustering results of the proposed approach can help give the list of companies to be reviewed for the integrity of their financial statements, and such information can be utilized by the regulatory bodies of publicly traded companies (e.g., Securities and Exchange Commission) to set up the FFR prevention strategies.

‧ 國

立 政 治 大 學

N a tio na

l C h engchi U ni ve rs it y