Situation/action identification and case-based reasoning

Chapter 4. Knowledge Support based on Case-based Reasoning and Data Mining

4.2. Discovery of problem-solving knowledge

4.2.2. Situation/action identification and case-based reasoning

Each situation or action is a case that is characterized by a text description and a set of attribute values. The attribute values provide additional features, such as the symptoms of a situation or the standard operating procedures of an action to identify the situation/action case. Both the text description and attribute values contribute to similarity matching and situation/action identification. For historical problem-solving instances, similar situa-tion/action cases are transformed into the same situasitua-tion/action identifier to facilitate the mining of decision-making and dependency knowledge patterns. Moreover, for the target situation/action, namely, the case workers are currently handling, the system identifies an existing case identifier or retrieves similar cases based on CBR. In the following, we de-scribe the steps taken to transform existing cases and how to compute the similarity meas-ures for case-based reasoning.

Extraction of identifying term vectors. The data stored in the Subject field of an existing case is a text description of the situation/action. For example, Subject: “FAB8D Cu-BSC DI Water flow capacity insufficient issue” is the description of the situation - insufficient water flow capacity. The terms extracted from the subject field are used to identify the situa-tion/action. Note that the terms are extracted using term transformation steps, including case folding, stemming, and stop word removal. We simply extract the terms without considering the term frequency, since the subject field generally contains a short text description. The extracted terms form identifying terms to identify a situation/action case. Moreover, the user needs to provide a text description for the target case, namely, the situation or action which he/she is handling. Similarly, the identifying terms of the target case are extracted from the text description using the term transformation steps. Let Tj be the set of identifying terms extracted from the subject field of a situation/action case Cj. An identifying term vector Cr_j is created to represent Cj. The weight of a term ti inCr_j

is defined by Equation 5.

⎩⎨

based on their text descriptions. The similarity value is derived by computing the cosine value of the identifying term vectors of Ck and Cj.

Similarity value by attribute. An attribute value may be nominal, binary, or numeric. For numeric attributes, a data discretization process is conducted to transform their values into value ranges or user-defined concept terms (such as low, middle or high). Equation 7 defines the similarity value sim^A(Ck (attrbx); Cj (attrbx)) of two situation/action cases Ck and Cj, derived according to their values of attribute x; value(Ck (attrbx)) denotes the transformed value of attribute x of Ck, which is calculated by the discretization process.

⎩⎨

Similarity function for case-based reasoning. Equation 8 defines the similarity function used to compute the similarity measure between two cases Ck and Cj. The similarity function is modified from Guardati (1998) by considering the cosine measure and attribute discreti-zation.

wT is the weight factor for the text description, and wx is the weight given to attribute x. Note that the summation of wT and all wx is equal to 1.

Transforming existing cases. Similar cases are transformed into the same situation/action identifier to discover decision-making and dependency knowledge patterns. The similarity

identify cases with high similarity measures (i.e., similarity(Ck, Cj) > θ). Cases with the same or high similarity measures are transformed into the same situation/action identifier. The transformation procedure is conducted in an incremental and greedy manner. Assume that r situation identifiers have been created. For each Si of r situation identifiers, one or more situation cases have been transformed into Si. Ck is the situation case that needs to be transformed into a situation identifier. Let minsim(Ck, Si) be the minimum similarity(Ck, Cj) over all Cj that is transformed into Si. The procedure finds a situation identifier Sf such that minsim(Ck , Sf) is the maximum of minsim(Ck, Si) over all Si (for i = 1 to r). For a situation case Ck, Ck is transformed into Sf, if minsim(Ck, Sf) is greater than θ; otherwise, Ck is trans-formed into a new situation identifier. The transformation procedure for action cases is conducted in a similar way. Table 1 lists the situations and actions in each stage of the water supply problem-solving process.

Table 1: Situations/actions in the water supply problem-solving process

Water supply problem-solving process

Situations Actions

[S1] Flow Capacity Abnormal Issue (Subject: Insufficient/Unstable/Overflow) [S2] Supply Quantity Abnormal Issue (Subject: Insufficient/Unstable/Overflow) [S3] Power Supply Abnormal Issue (Subject: Insufficient /Unstable/Excess) [S4] Water Pressure Abnormal Issue (Subject: Insufficient/Unstable/Excess) [S5] Cleaning Quality Abnormal Issue (Subject: Low/Unstable)

[S6] Pipe Abnormal Issue (Subject: Broken/Clogged)

[S7] Controller Temperature Abnormal Issue (Subject: Excess/Unstable)

…

[A1] Testing based on SOPs [A2] Consult expert information [A3] Modify the configuration [A4] Recycle the material [A5] Monitor the output [A6] Discuss with workers [A7] Report the outcome

…

Case-based reasoning for a target case. A target case is a situation or action that a worker is currently handling. After entering a target case Ck of a situation/action, the system identifies an existing case identifier of Ck or retrieves similar situation/action cases if Ck is a new case.

The similarity measures between the target case and previous cases are computed using Equation 6. The identification procedure is similar to the transformation procedure. Assume there are r situation identifiers. Let minsim(Ck, Si) be the minimum similarity(Ck, Cj) over all Cj transformed into Si. The procedure finds a situation identifier Sf such that minsim(Ck , Sf) is the maximum of minsim(Ck, Si) over all Si (for i = 1 to r). An existing situation identifier Sf

is identified if minsim(Ck , Sf) is greater than θ; otherwise, the situation is a new case and the system assigns a new identifier to it. The case and its identifier are then stored in the knowledge base, and CBR is initiated to retrieve similar cases based on their similarity measures and to suggest possible knowledge related to the similar cases.

在文檔中問題解決之知識支援 (頁 27-30)