Context-based situation identification and case-based reasoning

Chapter 4. Knowledge Support based on Case-based Reasoning and Data Mining

5.2. Discovery of context-based problem-solving knowledge

5.2.1. Context-based situation identification and case-based reasoning

features and a set of attribute values. The attribute values provide important information such as the symptoms of a situation to identify the situation case. Situation features are analyzed from previous problem situations/actions and can be predefined in system. Such situation features may be collected in run-time by the system or selected by the user. For undefined situation causes, users need to provide a text description of the situation. The text description can be used to extract identifying terms for the situation. Moreover, situation features collected by the system are usually partial and incomplete. Context-based inference can be initiated to infer more situation features. The text descriptions, situation features, attribute values contribute to similarity matching and situation identification. For the target situation/action, namely, the case workers are currently handling, the system identifies an existing case identifier or retrieves similar cases based on CBR.

Extraction of identifying term vectors. The data stored in the Subject field of an existing

case is a text description of the situation. For example, Subject: “FAB8D Cu-BSC DI Water flow capacity insufficient issue” is the description of the situation - insufficient water flow capacity. The terms extracted from the subject field are used to identify the situation and attributes, e.g., situation name: insufficient water flow capacity; factory name: FAB8; de-partment identification: D; system type: DI; system status: water flow capacity insufficient.

The relevant context entity and feature include staff: Annie; role: DG; time: 20040502-PM;

location: Hsinchu, service name: DI water supply service, etc. Note that the terms are ex-tracted using term transformation steps, including case folding, stemming, and stop word removal. We simply extract the terms without considering the term frequency, since the subject field generally contains a short text description. The extracted terms form identifying terms to identify a situation case. Moreover, the user needs to provide a text description for the target case, namely, the situation or action which he/she is handling. Similarly, the identifying terms of the target case are extracted from the text description using the term transformation steps. Let Tj be the set of identifying terms extracted from the subject field of a situation case Cj. An identifying term vector Cr_j

is created to represent Cj. The weight of a term ti inCr_j is defined by Equation 5. Equation 6 defines the similarity value sim_T

(Ck, Cj) of two situation cases Ck and Cj based on their text descriptions. The similarity value is derived by computing the cosine value of the identifying term vectors of Ck and Cj.

Similarity value by attribute. An attribute value may be nominal, binary, or numeric. For numeric attributes, a data discretization process is conducted to transform their values into value ranges or user-defined concept terms (such as low, middle or high). Equation 7 defines the similarity value sim^A(Ck (attrbx), Cj (attrbx)) of two situation cases Ck and Cj, derived according to their values of attribute x; value(Ck (attrbx)) denotes the transformed value of attribute x of Ck , which is calculated by the discretization process.

Context modeling. The context information is any information about an entity status. An entity can be the user, physical location, service, or service relevant object, etc. Due to the variety of context information, it is not easy to represent the complete context information of an entity. Therefore, based on the problem-solving environment, this work uses a modeling mechanism which composes with three levels to formalize the context information including Context entity level, Context feature level and Context association level.

z Context entity level. This level represents the conceptual abstraction of context entity spread in a problem-solving environment, includes physical, organization, process, staff, service, and document entities, etc.

z Context feature level. The context feature may be predefined by a domain expert that shows relevant information of a specific entity. A context entity may include one or more context features, for example, a physical entity covers the identification, time and location features; an organization entity may include the factory and department fea-tures; a process entity contains stage, task, and status feafea-tures; a staff entity has user, role, degree, and activity features; a service entity may involve with system, component, and parameter features; a document entity includes original, type, author, and score features, etc.

z Context association level. This level defines the association relationship between relevant features and attributes of the context entities. The association relationship is used to collect more relevant information of current problem-solving process based on context characteristics. We list some pre-defined association types as follows.

¾ The organization-staff association describes the relationship between organiza-tion and staff entity, e.g., Annie belongs to DG role in B department of Fab8 factory.

¾ The staff-process association describes the relationship that user-role carries out the specific process, e.g., DG-Annie carries out the water supply problem-solving process.

¾ The staff-service association describes the relationship that user-role uses the specific system service, e.g., DG-Annie uses the DI water supply system service.

¾ The process-service association shows the relationship between the process and service entity, e.g., the water supply process contains the DI water supply and pipe control system services.

¾ The process-document association describes the relationship that some documents support specific process, e.g., expert or experiential reports of specific situation.

¾ The service-document association shows the relationship that some documents belong to specific service, e.g., user guide or technical documents of specific system service.

Based on context modeling, the system proactively collects the relevant context entities and features of current situation. For example, when staff Annie suffers from the controller temperature abnormal situation, the relevant entities include physical time, location, or-ganization, Annie, water supply problem-solving process, DI water supply system, and relevant knowledge documents, etc. The system also gathers relevant features of context entities in a controller temperature abnormal situation, such as physical time: 20040502-PM 3:24; location: Hsinchu; factory: Fab8; department: B; user-role: DG- Annie; process: water supply problem-solving process; stage: normal management stage; situation: controller temperature abnormal situation; service: DI water supply system service; document:

AF0001C0F25; author: PTC; Score: 4; original: DIFF knowledge base, etc. The collected context entities and features of specific situation are stored in enterprise knowledge base for context-based inference rule discovery. Context entities and situation/action features are represented in some meta-rule format predefined by expert. The proposed system enforces the constraint-based association rule mining to discover the context-based inference rules from the problem-solving log.

Context-based inference rule mining. The context-based inference rules discovered from association rule mining represent the associations of situation features and context charac-teristics. The rule format is shown as Equation 10:

[feature_p … and context_q …] → [feature_r] [Support = s%, Confidence = c%] (10)

For example, for the controller temperature abnormal situation, the features of staff entity:

“Annie” and service entity: “DI water supply system service” are associated with the feature of DI water supply system service entity: “Parameter incorrect”. The context-based infer-ence rule is shown as follows.

[Staff(Annie) and DI water supply system service()] →

[DI water supply system service(Parameter: incorrect)] [Support =2%, Confidence =13%]

For specific situation, the collected context entities and features are used to discover relevant actions. The format of context-based inference rule is represented as Equation 11:

[featurep … and contextq …] → [Actionr] [Support = s%, Confidence = c%] (11) For example, for the controller temperature abnormal situation, the features of staff entity:

“Annie” and service entity: “DI water supply system service” are associated with the Action:

“Reporting the outcome”. The context-based inference rule is shown as follows.

[Staff(Annie) and DI water supply system service()] → [Reporting the outcome action()]

[Support =2%, Confidence =13%]

For specific problem-solving process, the collected context entities and features of specific situation are used to discover relevant situation features. Equation 12 shows the format of context-based inference rule that infers relevant action feature of specific situation; the format of context-based inference rule that infers relevant situation features of specific ac-tion is represented as Equaac-tion 13:

[feature_p… and context_q …]_Si *…and [feature_u … and context_v …]_Aj* →

[feature_r] Ak [Support = s%, Confidence = c%] (12)

[feature_p … and context_q …]_Si *…and [feature_u … and context_v …]_Aj* →

[feature_r]_Sk [Support = s%, Confidence = c%]

(13)

The examples are illustrated as follows. The feature of context entity Staff: “Annie” in controller temperature abnormal situation of Normal Management stage and the feature of context entity Staff: “PTC” in consulting with the expert action of Engineering Improvement stage are associated with the feature of DI water supply system service entity: “Parameter:

increasing pressure” in modifying the configuration action of Exception Management stage.

The context-based inference rule is shown as follows.

[Staff(Annie)]NM_S7 and [Staff(PTC)]EI_A2 →

[DI water supply system service(Parameter: increasing water pressure )]EM_A1

[Support = 1%, Confidence = 14%]

The feature of context entity DI water supply system service: “Parameter: output value” in monitoring the output action of Normal Management stage and the feature of context entity Document: “A9600400762” in testing based on the SOP action of Engineering Improvement stage are associated with the feature of DI water supply system service entity: “Parameter:

water quantity” in supply quantity abnormal situation of Exception Management stage. The context-based inference rule is shown as follows.

[DI water supply system service(Parameter: output value)]NM_A5 and [Document(A9600400762)]EI_A1 →

[DI water supply system service(Parameter: water quantity )]EM_S2

[Support = 3%, Confidence = 11%]

Certainty Factor value of context-based inference rule. The certainty degree of system collected situation feature is set to 1. For inferred situation features, this work employs the method of Certainty Factor (CF) value (Shortliffe et al., 1975) to derive the certainty degree during the inference, as defined in Equation 3. The preceding set denotes run-time situation features and context characteristics; the succeeding set is the situation feature that we want to infer its certainty degree. For example, The CF value of the context-based inference rule:

[Staff(Annie)] → [DI water supply system service(Parameter: incorrect)] is 0.033. The de-tails of calculation are shown as follows.

[Staff(Annie)] → [DI water supply system service(Parameter: incorrect)]

[Support = 2%, Confidence = 13%]

S([DI water supply system service(Parameter: incorrect)]) = 10%

CF([Staff(Annie)] → [DI water supply system service(Parameter: incorrect)])

= (13%-10%)/(1-10%) =0.033

Inference for situation features. Based on the CF value of situation feature and con-text-based inference rule, the inference process follows the rules defined in Equation 4. An example is illustrated in Fig. 11. The details of inference process are shown as follows. The context-based inference rule: [Role(DG)] → [Staff(Annie)] indicates the feature: DG of context entity: Role inferring the feature: Annie of context entity: Staff. Its CF value is 0.7.

two context entities: [Staff(Annie)] and [DI water supply system service ()] have “AND”

relationship. Its output CF value is 0.5. The CF value of [Staff(Annie) and DI water supply system service ()] → [DI water supply system parameter(Incorrect)] is 0.3. The CF value of [Pipe system service()] → [DI water supply system parameter(Incorrect)] is 0.2. Finally, there is a “JOIN” relationship with two inference conditions. The CF value of [Staff(Annie) and DI water supply system service ()] → [DI water supply system parameter(Incorrect)], [Pipe system service()] → [DI water supply system parameter(Incorrect)] is 0.3. Inferred situation features with high ranking of CF value are considered as the Inferred knowledge to assist CBR in identifying situation encountered.

Fig. 11: An example of inference process

CF(Annie) = CF(DG) * CF(IF DG THEN Annie) = 1.0 * 0.7 = 0.7 CF(DI)= CF(Service) * CF(IF Service THEN DI) = 1.0 * 0.5 = 0.5 CF(Annie DI) = MIN(CF(Annie), CF(DI)) = 0.5 ∧

∧ DI) *CF(IF Annie ∧DI THEN Incorrect), CF(Incorrect) = MAX(CF(Annie

CF(Pipe) * CF(IF Pipe THEN Incorrect) ) = MAX(0.5 * 0.6, 1.0 * 0.2) = 0.3

Inferred situation features with high ranking of CF value are considered as the Inferred knowledge. Then the inferred knowledge assists CBR in situation identification. Let Fj be the set of situation features of Cj that are collected by the system or inferred by the con-text-based inference rules. A feature vector Cr_Fj

is created to represent Cj. The weight of a feature fi inCr_Fj

is defined by Equation 14.

⎩⎨ cases Ck and Cj based on their situation features. The similarity value is derived by com-puting the cosine value of the feature vectors of Ck and Cj.

Similarity function for case-based reasoning. Equation 16 defines the similarity function used to compute the similarity measure between two cases Ck and Cj. The similarity function is modified from Guardati (1998) by considering the combination of the similarity of text descriptions, attribute values and situation features.

∑

where sim^T(Ck, Cj) is the similarity value derived from the identifying term vectors of Ck and Cj; sim^F(Ck, Cj) is the similarity value derived from situation features of Ck and Cj; sim^A(Ck

(attrbx), Cj (attrbx)) is the similarity value obtained from the values of attribute x; wT is the weight factor for the text description; wF is the weight factor for the situation feature; and wx

is the weight given to attribute x. Note that the summation of wT , wF and all wx is equal to 1.

Case-based reasoning for a target case. A target case is a situation that a worker is currently handling. After entering a target case Ck of a situation, the system identifies an existing case identifier of Ck or retrieves similar situation cases if Ck is a new case. The similarity meas-ures between the target case and previous cases are computed using Equation 16. The iden-tification procedure is similar to the transformation procedure. Assume there are r situation identifiers. Let minsim(Ck, Si) be the minimum similarity(Ck, Cj) over all Cj transformed into Si. The procedure finds a situation identifier Sf such that minsim(Ck , Sf) is the maximum of

minsim(Ck, Si) over all Si (for i = 1 to r). An existing situation identifier Sf is identified if minsim(Ck , Sf) is greater than θ; otherwise, the situation is a new case and the system assigns a new identifier to it. The case and its identifier are then stored in the knowledge base, and CBR is initiated to retrieve similar cases based on their similarity measures and to suggest possible knowledge related to the similar cases.

在文檔中問題解決之知識支援 (頁 41-49)