(Artifact Anomaly Table) - 一個對時序工作流程管理系統進行分析的研究

Let AATw be the artifact anomaly table for an LRTS workflow w

∀aar∈AATw, aar = (op, type, SRC),

aar.op indicates the abnormal artifact operation,

aar.type∈{Useless Definition, Null Kill, Undefined Usage, Ambiguous Usage}

indicates the anomaly type, and

aar.SRC represents the set of operations leading to the anomaly.

For each record in AATw, the source operations producing the anomaly are recorded. For example, a usage of an artifact is undefined because a kill removes the definition of the artifact before it. The kill is recorded in the artifact anomaly record to provide information for fixing of the anomaly. The following algorithm illustrates detection of artifact anomalies and calculation of the output states for operations with different types.

Algorithm 15 Identifying Artifact Anomalies for No Operations - IAAN Input: an LRTS workflow w,

an artifact operation op, and a set of state items InState

For an artifact a, the no operation made by the end process is recorded in AOPLa to detect if any useless definition exists at the end of the LRTS workflow. Since only a definition transits an artifact to state DN, the algorithm records the operations generating DN state directly before the end of the LRTS workflow as useless definitions.

Algorithm 16 Identifying Artifact Anomalies for Definitions - IAAD Input: an LRTS workflow w,

an artifact operation op, and a set of state items InState Pre-Condition: op.type == Def

IAAD {

Algorithm 16 identifies the artifact anomalies generated from a definition, and calculate its output state. For an artifact a, a definition which is not referenced by any usages before being defined again is a useless definition. Finally, a definition transits a to state DN, and the output state generated by the definition is recorded accordingly.

Algorithm 17 Identifying Artifact Anomalies for Kills - IAAK Input: an LRTS workflow w,

an artifact operation op, and a set of state items InState anomaly is detected at line 2 and 3. Besides, if an artifact remains undefined before a kill, the kill is redundant, and a Null Kill is raised accordingly. A kill transits an artifact to state UD, and the output state generated from the kill is recorded at line 5.

Algorithm 18 Identifying Artifact Anomalies for Usages - IAAU Input: an LRTS workflow w,

an artifact operation op, and a set of state items InState Pre-Condition: op.type = Use

IAAU {

01: ∀stItem∈InState { 02: if( stItem.state == AB )

03: add( op, Ambiguous Usage, stItem.SRC∪ 04: ConcD_op∪ConcK_op ) to AATw; 05: else if( stItem.state == UD ) {

06: if(ConcD_op ≠ Ø )

07: add (op, Ambiguous Usage, stItem.SRC∪ConcD_op) to AATw; 08: else add(op, Undefined Usage, stItem.SRC) to AATw;

09: }

10: else if( stItem.state∈{DR, DN} ) 11: if( ConcD_op∪ConcK_op ≠ Ø )

12: add(op, Ambiguous Usage, stItem.SRC∪ 13: ConcD_op∪ConcK_op) to AATw;

14: if( stItem.state == DN ) add (DR, stItem.SRC) to opi.OutState;

15: else add stItem to opi.OutState;

16: } }

Algorithm 18 identifies whether a usage is abnormal, and calculates its output state. The input state AB indicates that the artifact is ambiguous in definition when the operation being operated, and makes the usage an ambiguous usage. If the input state of the usage is UD, the algorithm checks if there is any definition concurrent to the usage from at line 6. If no concurrent definition exists, the usage is undefined. Otherwise, the usage is ambiguous because it may reference an undefined artifact or the value defined by the concurrent definition(s). If the input state of the usage is DN or DR, the concurrent definitions or kills which cause ambiguity to the usage are checked at line 11, and an Ambiguous Usage is raised if any ambiguity exists.

The usage transits a DN artifact to state DR or simply propagates the input states to the following operations otherwise.

The expressions adopted in Algorithm 15 to Algorithm 18 are stated based on the description in section 4.1. With all the definitions and algorithms described in this chapter, the methodology detecting artifact anomalies in a TS workflow is introduced as following.

Algorithm 19 Identifying Artifact Anomalies - IAA information like EAIs, ABStacks, and artifact operation lists for the input LRTS workflow. For each artifact a, Algorithm 19 then identifies the concurrency between artifact operations with Algorithm 10 at line 3, and starts analysis of the each operation in AOPL_a in order from line 4.

Algorithm 11 is invoked at line 6 to collect the operations directly before opi, and the operation sets directly before op_iis manufactured by Algorithm 12 from the previous result at line 7. At line 8, Algorithm 14 gathers the input state of opi, and invokes corresponding algorithms from line 9 to 12 to detect artifact anomalies and calculate the output state of op_i. At line 13, Algorithm 13 is invoked to detect if there is any blank branch before opi. If not, the anomaly

detection work for opi is accomplished. Otherwise, all the operations residing in the decision structure with blank branches are removed from OPL_opi, and Algorithm 19 repeats analysis of artifact anomalies for opi until all the blank branches considered. The completeness of the artifact anomalies detected in our methodology is decided by the completeness of the operation sets identified by Algorithm 12. Developing an algorithm able to collecting more operation sets is helpful in enhancing our methodology, and is left as a future work of this study.

4.3 Case Study

In this section, a case study is made to illustrate the feasibility of our methodology.

Figure 16 The Sample TS Workflow for the Case Study in Chapter 4

Figure 16 shows the sample TS workflow for our case study. The processes, flows, working durations, and the artifact operations made on artifact a are illustrated in the sample.

To analyze the sample TS workflow with our methodology, the structured loops in the TS workflow should first be reduced. After loop reduction, the LRTS workflow generated from the sample TS workflow are illustrated as Figure 17. Then, Algorithm 9 is invoked to gather the temporal and structural information such as the EAI and the ABStack for each process, and the artifact operation list for each artifact.

Figure 17 The Sample LRTS Workflow Derived from Figure 16 with Decoration of EAIs and ABStacks

Table 3 illustrates the artifact operation list and the concurrent operations for artifact a generated by Algorithm 9 and Algorithm 10.

Table 3 Artifact Operation List for a, and the Corresponding Concurrent Operations

opi AOPLa ConcD ConcK

op₁ (v₁, a, 0, 2, Use) Ø Ø

op2 (v2, a, 1, 4, Def) Ø Ø

op3 (v10, a, 1, 4, Use) {op2} {op7}

op₄ (v₁₀¹, a, 1, 4, Use) {op₂} {op₇}

op₅ (v₃, a, 2, 6, Use) Ø Ø

op₆ (v₁₀², a, 2, 6, Use) {op₂. op₉} {op₇}

op7 (v6, a, 3, 8, Kill) Ø Ø

op8 (v103

, a, 3, 8, Use) {op2. op9} {op7}

op₉ (v₇, a, 3, 10, Def) Ø Ø

op10 (v9, a, 9, 14, Use) {op9} Ø

op11 (v11, a, 10, 16, Use) Ø Ø

op12 (v12, a, 11, 18, Def) Ø Ø

op₁₃ (e, a, 12, 18, Nop) Ø Ø

To be brief, we do not show all the details of detecting artifact anomalies in this case study,

and focus on two representative examples, op9 and op10. Therefore, we assume that the operations before op₉ are calculated already, and Table 4 shows the output state of the operations with LETs smaller then op9’s.

Table 4 The Output State of the Operations before op₉ is Calculated

opi OutState

op₁ { (UD, {s}) }

op2 { (DN, {op2}) }

op3 { (UD, {s}) }

op₄ { (UD, {s}) }

op₅ { (DR, {op₂}) }

op₆ { (UD, {s}) }

op7 { (UD, {op7}) } op8 { (AB, {s, op2, op9}) }

op1 is an undefined usage because it is operated before any activity process gives definition to artifact a. op₃, op₄, op₆, and op₇ are ambiguous usages because there exist definition concurrent to them. Before op9 is calculated, the artifact anomaly table, AATw, records the following anomalies:

AAT_w = { ( op₁, Undefined Usage, {s} ), ( op₃, Ambiguous Usage, {s, op₂} ), ( op₄, Ambiguous Usage, {s, op₂} ), ( op₆, Ambiguous Usage, {s, op₂, op₇} ), ( op₇, Ambiguous Usage, {s, op₂, op₇} ) }

For op9, Algorithm 19 retrieve all the operations with smaller LET from AOPLa as OPLop9, {op₁, op₂, op₃, op₄, op₅, op₆, op₇, op₈}, and invokes Algorithm 11 to calculate DB4_op9, {op₅, op7}. Since all the operations directly before op9 are mutually exclusive, i.e. the case (2) described in section 4.2, the DB4OPS_op9 is calculated from Algorithm 12 as { {op₅}, {op₇} }.

With DB4OPSop9, Algorithm 14 gathers the input states of op9 as the union of the output states of op₅ and op₇ as { (DR, {op₂}), (UD, {op₇}) }. op₉ is a definition, and Algorithm 16 is invoked for detection of artifact anomalies and generation of its output state. As a result, no artifact anomaly is found and the output state of op₉ is generated as { (DN, {op₉}) }. However, during

the blank branch detection, (xs1, 2) is found a blank branch, and the operation in the same decision structure should be removed to eliminate the effect of blank branch. op₅ and op₇ is removed from OPLop9. DB4op9, DB4OPSop9, and the InState of op9 are recalculated as {op2}, {{op₂}}, and { (DN, {op₂}) }. After invoking Algorithm 16 once again, an artifact anomaly (op2, Useless Definition, {op9}) is raised because the definition made by op2 is not used before redefinition when the blank branch is taken.

DB4_op10is generated as {op₃,op₅, op₇, op₈}, and DB4OPS_op10 is generated as { {op₃,op₅}, {op7, op8} }. Since this case is relatively simple, we can easily identify that the operation sets {op₃,op₇} and {op₅, op₈} is neglected in our methodology. With DB4OPS_op10, the input states of op10 are generated. According to the definition of racing operations introduced in section 4.1.1, {op₃,op₅} is an RUS and {op₇, op₈} is an RKU, and { (DR, {op₂}) } and { (UD, {op₇}) } are generated as op10’s input states correspondingly. Algorithm 18 is invoked to detect artifact anomalies and identify the output state of op₁₀. Two artifact anomalies, (op₁₀, Ambiguous Usage, {op7, op9}) and (op10, Ambiguous Usage, {op2, op9}), are generated because op9 makes a definition to a concurrently, and generates ambiguity to op₁₀.The output states of op₁₀ is { (DR, {op2}), (UD, {op7}) }. Then the algorithm removes the blank branches for op10, and finds no further anomalies.

Except for the artifact anomalies listed and described above, (op₁₃, Useless Definition, {e}) are detected and recorded to AATw when Algorithm 19 completes its work throughout w. The useless definition is detected at the end process of the LRTS workflow because the definition made by op13 is not used by any other activity process until the end of w.

4.4 Discussion

4.4.1 Related Works in Analysis of Artifact Anomalies

Sun et al. extend the Activity Diagram in UML for modeling data flow in a business process [51]. Three classes of data-flow anomalies, missing data, redundant data, and conflicting data, are defined. With the routing information defined in a workflow specification, a detecting algorithm for the data-flow anomalies is constructed [51]. However, Sun et al. do not build an explicit data model in characterizing the data behaviors, and consider only read and initial write in data operations.

In [26], Sadiq et al. reveal the importance about the validation of workflow data, and introduce seven basic data validation problems, Redundant Data, Lost Data, Missing Data, Mismatched Data, Inconsistent Data, Misdirected Data, and Insufficient Data in workflow models. Redundant Data occur when designers specify an activity to define a data item which is not required by any other succeeding activities. Lost Data occur when designers specify two activities that may be executed in parallel to define the same data item, and one of the definitions is lost when the data item is preempted by the process executed in advance. Missing Data occurs when designers specify an activity to consume a data item which is never defined by any preceding activities. Mismatched Data arise when the structure of data is incompatible between the definition and the usage of the data. Inconsistent data happen when the data required by a workflow are externally updated by other applications during the workflow execution, and the polluted data might cause errors of the workflow. Misdirected Data occur when the direction of the data flow is conflict with the direction of the control flow of the workflow. Insufficient Data happen when the data specified by designers is insufficient to successfully complete an activity.

Destruction of artifacts is not considered in both Sun and Sadiq’s studies. In [27] and [28],

Hsu et al. consider the effect of destroying an artifact and re-model the inaccurate artifact manipulation by separating initialization and update as two different artifact operations. In [28], six inaccurate artifact usages, No Producer, No Consumer, Redundant Specification, Contradiction, Parallel Hazard, and Branch Hazard are defined. No Producer is a warning indicating that a data item is operated before it is specified. No Consumer indicates that an artifact is not requested after its definition (initialization). Redundant Specification indicates that an artifact is repeatedly specified in a workflow. Contradiction implies the defect that the state of an artifact is not matched to the pre-condition or post-condition of the activity accessing it. Parallel Hazard occurs due to conflict interleaving of concurrent artifact operations, and is recognized if multiple concurrent activities operate on the same artifact. Branch Hazard occurs when branches in a decision structure contain operations on artifacts have been selected, or when there is inconsistency between the condition testing in the XOR-split process or the branches in the decision structure.

In [29], Wang et al. develop a systematic notation to describe artifact anomalies and simplify the description of artifact anomalies from [28] into three categories, Missing Production, Redundant Write, and Conflict Write. Missing Production occurs when an artifact is consumed before it is produced or after it is destroyed. Redundant Write occurs when an artifact is written by an activity but the artifact is neither required by the succeeding activities nor a member of the process outputs. Conflict Write occurs when parallel processes race their access to the same artifact. According to different structural relationships between activities accessing some artifacts, thirteen abnormal usage patterns are described for the three categories to follow the previous models made by Sadiq et al. [26], Hsu et al. [29], and Sun et al. [51],

4.4.2 Comparison between Our Approach and the Related Works

Table 5 Comparison between Our Approach and the Related Works Our

Initialization Missing Data No Production

Delayed

Initialization Lost Data Contradiction Conflict Write mapped into the three basic categories made in [51]. By comparing the definition of the artifact anomalies defined in our approach and the related works, we conclude that Undefined Usage and Useless Definition are directly mapped into Missing Data and Redundant Data described in [51]. On the other hand, the Conflict Data defined in [51] are anomalies generated when multiple definitions are made in parallel. In our approach, the concurrent definitions are considered being executed with undetermined order, and generate ambiguity in artifacts. They

are not directly considered as an anomaly because (1) an anomaly actually occurs when a usage refers to the ambiguous definitions, and (2) similar anomaly may also occur when there exist kills or definitions concurrent to usages. Therefore, Ambiguous Usage is categorized in this dissertation, and covers Conflict Data discussed in the previous works. Besides, Sadiq et al.

additionally define Insufficient Data and Mismatched Data in [26] for conflicts about contents or format between definitions and usages. Since the studies made in [28], [29], [51] and this dissertation do not discuss the contents of artifacts, Insufficient Data and Mismatched Data are ignored in these studies. Finally, although destruction of artifacts is considered in [28] and [29], the redundancy generated by unnecessary destruction is not discussed in these works. In our studies, Null Kill is categorized and detected by our approach to eliminate such redundancies.

Our approach also considers how temporal factors may affect the detection of artifact anomalies. The twisted temporal and structural relationships between activity processes are modeled and analyzed, and the artifact anomalies generated along with them are detected.

Besides, when the previous works only focus on detection of artifact anomalies, our approach also helps designers locating the problems hidden in a workflow schema with providing the information about the sources leading to artifact anomalies.

Chapter 5. Incremental Detection of Resource Conflicts in LRTS Workflow

In this chapter, an incremental methodology detecting the resource conflicts generated/eliminated during construction of LRTS workflows along with each edit operation made by designers is described. With the methodology, designers obtain information after each step they made, and may respond to any conflicts immediately. In section 5.1, the resource conflicts in LRTS workflow are first defined, and the edit operations and additional elements necessary for building an LRTS workflow are modeled in section 5.2. The methods for incremental detection of resource conflicts are depicted in section 5.3. Several examples are described in section 5.4 to illustrate the feasibility of our methodology, and the related works are discussed in section 5.5.

5.1 Resource Conflicts in LRTS workflow

Each activity in a workflow needs certain resources to accomplish its business objective.

In this dissertation, it is assumed that all the resources required by an LRTS workflow w are recorded in the set RESw, and designers may assign resource in RESw to activity processes in Pw

to show that the resource is necessary to the process. The resource model used in this dissertation is defined as following.

Definition 31 (Resources)

在文檔中一個對時序工作流程管理系統進行分析的研究 (頁 74-86)