• 沒有找到結果。

運用一正規化模式來偵測商業流程規格中異常的Artifact 使用

N/A
N/A
Protected

Academic year: 2021

Share "運用一正規化模式來偵測商業流程規格中異常的Artifact 使用"

Copied!
69
0
0

加載中.... (立即查看全文)

全文

(1)

資訊科學與工程研究所

運用一正規化模式來偵測商業流程規格中

異常的 Artifact 使用

Detecting the Artifact Anomalies in

Business Process Specifications with a Formal Model

研 究 生:許嘉麟

指導教授:王豐堅 教授

中 華

華 民

民 國

國 九

九 十

十 六

六 年

年 十

十 月

(2)

運用一正規化模式來偵測商業流程規格中異常的 Artifact 使用

Detecting the Artifact Anomalies in

Business Process Specifications with a Formal Model

研 究 生:許嘉麟 Student:Chia-Lin Hsu

指導教授:王豐堅 Advisor:Feng-Jian Wang

國 立 交 通 大 學

資 訊 科 學 與 工 程 研 究 所

博 士 論 文

A Dissertation

Submitted to Institute of Computer Science and Engineering College of Computer Science

National Chiao Tung University in partial Fulfillment of the Requirements

for the Degree of Doctor of Philosophy

in

Computer Science

October 2007

Hsinchu, Taiwan, Republic of China

中華民國九

中華民國九

中華民國九

(3)

運用

運用

運用

運用一正規化模式來偵測商業流程規格中

一正規化模式來偵測商業流程規格中

一正規化模式來偵測商業流程規格中

一正規化模式來偵測商業流程規格中

異常的

異常的

異常的

異常的 Artifact 使用

使用

使用

使用

學生:許嘉麟 指導教授:王豐堅 博士

國立交通大學資訊工程與科學研究所 博士班

摘要

摘要

摘要

摘要

儘管已有許多商業流程模型被提出,卻鮮少針對 artifact 的使用進行分析。由於不適當的 artifact 操作,譬如說 artifact 流程與控制流程不一致或是相衝突的 artifact 運算,一個結構 良好且擁有足夠資源的商業流程在執行時依然可能產生非預期的結果。因此,分析 artifact 的使用是很重要的畢竟活動無法在沒有精確的資訊的情況下執行正確。本論文提出一個流 程模型來描述商業流程並且在此模型上分析 artifact 的使用。總共有三類(十三種狀況)會影 響流程執行的異常 artifact 使用被確認出來並且使用系統化的方式來表達。除此之外,本論 文提出偵測這些異常的演算法並以一個實際的例子作示範說明。 關鍵字: 工作流程,商業流程,分析,控制流程,資料流程,異常

(4)

Detecting the Artifact Anomalies

in Business Process Specifications

with a Formal Model

Student:Chia-Lin Hsu Advisor:Dr. Feng-Jian Wang

Institute of Computer Science and Engineering

National Chiao Tung University

Abstract

Although many business process models have been proposed, analyses on artifact usages are seldom discussed. A well-structured business process with sufficient resources may still fail or yield unexpected results during process execution due to inaccurate artifact specification e.g. inconsistency between artifact flow and control flow, or contradictions between artifact operations. Thus, the analyses on artifact usages are very important since activities cannot be executed properly without accurate information. This dissertation presents a process model for describing a business process and analyzes the artifact usages on this model. Three types with thirteen cases of artifact usage anomalies affecting process execution are identified and formulates and a set of algorithms to detect these anomalies in business process specifications is presented. Furthermore, an example is demonstrated to validate the usability of the proposed algorithms.

(5)

誌 謝

本篇論文的完成,首先要萬分感謝指導教授王豐堅博士,王教授在我求學期間 (從 碩士班到博士班) 持續不斷的指導與鼓勵,讓我不僅在論文研究方面學習到相當寶貴的 經驗,在做人處事方面也獲益良多。如今學生若有些微的成就,王教授的指導實在功不 可沒。 其次要感謝吳毅成教授,陳耀宗教授,朱治平教授,朱正忠教授,黃悅民教授,與 楊鎮華教授,在百忙之中首肯擔任我博士論文的口試委員,並且提供了許多寶貴的意見, 補足我論文裡不足的部分。其中吳、陳兩位教授亦是我的論文計畫書口試委員,在論文 報告的方式上也給了我相當多的指導。此外,對於一起研究討論與互相鼓勵的實驗室學 長與學弟妹們,包括楊基載、黃國展、王建偉、王靜慧、許懷中等等,在此一併加以感 謝。 最後,我要與我的家人、同學、朋友、溶璘、徐媽媽與徐嬤嬤共同分享完成論文的 喜悅,由於有您們的支持與關懷,陪伴我走過這漫長的求學過程。僅將此論文獻給我最 敬愛的父母親與支持我的親友們。

(6)

Table of Contents

摘要 摘要摘要 摘要 ... I ABSTRACT ... II 誌 誌誌 誌 謝謝謝謝 ... III TABLE OF CONTENTS ... IV LIST OF TABLES ... VI LIST OF FIGURES ... VII

CHAPTER 1. INTRODUCTION ... 1

CHAPTER 2. RELATED WORK AND BACKGROUND ... 3

CHAPTER 3. PROCESS MODELING ... 6

3.1. PROCESS SPECIFICATIONS ... 6

3.2. CONTROL FLOW SPECIFICATION ... 7

3.2.1. Activities and Control Blocks ... 7

3.2.2. Relations among Activities and Control Blocks ... 10

3.3. ARTIFACT FLOW SPECIFICATION ... 13

3.3.1. Artifacts and Artifact Operations ... 13

3.3.2. Artifact Flow and Artifact Usages ... 15

CHAPTER 4. ARTIFACT USAGE ANOMALIES... 17

4.1. ARTIFACT USAGE ANOMALIES... 17

4.1.1. Missing Production Anomalies ... 17

4.1.2. Redundant Write Anomalies ... 22

4.1.3. Conflict Write Anomalies ... 24

4.1.4. Summary of Usage Patterns Causing Artifact Usage Anomalies ... 25

CHAPTER 5. ALGORITHMS TO DETECTING ARTIFACT USAGE ANOMALIES .. 27

5.1. THE TRAVERSAL ALGORITHM ... 27

5.2. THE DETECTION ALGORITHM ... 30

5.2.1. Method for Detecting Missing Production Anomalies ... 30

5.2.2. Method for Detecting Redundant Production/Update Anomalies ... 35

5.2.3. Method for Detecting Conflict Writes Anomalies ... 39

5.2.4. Complexities of Traversal and Detection Algorithms ... 41

CHAPTER 6. ILLUSTRATIVE EXAMPLE ... 43

6.1. AN EXAMPLE:PROPERTY LOAN APPROVAL PROCESS ... 43

6.2. DETECTION OF MISSING PRODUCTION ANOMALIES ... 45

(7)

CHAPTER 7. COMPARISONS OF DATA-FLOW ANALYSIS APPROACHES ... 50 CHAPTER 8. CONCLUSION AND FUTURE WORK ... 54 REFERENCE ... 55

(8)

List of Tables

Table 4.1. Symbols Used in Usage Patterns ... 17

Table 4.2: Summary of Usage Patterns Causing Artifact Usage Anomalies ... 26

Table 6.1: Activities and Artifacts in the Property Loan Approval Process ... 43

Table 6.2: Artifacts Usages in the Property Loan Approval Process ... 44

Table 6.3. Steps to Detect Missing Production Anomalies. ... 45

Table 6.4. Steps to Calculate the Unused Artifacts for Every Activity. ... 47

Table 7.1. Comparison of anomalies addressed. ... 51

(9)

List of Figures

Figure 3.1. Notations of Control Flow Graph. ... 7

Figure 3.2. Four Primitive Types of Control Structures. ... 9

Figure 3.3. The State Diagram of an artifact. ... 14

Figure 5.1: Transform a Repeat-Until Loop. ... 28

Figure 5.2: Transform a While Loop. ... 28

(10)

Chapter 1.

Introduction

Workflow can be viewed as a set of interrelated tasks that are systematized to achieve certain business goals by completing each task in a particular order under automatic control [1]. Resources are required for workflow implementation, and support process execution. Resource allocation and resource constraint analysis [2-6] are popular workflow research topics. However, data flow within workflow is seldom addressed [7-10].

Artifact is an abstraction of a data instance within a workflow. Introducing analysis of artifact usage into control-oriented workflow designs helps maintain consistency between execution order and data transition, as well as prevents the exceptions resulting from contradiction between data flow and control flow. In contrast to structural correctness, accuracy in artifact manipulation can help determine whether the execution result of a workflow is meaningful and desirable.

This dissertation proposes a process model for describing business processes and address three types of artifact usage anomalies. An artifact usage analysis procedure associated with the model is applied before deploying the workflow schema. Reports of consistency checking between data flow and control flow and information of manipulating artifacts are automatically provided to designers when they edit or adjust workflow specification. The model is based on component-based design technique [11, 12] and is compatible with existing control-oriented workflow design models. It provides an easier way to extract knowledge of artifact usages in a workflow. In our earlier work [13, 14], we have introduced the artifact usage analysis into workflow design phase and the improper artifact usages affecting workflow execution have been identified preliminary. In this dissertation, the artifact usages are formularized and the concrete algorithms to discovering the improper usages in workflow specifications are proposed. In addition, an example to demonstrate the contribution of our work and a comparison among related works and ours are presented.

The remainder of this dissertation is organized as follows. Chapter 2 presents the research background and related work. Chapter 3 presents our process modeling, including the control flow and artifact flow. Chapter 4 then defines three types with thirteen cases of artifact usage

(11)

anomalies. Next, chapter 5 proposes a set of algorithms to detect artifact usage anomalies in a process schema. Chapter 6 demonstrates the algorithms through an example. Chapter 7 compares our approach with related works. Conclusions are finally drawn in chapter 8, along with recommendations for future work.

(12)

Chapter 2.

Related Work and Background

A workflow can be deemed as a collection of cooperating and coordinated activities designed to carry out a well-defined complex process, such as a trip planning, conference registration procedure, or business process in an enterprise. A workflow model is used to describe a workflow in terms of various elements, such as roles and resources, tools and applications, activities, and data, which represent different perspectives of a workflow [15, 16]. Roles and resources elements represent organizational perspective that describes where and by whom tasks are performed and available resources tasks can utilize in the organization. Tools and applications elements represent operational perspectives by specifying what tools and applications are used to execute a particular task. Activity elements are defined with two perspectives: 1) functional: what tasks a workflow performs; and 2) behavioral: when and how tasks are performed. Data elements represent the informational perspective, i.e., what information entities are produced or manipulated in the corresponding activities in a workflow.

A well-defined workflow model leads to the efficient development of an effective and reliable workflow application. The correctness issues in a workflow might be classified into three dimensions: control-flow, resource, and data-flow. Generally, the analyses in control-flow dimension are focused on correctness issues of control structure in a workflow. The common control-flow anomalies include deadlock, livelock (infinite loop), lack of synchronization, and dangling reference [17-28]. A deadlock anomaly occurs if it is no longer possible to make any progress for a workflow instance, e.g. synchronization on two mutually exclusive alternative paths. A livelock anomaly indicates an infinite loop, such as iteration without possible exit condition, which causes a workflow to make continuous progress, however, without progressing toward successful completion. A lack of synchronization anomaly represents the case of more than one incoming vertex merging into an or-join vertex. Activities without termination or without activation are two common cases of dangling reference anomaly.

Activities belonging to different workflows or parallel activities in the same workflow might access the same resources. A resource conflict occurs when these activities execute over the same time interval. Thus, the analyses in resource dimension include the identification of resource

(13)

conflicts under resource allocation constraints and/or under the temporal and/or causality constraints [2-6]. On the other hand, missing, redundancy, and conflict use of data are common anomalies in data-flow dimension [7-10]. A missing data anomaly occurs when an artifact is accessed before it is initialized. A redundant data anomaly occurs when an activity produces an intermediate data output but this data is not required by any succeeding activity. A conflicting data anomaly represents the existence of different versions of the same artifact.

Current workflow modeling and analyzing paradigms are mainly focused on the soundness of control logic, i.e., in the control-flow dimension, including process model analysis [19-30], workflow patterns [20-33] and automatic control of workflow process [34]. Aalst and ter Hofstede [19] proposed a WorkFlow net (WF-net), based on Petri nets, to model a workflow: transitions representing activities, places representing conditions, tokens representing cases, and directed arcs connecting transitions and places. Furthermore, control-flow anomalies, such as deadlock, livelock, and dangling reference (activities without termination or activation) have been identified through Petri net modeling and analysis. Son [35] defined a well-formed workflow based on the concepts of closure and control block. He claimed that a well-formed workflow is free from structural errors, and that complex control flows can be made with nested control blocks. Son [35] and Chang [36] identified and extracted the workflow critical path from the context of the workflow schema. They proposed extraction procedures from various non-sequential control structures to sequential paths, thus obtaining appropriate sub-critical paths in non-sequential control structures. Sadiq and Orlowska [30] proposed a visual verification approach and algorithm with a set of graph reduction rules to discover structural conflicts in process models for given workflow modeling languages.

There are several research topics discussed in resource dimension, including resource allocation constraints [2, 3], resource availability [4], resource management [5] and resource modeling [6]. Senkul [2] developed an architecture to model and schedule workflow with resource allocation constraints and traditional temporal/causality constraints. Li [3] concluded that a correct workflow specification should have resource consistence. His algorithms can verify resource consistency and detect the potential resource conflicts for workflow specifications. Both Pinar and Hongchen extended workflow specifications with constraint descriptions. Liu [4]

(14)

proposed a three-level bottom-up workflow design method to effectively incorporate confirmation and compensation in case of failure. In Liu’s model, data resources are modeled as resource classes, and the only interface to a data resource is via a set of operations.

Current analysis techniques including above approaches pay little attention on the data-flow dimension, although the related analysis in data-flow dimension is very important since activities cannot be executed properly without sufficient data information. In the literature, there are two works in data-flow dimension found. Sadiq et al. [7] presented data flow validation issues in workflow modeling, including identifying requirements of data modeling and seven basic data validation problems: redundant data, lost data, missing data, mismatched data, inconsistent data, misdirected data, and insufficient data. However, there is no concrete verification procedure presented. Sun et al. [8-10] presented a data-flow analysis framework for detecting data-flow anomalies such as missing data, redundant data, and potential conflicts of data. In addition, they provided several analysis algorithms; however, the work is done only based on read and initial write data operations.

(15)

Chapter 3.

Process Modeling

3.1.

Process Specifications

Based on BPMN, a process consists of a network of activities designed to produce a product or service for a particular customer or market. A process specification, a formalized view of a business process, defines a set of linked (parallel and/or sequential) activities across time and space, with a beginning and an end, associated with clear defined inputs and outputs respectively. Each activity takes a subset of process input(s) or output(s) of previous activity(ies) and transforms them to create the data for later use or as process outputs. The inputs or outputs of a process, as well as the intermediate outputs of activities, are called artifacts. Thus, a process specification contains not only the control flow but also the artifact flow of a business process. Definition 3.1 is a formal description of a business process.

Definition 3.1. A process specification is a tuple BP (G,VT ,D,I ,O )= W W , where

- G (V ,E )= , representing the control flow, is a directed, connected, and acyclic graph, where V is a set of vertices of which each represents an activity and E⊂V x V is a set of directed edges indicating the precedence relation between two activities.

- VT : V→T is a type function that maps each activity into one of the activity types defined as { } T Task,SubProcess,ProcessStart,ProcessEnd,AndSplit,AndJoin, XorSplit,XorJoin,LoopStart,LoopEnd = .

Activities whose types are Task are called task activities while the others are called control activities.

- D is a set of artifacts used in the process.

- IW⊂D, a subset of D , denotes the set of process inputs. - OW⊂D, a subset of D , denotes the set of process outputs.

(16)

3.2.

Control Flow Specification

3.2.1. Activities and Control Blocks

An activity in a business process might be atomic or non-atomic (compound). An atomic activity is the smallest unit of work that is scheduled by a workflow engine during process enactment and cannot be decomposed. A sub-process included within a process is represented as a compound activity. Atomic activities are classified into two major types, Task activities and

control activities, based on their functionalities. A task activity performs a piece of processing steps. Control activities are pairwise activities representing a group of activities, called a control block. There are eight types (four pairs) of primitive control activities in general: (1). ProcessStart (PS) and ProcessEnd (PE) are unique control activities of a process that represent the start and the end of the process respectively (2). AndSplit (AS) and AndJoin (AJ) are control activities for constructing a parallel structure (3). XorSplit (XS) and XorJoin (XJ) are control activities for constructing a branch structure. (4). LoopStart (LS) and LoopEnd (LE) are control activities representing an iteration structure.

Figure 3.1 shows the corresponding notations of control activities, task activity, sub-process activity, and the precedence relation [37].

(17)

With typed activities and their precedence relation, various kinds of control structures can be constituted. In this dissertation, the four primitive control structures, "sequential", "parallel branch", "conditional branch" and "iterative structure", defined in [1] are concerned.

Figure 3.2 shows these control structures to construct a process respectively.

Sequential Block: the activities within this structure are executed sequentially under a single thread. The main characteristic is that the target activity cannot execute until its preceding activity completes. In other words, the completion of a target activity triggers the execution of its succeeding activity.

Iteration Control Block: The activities within the block enclosed by LoopStart and LoopEnd control activities are executed repetitively until certain conditions are met. There are two kinds of iteration control blocks: while loop and repeat-until loop. A while loop checks the conditions before the first activity within the block is executed and thus, it is often also known as a pre-test loop. On the contrary, a repeat-until loop, also known as a post-test loop, tests the conditions after the activities within the block are executed.

AND Control Block: All outflows of an AndSplit activity are executed in parallel, and finally converge into an AndJoin activity synchronously.

XOR (eXclusive OR) Control Block: An XorSplit activity decides one among multiple alternative outflows (process branches) to continue. These branches converge to a single XorJoin activity. No synchronization is required since only one thread is chosen for execution.

(18)

Figure 3.2. Four Primitive Types of Control Structures.

According to our notations, the control flow G (V ,E)= of a process specification is

well-formed if the following constraints hold:

- G has a unique Process Start vertex v of type ProcessStart, which has no incoming ps

edge and one outgoing edge.

 0 1

ps ps ps ps

!v : VT(v ) Pr ocessStart InDegree(v ) OutDegree(v )

∃ = → = ∧ =

- G has a unique Process End vertex v of type ProcessEnd, which has one incoming edge es

and no outgoing edge.

 1 0

es es es es

!v : VT(v ) Pr ocessStart InDegree(v ) OutDegree(v )

∃ = → = ∧ =

- Vertices of type Task , LoopStart , and LoopEnd have one incoming edge and one outgoing edge.

 1

i i i i

v : (VT(v ) Task LoopStart LoopEnd ) InDegree(v ) OutDegree(v )

∀ = ∨ ∨ → = =

- Vertices of type AndSplit and XorSplit have one incoming edge and more than one outgoing edge.

 1 1

bs bs bs bs

v : (VT(v ) AndSplit XorSplit) InDegree(v ) OutDegree(v )

(19)

- Vertices of type AndJoin and XorJoin have more than one incoming edge and one outgoing edge.

 1 1

bj bj bj bj

v (VT(v ) AndJoin XorJoin) InDegree(v ) OutDegree(v )

∀ ∈ = ∨ → > ∧ =

- Any two control blocks can be nested but not overlapped. 

1 i j 2 x y 1 2 1 2 1 2 1 2

b [v ,v ],b [v ,v ],b b b b b b b b

∀ = = ≠ → ⊂ ∨ ⊂ ∨ ∩ = ∅

3.2.2. Relations among Activities and Control Blocks

In this session, relations among activities and control blocks are identified as follows.

Definition 3.2 (Paths).

A path from v1 to vk is a sequence of vertices <v1,...,vk> in a control graph G = (V, E) such

that each node is connected to the next vertex in the sequence (the edges (vi,vi+1) for

i=1,2,...,k-1 are in the edge set E). A path from v1 to vk is denoted byPath v v . ( , )1 k

Definition 3.3 (Reachability).

Given two vertices, u and v, IsReachable u v is a Boolean function that indicates whether ( , ) if there exists a pathfrom u to v.

, , ( , ) ( , )

u v V IsReachable u v true Path u v u v

∀ ∈ = ↔ ∃ ∨ =

Definition 3.4 (Predecessors and Successors). { | ( , ) }

IsPredecessor v

V = ∈u V u v ∈E

{ | ( : }

IsPredecessor IsPredecessor IsPredecessor IsPredecessor

v v v u

(20)

{ | ( , ) }

IsSuccessor v

V = ∈u V v u ∈E

{ | ( : }

IsSuccessor IsSuccessor IsSuccessor IsSuccessor

v v v u

V = ∈t V t V∈ ∨ ∃ ∈u V t V∈

IsPredecessor v

V comprises the set of vertices which are the source of an edge with destination vertex v V∈ . Each element u inVvIsPredecessoris called a direct predecessor of the vertex and is

denoted by u→v . VvIsPredecessor denotes the transitive closure of VvIsPredecessor .

IsPredecessor v

V comprises those vertices that are reachable from v. Each element u inVvIsPredecessoris

called a predecessor of v and is denoted by u

։

v. VvIsSuccessor and its transitive closure

IsSuccessor v

V are defined similarly.

Definition 3.5 (Ancestor Blocks and Level of an Activity).

v V

∀ ∈ , let v.PB denote the parent control block containing v. AncestorBlock comprises the set of all control blocks that contains v.

( ) { | . ( ( . . )}

AncestorBlock v = b b=v PB∨ ∈b AncestorBlock v PB startVertex In addition, the cardinality of AncestorBlock v identifies the nested level of v. ( )

( ) if

( . ) if represents a control block ( ) AncestorBlock v v V AncestorBlock v StartVertex v Level v ∈  = 

Definition 3.6 (Common Ancestor Blocks and Nearest Common Ancestor Blocks).

(21)

following holds: 1 ( ) n i i i B AncestorBlock v = ∈

, denoted byBi∈CAB v( ,1…,vn).

Bi is the Nearest common ancestor ofv1,…,vn if and only if the following holds:

1

( , , ) : ( ) ( )

j n j i j i

B CAB v v B B Level B Level B

∀ ∈ … ∧ ≠ < , denoted byNCAB v( ,1…,vn)=Bi.

Definition 3.7 (Parallel Activities).

Given two vertices, u and v, IsParallel u v is a Boolean function to represent if u and v ( , ) might be executed in parallel within a workflow instance.

( , ) ( , ). " " ( , ) ( , )

IsParallel u v =true⇔NCAB u v Type= AND ∧ ¬IsReachable u v ∧ ¬IsReachable v u

( , )

IsParallel u v =true, denoted asu⊕v, indicates that u and v might be executed in parallel and v is called a parallel activity of u.

Definition 3.8 (Exclusive Activities).

Given two vertices, u and v, IsExclusive u v is a Boolean function to represent some XOR ( , ) characteristics of u and v. Within a workflow instance, if u is selected for execution then v won’t be selected for execution and vice versa.

( , ) ( , ). " " ( , ) ( , )

IsExclusive u v =true⇔NCAB u v Type= XOR ∧ ¬IsReachable u v ∧ ¬IsReachable v u

( , )

IsExclusive u v =true, denoted as u⊗v, indicates that at most one of u and v can be selected for execution and v is called an exclusive activity of u.

(22)

Definition 3.9 (Companion Activities).

Given two vertices, and u v , IsCompanion u v is a Boolean function which indicates ( , ) whether if u is selected for execution then v will always be selected for execution and vice versa.

( ) ( ) \ ( , ) : . " " if ( , ) ( , )

( ) ( ) \ ( , ) { ( , )} : . " " otherwis ( , )

b AncestorBlock u AncestorBlock v CAB u v b type AND IsReachable u v IsReachable v u b AncestorBlock u AncestorBlock v CAB u v NCAB u v b type AND

IsCompanion u v true ∀ ∈ = ∨ ∀ ∈ = = ⇔ ∪ ∪ ∪ e    ( , )

IsCompanion u v =true, denoted as u⊙ , indicates that neither of and v u v or both of them will be selected for execution and v is called a companion activity of u.

3.3.

Artifact Flow Specification

Currently, as identified in [7], there are three major implementation models for artifact flow: explicit data flow, implicit data flow through control flow, and implicit data flow through a process data store. In this dissertation, we adopt the model of implicit data flow through a common process data store. The exchanges of artifacts between tasks are passed through global variables stored in a common database. In a workflow, some activities store their output artifacts in the database, and their following activities may access these artifacts later. The activities in our model are regarded as black boxes, i.e., their internal computations are not visible. Neither are the intermediate execution states. Thus, the artifact usages of an activity are identified through the inputs/outputs of the activity.

3.3.1. Artifacts and Artifact Operations

Artifacts are information entities involved in a process, including the input data to the process, the intermediate data produced within the process, and the final output data from the process. An artifact is an atomic data item (e.g. a number, a character string, or an image) or a collection of atomic data items (e.g. a document). Intuitively, all artifacts participating in a workflow execution must be pre-defined in process specifications. Each artifact contains a set of legal operations for its internal data. An activity designed to manipulate a certain artifact can work only with that

(23)

artifact’s legal operations. From the data storage point of view, every artifact operation can be regarded as one of the following operations, regardless of its semantic meaning.

Initialize: all definition operations, e.g. "fill in", "create", and "define" operations. Read: all reference operations, e.g. "use", "fetch", "select", and "retrieve" operations. Update: all modification operations, e.g. "write", "change", and "update" operations. Destroy: all deletion operations, e.g. "remove", "erase", "cancel", and "discard" operations.

In general, an Initialize operation is used to create an artifact instance in a process. Read and Update operations are then used to access the instance. Finally, a Destroy operation is used to

delete the artifact instance. Destroy operations are applied for temporary artifacts created during

in workflow execution, but may not strict for all artifacts.

Figure 3.3 shows the state diagram of an artifact with above four kinds of operations. There are four states, “Uninitialized”, “Initialized”, “Updated”, and “Read”. ‘Uninitialized’ represents the initial state of an artifact. “Initialized”, “Updated”, and “Read” represent the states after an

Initialize, Update, and Read operation is performed respectively. In addition, the state of an

artifact resets to “Uninitialized” after a Destroy operation.

(24)

3.3.2. Artifact Flow and Artifact Usages

To simplify the discussion of artifact usages, now a formal and complete definition of a task/control activity is shown below:

Definition 3.10 (Task/Control Activities).

An task/control activity is a tuple v=(AT SC EC RC I O ASv, v, v, v, ,v v, v), where  AT represents the type of the activity. v



v

SC , EC , and v RC are sets of logical expressions which are evaluated by a workflow v

engine. 

v

SC is the set of pre-conditions of which each is evaluated to decide whether an activity within a process instance can be started (only used by task activities). 

v

EC is the set of post-conditions of which each is evaluated to decide whether an activity within a process instance is completed (only used by task activities). 

v

RC is the set of routing conditions of which each is evaluated to decide the sequence of activity execution within a process (only used by control activities). 

v

I , the input set, identifies all the artifacts required to be accessed by the activity.  For a task activity,

v

I contains all the artifacts required for computation.  For a control activity,

v

I contains all the artifacts required for evaluating the routing conditions.



v

O , the output set, identifies all the artifacts produced, updated, or destroyed after executing the activity. O is divided into two disjoint subsets, v Ov

+

and Ov−, where

v

O+represents the set of the artifacts initialized or updated by t and Ov

represents the set of the artifacts destroyed by t.

(25)

Based on Definition 3.10, a usage relation between an activity and an artifact can be defined as follows: an artifact usage representing the relation between an activity and an artifact is defined as follows:

Definition 3.11 (Consumer, Producer, Updator, and Destroyer Activities of an Artifact).

For a given artifact d, the memberships between artifact d and I , v Ov+, and Ov− can be

applied for identifying the usage of artifact d at activity v. All the possible usages are categorized as follows:  if v and v v d O d I d O + −   ∈  ∉

 , v is called a Reader (Activity) of artifact d.  if d I∈ v and d O∈ v+, v is called an Updator (Activity) of artifact d.

 if v and v v d I d O d I − ∈  ∈  ∉

 , v is called a Destroyer (Activity) of artifact d.  if d I∉ v and d O∈ v+, v is called a Producer (Activity) of artifact d.

 if v and v v d O d I d O + −   ∉  ∉

 , v is called an Irrelevantor (Activity) of artifact d.

In addition, ifd I∈ v, v is generally called a Consumer (Activity) of artifact d and

ifd O∈ v+, v is generally called a Writer (Activity) of artifact d.

Definition 3.12 (Consumer, Updator, Destroyer and Producer Activity Sets for an Artifact).  IsConsumer { | }

v d

V = ∈v V d I∈ is called the Consumer Activity Set of artifact d.  IsUpdator { | and }

v

d v

V = ∈v V d I∈ d O∈ + is called the Updator Activity Set of artifact d.  IsDestroyer { | }

d v

V = ∈v V d O∈ − is called the Destroyer Activity Set of artifact d.  IsProducer { | and }

v

d v

(26)

Chapter 4.

Artifact Usage Anomalies

4.1.

Artifact Usage Anomalies

In a process specification, some of the following three types of anomalies might occur: (1) Missing Production, (2) Redundant Write, and (3) Conflict Write. In the subsections, these anomalies are defined and the corresponding usage patterns that cause the anomalies are identified. Every usage pattern is given a name, description, and formulated detection conditions. Table 4.1 shows the symbols used in usage patterns.

Table 4.1. Symbols Used in Usage Patterns

d

P

: a producer (d I∉ v and d O∈ v+) d

C

: a consumer (d I∈ v) d

U

: a updator (d I∈v and d O∈ v+) d

W

: a updator (d Ov + ∈ ) d

R

: a reader ( v and v v d O d I d O + −   ∈  ∉  )

։

: reachable d

P

: no producer of d exists d

C

: no consumer of d exists d

R

: no reader of d exists

( )

: a control block

(

)

: XOR control block

(

)

: AND Control block

4.1.1. Missing Production Anomalies

A missing production anomaly occurs when an artifact is consumed before it is produced or after it is destroyed. Formally speaking, given an activity v and an artifact d such that v is a consumer of d, a missing production anomaly occurs if d is not produced or is destroyed when v is selected for execution. To formulate this type of anomaly, the propagation of an artifact is introduced in Definition 4.1.

(27)

Definition 4.1 (Propagation of Artifacts to an Activity).

Given an activity v, let a preceding execution order to v denote an execution order leading to v without parallel activities of v, i.e., only consisting of the predecessors of v. Given an artifact d, if there exists at least one preceding execution order to v such that d is produced but not destroyed (i.e., d is not in Uninitialized state), we call d can be propagated from v’s predecessors to v. The propagation of artifact d regarding only the preceding execution orders to v is called preceding propagation of d to v and can be classified into three cases: no

preceding propagation, conditional preceding propagation, and unconditional preceding

propagation.

No preceding propagation indicates that d is always Uninitialized for all preceding execution orders to v. Conditional preceding propagation indicates whether d is Uninitialized depends on the preceding execution orders to v taken. Unconditional preceding propagation denotes that d is Uninitialized for all preceding execution orders to v.

Based on Definition 4.1, let AA contains all the artifacts which can be propagated from the v

predecessors of v. AA can be divided into two disjoint subset, v u v AA and c v AA , where u v AA contains the artifacts unconditional propagated from the predecessors of v and AA contains the vc

artifacts propagated from the predecessors of v conditionally.

The causes of missing production anomalies can be classified into three categories: No

Preceding Propagation, Conditional Preceding Propagation, and Uncertain Preceding

Propagation. Intuitively, if v V∈ dIsConsumer and d AA∉ v hold, a missing production anomaly might occur due to No Preceding Propagation of d to v. Similarly, if v V∈ dIsConsumer and d AA∈ cv hold, a missing production anomaly might occur owning to Conditional Preceding Propagation of d to v. Furthermore, consider parallel activities of v, even though v V∈ dIsConsumer and d AA∈ uv hold, a

(28)

For each cause of the missing production anomaly, the possible usage patterns are characterized by its name, description, and required condition as followings:

(1). No Preceding Propagation: v V∈ dIsConsumer∧ ∉d AAv

Usage Pattern 1:

P

d

։։

C

d

P

d

Name: No Production

Description: Artifact d has at least one consumer activity v; however, no producer activity of d exists in the process.

Conditions: IsConsumer IsProducer

d d

v V V

∃ ∈ ∧ = ∅

Usage Pattern 2:

P

d

։։

C

d

P

d

։

Name: Delayed Production

Description: Artifact d has a consumer activity v which precedes every producer activity of d.

Conditions: IsConsumer ( IsPredecessor IsProducer) ( IsSuccessor IsProducer)

d v d v d

v V V V V V

∃ ∈ ∧ ∩ = ∅ ∧ ∩ ≠ ∅

Usage Pattern 3:

։

P

d

։։

D

d

C

d

։

Name: Early Destruction

Description: Artifact d is produced and then destroyed before it is consumed. Conditions: IsConsumer ( IsPredecessor IsProducer IsDestroyer)

d v v d d

v V d AA V V V

(29)

Usage Pattern 4:

P

d

։

(C

d

P )

d

։

Name: Exclusive Production

Description: Given two exclusive activities v and u such that v is a consumer of artifact d and u is a producer of d. Due to the characteristic of exclusive activities, only one of v and u might be selected for execution. Although u is a producer of d, it makes no contribution to the propagation of d to v and thus a missing production anomaly occurs if artifact d cannot be propagated from the predecessors of v.

Conditions: IsConsumer IsExclusive IsProducer

d v v d

v V d AA V V

∃ ∈ ∧ ∉ ∧( ∩ )≠ ∅

Usage Pattern 5:

P

d

։

(C

d

P )

d

։

Name: Uncertain Production

Description: Given two exclusive activities v and u such that v is a consumer of artifact d and u is a producer of d. Due to the race hazard of parallel activities, v might be executed before u. Therefore, u may not make contribution to the propagation of d for v and consequently, a missing production anomaly occurs if artifact d cannot be propagated from the predecessors of v.

Conditions: IsConsumer IsParallel IsProducer

d v v d

v V d AA V V

∃ ∈ ∧ ∉ ∧( ∩ )≠ ∅

(2). Conditional Preceding Propagation: v V∈ dIsConsumer∧ ∈d AAvc

Whether d is propagated depends on the preceding path to v taken. Consequently, a missing production anomaly occurs when those preceding paths to v such that d is not propagated are taken.

Usage Pattern 6:

P

d

։

(P

d

P

d

)

։։

C

d

(30)

Conditions: IsConsumer c

d v

v V d AA

∃ ∈ ∧ ∈

Usage Pattern 7:

։

P

d

։

(D

d

D

d

)

։։

C

d

Name: Conditional Destruction

Description: Artifact d is destroyed conditionally before a consumer activity of d. Conditions: IsConsumer c

d v

v V d AA

∃ ∈ ∧ ∈

(3). Uncertain Preceding Propagation: v V∈ dIsConsumer∧ ∈d AAuv

Usage Pattern 8:

։

P

d

։

(D

d

 ։

C )

d

Name: Uncertain Destruction

Description: Given two parallel activities v and u such that v is a consumer of artifact d and u is a destroyer of d. Due to the race hazard of parallel activities, v might be executed before u. Therefore, even though d is unconditional propagated from the predecessors of v, d might be destroyed by u before v is executed and a missing production anomaly occurs. Conditions: IsConsumer u IsParallel IsDestroyer

d v v d

v V d AA V V

∃ ∈ ∧ ∈ ∧( ∩ )≠ ∅

Theorem 1 (Missing Production Verification).

A process BP is free from missing production anomalies if the following condition holds: v V

∀ ∈ , ∀ ∈d Iv:

u v

d AA∈ and (VvIsParallel∩VdIsDestroyer)= ∅.

Proof: This theorem is proofed by contradiction as follows. Support that there exists a missing production anomaly in BP. It indicates that there exists an activity v V∈ , an artifact

v

d I∈ , and an execution order Γ such that v∈Γand d is Uninitialized when v is selected for execution. However, d AA∈ uv implies that d will be always propagated from the predecessors

(31)

of v. Furthermore, IsParallel IsDestroyer

v d

V ∩V = ∅

( ) implies that no parallel activity of v will affect the propagation of d from the predecessors of v. Thus, d will always be propagated to v regardless the execution order leading to v, that is, Γ does not exist. This contradicts the hypothesis and thus, Theorem 1 holds.

4.1.2. Redundant Write Anomalies

A redundant write anomaly occurs when an artifact is written (produced or updated) by an activity but the artifact is neither required by succeeding activities nor a member of the process outputs. Redundancy is not an error; nevertheless, it causes inefficiency. To formulate this type of anomaly, the set of artifacts unused to an activity is introduced in Definition 4.2.

Definition 4.2 (The Set of Artifacts Unused before an Activity).

Given an activity v and an artifact d, if there exists at least one preceding execution order to v such that d is written but not consumed when v is selected for execution, d is called unused for the predecessors of v or simply called unused before v. Intrusively, if artifact d is unused for the predecessors of the Process End vertex and is not a member of the set of process outputs, a redundant write anomaly occurs. There are two cases: completely unused and conditionally

unused. Completely unused indicates that d is unused for all preceding paths to v. Conditionally unused indicates whether d is unused depends on the preceding path to v taken.

Let NC contain all the artifacts unused for the predecessors of v. v NC can be divided into v

two disjoint subset, NC and uv c v

NC . u

v

NC contains the artifacts which are completely unused and NC contains the artifacts which are conditional unused. vc

Based on Definition 4.2, redundant update anomalies can be classified into two categories:

(32)

u ProcessEnd

d NC∈ such that d O∉ w, a redundant update anomaly always occurs for artifact d of the process. Similarly, for every artifact d NC∈ ProcessEndc such that d O∉ w, a redundant update

anomaly might occur for artifact d depending on the execution paths taken.

For each category of the redundant write anomaly, the possible usage patterns are characterized by its name, description, and required condition as followings:

(1). Explicit Redundant Update

Usage Pattern 9: d

C

։ ։

W

d

C

d d d d

C

W

C

։

։

։

d d d

(C

W )

C

։

։

Name: No Consumption After Last Write

Description: For an artifact d not belonging to the process outputs, when d is written by an activity v and the artifact is unused for all succeeding activities of v, a redundant update always occurs for the artifact.

Conditions: u

ProcessEnd w

d NC d O

∃ ∈ : ∉

(2). Potential Redundant Update

Usage Pattern 10:

։

W

d

։

(C

d

C

d d d d

)

(C

W )

C

։

։

Name: Conditional Consumption After Last Write

Description: For an artifact d not belonging to the process outputs, when d is written by an activity v and the artifact is conditionally unused for some succeeding activities of v, a redundant update might occurs.

Conditions: c

ProcessEnd w

d NC d O

(33)

Theorem 2 (Redundant Write Verification).

A process BP is free from redundant write anomalies if NCProcessEnd\Ow= ∅ holds:

Proof: NCProcessEnd\Ow= ∅ indicates that every artifact d is a process output (d O∈ w) or is

read after its last write for all possible (preceding) execution orders leading to Process End vertex. (d NC∉ ProcessEnd). Therefore, no redundant write anomaly exists if NCProcessEnd\Ow= ∅

holds.

4.1.3. Conflict Write Anomalies

A multiple parallel productions anomaly occurs when more than one activity tries to initialize the same artifact in parallel. When this anomaly occurs, different versions of an artifact will exist.

A conflict update anomaly occurs when more than one activity in parallel updates the same artifact.

Usage Pattern 11:

։

(P

d

P )

d

։

Name: Multiple Parallel Productions

Description: More than one activity initializes the same artifact in parallel. Conditions: IsProducer IsProducer IsParallel

d d v

v V u V V

∃ ∈ ∧ ∈( ∩ )

Usage Pattern 12:

։

(U

d

U )

d

։

Name: Multiple Parallel Updates

Description: More than one activity updates the same artifact in parallel. Conditions: IsUpdator IsUpdator IsParallel

d d v

v V u V V

(34)

Usage Pattern 13:

։

(R

d

U )

d

։

Name: Parallel Read and Update

Description: Two activities perform read and update respectively on the same artifact concurrently.

Conditions: IsReader IsUpdator IsParallel

d d v

v V u V V

∃ ∈( )∧ ∈( ∩ )

Theorem 3 (Conflict Writes Verification).

A process BP is free from conflict writes anomalies if for any two parallel activities v and u,

v v u u

O+ I ∩ O+ I = ∅

( \ ) ( \ ) , (Ov+∩Iv) (∩ Ou+∩Iu)= ∅, Iv∩Ou+= ∅, and Iu∩Ov+= ∅hold.

Proof: if for any two parallel activities v and u such that Ov Iv Ou Iu

+ + = ∅

( \ ) ( \ ) , then no

two activities initializes the same artifact in parallel. If Ov Iv Ou Iu

+ + = ∅

( ) ( ) , then no two activities updates the same artifact in parallel. Furthermore, Iv Ou

+= ∅

and IuOv+= ∅

indicate that no two activities perform read and update respectively on the same artifact. Thus, BP is free from conflict writes anomalies.

4.1.4. Summary of Usage Patterns Causing Artifact Usage Anomalies

(35)

Table 4.2: Summary of Usage Patterns Causing Artifact Usage Anomalies

Type Case Pattern

Missing Production No Production

P

d

։։

C

d

P

d Delayed Production

P

d

։։

C

d

P

d

։

Early Destruction

։

P

d

։։

D

d

C

d

։

Exclusive Production

P

d

։

(C

d

P )

d

։

Conditional Production

P

d

։

(P

d

P

d

)

։։

C

d Conditional Destruction

։

P

d

։

(D

d

D

d

)

։։

C

d Uncertain Production

P

d

։

(C

d

P )

d

։

Uncertain Destruction

։

P

d

։

(D

d

 ։

C )

d Redundant Write No Consumption After Last Write

d

C

։ ։

W

d

C

d d d d

C

W

C

։

։

։

d d d

(C

W )

C

։

։

Conditional Consumption After Last Write

d d d

W

(C

C

։

։

d d d

)

(C

W )

C

։

։

Conflict Write Multiple Parallel Productions

։

(P

d

P )

d

։

Multiple Parallel Updates

։

(U

d

U )

d

։

(36)

Chapter 5.

Algorithms to Detecting Artifact Usage Anomalies

This chapter presents a solution for detecting artifact usage anomalies in a process specification. To simplify the discussion, our solution is divided into two algorithms: traversal algorithm and detection algorithm. The traversal algorithm is applied firstly to transform the control graph of a process for facilitating the presentation of the detection algorithm. The detection algorithm to artifact usage anomalies is then applied on the transformed structure.

5.1.

The Traversal Algorithm

From the top-level of view, a well-formed control flow can be deemed as a sequence of task activity and top-level control blocks. Thus, an entire process can be deemed as a sequence of nodes, where each node may present a task activity or a control block. The same perspective can be applied to the branches of a control block. Based on this perspective, a control flow graph can be recursively transformed into a sequence of nodes.

Thus, for an input process schema, the traversal algorithm begins by traversing the main sequence enclosed by the start vertex and the end vertex of the process. The traversal algorithm is recursively applied until every task activity and control block in each level are processed.

Besides, the traversal algorithm also transforms each iteration control block into a corresponding XOR control block during the analysis of artifact usage anomalies. Figure 5.1 and Figure 5.2 show the transformation of a loop with at-least-once iteration and zero iteration respectively.

(37)

...

...

...

...

ls v RC = ∅ RCvle ≠ ∅ le V Vls ''' ls V ''' le V '''' ls V '''' le V '' ls V '' le V xs V Vxj Transformation of a Repeat-Until Loop

Figure 5.1: Transform a Repeat-Until Loop.

...

...

...

...

ls v RC ≠ ∅ le V Vls ''' ls V ''' le V '''' ls V '''' le V '' ls V '' le V ' ls V ' le V xs V Vxj Transformation of a While Loop

(38)

Algorithm ControlGraphToSequence(G, v, level) { //Input: G=(V,E): a directed connected graph

// v: a vertex of G representing the next vertex to traverse // level: nested level

//Output: S: a structure containing a sequence of nodes (a node can represent a task or a control block) // S.startVertex :corresponding vertex in G for the beginning of s

// S.endVertex :corresponding vertex in G for the end of s // S.nodes :nodes collection (an ordered set of nodes)

sequence.startVertex=currentVertex=v; sequence.level=level; while (currentVertex!=null) { currentVertex.ownerSequence=sequence; switch (currentVertex.type) { case “Task”: sequence.nodes.append(currentVertex); nextVertex=currentVertex.next; break;

case “ProcessStart”, “AndSplit”, “XorSplit”, “LoopStart”: if (currentVertex.type==”LoopStart”) {

//Transform loop to corresponding XOR control block //based on Figure 5.1 and 5.2

}

newNode.type=currentVertex.type; newNode.startVertex=currentVertex; for each edge (currentVertex, w)∈E {

//recursively transform every branch within a control block subSequence= ControlGraph2Sequence(G, w, level+1); subsequence.parentBlock = newNode;

//collect every subSequence (corresponding to each branch) newNode.subSequences.append(subSequence); } newNode.endVertex=subSequence.endVertex.next; sequence.nodes.append(newNode); nextVertex=newNode.endVertex.next; break;

(39)

exit while; }

previousVertex=currentVertex; //remember last traversed vertex currentVertex=nextVertex; //continue to traverse next node }

sequence.endVertex=previousVertex; return sequence

}

5.2.

The Detection Algorithm

The detection algorithm is subdivided into several sub-algorithms described in subsections.

Algorithm AnalyzeProcess (G , D, I ,W O ) { W S = ControlGraphToSequence( G ); //OS.startVertex=IW; //IS.endVertex=OW; DetectMissingProduction (S ,I ); W DetectRedundantWrite(S ,I ,W O ); W DetectConflictWrites(S ,∅); }

5.2.1. Method for Detecting Missing Production Anomalies

5.2.1.1. Calculation of Propagated Artifacts from the Predecessors

Given a sequence S of the input process schema and an activity v of S, Let S.AAv denote the

set of artifacts propagated from the predecessors of v and v

'

S.AA be the set of artifacts of which each can be propagated to the direct successors of v after execution of v. Initially,

S.startVertex w

S.AA =I if S is the top level sequence. During the traversal of the sequence S , S.AA is calculated after every traversed node n as follow.

(40)

If n represents a task activity v, v has only one direct successor x.

v

'

S.AA and S.AA are x

calculated as follows:

For every destroyed artifact

v

d O∈ −, remove d from u v

S.AA and S.AA vc For every produced artifact \

v v

d (O∈ + I ), add d to u v

S.AA and remove d from c v S.AA . \ \ \ \ \ u u v v v v v x v c c v v v v v ' ' ' S.AA S.AA O (O I ) S.AA S.AA S.AA S.AA O (O I ) − + − +  =  = = =  ∪

If n represents a control block with subsequences

{

1

}

i

SS= SS | ≤ ≤i k , every vertex within the block will be recursively traversed as follows:

n.startVertex, the start vertex of the control block, is traversed first. 

n.startVertex n.startVertex n.startVertex

'

S.AA =S.AA =S.AA =since n.startVertex is a control node. For every subsequence

i

SS , recursively applied the same traversed algorithm to calculate each SS .AA i

n.endVertex is traversed at last and each

i

SS .AA is merged according to the type of the control block.

If n is an XOR control block,

1 1 \ i i k u u n.endVertex i SS .endVertex i n.endVertex n.endVertex k c u

n.endVertex i SS .endVertex n.endVertex i

'

S.AA SS .AA

S.AA S.AA

S.AA SS .AA S.AA

= =  =   = =  = 

(41)

If n is an And control block, 1 1 1 i i i k k u u u

n.endVertex i SS .endVertex i i SS .endVertex

i i

n.endVertex n.endVertex k

c u

n.endVertex i SS .endVertex n.endVertex i

'

S.AA SS .AA (SS .O SS .AA )

S.AA S.AA

S.AA SS .AA S.AA

− = = =  =   = =  =  \ \ \

5.2.1.2. Rules for Detecting Missing Production Anomalies

No Propagation

When visiting an activity v such that

v v

I ∪O−≠ ∅, if MAuv=(Iv∪O−v) \ AAv≠ ∅ a missing production anomaly occurs for each artifact d M∈ uv due to No Propagation.

Conditional Propagation

When visiting an activity v such that

v v

I ∪O−≠ ∅, if MAvc=(Iv∪O )v− ∩ AAvc≠ ∅ a missing production anomaly occurs for each artifact d M∈ vc .due to Conditional

Propagation. Uncertain Propagation

For an AND control block with subsequences

{

1 2

}

i

SS= SS | ≤ ≤i k,k≥ , before merging

i

i SS .endVertex

SS .AA from every subsequence, if

1

i j

SS ,SS i j

i,j i,j k i j (UP SS .I SS .O− )

∃ ∧ ≤ ≤ ∧ ≠ ∧ = ∩ ≠ ∅ , a missing production anomaly occurs for each artifact d UP∈ SS ,SSi j due to Uncertain Propagation.

5.2.1.3. Algorithm to Detect Missing Production Anomalies

Algorithm DetectMissingProduction (S ,AAP) {

//Input: S: a structure containing a sequence of nodes (of which each is a task or a control block) // AAP: the set of artifacts propagated from preceding nodes.

(42)

P

S.AA AA= ; //the set of artifacts propagated from preceding nodes. S.I= ∅; //the set of artifacts consumed by activities of this sequence. S.O−= ∅; //the set of artifacts destroyed by activities of this sequence. for each n S∈ { //process every task or control block

switch (n.type) {

case Task: //a task activity v=n.startVertex;

CheckMissingProduction( v , S.AA ); //check missing production on v

//Calculate the set of artifacts that can be propagated to the successors of v. S.AA = UpdatePropagatedArtifactSet(v , S.AA );

v

S.I S.I= ∪ I ; //update the set of artifacts consumed after execution of v.

v

S.O−=S.O− ∪ O−; //update the set of artifacts destroyed after execution of v. break;

case default: //control blocks

CheckMissingProduction(n.startVertex , S.AA );//check missing production on n.startVertex SS n.subSequences= ; //the set of subsequences of the control block.

for each SSi∈SS {

//recursively applied the algorithm on every subsequence by passing S.AA as an argument DetectMissingProduction (SS ,i S.AA );

}

//check uncertain destruction before merging CheckUncertainDestruction(n , SS );

//merge artifact sets propagated from all subsequences AA = MergePropagatedArtifactSets (n , SS ); } } } Algorithm CheckMissingProduction(v ,AA ) { if ((Iv∪Ov−)≠ ∅) { ) ) u v v v v c c MA (I O AA MA MA (I O AA − −  =  = =  \ ∪ ∪ ∩ ;

(43)

for each d MA∈ u

print “a missing production occurs on v due to no propagation of artifact d”; for each d MA∈ c

print “a missing production occurs on v due to conditional propagation of artifact d”; }

}

Algorithm UpdatePropagatedArtifactSet (v , AA ) {

//after traversing v, update the set of propagated artifacts. //remove artifacts destroyed and add artifacts produced by v.

\ \ u u v v v v v c c v v v v AA AA O (O I ) AA AA AA O (O I ) − + − +  =  = =  \ \ \ ∪ ; return AA ; v } Algorithm MergePropagatedArtifactSets(n , SS ) { if (n.endVetex.type==”XorJoin”) { 1 SS u u n.endVertex i i AA SS .AA = =

; 1 SS c u n.endVertex i n.endVertex i AA SS .AA AA = =

\ ; } else { 1 1 SS SS u u u n.endVertex i i i i i

AA SS .AA (SS .DA \ SS .AA )

= = =

\

; 1 SS c u n.endVertex i n.endVertex i AA SS .AA AA = =

\ ; } u c

n.endVertex n.endVertex n.endVertex

AA =AA ∪AA ;

return AAn.endVertex; }

(44)

for each SSi∈SS {

for each SSj∈SS and SSj≠SSi {

i j SS ,SS i j UP =SS .I ∩ SS I.O−; for each i j SS ,SS d UP∈ {

print “A missing production anomaly for artifact d might occur due to uncertain production on parallel branches(SS ,SS ) .” i j

} } } }

5.2.2. Method for Detecting Redundant Production/Update Anomalies 5.2.2.1. Calculation of Redundant Production/Update

Given a sequence S of the input process schema and an activity v of S, Let S.NCv denote the

set of artifact unused before v and S.NC denotes the set of artifact unused after executing v. 'v During the traversal of the sequenceS , S.NC is calculated on every traversed node n as follow.

If n represents a task activity v,

u u v v v v v v c c v v v v v ' ' ' S.NC (S.NC I O ) O S.NC S.NC S.NC I O O − + − +  =  = =  \ \ \ \ \ ∪

For every read or destroyed artifact or

v v

d I∈ d O∈ −, remove d from u v

NC and NCvc. For every produced or updated artifact

v

d O∈ +, add d to u v

NC and remove d from c

v

NC .

If n represents a control block with subsequences

{

1

}

i

SS= SS | ≤ ≤i k , the same algorithm is recursively applied to calculate each SS .NCi and then merge them according to the type of the control block.

(45)

1 1 i i k u u n.endVertex i SS .endVertex i n.endVertex n.endVertex k c u

n.endVertex i SS .endVertex n.endVertex i ' S.NC SS .NC S.NC S.NC S.NC SS .NC S.NC = =  =   = =  =  \

If n is an And control block,

1 1 1 i i i k k u u u u

n.endVertex i SS .endVertex i SS .endVertex n.startVertex

i i

k

u

n.endVertex n.endVertex i i i SS .endVertex i c n.endVertex i ' S.NC SS .NC ( SS .NC S.NC ) S.NC S.NC ((SS .I SS .O ) SS .NC ) S.NC SS .NC = = − = = = = = \ \ \ ∪ ∪

1 i k u SS .endVertex n.endVertex i S.NC =         

\

5.2.2.2. Rules for Detecting Redundant Production/Update Anomalies

Explicit Redundant Update

After visiting the endVertex of the top level sequence, i.e. the end vertex of the process, if

u

S.endVertex w

EC NC= \O ≠ ∅, a redundant update anomaly occurs for every artifact d EC∈ due to No Consumption After Last Write.

Potential Redundant Update

After visiting the endVertex of the top level sequence, i.e. the end vertex of the process, if

c

S.endVertex w

CC NC= \O ≠ ∅ , a redundant update anomaly occurs for every artifact

d CC∈ due to Conditional Consumption After Last Write.

5.2.2.3. Algorithm to Detect Redundant Production and Update Anomalies

Algorithm DetectRedundantWrite (S , NCP,O ) { W

//Input: S: a structure containing a sequence of nodes (of which each is a task or a control block) // NCP: the set of artifacts unused after preceding nodes.

數據

Figure 3.1 shows  the corresponding notations  of control activities, task activity, sub-process  activity, and the precedence relation [37]
Figure 3.2. Four Primitive Types of Control Structures.
Figure 3.3 shows the state diagram of an artifact with above four kinds of operations
Table 4.1 shows the symbols used in usage patterns.
+7

參考文獻

相關文件

銷貨單號碼 發票日期 運貨日期 銷貨總額 營業稅 品號 品名/規格 單價 數量 B 第一次正規化格式.

(A)SQL 指令是關聯式資料庫的基本規格(B)只有 SQLServer 2000 支援 SQL 指令(C)SQL 指令 複雜難寫,適合程式進階者使用(D)是由 Oracle

Although many excellent resource synchronization protocols have been pro- posed, most of them are either for hard real-time task scheduling with the maxi- mum priority inversion

Resources for the TEKLA curriculum at Junior Secondary Topic: Business, Business Environment and Globalization Strategies and Management – Core Learning Elements1. Module

Lately, the chairperson of the Business Education Club, Louise, approached Sandy and proposed the idea of starting up a short term business with Organic Farming Club during

* All rights reserved, Tei-Wei Kuo, National Taiwan University, 2005..

(資料來源:The INSEAD − Wharton Alliance on Globalizing: Strategies for Building Successful Global Business (歐洲商業管理學院 −

These family business owners have to face the following problem: Keep up with today's technology development from the original business equipments, whether to expand the scale of