Chapter 7. Implementation and experimental results
7.2. Experimental results of dynamic effective testing
To perform dynamic effective testing, we implemented the algorithms described in Chapter 5. The Java code of our implementation can be obtained at
104
http://www.csie.ntnu.edu.tw/~ghhwang/DET/DET_RV_gen.zip. It contains the implementation of the Γ function, Algorithms 1, 2, 3, 5, and 6 and some examples that demonstrate how to use these algorithms to derive race variants from a SYN-sequence.
The measured data from the experiments are divided into three parts:
• Part I: We applied dynamic effective testing according to the proposed framework shown in Figure 17. The Γ function was implemented according to Definition 6 and Algorithm 7. We also applied the event compression scheme (Algorithm 8 in Section 5.3) for the partial order graph to reduce the time required to generate race variants. Here we report the results for several measurement parameters. The first one is the number of SYN-sequences produced in dynamic effective testing (i.e., the number of elements in the effective test set). Since dynamic effective testing may duplicate some tests, we also show the number of different SYN-sequences generated. Second, we show the number of times that the Γ function encountered cases 3, 4, 5A, and 6. Note that the Γ function either prunes some events or abandons SYN-sequences in these cases. We can investigate how the Γ function works to avoid boundless state reiteration. Third, we show the maximum number of elements in the reiterated state set of the obtained SYN-sequences. Finally, we show the time required to perform a complete dynamic effective testing.
• Part II: Algorithm 9 can construct a DFA (i.e., the simplified reachable state graph of the target concurrent program) from the effective test set derived during dynamic effective testing. We show the number of states of the derived DFA and the running time of this algorithm. In addition, we also developed a
tool to generate the visual graph of the derived DFA. We used an open source tool called Graphviz [70], which produces a graph from data in a text file written in a graph description language. Our tool first outputs the simplified reachable state graph as a DOT28 file and then invokes Graphviz to convert the DOT file into a PNG file. Figure 29 shows one of the generated PNG files.
• Part III: We also show some static analysis results according to the source code of the target concurrent program. The reported numbers, which we calculated manually, demonstrate the complexity of the target concurrent program in terms of the number of states and interleavings. The execution-state upper bound is that defined in Chapter 4. Note that we did not count the not-synchronization-related statements. The combinatorial interleaving number of this concurrent program is (బ)(భ)⋯(షభ)!
(బ)! ×(భ)!×⋯×(షభ)!. Assume that there are n processes (P0, P1, …, Pn–1) in the target concurrent program and that the number of synchronization events of Pi is NE(Pi) according to the ASET rules. (Note that the combinatorial interleaving number is only a rough upper bound of the potential interleavings of a concurrent program since it does not consider the iteration of loops or conditional branching.) In the case where there is no synchronization event in an iterative statement or conditional branching construct and events of different processes always race, the combinatorial interleaving number is the number of possible interleavings [31].
28 For the details of the DOT language, refer to http://www.graphviz.org/doc/info/lang.html.
106
The experiment was divided into two parts. We first show the results for three artificial programs, and then those obtained with three popular programs reported in the literature.
Table 2 presents the first part of our experiment: A1 and A2 are the programs shown in Figure 1A and B, respectively, and A3 is a concurrent program with three processes that might form sophisticated livelocks. Appendix A provides the source code of A3. The second part of our experiment involved three well-known programs. Referring to Table 3, R1, R2, and R3 were used for dynamic effective testing; see Appendix A for their source programs. R1 and R2 are previously published programs [71], and R3 is Peterson’s algorithm, which is a solution to the two-process critical-section problem [65]. We let R1, R2, and R3 attempt to enter the critical section twice. Processes in R1 and R2 get stuck in infinite loops in some SYN-sequences. This type of error is successfully detected by the dynamic effective testing. In calculating the combinatorial interleaving number of the three programs, we allowed loop unrolling to occur beforehand.
Table 2. Experimental results of dynamic effective testing for three sample programs
Measurement parameters A1 A2 A3
Part I
Number of different SYN-sequences generated/total
number of tests (effective test set) 3/3 23/23 325/325 Number of times that case 3/case 4/case 5A/case 6
occurred 1/0/0/0 4/8/0/0 236/0/0/0 Maximum number of elements in the reiterated state
set 1 1 1
Time required to perform dynamic effective testing
(in seconds) 0.047 0.324 2.868
Part II
Number of states/number of transitions of the DFA
derived by Algorithm 9 5/6 29/46 97/252 Time required to execute Algorithm 9 (in seconds) 0.008 0.046 2789
Part III Execution-state upper bound 12 243 58320 Combinatorial interleaving number 2 6 1260
Table 3. Experimental results of dynamic effective testing for three well-known programs
Measurement parameters R1 R2 R3
Part I
Number of different SYN-sequences generated/total
number of tests (effective test set) 22/22 56/56 2748/2748 Number of times that case 3/case 4/case 5A/case 6
occurred 6/3/1/4 22/0/0/0 758/454/3/1 Maximum number of elements in the reiterated
state set 1 1 1
Time required to perform dynamic effective testing
(in seconds) 10.169 10.383 66.979
Part II
Number of states/number of transitions of the DFA
derived by Algorithm 9 16/31 33/63 413/821 Time required to execute Algorithm 9 (in seconds) 0.059 0.051 1593
Part III Execution-state upper bound 162 572 34992
Our check of all the generated simplified reachable state graphs according to the semantics of the target concurrent programs revealed that they were all correct. Also, according to the number of occurrences of cases 3, 4, 5A, and 6, it was obvious that the Γ function does avoid boundless state reiteration. In all cases the number of states of the simplified reachable state graph was smaller than the execution-state upper bound. This is because our scheme executes the target program during prefix-based replay, and hence infeasible execution states cannot be explored in the testing process. For sophisticated concurrent programs the feasible execution states constitute only a tiny portion of the execution-state upper bound. For example, the feasible execution states in A3 and R3 are only 0.16% and 1.18% of the execution-state upper bound, respectively.
An intuitive assumption would be that the combinatorial interleaving number will always be larger than the number of elements in the derived effective test set. However, this might not be true for concurrent programs with a small number of synchronization
108
operations, such as in A1 and A2. This is because calculating the combinatorial interleaving number does not consider the loop iteration, which may involve multiple synchronization events. However, our scheme forces the loop to iterate at least twice, after which the Γ function can prohibit further iterations. Thus, in small concurrent programs such as A1 and A2, the combinatorial interleaving number is smaller than the number of elements in the derived effective test set. In general, dynamic effective testing filters out many infeasible SYN-sequences that may be counted in the combinatorial interleaving number. For example, in R3, we only need to perform 2748 tests out of the combinatorial interleaving number of 184,756.
To provide more empirical evidence about the application and effectiveness of the dynamic effective testing for testing concurrent software, we applied the dynamic effective testing on concurrent programs with different synchronization primitives. We
used the race variant generator at
http://www.csie.ntnu.edu.tw/~ghhwang/DET/DET_RV_gen.zip to perform race analysis and designed new protocols to perform prefix-based replay for these synchronization primitives. First, we applied the dynamic effective testing to testing semaphore-based Java concurrent programs. We obtained some semaphore-based Java programs from the benchmark suite, Clash of the Titans [72], and modified some of them so that some of the semaphore operations occurred in busy-waiting loops. The experimental result showed that all the potential deadlock states were detected by the dynamic effective testing. The derived reachable state graph can be used for the programmer to check why the semaphore primitives cause deadlocks. The experimental
results can be obtained at
http://www.csie.ntnu.edu.tw/~ghhwang/DET/experiments.html. Second, we also obtained preliminary results from applying dynamic effective testing on a service-oriented architecture [73]. The tested applications were based on service-oriented principles, using Web services [74] and WS-BPEL (Web Services Business Process Execution Language [75]) technologies together. The Web service transactions were modeled as shared objects for reading and writing. Several WS-BPEL applications with an infinite number of SYN-sequences were successfully tested by dynamic effective testing, which demonstrates the portability and practicability of dynamic effective testing.
110