• 沒有找到結果。

4.3 2-output Bi-decomposition with two Shared Subfunctions

5.2 Experimental Results

a node with the most fanin nodes to bi-decompose. The process iterates until no more co-bidecomposition or bi-decomposition can be made.

5.2 Experimental Results

The algorithms were implemented in C++ in ABC [2] with MiniSAT [7] as the underlying solver. All experiments were conducted on a Linux machine with Xeon 3.4GHz CPU and 6Gb RAM.

Two sets of experiments were designed to demonstrate the scalability of bi-decomposition and the optimality of variable partitioning.

Only circuits containing output functions with large support sizes (≥ 30) were chosen from the iscas, itc, and lgsynth benchmark suites.1 To show the ef-ficiency of decomposing large functions, Table 5.1 shows the results of or2- and xor-decompositions on the output functions of the listed circuits. As can be seen, functions with up to 200 inputs, such as i2, can be decomposed effectively. It may not be the case using BDD-based methods.

To measure the quality of a variable partition, we use two metrics: |XC|/|X|

for disjointness, and ||XA| − |XB||/|X| for balancedness. The smaller they are, the

1Sequential circuits are converted to combinational ones by replacing register inputs and outputs with primary outputs and inputs, respectively.

5.2. Experimental Results

Figure 5.1: |XC|/|X| and ||XA| − |XB||/|X| in the enumeration of variable partitions in OR2-decomposition.

better a partition is. In particular, we prefer disjointness to balancedness since the former yields better variable reduction. Experience suggests that |XC|/|X| very often can be maximally reduced within the first few enumerations while keeping

||XA|−|XB||/|X| as low as possible. Figure 5.1 shows how these two values change in enumerating different variable partitions under or2-decomposition on some sample circuits, where every variable partition corresponds to two markers in the same symbol, one in black and the other in gray.

It is interesting to note that or2- and xor-decompositions exhibit very different characteristics in variable partitioning. Figures 2 and 3 show the difference. In these two plots, a marker corresponds to a rst found valid variable partition in decomposing some function. As can be seen, the decomposition quality is generally good in or2-decomposition, but not in xor-decomposition. This phenomenon is because xor-decomposable circuits, e.g. arithmetic circuits, possess some regular

5.2. Experimental Results

Figure 5.2: Variable partition in OR2-decomposition.

Figure 5.3: Variable partition in XOR-decomposition.

structures in their functionality. This regularity makes disjointness and balancedness mutually exclusive in variable partitioning.

To measure the quality of a variable partition in co-bidecomposition, we use seven metrics: |XA|/|X|, |XB1|/|X|, |XB2|/|X|, |XC1|/|X|, |XC2|/|X|, |XC3|/|X|

and |XC|/|X|. |X| denotes the common fanins of the two choosed nodes.

Figure 5.5 and Figure 5.4 show that the variable partition distribution with or

5.2. Experimental Results

Figure 5.4: Variable partition without UNSAT core

Figure 5.5: Variable partition with UNSAT core

without minimal unsat core. From the two figures, we can find that XA increases lightly and XC decreased dramatically.

Moreover, We use co-bidecomposition to synthesize network. That is, we bi-decompose the network iteratively until the network can not be bi-bi-decomposed. We are concerned about the size and disjointness of the network. After bi-decomposing the network, we perform structure hashing on decomposed network to convert the

5.2. Experimental Results

Figure 5.6: An example of overlap cost function

network to AIG network and use the number of nodes to denote size. To measure the disjointness of network which is an AIG, we define a cost function. cp(n1):

The number of primary inputs which are support to node n1’s two fanin nodes.

Overlap(f ): The summantion of cp(n1) for each node n1 in f divide by the number of nodes and primary inputs.

Figure 5.6 shows an example. For node 7, it has two fanin nodes, node 5 and node 6. Primary input b is support to node 5 and node 6. Therefore, for node 7, cp(7) is 1. The number of nodes and primary inputs are 3 and 3, respectively.

In each decomposition, we compare the original network with the decomposed network. We divide new size (overlap cost function) by old size (overlap cost func-tion). If the ratio exceeds a threshold, this bi-decomposition would not be accepted.

we would restore original network. In our experiments, we set threshold as 1, 1.05, 1.1 or infinite.

5.2. Experimental Results

Figure 5.7: Trade-off curve with Size and Overlap

Figure 5.7 shows the relation between size and overlap after bi-decomposing network with different threshold values iteratively. The green dot denotes the result after applying a command ”dc2” in ABC to synthesis the network. The blue dots denote bi-decomposed result with different threshold values. The pink dots represent the result of bi-decomposed circuit applied a command ”dc2”. The nodes on the blue or pink line denote the pareto points. The red dots denote the minimum of product of size ratio and overlap ratio. Observe the red dot in pink line. It shows that its size ratio is less than the size ratio of circuit applied ”dc2”. Moreover, it can reduce about 8% overlap ratio.

5.2. Experimental Results

Table 5.1: Bi-decomposition of PO functions

OR2-decomposition XOR-decomposition

circuit #in #max #out #dec #slv time mem #dec # slv time mem

(sec) (Mb) (sec) (Mb)

b04c 76 38 74 49 3878 12.26 19.35 49 2714 28.82 20.02

b07c 49 42 57 14 12985 27.59 22.3 39 601 5.43 18.72

b12c 125 37 127 80 12526 25.14 23.32 84 4862 19.22 26.93

C1355 41 41 32 0 26240 354 20.32 TO

C432 36 36 7 7 102 13.15 18.54 0 3654 197.81 17.46

C880 60 45 26 16 222 8.36 20.72 11 4192 83.08 18.72

comp 32 32 3 0 1488 2.61 15.86 1 1014 13.69 16.9

dalu 75 75 16 1 26848 352.87 24.14 16 210 26.59 19.68

e64 65 65 65 0 45760 17.98 22.91 0 45760 388.18 24.37

i2 201 201 1 1 1 1.07 18.6 1 34 2.16 18.59

i3 132 32 6 6 82 0.96 16.32 0 1986 9.28 16.36

i4 192 47 6 4 6 0.58 16.08 0 4326 60.04 16.54

k2 45 45 45 33 1071 17.51 22.33 33 612 5.29 20.71

my adder 33 33 17 0 3656 2.61 18.05 16 577 4.92 17.32

o64 130 130 1 1 1 0.36 16.17 0 8385 623.43 16.12

pair 173 53 137 119 4429 20.63 21.56 101 5676 41.81 21.61

rot 135 63 107 49 19927 65.97 23.21 46 4975 59.23 21.96

s1423c 91 59 79 26 42744 121.49 27.17 68 7281 161.98 20.25

s3330c 172 87 205 60 2941 9.42 23.09 71 3135 16.45 21.87

s3384c 226 48 209 76 12685 28.16 30.21 147 2467 24.95 21.33

s6669c 322 49 294 101 24423 198.14 29.13 176 3120 279.03 22.87

s938c 66 66 33 1 5985 2.81 19.86 33 426 4.49 16.28

too large 38 36 3 3 22 9.89 19.87 2 629 33.38 18.4

#in: number of PIs; #max: maximum number of support vars in POs; #out: number of POs; #dec: number of decomposable POs; #slv: number of SAT solving runs; TO: time out at 1500 sec

Chapter 6

相關文件