The Floorplanning Algorithm - Placement of Digital Microfluidic Biochips

Placement of Digital Microfluidic Biochips

4.5 The Floorplanning Algorithm

The proposed algorithm is based on the simulated annealing (SA) method [27].

We adopt SA instead of genetic algorithm (GA) as the optimization method because it has been shown that SA is typically more efficient and economical than GA for the problems in electronic design automation (EDA). GA needs to maintain a set of solutions, called the population. At each iteration, GA needs to evaluate the fitness function for each solution in the current population. On the other hand, SA maintains only two solutions—the current solution and the best one. At each iteration, SA needs only to evaluate the current solution. As a result, SA typically needs less CPU time than GA. Moreover, SA uses less memory than GA due to the smaller number of solutions maintained.

Before performing SA, we first cluster one generation operation with one reconfigurable operation to reduce the CPU time and to increase the chance of ob-taining more compact 3D floorplans. During SA, given a feasible T-tree, we perturb it to obtain another feasible T-tree through a set of pre-defined SA operations. After perturbation, we perform a feasibility detection and tree reconstruction process to obtain a feasible topology with respect to the precedence constraints and the stor-age constraints. Finally, a packing procedure that places all operations and optical detectors is invoked to evaluate the solution quality.

4.5.1 Clustering of Generation and Reconfigurable Operations

In this subsection, we detail the clustering algorithm. The goal of the clus-tering algorithm is to obtain a more compact 3D floorplan and to reduce the CPU time by reducing unnecessary storage units. This clustering algorithm is motivated

a b

Figure 4.18: (a) A sample sequencing graph. (b) A partial floorplan with v_f being scheduled long before v_kstarts. (c) Another partial floorplan with v_f being scheduled right before v_k starts.

by two observations. First, a generation operation and a reconfigurable operation are always performed in sequence, since a droplet needs to be first generated and then used for reactions. Second, the solution quality (e.g., volume) and reduce the CPU time may be improved by reducing the amount of storage units required via clustering. Recall that a storage unit is needed for two data-dependent tasks if they are not scheduled at consecutive time steps. The duration of this storage unit also varies based on the starting and ending times of these two tasks. Since the storage units occupy certain volumes, the number and duration of them have great effect on the total volume of a 3D floorplan. If the volume of these storage units can be mini-mized, we may obtain a more compact 3D floorplan. We use the sample sequencing graph shown in Figure 4.18 (a) as an example. Figure 4.18 (b) shows a partial floorplan. For simplicity, we only show the X and T dimensions ¹. In this floorplan, since task v_f finishes much earlier than task v_k starts, the storage unit v_s will have a very long duration. Therefore, we may obtain a less compact 3D floorplan due

1For illustration purpose, in this figure, the width of the generation operation vf is not zero.

to the non-overlapping requirement among v_s and other tasks. On the other hand, Figure 4.18 (c) shows another partial floorplan. In this floorplan, since v_f finishes right before v_k starts, a storage unit is not required between v_f and v_k. By schedul-ing a generation operation as near as its data-dependent reconfigurable operation, the unnecessary volume occupied by a storage unit can be effectively minimized. In this way, there is a higher chance to obtain a 3D floorplan with smaller area.

The idea of the proposed clustering algorithm is that: given a sequencing graph G, we randomly cluster one generation operation v_g with one reconfigurable operation v_r if there exist an edge between v_g and v_r in G. After clustering, the ending time of v_g is the same as the starting time of v_r. By this method, we do not need the storage unit between vg and vr. The other advantage is that we reduce the number of nodes in a T-tree to speed up the packing process. However, one disadvantage is that the assay completion time is potentially increased. The reason is as follows. Recall that we assign virtual precedence constraints among tasks that are bound to the same non-reconfigurable device. Suppose that there exists a virtual precedence constraint between two generation operations v_g and v_q. If we cluster v_g and v_r, v_g and v_r are merged into a new task v_l. So now there is a virtual precedence constraint between v_l and v_q. This means that v_q starts after v_l finishes rather than after v_g finishes. Therefore, the assay completion time is potentially increased due to clustering. In order not to increase the assay completion time, we do not actually cluster vg and vr into a new task. In current implementation, we add additional requirement on nodes n_g and n_r in a T-tree. We require that n_r will always be the left child of n_g in a T-tree. This requirement has the same effect as clustering two tasks, since the ending time of v_g is the same as the starting time of v_r if n_r is the left child of n_g. In the proposed floorplanning algorithm, if n_g is used to perform SA operation, n_r is used to perform the same SA operation. We also check if the two clustered nodes are in their correct positions in a T-tree during feasibility detection and tree reconstruction process which will be presented in Section 4.5.5.

4.5.2 Perturbations for Biochip Placement

For the placement of digital microfluidic biochips, we introduce a new type of SA operations, called Rebind. Rebind is to bind a task to another functional resource. For a reconfigurable task, such as the mix operation, we randomly select a resource instance for this task. For example, we change from a 2 × 2-array mixer to a 2 × 4-array mixer with different mixing times. For a non-reconfigurable task, we randomly change a task from one instance to another. For example, suppose that there are two optical detectors p1 and p2 for detection operations. For a detection operation v_d that originally uses p₁, we can rebind it to p₂. Note that since a virtual precedence constraints is added among tasks corresponding to the same non-reconfigurable resource instance, these virtual precedence constraints are modified after rebinding. For instance, when we bind v_d from p₁ to p₂, we delete all virtual precedence constraints of v_d and add the virtual precedence constraints between all other tasks that are bound to p₂ and v_d.

4.5.2.1 Enhancement for the Fixed-cube Constraint

We now show how to add the virtual precedence constraints between tasks that are bound to the same non-reconfigurable device when performing the Rebind SA operation. We add the virtual precedence constraints among tasks bound to the same non-reconfigurable device based on the execution level , or lv(i), of each operation v_i. The intuition is that if lv(i) is larger than lv(j), then operation v_i is executed before operation vj. We add a virtual precedence constraint from vi to vj

if lv(i) is larger than lv(j). Given a sequencing graph G, lv(i) can be recursively calculated for each task vi in G. We first assign lv(i) = 0 for all tasks vi with zero out-degree in G. For example, in Figure 4.18 (a), lv(l) and lv(m) are both zero. We then delete all assigned tasks and assign lv(i) = 1 for all remaining tasks with zero out-degree in G. For example, in Figure 4.18 (a), lv(j) and lv(k) are both one. The above process repeats until all tasks are assigned its execution level.

We also enhance the SA operations defined in Section 4.3.2.4 to handle the fixed-cube constraint. We bias the Move operation based on the probability of violating the fixed-cube constraint in each dimension. Let k_w (k_h, k_t) be the number

of floorplans whose width (height, completion time) exceeds the user-specified width (height, completion time) in the last r iterations. In this dissertation, we set r equal 500. We bias the selection of the destination of the Move operation based on the values ^k_r^w, ^k_r^h, and ^k_r^t. For example, a larger ^k_r^w implies that it is more difficult to fit the floorplans in the 3D cube in the X direction. Therefore, tasks should be placed along the Y or T directions to satisfy the fixed-cube constraint. The details of this approach are listed below.

Based on the values of ^k_r^w, ^k_r^h, and ^k_r^t, We consider the following four cases when inserting node n_i as the left, middle, or right child of n_j.

1. If all three values are larger than 0.9, which means that the fixed-cube con-straint is very tight, and all three directions cannot fit into the desired 3D cube. Therefore, we randomly insert n_i as the left, middle, or right child of n_j.

2. If both ^k_r^w and ^k_r^h are larger than 0.9, which means that it is hard to satisfy the 3D cube in both X and Y direction. Therefore, n_i should be placed along the T dimension; that is, n_i should be inserted as the left child of n_j. The other two cases; i.e., both ^k_r^w and ^k_r^t (^k_r^h and ^k_r^t) are larger than 0.9, can be handled similarly.

3. If only ^k_r^w is larger than 0.9, which means that it is hard to fit into the 3D cube in the X direction. Therefore, n_i should be placed along the Y or T dimension; i.e., insert n_i as the left or middle child of n_j. Then, n_i is inserted based on the values of p⁰_h = ⁽¹⁻^kh^r ⁾

Besides the destination selection, we also heuristically choose the node n_i to perform the Deletion operation based on three cases:

1. If all ^k_r^w, ^k_r^h, and ^k_r^t are larger than 0.9 or less than 0.1, which means that the fixed-cube constrain is hard (easy) to be satisfied. Therefore, we randomly choose n_i to perform the Move operation.

2. If both ^k_r^w and ^k_r^h are larger than 0.9, which means that it is hard to satisfy the 3D cube constraint in the X and Y dimension. Therefore, we randomly choose n_i for the Move operation, where x⁰_i or y_i⁰ is larger than the outline width/height. The reason is that by Move n_i to another position, floorplan width/height can be effectively reduced, and therefore can fit into the desired outline. The other two cases; i.e., both ^k_r^w and ^k_r^t (^k_r^w and ^k_r^t) are larger than 0.9, can be handled similarly.

3. For the last case, we choose n_i for the Move operation based on the maximum values of ^k_r^w, ^k_r^h, and ^k_r^t. For example, if ^k_r^w is the maximum values among the three, then we randomly choose n_i, where x⁰_i is larger than the outline width.

By above method, we can effectively fit the 3D floorplan into the desired 3D cube.

4.5.3 Placement of Optical Detectors

In this subsection, we describe how to place the optical detectors in the floor-planning algorithm. After the chemical reaction among droplets, optical detectors are needed to detect the reaction results. The locations of these optical detectors needs to be determined during floorplanning. These detectors are fixed after fab-rication. Therefore, if two detection operations map to the same optical detector, they should be placed at the same physical location. Note that the segregation cells are also needed for the optical detectors to avoid the optical interference.

Suppose that two detection operations vi and vj are bound to the same optical detector and we first determine the location of vi. The basic idea is that we simultaneously determine the locations of v_i and v_j. Once the locations of v_i and v_j are determined, the location of the optical detector is also determined. Note that when placing the detection operations, we also warp these operations with the segregation cells. By this method, we guarantee that the optical detectors are warped with the segregation cells after floorplanning. After determining the location

of v_i, we set v_j at the same location as v_i. The original packing algorithm of T-tree maintains a list L to store all tasks whose locations are already determined [63].

Finally, we add v_j into L to indicate that the location of v_j is already determined.

Note that we need to check if v_j overlaps with any other tasks in L. If v_j overlaps with some tasks, we shift both v_i and v_j along the X direction to avoid the overlap.

4.5.4 Cost Function

The goal is to simultaneously optimize the biochip area and assay completion time under the design specification. Therefore, the cost function Φ used in the proposed floorplanning algorithm is given by

Φ = αV /Vnorm+ βS/snorm+ γM, (4.5)

where V is the volume of the 3D floorplan, S is the sum of the volume of all storage units, V_norm is the normalized volume, S_norm is the normalized sum of the volumes of all storage units, and M is the penalty term for fixed-cube constraint. α, β, and γ are user-specified constants. M is defined as

M = max(Wf − Wp, 0) × Wf

N_w² +

max(H_f − H_p, 0) × H_f

N_h² +

max(T_f − T_p, 0) × T_f

N_t² , (4.6)

where N_w (N_h, N_t) is the normalized width (height, assay completion time), W_p (Hp, Tp) and Wf (Hf, Tf) denote the width (height, assay completion time) of the design specification and a 3D floorplan, respectively. Since all tasks must be packed into a pre-defined 3D cube, we penalize the excessive width, height, and completion time in the cost function. The rationale behind M is that when SA minimizes the cost function, it automatically minimizes the penalty term. Thus, the fixed-cube constraint can be automatically satisfied.

4.5.5 Feasibility Detection and Tree Reconstruction

After perturbation, we perform feasibility detection and tree reconstruction to satisfy all precedence constraints and storage constraints. We iteratively adjust

the topology of a T-tree to satisfy the precedence and storage constraints. After obtaining a feasible topology of a T-tree, we invoke the packing procedure described in Section 4.3.2.2 to determine the physical locations of all tasks. The reason why we do not perform packing during the feasibility detection and tree reconstruction is that all precedence and storage constraints are only related to starting time of each task. The physical location of each task does not have any effect on the satisfaction of the precedence constraints. Thus, we do not need to perform packing in the main loop of the tree-reconstruction process.

Given a T-tree H, we first check if a clustered node n_i is the left child of another clustered node n_j. If not, we Move n_i to the position of the left child of nj. Then we check if every storage unit is in one of its feasible positions. If a storage unit ns is not in one of its feasible positions, we Move ns to one of its feasible positions. Note that since we modify the topology of H during the tree reconstruction process, the duration of each storage unit may change. To reduce the floorplanning complexity, we thus restrict every storage unit not to have its left child. By doing so, the starting time of a task will not be affected by any storage unit during the tree reconstruction process. Next we explain how to remove the left child of a storage unit. Suppose that a storage unit v_s stores the result of task v_a and n_k is the left child of n_s in H. We perform the move subtree procedure described below to move the subtree rooted by nk to another place in H. First we choose one node nz in the subtree rooted by na but not in the subtree rooted by n_k. Then we randomly move the subtree rooted by n_k to the positions of the left subtree, middle subtree, or right subtree of n_z based on the values of ^k_r^w, ^k_r^h, and ^k_r^t defined in Section 4.5.2. For example, if ^k_r^t is large, then we have lower probability to move the subtree rooted by n_k to the position of the left subtree of n_z. Without loss of generality, assume that we move the subtree rooted by n_k to the position of the left subtree of n_z. The other two cases can be handled similarly. First, if n_z has no left child, then we simply move the subtree rooted by n_k to the position of the left subtree of nz. Second, if nz has its left child, there are two situations:

1. n_k has its left child: In this case, we first move the subtree rooted by n_z’s left child to the position of the left subtree of n_k. Then we move the subtree

originally rooted by n_k’s left child to the position of the left subtree of n_f, where n_f is in the subtree rooted by n_k with no left child.

2. n_k has no left child: In this case, we simply move the subtree rooted by n_z’s left child to the position of the left subtree of n_k.

Figure 4.19 gives an example if the subtree rooted by n_k is moved to the position of the left subtree of n_z. Figure 4.20 summaries the move subtree procedure.

Once all storage units are in their feasible positions and do not have their left child, we traverse H to obtain the starting time of each task. Next, we check the precedence constraints, which describes the temporal ordering of tasks for correct execution, and reconstruct H if necessary. From Property 1, if node n_j is in the left subtree of n_i, task v_j must be executed after task v_i. Therefore, to ensure all the precedence constraints are not violated, a node n_kmust be placed in the left subtree of n_p, where n_p has the latest ending time among the tasks that must be executed before a task v_k. Therefore, the feasibility detection of the precedence constraint can be summarized in the following theorem:

Theorem 4.4 Let I_k, 1 ≤ k ≤ n, denote the set of tasks that must be executed before task v_k. If node n_k is in node n_p’s left subtree, where t⁰_p = max{t⁰_i|v_i ∈ I_k}, then vk is guaranteed to satisfy the precedence constraint.

Proof: Based on property 1, if node n_j is in the left subtree of node n_i, task v_j must be executed after task v_i. Thus, if node n_k is in node n_p’s left subtree, where t⁰_p = max{t⁰_i|v_i ∈ I_k}, then task v_k must be executed after task v_p. It is easy to see that if vk is executed after vp, then vk is executed after all tasks in Ik. Thus vk

satisfies the precedence constraint.

Once we identify a node that violates the precedence constraint, we re-construct the T-tree to remove the violation conditions. Assume task v_i violates precedence constraints and vp is the task that has the latest ending time in Ii. Let U = {all nodes in the left subtree of np } ∪ {np}. In U, we look for a node nj that minimizes |t_j − t_i| with I_j = ∅. If n_j 6= n_p, n_i is swapped with n_j; otherwise, it

means that n_j = n_p or I_j 6= ∅ for every n_j ∈ U. In this case, we make n_i the left child of n_p.

To avoid infeasible T-trees that may trigger the tree re-construction process, we develop a mechanism to filter out the operations that will definitely cause prece-dence constraint violations. Consider the T-tree shown in Figure 4.21. Assume that v_b must be executed before v_e. Node n_e can only be swapped with a node in n_b’s left subtree, or be inserted into an internal/external position in n_b’s left subtree.

Further, node n_b cannot be swapped with any node in the subtree rooted by n_e or be inserted into any internal/external position in the subtree rooted by n_e. Based on this observation, we number each node in a T-tree in the DFS order starting from the right subtree. For example, in Figure 4.21, the number in the parenthesis of each node denotes the order of the tree traversal. That is, we traverse the T-tree shown in Figure 4.21 in the following order: n_a, n_d, n_c, n_b, n_e, n_h, n_g, and n_f. The reason why we number each node starting from the right subtree is that it is easy to detect if node n_j is in the left subtree of node n_i by observing the DFS numbers of n_j and n_i’s left child. If the DFS number of n_j is greater than the DFS number of n_i’s left child, then n_j is in the left subtree of n_i. Conventional DFS ordering, i.e., traverse the tree starting from the left subtree, does not provide this property. After the numbering, each node is associated with two values, DF S up and DF S low.

Let Ok, 1 ≤ k ≤ n, be the set of tasks that must be executed after vk. A node nk’s DF S low is the DFS number of np’s left child, where t⁰_p = max{t⁰_i|vi ∈ Ik}.

Similarly, n_k’s DF S up is n_q’s DFS number, where t_q = min{t_i|v_i ∈ O_k}. For the Swap and Move operations performed on n_k, we heuristically choose nodes in the

在文檔中數位微流體晶片之合成: 模型,擺置,和繞線 (頁 71-81)