相關研究與背景說明 - 可動態擴充之數位訊號處理核心於系統單晶片內之整合架構研究(III)

2.1 Relative Works

Pozzi [8] proposed an algorithm, called the exact algorithm, to examine all possible ISE candidates such that it can obtain an optimal solution. The exact algorithm maps the ISE search space, such as a basic block, to a binary tree, and then discards some portion of the tree that violates predefined constraints. Nevertheless,

this algorithm is highly computing-intensive, so it does not process a larger search space. For instance, if a BB has N operations, and each operation has only one hardware implementation option, then it has 2^N possible ISE patterns (legal or illegal).

Notably, one ISE candidate may consists of one or multiple legal ISE pattern(s).

When N = 100 (the standard case), then the number of possible ISE patterns is 2¹⁰⁰. Obviously, this number of patterns cannot be computed in a reasonable time. To decrease the computing complexity, heuristic algorithms derived from genetic [8], Kernighan-Lin (KL) [9] and greedy-like algorithms [10] have been developed.

Yu [11] investigated the effect of various constraints, such as ISA format, hardware area and control flow, for ISE generation. Such constraints restrict the performance improvement of the ISEs. The ISA format limits the number of read and write ports to the register file. The limitation of the control flow is whether the search space of ISE exploration can cross basic block boundaries. To meet time constraint in real-time applications, the operations locating on the worst-case execution path would have higher opportunity to be grouped into ISE than others, as in [12]. This is because the most frequently executed instruction pattern may not contribute execution time reduction to the worst-case execution path. The granularity of each vertex within the search space can be varied from one instruction to multiple subroutine calls [13].

Borin [13] also claims that one search space can consist of multiple basic blocks in their proposed algorithm. From a different perspective, Peymandoust [14]

characterized each basic block as a polynomial representation. First, the multiple-input single-output (MISO) algorithm extracts symbolic algebraic patterns from the search spaces, and represents them as polynomials on behalf of ISE candidates. These ISE candidates are then mapped to the polynomial representations of program segments using symbolic algebraic manipulations. Nevertheless, some algorithms [8, 9, 10, 11, 12, 13 and 14] do not consider the pipestage timing constraint, and therefore waste silicon area unnecessarily.

2.2 Background - Ant Colony Optimization (ACO) Algorithm Basic Idea of Ant Colony Optimization Algorithm

Ant Colony Optimization algorithm is inspired by the behavior of ants in finding paths from the colony to food. This algorithm [1, 2] has been extensively applied to solve many optimization problems.

In the real world, ants wander randomly when they begin to find paths from the colony to food. The ants lay down pheromone on the paths which they have passed through. The density of the pheromone on one path determines the probability that the next ant will pass through this path. Because the pheromone evaporates with time, the shortest path is marched over fastest, and thus has the highest pheromone density.

After a period of time, an increasing number of ants select the shortest path, causing the density of pheromone on this path gradually grow. Finally, the shortest path is obtained. The shortest path can be treated as the optimal solution for an optimization problem.

Figure 2 depicts an example of ACO. Suppose that 50 ants are going to find food, and can choose from among two paths, namely left-path (left hand side path) and right-path (right hand side path). Left-path is twice as long than right-path, as illustrated in Fig. 2 (a). In Fig. 2, D and P are the number of unit of distance and pheromone, respectively, and t represents the time unit. At t = 0, neither path has pheromone, and the ants choose paths with equal probability. Suppose that 25 ants choose left-path, and 25 ants choose right-path at the beginning of t = 1, as shown in Fig. 2 (b). Every ant leaves one unit of pheromone on the path. At the end of t = 1, 25 ants arrive food source, and another 25 ants are in the middle of left-path. Black spots in Fig. 2 (c) depict the locations of ants at the end of t = 1. Assume that the pheromone evaporates at a rate of 5 units per time unit. The paths ant passed have 20 (=25−5) units of pheromone after evaporation, as displayed in Fig. 2 (c). Then, the ants start walking again at the beginning of t = 2, as demonstrated in Fig. 2 (d). 25 ants return to their colony and another 25 ants arrive food source, at the end of t = 2.

The locations (black spots) of ants at the end of t = 2 is shown in Fig. 2 (e). Fig. 2 (f) depicts the number of pheromone units on each path segment after t = 2. At next iteration, right-path has a higher probability of being chosen by ants than left-path owing to the higher pheromone density. After a period of time, an increasing number of ants select right-path due to higher density of pheromone, causing the density of pheromone on right-path grows. Finally, the shortest path, i.e. right-path, is obtained.

Figure 2: An example of ant colony

Why Use the Ant Colony Optimization Algorithm?

ISE exploration is an optimization problem. Many computation models, such as genetic algorithms and simulated annealing, have been successfully adopted to solve many optimization problems. One of those computation models, named ACO, is very similar to the ISE exploration problem that explores not only ISE candidates, but also their hardware implementation options. A computation model that is more similar to the original problem than others should be chosen for minimizing the modification required. In other word, a computation model that more closely resembles the original problem requires less modification than other models do.

The ISE exploration problem is clearly analogous to ACO. Selecting the shortest path among multiple paths in ACO can be viewed as similar to choosing the best implementation option (hardware or software) among different implementation options for each operation in the basic block. One can imagine that different implementation options for all operations in a basic block can be modeled as a tree.

An operation can be seen as a node in the tree, and different implementation options are different links between two nodes in the tree. Figure 3, in which O1 is operation 1 as well as SW-IO and HW-IO denote the software and hardware implementation option, respectively, illustrates this concept. In Fig. 3, the ants start from their colony (operation 1) to food (operation m), and make a decision at every node (choose one

P=25→20

Food

Food Food

Ant Colony Ant Colony

25 ants 25 ants

Ant Colony (50 ants)

D=20

implementation option of each operation). After a period of time, i.e. several iterations, the ants find the shortest path (the dotted line with arrow in Fig. 3) between their colony and food. The shortest path is the optimal solution in ISE exploration.

O 1

HW -IO 1 WH

-IO n SW

-IO

O 2

HW -IO 1 W H

-IO n

SW-IO

O m

HW -IO 1 W H

-IO n

SW-IO O m

HW -IO 1 WH

-IO n

SW-IO

Operation 1

Operation 2

Operation m

Ant Colony

Food

Figure 3: Analogy between ACO and ISE exploration

在文檔中可動態擴充之數位訊號處理核心於系統單晶片內之整合架構研究(III) (頁 11-15)