FPGA Placement - 針對大型異質現場可程式化邏輯閘陣列之巨集暨解析擺置

Placement is one of the main stages that falls between technology mapping and routing in the typical FPGA CAD flow. FPGA placement determines non-overlapping positions for technology-mapped logic blocks in the netlist with opti-mized cost metrics (such as the total wirelength or routability). Figure 1.7 shows an example of FPGA placement. Every block is assigned to a unique position of CLBs. The placement problem is computationally difficult. The wirelength min-imization of the unit-size block placement with only two-pin nets is proved to be NP-complete [19]. Therefore, placement is considered as one of the most critical and time-consuming stages among the FPGA CAD flow.

Many works on FPGA placement are mainly based on the following three

approaches:

1. Simulated annealing (SA) approach: This approach optimizes placement by SA techniques. Given an initial solution, the SA approach obtains solutions by iteratively perturbing the current solution to generate a new solution. The new solution is kept if it is better than the current solution. Otherwise, an acceptance probability function is applied to decide whether to keep the new solution or not. The probability function helps to escape from local optimum solutions. The state-of-the-art academic FPGA CAD tool VPR [7, 8, 34, 37]

adapted SA techniques to be its optimization engine. Besides the basic SA techniques, VPR also improved aspects including: (1) incremental net bound-ing box updatbound-ing to improve the placement runtime, (2) better temperature updating so that the annealing process takes longer time when perturbations produces more improvements while saving time for perturbations with less im-provements. The SA-based method has been dominating for decades because it can achieve very high-quality placement results. Nevertheless, it tends to have long runtime in large circuits.

2. Partitioning-based approach: The partitioning-based approach is pro-posed to achieve better speedups than the SA approach. Example works of this approach are FPR [4] and PPFF [35], and PPFF is the classic of this approach. PPFF applies the famous multilevel partitioner hMetis which re-cursively partitions the design and places it hierarchically. At each hierarchical level, PPFF employs an alignment cost in the objective function for delay and congestion minimization. Finally, PPFF applies a low-temperature SA flow, which is basically the VPR SA flow with smaller initial starting temperature than VPR. Although the partitioning-based approach achieves much better

speedup, it lacks global view when performing the partitioning and therefore can easily fall into local optimum solutions. As a result, the partitioning-based approach usually suffers from some quality loss.

3. Analytical-based approach: In recent years, the analytical-based approach, as a rising star in FPGA placement, shows rather fast and comparable or even higher solution quality compared to the SA approach. The analytical-based approach applies smooth functions to approximate objective functions and solves the problem by efficient numerical methods. Example works of this approach are as follows: QPF [50], CAPRI [20], StarPlace [49], HeAP [21], and Lin et al. [32].

QPF applied the quadratic wirelength model widely used in VLSI placement [31, 45]. The quadratic wirelength model computes the wirelength of a net as the summation of the squared Euclidean distance over every two fan-outs of the net. By solving the quadratic programming problem with the total quadratic wirelength over all nets, QPF finds the locations of CLBs. Finally, a low-temperature SA flow is applied.

CAPRI is based on metric geometry and graph embedding. CAPRI con-structs the metric space graph according to routing architecture and embeds the netlist graph into this metric space graph with bipartite graph matching.

This bipartite graph matching minimizes the distortion between the placement of the netlist graph and the metric space graph. Finally, CAPRI applied a legalization and a low-temperature SA flow to refine its solution.

StarPlace proposed a star+ model. This star+ model is modified from the famous star model [40] and is near-linear and continuously differentiable. The traditional star model wirelength for each net is estimated by summing up the

Euclidean distance between each block connected to the net and the center-of-gravity of the net. Different from the star model, this star+ model adds square root to the wirelength which is claimed to better approximate the real routed wirelength. StarPlace basically minimizes the sum over all net wirelength using the successive over-relaxation (SOR) optimization solver.

Lin et al. applied non-linear optimization using log-sum-exponential (LSE) as the wirelength approximation function and bell-shaped overlap function as density function. This work applied a multilevel framework to accelerate the placement algorithm and enhances the scalability. A partitioning-based look-ahead legalization is introduced to have a better forecast of the solution.

Finally, Lin et al. refined its solution by a window-based bipartite matching and a low-temperature SA.

Along with the dramatically increasing gate count of modern FPGAs, devel-oping and applying analytical placement tools which are both fast and high-quality have become inevitable trends in FPGA design. Therefore, in this thesis, we develop our placer based on the analytical approach.

Most existing works on FPGA placement focus on CLB placement. Neverthe-less, with the advances in process technology, IP cores have become indispensable components in modern FPGAs. As a result, in heterogeneous FPGA placement, the legal positions of blocks are further constrained by the type of blocks. The distribution of the configurable locations for these IP cores, however, is limited and scattered on FPGAs. For example, as shown in Figure 1.1, RAMs could only be placed in Column two, and DSPs could only be placed in Column six. Figure 1.8 il-lustrates heterogeneous FPGA placement. Notice that analytical-based approaches use continuous and differentiable objective functions, which are against the discrete

nature of FPGAs. The analytical FPGA placement problem has thus reshaped.

Moreover, the increasing design complexity has made the FPGA placement prob-lem even more challenging. Therefore, in this thesis, we propose a guiding function for analytical FPGA placement to cope with the mismatch between discrete and continuous features.

在文檔中針對大型異質現場可程式化邏輯閘陣列之巨集暨解析擺置 (頁 21-25)