Handbook of algorithms for physical design automation part 67.pdf (Sổ tay thuật toán)

Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C032 642 Finals Page 642 9-10-2008 #17 Handbook of Algorithms for Physical Design Automation chip may be partially or completely blocked and buffers cannot be placed in these areas. As the capacity constraints for the edges in the global routing graph ensure that not too many wires cross the boundaries between two adjacent global routing tiles, similar constraints ensure that not too many buffers are placed in one global routing tile. Vygen [26] considers the coupling capacitance and minimizes the total power consumption while ensuring the timing constraints for individual nets and certain paths. A Steiner tree for a net is characterized not only by the edges of the global routing graph, but also each edge of the Steiner tree has a continuous parameter specifying the spacing to each side of the final route. It is assumed that the coupling capacitance decreases linearly with the spacing. The timing constraints are ensured by bounding the weighted capacitance for subsets of the nets, a constraint similar to the constraint that bounds the total weighted wirelength. While more space decreases the coupling capacitance also more routing resources are used. The problem can be formulated as a fractional packing problem of infinitely many Steiner trees, infinitely many because of the continuous spacing parameters. Because the capacitance depends linearly on the spacing, every edge of the Steiner tree that minimizes the cost function with respect to the dual variables either has the maximum or minimum spacing. The task of the subroutine is still to find a Steiner minimal tree in the grid graph with respect to a nonuniform length function. Müller [27] describes a parallel multithreaded implementation of the approximation scheme. He shows that it is possible to update the dual variables at the end of each phase for all nets instead of updating them immediately after a Steiner tree is found. The set of nets is split into subsets and each thread computes the minimal Steiner trees for one subset in the global routing graph. 32.9 CONCLUSION This chapter is about the global routing problem, and specifically about algorithms solving the linear programming relaxation. The complexity of the linear program is enormous and hence it is not possible to solve the linear program optimally. The linear programming relaxation for global routing is a special case of a fractional packing problem and is similar to the multicommodity flow problem. We showed that the approximation algorithm for the multicommodity flow problem can be applied to the fractional global routing problem. A final integer solution is derived by randomized rounding. This approach has been used successfully in practice and has been extended to consider additional constraints and objectives. The approach of the linear relaxation and randomized rounding is general and it may be possible to apply it to other combinatorial optimization problems in physical design. For global routing, the approach works well because the capacities of the edges are relatively large and hence randomized rounding does not disturb the solution much. REFERENCES 1. P. Raghavan and C. D. Thompson, Randomized rounding: A technique for provably good algorithms and algorithmic proofs, Combinatorica, 7(4): 365–374, 1987. 2. P. Raghavan and C. D. Thompson, Multiterminal global routing: A deterministic approximation, Algorithmica, 6: 73–82, 1991. 3. F. Shahrokhi and D. W. Matula, The maximum concurrent flow problem, Journal of the Association for Computing Machinery, 37: 24–31, 1990. 4. S. Plotkin, D. Shmoys, and E. Tardos, Fast approximation algorithms for fractional packing and covering problems, Mathematics of Operations Research, 20: 257–301, 1995. 5. A. V. Goldberg, J. D. Oldham, S. Plotkin, and C. Stein, An implementation of a combinatorial approximation algorithm for minimum-cost multicommodity flow, in Integer Programming and Combinatorial Optimization (6th International IPCO Conference), Houston, TX, pp. 338–352, 1998. 6. N. Garg and J. Könemann, Faster and simpler algorithms for multicommodity flow and other fractional packing problems, in Proceedings of the 39th Annual Symposium on Foundations of Computer Science, Palo Alto, CA, pp. 300–309, 1998. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C032 Optimization Techniques in Routing Finals Page 643 9-10-2008 #18 643 7. L. K. Fleischer, Approximating fractional multicommodity flow independent of the number of commodities, SIAM Journal on Discrete Mathematics, 13(4): 505–520, 2000. (FOCS 1999). 8. G. Karakostas, Faster approximation schemes for fractional multicommodity flow problems, in Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms, San Francisco, CA, pp. 166–173, 2002. 9. C. Albrecht, Global routing by new approximation algorithms for multicommodity flow, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 20: 622–632, May 2001. (ISPD 2000). 10. M. R. Kramer and J. van Leeuwen, The complexity of wire routing and finding minimum area layouts for arbitrary VLSI circuits, in Advances in Computing Research, Vol. 2: VLSI Theory, F. P. Preparata (Eds), JAI Press, Greenwhich, CT, pp. 192–146, 1984. 11. J. Vygen, Disjoint paths, Technical Report 94816, Research Institute for Discrete Mathematics, University of Bonn, Bonn, Germany, 1994. 12. V. Chvátal, Linear Programming. New York: Freeman, 1983. 13. A. Schrijver, Theory of Linear and Integer Programming. Chichester, United Kingdom: Wiley, 1986. 14. J. Werber, Das Multicommodity-Flow-Problem und seine Anwendung im Global Routing. Diplomarbeit, Universität Bonn, Bonn, Germany, 2000. 15. T. C. Hu and M. T. Shing, A decomposition algorithm for circuit routing, in VLSI Circuit Layout: Theory and Design, T. C. Hu and E. S. Kuh (Eds), IEEE Press, pp. 144–152, 1985. 16. G. B. Dantzig, Maximization of a linear function of variables subject to linear inequalities, in Activity Analysis of Production and Allocation, Tj. C. Koopmans (Eds), Wiley, NY, pp. 399–347, 1951. 17. H. T. Jongen, K. Meer, and E. Triesch, Optimization Theory. Norwell, MA: Kluwer Academic Publishers, 2004. 18. A. Vannelli, An adaptation of the interior point method for solving the global routing problem, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 10: pp. 193–203, February 1991. (CICCC 1989). 19. M. Hanan, On Steiner’s problem with rectilinear distance, Soviet Mathematics Doklady, 14(2): 255–265, 1966. 20. N. Karmarkar, A new polynomial-time algorithm for linear programming, Combinatorica, 4: 373–395, 1984. 21. L. G. Khachiyan, A polynomial-time algorithm in linear programming, Soviet Mathematics Doklady, 20: 191–194, 1979. 22. R. C. Carden IV, J. Li, and C. -K. Cheng, A global router with a theoretical bound on the optimum solution, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 15: 208–216, February 1996. 23. P. Raghavan, Probabilistic construction of deterministic algorithms: Approximating packing integer programs, Journal of Computer and System Sciences, 37: 130–143, 1988. 24. H. Chernoff, A measure of asymptotic efficiency for tests based on the sum of observations, Annals of Mathematical Statistics, 23: 493–509, 1952. 25. C. Albrecht, A. Kahng, I. Măndoiu, and A. Zelikovsky, Multicommodity flow algorithms for buffered global routing, in Handbook of Approximation Algorithms and Metaheuristics, T. F. Gonzales (Ed.), Boca Raton, FL: Chapman & Hall/CRC, pp. 80.1–80.18, 2007. (ASPDAC 2002). 26. J. Vygen, Near-optimum global routing with coupling, delay bounds, and power consumption, in Integer Programming and Combinatorial Optimization (10th International IPCO Conference), LNCS 3064, G. Nemhauser and D. Bienstock (Eds.). Berlin, Germany: Springer, pp. 308–324, 2004. 27. D. Müller, Optimizing yield in global routing, in Digest of Technical Papers of the IEEE/ACM International Conference on Computer-Aided Design. San Jose, CA, November 2006. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C032 Finals Page 644 9-10-2008 #19 Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 Finals Page 645 29-9-2008 #2 33 Global Interconnect Planning Cheng-Kok Koh, Evangeline F.Y. Young, and Yao-Wen Chang CONTENTS 33.1 Buffer Planning Basics. . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.1.1 Feasible Regions . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.1.2 Independent Feasible Regions .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.1.3 Two-Dimensional Feasible Region . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.2 Buffer Blocks and Sites . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.2.1 Buffer Block Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.2.2 Buffer Site Planning .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.3 Interconnect Planning and Buffer Planning .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.3.1 Routability-Driven Buffer Planning .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.3.1.1 Routability-Driven Buffer Planning with Dead Space Redistribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.3.1.2 Interconnect Planning with Fixed Interval Buffer Insertion Constraint . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.3.1.3 Methodology for Interconnect Planning in Buffer Site . . . .. . . . . . . . . . . . . . 33.3.1.4 Other Routability-Driven Buffer Planning Approaches . . .. . . . . . . . . . . . . . 33.3.2 Pin Assignment with Buffer Planning.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.3.3 Noise-Aware Buffer Planning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.3.3.1 Independent Feasible Regions with Transition Time Constraints.. . . . . . 33.3.3.2 Common Independent Feasible Region . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.3.3.3 Buffer Block Planning Considering Transition Time and Delay . . . . . . . . 33.3.4 Buffer Planning with Noise Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.4 Flip-Flop and Buffer Planning (Wire Retiming) .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.4.1 Minimizing Latency... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.4.1.1 Two-Pin Net Optimization Using Analytical Formulas . . .. . . . . . . . . . . . . . 33.4.1.2 Multiple-Terminal Net Optimization .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.4.2 Latency Constrained Optimization .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.4.3 Wire Retiming . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.4.4 Area Constrained Wire Retiming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 33.5 Concluding Remarks . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 646 648 649 649 649 650 652 653 653 654 655 656 656 657 658 658 660 660 661 663 664 664 666 666 667 668 669 670 With the growing dominance of global interconnects on circuit performance, it is desirable to optimize interconnects as early as possible. Recall from Chapter 26 that buffer insertion is generally considered the most effective and popular technique to reduce interconnect delay, especially for global signals. A buffer is composed of two inverters while a repeater is referred to as a buffer or an inverter. To simplify the discussions, we shall use buffer and repeater interchangeably throughout this chapter. 645 Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 646 Finals Page 646 29-9-2008 #3 Handbook of Algorithms for Physical Design Automation As hundreds of thousands of buffers may be inserted for modern high-performance VLSI designs, it is imperative to plan for the buffer positions as early as possible to ensure timing closure and design convergence. In this chapter, we shall first present the enabling concept of buffer planning, namely, the feasible region in which a buffer can be inserted such that the timing constraint is met. Following a description of two fundamental approaches to buffer planning, taking into account only timing constraints, we address also other important design issues such as noise constraints and routability in buffer planning. When buffer insertion fails to meet the timing constraints, pipelining of global interconnects with flip-flops becomes necessary. We devote a section on flip-flop and buffer planning to deal with the challenges that arise from the additional latency introduced by interconnect pipelining. 33.1 BUFFER PLANNING BASICS Some VLSI designs may not allow buffers to be inserted inside a circuit block as they consume silicon resource and require connections to the power/ground network. Consequently, buffers are placed in channels and dead spaces of a floorplan, and they are often clustered to form buffer blocks between existing circuit blocks of the floorplan, which inevitably increases the chip area [1]. It is thus desirable to carefully plan for the buffer blocks during/after floorplanning to minimize the area overhead and facilitate routing. This is known as buffer block planning. However, the existence of buffer blocks imposes more design constraints. Because buffers connect global nets, the routing regions where buffer blocks are located might be congested. Furthermore, buffers might be placed in poor locations because buffers are clustered into blocks and thus the best location for a buffer is forbidden. A remedy to this deficiency is to distribute buffers more uniformly in a chip, so as to naturally spread out global nets. This approach looks promising in handling the aforementioned problems with wire congestion and buffer blockages. In contrast to the buffer block planning methodology, Alpert et al. [2] proposed the buffer site methodology. The methodology allocates a buffering resource within a block by inserting a buffer site that can accommodate buffers (or other logic gates if the buffer site is not used for buffering). For buffer site planning, we shall plan for the buffers during/after floorplanning such that the given buffer sites can accommodate buffers and the routing timing and congestion constraints are satisfied. To determine the optimal location for buffer insertion, we shall first consider the feasible region (FR) for a buffer, which is referred to as the region where the buffer can be placed to satisfy the timing constraint. Figure 33.1 shows respective FRs for inserting (a) one buffer and (b) multiple buffers into a net between a source and a sink, where the FRs are shaded. The concepts of the feasible region come in two forms. Cong, Kong, and Pan first defined in Ref. [1] the feasible region for buffer insertion to be the region where a buffer could be placed to satisfy a target timing constraint, assuming that all the remaining buffers were placed in their optimal positions. In contrast, Sarkar and Koh [3] introduced the idea of independent feasible region (IFR) for buffer insertion, which was defined as the region where a buffer could be placed such that the timing constraint of the net was satisfied, assuming that the other buffers were also placed within their respective independent feasible regions. Before presenting the analytical formulas for computing the feasible regions, we shall first introduce the notation and delay model that will be used throughout this chapter. Each driver/buffer is modeled as a switch-level RC circuit [4] and each wire is modeled as a π-type circuit, as shown in Figure 33.2. We use the Elmore delay model [5] covered in Chapter 3 for delay computation. The notation for the physical parameters of wire and buffer is listed in Table 33.1. Given a wire segment of length l with driver output resistance R and sink capacitance C, the Elmore delay of this segment is given by D(R, C, l) = rc 2 l2 + (Rc + rC)l + RC. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 Finals Page 647 29-9-2008 #4 647 Global Interconnect Planning Feasible region Rd x1,min x1 x1,max (a) Feasible regions for the corresponding buffers Rd x1 x2 xn (b) FIGURE 33.1 Feasible regions for buffer insertion. (a) Single-buffer insertion and (b) multiple-buffer insertion. (From Cong, J., Kong, T., and Pan, Z., IEEE Trans. VLSI Sys., 9, 929, 2001 (ICCAD 1999).) Rb Buffer Cb Tb/Rb (a) Length l Wire rl cl/2 cl/2 (b) FIGURE 33.2 Buffer and wire model. (a) Switch-level buffer model and (b) wire model. Using the preceding expression, the Elmore delay of a single-source, single-sink net (i.e., two-pin net) N of length L with n buffers can be computed as DN (x1 , x2 , . . . xn , L) = D(Rd , Cb , x1 ) + D(Rb , Cs , L − xn ) + n−1 D(Rb , Cb , xi+1 − xi ) + nTb , i=1 where Rd is the driver resistance Cs is the sink capacitance xi is the location of the ith buffer For convenience, we reexpress the optimal locations of the n buffers for the delay minimization of a net [6], presented in Chapter 26, as follows: xi∗ = (i − 1)yL∗ + xL∗ i ∈ {1, 2, . . . n}, Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 648 Finals Page 648 29-9-2008 #5 Handbook of Algorithms for Physical Design Automation TABLE 33.1 Parameters of Wire and Buffer Parameter Description r c Tb Cb Rb Wire resistance per unit length Wire capacitance per unit length Intrinsic buffer delay Buffer input capacitance Buffer output resistance where xL∗ = n(Rb − Rd ) (Cs − Cb ) 1 L+ + n+1 r c and yL∗ = (Rb − Rd ) (Cs − Cb ) 1 L− . + n+1 r c We denote the optimal delay for the net N, of length L, with n buffers by N Dopt (n, L) = DN (x1∗ , x2∗ , . . . . . . , xn∗ , L). In the following subsections, we first discuss the computation of the feasible region and the independent feasible region of a buffer on a one-dimensional line segment, and then extend the idea to a two-dimensional chip plane. 33.1.1 FEASIBLE REGIONS For n buffers inserted in a two-pin net N as shown in Figure 33.1b, their feasible regions can be computed as follows [1]. Theorem 1 For a two-pin net N of length L and with n buffers inserted and a given timing bound N Dtgt , the feasible region for the ith buffer (i ≤ n) is xi ∈ [xi,min , xi,max ] with xi,min = max 0, K2 − K22 − 4K1 K3 2K1 , and xi,max = min L, K2 − K22 + 4K1 K3 2K1 , where K1 = (n + 1)rc , 2i(n − i + 1) Cb )r + rcL , and K2 = (Rb −i Rd )c + (Cs − n−i+1 − i)rL C + R ((n − 1)C + C + cL) + N K3 = nTb − Dtgt + Rd + (i − 1)Rb + (n b b s n−i+1 b 2 2 b − Rd ) − (n − i)r(Cb − Cs ) . + rLCs − (i − 1)c(R 2ir 2(n − i + 1)c rcL 2 2(n − i + 1) Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 Finals Page 649 29-9-2008 #6 649 Global Interconnect Planning We denote the width of the feasible region for a given buffer by WFR . Cong, Kong, and Pan gave an analytical expression for WFR in Ref. [1]. Sarkar and Koh presented an equivalent analytical expression in Ref. [3], as given below. Theorem 2 net N is N N For Dtgt ≥ Dopt (n, L), the width of the feasible region for the ith buffer (i ≤ n) of the WFR = 2 · N − DN (n, L))(n − i + 1)(i) 2(Dtgt opt rc(n + 1) . 33.1.2 INDEPENDENT FEASIBLE REGIONS In contrast to the definition of feasible region, the IFR of a buffer is the region where it can be placed to satisfy the timing constraints of the net, assuming that the other buffers are placed within their respective IFRs [3]. To provide every buffer in the net with an equal degree of freedom to move within its IFR, the IFRs are chosen to have the same width, denoted by WIFR . Hence, the IFR for the N ith buffer of a net N with a corresponding target delay Dtgt is given by IFRi = (xi∗ − WIFR /2, xi∗ + WIFR /2) ∩ (0, L), N . The such that ∀ (x1 , x2 , . . . , xi , . . . , xn ) ∈ IFR1 × IFR2 × . . . × IFRn and DN (x1 , x2 , . . . , xn , L) ≤ Dtgt following theorem gives an analytical expression for WIFR . N N ≥ Dopt (n, L), the width of the independent feasible region for the ith buffer Theorem 3 For Dtgt (i ≤ n) of the net N is WIFR = 2 N − DN (n, L) Dtgt opt rc(2n − 1) . 33.1.3 TWO-DIMENSIONAL FEASIBLE REGION Implicit in the preceding discussions are the assumptions that a routing from source to sink exists, which is not true for buffer planning during floorplanning, and that buffer insertion occurs only along an one-dimensional line. For buffer planning, we typically assume that the two terminals of a net are connected with a shortest path within the bounding box of the net. The union of the one-dimensional FRs (or one-dimensional IFRs) of a buffer on all monotonic Manhattan routes between source and sink forms the two-dimensional FR (or two-dimensional IFR) of that buffer (see Figure 33.3). The feasible region of a buffer may be reduced by circuit blocks. Moreover, 2D IFRs of buffers of the same net are not entirely independent of each other. As the widths and locations of a 2D IFR are determined under the assumption of a monotonic Manhattan route between the source and the sink, an assignment of buffers to locations within their respective 2D IFRs is legal only if the buffers lie along a monotone source-to-sink path. Figure 33.3 shows a nonmonotonic buffer assignment, which may not meet the timing constraint, even though the buffers are all within their respective 2D IFRs. Therefore, when we have committed a buffer to a location in its 2D IFR, it may be necessary to update the 2D IFRs of all other buffers in the net. 33.2 BUFFER BLOCKS AND SITES There are two approaches to buffer planning: buffer block planning and buffer site planning. For buffer block planning, top-level macroblocks with only buffers, or buffer blocks, are inserted into the floorplan [1,3,7–9]. The underlying idea to this methodology is that when one moves a buffer considerably from its optimal location, only a small delay penalty is incurred. As a result, buffers can Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 650 Finals Page 650 29-9-2008 #7 Handbook of Algorithms for Physical Design Automation Assignment of buffers to these locations results in a nonmonotonic path Sink Circuit block A Circuit block B Source Feasible regions of three buffers, two of which have their areas reduced by circuit blocks FIGURE 33.3 2D feasible regions and their implications on buffer assignment. be relocated within their respective feasible regions or independent feasible regions such that they can be clustered together to form buffer blocks. The buffer site methodology puts the onus on block designers to allocate a buffering resource within a block by inserting a buffer site. The allocation of buffer sites within blocks may not be uniform; a low-performance block may accommodate more buffer sites than a high-performance one, and some blocks, such as a cache, may not have any buffer sites. A preallocated buffer site may remain unassigned to a net after planning. In that case, unused buffer sites can be used to accommodate other useful circuit elements, such as decoupling capacitors. To facilitate buffer planning, a chip is typically divided into tiles first. Figure 33.4 shows a tiled chip layout with channel regions, hard blocks, and soft blocks. The capacity of each tile for buffer insertion depends on whether the tile overlaps with channel regions, dead areas, or hard blocks. Channel regions and dead areas of the floorplan have high capacity for buffer insertion. In contrast, hard blocks have very low capacity for buffer insertion unless some buffer sites have been inserted intentionally [2]. As the exact layout of each soft block is yet to be determined, it is typically assumed that as long as the total area of functional units and buffers in a soft block is not larger than its preallocated space, the layout of this block can be completed in the placement stage. For ease of problem formulation, all the tiles in a soft block may be merged together, as in Figure 33.4. The buffer capacity of this merged block tile is the total area less the area consumed by its functional units. It is the responsibility of the placement tool to ensure that buffers are placed at appropriate locations in the physical realization of a soft block. Let VT denote the set of tiles obtained as described in the preceding paragraph. We can construct a tile graph GT (VT , ET ), where every two neighboring tiles u and v in VT are connected by an edge eu,v in ET . For a tile v, let B(v) be the number of buffer sites within v and b(v) be the number of buffers assigned to v. Let W (eu,v ) be the wire capacity of the edge eu,v , and w(eu,v ) denote the actual wire usage of eu,v . It is clear that a buffer planning solution is feasible only if b(v) ≤ B(v) for all v ∈ VT and w(e) ≤ W (e) for all e ∈ ET . 33.2.1 BUFFER BLOCK PLANNING The buffer block planning problem can be informally stated as follows: Given a set of circuit blocks and a set of connections with feasible regions for buffer insertion to satisfy the design constraints (e.g., timing, noise), plan the locations of buffer blocks within the available free space (e.g., dead spaces Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 Finals Page 651 29-9-2008 #8 651 Global Interconnect Planning Hard block Soft block Dead space/ channel region FIGURE 33.4 Tile graph for buffer planning. and channels) so as to route a maximum number of connections. Buffer blocks can be planned after floorplanning [1,3,7,9] or during floorplanning [10–13]. Postfloorplanning buffer block planning is more efficient, but is often limited by the quality of a given floorplan because the location and size of the space for buffer insertion are fixed. Furthermore, as the dead spaces are often treated as undesired cost during floorplanning, they are usually avoided or minimized. As a result, the size and location of dead spaces may not be suitable for postfloorplanning buffer insertion. Therefore, there are also efforts that integrate buffer block planning into floorplanning to fully utilize useful dead spaces for performance optimization. This approach typically enjoys higher design flexibility, but inevitably incurs higher time complexity. Cong, Kong, and Pan first considered postfloorplanning buffer block planning in Ref. [1]; they derived feasible region formulas to determine where to insert buffers to meet timing constraints and proposed a greedy algorithm to plan buffer blocks in a slicing floorplan. Sarkar and Koh also considered routability and addressed the concept of independent feasible regions in Ref. [3]. Moreover, both approaches in Refs. [1,3] expand channels to provide more buffers if necessary. On the basis of a network-flow formulation, Tang and Wong in Ref. [9] optimally planned as many buffers into buffer blocks as possible for all nets, each with at most one buffer. Given an existing buffer block plan, Dragan et al. in Ref. [7] performed buffering of global nets. Nets are routed using available buffer blocks such that required upper and lower bounds on buffer intervals and the wirelength upper bounds per connection are satisfied. We describe the generic approach for postfloorplanning buffer block planning as presented in Ref. [1]. First, we construct a directed horizontal constraint graph and a vertical constraint graph for a given floorplan, denoted by GH and GV , respectively. Each vertex v in GH models a vertical routing channel, and an edge e = (v1 , v2 ) denotes a circuit block whose respective left and right boundaries are adjacent to the routing channels v1 and v2 . The weight of a vertex v, w(v), denotes the corresponding channel width while the weight of an edge e, w(e), represents the corresponding

Handbook of algorithms for physical design automation part 67

Nội dung