Handbook of algorithms for physical design automation part 59.pdf (Sổ tay thuật toán)

Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C027 562 Finals Page 562 23-9-2008 #7 Handbook of Algorithms for Physical Design Automation algorithm enhancements, it is shown how to choose technology constants, how to trade off wirelength and slack, and how to deal with blockages, congestion, and high fanout nets. Experimental results demonstrate that generated trees are of good quality and that the algorithm execution time is extremely small (one million trees are computed in less than a minute; note that if a robust buffer insertion is needed, additional CPU time may be required for postprocess buffering). 27.3 SIMULTANEOUS TREE CONSTRUCTION AND BUFFER INSERTION In this section, some of the methods that combine buffer insertion and topology construction are presented. Most of them belong to the P-Tree class of algorithms. The work started with the first version of the P-Tree algorithm [8] that was designed to construct timing-driven routing tree and it has seen a decade long evolution of various improvements and extensions that were building on the original ideas. These algorithms are designed to handle a variety of challenges seen in modern designs. Simultaneous tree construction and repeater insertion (with multiple buffers and inverters in the library) while being able to optimize multiple objectives, again, simultaneously (delay, cost, congestion, wirelength) are achieved through the core optimization engine. Practical issues such as obstacles (i.e., placement and routing blockages), multilayer routing and vias, and nonorthogonal routing are handled by the capability of the algorithms to work on general graph models as routing targets. Spatial, temporal, and polarity localities (all of them independently) are captured and exploited by implicit specification of the set of tree topologies that will be searched. In the following subsections, we give an overview of the mentioned contributions in chronological order. 27.3.1 P-TREE ALGORITHM The work in Ref. [8] presents an algorithm that constructs rectilinear Steiner trees while explicitly optimizing both delay and wire area. Contrary to the methods presented in the previous section, it introduces the paradigm of finding an optimal solution in the constrained solution space; in other words, the solution search space is defined in advance, and then the algorithm finds an optimal solution in this constrained space. The P-Tree algorithm simultaneously optimizes over topologies and their embeddings. To illustrate the degrees of freedom in embedding a particular topology consider Figure 27.6. The driver is the root, leaves represent sinks, and internal nodes represent Steiner or branching points. However, the branching points do not have defined placement locations. As an example, topology in Figure 27.6a can yield very different embedded solutions as shown in Figure 27.6b and c. Notice that sinks A and B when the topology is embedded (Figure 27.6b and c) are always in the same subtree, as specified by the topology (Figure 27.6a). Thus, how one embeds a particular topology can be of great importance. But what about the topology itself? The P-Tree algorithm does not limit itself to a single topology. Instead, it optimizes D D D C A (a) A C A B B B (b) C (c) FIGURE 27.6 Routing tree topology (a) can have different embedded solutions (b) and (c). Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C027 Finals Page 563 23-9-2008 #8 563 Generalized Buffer Insertion A C A B (a) B D D A C C B (b) D (c) FIGURE 27.7 Topologies (a) and (b) satisfy permutation BDAC while topology (c) does not. over all topologies induced by a sink permutation (a set exponential in size). These topology trees are simultaneously explored and embedded to the routing domain. The routing topologies that are produced are permutation-constrained routing trees (giving rise to the name P-Tree). Given an ordering of sinks (i.e., permutation) a topology satisfies the permutation constraint if some depth-first traversal of the topology tree produces the same sink ordering. As an example, given the permutation BDAC, trees in Figure 27.7a and b satisfy the permutation while the one in Figure 27.7c does not. The method proposed to find a high-quality permutation consists of three steps. The first step constructs a minimum spanning tree (MST) on all pins (both driver and sinks). The next step converts the MST into topology by reorienting MST such that the driver is at the root and binarizing it such that each internal node has exactly two outgoing edges. The last step is to apply the dynamic programming algorithm to optimize the induced permutation further. The proposed approach is to optimize the tour length of the permutation (in the sense of a traveling salesman problem (TSP)). The intuition is that TSP provides good clustering information (i.e., sinks that are close in the placement should be close in the topology as well). In addition, because the permutation is consistent with the MST, it guarantees that the minimum area solution induced by this permutation can be at most 50 percent larger than the optimal Steiner tree. A more detailed description of the algorithm can be also found in Ref. [17]. Once the permutation is obtained, thus defining all topologies to be explored, the algorithm proceeds to topology embedding. The routing target is specified by a Hanan grid (note that in Ref. [10] this step of the flow is redesigned to handle general graph model, giving the capability to account for blockages and congestion). Instead of embedding topologies one at a time, algorithm exploits the structure of permutation constrained topologies and achieves polynomial computational complexity while exploring an exponential search space. For example, topologies in Figure 27.7a and b have identical subtree containing sinks B and D, and there is no need to compute solution for this subtree more than once. The dynamic programming approach proceeds in a bottom-up fashion computing the following solution sets: S(v, i, j) contains signatures∗ of the solutions over all permutation-induced topologies driving sinks from i to j in the permutation that are rooted at the vertex v in the target routing graph. Set Sb (v, i, j) contains signatures of the solutions over all permutation-induced topologies driving sinks from i to j in the permutation that are rooted at the vertex v with a constraint that vertex v is also a branching point. The top-level view of the P-Tree algorithm is given in Figure 27.8. Depending on the solution signature, the algorithm can optimize many objectives. In the P – TreeA mode, solution signature contains only one parameter: total wire capacitance c. This mode is used to optimize wire area. In the P – TreeAT mode, both timing and area are optimized simultaneously. ∗ Solution signature is described in the following paragraph. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C027 564 Finals Page 564 23-9-2008 #9 Handbook of Algorithms for Physical Design Automation P-Tree Algorithm a1 Compute S (v, i, i) a2 for I = 1 to n − 2 do a3 for i = 1 to n − 1 − I do a4 j=i+I a5 Compute S b (v, i, j) a6 Compute S (v, i, j) a7 endfor a8 endfor a9 return S (v d , 1, n − 1) FIGURE 27.8 P-Tree algorithmic framework. Instead of one single final solution, algorithm produces a family of nondominated solutions with area/delay trade-off. In this mode, solution signature is specified by an ordered pair (c, q), where c represents total downstream capacitance, while q represents signal required arrival time. Under the Elmore delay model [2], managing these primitives is done in a way similar to Ref. [1], as explained in Chapter 26. Just as a reminder the primitives include joining, augmenting, merging, and pruning solutions. In the followup work, P – TreeA and P – TreeAT modes are referred to as one-dimensional (1D) and two-dimensional (2D) modes because the algorithm optimizes only one parameter in the 1D mode, and simultaneously optimizes two parameters in the 2D mode. Details about computing S and Sb sets can be found in Ref. [8]. As a side note, computation of the set S is done by performing four sweeps of the routing grid (once in each direction). This step is very efficient under assumption that the routing target is a Hanan grid, but it cannot handle obstacles. Once the set S is computed at the top level S(vd , 1, n − 1), the actual topology embeddings can be obtained by backtracking through the data structure containing these solution signatures. In Ref. [7], the original algorithm is extended to perform simultaneous route construction and buffer insertion. The general flow of the algorithm is almost identical to Ref. [8], except that each internal vertex in the topology is also considered as a buffer insertion candidate location. However, some adjustments had to be made to the solution signature and, because of that, to all primitives that operate on them as well. Because inserting a buffer decouples downstream load, and effectively resets the load visible from the parent vertex to the input capacitance of the inserted buffer, the load value cannot be used for cost estimation (but remains necessary for delay estimation). Thus, a third parameter that represents solution cost (e.g., area, power, congestion, or any composite function of those) is added to the solutions signature. The solution signature now becomes a triple (p, c, q). More details about the three-dimensional (3D) (simultaneously optimizing three parameters) version of the primitives can be found in Ref. [12]. Also, methods from Ref. [12] can be applied here to extend the algorithm to support multiple buffers and inverters in the library. Although the algorithm’s computational complexity is O(n5 ) in the 1D mode, and pseudopolynomially bounded in the 2D mode, it produces solutions of very high quality in terms of both cost and delay. The algorithm may not be suitable for optimizing every single net in the design, but using this approach to optimize a smaller number of highly critical nets can be very beneficial, especially when one considers that the algorithm produces a family of solutions with delay/cost trade-off giving more choices in the overall design optimization process. Experimental results support these claims. A detailed description of the algorithm and its extensions that were summarized above can be also found in Ref. [17]. 27.3.2 S-TREE ALGORITHM The work in Ref. [11] adopts the overall philosophy of P-Tree while improving runtime and scalability, introducing a general strategy (stitching) for incorporating multiple measures of locality between Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C027 Finals Page 565 23-9-2008 #10 565 Generalized Buffer Insertion sinks and subtrees (e.g., temporal, polarity), and finally adopting a general graph model as the routing target enabling natural solutions for obstacles, multilayer routing, etc. Although the P-Tree topology space is very large and has a very good ability to capture spatial sink locality, it may create solutions of higher cost in certain scenarios that include, for example, highly critical sinks physically located among noncritical sinks. To address this issue, the S-Tree defines its topology search set using a topology tree and a sink partition. Sink partition specifies which sinks can break the original topology tree and be stitched to any other part of the topology while maintaining the order within its own partition set. The topology set is then composed of all valid stitching trees (thus, giving the name S-Tree). In Figure 27.9, given the topology on the left and a sink partition {{A, C, D, F}, {B, E}}, one can find all possible topologies that satisfy the stitching requirement. Assuming that sinks B and E are critical, note how they are allowed to break away from noncritical sinks and climb toward the root. The fact that these topologies are in the solution space guarantees only that they will be explored but whether they will be chosen at the end depends on the cost and quality of the embedded trees. In the extreme cases, if all sinks belong to a single partition, S-Tree is equivalent to a single topology tree embedding. The other extreme case is if each sink belongs to its own partition, which explores all possible tree topologies in a way similar to Ref. [9]. The initial topology is obtained as in P-Tree, using the first two steps only: construct an MST and convert it to a topology tree by reorientation and binarization. The sink partitions can be determined based on estimated timing criticality, input signal polarity requirements, or any other criteria. Once the topology set is defined, the algorithm proceeds with the dynamic programming algorithm that performs topology set embedding. In the embedding process, no sink (nor a group of sinks) receives any special treatment. Modifications to the topology embedding algorithm include support of a general graph model for a routing target. Propagation of candidate solutions through the routing graph is performed using timing-driven maze routing approach with simultaneous buffer insertion as in Ref. [14]. Note that solution quality depends a lot on the routing target graph. Careful graph construction is a key to S S B S A C C F D A E B F F C B D S S A E D E A C B E B D F A C D E F FIGURE 27.9 S-Tree topology space example. The initial topology tree on the left and a sink partition {{A, C, D, F}, {B, E}} define a set of five different topologies that will be explored simultaneously. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C027 566 Finals Page 566 23-9-2008 #11 Handbook of Algorithms for Physical Design Automation the high-quality solution, but one should have in mind that the size of the graph has a large impact on the execution time. Some suggestions how to construct the routing target graph can be found in Refs. [9–11,14]; however, the algorithm itself is designed to work on general graph model, thus, the routing target graph may be constructed by any other means. Optimization modes include all three modes seen in the P-Tree algorithm (1D, 2D, and 3D). Experimental results demonstrate the effectiveness of the approach in identifying solutions with very good timing and low cost. Also, the algorithm has very good predictability characteristics. 27.3.3 SP-TREE ALGORITHM The SP-Tree algorithm [10], as the name suggests (the stitching permutation constrained trees), combines the topology tree spaces of P-Tree and S-Tree. After obtaining the sink permutation and constructing a set of topology trees, sink partitions are specified and the entire set of topology trees is expanded by the stitching operation, producing a new larger set of tree topologies. The SP-Tree topology set contains both the P-Tree and the S-Tree topologies in addition to some new topologies that are not contained in either P-Tree or S-Tree. The topology set embedding algorithm is identical to that of the S-Tree with the main difference in the construction of the topology tree set that is provided as an input to the embedder. More details about both algorithms can be found in Ref. [18], while executable solvers and examples can be found at Ref. [19]. Algorithms like buffered P-Tree, S-Tree, and SP-Tree, based on a general graph model as a routing target, provide high-quality solutions and have large flexibility with different optimization objectives. As expected, experimental results demonstrate that SP-Tree flow produces solutions that are always better or equal to those of both S-Tree and P-Tree, usually with a smaller cost, but at a runtime penalty. 27.3.4 COMPLETE TREE TOPOLOGY EXPLORATION For the experimental purposes, one can construct a tree topology set that contains all possible tree topologies. This is similar to topology decomposition from Ref. [20]. One can achieve that either by using the S-Tree and requesting that each sink belongs to its own partition, or by using a specific approach as in Ref. [9]. Embedded routing trees obtained in this fashion are indeed optimal, but the computational complexity prevents any practical applications, although they are valuable for research purposes and evaluation of other heuristic approaches. REFERENCES 1. L. P. P. P. van Ginneken, Buffer placement in distributed RC-Tree networks for minimal elmore delay, in Proceedings of the IEEE International Symposium on Circuits and Systems, New Orleans, LA, May 1990, pp. 865–868. 2. W. C. Elmore, The transient response of damped linear networks with particular regard to wideband amplifiers, Journal of Applied Physics, 19: 55–63, Jan. 1948. 3. C. J. Alpert, G. Gandham, M. Hrkić, J. Hu, A. B. Kahng, J. Lillis, B. Liu, S. T. Quay, S. S. Sapatnekar, and A. J. Sullivan, Buffered Steiner trees for difficult instances, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 21(1): 3–14, Jan. 2002. 4. C. J. Alpert, G. Gandham, M. Hrkić, J. Hu, and S. T. Quay, Porosity aware buffered Steiner tree construction, in Proceedings of the ACM International Symposium on Physical Design, Monterey, CA, Apr. 2003, pp. 158–165. 5. C. J. Alpert, M. Hrkić, J. Hu, and S. T. Quay, Fast and flexible buffer trees that navigate the physical layout environment, in Proceedings of the 41st Design Automation Conference, San Diego, CA, Jun. 2004, pp. 24–29. 6. C. Bartoschek, S. Held, D. Rautenbach, and J. Vygen, Efficient generation of short and fast repeater tree topologies, in Proceedings of the ACM International Symposium on Physical Design, San Jose, CA, Apr. 2006, pp. 120–127. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C027 Generalized Buffer Insertion Finals Page 567 23-9-2008 #12 567 7. J. Lillis, C. K. Cheng, and T. T. Y. Lin, Simultaneous routing and buffer insertion for high performance interconnect, in Proceedings of the 6th IEEE Great Lakes Symposium on VLSI, Ames, IA, Mar. 1996, pp. 148–153. 8. J. Lillis, C. K. Cheng, T. T. Y. Lin, and C. Y. Ho, New performance driven routing techniques with explicit area/delay tradeoff and simultaneous wire sizing, in Proceedings of the 33rd Design Automation Conference, Las Vegas, NV, Jun. 1996, pp. 395–400. 9. J. Cong and X. Yuan, Routing tree construction under fixed buffer locations, in Proceedings of the 37th Design Automation Conference, Los Angeles, CA, Jun. 2000, pp. 379–384. 10. M. Hrkić and J. Lillis, Buffer tree synthesis with consideration of temporal locality, sink polarity requirements, solution cost and blockages, in Proceedings of the ACM International Symposium on Physical Design, Del Mar, CA, Apr. 2002, pp. 98–103. 11. M. Hrkić and J. Lillis, S-Tree: A technique for buffered routing tree synthesis, in Proceedings of the 39th Design Automation Conference, New Orleans, LA, Jun. 2002, pp. 578–583. 12. J. Lillis, C. K. Cheng, and T. T. Y. Lin, Optimal wire sizing and buffer insertion for low power and a generalized delay model, IEEE Journal of Solid-State Circuits, 31(3): 437–447, Mar. 1996. 13. C. J. Alpert, T. C. Hu, J. H. Huang, A. B. Kahng, and D. Karger, Prim-Dijkstra tradeoffs for improved performance-driven routing tree design, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 14(7): 890–898, Jul. 1995. 14. S. W. Hur, A. Jagannathan, and J. Lillis, Timing-driven maze routing, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 19(2): 234–241, Feb. 2000. 15. A. Jagannathan, S. -W. Hur, and J. Lillis, A fast algorithm for context-aware buffer insertion, in Proceedings of the 37th Design Automation Conference, Los Angeles, CA, Jun. 2000, pp. 368–373. 16. H. Zhou, D. F. Wong, I. M. Liu, and A. Aziz, Simultaneous routing and buffer insertion with restrictions on buffer locations, in Proceedings of the 36th Design Automation Conference, New Orleans, LA, Jun. 1999, pp. 96–99. 17. J. Lillis, Algorithms for Performance Driven Design of Integrated Circuits, 1996. Available at http://www.cs.uic.edu/ ˜jlillis/papers/thesis.ps. 18. M. Hrkić and J. Lillis, Buffer tree synthesis with consideration of temporal locality, sink polarity requirements, solution cost, congestion and blockages, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 22(4): 481–491, Apr. 2003. 19. M. Hrkić and J. Lillis, GSRC Single Interconnect Tree Synthesis Web Page. Available at http://eda.cs. uic.edu/software/interconnect/gsrc.html. 20. S. E. Dreyfus and R. A. Wagner, The Steiner problem in graphs, Networks, 1: 195–208, 1972. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C027 Finals Page 568 23-9-2008 #13 Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C028 Finals Page 569 30-9-2008 #2 in the Layout 28 Buffering Environment Jiang Hu and Cliff C. N. Sze CONTENTS 28.1 Introduction.. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.2 Placement and Routing Blockages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.3 Buffered Path with Blockage Avoidance .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.3.1 Dynamic Programming-Based Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.3.2 Graph-Based Approach .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.4 Buffered Tree with Blockage Avoidance.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.4.1 Tree Adjustment Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.4.2 Simultaneous Tree Construction and Buffer Insertion . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.4.2.1 Dynamic Programming-Based Method . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.4.2.2 Graph-Based Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.5 Layout Environment Aware Buffered Steiner Tree .. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.5.1 Measurement of Placement and Routing Congestion . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.5.2 Plate-Based Tree Adjustment.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.5.2.1 Dynamic Programming-Based Adjustment .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.5.2.2 Hybrid Approach for Tree Adjustment.. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.5.3 Layout Navigation . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 28.5.4 Relating Buffering Candidate Locations to Layout Environment . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 569 569 571 571 572 574 574 574 575 575 577 577 578 578 579 581 582 583 28.1 INTRODUCTION Chapters 26 and 27 presented buffering algorithms where the buffering problem was isolated from the general problem of timing closure. The main problem with this extraction is that there are no necessarily resources available to put the buffers or wires in their desired locations. One way to manage this flow is to put the buffers in their ideal locations and allow a legalization procedure to move them to actual locations. The problem with this approach is that buffers may be moved quite far from their ideal locations, which could completely corrupt the quality of the solution. It is much better to place the buffers in regions where there appear to be sufficient space, so that legalization would move the buffers by at most 10–20 routing tracks, which would preserve the original solution. To do this, one must certainly take into account the blockages and preferably local placement and routing congestion. In this chapter, we explore techniques that consider these factors. 28.2 PLACEMENT AND ROUTING BLOCKAGES In realistic chip designs, some regions may be occupied by IP blocks, memory arrays, and macros. Such regions allow wires to pass through but have no room for buffer insertion. Therefore, buffer insertion has to be performed with consideration of these buffer blockages. If a wire path has large 569 Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C028 570 Finals Page 570 30-9-2008 #3 Handbook of Algorithms for Physical Design Automation (a) (b) (c) FIGURE 28.1 (a) The wire path has large overlap with the buffer blockage, which is the shaded region, and there is no feasible buffering solution. (b) Rerouting the wire to completely avoid the blockage may result in large wire detour. (c) Ideally, the wire path should largely avoid the blockage with limited detour. overlap with blockages as in Figure 28.1a, no feasible buffering solution can be found. However, avoiding buffer blockages completely as Figure 28.1b may cause unnecessary wire detour as well as delay degradation. Hence, it is important to have algorithmic techniques that can find a proper routing topology (as Figure 28.1c) together with the buffer insertion solution. Another more intuitive example was previously described in Figure 27.4. To find the best buffered interconnect solution, the routing must consider feasible buffer insertion locations. A similar problem is buffer insertion considering placement and routing congestions. If a buffer is inserted in a crowded place as in Figure 28.2a, it might be moved far away later during cell placement legalization. Thus, such insertion is not favorable although it is feasible, i.e., buffers are preferred to be inserted in relatively sparse regions like Figure 28.2b. Similarly, buffer insertion in a routing congested region may intensify the wire routability problem. Hence, buffer insertion algorithms need to be aware of layout environment and be able to handle the trade off between timing performance and congestion avoidance. Section 28.3 introduces algorithms on blockage avoidance for two-pin nets, i.e., buffered paths. Blockage buffered Steiner tree methods for multipin nets are described in Section 28.4. In Section 28.5, techniques for handling congestions are discussed. (a) (b) FIGURE 28.2 (a) If a buffer is placed in a crowded region, it may be moved by placement legalizer far away from its preferred location. (b) If the buffer is placed in a sparse region, its location will not be significantly changed by placement legalization. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C028 Finals Page 571 30-9-2008 #4 571 Buffering in the Layout Environment 28.3 BUFFERED PATH WITH BLOCKAGE AVOIDANCE For two-pin nets, the problem is to find a buffered path with the minimum delay under the blockage constraints. In the literature, there exists two major approaches: (1) dynamic programming and (2) graph-based algorithm. Both are based on the Elmore delay model. It has not been showed that these approaches are fast enough to be practical for general optimization cases, but they can be very useful to a handful of the most critical nets. 28.3.1 DYNAMIC PROGRAMMING-BASED METHOD The dynamic programming-based method [1,2] propagates partial solutions from the sink node t through a routing graph G = (V , E) and picks the optimal solution at the source node s. The routing graph can be either a uniform grid reflecting routing tracks (Figure 28.3a) or an extended Hanan grid, which is obtained by drawing vertical and horizontal lines through the given pins and blockage boundaries (Figure 28.3b). For each edge (u, v) ∈ E, R(u, v) and C(u, v) are the edge resistance and capacitance, respectively. For each node v ∈ V , there is a label p(v) ∈ {0, 1} which is equal to 0 if it overlaps with a buffer blockage and equal to 1, otherwise. Besides the routing graph, the driver resistance Rd , sink capacitance Ct , and a buffer library B are assumed to be given. Each buffer type b ∈ B is modeled by its input capacitance C(b), intrinsic delay K(b), and output resistance R(b). A partial solution at a node v is characterized by a quadruple α = (c, d, m, v), where c is the current input capacitance seen at v, d is the delay from v to the sink t, and m is a labeling for the buffered path from v to t. The label of m(v) = b indicates that buffer b ∈ B is inserted at node v and m(v) = 0 implies that no buffer is inserted there. The solution α1 = (c1 , d1 , m1 , v) is inferior to solution α2 = (c2 , d2 , m2 , v) if c1 ≥ c2 and d1 ≥ d2 . The partial solutions are maintained in a priority queue Q initialized with the solution (Ct , 0, 0, t) at the sink t. Each time, the top solution in Q, which has the minimum delay, is extracted for expansion. A solution (c, d, m, u) is expanded to its neighbor node v if there is an edge (u, v) ∈ E. The expanded solution is (c + C(u, v), d + R(u, v) · (c + C(u, v)/2), m, v) where the delay increase is based on the Elmore delay model. If a solution (c, d, m, v) is at node v where m(v) = 0 and p(v) = 1, buffers of each type are inserted there to generate new partial solutions. If the buffer type is b ∈ B, its corresponding buffered solution is (C(b), d + R(b) · c + K(b), m, v) t s (a) FIGURE 28.3 Routing graph. (b)

Handbook of algorithms for physical design automation part 59

Nội dung