Handbook of algorithms for physical design automation part 62.pdf (Sổ tay thuật toán)

Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C029 592 Finals Page 592 29-9-2008 #9 Handbook of Algorithms for Physical Design Automation convexifying the look-up table data with minimum perturbation can be formulated as a convex semidefinite optimization [Roy 2005] problem and hence optimality can be reached in polynomial time. Thus, given a numerical function g(x) for the original delay, let f (x) = g(x) + δ(x). δ(x) is the perturbation of g(x), and f (x) is the transformed function. Any function φ(x) is convex if and only if the Hessian matrix ∇ 2 φ(x) 0 for all x ∈ DOMφ. (∇ 2 φ(x) 0 means the Hessian of φ(x) is positive semidefinite, i.e., all the eigenvalues of the Hessian are greater than or equal to zero.) Thus, the fitting problem is to minimize δ(x) to make the Hessian of f (x) positive semidefinite. The problem is defined as follows: minimize |δ (x)| x∈DOMg subject to ∇ 2 (g (x + δ(x))) ≥ 0, x ∈ DOMg 29.3.5 SEQUENTIAL QUADRATIC PROGRAMMING ALGORITHM The convex optimization problem of concurrent gate and wire sizing can also be solved using the sequential quadratic programming (SQP) method [Menezes 1997, Chu 1999b]. SQP reduces a nonlinear optimization to a sequence of quadratic programming (QP) subproblems. A general convex quadratic program can be represented as X T QX + X T C minimize 1 2 subject to ATi X ≤ bi , i∈I where Q is a symmetric positive semidefinite matrix I is the set of inequalities Now if we want to minimize a function F(X) subject to the constraints hi (X) ≤ 0, i = 1 . . . m, then we can express the Lagrangian of F(x) as L(X, λ) = F(X) + m λi hi (X) (29.1) i=1 where λi is the Lagrange multiplier associated with the ith constraint. Now, if G(X) = ∇F(x) be the gradient of the objective function, the original optimization problem can be solved by solving a sequence of QP subproblems as shown below: minimize 1 2 (X − X0 )T B(X0 ) (X − X0 ) + (X − X0 )T G(X0 ) subject to (X − X0 )T ∇hi (X0 ) + hi (X0 ) ≤ 0, i = 1...m where X0 is the solution of the previous QP iteration B(X0 ) is the approximation of the Hessian of the Lagrangian 29.3.6 VARIATIONAL CALCULUS-BASED NONUNIFORM SIZING ALGORITHM All the wire-sizing techniques presented so far are uniform wire-sizing techniques. Now we illustrate a case of nonuniform wire sizing. Figure 29.7 shows a nonuniform wire segment W of length L, with source driver resistance Rd , and sink load capacitance CL . Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C029 Finals Page 593 29-9-2008 #10 593 Wire Sizing Driver Wire Load f(x) Rd CL L x 0 FIGURE 29.7 Nonuniform wire sizing function. For each x ∈ [0, L], let f (x) be the wire width of W at position x. Let the wire resistance and capacitance per unit square be r0 and c0 , respectively. Let t be the Elmore delay from the source to the sink of W . Then the optimal wire-sizing functionf that minimizes t is given by f (x) = ae−bx . a > 0 and b > 0 are constants given by a = bRr0 , b Rrd Cc L − e(−bL)/2 = 0. This can be proved by d 0 0 using variational calculus [Lee 2002]. In case of constrained wire sizing, where the wire widths are bounded by L ≤ f (x) ≤ U, 0 ≤ x ≤ L, the wire sizing solution will be a truncated version of ae−bx as shown in Figure 29.8. This formula can be iteratively applied to optimally size the wire segments in a routing tree. 29.3.7 OPTIMAL PROPAGATION SPEED WITH WIRES Nonuniform wire sizing is not used widely because routing such wires is nontrivial, and it can also lead to poor track utilization. If we get a reasonably good solution by uniform wire sizing, buffer insertion, and gate sizing, it may not be worthwhile to spend a high effort in routing nonuniform wires if the delay improvement is marginal. Figure 29.9 shows two wire-sizing solutions Figure 29.9a showing optimal uniform wire sizing with buffering and Figure 29.9b showing optimal nonuniform wire-sizing solution with buffering. It has been shown that the ratio of maximum attainable signal velocities of the optimal nonuniform wire-sizing configuration to the optimal uniform wire-sizing configuration is 1.0354 with full buffering [Alpert 2001]. This means that theoretically, tapering in the best case only gives an improvement of 3.54 percent over uniform wire sizing, and this ratio is independent of technology parameters. Hence tapering only gives a small performance gain in the best case. 29.3.8 HIGH-ORDER MOMENT-BASED ALGORITHM EWA [Kay 1998] or efficient wire-sizing algorithm is an example of an algorithm heuristic for minimizing the total wiring area of an interconnect tree, subject to hard constraints on the Elmore delay. This algorithm can use the Elmore delay model or can be extended to use higher order delay models. Driver Wire Load f (x) = ae−bx U Rd FIGURE 29.8 Optimal wire sizing. 0 L x L CL Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C029 594 Finals Page 594 29-9-2008 #11 Handbook of Algorithms for Physical Design Automation Lu Lu w Lu w Lu w w (a) Lt (b) FIGURE 29.9 Optimal propagation speed. 29.4 SIGNAL INTEGRITY OPTIMIZATION ALGORITHM Some other advances in interconnect optimization include noise-aware repeater insertion and wire sizing. In the following section we describe an algorithm for noise-aware optimization. 29.4.1 NOISE AWARE OPTIMIZATION Noise aware optimization is a hierarchical and accurate noise estimation algorithm [Chen 1999] which can handle arbitrarily shifted attacking noise waveforms. Moment-matching techniques are used for accurate RC delay estimation. The transfer functions between nodes i and j, and nodes j and k in Figure 29.10 are computed hierarchically. The delay tik is computed by convolution of the input signal with the composite transfer function up to node k. During backward propagation of a pair at node j, the transfer function Hik (s) is computed. The electrical models for computation of Hjk (s) and Yi (s) are shown in Figure 29.10b. They are calculated as follows: Hij (s) = 1 R(Cs + Yj (s)) + 1 Yi (s) = Cs + Cs + Yj (s) R(Cs + Yj (s)) + 1 where R = Ri and C = Ci /2. The RC delay is computed by the convolution of the waveform at i and Hik (s). The moments are then stored in the pair at node i. Hik (s) = f (Hjk (s), Yj (s), Ri , Ci) ei i Hjk (s) = m0 + m1s + m1s 2 + … j Yi (s) k Yj (s) tij (a) i Ci 2 Ri j Ci 2 (b) FIGURE 29.10 Hierarchical moment computation. Hjk (s) Yj (s) k Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C029 Wire Sizing Finals Page 595 29-9-2008 #12 595 The hierarchical moment generation for the transfer function and input admittance always starts from either a receiver or a repeater. For this base case, if c represents the receiver/repeater input capacitance, the moment representation of the transfer function and admittance is given by H(s) = 1, and Y (s) = cs. Wire sizing can be handled during this step by backward propagation of the pairs from node j to node i for different wire widths of segment ei . Ri and Ci are functions of the wire width. REFERENCES [Alpert 2001] [Chen 1998] [Chen 1999] [Chen 1997] [Chen 1996] [Chu 1999a] [Chu 1999b] [Cong 1996a] [Cong 1994] [Cong 1996b] [Cong 1993] [Gao 1999] [Kay 1998] [Lee 2002] [Lillis 1995] [Menezes 1997] C.J. Alpert, A. Devgan, J.P. Fishburn, and S.T. Quay, Interconnect synthesis without wire tapering, IEEE Transactions on Computer Aided Design of Intergrated Circuits and Systems, 20(1), 90–104, January 2001. C.P. Chen, C.C.N. Chu, and D.F. Wong, Fast and exact simultaneous gate and wire sizing by Lagrangian relaxation, in IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA, November 1998, pp. 617–624. C.P. Chen and N. Menezes, Noise-aware repeater insertion and wire sizing for on-chip interconnect using hierarchical moment-matching, in Proceedings of the 36th Design Automation Conference, New Orleans, LA, June 1999, pp. 502–506. C.P. Chen and D.F. Wong, Optimal wire-sizing function with fringing capacitance consideration, in Proceedings of the 34th Design Automation Conference, Anaheim, CA, June 1997, pp. 604–607. C.P. Chen, H. Zhou, and D.F. Wong, Optimal non-uniform wire-sizing under the Elmore delay model, in IEEE/ACM International Conference on Computer-Aided Design, San Jose, CA, November 1996, pp. 38–43. C.C.N. Chu and M.D.F Wong, Greedy wire-sizing is linear time, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 18(4), 398–405, April 1999. C.C.N. Chu and D.F. Wong, A quadratic programming approach to simultaneous buffer insertion/sizing and wire sizing, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 18(6), 787–798, June 1999. J. Cong, L. He, C.K. Koh, and P.H. Madden, Performance optimization of VLSI interconnect layout, Integration, the VLSI Journal, 21(1–2), 1–94, November 1996. J. Cong and C.K. Koh, Simultaneous driver and wire sizing for performance and power optimization, in Proceedings of the IEEE/ACM International Conference on ComputerAided Design, San Jose, CA, November 1994, pp. 206–212. J. Cong, C.K. Koh, and K.S. Leung, Simultaneous buffer and wire sizing for performance and power optimization, in International Symposium on Low Power Electronics and Design, Monterey, CA, August 1996, pp. 271–276. J. Cong and K.S. Leung, Optimal wiresizing under the distributed Elmore delay model, in Proceedings of the IEEE/ACM International Conference on Computer Aided Design, Santa Clara, CA, 1993, pp. 634–639. Y. Gao and D.F. Wong, Wire-sizing optimization with inductance consideration using transmission-line model, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 18(12), 1759–1767, December 1999. R. Kay and L.T. Pileggi, EWA: Efficient wiring-sizing algorithm for signal nets and clock nets, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 17(1), 40–49, January 1998. Y. Lee, C.C.P. Chen, and D.F. Wong, Optimal wire-sizing function under the Elmore delay model with bounded wire sizes, in IEEE Transactions on Circuits and Systems-I, 49(11), 1671–1677, November 2002. J. Lillis, C.K. Cheng, and T.T.Y. Lin, Optimal and efficient buffer insertion and wire sizing, in Proceedings of the IEEE Custom Integrated Circuits Conference, Santa Clara, CA, May 1995, pp. 259–262. N. Menezes, R. Baldick, and L.T. Pileggi, A sequential quadratic programming approach to concurrent gate and wire sizing, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 16(8), 867–881, August 1997. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C029 596 Finals Page 596 29-9-2008 #13 Handbook of Algorithms for Physical Design Automation [Odabasioglu 1998] A. Odabasioglu, M. Celik, and L.T. Pileggi, PRIMA: Passive reduced-order interconnect macromodeling algorithm, IEEE Transactions on Computer Aided Design of Intergrated Circuits and Systems, 17(8), 645–654, August 1998. [Pillage 1990] L.T. Pillage and R.A. Rohrer, Asymptotic waveform evaluation for timing analysis, IEEE Transactions on Computer Aided Design, 9(4), 352–366, April 1990. [Roy 2005] S. Roy, W. Chen, and C.C.P. Chen, ConvexFit: An optimal minimum-error convex fitting and smoothing algorithm with application to gate sizing, in Proceedings of the International Conference on Computer Aided Design, San Jose, CA, November 2005 pp. 196–203. [Sapatnekar 1996] S.S. Sapatnekar, Wire sizing as a convex optimization problem: Exploring the areadelay tradeoff, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 15(8), 1001–1011, August 1996. [Tsai 2004] J.L. Tsai, T.H. Chen, and C.C.P. Chen, Zero skew clock-tree optimization with buffer insertion/sizing and wire sizing, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 23(4), 565–572, April 2004. [Zhang 2004] L. Zhang, Z. Luo, X. Hong, Y. Cai, S.X.D Tan, and J. Fu, Optimal wire sizing in early-stage design of on-chip power/ground (P/G) networks, in Proceedings of the 7th International Conference on Solid-State and Integrated Circuits Technology, 3, 1936–1939, October 2004. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_S006 Finals Page 597 24-9-2008 #2 Part VI Routing Multiple Signal Nets Alpert/Handbook of Algorithms for Physical Design Automation AU7242_S006 Finals Page 598 24-9-2008 #3 Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C030 Finals Page 599 19-9-2008 #2 of Routing 30 Estimation Congestion Rupesh S. Shelar and Prashant Saxena CONTENTS 30.1 Introduction.. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 30.2 Postrouting Congestion Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 30.3 Placement-Level Congestion Estimation.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 30.3.1 Fast Metrics for Routing Congestion.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 30.3.2 Probabilistic Estimation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 30.3.3 Estimation Based on Fast Global Routing . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 30.3.3.1 Comparison of Fast Global Routing with Probabilistic Methods . . . . . . . 30.4 Congestion Metrics for Technology Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 30.5 Congestion Metrics for Logic Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 30.6 Final Remarks . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . 599 600 601 601 602 606 607 608 610 611 612 30.1 INTRODUCTION A design is said to exhibit routing congestion if the demand for the routing resources in some region within its layout exceeds their supply. Congestion is undesirable because it can degrade the performance and the yield of a design, and can add uncertainty to its convergence. With wire delays no longer being insignificant in modern process technologies, an unexpected increase in the delay of a net that lies on a critical path can cause a design to miss its frequency target. The routing of a net passing through a congested region may be detoured significantly, or forced to use the more resistive metal layers. Consequently, the delay estimates for nets that pass through congested regions are often erroneous. These estimates may mislead the design optimization trajectory by failing to correctly identify the truly critical paths, thus aggravating the design convergence problem. A densely congested design is also likely to result in a lower manufacturing yield than a similar uncongested design. Congestion typically results in an increased number of vias in the routes, which can affect the yield. Additionally, congested layouts tend to have larger critical areas for the creation of shorts and opens because of random defects. Furthermore, it can be shown using first-order scaling models that the congestion problem is likely to worsen in the future, as design sizes increase and process geometries shrink [SSS07]. As a result, it is desirable to minimize the routing congestion in a design. Congestion can be measured accurately only after the routing has been completed. However, if the design exhibits congestion problems at that stage, mere rerouting of the nets may not be able to resolve these problems. This may necessitate a new design iteration with changes being made to the placement or to the netlist. However, one has to be able to measure routing congestion before one can optimize it. This chapter describes the measurement of congestion at all levels of abstraction, from a routed layout up to a multilevel Boolean network. 599 Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C030 600 Finals Page 600 19-9-2008 #3 Handbook of Algorithms for Physical Design Automation The rest of this chapter is organized as follows. Section 30.2 describes the postrouting metrics for congestion, and Section 30.3 discusses placement-level congestion estimation. Congestion metrics at the technology mapping level are covered in Section 30.4, whereas those that serve as proxies for congestion during logic synthesis are presented in Section 30.5. Finally, some closing remarks are presented in Section 30.6. 30.2 POSTROUTING CONGESTION METRICS Before discussing the metrics used to measure postrouting congestion, it is useful to describe the underlying routing model. As was discussed in Section 23.2.2, the entire routing space is usually tessellated into a grid array. The small subregions created by this tessellation of the routing region have variously been referred to as grid cells, global routing cells, or bins. The bins are usually gridded employing horizontal and vertical gridlines, referred to as routing tracks, along which wires can be created. The dual graph of the tessellation is the routing graph. In this graph, each vertex represents a bin and each edge denotes the boundary between the bins corresponding to its vertices. Routing graphs used for congestion estimation may bundle the horizontal (vertical) routing tracks on all the layers, or they may distinguish individual metal layers to identify the congestion on each layer. The number of tracks available in a bin denotes the supply of routing resources for that bin; this number is also known as the capacity of the bin. Similarly, the number of tracks crossing a bin boundary is referred to as the supply or the capacity of the routing graph edge corresponding to that boundary. A route passing through a bin (or crossing a bin boundary) requires a track in either the horizontal or the vertical direction. Thus, each such route contributes to the routing demand for that bin (or edge). Further details on capacity computation may be obtained in Section 23.3. One of the metrics commonly used to gauge the severity of routing congestion is the track overflow that measures the number of extra tracks required to route the wires in a bin. It can be defined formally∗ as follows: Definition 1 The horizontal (vertical) track overflow Txv (Tyv ) for a given bin v is defined as the difference between the number of horizontal (vertical) tracks required to route the nets through the bin and the available number of horizontal (vertical) tracks when this difference is positive, and zero otherwise. In other words, T v = max{[demand(v) − supply(v)], 0}. The formal definition of the congestion metric is as follows: Definition 2 The horizontal (vertical) congestion Cxv (Cyv ) for a given bin v is the ratio of the number of horizontal (vertical) tracks required to route the nets assigned to that bin to the number of horizontal (vertical) tracks available. Thus, the congestion in a given bin is simply the ratio of the demand of the tracks to their supply in that bin, and can be written as C v = demand(v) . The overflow and congestion metrics can be defined supply(v) similarly for the bin boundaries (or equivalently, for the routing graph edges). These definitions can also be extended to consider each routing layer individually. The notion of a congestion map is often used to obtain the complete picture of routing congestion over the entire routing area. The congestion map is a three-dimensional array of congestion twotuples indexed by bin locations and can be visualized by plotting congestion on the z-axis while ∗ Throughout this chapter, whenever the routing direction is left unspecified in some equation or discussion, it is implied that the equation or discussion is equally applicable to both the horizontal and the vertical directions. Thus, for instance, the notation T v in a statement implies that the statement is equally applicable to both Txv and Tyv . Similarly, if the bin to which a congestion metric pertains is clear from the context, it may be dropped from the notation. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C030 Estimation of Routing Congestion Finals Page 601 19-9-2008 #4 601 denoting bins on the xy-plane. Such a visualization helps designers easily identify densely congested areas (that correspond to peaks in the congestion map). Some other commonly used metrics that capture the overall routability of the design rely on scalar values (in contrast to three-dimensional congestion map vectors). These metrics include the total track overflow, maximum congestion, and the number of congested bins. The total track overflow is defined as the sum of the individual track overflows in all the bins. The maximum congestion is defined as the maximum of the congestion values over all the bins. The number of congested bins is defined as the number of bins whose congestion is greater than some specified threshold Cth . 30.3 PLACEMENT-LEVEL CONGESTION ESTIMATION Most industrial congestion-aware physical synthesis flows rely on improving the routability of a design during the placement stage itself. However, for a placement algorithm to be congestion aware, it must first be able to evaluate whether a given placement configuration is likely to be congested after routing, as well as discriminate between any two placement configurations based on their expected congestion. Different congestion metrics involve different trade-offs between the computational overhead required for their estimation and the accuracy that they can provide. They range from quick-and-dirty proxies for congestion, such as the total wirelength, to expensive but accurate congestion prediction techniques such as probabilistic estimation or fast global routing. The quick-and-dirty metrics are often employed during the early stages of placement, whereas the expensive but accurate ones are better suited to the later stages, when the the placement is relatively stable. 30.3.1 FAST METRICS FOR ROUTING CONGESTION The fast placement-level metrics for congestion include the total wirelength, the pin density, and the perimeter degree. They are best used by fast congestion analyzers embedded within optimizers during the early stages of global placement. During these applications, their fidelity to the actual congestion can help choose between alternative optimization moves based on their expected congestion impact, without incurring a significant runtime overhead. Traditionally, placers have targeted the minimization of cost functions involving wirelength in the belief that the optimization of the wirelength also leads to a reduction in the average congestion. The length of a net can be estimated using metrics such as the half-rectangle perimeter (HRPM) of its bounding box or the length of a minimum spanning tree (MST) for the net. However, this metric does not capture the spatial aspects (i.e., the locality) of the congested regions. A design can easily have low average congestion and yet have a few densely congested bins that may be very difficult to route successfully. Moreover, the predicted netlength of a given net can be quite erroneous because it ignores congestion-caused detours and uses simplistic topology generation, and because the placement itself may change during the remainder of the physical synthesis flow. Consequently, the HRPM metric is often preferred to the slightly more accurate but slower MST scheme. Indeed, the accuracy of the HRPM metric can be improved by the use of an empirical multiplicative factor depending on the pin count for the net, to compensate for its tendency to underestimate the netlength for multipin nets [Che94]. Two other fast metrics that have been used for congestion optimization during placement are the pin density and the perimeter degree. Unlike the total wirelength, which is a scalar that characterizes the entire design, these metrics are good at identifying the specific bins that are likely to be congested. The pin density metric is defined for a bin as the ratio of the number of pins in the bin to the area of the bin [HM02]. This metric captures the contributions of the intrabin nets and those interbin nets that have at least one pin within the bin. It, however, ignores the global wires that are routed through the bin but do not connect to any pins inside the bin, even though they consume routing resources within the bin. The perimeter degree of a bin is defined as the ratio of the number of interbin nets that

Handbook of algorithms for physical design automation part 62

Nội dung