Handbook of algorithms for physical design automation part 68.pdf (Sổ tay thuật toán)

Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 652 Finals Page 652 29-9-2008 #9 Handbook of Algorithms for Physical Design Automation block width. The graph GV is constructed similarly. The respective width Wc and height Hc of the chip can be computed by applying a longest-path algorithm on GH and GV . Then, the algorithm divides the dead spaces and routing channels into tiles to facilitate buffer block planning. For each tile, its area slack is computed from the longest paths in GH and GV . For dead spaces and routing channels that are not on the critical paths of the constraint graph GH (GV ), they will each have a positive area slack in width (height). If there are buffers required to be inserted to meet timing constraints, the algorithm picks a tile that can accommodate the most number of these buffers and then inserts appropriate buffers into this tile. If there are no tiles with positive area slack, we have to shift some circuit blocks for the buffer insertion, thereby increasing the overall chip area. This block shifting might make rooms for other tiles, resulting in new positive slacks for these tiles. We pick the dead space or the routing channel with the maximum buffer-insertion demand, and then select one tile in it. For the selected tile, we insert appropriate buffers into the tile. If there is not sufficient space in the tile for buffer insertion, the associated routing channel will be expanded to make room for the buffers. After inserting buffers into the selected tile, the information of the constraint graphs, feasible regions, and the chip dimension is updated and the buffer insertion/clustering process is repeated until all buffers are placed. More recently, there have been attempts to perform simultaneous buffer block planning and floorplanning to fully utilize useful dead spaces for performance optimization [10–13]. Jiang et al. provided a generic paradigm along this direction in Ref. [11]. The algorithm presented in Ref. [11] simultaneously considersfloorplanning and buffer block planning for a general floorplan. The method adopts simulated annealing to refine the floorplan so that buffers can be inserted more effectively. In each iteration, it constructs a routing tree for each net and calculate the longest path from the source to the sink in each routing tree. On the basis of the aforementioned formulas presented in preceding sections, it computes the number of buffers needed for the longest path, the optimal distance from the source terminal to each buffer, and the width of independent feasible regions. After allocating buffers for all nets, it inserts buffer blocks as soft circuit blocks into the constraint graphs. These buffer blocks may occupy dead spaces or be inserted into routing channels. After all buffers for all nets are allocated, the area of each buffer block is determined as the bounding area of inserted buffers. It then reshapes the floorplan by Lagrangian relaxation. Unlike the work for buffer block planning after floorplanning that generates buffer blocks before buffer assignment, this work generates buffer blocks after buffer assignment. Consequently, the area of buffer blocks can be properly controlled, especially for the buffer blocks in routing channels. 33.2.2 BUFFER SITE PLANNING In general, buffer block planning algorithms assume that routing has not been performed. In contrast, the first two steps of the buffer site planning algorithm proposed in Ref. [2] are Steiner tree construction and wire congestion reduction. The purpose is to establish the global routing so that accurate estimation of delay of global interconnects in subsequent steps can be obtained. Any timing-driven and congestion-aware global router can be used for these two steps. The third step, buffer assignment, is the heart of the planning algorithm. Buffer assignment is performed net-by-net in decreasing order of net delay. For a general multiple-terminal net Ni , let Li be the maximum number of tiles that can be driven by either the source or an inserted buffer. If net Ni crosses tile v, the probability of net Ni using a buffer site in v is 1/Li . With p(v) denoting the sum of these probabilities for tile v over all unprocessed nets, B(v) the number of buffer sites within v, and b(v) the number of buffers assigned to v, the cost of using a buffer site in v is defined as q(v) = b(v)+p(v)+1 B(v)−b(v) ∞ b(v) if B(v) < 1, otherwise. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 Finals Page 653 29-9-2008 #10 Global Interconnect Planning 653 A minimum cost buffering solution can be computed using a van Ginneken-style dynamic programming algorithm [14] as follows. First, we describe the computation of a minimum cost buffering solution for a two-pin net with source s and sink t. Let γ = (c, l) denote a solution at v, where c is the cost of the solution and and l is the number of un-buffered tiles seen at v. Let u be the parent tile of v in the route. If l < Li , there are two solutions at u that can be derived from γ : an un-buffered solution (c, l + 1) and a buffered solution (c + q(u), 0). As in any van Ginneken-style dynamic programming approach, pruning of inferior solutions is crucial for achieving runtime efficiency. Given two solutions γ = (c, l) and γ = (c , l ), we say that γ is inferior and can be pruned if c > c and l ≥ l . Consequently, there are no more than Li noninferior solutions at each tile along the route. The algorithm in Ref. [2] can generate the noninferior solutions at each tile in O(Li ) time and compute the optimal solution of the net in O(nLi ) time, where n is the number of tiles the net crosses. For a multiple-pin net, it is necessary to consider the case where a parent tile u drives multiple child tiles. For simplicity, we assume that u has only two child tiles v and w. Given the sets of noninferior solutions at v and w, denoted respectively as Cv and Cw , we compute two sets of noninferior solutions at u, denoted as Cuv and Cuw , derived respectively from Cv and Cw , using the procedure outlined in the preceding paragraph. Let γ = (c, l) and γ = (c , l ) be solutions in Cuv and Cuw , respectively. If l + l ≤ Li , there are two solutions in Cu that can be constructed from γ and γ : (1)(c + c + q(u), 0) is a solution if a buffer is inserted at u to drive γ and γ , and (2)(c + c , l + l ) is an un-buffered solution. If we assume as in Ref. [2] that a net can only be assigned one buffer in a tile, the solution in (1) is feasible only if both γ and γ have no buffers at u, and the solution in (2) is feasible only if there is at most one buffer at u in γ and γ . The time complexity of the algorithm is O(nLi2 ) for a multiple-pin net that spans n tiles. A final postprocessing step, which involves ripping up and rerouting nets, is then applied to reduce buffer and wire congestion, as well as the number of nets that fail to meet their length constraints. 33.3 INTERCONNECT PLANNING AND BUFFER PLANNING Routability is a critical issue in modern VLSI design flow due to the dominance of system performance by today’s interconnects. Interconnect planning must be done early to ensure an achievable routing solution. The locations of buffer blocks/sites are places where signals get in and out and it is thus essential to consider routability and buffer planning simultaneously. Early planning of buffer and wiring resources is a critical component of high-performance VLSI design methodologies. Such planning is required to evaluate the quality of RT-level partitioning, floorplanning, placement and pin assignment, etc. In this section, the topic of performing interconnect planning with buffer planning is explored. 33.3.1 ROUTABILITY-DRIVEN BUFFER PLANNING In Ref. [3], one of the earliest works that consider congestion in the buffer block planning step, a two-level tile structure (Figure 33.5) is used. The coarser tile structure is used for estimating routing congestion, and the finer one is used for defining the candidate buffer block (CBB) locations. For each buffer b to be inserted, let Sb denote the set of CBBs in which b can be placed in order to satisfy the timing constraint. Sb contains all the finer tiles that overlap with the FR or IFR of b. The objective of buffer block planning is to assign each buffer to one CBB such that the congestion cost is minimized. A congestion-driven iterative deletion algorithm is used to obtain such an assignment. In each iteration of the congestion-driven iterative deletion algorithm, the candidate set of each buffer is first generated. A bipartite graph Ga = (Va , Ea ) that represents the set of all possible buffer assignments is then constructed. The edge set in Ga is defined as Ea = {(b, c)|b ∈ B, c ∈ Sb } where B is the set of buffers needed to be inserted. The iterative deletion algorithm starts with all possible assignments of the buffers to their CBBs. The edges in Ga are weighted according to the compatibility of the corresponding buffer assignment. An incompatible buffer assignment (an edge Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 654 Finals Page 654 29-9-2008 #11 Handbook of Algorithms for Physical Design Automation Independent feasible region Sink Coarse routing tiles Circuit block Fine tiles to define CBBs in feasible region Source FIGURE 33.5 Two-level tile structure. of large weight), corresponding to an assignment that may result in high-routing congestion or too many buffer blocks, will be deleted one at a time until only one assignment is left for each buffer. The weight of each edge e = (b, c) is a composite function of the routing congestion cost C1 (e) and the buffer block cost C2 (e), assuming that all the other buffers have equal probability of being assigned to their CBBs. The congestion cost of a routing tile in the two-level tile structure is estimated by the traditional 2D grid based probabilistic map assuming two-bend shortest Manhattan route for each wire segment (a wire segment can be from the source to a buffer, from a buffer to another buffer, or from a buffer to the sink). The routing congestion cost C1 (e) of an edge e is then defined as the maximum congestion cost among all the routing tiles in a one-bend routing path of the net segment represented by the edge. The buffer block cost C2 (e) is computed according to the number of buffers already assigned to the CBB c and the maximum number of buffers allowed in a CBB. This iterative deletion process will remove the highest cost redundant assignment at each step, and the bipartite graph and its associated edge costs are needed to be updated dynamically. 33.3.1.1 Routability-Driven Buffer Planning with Dead Space Redistribution Dead spaces in a floorplan or placement can be redistributed by moving some circuit blocks within their rooms to achieve better buffer insertion result. Chen et al. [15,16] considered this problem of congestion-driven buffer block planning with dead space redistribution. Dead spaces can be classified into two types, detached dead-space (DDS) and attached dead-space (ADS). A DDS is generated because of the existence of an empty room, while an ADS is generated because a room is not entirely occupied by a circuit block. The ADSs can be redistributed while keeping the topology and the total area unchanged. Each ADS or DDS is associated with a circuit block or a dummy block, which can be found efficiently from a floorplan. In order to find a good dead space distribution to insert as many buffers as possible such that the number of nets meeting their timing constraints without considering congestion is maximized [15], a bipartite graph can first be constructed to represent all possible assignments of buffers to tiles (each dead space is divided into small tiles). An s–t graph can then be constructed based on this bipartite graph to find a maximum cardinality matching in it. To consider congestion, a preprocessing step, presented in Ref. [16], first inserts some vertical and horizontal channels into the boundaries of the circuit blocks, based on an estimation of the buffer distribution. Instead of performing a maximum cardinality matching, a more sophisticated congestion-driven buffer planning algorithm is employed. The traditional 2D grid based probabilistic map assuming two-bend shortest Manhattan routes is used to estimate congestion. At the beginning, Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 Finals Page 655 29-9-2008 #12 Global Interconnect Planning 655 the congestion estimations are initialized without considering buffers. Then buffer planning and congestion updates are performed net by net. For each net i, a single-source-single-sink shortestpath problem is set up and solved to insert buffers to minimize the sum of the congestion levels at the most congested grid in each wire segment of i. The congestion information at each grid is then updated by erasing the original congestion contribution of net i and adding back the new contribution by its wire segments. The whole process is repeated until all the nets are routed and buffered. It is possible that a net cannot be buffered successfully, when there is no more dead spaces for buffer insertion, or when all the possible routing paths constrained by the buffer locations are nonmonotonic. A local search is performed to explore different ways to redistribute the ADSs. A new redistribution can be generated by randomly selecting an ADS and moving its associated block to change the dead space distribution around it. The cost function to evaluate a floorplan with different dead space redistribution is a composite function of the number of nets that fail to satisfy their delay constraints and the average congestion of the top 5 percent most congested grids. 33.3.1.2 Interconnect Planning with Fixed Interval Buffer Insertion Constraint In fixed interval buffer insertion constraint, buffers are constrained to be inserted such that the distance between adjacent buffers is within a range [low, up] given by the user. Sham and Young [17] introduced this concept and performed interconnect planning and buffer planning based on this assumption. In their approach, wires are routed over-the-cell with multibend shortest Manhattan distance and buffers are inserted in the dead space area. A floorplan is first divided into a 2D array of grids and the size of the dead space in each grid is computed for estimating the amount of buffer resources in that grid bspace (). The probability of successful buffer insertion bsuccess (x, y) at a grid (x, y) can be estimated from the amount of dead space bspace (x, y) at (x, y) and the number of possible buffer insertions busage (x, y) at (x, y) according to the formula: bsuccess (x, y) = min{1, bspace (x, y)/busage (x, y)}, where busage (x, y) is obtained by considering all the nets and all their possible multibend shortest Manhattan distance routes satisfying the fixed interval buffer insertion constraint, assuming that every possible route of a net and every feasible way of buffer insertion is equally likely to occur. These busage () values can be computed efficiently by dynamic programming, and then saved and reused. After estimating the probability of successful buffer insertion at each grid, the congestion information is computed and interconnect planning is performed accordingly, taking into account the fact that the probability of occurrence of a route will be higher if it passes through those grids with larger bsuccess (), that is, with a higher chance of successful buffer insertion. All these computations can be done efficiently by dynamic programming and by making use of some table look-up techniques. Wong and Young [18] have also assumed the fixed interval buffer insertion constraint in their interconnect planning. Similar to the work in Ref. [17], all multipin nets are first broken down into a set of two-pin nets by the MST approach. For each two-pin subnet, a simple dynamic programming approach is employed to find a path from the source to the destination with buffer insertion satisfying the fixed interval buffer insertion constraint and minimizing a cost function that is the sum of the costs at the grids with buffer insertion, where the cost at a grid is a composite function of the congestion cost and the buffer insertion cost. The buffer insertion cost is computed as a ratio between the number of buffers already inserted and the largest number of buffers allowed while the congestion cost is computed by the traditional 2D grid based probabilistic map assuming multibend shortest Manhattan route for each wire segment. The wire segments of each two-pin subnet are processed one after another and the congestion cost is updated accordingly for further buffer insertions. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 656 33.3.1.3 Finals Page 656 29-9-2008 #13 Handbook of Algorithms for Physical Design Automation Methodology for Interconnect Planning in Buffer Site The buffer site methodology was first proposed by Alpert et al. [2]. Details of their approach have been described in Section 33.2. Albrecht et al. [19] studied a similar problem of performing timing-driven buffered routing given a buffer site map. The constraints are on wire loading (maximum number of tiles driven by either the source or an inserted buffer), buffer site capacity, wire congestion, and individual sink delay, and the objective is to minimize the routing area, that is, the total wirelength and the total number of buffers. Unlike many previous works that solves each net optimally one after another, the problem of buffering all the nets simultaneously is formulated as an integer linear program (ILP). However, as solving ILP exactly is NP-hard, the approach taken in Ref. [19] is to first solve the corresponding fractional relaxed linear program and then obtain a near-optimal integral solution by randomized rounding. Their approximation algorithm can be extended to find paths with bounded delay and handle multiple buffer and wire width libraries. 33.3.1.4 Other Routability-Driven Buffer Planning Approaches Ma et al. [20] also developed an efficient algorithm to perform congestion-driven buffer planning that could be budgeted into the floorplanning process to give better timing performance and chip area. In their approach, the dead spaces are first partitioned into rectangular empty space (ES) blocks. The ES blocks in a floorplan can be obtained efficiently based on the CBL representation, and it is proved that all the dead spaces in a packing can be partitioned into no more than 2n ES blocks without overlapping, where n is the total number of circuit blocks. Intersection between the feasible region (FR) of a buffer and a ES block will be a regular hexagon (with possibly some degenerations) of which two edges are parallel to the x-axis, two edges are parallel to the y-axis, and the other two edges have a slope of +1 or −1. To facilitate data manipulation, the ES blocks are further partitioned into grids and each grid contains buffer sites for buffer insertion. The whole process of computing the intersection between the FR of a buffer and the ES blocks is divided into two steps. The first step computes the intersection between the ES blocks and the bounding box of the net, and the second step computes the overlapping between the rectangular regions obtained from step one and the two parallel slanted lines of the FR. The grids in a ES block are ordered in parallel to these slanted lines with slope +1 or −1, so that the overlapping obtained from step two will just be two indices specifying the range of grids where a buffer can be inserted. Because it is only needed to compute the first and the last grid number in the range, the complexity is linear and independent of the grid size. Assuming that the probabilities of buffer insertion are equal at each possible buffer insertion site, the probability of a buffer b being inserted into a grid g is computed as 1/NFRb where NFRb is the total number of grids in some ES blocks overlapping with the FR of b. This number can be easily obtained from step two above by knowing the range of grids in each ES block overlapping with the FR of b. By summing up these probabilities for all the buffers, the expected number of buffers to be inserted in each grid can be obtained. These probabilities of successful buffer insertion are then used for buffer allocation with consideration of routing congestion. The congestion model used is essentially the traditional 2D grid based probabilistic map assuming multibend shortest Manhattan route for each wire segment. Because the probabilities of successful buffer insertion are different at different routing tiles, the chance of the occurrence of a route will be higher if it runs through those tiles with high probabilities of successful buffer insertion. Consider the route of a net i passing through a set of tiles {T1 , T2 , . . . , Tk }. Suppose that a buffer b of this net is to be inserted in one of these tiles Tj where 1 ≥ j ≥ k. The size of the overlapping area between the feasible region of buffer b and the dead spaces in tile Tj will affect the probability of the occurrence of a route. This fact is taken into account in their congestion model. The nets are then considered one after another for buffer allocation. For each net, all its feasible routes are processed in a nondescending order of their sums of congestion levels of all the tiles it goes through. A route will be taken if all of its required buffers can be inserted successfully when it is being processed. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 Finals Page 657 29-9-2008 #14 657 Global Interconnect Planning 33.3.2 PIN ASSIGNMENT WITH BUFFER PLANNING It is beneficial to perform pin assignment and buffer planning simultaneously to achieve better timing performance. Given a placement of macroblocks and buffer blocks, the objective is to assign pins and insert buffers for a given set of nets to minimize a weighted sum of the total wirelength, W , and the total number of buffers inserted, R: C = αW + βR. Xiang, Tang, and Wong [21] presented a polynomial time algorithm to perform simultaneous pin assignment and buffer planning optimally for all two-pin nets from one particular macroblock to all the other blocks. The algorithm minimizes the cost C for any given positive constants α and β while enforcing the lower and upper bound constraints on wire segment length for each net. By applying this algorithm iteratively (picking one source block each time), pin assignment and buffer planning can be done for all the nets. The subproblem of performing simultaneous pin assignment and buffer planning from one source block to all the other blocks can be formulated as a min-cost max-flow problem. In the example shown in Figure 33.6, the source block is bs , and it has two nets connecting to block b1 and b2 , and each block bi has a set of available pin locations Pi . There are two buffer blocks r1 and r2 , and each buffer block ri is associated with a capacity ci denoting the maximum number of buffers ri can hold. A network flow problem can be set up as shown in Figure 33.6b. In the constructed flow network, a directed edge from a pin p ∈ Ps to a buffer block rj (or from a buffer block rj to a pin p ∈ Pi where bi is not the source block) exists if and only if the shortest Manhattan distance between p and rj satisfies the lower and upper bound constraints on wire segment length. Similar conditions hold for other edges between two pins or between two buffers. Then a source node s, which is connected to all the pins in the source block bs with capacity one and zero cost, is added. An intermediate sink node ti is added for each block bi other than the source block and is connected from every pin p ∈ Pi with capacity one and zero cost. Each of these intermediate sink nodes ti is then connected to the final sink node t with zero cost and capacity |Ni | where Ni is the set of nets from bs to bi . For all the remaining edges, the capacities are one and the costs are α × d where d is the shortest Manhattan length of the edge. For each buffer node ri , there will be a cost of β and a capacity of ci . For all the remaining nodes, the capacity is one and the cost is zero. It is not difficult to see that a min-cost max-flow solution f in the flow network constructed as described above can give an optimal solution that minimizes the cost and maximizes the number t t2 p21 b2 p23 p22 r1 p24 (a) p12 p13 b1 p14 p21 p22 p23 p24 ps2 ps1 ps3 r2 p11 bs t1 r1 r2 ps1 ps2 ps3 ps4 p11 p12 p13 p14 s ps4 (b) FIGURE 33.6 Min-cost max-flow method for simultaneous pin assignment and buffer planning: (a) a problem instance and (b) the corresponding flow network. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 658 Finals Page 658 29-9-2008 #15 Handbook of Algorithms for Physical Design Automation of connections made. If the flow | f | is equal to |N| where N = ∪i Ni , a feasible solution of pin assignment and buffer planning for all the nets in N is found. However, if | f | < |N|, there is no way to make all the connections and have the constraints satisfied. Every flow solution can be mapped to a pin assignment and buffer planning solution of the given set of nets. However, this min-cost maxflow approach can only consider nets connecting between one source block and all the other blocks. To route all the nets between all the macroblocks, each macroblock is treated as the source block once and the min-cost max-flow algorithm is invoked; a solution for all the nets between multiple macroblocks can be obtained at the end. To reduce the complexity of this approach, neighboring pins are grouped together at the beginning. Once several nodes are grouped together, the average coordinate is used as the location of the new super-node. After getting a solution with the supernodes, the flow solution is mapped back to the original problem, that is, distributing the flow from one super-node to its pin nodes. This subproblem can also be solved by the min-cost max-flow method. 33.3.3 NOISE-AWARE BUFFER PLANNING The aforementioned buffer block planning is done for delay or routability optimization. It is also of crucial importance to consider the signal integrity, for example, maintain fast transition time (the inverse of slew rate) of the signal at the receiver and along the net and minimize the crosstalkinduced noise during buffer planning. Otherwise, a slow transition on a net may be highly susceptible to coupling noise injection from faster switching signal lines in its vicinity. Furthermore, if the length of the net between two successive buffers is too long, the interconnect resistance becomes comparable to the driver resistance. This effectively decouples the receiver from the driver and makes the receiver highly susceptible to any attacking signals. Another reason for concern is that a slow transition on the input causes higher short-circuit power consumption. To avoid such signal integrity problems and higher power consumption, guidelines on most modern large-circuit designs require signal transition times to be no slower than a specified value. As a rule of thumb, the allowed signal transition time is between 10 and 15 percent of the clock cycle time for modern circuit design. Without considering the signal transition time constraint for buffer planning, as pointed out in Ref. [3], the buffer insertion solution may not maintain the required signal transition rate even though the target delay (as defined by 50 percent input to 50 percent output) on a net may be satisfied. Therefore, it is desirable to consider the problem of buffer block planning under delay, rise/fall time, and crosstalk-induced noise constraints for interconnect-centric floorplanning. In the following subsections, we describe the independent feasible regions defined by the transition time and delay constraints presented in Ref. [22], as well as the crosstalk-induced noise constraint introduced in Ref. [23]. 33.3.3.1 Independent Feasible Regions with Transition Time Constraints In addition to inserting buffers to improve signal delay, it is also popular to insert buffers at regular intervals to ensure a proper slew rate (or transition time) at the input to all gates. Let R be the driver output resistance, C be the sink capacitance, and L(R, C, Rtgt ) be the maximum length of a net such that the signal transition time at the sink is no more than Rtgt (it can be assumed that the input signal transition time is also Rtgt ). The independent feasible region under the signal transition time constraint for the ith buffer of a net N is given by IFR(R)i = xi⊗ − WIFR(R) 2, xi⊗ + WIFR(R) 2 ∩ (0, l), such that ∀(x1 , x2 , . . . , xi , . . . , xn ) ∈ IFR(R)1 × IFR(R)2 × . . . × IFR(R)n and the input transition times for all buffers and the sink satisfy the signal transition time constraint. Here, WIFR(R) denotes the width of independent feasible region IFR(R)i , and xi⊗ the center of IFR(R) for the ith buffer. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 Finals Page 659 29-9-2008 #16 659 Global Interconnect Planning For all segments of the net N to satisfy the signal transition time constraint, the following inequalities must hold: x1⊗ + WIFR(R) /2 ≤ l1 , ⊗ xi+1 − xi⊗ + WIFR(R) ≤ l2 for 1 ≤ i ≤ n − 1, and l − xn⊗ + WIFR(R) /2 ≤ l3 , where N l1 = L RD , CB , Rtgt N l2 = L RB , CB , Rtgt N l3 = L RB , CS , Rtgt N denotes the target signal transition time for the net N. Summing up the preceding n + 1 Here, Rtgt inequalities and making the IFR(R) intervals of equal width, we get l1 + (n − 1)l2 + l3 − l . n WIFR(R) = For this WIFR(R) , the centers of the feasible regions are determined by the following equalities: x1⊗ = l1 − WIFR(R) /2, ⊗ xi+1 − xi⊗ = l2 − WIFR(R) for 1 ≤ i ≤ n − 1, and ⊗ n x = l − l3 + WIFR(R) /2. To find the IFR of the ith buffer considering both delay and signal transition time constraints for a given number of inserted buffers, we should find the intersection of IFRi and IFR(R)i . If IFRi and IFR(R)i do not overlap, we have to insert a different number of buffers. Given l1 , l2 , and l3 , the minimum number of buffers required to satisfy the transition time constraint is given by n min R = l − l1 − l3 +1 . l2 max be the respective minimum and maximum numbers of buffers that can be inserted Let nmin D and nD N max into a net to satisfy the delay constraints. For the delay constraint, Dtgt , nmin D to nD can be computed as follows: √ −B − (B2 − 4AC) min nD = max 0, , 2A and n max D = −B + √ 2 (B − 4AC) , 2A where A = Rb Cb + Tb N B = Dtgt + rc (Cb − Cs )2 + cr (Rb − Rd )2 − (rCb + cRb )l − Tb − Rd Cb − Rb Cs N C = 12 rcl2 + (rCs + cRd ) l − Dtgt Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 660 Finals Page 660 29-9-2008 #17 Handbook of Algorithms for Physical Design Automation If nMax ≤ 0, the delay constraint on the net cannot be satisfied by inserting buffers of this type D alone. The number of buffers required to satisfy both delay and signal transition time constraints, min max denoted by nD,R , is bounded by max{nmin R , nD } ≤ nD,R ≤ nD . A linear search within the interval would typically find a feasible solution. 33.3.3.2 Common Independent Feasible Region The common IFR for buffer i of net N under both delay and signal transition time constraints is referred to as the maximal region where the buffer can be placed such that both the constraints can be satisfied, assuming that the other buffers are placed within their respective common IFRs. Both the constraints cannot be satisfied by buffer insertion if Min Max max{nMin R , nD } > nD . For a fixed value of nD,R in the feasible range, the IFR(D, R) for the ith buffer (IFR(D, R)(i)) on the net is the region common to both IFR(R)(i) and IFR(D)(i). Let Wmin = min{WIFR(R) , WIFR(D) }, δi = |xi∗ − xi⊗ |, and δw = |WIFR (R) − WIFR(D) |. The width of the common independent feasible region is given by ⎧ ⎪ ; if δi ≤ δw 2 ⎨Wmin WIFR(D,R) (i) = Wmin − δi + δw 2 ; if δw 2 ≤ δi ≤ WIFR(R) + WIFR(D) 2, ⎪ ⎩ undefined ; otherwise. We can observe that min1≤i≤nD,R {WIFR(D,R) (i)} will occur at i = 1 or i = nD,R . To fix the number of buffers inserted on the net, we pick the value of nD,R that maximizes the minimum width of IFR(D, R)(i). 33.3.3.3 Buffer Block Planning Considering Transition Time and Delay In this section, we briefly describe the algorithm for buffer block planning considering both transition time and delay constraints. The inputs to the algorithm are the initial floorplan and the transition time and delay constraints on the global nets. The algorithm determines the locations, assignments, and sizes of buffer blocks to be inserted in a dead space or a routing channel such that the two constraints are satisfied. Figure 33.7 summarizes the algorithm. Step 1 divides the available channel space, as performed in Section 33.2.1, into a set of buffer-block tiles to meet both constraints. Step 2 determines the type and the number of buffers to be inserted in each net to satisfy its timing constraints. The buffer type chosen for a net is the smallest size buffer, such that all the buffers on the net have a nonzero IFR(D, R) width. Furthermore, it constrains all the buffers on a net to be of the same size. The number of buffers required is then obtained by searching the common feasible range of buffer numbers for delay and transition time requirements. The chosen value of nD,R maximizes the minimum IFR(D, R) width, as mentioned in Section 33.3.3.2. Steps 4–6 find the set of buffer-block tiles into which each buffer can be placed. Let B be the set of buffers needed to be inserted to satisfy the constraints, and the candidate buffer blocks (CBB) set of b, Sb , b ∈ B, be the set of buffer-block tiles into which it can be placed. The intersection of the two-dimensional IFR(D, R) of a buffer with the buffer-block tiles defines the CBB set of the buffer. If there exists one buffer to be placed along the monotonic Manhattan route with an empty CBB, non-monotone detour routes are considered. Steps 7–9 compute the CBB sets for buffers to be placed along the shortest detour route. Step 8 finds the shortest detour path. The optimal number of buffers Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C033 Finals Page 661 29-9-2008 #18 661 Global Interconnect Planning Algorithm: Buffer block planning with transition time and delay constraints 1. Divide dead spaces and routing channels into buffer block tiles; 2. Find type of buffers and nD,R for each net N 3. Compute IFR(D,R) for each buffer b ∈ B; 4. foreach net N 5. foreach buffer b in net N 6. Obtain CBB set Sb ; 7. If there exists (Sb = ∅) for a net N 8. Find the shortest detour path; 9. Obtain Sb along the detour path; 10. Generate the bipartite graph G; 11. While there exists a buffer to be assigned do 12. Delete the highest cost edge of G; 13. Update edge costs; 14. Assign a buffer to a CBB if required; FIGURE 33.7 Algorithm for buffer block planning considering both transition time and delay constraints. (From Sarkar, P. and Koh, C.-K., Proceedings of IEEE/ACM Design, Automation and Test in Europe Conference, IEEE Press, Piscataway, NJ, 2001.) (nD,R ) to be inserted to satisfy the timing constraints is computed based on this path length. Then, the width of the IFR(D, R) for each buffer along this path is computed. Step 9 applies this width to compute the CBB set for the net N. Each buffer could have several feasible CBBs to be assigned. A method based on the iterative deletion and bipartite graph formulation introduced in Section 33.3.1 is used to assign each buffer. Step 10 constructs the bipartite graph G, and Steps 12–14 prune G by removing incompatible buffer assignments or edges in each iteration, similar to the process in Section 33.3.1. The algorithm terminates when a unique CBB is assigned to each buffer. 33.3.4 BUFFER PLANNING WITH NOISE CONSTRAINTS Coupling noise between adjacent nets could induce unexpected circuit behavior. Figure 33.8 shows a noise model that considers coupling capacitance cc . The coupling capacitance is proportional to the fringing capacitance (cf ) and the coupling length (lc ), and it is inversely proportional to the distance (d) between the aggressor and the victim nets, that is, cc ∝ cf lc /d. Because the detailed routing information is not available during floorplanning or postfloorplanning, we may adopt a more conservative approach and make d the minimum wire distance dmin specified by the design rule. Furthermore, we set the coupling length (lc ) to be twice the victim net’s length (lν ), which is the 1 Aggressor net 0 Cs Cc Ic X Victim net Cs FIGURE 33.8 Noise model resulting from the coupling capacitance and crosstalk-induced current. (From Cheng, Y. -H. and Chang Y. -W., Proceedings of IEEE/ACM Asia South Pacific Design Automation Conference, IEEE Press, Piscataway, NJ, 2004.)

Handbook of algorithms for physical design automation part 68

Nội dung