Handbook of algorithms for physical design automation part 43.pdf (Sổ tay thuật toán)

Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C020 402 Finals Page 402 23-9-2008 #5 Handbook of Algorithms for Physical Design Automation For designs that are dense, legalization is difficult, and more robust methods are necessary if one is to obtain good results. For legalization of designs that contain both macroblocks and standard cells, there are two distinct approaches. One method first first fixes the positions of macroblocks, and then fills the space between them with standard cells. A second approach is to legalize macroblocks and standard cells simultaneously. Once a placement is legal, with all overlaps removed and logic elements properly aligned, optimizations that are traditionally classified as detailed placement can be applied. Small groups of standard cells can be shifted or reordered—these local optimizations can have a dramatic impact on wirelength. To maximize the size of an optimization window with acceptable runtimes, dynamic programming is frequently used. 20.1.1 NOTATION When presenting specific algorithms, we utilize the following notation. For an integrated circuit, the netlist will contain a set of cells C = {c1 , c2 , c3 , . . . , cn }, connected by a set of signal nets N = {n1 , n2 , n3 , . . . , nm }. Each net connects a subset of the cells. A placement P of a netlist consists of precise x and y positionings for each cell ci . We focus on transformations of P to another similar placement P such that the placement has lower overlap, becomes legal, has better wirelength, and so on. For simplicity, in most cases we assume that the nets connect to the centers of the cells, and we treat macroblocks and standard cells in the same way. Extending the methods described to use exact pin positions is trivial. 20.1.2 ROUTING MODELS Over the years, the increasing numbers of interconnect layers has driven changes in routing models. This has impacted detail placement in interesting ways. In earlier variable-die designs, routing success could almost always be assured (given sufficient space). In standard cell designs, the spacing between rows can be adjusted as needed. In earlier fabrication technologies, there were relatively few layers of interconnect metal, and thus most routing occurred in channels between rows. For a congested design, where more routing resources are needed, one would simply need to expand the space between a pair of rows, thereby gaining the necessary resources. If more routing resource was required between rows, a “feedthrough” could be inserted into the row. With the growth in the number of available metal interconnect layers came the ability to route some connections over cells. For a period, there were T-shaped channels. Some connections were made between rows, but a portion was made directly above. As with variable-die designs, there was flexibility to adjust row spacing. As the number of interconnect layers further increased, progressively more routing was done above the cells, with cell row spacing decreasing until the traditional channel was eliminated entirely. This is known as the fixed-die routing model. The fixed-die model brings cells closer together, lowering overall wirelengths. Fixed-die routing now dominates the industry, but faces the problem of routing failure. If a portion of the design has excessive routing demand, there is no simple way to provide additional routing space; space management methods described later in this chapter are gaining popularity as a means to combat routing failure. Although adding space to a fixed-die design is not trivial, one can in fact leverage a number of techniques found in analytic placement. Figure20.3 shows an intermediate step in analytic placement, and the congestion map as might be produced by Liu’s model [1]. Spreading methods that are applied to an analytic placement can also be applied to gain routing space. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C020 Finals Page 403 23-9-2008 #6 403 Legalization and Detailed Placement Spreading needed to remove congestion Spreading needed to remove overlap In an analytic placement method, it is common for intermediate steps to have a high degree of cell overlap. Methods to spread apart dense regions are well studied. Routing congestion is a significant problem; spreading logic elements apart is a common solution. There are many parallels between analytic spreading methods, and modern congestion reduction methods. FIGURE 20.3 Analytic placement tools face a problem similar to those faced in routing; some portions of a design are excessively dense, and the logic elements need to be separated to obtain a feasible or routable solution. 20.2 SPACE MANAGEMENT Global placement tools perform a rough positioning of logic elements across the core region. Frequently, this rough placement contains a considerable amount of overlap between logic elements, and there are regions that are excessively dense. This is particularly common with analytic placement methods, but some forms of recursive bisection also exhibit this behavior. To evaluate a global placement, one common measure is the utilization of different regions of the placement. If the total area of the logic elements assigned to a region is greater than the area of the region itself, it is overutilized or dense. To obtain a legal solution, logic elements must be moved away from dense areas; ideally, this movement should be done with minimum change to the overall structure of the placement. Figure 20.4 illustrates the issue; initially, a portion of the placement has more circuit elements assigned than there is physical space.Through a variety of space management or placement migration (a) Initial placement (b) Direction of cell movement (c) Placement after movement FIGURE 20.4 Global placement techniques may produce solutions with a significant amount of overlap, or regions with high utilization. Spreading logic elements apart is neccessary for placement legalization. Algorithms that remove overlap can also be effective for adding in additional space for routing. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C020 404 Finals Page 404 23-9-2008 #7 Handbook of Algorithms for Physical Design Automation techniques, new locations for some of the logic elements can be found, resulting in a new placement with better utilization. Although the most obvious application of space management is in the removal of overlaps and density reduction (making legalization possible), there are other motivations as well. Circuit routing is difficult; additional space makes routing easier, and it is common for large designs to have maximum density constraints on portions of the layout. Space management is also an effective way of adjusting to changes resulting from gate sizing or buffer insertion. Rather than running the placement tool again (possibly obtaining a completely different result), an existing placement can be stretched to accomodate resynthesis. Yet annother use for space management algorithms is in spreading apart devices with high activity—the power dissipated during switching can create thermal hot spots, and spreading can reduce peak on-chip temperatures. In this section, we discuss a variety of space management techniques; which is best depends a great deal on the initial placement, and on the objectives of designer. We begin with a classic approach based on minimum cost flow, and then consider more recent geometric methods. We also include in the discussion some methods in use in analytic placement tools; in particular, the methods used by FastPlace [3] and Warp [4] can achieve deisred results. 20.2.1 FLOW-BASED OVERLAP REMOVAL An early approach to placement spreading, which can be used directly on standard cell designs, is a technique that iteratively improves the quality of a placement using network flow optimization. This technique is used by the placement tool Domino [5] and can not only optimize but also remove overlaps in the placement. Vygen [6] also discusses related ideas. The approach is relatively intuitive. The placement region can be divided into a set of regions; these are normally arranged in a rectangular grid. Each region corresponds to a vertex vi in a graph. Figure 20.5 illustrates a simple example. If a region contains more circuit area than physical space, the corresponding vertex forms a supply for a maximum flow problem. Similarly, if a region is not excessively dense, it is a sink. Edges between each pair of vertices can be created, with the cost of the edge being related to the distance between the regions. (a) Placement with some overly dense regions (b) Maximum flow graph, with supply vertices in black (c) Subset of the edges considered in maximum flow; the cost of any edge is related to the physical distance traveled FIGURE 20.5 If a placement contains regions that are excessively dense, one can determine an appropriate way to move some logic elements using a minimum cost flow algorithm. Regions of the placement can be modeled as vertices in a graph; edges are placed between regions, with a cost relative to the distance between regions. A flow in the graph corresponds to a ripple of logic elements from one region to another. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C020 Finals Page 405 23-9-2008 #8 Legalization and Detailed Placement 405 Solving the minimum cost flow problem results in a strategy to move circuit elements from dense regions to sparse regions. By keeping the cost of the flow small, the physical displacement of logic elements is also kept small. A flow of x along an edge from vertex vi to vj indicates that x units of logic should be moved from the region i to region j; normally, the logic elements closest to region j are selected. Flow-based improvements are sometimes referred to as ripple moves; one can see chains of regions, where logic elements move in sequence through, from dense regions to sparse. It is common for flow-based methods to be used on only a portion of the placement region at a time; the algorithms used can be computationally expensive. Algorithm 1 gives a high-level description of this approach. The input to the algorithm is any placement with or without overlaps. The output is an optimized placement with reduced overlaps, and sometimes reduced wirelength. The first step is to partition the placement area into several small rectangular subregions. This is done to create small subproblems that can be solved efficiently. In Ref. [5], the subregions are allowed to overlap with each other, allowing movement between regions. The subproblems are solved iteratively until the solution quality converges. Once partitioned, the rest of the algorithm works on individual subregions. Cells are assigned to the subregions that contain their geometric coordinates. For each subregion, a transportation problem is set up and solved to get the improved subplacement. Algorithm 1 Outline of flow-based space management tool such as Domino [5] Partition the placement region into smaller rectangular subregions; while there is an improvement do for each subregion ρ do Assign cells to ρ depending on their initial locations; Set up and solve a transportation problem; Move cells to new locations; end for end while 20.2.1.1 Setting Up and Solving the Transportation Problem A transportation problem can be represented in the form of a bipartite graph, as shown in Figure 20.6. The graph has two disjoint sets of nodes known as the supply nodes and the demand nodes. There are edges between the nodes in the two sets with a certain cost w and capacity u associated with each such edge. In our case, the supply nodes are the cells in a subregion, say Cρ , and demand nodes are placement locations within a subregion, say Lρ . We need to transport cells in Cρ to unique locations in Lρ minimizing the transportation cost. One problem here is that standard cells have fixed heights but variable widths. To account for this, each cell c is broken down into multiple subcells (say sc ). In Ref. [5], width of a subcell is chosen as the the greatest common divisor of the width of all cells within a subregion. The transportation problem can be converted into a minimum cost maximum flow problem, which can be solved using a variety of techniques. This conversion is done as follows: • We add two extra nodes in the graph; a super-source node S and a super-sink node T . • We add |Cρ | edges between S and each cell node ci (∈ Cρ ). The cost of each such edge is 0 and the capacity is sci , that is, the total number of subcells of ci . Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C020 406 Finals Page 406 23-9-2008 #9 Handbook of Algorithms for Physical Design Automation Supply nodes (regions with available space) Demand nodes (regions with excess area demand) Super source Super sink Edges used to find a flow of logic elements from dense to sparse regions. The cost of each edge is dependent on physical displacement and approximations of net cost. FIGURE 20.6 Flow-based methods normally construct a bipartite graph for a portion of the placement region. Edges indicate the possibility of shifting logic elements (or portions thereof) from one area to another; the cost of each edge is based on an estimate of wirelength. • We add |Lρ | edges between each location node lj (∈ Lρ ) and T . The cost of each such node is 0 and the capacity is 1 (which indicates that each location can hold at most one subcell). • Capacity of each edge (ci , lj ) (where ci ∈ Cρ and lj ∈ Lρ ) is ∞. The cost calculation of each such edge is complex and is described in detail in the next subsection. Once the transportation problem is solved, one must then deduce each cell location from the subcell locations. One method is to assign each cell to the row that contains the majority of its subcells. Ties are broken in favor of the row that contains a subcell with the least transportation cost. X-coordinates of cells are determined by calculating the center of gravity of the X-coordinates of corresponding subcells. 20.2.1.2 Calculation of Transportation Cost In this section, we discuss how to calculate the cost of transporting a cell c (∈ Cρ ) to a location l (∈ Lρ ) with coordinates (xl , yl ). This is given by clv wcl = v∈Nc where Nc is the set of nets of cell c clv is the contribution of net v to wcl when cell c is placed at location l Our objective is the following: minimize c∈Cρ wcl Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C020 Finals Page 407 23-9-2008 #10 407 Legalization and Detailed Placement For a net v, let CvI and CvE denote the cells in v that are, respectively, internal and external to the region under consideration. The locations of cells in CvE are fixed, where as, the locations of cells in CvI are unpredictable during solving the problem. Note that the half-perimeter netlength model can not be directly used in this case to calculate the transportation costs, as cells within the region might be connected to each other; the cost for each cell is dependent on the locations of other cells. Three cases could occur for net v: 1. Case 1: |CvI | = 1 2. Case 2: 1 < |CvI | < |Cv | 3. Case 3: |CvI | = |Cv | For Case 1, the half-perimeter model can be applied directly. For Cases 2 and 3, Ref. [5] suggests two net models to approximate the half-perimeter net model. Let, E xmax = max |c∈CvE (xc ) E = min |c∈CvE (xc ) xmin E E ymax , ymin are defined similarly. 1. Net model I: For Case 2, the connectivity between the cells in CvI is ignored. We have E E E E , xl ) − min(xmin , xl ) + max(ymax , yl ) − min(ymin , yl ) clv = max(xmax For Case 3, a dummy cell δ is created at the center of mass of the locations of cells in CvI . We have clv = max(xδ , xl ) − min(xδ , xl ) + max(yδ , yl ) − min(yδ , yl ) 2. Net model II: Let, xImax = max |c∈I (xc ) I = min |c∈I (xc ) xmin where, I = CνI \ {c} and (xc , yc ) is the initial location of cell c. yImax and yImin are defined similarly. For Case 2, E E E E clv = max xmax , xl , xImax − min xmin , xl , xImin + max ymax , yl , yImax − min ymin , yl , yImin For Case 3, dummy cell δ is again created as above. clv = max xδ , xl , x Imax − min xδ , xl , xImin + max yδ , yl , yImax − min yδ , yl , yImin Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C020 408 Finals Page 408 23-9-2008 #11 Handbook of Algorithms for Physical Design Automation A detailed analyis of the two net models described above can be found in Ref. [5]. The flowbased approach of Domino is a classic technique, and one can find the basic elements showing up in a variety of other places (e.g., a legalization method described later in this chapter). 20.2.2 DIFFUSION-BASED PLACEMENT MIGRATION Conceptually related to flow-based methods is a recent diffusion-based placement migration technique by Ren [7]. The placement area is again decomposed into a set of bins; areas with high utilization are considered high pressure. A physics-based model can be used to compute particle velocities between bins with different pressures. The approach in Ref. [7] is to calculate pressure differentials between adjacent bins, and then to use this to compute velocities for elements in each region. Through a series of time steps, logic elements move from dense bins to sparse, and achieve a more uniform distribution across the placement area. The approach is illustrated at a high level in Algorithm 2. The approach requires less computation than flow-based methods, and is perhaps more intuitive. In terms of solution quality, however, it is difficult to draw conclusions: to our knowledge, there has been no direct comparison of the two methods. An apparent shortcoming of the diffusion approach is that it may spread logic elements apart, even when this is not required. For designs that have a great deal more available space than logic area, this spreading can result in higher wirelength. Methods to insert dummy space are being investigated, but these are heuristic in nature. Algorithm 2 Diffusion-based placement migration Map cells onto bins and compute initial bin densities repeat Compute horizontal and vertical velocities for each bin Compute new positions and velocities for each cell Update bin densities using the new cell positions until Density constraints are met 20.2.3 WHITE SPACE ALLOCATION Although flow-based methods are effective, they can be computationally expensive and may also be difficult to implement. Recently, a number of methods that are geometric in nature have been developed. One such technique, white space allocation (WSA), has been used to improve routability [8,9], and also to remove overlap created by gate sizing [10]. WSA in some sense reverses the approach of recursive bisection-based placement tools. Given an input placement, cutlines in alternating directions can be inserted; this is shown in Figure 20.7a. The method used to generate the initial placement is irrelevant. The placement is then modified by shifting the cutlines to meet area constraints of each side. In the first column of Figure 20.7a, the vertical cutline is positioned halfway across the placement region. If the distribution of area demand is unbalanced, with a greater portion of the area being on the right, the WSA approach shifts the vertical cutline to the left. Positions of all logic elements are scaled linearly, and the process recurses. The method is extremely fast, and is also easy to implement. Excessively dense regions are spread out easily, and the relative positions of most logic elements are preserved. Pseudocode for the approach is shown in Algorithm 3. In Ref. [8], the WSA method was used in conjunction with an estimate of routing congestion. Areas where routing was expected to be difficult were spread out, while areas with low routing demand were contracted. By applying WSA as a postprocessing step to the placements of a variety of tools, experiments showed vast improvement in almost all cases. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C020 Finals Page 409 23-9-2008 #12 409 Legalization and Detailed Placement (a) White space allocation (b) Cell shifting (c) Placement warping FIGURE 20.7 Geometric methods for space management; each involves the physical stretching or shifting of a portion of the design. By utilizing simple geometry (rather than a flow computation), overlap can be removed quickly and easily, although with a potential increase in wirelength. Algorithm 3 Outline of the WSA algorithm; recursion continues until each region contains only a single element Given a rectangular region r that is h high and w wide if h ≥w then Insert a horizontal cut line Compute the area of cells above and below the cut line Shift the cut line vertically to match the relative areas Scale the y positions of all cells to match the cut line position else Insert a vertical cut line Compute the area of cells to the left and right of the cut line Shift the cut line horizontally to match the relative areas Scale the x positions of all cells to match the cut line position end if Recursively process each half The method was used again in Ref. [10], as a means to adjust a placement to changes from gate sizing and buffer insertion. The simplicity of the approach has resulted in its adoption in other placement tools [9]. 20.2.4 COMPUTATIONAL GEOMETRY-BASED PLACEMENT MIGRATION A shortcoming of the WSA approach is that there may be significant sheer along the cutlines; logic elements on either side of a boundary may shift in opposite directions. The effect of this shift is Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C020 410 Finals Page 410 23-9-2008 #13 Handbook of Algorithms for Physical Design Automation An initial placement of logic elements; netlengths are low For routability or timing optimization, the size of logic elements may be increased Cutline-based methods for transforming a placement may introduce sheer; connections along a cutline may become significantly longer. This may introduce new timing and routability problems FIGURE 20.8 In geometric space allocation methods that use cutlines of some sort, it is possible that logic elements on opposite sides of a cutline have a significant relative displacement after shifting. The optimization objectives of global placement are not preserved, and this can negatively impact both routability and operating frequencies. that some interconnections may become significantly longer, degrading the overall performance and potentially introducing routing congestion. The sheer problem is illustrated in Figure 20.8. An alternative method by Luo [11] uses computational geometry-based methods to stretch the placement more uniformly. In one method, a rectilinear grid of bins are adjusted, such that each bin becomes an arbitrary quadrilateral; cell positions are then interpolated. A second approach performs a Delaunay triangulation of the placement area, with cell positions being interpolated within each region. 20.2.5 CELL SHIFTING Although not initially envisioned in this manner, the cell shifting technique used in the analytic placement tool FastPlace [3] can be also used for space management. In cell shifting, the placement region is divided into either horizontal or vertical stripes; this is illustrated in Figure 20.7b. Each stripe is divided into a set of bins, and these bins can expand or contract to meet area constraints. In the figure, the first column corresponds to the initial bin structure. On the basis of the area of logic elements within the bin and the total area of all bins in a row or column, the size of each bin can expand or contract. When a dense bin expands, the coordinates of any contained logic element are scaled accordingly. The total height or width of a stripe is constrained by the placement area; it is trivial to adjust the sizes of bins such that they equally share the available space. By iterating alternating horizontally and vertically, new positions can be found for the logic elements, distributing them more uniformly across the chip. The placement tool FastPlace uses these locations as added forces to the analytic formulation, it is by this means that an analytic solution with low overlap can be found. In principle, however, there is nothing that would prevent one from using the approach in the same manner as WSA. 20.2.6 GRID WARPING Yet another geometric technique found in analytic placement is grid warping [4]. This is illustrated in Figure 20.7c. Starting from an initial rectilinear quadrisection, the boundaries of the four regions can be adjusted such that they meet the area demands. Alpert/Handbook of Algorithms for Physical Design Automation AU7242_C020 Legalization and Detailed Placement Finals Page 411 23-9-2008 #14 411 As with the computational geometry approach of Luo [11], the division lines are not restricted to be rectilinear. Rather, the placement area is stretched like a rubber sheet, with positions of the logic elements being scaled. As with WSA, the grid warping method is applied recursively; each subregion is divided by a rectilinear quadrisection, and then the boundries of these subregions are adjusted. 20.2.7 SPACE MANAGEMENT SUMMARY Driven primarily by the complications of routing in a fixed-die environment, there has been an explosion of interest in space management techniques. Additionally, advanced fabrication technologies are providing more silicon real estate to work with, and designers are rarely under pressure to pack circuitry as tightly as was once done. Balancing space is a difficult task. If a design becomes spread too far apart, wires become longer than necessary; this increases power consumption, and lowers the achievable clock rates. A design that is too dense suffers from routing failures, excessive routing detours, and may not be amenable to buffer insertion or gate sizing. At this point, there is no simple method to determine how much space is “enough.” The methods used are frequently reminiscent of overlap removal methods in analytic placement, and one can expect continued cross-fertilization of ideas. When adjusting a placement to meet eliminate overlap, or to improve routability, cutlinebased methods such as WSA may encounter a sheer problem. When evaluating space management approaches, one should consider the magnitude of changes required. For small changes, simple geometric techniques may be sufficient; for larger changes, more robust flow-based methods may be a superior choice. 20.3 LEGALIZATION TECHNIQUES Placement legalization is a key step in a successful analytic placement flow, and has become important in recursive bisection as well. In a legal placement, all cells must be aligned with row boundaries (and potentially column grids), and no cells may overlap. Our discussion begins with a flow-based method, leveraging off of the space management methods described in Section 20.2.1. If a design has been spread out effectively, so that no area has demand higher than the available space, legalization is trivial. The discussion continues, with a relatively recent method by Hill [12], and extensions to it by Agnihotri [13]. This method, known as Tetris, is remarkably simple and easy to implement; it should provide good intuition for the problem. While the method works well for some problems, it can fail dramatically in others; this gives intuition to why more robust methods such as those based on minimum cost flow are necessary. 20.3.1 FLOW AND DIFFUSION-BASED LEGALIZATION Using either a minimum cost flow formulation [14], or diffusion [7], a placement that has small localized areas of overlap can be obtained. For designs that contain only standard cells (or in which macroblocks are fixed), placement legalization can be easily accomplished. By creating zones, bins, or regions that are a single cell row tall, and then ensuring that each zone meets a density constraint, a legal placement can be found by packing cells into each zone. A legal solution is guaranteed to exist; distributing cells within a zone can further improve results. Although we have discussed flow and diffusion-based methods within the context of space management, it is also appropriate to think of them as legalization techniques. Good space management makes placement legalization trivial.

Handbook of algorithms for physical design automation part 43

Nội dung