A Fast Optimal Double Row Legalization Algorithm
AA Fast Optimal Double Row Legalization Algorithm
Stefan Hougardy [email protected] Institute for DiscreteMathematics, University of BonnBonn, Germany
Meike Neuwohner [email protected] Institute for DiscreteMathematics, University of BonnBonn, Germany
Ulrike Schorr [email protected] Design Systems Inc.Munich, Germany
ABSTRACT
In Placement Legalization, it is often assumed that (almost) all stan-dard cells possess the same height and can therefore be aligned in cell rows , which can then be treated independently. However, this isno longer true for recent technologies, where a substantial numberof cells of double- or even arbitrary multiple-row height is to beexpected. Due to interdependencies between the cell placementswithin several rows, the legalization task becomes considerablyharder. In this paper, we show how to optimize quadratic cell move-ment for pairs of adjacent rows comprising cells of single- as wellas double-row height with a fixed left-to-right ordering in time O( π Β· log ( π )) , whereby π denotes the number of cells involved.Opposed to prior works, we thereby do not artificially bound themaximum cell movement and can guarantee to find an optimumsolution. Experimental results show an average percental decreaseof over 26% in the total quadratic movement when compared toa legalization approach that fixes cells of more than single-rowheight after Global Placement. CCS CONCEPTS β’ Hardware β Placement . KEYWORDS
Placement; Legalization; double-row-height cells
The
Standard Placement Problem captures the task of locating hun-dreds of thousands or even millions of standard cells, which areusually assumed to exhibit uniform heights, within the rectangularchip area. Thereby, multiple objectives such as minimizing the totallength of inter-cell electrical connections (nets) or achieving de-sirable timing properties have to be respected. Given the fact thateven the underlying packing problem is strongly
π π -hard [9], theplacement task is most commonly split into the three sub-problemsof
Global Placement , Legalization and
Detailed Placement . GlobalPlacement aims at finding cell locations that approximately mini-mize the total netlength for a certain net model and obey boundson local packing density, but does not have to ensure internal dis-jointness of shapes. The Legalization step deals with resolving theremaining overlaps by shifting cells locally, trying to minimize ei-ther netlength or the total (squared) cell displacement. The latteris desirable because it honors the quality of the Global Placementresult (e.g. w.r.t. timing) and balances cell movement. Detailed Place-ment usually incorporates several post-optimization routines.When only cells of single-row height are present, the Standard CellLegalizers βTetrisβ [13] and βAbacusβ [20] produce good results.They process the cells one by one, ordered by the π₯ -coordinates of their Global Placement positions, and place each cell at the closestfree position [13] or at the end of a nearby row, choosing the onethat allows for the minimum possible total cell movement [20].Another strategy, which is employed within the BonnTools project[16],[3], uses a min-cost-flow approach to first assign the cells to zones , unblocked parts of a row [1]. Fixing the left-to-right orderingof the cells contained within each zone to the one imposed by theGlobal Placement locations, legal cell positions are then obtainedby minimizing the total squared cell displacement (or (weighted)bounding box netlength) within each zone. The latter task is cap-tured by the Single Row Problem , which also occurs as a sub-problemof the Abacus Legalizer. It was first studied by Kahng, Tucker andZelikovsky [15], who suggested the
Clumping Algorithm to tackleit. While their implementation runs in π ( π Β· log ( π )) for unit netweights (where π denotes the number of nets), the fastest imple-mentation, which is due to Suhl [21], achieves a running time of O( π Β· log ( π )) even for general net weights. A similar result hasbeen obtained in the context of scheduling [10]. When the goal isto optimize quadratic cell movement, the Clumping Algorithm caneasily be implemented to run in time linear in the number of cells.While the mentioned approaches work well in the presence of uni-form cell heights, it is not obvious how to generalize them to asetting where cells of double- or even arbitrary multiple-row heightmay occur. Wang et al. [23] try to adapt the Clumping Algorithmto the double-row case, but manage to guarantee optimality onlyin a very restricted setting. In contrast to this, Wu and Chu [24]suggest to handle cells of double-row height by, depending on theplacement density, either inflating or matching cells of single-rowheight to ensure uniform cell heights again. However, as was al-ready pointed out in [19], this strategy can neither handle distinctpower alignment constraints nor cells covering more than two rows.Besides, both merging and inflating cells may drastically reduce theplacement flexibility as well as lead to a significant area overhead.Many other authors, therefore, settle for a dynamic programmingsolution instead of generalizing the Clumping Algorithm, guaran-teeing a reasonable runtime by artificially bounding the maximumdisplacement allowed for each cell by a small number of placementsites. In exchange, they show how to make their dynamic programaware of several other desirable objective traits or incorporate alarger degree of freedom by allowing for a local reordering of cells,even between multiple rows [6], [11], [12], [19].Other approaches comprise solving a linear complementarity prob-lem to approximately minimize the squared cell movement and thenresolving the remaining overlaps [5], [18], [25], applying integerlinear programming to legalize sufficiently small regions of the chipseparately [14], or making use of a cell insertion scheme [7], com-bined with bipartite matching and min-cost-flow-algorithms [17].In this paper, we present a fast O( π log π ) -time (where π denotes a r X i v : . [ c s . D S ] J a n he number of cells) algorithm minimizing the total quadratic dis-placement for cells of single- and double-row height that need tobe accommodated in two adjacent rows obeying a fixed orderingof the cells covering each row. In contrast to previous dynamicprogramming approaches, we do not need to artificially restrict thenumber of available positions for each cell, which may be beneficialfor regions of low density and when dealing with coarser grid sizesfor double-row cells, which our algorithm can take into account.Moreover, our approach can be extended to support rectangularmovebounds for the cells.The rest of this paper is organized as follows: In Section 2, wediscuss the Single Row Problem, the Clumping Algorithm and itsimplementation for piecewise quadratic cost functions. In Section 3,we then introduce the Double Row Problem and show how to reduceit to the Single Row Problem in Section 4. Finally, Section 5 presentsour experimental results.
The following section comprises the base results our reduction fromthe Double Row to the Single Row Problem builds upon. β’ Section 2.1 reviews the Clumping Algorithm and its analysis. β’ Theorem 2.3 points out how an optimum solution to theSingle Row Problem changes when the domain is restricted. β’ Section 2.2 discusses an efficient implementation of the Clump-ing Algorithm for piecewise quadratic cost functions.
Definition 2.1 (Single Row Problem).
Instance:
A tuple (C , π€, π₯ πππ , π₯ πππ₯ , ( π π ) ππ = ) consisting of β’ a set C : = { πΆ , . . . , πΆ π } of cells, β’ cell widths π€ : C β R + , β’ a minimum and maximum coordinate π₯ πππ , π₯ πππ₯ β R satisfying (cid:205) ππ = π€ ( πΆ π ) β€ π₯ πππ₯ β π₯ πππ and β’ convex, continuous functions π π : R β R for π = , . . . , π . Task:
Find coordinates ( π₯ π ) ππ = minimizing (cid:205) ππ = π π ( π₯ π ) subjectto β’ π₯ πππ β€ π₯ , β’ π₯ π + π€ ( πΆ π ) β€ π₯ π + for π = , . . . , π β β’ π₯ π + π€ ( πΆ π ) β€ π₯ πππ₯ .For π = , . . . , π , we write [ π β π , π + π ] : = argmin { π π ( π₯ ) , π₯ β [ π₯ πππ + π β βοΈ π = π€ ( πΆ π ) , π₯ πππ₯ β π βοΈ π = π π€ ( πΆ π )]} . The Single Row Problem can be solved by the aforementioned
Clumping Algorithm [15]. The given formulation of the ClumpingAlgorithm (Algorithm 1) is based on [2].Theorem 2.2 ([15]).
The Clumping Algorithm finds an optimumplacement.
We prove a slightly stronger statement which we will need at alater point. In order to formulate it, we have to introduce the notionof a block , which we define as follows: For a cell πΆ π β L , the block π΅ ( π ) represented by πΆ π is defined to be the consecutive set of cells π΅ ( π ) : = { πΆ π : π β€ π β€ π β§ (cid:154) πΆ π β L : π < π β€ π } . The blocks presentat a given point during the run of the Clumping Algorithm indicate Algorithm 1:
Clumping Algorithm
Input:
An instance of the Single Row Problem given byan ordered list L = ( πΆ , . . . , πΆ π ) of cells,cell widths π€ : { πΆ , . . . , πΆ π } β R + ,a row interval [ π₯ πππ , π₯ πππ₯ ] andconvex cost functions ( π π ) ππ = . Output:
Optimum positions ( π₯ π ) ππ = . Add an auxiliary element πΆ to the front of L and set π₯ β π₯ πππ and π€ β for π β to π do Compute π β π and π + π . π€ π β π€ ( πΆ π ) for π β to π do ππΏπ΄πΆπΈ ( πΆ π , L) for π β to π with πΆ π β L do π₯ π β π₯ π β + π€ ( πΆ π β ) return ( π₯ π ) ππ = Algorithm 2:
ππΏπ΄πΆπΈ ( πΆ π , L) πΆ β β predecessor of πΆ π in L if π₯ β + π€ β β€ π + π then π₯ π β max { π₯ β + π€ β , π β π } else πΆππΏπΏπ΄πππΈ ( πΆ β , πΆ π , L) ππΏπ΄πΆπΈ ( πΆ β , L) Algorithm 3:
πΆππΏπΏπ΄πππΈ ( πΆ β , πΆ π , L) Redefine π β as π₯ β¦β π β ( π₯ ) + π π ( π₯ + π€ β ) and update π β β and π + β (w.r.t. [ π₯ πππ + (cid:205) β β π = π€ ( πΆ π ) , π₯ πππ₯ β (cid:205) ππ = β π€ ( πΆ π )] ) π€ β β π€ β + π€ π Remove πΆ π from L sets of cells that the algorithm forces to be placed contiguously(or has clumped together ) at that time. Note that the partition intoblocks can only get coarser throughout the run of the algorithm.Theorem 2.3. Let πΌ β² : = (C , π€, π₯ β² πππ , π₯ β² πππ₯ , ( π π ) ππ = ) be an instanceof the Single Row Problem, let π₯ πππ β€ π₯ β² πππ < π₯ β² πππ₯ β€ π₯ πππ₯ andlet πΌ denote the instance of the Single Row Problem that arises fromreplacing π₯ β² πππ and π₯ β² πππ₯ by π₯ πππ and π₯ πππ₯ , respectively. Then thereexists an optimum solution ( π₯ β π ) ππ = for πΌ β² such that for any block π΅ ( π ) formed during the run of Algorithm 1 on πΌ , the cells in π΅ ( π ) are placedcontiguously. Proof. By induction on the number of calls to
πΆππΏπΏπ΄πππΈ . Ini-tially, the statement is clearly true because every cell constitutes ablock on its own. Consider a call to
πΆππΏπΏπ΄πππΈ where two blocks π΅ ( β ) and π΅ ( π ) are united by deleting πΆ π from L , and pick an op-timum solution ( π₯ β π ) ππ = for πΌ β² respecting all previously formedblocks. If additionally π₯ β π β + π€ ( πΆ π β ) = π₯ β π , we are done, so as-sume π₯ β β + (cid:205) π β π = β π€ ( πΆ π ) = π₯ β π β + π€ ( πΆ π β ) < π₯ β π . By constructionf the algorithm, we have π β β β€ π₯ β β€ π + β , π€ β = (cid:205) π β π = β π€ ( πΆ π ) and π₯ β + π€ β > π + π . If π₯ β π > π + π , then we can shift π΅ ( π ) to the left un-til it hits max { π₯ β π β + π€ ( πΆ π β ) , π + π } and thereby decrease the totalcost since the cost function π π of π΅ ( π ) is strictly monotonically in-creasing on [ π + π , π₯ πππ₯ β (cid:205) ππ = π π€ ( πΆ π )] β [ π + π , π₯ β² πππ₯ β (cid:205) ππ = π π€ ( πΆ π )] , acontradiction to the assumed optimality of ( π₯ β π ) ππ = . Hence π₯ β π β€ π + π .Then π₯ β π β π€ β < π₯ β β€ π + β , so we can shift π΅ ( β ) to the right untilit hits the left boundary of π΅ ( π ) without increasing the total costsince the cost function π β of π΅ ( β ) is monotonically decreasing on [ π₯ πππ + (cid:205) β β π = π€ ( πΆ π ) , π + β ] β [ π₯ β² πππ + (cid:205) β β π = π€ ( πΆ π ) , π + β ] . β‘ Remark.
Together with the fact that the Clumping Algorithmplaces each block π΅ ( π ) with its optimum range [ π β π , π + π ] and hencealso within [ π₯ πππ , π₯ πππ₯ β π€ π ] (whereby π π and π€ π refer to the re-spective values after π΅ ( π ) has been formed), Theorem 2.3 implies op-timality and therefore in particular the correctness of Theorem 2.2.Theorem 2.4. Let πΌ and πΌ β² be as in Theorem 2.3 and let ( π₯ β π ) ππ = be the solution computed by a run of the Clumping Algorithm on πΌ .Then an optimum solution ( π₯ β²β π ) ππ = for πΌ β² is given by π₯ β²β π = min  π₯ β² πππ₯ β π βοΈ π = π π€ ( πΆ π ) , max  π₯ β² πππ + π β βοΈ π = π€ ( πΆ π ) , π₯ β π  for π = , . . . , π . Proof. Feasibility follows easily from the fact that we have π₯ β² πππ₯ β π₯ β² πππ β₯ (cid:205) ππ = π€ ( πΆ π ) by definition of the Single Row Problem.By Theorem 2.3, it, therefore, suffices to show that ( π₯ β²β π ) ππ = placeseach block π΅ ( π ) arising from the run of the Clumping Algorithm on πΌ optimally. Pick such a block π΅ ( π ) and call its cumulated cost functionto which π π is set during the course of the algorithm Β― π π . Then bydefinition of the Clumping Algorithm, we have π₯ β π β [ Β― π β π , Β― π + π ] . Wedistinguish the three cases β’ π₯ β π < π₯ β² πππ + (cid:205) π β π = π€ ( πΆ π ) , β’ π₯ β π β [ π₯ β² πππ + (cid:205) π β π = π€ ( πΆ π ) , π₯ β² πππ₯ β (cid:205) ππ = π π€ ( πΆ π )] and β’ π₯ β² πππ₯ β (cid:205) ππ = π π€ ( πΆ π ) < π₯ β π .In the first case, π₯ β²β π = π₯ β² πππ + (cid:205) π β π = π€ ( πΆ π ) is set to the leftmostfeasible position and furthermore, Β― π π is monotonically increasingto the right of π₯ β π < π₯ β²β π , showing that π΅ ( π ) is placed optimally.In the second case, π₯ β²β π = π₯ β π is placed within the optimum rangeof Β― π π βΎ [ π₯ πππ + (cid:205) π β π = π€ ( πΆ π ) , π₯ πππ₯ β (cid:205) ππ = π π€ ( πΆ π )] and therefore inparticular occupies an optimum position for this function. Finally, inthe third case, we get π₯ β²β π = π₯ β² πππ₯ β (cid:205) ππ = π π€ ( πΆ π ) , which is the right-most feasible position πΆ π may attain. Given that Β― π π is monotonicallydecreasing on [ π₯ β² πππ + (cid:205) π β π = π€ ( πΆ π ) , π₯ β²β π ] β [ π₯ πππ + (cid:205) π β π = π€ ( πΆ π ) , Β― π + π ] ,optimality follows again. β‘ Note that if all of the π π are quadratic functions stored as triples ( π, π, π ) of coefficients such that π π : π₯ β¦β π Β· π₯ + π Β· π₯ + π , theClumping Algorithm can be implemented to run in linear time, aspointed out, for example, in [21], since the computation of minimaas well as shifting a quadratic function in π₯ -direction or addingit to another one only requires a constant number of arithmeticoperations on the respective coefficients. Our strategy to solve the problem of minimizing squared movementwithin two adjacent rows containing cells of both single- and double-row height with a prescribed left-to-right ordering is based on areduction of an instance of the latter problem to an instance of theSingle Row Problem with piecewise quadratic objective functions . Inthe following subsection, we therefore discuss how to implementthe Clumping Algorithm in this case.
Definition 2.5 (piecewise quadratic function).
For [ π, π ] β R , wecall a continuous function π : [ π, π ] β R piecewise quadratic ifthere exist a nonnegative integer π and β’ real numbers π = : π₯ < π₯ < Β· Β· Β· < π₯ π < π₯ π + : = π and β’ quadratic functions ( π π : R β R ) ππ = such that π βΎ [ π₯ π , π₯ π + ] = π π βΎ [ π₯ π , π₯ π + ] for all π = , . . . , π . Thepositions ( π₯ π ) ππ = are called kinks of π . Note that there exists aunique representation of π with π π β π π + for all π = , . . . , π β
1, towhich we refer when talking about the set of kinks of a piecewisequadratic function.Our goal is to achieve a running time of
O(( π + π ) log ( min { π, π })) for the Clumping Algorithm, where π denotes the number of cellsand π specifies the total number of kinks occurring among allcost functions. Therefore, we suggest an implementation of thealgorithm that is based on the one proposed in [21] for the case ofpiecewise linear objective functions. Due to page limit, we do notpresent a detailed description, but rather give a short overview ofthe data structures used as well as a brief outline of the analysis. Representation of cost functions of cells.
We associate the qua-dratic function π₯ β¦β π Β· π₯ + π Β· π₯ + π with the triple ( π, π, π ) andstore the restriction π π βΎ [ π₯ πππ , π₯ πππ₯ ] of the piecewise quadraticcost function π π as follows:Let π₯ πππ = : π π π + π < π π π π < Β· Β· Β· < π π < π π : = π₯ πππ₯ suchthat { π π , . . . , π π π π } is the set of kinks of π π βΎ [ π₯ πππ , π₯ πππ₯ ] and let π π βΎ [ π π + π , π ππ ] be given by the quadratic function π ππ , π = , . . . , π π .Then we represent π π by the ordered list πΉ π : = (( π π + π , π ππ )) π π π = con-sisting of pairs of quadratic functions defining π π βΎ [ π₯ πππ , π₯ πππ₯ ] ona certain interval and the left boundary of their domain. Through-out the algorithm, for each cell πΆ π that has already been processedand is currently placed at the position π₯ π , we maintain the index π ( π ) β { , . . . , π π } for which π π ( π )+ π < π₯ π β€ π π ( π ) π respectively π ( π ) = π π if π₯ π = π₯ πππ . Observe that if we implicitly assume all cellsto be located at π₯ πππ₯ initially and further consider a cell πΆ π β π΅ ( π ) as being placed at π₯ π + (cid:205) π β π = π π€ ( πΆ π ) , cells never move to the rightduring a run of the Clumping Algorithm. To see this, note that bydefinition of π β π and π + π , each cell is located within [ π₯ πππ , π₯ πππ₯ ] byconstruction. Moreover, whenever π₯ β is reassigned after a call to πΆππΏπΏπ΄πππΈ ( πΆ β , πΆ π , L) , then β β πΆ π the predecessor of πΆ β in L , we get max { π₯ π + π€ π , π β β } = π₯ β β€ π + β and π₯ β + π€ β > π + π β₯ π β π before πΆππΏπΏπ΄πππΈ is performed. Hence, after the update of π β , wehave π β β β€ π₯ β , implying that π₯ β is decreased, remains unchanged oranother call to πΆππΏπΏπ΄πππΈ is launched. In the first case, all alreadyprocessed cells πΆ π with π > β belong to π΅ ( β ) and therefore move tohe left as well.As a consequence, the total time needed to maintain the indices π ( π ) can be bounded by O( (cid:205) ππ = π π ) = O( π ) since none of theseindices is ever decreased. Representation of cost functions of blocks.
In order to realize callsto
ππΏπ΄πΆπΈ and
πΆππΏπΏπ΄πππΈ efficiently, we need some additional datawhich we store for the blocks consisting of cells we have alreadyprocessed. Thereby, the key observation is the fact that in order toimplement the function
ππΏπ΄πΆπΈ , only local information on the givenconvex cost function is required since for a convex real function, thequestion whether the interval where it attains its minimum lies tothe left or right of or contains a certain coordinate can be answeredby considering local monotonicity properties. In this spirit, for eachblock π΅ ( π ) , we store the following data: β’ a heap π» ( π ) that contains for each πΆ π β π΅ ( π ) the position π π ( π )+ π β (cid:205) π β β = π π€ ( πΆ β ) unless π ( π ) = π π and β’ the quadratic function π π defining π π on the non-empty in-terval ( max π» ( π ) , π₯ π ] (whereby max β : = ββ ).We outline how to use them in order to implement ππΏπ΄πΆπΈ and
πΆππΏπΏπ΄πππΈ . Consider a call to
ππΏπ΄πΆπΈ ( πΆ π , L) and remember thatwe implicitly assume that π₯ π = π₯ πππ₯ for 1 β€ π β€ π initially. Fur-ther observe that this convention ensures that throughout thealgorithm, for πΆ β , πΆ π β L with β < π , we have π₯ β + π€ β β€ π₯ π .In order to execute ππΏπ΄πΆπΈ , the first thing we have to decide iswhether π₯ β + π€ β β€ π + π . While we can compute the value of theleft hand side in constant time, π + π is not necessarily known to us.However, what we do know is that by convexity of π π , π + π is theunique position in [ π₯ πππ + (cid:205) π β π = π€ ( πΆ π ) , π₯ πππ₯ β (cid:205) ππ = π π€ ( πΆ π )] suchthat π π βΎ [ π₯ πππ + (cid:205) π β π = π€ ( πΆ π ) , π₯ πππ₯ β (cid:205) ππ = π π€ ( πΆ π )] is monotoni-cally decreasing to its left and strictly monotonically increasing toits right. As a consequence, if π π βΎ ( max π» ( π ) , π₯ π ] (which is givenby the quadratic function π π ) is monotonically decreasing, we canbe sure that π + π β₯ π₯ π β₯ π₯ β + π€ β . On the other hand, as long as π π βΎ ( max π» ( π ) , π₯ π ] is strictly monotonically increasing, we can de-crease π₯ π to max { π₯ β + π€ β , max π» ( π )} , and, whenever this maximumis attained by max π» ( π ) , pop all corresponding entries from theheap, increment the corresponding indices π ( π ) by one and inserta new heap entry unless they reach π π , and update π π . Note thatif one precomputes all of the values (cid:205) π β π = π€ ( πΆ π ) , π = , . . . , π re-cursively in linear time, which allows to determine (cid:205) π β π = π π€ ( πΆ π ) inconstant time throughout the algorithm, each of these update stepstakes constant time per heap entry. In each case where the maxi-mum is not attained by max π» ( π ) , we can infer that π + π < π₯ β + π€ β and therefore launch a call of πΆππΏπΏπ΄πππΈ . Finally, if there is some π§ β ( max π» ( π ) , π₯ π ) where π π changes from being monotonicallydecreasing to being strictly monotonically increasing, then π§ = π + π and we are able to decide whether or not π₯ β + π€ β β€ π + π holds. Incase the latter is true, we also have to determine max { π₯ β + π€ β , π β π } .To this end, observe that by convexity of π π , π β π is the unique co-ordinate in [ π₯ πππ + (cid:205) π β π = π€ ( πΆ π ) , π₯ πππ₯ β (cid:205) ππ = π π€ ( πΆ π )] such that π π ,restricted to the latter interval, is strictly monotonically decreasingto the left, and monotonically increasing to the right of π β π . By ap-plying a similar strategy as before, we can therefore either compute π β π β ( max π» ( π ) , π₯ π ] or set π₯ π to max { π₯ β + π€ β , max π» ( π )} β₯ π β π . As a consequence, we are left with discussing the implementation of πΆππΏπΏπ΄πππΈ ( πΆ β , πΆ π , L) . Since we do not explicitly recompute π β β and π + β and the updates of π€ β and L can be easily performed inconstant time when implementing L as a doubly linked list, weonly have to take care of the redefinition of π β . To this end, notethat π β can be updated by setting π β ( π₯ ) β π β ( π₯ ) + π π ( π₯ + π€ β ) by a constant number of arithmetic operations on the respectivecoefficients. As far as the heap π» ( β ) is concerned, we have to shiftall entries in π» ( π ) by π€ β to the left and then merge π» ( π ) into π» ( β ) .By employing Leftist Heaps and storing key differences instead ofthe actual keys (see [22] for further details), the shifting can be per-formed in constant and the merging in logarithmic (w.r.t. the totalnumber of heap elements) time. A logarithmic or even constanttime bound also applies for all other heap operations we perform,which comprise the creation of empty heaps, the extraction anddeletion of maximum heap entries as well as the insertion of newelements. By observing that the maximum heap size is boundedby min { π, π } since each heap contains at most one entry per cell,but also at most one entry per kink, and that the total number ofheap operations is O( π + π ) since for every (pair of) shifting andmerging, we remove an entry from L , and every kink position isadded to and removed from a heap at most once, we obtain theclaimed runtime bound. In this section, we β’ formally introduce the Double Row Problem and β’ reformulate the feasibility constraints as those of an instanceof the Single Row Problem defined on the set of cells ofdouble-row height.As the name of the problem indicates, the task is to place a setof cells of single- and double-row height within a given rectangu-lar window covering two rows, minimizing a sum of continuous,convex objective functions on the positions of the individual cells.Thereby, the left-to-right ordering of those cells occupying a certainrow is fixed and the cells are not allowed to overlap. Definition 3.1 (Double Row Problem).
Instance: β’ a non-empty set C : = { πΆ , . . . , πΆ π } of double-row cells, β’ sets of cells β B : = { π π π , π = , . . . , π, π = , . . . , π π } and β T : = { π‘ π π , π = , . . . , π, π = , . . . , π π } to be placed in the bottom respectively top row,where π π , π π β N for π = , . . . , π , β’ cell widths π€ : C βͺ B βͺ T β R + , β’ a minimum and maximum coordinate π₯ πππ , π₯ πππ₯ β R such that π₯ πππ + π βοΈ π = π€ ( πΆ π ) + π βοΈ π = max  π π βοΈ π = π€ ( π π π ) , π π βοΈ π = π€ ( π‘ π π )  β€ π₯ πππ₯ and β’ convex, continuous cost functions β π π : R β R for π = , . . . , π , β π π π : R β R for π = , . . . , π , π = , . . . , π π and β β π π : R β R for π = , . . . , π , π = , . . . , π π . πππ π₯ πππ₯ π π π π π‘ π‘ π‘ π‘ π‘ π‘ B B B T T T πΆ πΆ Figure 1: The Double Row Problem.Task:
Find coordinates ( π₯ π ) ππ = , ( π¦ π π ) ππ = π π π = and ( π§ π π ) ππ = π π π = min-imizing (cid:205) ππ = π π ( π₯ π ) + (cid:205) ππ = (cid:16)(cid:205) π π π = π π π ( π¦ π π ) + (cid:205) π π π = β π π ( π§ π π ) (cid:17) subject to β’ π₯ π + π€ ( πΆ π ) β€ π₯ π + for π = , . . . , π , β’ π₯ π + π€ ( πΆ π ) β€ π¦ π for π = , . . . , π , β’ π¦ π π + π€ ( π π π ) β€ π¦ π π + for π = , . . . , π , π = , . . . , π π β β’ π¦ ππ π + π€ ( π ππ π ) β€ π₯ π + for π = , . . . , π , β’ π₯ π + π€ ( πΆ π ) β€ π§ π for π = , . . . , π , β’ π§ π π + π€ ( π‘ π π ) β€ π§ π π + for π = , . . . , π , π = , . . . , π π β β’ π§ ππ π + π€ ( π‘ ππ π ) β€ π₯ π + for π = , . . . , π ,where π₯ : = π₯ πππ , π€ ( πΆ ) : = π₯ π + : = π₯ πππ₯ and each con-straint only applies if all of its variables exist.For π = , . . . , π , we define B π : = { π π π , π = , . . . , π π } and T π : = { π‘ π π , π = , . . . , π π } .Proposition 3.2. Given a tuple ( π₯ β π ) ππ = and an instance of theDouble Row Problem as defined above, there exists a feasible solutionto the Double Row Problem with π₯ π = π₯ β π for π = , . . . , π if and only if π₯ β π + π€ ( πΆ π ) + max  π π βοΈ π = π€ ( π π π ) , π π βοΈ π = π€ ( π‘ π π )  β€ π₯ β π + for π = , . . . , π, where π₯ β : = π₯ : = π₯ πππ , π€ ( πΆ ) : = and π₯ β π + : = π₯ π + : = π₯ πππ₯ . We call such a tuple ( π₯ β π ) ππ = feasible . Remark.
Note that a tuple ( π₯ β π ) ππ = is feasible if and only if itdefines a feasible solution to the instance of the Single Row Problemwith cell set C , cell widths π€ β² ( πΆ π ) : = π€ ( πΆ π ) + max  π π βοΈ π = π€ ( π π π ) , π π βοΈ π = π€ ( π‘ π π )  and enclosing π₯ -interval [ π₯ β² πππ , π₯ β² πππ₯ ] given by π₯ β² πππ : = π₯ πππ + max  π βοΈ π = π€ ( π π ) , π βοΈ π = π€ ( π‘ π )  and π₯ β² πππ₯ : = π₯ πππ₯ . For the remainder of this paper, we restrict ourselves to the caseof piecewise quadratic cost functions and show how to reducethe respective variant of the Double Row Problem to the SingleRow one. As we have already seen how to deal with the subjectof feasibility, it remains to transfer costs from the single-row cellsto the double-row ones, i.e. to determine the minimum cost of afeasible extension of a feasible tuple ( π₯ β π ) ππ = and to express it as (cid:205) ππ = π β² π ( π₯ β π ) for some piecewise quadratic objective functions π β² π . β’ We examine the structure of an optimum extension of afeasible tuple to coordinates for the single-row height cells. β’ Lemma 4.1 expresses the total cost of such an extension, upto a constant, as a sum (cid:205) ππ = πΉ π ( π₯ β π ) . β’ We show that each of the functions πΉ π is convex and piece-wise quadratic and linearly bound the total number of kinks. β’ We then derive our main result stated in Theorem 4.2.Consider the coordinates ( Β― π¦ π π ) ππ = π π π = and ( Β― π§ π π ) ππ = π π π = arising fromruns of the Clumping Algorithm on the instances of the SingleRow Problem given by (B π , π€ βΎ B π , π₯ πππ , π₯ πππ₯ , ( π π π ) π π π = ) and (T π ,π€ βΎ T π , π₯ πππ , π₯ πππ₯ , ( β π π ) π π π = ) for π = , . . . , π . Note that once afeasible tuple ( π₯ β π ) ππ = of coordinates for the double-row cells hasbeen fixed, coordinates ( π¦ π π ) ππ = π π π = and ( π§ π π ) ππ = π π π = extend themto a feasible solution of the Double Row Problem if and only iffor each π β { , . . . , π } , ( π¦ π π ) π π π = and ( π§ π π ) π π π = constitute feasiblesolutions of the instances of the Single Row Problem given by (B π , π€ βΎ B π , π₯ β π + π€ ( πΆ π ) , π₯ β π + , ( π π π ) π π π = ) and (T π , π€ βΎ T π , π₯ β π + π€ ( πΆ π ) ,π₯ β π + , ( β π π ) π π π = ) , respectively, whereby again π₯ β : = π₯ πππ , π€ ( πΆ ) : = π₯ β π + : = π₯ πππ₯ . Note that these instances are feasible by fea-sibility of ( π₯ β π ) ππ = . But now, since for each π = , . . . , π , we have π₯ πππ β€ π₯ β π + π€ ( πΆ π ) β€ π₯ β π + β€ π₯ πππ₯ , Theorem 2.4 tells us that anoptimum extension ( π¦ β π π ) ππ = π π π = and ( π§ β π π ) ππ = π π π = of ( π₯ β π ) ππ = is givenby π¦ β π π = min { π₯ β π + β π π βοΈ π = π π€ ( π ππ ) , max { π₯ β π + π€ ( πΆ π )+ π β βοΈ π = π€ ( π ππ ) , Β― π¦ π π }} (1)and π§ β π π = min { π₯ β π + β π π βοΈ π = π π€ ( π‘ ππ ) , max { π₯ β π + π€ ( πΆ π ) + π β βοΈ π = π€ ( π‘ ππ ) , Β― π§ π π }} . (2)This allows us to express the total cost of the solution in terms ofthe coordinates ( π₯ β π ) ππ = :Lemma 4.1. Let ( Β― π¦ π π ) ππ = π π π = and ( Β― π§ π π ) ππ = π π π = as before and define πΉ π : π₯ β¦β π π ( π₯ ) (3) + π π β βοΈ π = π π β π ( min { π₯ β π π β βοΈ π = π π€ ( π π β π ) , Β― π¦ π β π }) (4) + π π βοΈ π = π π π ( max { π₯ + π€ ( πΆ π ) + π β βοΈ π = π€ ( π ππ ) , Β― π¦ π π }) (5) + π π β βοΈ π = β π β π ( min { π₯ β π π β βοΈ π = π π€ ( π‘ π β π ) , Β― π§ π β π }) (6) + π π βοΈ π = β π π ( max { π₯ + π€ ( πΆ π ) + π β βοΈ π = π€ ( π‘ ππ ) , Β― π§ π π }) (7) and π : = (cid:205) π β π = (cid:205) π π π = π π π ( Β― π¦ π π ) + (cid:205) π β π = (cid:205) π π π = β π π ( Β― π§ π π ) . Then for a feasi-ble tuple ( π₯ β π ) ππ = , the total cost of an optimum solution to the DoubleRow Problem with π₯ π = π₯ β π for π = , . . . , π amounts to (cid:205) ππ = πΉ π ( π₯ β π ) β π . Proof. Recall that an optimum extension ( π¦ β π π ) ππ = π π π = and ( π§ β π π ) ππ = π π π = of ( π₯ β π ) ππ = is given by (1) and (2). We are done if we can show thator any cell, the part of the cost term involving its objective functionmatches the cost of its position in the given solution.For the cells ( πΆ π ) ππ = , this is clear.For a cell π π with π β { , . . . , π } , the desired statement followsfrom π₯ β + π€ ( πΆ ) + (cid:205) π β π = π€ ( π π ) = π₯ πππ + (cid:205) π β π = π€ ( π π ) β€ Β― π¦ π , and asimilar argument applies for π = π .For a cell π π π with π β { , . . . , π β } and π β { , . . . , π π } , we exem-plarily consider the case where Β― π¦ π π β€ π₯ β π + π€ ( πΆ π ) + (cid:205) π β π = π€ ( π ππ ) sincethe cases π₯ β π + π€ ( πΆ π ) + (cid:205) π β π = π€ ( π ππ ) < Β― π¦ π π < π₯ β π + β (cid:205) π π π = π π€ ( π ππ ) and π₯ β π + β (cid:205) π π π = π π€ ( π ππ ) β€ Β― π¦ π π can be treated similarly. In the mentionedcase, we get π¦ β π π = min { π₯ β π + β π π βοΈ π = π π€ ( π ππ ) , max { π₯ β π + π€ ( πΆ π ) + π β βοΈ π = π€ ( π ππ ) , Β― π¦ π π }} = max { π₯ β π + π€ ( πΆ π ) + π β βοΈ π = π€ ( π ππ ) , Β― π¦ π π } and min { π₯ β π + β (cid:205) π π π = π π€ ( π ππ ) , Β― π¦ π π } = Β― π¦ π π , so π π π ( max { π₯ β π + π€ ( πΆ π ) + π β βοΈ π = π€ ( π ππ ) , Β― π¦ π π })+ π π π ( min { π₯ β π + β π π βοΈ π = π π€ ( π ππ ) , Β― π¦ π π }) β π π π ( Β― π¦ π π ) = π π π ( π¦ β π π ) + π π π ( Β― π¦ π π ) β π π π ( Β― π¦ π π ) = π π π ( π¦ β π π ) . The cells in T can be treated analogously. β‘ Up to the constant π , which only depends on the given instanceof the Double Row Problem, but not on the tuple ( π₯ β π ) ππ = , we canhence express the costs of an optimum solution extending a fea-sible tuple ( π₯ β π ) ππ = as a sum of the cost functions ( πΉ π ) ππ = appliedto the individual coordinates. Note that each of the summandscontributing to πΉ π and hence πΉ π itself is piecewise quadratic sincelinear shifting as well as replacement by a constant function tothe left or right of a certain coordinate (ensuring continuity) pre-serves this property. In addition to that, it is not hard to see thatthe total number of kinks the cost functions ( πΉ π ) ππ = possess canbe bounded by 2 Β· (|B| + |T |) + π , where π denotes the totalnumber of kinks present in the cost functions of the single- anddouble-row cells. To show that all πΉ π are actually convex, it is suffi-cient to show that each of the summands (3)-(7) induces a convexfunction. This is clear for (3), and we exemplarily show it for (5).Let L ππ denote the list of cells arising from the run of the Clump-ing Algorithm on the aforementioned instance of the Single RowProblem with cell set B π . Given that for π π π β L ππ , the cells in theblock π΅ ( π π ) starting at π π π are placed contiguously, we can rewrite(5) as (cid:205) π ππ βL ππ πΊ π π ( max { π₯ + π€ ( πΆ π ) + (cid:205) π β π = π€ ( π ππ ) , Β― π¦ π π }) , where πΊ π π denotes the cumulated cost function of the block represented by π π π .Recall that by definition of the Clumping Algorithm, Β― π¦ π π occupies aminimum position of πΊ π π for π π π β L ππ . Given that for a continuous,convex function π : [ π, π ] β R and π₯ β argmin { π ( π₯ ) , π₯ β [ π, π ]} ,the function mapping π₯ β [ π, π ] to π ( max { π₯, π₯ }) is convex, itfollows that (5) defines a convex function in π₯ . By applying anal-ogous arguments for the remaining summands, we can infer that each πΉ π is convex as a sum of convex functions. This completesour reduction from the Double to the Single Row Problem andit remains to discuss the runtime it requires. Note that the posi-tions ( Β― π¦ π π ) ππ = π π π = and ( Β― π§ π π ) ππ = π π π = can be computed in total time O((|B| + |T | + π ) Β· log (|B| + |T |)) , where again π denotes the totalnumber of kinks of the all cost functions appearing in the giveninstance of the Double Row Problem.A time of O((|C| + |B| + |T | + π ) Β· log (|C| + |B| + |T | + π )) thensuffices to build up and solve the instance of the Single Row Problemon the set of double-row cells to which we reduce, and optimumcoordinates for the single-row cells can be deduced from the com-puted positions for the cells in C in linear time. Putting everythingtogether, we can therefore formulate the following theorem:Theorem 4.2. The Double Row Problem with piecewise quadraticfunctions with a total amount of π kinks can be solved in time O((|C| + |B| + |T | + π ) Β· log (|C| + |B| + |T | + π )) . We implemented the proposed algorithm in the C++ programminglanguage and embedded it into the legalization framework describedin [1]. More precisely, we first run the legalization algorithm from[1], which legalizes all cells of more than single-row height viaa greedy projection approach and then proceeds by assigning allcells of single-row height to so-called zones , unblocked segmentsof cell rows, through a min-cost-flow algorithm. Within each zone,the left-to-right ordering is inferred from the Global Placementpositions. While the algorithm from [1] proceeds by optimizingsquared cell movement only within each zone making use of theClumping Algorithm, we instead apply the Double Row Algorithmto the instances of the Double Row Problem arising from the givenleft-to-right ordering in every second pair of rows, treating all cellsof more than double-row height as blockages.All experiments were performed single-threaded on Intel Xeon3.3GHz CPUs with 384GB RAM. We conduct two experiments ontwo different sets of benchmarks. The first one aims at establishingthe competitiveness of our legalization approach when comparedto recent works on the matter of mixed-cell-height legalization. Thesecond experiment displays the effectiveness of the Double RowAlgorithm in improving squared cell movement.For the first experiment, we run our algorithm on benchmark in-stances from the ICCAD-2017 CAD Contest on Multi-Deck Standard-Cell Legalization [8]. In doing so, we omit fence region constraintsas well as soft constraints, but stick to the required power-rail align-ment. As most prior works optimize linear instead of squared cellmovement, we employ our proposed legalization method to mini-mize linear movement during the Double Row Algorithm. Observethat this is possible since for each cell, once its row assignment isfixed, the distance to its Global Placement location constitutes apiecewise linear and hence in particular piecewise quadratic func-tion. However, we point out that minimizing l1 movement is notthe main purpose of our algorithm and that in particular, the as-signment to zones is designed to optimize squared instead of linearmovement. Hence, the subsequent comparison should be regardedas proof that our algorithm, even though not explicitly devisedto do so, can compete with state-of-the-art legalizers concerninglinear cell movement. We compare the average l1 cell movement able 1: Comparison between the average cell movement in terms of horizontal placement sites.
Instance GP HPWL (m) Ξ HPWL Av. L1 Movement (Sites) Max. L1 Movement (Sites) CPU (sec)DACβ17 ISPDβ19 TCADβ13 Ours DACβ17 ISPDβ19 TCADβ13 Ours OursISPDβ19 DACβ17 ISPDβ19 TCADβ13 Ours DACβ17 ISPDβ19 Oursdes_perf_1 1.217 16.21% 6.66% 4.52% 4.52% 10.86 6.97 6.66 6.66 95.55% 200.82 48.95 57.22 57.22 11.23 11.75 9.97des_perf_ a_md1 2.160 3.27% 2.48% 2.20% 2.19% 6.71 5.94 5.85 5.79 97.47% 607.30 607.30 607.30 607.30 2.30 2.79 8.05des_perf_a_md2 2.177 3.35% 2.51% 2.23% 2.23% 6.77 5.93 6.08 6.07 102.36% 403.86 403.86 403.86 403.86 2.19 6.82 8.53des_perf_b_md1 2.106 1.75% 1.52% 1.61% 1.59% 5.17 4.77 4.78 4.72 98.95% 79.34 38.45 48.19 45.19 2.01 3.64 6.79des_perf_b_md2 2.137 2.05% 1.72% 1.50% 1.49% 5.74 5.25 5.38 5.31 101.14% 198.74 39.76 50.68 50.68 2.31 3.12 8.06edit_dist_1_md1 4.004 1.47% 1.39% 1.27% 1.26% 6.22 5.79 5.75 5.69 98.27% 109.34 95.45 67.55 67.55 3.49 5.19 9.67edit_dist_a_md2 5.103 1.17% 1.01% 0.92% 0.91% 6.02 5.51 5.57 5.51 100.00% 164.00 164.00 164.00 164.00 2.59 2.24 10.78edit_dis_ a_md3 5.328 2.69% 1.48% 1.02% 1.02% 9.11 7.08 6.96 6.93 97.88% 233.00 233.00 233.00 233.00 5.91 15.68 15.87fft_2_md2 0.444 11.21% 8.78% 7.14% 7.02% 8.84 7.54 7.89 7.76 102.92% 102.94 73.60 59.55 60.55 0.70 2.89 2.81fft_a_md2 1.092 0.98% 0.95% 1.13% 1.13% 5.03 4.86 4.74 4.70 96.71% 345.50 345.50 343.48 346.50 0.69 0.60 2.15ff_ a_md3 0.949 1.08% 1.08% 1.22% 1.22% 4.73 4.55 4.43 4.42 97.14% 109.62 109.62 102.59 102.59 0.63 0.40 1.91pci_bridge32_a_md1 0.454 3.61% 3.38% 3.00% 2.95% 6.01 5.64 5.83 5.76 102.13% 72.48 63.76 63.76 63.76 0.61 2.29 2.01pci_bridge32_a_md2 0.565 8.33% 4.38% 3.68% 3.62% 9.43 7.14 7.55 7.45 104.34% 186.08 121.35 121.35 121.35 0.53 3.34 3.76pc_ bridge32_b_md1 0.660 2.55% 2.26% 2.13% 2.11% 6.35 6.01 5.79 5.72 95.17% 322.71 332.71 313.99 313.99 0.52 0.70 2.41pci_bridge32_b_md2 0.574 2.80% 2.53% 2.57% 2.57% 5.92 5.53 5.43 5.42 98.01% 640.12 430.04 430.04 430.04 0.50 0.66 1.89pci_bridge32_b_md3 0.583 3.63% 3.17% 3.14% 3.13% 6.74 6.10 6.13 6.12 100.33% 398.57 398.57 398.58 398.58 0.51 1.58 2.21average 4.13% 2.83% 2.46% 2.44% 6.85 5.91 5.93 5.88 99.27% 260.90 219.12 216.57 216.64 2.30 3.98 5.06
Table 2: Comparison between the squared cell movementresulting from the legalization algorithm described inTCADβ13 and our algorithm.
Instance GP HPWL (m) Cells Squared Cell MovementSingle Double achieved by our algorithm to the results obtained by [5] and thestate-of-the-art paper [18] as reported in [18] as well as the legal-ization approach from [1]. Table 1 displays the relative increase( Ξ HPWL) of the half-perimeter wire length after Global Placement(GP HPWL), the average l1 cell movement (measured in horizontalplacement sites), the maximum l1 cell movement (again measuredin placement sites) and the runtime in CPU seconds for the algo-rithms in [5](DACβ17), [18](ISPDβ19) and [1](TCAD β13) and thealgorithm suggested in this paper (Ours). Concerning the averagecell movement, which we are mainly interested in for this compari-son, the column labeled βOurs/ISPDβ19β contains the percentagesthe average cell movement obtained by "Ours" constitutes of theaverage cell movement reported by ISPDβ19 [18]. The final rowlabeled βaverageβ displays the average of all prior values in therespective column. In particular, the respective entry in the columnβOurs/ISPDβ19β refers to the average of the above percentages. Onecan see that on average, our proposed algorithm achieves com-parable results to the algorithm in [18], which in turn producesconsiderably better results than [5] when it comes to average cellmovement. However, the deviation between the different instances is relatively high: While there are some on which our algorithmsignificantly outperforms the method from [18] (including thosewhere no cells of triple- and quadruple-row height are present),the converse is true for several other test cases. One possible ex-planation for this might be the fact that the greedy legalizationof cells of more than double-row height only works well if theyare sufficiently spaced out in the Global Placement solution, whichis true for only some of the given benchmarks. When it comes torunning time, maximum movement, and increase in HPWL, ouralgorithm can be seen to yield comparable or even better results.In our second experiment, we compare the total quadratic cellmovement achieved by the algorithm described in [1] to minimizesquared cell movement and our new method. As the number ofdouble-row cells on the ICCAD-2017 CAD Contest benchmarks[8] is rather small, we employ a set of benchmarks generated bythe authors of [7] by modifying instances from the ISPD 2015 De-tailed Routing-Driven Placement Contest [4]. While these are moresuitable for the primary application of our algorithm, we decidedagainst using them for a comparison to other legalizers since theyare not publicly available and the parsing process appears to bemore error-prone due to a non-standard format. For completeness,we nevertheless state that our experiments revealed an average cellmovement better than the one obtained by [7], [5] and [23], butworse than what is claimed in [14] (at the cost of a considerablyhigher runtime) and [17].The results of our second experiment can be read from Table 2,which displays the squared cell movement achieved by the algo-rithm described in TCADβ13 [1] and the algorithm proposed inthis paper. The first column contains the instance name, while thecolumns labeled "Single" and "Double" display the number of cellsof single- respectively double-row height present on the given testcase, whereby the fraction the number of double-row cells consti-tutes of the total number of cells can be found in the following col-umn labeled " igure 2: superblue12Figure 3: matrix_mult_1 after our algorithm. Blue lines in-dicate movement w.r.t. the output of TCADβ13. cells of double-row height, improvements achieved by the applica-tion of the Double Row Algorithm are quite significant, which canbe explained by the fact that even a single double-row cell beingfixed in position may lead to the displacement of huge blocks ofconsecutive cells of single-row height in densely packed regions(see Figure 3). On the other hand, if many of the cells of double-rowheight do not interfere with those of single-row height at all in thatthere is sufficient horizontal whitespace around them, comparablysmall improvements are obtained despite a considerable numberof cells of double-row height present (see Figure 2). However, asthe legalization task becomes more difficult in those cases wherethe Global Placement packs the cells relatively dense locally, theDouble Row Algorithm can be considered a worthwhile extensionof the considered legalization framework.
In this paper, we have presented a fast algorithm to minimize qua-dratic (or linear) cell displacement for pairs of cell rows comprisingcells of both single- and double-row height with predefined targetlocations and a fixed left-to-right ordering. Even though the sur-rounding legalization framework is designed to optimize squaredinstead of linear cell displacement, our results are competitive whencompared to state-of-the-art works on mixed-cell-height legaliza-tion. Moreover, experimental results comparing the squared celldisplacement when fixing all cells of double-row height and whenemploying the Double Row Algorithm, respectively, clearly speakin favor of its effectiveness.
REFERENCES [1] U. Brenner. 2013. BonnPlace Legalization: Minimizing Movement by IterativeAugmentation.
TCAD
32, 8 (2013), 1215β1227.[2] U. Brenner and J. Vygen. 2000. Faster Optimal Single-Row Placement with FixedOrdering. In
Proceedings Design, Automation and Test in Europe . 117β121.[3] U. Brenner and J. Vygen. 2004. Legalizing a Placement with Minimum TotalMovement.
TCAD
23, 12 (2004), 1597β1613.[4] I. Bustany, D. Chinnery, J. Shinnerl, and V. Yutsis. 2015. ISPD 2015 benchmarkswith fence regions and routing blockages for detailed-routing-driven placement.In
Proceedings of the ISPD . 157β164.[5] J. Chen, Z. Zhu, W. Zhu, and Y. Chang. 2017. Toward Optimal Legalization forMixed-Cell-Height Circuit Designs. In . 6.[6] Y. Cheng, D. Huang, W. Mak, and T. Wang. 2018. A Practical Detailed PlacementAlgorithm under Multi-Cell Spacing Constraints. In
Proceedings of the ICCAD . 8.[7] W. Chow, C. Pui, and E. Young. 2016. Legalization Algorithm for Multiple-RowHeight Standard Cell Design. In . 1β6. [8] N. Darav, I. Bustany, A. Kennings, and R. Mamidi. 2017. ICCAD-2017 CADContest in Multi-Deck Standard Cell Legalization and Benchmarks. In
ICCAD .867β871.[9] M. Garey and D. Johnson. 1978. βStrongβ NP-Completeness Results: Motivation,Examples, and Implications.
J. ACM
25, 3 (1978), 499β508.[10] M. Garey, R. Tarjan, and G. Wilfong. 1988. One-Processor Scheduling withSymmetric Earliness and Tardiness Penalties.
Mathematics of Operations Research
13, 2 (1988), 330β348.[11] C. Han, K. Han, A. Kahng, H. Lee, L. Wang, and B. Xu. 2017. Optimal Multi-RowDetailed Placement for Yield and Model-Hardware Correlation Improvements inSub-10nm VLSI. In
ICCAD . 667β674.[12] C. Han, A. Kahng, L. Wang, and B. Xu. 2019. Enhanced Optimal Multi-RowDetailed Placement for Neighbor Diffusion Effect Mitigation in Sub-10 nm VLSI.
TCAD
38, 9 (2019), 1703β1716.[13] D. Hill. 2002. Method and system for high speed detailed placement of cellswithin an integrated circuit design. U.S. Patent 6370673.[14] C. Hung, P. Chou, and W. Mak. 2017. Mixed-Cell-Height Standard Cell PlacementLegalization. In
Proceedings of the Great Lakes Symposium on VLSI . 149β154.[15] A. Kahng, P. Tucker, and A. Zelikovsky. 1999. Optimization of Linear Placementsfor Wirelength Minimization with Free Sites. In
Proceedings of the Asia and SouthPacific Design Automation Conference . 241β244.[16] B. Korte, D. Rautenbach, and J. Vygen. 2007. BonnTools: Mathematical Innovationfor Layout and Timing Closure of Systems on a Chip.
Proc. IEEE
95 (2007), 555β572.[17] H. Li, W. Chow, G. Chen, E. Young, and B. Yu. 2018. Routability-Driven andFence-Aware Legalization for Mixed-Cell-Height Circuits. In . 1β6.[18] X. Li, J. Chen, W. Zhu, and Y. Chang. 2019. Analytical Mixed-Cell-Height Legaliza-tion Considering Average and Maximum Movement Minimization. In
Proceedingsof the ISPD . 27β34.[19] Y. Lin, B. Yu, X. Xu, J. Gao, N. Viswanathan, W. Liu, Z. Li, C. Alpert, and D. Pan.2016. MrDP: Multiple-row Detailed Placement of Heterogeneous-sized Cells forAdvanced Nodes. In
ICCAD . 1β8.[20] P. Spindler, U. Schlichtmann, and F. Johannes. 2008. Abacus: Fast Legalizationof Standard Cell Circuits with Minimal Movement. In
Proceedings of the ISPD .47β53.[21] U. Suhl. 2010.
Row-Placement in VLSI Design: The Clumping Algorithm and ageneralization . diploma thesis. University of Bonn, Research Institute for DiscreteMathematics.[22] R. Tarjan. 1983.
Data Structures and Network Algorithms . SIAM.[23] C. Wang, Y. Wu, J. Chen, Y. Chang, S. Kuo, W. Zhu, and G. Fan. 2017. An EffectiveLegalization Algorithm for Mixed-Cell-Height Standard Cells. In . 450β455.[24] G. Wu and C. Chu. 2015. Detailed Placement Algorithm for VLSI Design withDouble-Row Height Standard Cells.
TCAD
35 (2015), 1569β1573.[25] Z. Zhu, X. Li, Y. Chen, J. Chen, W. Zhu, and Y. Chang. 2018. Mixed-Cell-HeightLegalization Considering Technology and Region Constraints. In