Guilherme Flach
Universidade Federal do Rio Grande do Sul
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Guilherme Flach.
symposium on integrated circuits and systems design | 2006
Renato Fernandes Hentschke; Guilherme Flach; Felipe Pinto; Ricardo Reis
This paper presents a quadratic placement algorithm to be applied for 3D circuits. We formulate the 3D problem to control the area balance and the number of 3D-Vias between tiers. We introduce the z-Cell Shifting operation in order to control the area balance. We also define a new operation for the refinement of the solution called 3D Iterative Refinement, that has a control statement to avoid excessive number of 3D-Vias in order to keep the feasibility of our placement solution. After quadratic placement, we move to the placement legalization that is based on min-cost max flow and Simulated Annealing. For detailed placement refinement, we apply Simulated Annealing without cell migration between tiers. Experimental results show that our placement flow targeting one tier is comparable to academic tools such as FastPlace, Capo and Dragon in wire length and running time when targeting a single tier. On multiple tiers, we can reduce the average wire length from 7% (2 tiers) to 32% (5 tiers) and worst wire length by 26% (2 tiers) to 52% (5 tiers). The number of 3D-Vias obtained is feasible since the area overhead introduced is always below 10%.
ieee computer society annual symposium on vlsi | 2007
Renato Fernandes Hentschke; Guilherme Flach; Felipe Pinto; Ricardo Reis
This paper presents a cell placement algorithm for 3D-circuits. Compared to existing approaches, our placer has a number of new features that delivers more realism and improved wire length. First, the algorithm balances the tier utilization considering the effect of 3D-vias within two possible integration strategies: face-to-face and face-to-back. 3D-vias count is limited to an upper bound, that is sensible to the area of the 3D-via. Within the upper bound, the placer is free to add more 3D-vias, fact that delivers an improved wire length, as demonstrated experimentally in the paper. Our algorithm is based on a true 3D quadratic placement engine with a 3D cell shifting method to spread the cells out and on an iterative refinement step that improves wire length. Experimental results show that our algorithm can improve the wire length compared to a 2D solution provided by the FastPlace algorithm from 15% up to 27% in average.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2014
Guilherme Flach; Tiago Reimann; Gracieli Posser; Marcelo de Oliveira Johann; Ricardo Reis
This paper presents a fast and effective approach to gate-version selection and threshold voltage, Vth, assignment. In the proposed flow, first, a solution without slew and load violation is generated. Then, a Lagrangian Relaxation (LR) method is used to reduce leakage power and achieve timing closure while keeping the circuit no or few violations. If the set of gate-versions given by LR produces a circuit with negative slack, a timing recovery method is applied to find near zero positive slack. The solution without negative slack is finally introduced to a power reduction step. For the ISPD 2012 Contest benchmarks, the leakage power of our solutions is, on average, 9.53% smaller than and 12.45% smaller than . The sizing produced using our approach achieved the first place in the ISPD 2013 Discrete Gate Sizing Contest with, on average, 8.78% better power results than the second place tool. With new timing calculation applied, this flow can provide, on average, an extra 9.62% power reduction compared to the best Contest results. This flow is also the first gate sizing method to report violation-free solutions for all benchmarks of the ISPD 2013 Contest.
ACM Transactions on Design Automation of Electronic Systems | 2012
Matthew R. Guthaus; Xuchu Hu; Gustavo Wilke; Guilherme Flach; Ricardo Reis
Clock meshes are extremely effective at producing low-skew regional clock networks that are tolerant of environmental and process variations. For this reason, clock meshes are used in most high-performance designs, but this robustness consumes significant power. In this work, we present two techniques to optimize high-performance clock meshes. The first technique is a mesh perturbation methodology for nonuniform mesh routing. The second technique is a skew-aware buffer placement through iterative buffer deletion. We demonstrate how these optimizations can achieve significant power reductions and a near elimination of short-circuit power. In addition, the total wire length is decreased, the number of required buffers is decreased, and both skew and robustness are improved on average when variation is considered.
symposium on integrated circuits and systems design | 2015
Julia Casarin Puget; Guilherme Flach; Marcelo de Oliveira Johann; Ricardo Reis
In this paper, we present a new method for circuit legalization called Jezz, which is, on average, 42.7% better than the classic legalization algorithm Tetris in terms of overall cell displacement, and 2.4% better than Abacus, a legalization algorithm that uses a quadratic function to compute the minimum cost for moving a cell, different from Jezz, that uses a linear function. The legalization step aligns cells to sites within the circuit rows and removes any overlapping among them while trying to minimize the total displacement of cells. Jezz can perform both full and incremental legalization, indicating the impact caused by inserting a cell in a row. It intrinsically handles cell-to-site alignment and has blockage support. A cache system is used to allow fast lookup during incremental legalization, allowing Jezz to support detailed placement algorithms. Although Jezz can be 20× slower than Tetris, the full legalization of a circuit with 200k cells takes less than a second, which makes Jezz suitable even for large scale designs, as a full legalization is run just a few times during the design flow.
international symposium on circuits and systems | 2013
Tiago Reimann; Gracieli Posser; Guilherme Flach; Marcelo de Oliveira Johann; Ricardo Reis
This paper presents a flow composed by a set of heuristic algorithms to address the discrete gate sizing and Vt assignment problem for leakage power minimization while satisfying delay, load and slew constraints. The proposed flow combines the Fanout-of-4 empirical rule, the Logical Effort concept, a Simulated Annealing (SA) as the main engine, as well as a new set of specific optimization strategies to solve this difficult problem as formulated in the 2012 ISPD Gate Sizing Contest. The main contribution of this work is to show how a sequence of Simulated Annealing runs, starting from a solution given by Logical Effort, Fanout of-4 rule, and employing a set of new techniques can be used together to solve gate sizing problems of up to a million gates. New methods are presented to solve violations during the Annealing and a dynamic cost function is presented that helps SA to achieve different conflicting tasks during the optimization. The entire flow was able to achieve the second and first ranks in the ISPD 2012 Contest. A set of different experiments is presented to support design decisions and highlight the quality of the achieved results.
international symposium on physical design | 2016
Guilherme Flach; Mateus Fogaça; Jucemar Monteiro; Marcelo de Oliveira Johann; Ricardo Reis
As the interconnections dominate the circuit delay in nanometer technologies, placement plays a major role to achieve timing closure since it is a main step that defines the interconnection lengths. In initial stages of the physical design flow, the placement goal is to reduce the total wirelength, however total wirelength minimization only roughly addresses timing. A timing-driven placement incorporates timing information to remove or alleviate timing violations. In this work, we present an incremental timing-driven placement flow to further optimize timing violations via single-cell movements.For late violations, we developed techniques to reduce the load capacitance on critical nets and to obtain load capacitance balancing using drive strength. For early violations, we present techniques that rely on clock skew optimization, register swap and interconnection increase. Our flow is experimentally evaluated using the ICCAD 2015 Incremental Timing-Driven Contest infrastructure. Experimental results show that our flow can significantly reduce timing violations. On average, for long maximum displacement, the quality of results is improved by 67.8% with late WNS and TNS being improved by 2.31% and 10.84%, respectively, early WNS and TNS improved by 68.92% and 76.42%, respectively and congestion metric ABU improved by 74.9% compared to the 1st place in the contest. The impact on Steiner tree wirelength is less than 2.5%.
ieee computer society annual symposium on vlsi | 2010
Guilherme Flach; Gustavo Wilke; Marcelo de Oliveira Johann; Ricardo Reis
Clock meshes are an important resource for high performance circuit designers due to its robustness to variability. Until recently, there were no tools able to support the use of clock meshes in automated synthesis flows. In the last years commercial tools were adapted to support clock meshes [1]and the academia has addressed the problems of clock mesh design automation and optimization. However, current optimization techniques are still very preliminary. Many other aspects of the clock mesh design can be explored besides of edge removal and buffer placement explored by [2] and [3]. This paper proposes an algorithm to move the mesh buffers over the mesh wires to a position that minimizes the clock skew at the clock sinks. Experimental data show significant skew reduction using the algorithm presented in this paper. The clock mesh area and capacitance are unaffected by this strategy, therefore no overhead is introduced.
ieee computer society annual symposium on vlsi | 2013
Guilherme Flach; Tiago Reimann; Gracieli Posser; Marcelo de Oliveira Johann; Ricardo Reis
This paper presents a fast and effective approach to cell-type selection and Vth assignment. In our approach, initially a solution without slew and load violation is generated. Then, the Lagrangian Relaxation considering lambda-delay sensitivities is used to reduce leakage power trying to keep the circuit without timing and load violations. If the set of cell-types given by Lagrangian Relaxation produces a circuit with negative slack, a timing recovery method is applied to find near-zero positive slack. The solution without negative slack is introduced to a power reduction step. The sizing produced using our approach could achieve up to 28% in power reduction compared to state of the art works. The leakage power of our solutions is, on average, 9.53% smaller than [1] and 12.45% smaller than [2]. Furthermore, the method is 19× faster than [1] and 1.18× faster than [2].
instrumentation and measurement technology conference | 2014
Tania Mara Ferla; Guilherme Flach; Ricardo Reis
This paper presents the research and development of an optical simulation tool based on wavelets. The tool helps to decide the implementation of Resolution Enhancement Techniques (RET) such as Optical Proximity Correction (OPC) and double patterning. Optical lithography simulation is an essential step in a Design for Manufacturability (DFM) flow. Simulation is used in mask printability enhancement methods. Mask printability is improved by creating a modified mask for which the printed features resemble closely the features on the original mask. However lithography simulation is a compute-intensive task and a fast simulation is required to allow feasible mask correcting algorithms.