Rupesh S. Shelar
Intel
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Rupesh S. Shelar.
international symposium on physical design | 2010
Rupesh S. Shelar; Marek J. Patyra
In nanometer technologies, local interconnects are believed to cause a major impact on timing and power in VLSI circuits. To assess the impact of the interconnects on timing and power in a real high performance microprocessor design in a quantitative manner, this article presents results from an extensive study carried out on RTL-to-layout synthesized blocks in a 45-nm technology core. The study shows that the interconnects in these blocks account for 30% of the cycle time, on an average, on the worst internal timing paths and contribute nearly one-third to the power dissipation. This points to severity of impact due to the interconnects in todays high performance designs.
international symposium on physical design | 2009
Rupesh S. Shelar
In modern microprocessors, clocks are usually distributed employing a hybrid network, grid followed by buffered trees, to restrict the skew. This is typically done employing (gated) buffered trees inside the blocks, while the global grid overlay the entire die area. The block-level buffered trees are connected to the grid at specific locations, by routing the wires along the predetermined tracks. The routing of these clock wires, which consume noticeable power, have distance and capacitance constraints to avoid poor slopes at the inputs of the block-level buffers. Moreover, these wires also contribute to significant load on the clock grid. This leads to a problem of capacitance or wirelength minimization during the multi-terminal routing such that wires use pre-specified tracks and routes obey distance and capacitance constraints, i.e., the length of the route from any receiver to a connection on the grid-wire has less than the specified distance and the overall capacitance due to all receivers on the route is less than the given limit. Since the problem is intractable, we present an efficient algorithm that completes the routing connecting 1000s of terminals over a few
international conference on computer aided design | 2001
Rupesh S. Shelar; Sachin S. Sapatnekar
mm^2
international symposium on physical design | 2007
Rupesh S. Shelar
area in seconds, improving the wirelength by 17% over the commonly used nearest source heuristic. The algorithm is employed to perform post-grid clock distribution in a 45 nm technology microprocessor.
asia and south pacific design automation conference | 2002
Rupesh S. Shelar; Sachin S. Sapatnekar
In this paper, we address the problem of performance oriented synthesis of pass transistor logic (PTL) circuits using a binary decision diagram (BDD) decomposition technique. We transform the BDD decomposition problem into a recursive bipartitioning problem and solve the latter using a max-flow min-cut technique. We use the area and delay cost of the PTL implementation of the logic function to guide the bipartitioning scheme. Using recursive bipartitioning and a one-hot multiplexer circuit, we show that our PTL implementation has logarithmic delay in the number of inputs, under certain assumptions. The experimental results on benchmark circuits are promising, since they show the significant delay reductions with small or no area overheads as compared to previous approaches.
international symposium on physical design | 2005
Rupesh S. Shelar; Prashant Saxena; Xinning Wang; Sachin S. Sapatnekar
Clocks are known to be major source of power consumption in digital circuits, especially in high performance microprocessors. With the technology scaling, the increasingly capacitive interconnects contribute to more than 40% of the local clock power. In this paper, we propose a clustering algorithm for them inimization of the power in local clock tree, which is shown to be equivalent to the minimization of interconnect capacitance in the tree. Given a set of sequentials and their locations, clustering is performed to determine the clockbuffers that are required to synchronize the sequentials, where a cluster implies that a clock buffer drives all the sequentials in the cluster. The clustering algorithm uses minimum spanning tree (MST) metric to estimate the interconnect capacitance and ensures the optimality of the solution, when no capacity constraints are applied. The buffers are then sized and clock nets arerouted to minimize the delay, slope, and skew constraints. We compare the clocktrees obtained by our clustering and the competitive approaches on several blocks from a microprocessor design in 65nm technology. The comparison shows that our algorithm improves the clock tree capacitance consistently by up to 21%.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2005
Rupesh S. Shelar; Sachin S. Sapatnekar; Prashant Saxena; Xinning Wang
In this paper, we address the problem of power dissipation minimization in combinational circuits implemented using pass transistor logic (PTL). We transform the problem of power reduction in PTL circuits to that of BDD decomposition and solve the latter using the max-flow min-cut technique. We use transistor level power estimates to guide the BDD decomposition algorithm. We present the results obtained by running our algorithm on a set of MCNC benchmark circuits, and show on an average of 47% power reduction over these circuits; the comparison with the previously proposed low power pass transistor logic synthesis algorithms shows an average improvement of over 23% over the best previously published approach.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2013
Rupesh S. Shelar; Marek J. Patyra
Routing congestion has become a serious concern in todays very-large-scale-integration designs. To address this, the authors propose a technology mapping algorithm that minimizes routing congestion under delay constraints in this paper. The algorithm employs a dynamic-programming framework in the matching phase to generate probabilistic congestion maps for all the matches. These congestion maps are then utilized to minimize routing congestion during the covering, which preserves the delay optimality of the solution using the notion of slack. Experimental results on benchmark circuits in a 100-nm technology show that the algorithm can improve track overflows significantly as compared to conventional technology mapping while satisfying delay constraints.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2010
Rupesh S. Shelar
Due to increasing design complexities, routing congestion has become a critical problem in very large scale integration designs. This paper introduces a distributed metric to predict routing congestion and applies it to technology mapping that targets area and delay optimization. Our technology mapping algorithms are guided by a probabilistic congestion map for the subject graph to identify the congested regions, where congestion-optimal matches are favored. Experimental results on a set of benchmark circuits in a 90-nm technology show that congestion-aware mapping results in a reduction of 37%, on an average, in track overflows with marginal gate-area penalty as compared to conventional area-oriented technology mapping. For delay-oriented mapping, our algorithm improves track overflows by 20%, on an average, in addition to preserving or improving the delay, as compared to the conventional method.
international conference on computer aided design | 2008
Yifang Liu; Rupesh S. Shelar; Jiang Hu
In nanometer technologies, local interconnects are believed to cause a major impact on timing and power in VLSI circuits. To assess the impact of the interconnects on timing and power in a real high performance microprocessor design in a quantitative manner, this article presents results from an extensive study carried out on RTL-to-layout synthesized blocks in a 45-nm technology core. The study shows that the interconnects in these blocks account for 30% of the cycle time, on an average, on the worst internal timing paths and contribute nearly one-third to the power dissipation. This points to severity of impact due to the interconnects in todays high performance designs.