Anand Rajaram
University of Texas at Austin
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Anand Rajaram.
asia and south pacific design automation conference | 2008
Anand Rajaram; David Z. Pan
A leaf-level clock mesh is known to be very tolerant to variations (Restle et al., 2001). However, its use is limited to a few high-end designs because of the high power/resource requirements and lack of automatic mesh synthesis tools. Most existing works on clock mesh (Restle et al., 2001) either deal with semi-custom design or perform optimizations on a given clock mesh. However, the problem of obtaining a good initial clock mesh has not been addressed. Similarly, the problem of achieving a smooth tradeoff between skew and power/resources has not been addressed adequately. In this work, we present MeshWorks, the first comprehensive automated framework for planning, synthesis and optimization of clock mesh networks with the objective of addressing the above issues. Experimental results suggest that our algorithms can achieve an additional reduction of 26% in buffer area, 19% in wirelength and 18% in power, compared to the recent work of Venkataraman et al., (2006) with similar worst case maximum frequency under variation.
international symposium on physical design | 2006
Anand Rajaram; David Z. Pan
Clock network synthesis is a key step in the ultra deep sub-micron (UDSM) VLSI Designs. Most existing clock network synthesis algorithms are designed for nominal operating condition, which are insufficient to address the growing problem of process, voltage and temperature (PVT) fluctuations. Link based clock networks have been suggested as a possible way of reducing skew variability [1-3]. However, [1,2] deal with only unbuffered clock networks, making them impractical. In [3], the problem of constructing a link based buffered clock network has been addressed . But [3] requires special kind of tunable buffers, which might consume more area/power and might not be available for all designs. Also, [3] uses SPICE for tuning the locations of internal nodes and buffer delays, thereby making it slow even for clock networks with a few hundred sinks. In this paper, we propose a unified algorithm for synthesizing a variation tolerant, balanced buffered clock network with cross links. Our approach can make use of ordinary buffers and does not require SPICE for clock network synthesis. SPICE based Monte Carlo simulations show that our methodology results in a buffered clock network with 50% reduction in skew variability with minimal increase in wire-length, buffer area and CPU time.
international symposium on physical design | 2005
Anand Rajaram; David Z. Pan; Jiang Hu
In the nanometer VLSI technology, the variation effects like manufacturing variation, power supply noise, temperature etc. become very significant. As one of the most vital nets in any synchronous VLSI chip, the Clock Distribution Network (CDN) is especially sensitive to these variations. Recently proposed link-based non-tree [1] addresses this problem by constructing a non-tree that is significantly more tolerant to variations when compared to a clock tree. Although the two algorithms proposed in [1] are effective in reducing the skew variability, they have a few drawbacks including high complexity, lengthy links and uneven link distribution across the clock network. In this paper, we propose two new algorithms that can overcome these disadvantages. The effectiveness of the proposed algorithms has been validated using HSPICE based Monte Carlo simulations. Experimental results show that the new algorithms are able to achieve the same or better skew reduction with an average of 5% wire length increase when compared to the 15% wire length increase of the existing algorithms in [1]. Moreover, the new algorithms scale extremely well to big clock networks, i.e., the bigger the clock network, the less overall link cost (less than 2% for the biggest benchmark we have).
design, automation, and test in europe | 2009
Ashutosh Chakraborty; Gokul Ganesan; Anand Rajaram; David Z. Pan
NBTI (Negative Bias Temperature Instability) has emerged as the dominant PMOS device failure mechanism for sub-100 nm VLSI designs. There is little research to quantify its impact on skew of clock trees. This paper demonstrates a mathematical framework to compute the impact of NBTI on gating-enabled clock tree considering their workload dependent temperature variation. Circuit design techniques are proposed to deal with NBTI induced clock skew by achieving balance in NBTI degradation of clock devices. Our technique achieves up-to 70% reduction in clock skew degradation with miniscule (<0.1%) power and area penalty.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2010
Anand Rajaram; David Z. Pan
Clock mesh networks are well known for their variation tolerance. But their usage is limited to high-end designs due to the significantly high resource requirements compared to clock trees and the lack of automatic mesh synthesis tools. Most existing works on clock mesh networks either deal with semi-custom design or perform optimizations on a given clock mesh. However, the problem of obtaining a good initial clock mesh has not been addressed. Also, the problem of achieving a smooth tradeoff between variation tolerance and resource requirements has not been addressed adequately. In this paper, we present our MeshWorks framework, the first comprehensive automated framework for planning, synthesis, and optimization of clock mesh networks that addresses the above issues. Experimental results suggest that our algorithms can achieve an additional reduction of 31% in buffer area, 21% in wirelength, and 23% in power, compared to the best previous work, with similar worst case maximum frequency. We also demonstrate the effectiveness of our framework under several practical issues such as blockages, multiple clocks, uneven load distribution, and electromigration violations.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2011
Anand Rajaram; David Z. Pan
Chip-level clock tree synthesis (CCTS) is a key problem that arises in complex system-on-a-chip designs. A key requirement of CCTS is to balance the clock-trees belonging to different IPs such that the entire tree has a small skew across all process corners. Achieving this is difficult because the clock trees in different IPs might be vastly different in terms of their clock structures and cell/interconnect delays. The chip-level clock tree is expected to compensate for these differences and achieve good skews across all corners. Also, CCTS is expected to reduce clock divergence between IPs that have critical timing paths between them. Reducing clock divergence reduces the maximum possible clock skew in the critical paths between the IPs and thus improves yield. This paper proposes effective CCTS algorithms to simultaneously reduce multicorner skew and clock divergence. Experimental results on several test-cases indicate that our methods achieve 30% reduction in the clock divergence with significantly improved multicorner skew variance, at the cost of 2% increase in buffer area and 1% increase in wirelength.
design automation conference | 2008
Anand Rajaram; David Z. Pan
A key problem that arises in System-on-a-Chip (SOC) designs of today is the Chip-level Clock Tree Synthesis (CCTS). CCTS is done by merging all the clock trees belonging to different IPs per chip specifications. A primary requirement of CCTS is to balance the sub-clock-trees belonging to different IPs such that the entire tree has a small skew across all process corners. This helps in timing closure across all the design corners. Another important requirement of CCTS is to reduce clock divergence between IPs that have critical timing paths between them, thereby reducing maximum possible clock skew in the critical paths and thus improves yield. In this work, we propose effective CCTS algorithms to simultaneously reduce multi-corner skew and clock divergence. To the best of our knowledge, this is the first work that attempts to solve this practically important problem. Experimental results on several testcases indicate that our methods achieve 10%-31%(20% on average) clock divergence reduction and between 16-64ps skew reduction (1.6%-6.4% of cycle time for a 1 GHz clock) with less than 0.5% increase in buffer area/wirelength compared to existing CTS algorithms.
international symposium on quality electronic design | 2006
Anand Rajaram; David Z. Pan
With the advent of sub-100nm VLSI technologies, variation effects greatly increase the unwanted skew in clock distribution networks (CDNs), thereby reducing the performance of the chip. Recent works on link based non-tree CDN propose cross-link insertion in a given clock tree to reduce skew variation. However, the current methods suffer from the drawback that they are empirical in nature, requiring the user to experiment with different parameter values. Also, the methods of ignore the interaction between the different links while selecting the links for insertion. The method of (W.D. Lam et al., 2005) attempts to overcome this drawback using a statistical link insertion methodology. However, (W.D. Lam et al., 2005) is very slow even for relatively small circuits. In this paper, we propose a fast link insertion methodology which does not require selecting empirical parameters for link insertion. Our method also incrementally considers the effect of previously inserted links before choosing the next link. SPICE based Monte Carlo simulations show that our approach obtains comparable skew reduction to that of the existing approaches while drastically reducing the time taken to obtain a good link-based non-tree CDN
international symposium on quality electronic design | 2007
Joon-Sung Yang; Anand Rajaram; Ninghy Shi; Jian Chen; David Z. Pan
Clock distribution is one of the key limiting factors in any high speed, sub-100nm VLSI design. Unwanted clock skews, caused by variation effects like manufacturing variations, power-ground noise etc., consume increasing proportion of the clock cycle. Thus, reducing the clock skew variations is one of the most important objectives of any high-speed clock distribution methodology. Inserting cross-links in a given clock tree is one way to reduce unwanted clock skew variations. However, most of the existing methods use empirical methods and do not use delay/skew variation information to select the links to be inserted. This can result in ineffective links being inserted. The work of (Lam, et al., 2005) considers the delay variation directly, but it is very slow even for small clock trees. In this paper, the authors propose a fast link insertion algorithm that considers the delay variation information directly during link selection process. Our algorithm inserts links only in the parts of the clock tree that are most susceptible to variation effects by evaluating the skew sensitivity to variations. Another key feature of the algorithm is that it is compatible with any higher order delay model/variation model, unlike the existing algorithms. The authors verify the effectiveness of our algorithm using HSPICE based Monte Carlo simulations on a set of standard benchmarks
international symposium on quality electronic design | 2008
Anand Rajaram; Raguram Damodaran; Arjun Rajagopal
Clock tree analysis and signoff is a key step in the design of any high performance chip. Though simple and intutive metrics like skew have been used to track clock tree quality, they are not sufficient for most practical purposes. Ideally, skew distribution obtained using a SSTA (Statististical Static Timing Analysis) on the clock trees can be used. But in most practical cases, the process information assumed by SSTA is not available. As a result, the signoff of clock skew robustness to variation effects is an often difficult problem to solve. In this work, we propose two metrics that can address this issue. These metrics can be used in three important ways. First, they can be used to determine during CTS whether the clock tree is good enough to go through rest of the backend flow or whether more tuning needs to be done to the clock tree. Second, they can be used to isolate any parts of the clock tree that behaves like a hot spot for clock skew across corners. Third, it can be used as a final signoff metric for clock tree to ensure that the tracking of the delays and skews can be expected to be good across all process points. We provide several experimental results from industry testcases demonstrating the utility of our metrics.