Wei-Ting Jonas Chan
University of California, San Diego
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Wei-Ting Jonas Chan.
international conference on computer design | 2014
Juan Antonio Carballo; Wei-Ting Jonas Chan; Paolo A. Gargini; Andrew B. Kahng; Siddhartha Nath
The International Technology Roadmap for Semiconductors (ITRS) has roadmapped technology requirements of the semiconductor industry over the past two decades. The roadmap identifies major challenges in advanced technology and leads the investment of research in a cost-effective way. Traditionally, the ITRS identifies major semiconductor IC products as drivers; these set requirements for the state-of-the-art semiconductor technologies. High-performance microprocessor unit (MPU-HP) for servers and consumer portable system-on-chip (SOC-CP) for smartphones are two examples. Throughout the history of the ITRS, Moores Law has been the main impetus for these drivers, continuously pushing the transistor density to scale at a rate of 2× per technology generation (aka “node”). However, as new requirements from applications such as data center, mobility, and context-aware computing emerge, the existing roadmapping methodology is unable to capture the entire evolution of the current semiconductor industry. Today, comprehending how key markets and applications drive the process, design and integration technology roadmap requires new system-level studies along with chip-level studies. In this paper, we extend the current ITRS roadmapping process with studies of key requirements from a system-level perspective, based on multiple generations of smartphones and microservers. We describe potential new system drivers and new metrics, and we refer to the new system-level framing of the roadmap as ITRS 2.0.
international conference on computer design | 2013
Wei-Ting Jonas Chan; Andrew B. Kahng; Seokhyeong Kang; Rakesh Kumar; John Sartori
Aggressive requirements for low power and high performance in VLSI designs have led to increased interest in approximate computation. Approximate hardware modules can achieve improved energy efficiency compared to accurate hardware modules. While a number of previous works have proposed hardware modules for approximate arithmetic, these works focus on solitary approximate arithmetic operations. To utilize the benefit of approximate hardware modules, CAD tools should be able to quickly and accurately estimate the output quality of composed approximate designs. A previous work [10] proposes an interval-based approach for evaluating the output quality of certain approximate arithmetic designs. However, their approach uses sampled error distributions to store the characterization data of hardware, and its accuracy is limited by the number of intervals used during characterization. In this work, we propose an approach for output quality estimation of approximate designs that is based on a lookup table technique that characterizes the statistical properties of approximate hardwares and a regression-based technique for composing statistics to formulate output quality. These two techniques improve the speed and accuracy for several error metrics over a set of multiply-accumulator testcases. Compared to the interval-based modeling approach of [10], our approach for estimating output quality of approximate designs is 3.75× more accurate for comparable runtime on the testcases and achieves 8.4× runtime reduction for the error composition flow. We also demonstrate that our approach is applicable to general testcases.
design automation conference | 2015
Wei-Ting Jonas Chan; Siddhartha Nath; Andrew B. Kahng; Yang Du; Kambiz Samadi
Quantification of three-dimensional integrated circuit (3DIC) benefits over corresponding 2DIC implementation for arbitrary designs remains a critical open problem, largely due to nonexistence of any “golden” 3DIC flow. Actual design and implementation parameters and constraints affect 2DIC and 3DIC final metrics (power, slack, etc.) in highly non-monotonic ways that are difficult for engineers to comprehend and predict. We propose a novel machine learning-based methodology to estimate 3DIC power benefit (i.e., percentage power reduction) based on corresponding golden 2DIC implementation parameters. The resulting 3D Power Estimation (3DPE) models achieve small prediction errors that are bounded by construction. We are the first to perform a novel stress test of our predictive models across a wide range of implementation and design-space parameters Further, we explore model-guided implementation of designs in Ed to achieve minimum power: that is, our models recommend a most-promising set of implementation parameters and constraints, and also provide a priori estimates of 3D power benefits, based on a given designs post-synthesis and 2D implementation parameters. We achieve ≤10% error in power benefit prediction across various 3DIC designs.
international symposium on physical design | 2017
Wei-Ting Jonas Chan; Pei-Hsin Ho; Andrew B. Kahng; Prashant Saxena
Design rule check (DRC) violations after detailed routing prevent a design from being taped out. To solve this problem, state-of-the-art commercial EDA tools global-route the design to produce a global-route congestion map; this map is used by the placer to optimize the placement of the design to reduce detailed-route DRC violations. However, in sub-14nm processes and beyond, DRCs arising from multiple patterning and pin-access constraints drastically weaken the correlation between global-route congestion and detailed-route DRC violations. Hence, the placer|based on the global-route congestion map|may leave too many detailed-route DRC violations to be fixed manually by designers. In this paper, we present a method that employs (1) machine-learning techniques to effectively predict detailed-route DRC violations after global routing and (2) detailed placement techniques to effectively reduce detailed-route DRC violations. We demonstrate on several layouts of a sub-14nm industrial design that this method predicts the locations of 74% of the detailed-route DRCs (with false positive prediction rate below 0.2%) and automatically reduces the number of detailed-route DRC violations by up to 5x. Whereas previous works on machine learning for routability [30] [4] have focused on routability prediction at the floorplanning and placement stages, ours is the first paper that not only predicts the actual locations of detailed-route DRC violations but furthermore optimizes the design to significantly reduce such violations.
international conference on computer design | 2014
Wei-Ting Jonas Chan; Andrew B. Kahng; Siddhartha Nath; Ichiro Yamamoto
The system driver models for microprocessor (MPU) and system-on-chip (SOC) in the International Technology Roadmap for Semiconductors [21] (ITRS) determine the roadmap of underlying technology requirements across devices, patterning, interconnect, test, design and other semiconductor supplier industries. In this paper, we describe several fundamental changes in the ITRS MPU and SOC system driver models as of the recently-released 2013 edition of the roadmap. We first present new A-factor (i.e., layout density) models for the logic and memory components of the MPU and SOC drivers; these updated density models comprehend the industrys shift to FinFET devices below the foundry 20nm node. We also describe updated architectural, total chip area, and total chip power models for the MPU and SOC drivers. Notably, we model the growing uncore portion of MPU products, and the growing presence of graphic processing units (GPUs) and other peripheral cores (PEs) in SOC architectures. The updated SOC architectural model enables more realistic scenario-based power modeling for the SOC driver. The 2013 ITRS update of system driver models embodies extensive calibration with foundry data as well as product structural analysis reports from a leading analysis firm (Chipworks). The model calibration reveals that the industry has contended with a “scaling gap” since 2008, whereby traditional Moores-Law density scaling of 2× per node has failed due to patterning limitations on layout design, as well as manufacturability and performability challenges of Metal-1 half-pitch (M1HP) scaling. Growing design margins due to reliability, yield, variability, etc. have also contributed to the slowdown of density scaling. We describe how this scaling gap can potentially be compensated if the semiconductor industry urgently pursues design-based equivalent scaling (DES), which substantially changes the area and power model trajectories of MPUs and SOCs in the ITRS System Drivers Chapter. Finally, we note that as a consequence of the updated A-factor, area and power models in the 2013 ITRS, the industry now faces a 20% more daunting power management challenge than had been predicted in the 2011 roadmap.
IEEE Transactions on Circuits and Systems | 2014
Tuck-Boon Chan; Wei-Ting Jonas Chan; Andrew B. Kahng
Transistor aging due to bias temperature instability (BTI) is a major reliability concern in sub-32 nm technology. To compensate for aging, designs now typically apply adaptive voltage scaling (AVS) to mitigate performance degradation by elevating supply voltage. Since varying the supply voltage also causes the BTI degradation to vary over lifetime, this presents a new challenge for margin reduction in the context of conventional signoff methodology, which characterizes timing libraries based on transistor models with pre-calculated BTI degradations for a given IC lifetime. In this paper, we study the conditions under which a circuit with AVS requires additional timing margin during signoff. Then, we propose two heuristics for chip designers to characterize an aging-derated standard-cell timing library that accounts for the impact of AVS during signoff. According to our experimental results, this aging-aware signoff approach avoids both overestimation and underestimation of aging-either of which results in power or area penalty-in AVS-enabled systems. Further, we compare circuits implemented with the aging-aware signoff method based on aging-derated libraries versus those based on a flat timing margin. We demonstrate that the flat timing margin method is more pessimistic, and that the pessimism can be mitigated by AVS.
asia and south pacific design automation conference | 2016
Wei-Ting Jonas Chan; Kun Young Chung; Andrew B. Kahng; Nancy D. MacDonald; Siddhartha Nath
Embedded memories are critical to success or failure of complex system-on-chip (SoC) products. They can be significant yield detractors as a consequence of occupying substantial die area, creating placement and routing blockages, and having stringent Vccmin and power integrity requirements. Achieving timing-correctness for embedded memories in advanced nodes is costly (e.g., closing the design at multiple (logic-memory) cross-corners). Further, multiphysics (e.g., crosstalk, IR, etc.) signoff analyses make early understanding and prediction of timing (-correctness) even more difficult. With long tool and design closure subflow runtimes, design teams need improved prediction of embedded memory timing failures, as early as possible in the implementation flow. In this work, we propose a learning-based methodology to perform early prediction of timing failure risk given only the netlist, timing constraints, and floorplan context (wherein the memories have been placed). Our contributions include (i) identification of relevant netlist and floorplan parameters, (ii) the avoidance of long P&R tool runtimes (up to a week or even more) with early prediction, and (iii) a new implementation of Boosting with Support Vector Machine regression with focus on negative-slack outcomes through weighting in the model construction. We validate accuracy of our prediction models across a range of “multiphysics” analysis regimes, and with multiple designs and floorplans in 28FDSOI foundry technology. Our work can be used to identify which memories are “at risk”, guide floorplan changes to reduce predicted “risk”, and help refine underlying SoC implementation methodologies. Experimental results in 28nm FDSOI technology show that we can predict P&R slack with multiphysics analysis to within 253ps (average error less than 10ps) using only post-synthesis netlist, constraints and floorplan information. Our predictions are 40% more accurate than the predictions (worst-case error of 358ps and average error of 42ps) of a nonlinear Support Vector Machine model that uses only post-synthesis netlist information.
design, automation, and test in europe | 2013
Tuck-Boon Chan; Wei-Ting Jonas Chan; Andrew B. Kahng
Transistor aging due to bias temperature instability (BTI) is a major reliability concern in sub-32nm technology. Aging decreases performance of digital circuits over the entire IC lifetime. To compensate for aging, designs now typically apply adaptive voltage scaling (AVS) to mitigate performance degradation by elevating supply voltage. Varying the supply voltage of a circuit using AVS also causes the BTI degradation to vary over lifetime. This presents a new challenge for margin reduction in conventional signoff methodology, which characterizes timing libraries based on transistor models with pre-calculated BTI degradations for a given IC lifetime. Many works have separately addressed predictive models of BTI and the analysis of AVS, but there is no published work that considers BTI-aware signoff that accounts for the use of AVS during IC lifetime. This motivates us to study how the presence of AVS should affect aging-aware signoff. In this paper, we first simulate and analyze circuit performance degradation due to BTI in the presence of AVS. Based on our observations, we propose a rule-of-thumb for chip designers to characterize an aging-derated standard-cell timing library that accounts for the impact of AVS. According to our experimental results, this aging-aware signoff approach avoids both overestimation and underestimation of aging - either of which results in power or area penalty - in AVS enabled systems.
ACM Journal on Emerging Technologies in Computing Systems | 2017
Armin Alaghi; Wei-Ting Jonas Chan; John P. Hayes; Andrew B. Kahng; Jiajia Li
As we approach the limits of traditional Moore’s-Law scaling, alternative computing techniques that consume energy more efficiently become attractive. Stochastic computing (SC), as a re-emerging computing technique, is a low-cost and error-tolerant alternative to conventional binary circuits in several important applications such as image processing and communications. SC allows a natural accuracy-energy tradeoff that has been exploited in the past. This article presents an accuracy-energy tradeoff technique for SC circuits that reduces their energy consumption with virtually no accuracy loss. To this end, we employ voltage or frequency scaling, which normally reduce energy consumption at the cost of timing errors. Then we show that due to their inherent error tolerance, SC circuits operate satisfactorily without significant accuracy loss even with aggressive scaling. This significantly improves their energy efficiency. In contrast, conventional binary circuits quickly fail as the supply voltage decreases. To find the most energy-efficient operating point of an SC circuit, we propose an error estimation method that allows us to quickly explore the circuit’s design space. The error estimation method is based on Markov chain and least-squares regression. Furthermore, we investigate opportunities to optimize SC circuits under such aggressive scaling. We find that logical and physical design techniques can be combined to significantly expand the already-powerful accuracy-energy tradeoff possibilities of SC. In particular, we demonstrate that careful adjustment of path delays can lead to significant error reduction under voltage and frequency scaling. We perform buffer insertion and route detouring to achieve more balanced path delays. These techniques differ from conventional path-balancing techniques whose goal is to minimize power consumption by resizing the non-critical paths. The goal of our path-balancing approach is to increase error cancellation chances in voltage-/frequency-scaled SC circuits. Our circuit optimization comprehends the tradeoff between power overheads due to inserted buffers and wires versus the energy reduction from supply voltage downscaling enabled by more balanced path delays. Simulation results show that our optimized SC circuits can tolerate aggressive voltage scaling with no significant signal-to-noise ratio (SNR) degradation. In one example, a 40% supply voltage reduction (1V to 0.6V) on the SC circuit leads to 66% energy saving (20.7pJ to 6.9pJ) and makes it more efficient than its conventional binary counterpart. In the same example, a 100% frequency boosting (400ps to 200ps) of the optimized circuits leads to no significant SNR degradation. We also show that process variation and temperature variation have limited impact on optimized SC circuits. The error change is less than 5% when temperature changes by 100°C or process condition changes from worst case to best case.
international conference on computer design | 2016
Wei-Ting Jonas Chan; Yang Du; Andrew B. Kahng; Siddhartha Nath; Kambiz Samadi
In advanced technology nodes, physical design engineers must estimate whether a standard-cell placement is routable (before invoking the router) in order to maintain acceptable design turnaround time. Modern SoC designs consume multiple compute servers, memory, tool licenses and other resources for several days to complete routing. When the design is unroutable, resources are wasted, which increases the design cost. In this work, we develop machine learning-based models that predict whether a placement solution is routable without conducting trial or early global routing. We also use our models to accurately predict iso-performance Pareto frontiers of utilization, aspect ratio and number of layers in the back-end-of-line (BEOL) stack. Furthermore, using data mining and machine learning techniques, we develop new methodologies to generate training examples given very few placements. We conduct validation experiments in three foundry technologies (28nm FDSOI, 28nm LP and 45nm GS), and demonstrate accuracy ≥ 85.9% in predicting routability of a placement. Our predictions of Pareto frontiers in the three technologies are pessimistic by at most 2% with respect to the maximum achievable utilization for a given design in a given BEOL stack.