Weiping Liao
University of California, Los Angeles
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Weiping Liao.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems | 2005
Weiping Liao; Lei He; Kevin M. Lepak
Performance and power are two primary design issues for systems ranging from server computers to handhelds. Performance is affected by both temperature and supply voltage because of the temperature and voltage dependence of circuit delay. Furthermore, as semiconductor technology scales down, leakage powers exponential dependence on temperature and supply voltage becomes significant. Therefore, future design studies call for temperature and voltage aware performance and power modeling. In this paper, we study microarchitecture-level temperature and voltage aware performance and power modeling. We present a leakage power model with temperature and voltage scaling, and show that leakage and total energy vary by 38% and 24%, respectively, between 65/spl deg/C and 110/spl deg/C. We study thermal runaway induced by the interdependence between temperature and leakage power, and demonstrate that without temperature-aware modeling, underestimation of leakage power may lead to the failure of thermal controls, and overestimation of leakage power may result in excessive performance penalties of up to 5.24%. All of these studies underscore the necessity of temperature-aware power modeling. Furthermore, we study optimal voltage scaling for best performance with dynamic power and thermal management under different packaging options. We show that dynamic power and thermal management allows designs to target at the common-case thermal scenario among benchmarks and improves performance by 6.59% compared to designs targeted at the worst case thermal scenario without dynamic power and thermal management. Additionally, the optimal V/sub dd/ for the best performance may not be the largest V/sub dd/ allowed by the given packaging platform, and that advanced cooling techniques can improve throughput significantly.
design automation conference | 2004
Lei He; Weiping Liao; Mircea R. Stan
The high leakage devices in nanometer technologies as well as the low activity rates in system-on-a-chip (SOC) contribute to the growing significance of leakage power at the system level. We first present system-level leakage-power modeling and characteristics and discuss ways to reduce leakage for caches. Considering the interdependence between leakage power and temperature, we then discuss thermal runaway and dynamic power and thermal management (DPTM) to reduce power and prevent thermal violations. We show that a thermal-independent leakage model may hide actual failures of DPTM. Finally, we present voltage scaling considering DPTM for different packaging options. We show that the optimal Vdd for the hest throughput may be smaller than the largest Vdd allowed by the given packaging platform, and that advanced cooling techniques can improve throughput significantly.
international conference on computer aided design | 2002
Weiping Liao; Joseph M. Basile; Lei He
In this paper, we study leakage power reduction using power gating in the forms of the Virtual power/ground Rails Clamp (VRC) and Multi-threshold CMOS (MTCMOS) techniques. We apply power gating to two circuit types: memory-based units and datapath components. Using a microarchitecture-level power simulator, as well as power and timing models derived from detailed circuit designs, we further study leakage power modeling and reduction at the system level for modern high-performance VLIW processors. We show that the leakage power can be over 40% of the total power for such processors. Moreover, we propose time-out scheduling of VRC to reduce power up to 85.65% for L2 cache. This power savings results in close to 1/3 total power dissipation for the VLIW processors we study.
international symposium on low power electronics and design | 2003
Weiping Liao; Fei Li; Lei He
In this paper, we present power models with clock and temperature scaling, and develop the first of its type coupled thermal and power simulation with temperature-dependent leakage power model at micro-architecture level. We show that leakage energy and total energy can be different by up to 2.5X and 2X for temperatures between 90°C and 130°C, respectively. Given such big energy variations, no power model at microarchitecture level is accurate without considering temperature dependent leakage models.
design automation conference | 2004
Changbo Long; Lucanus J. Simonson; Weiping Liao; Lei He
Interconnect pipelining has a great impact on system performance, but has not been considered by automatic floorplanning. Consid-ering interconnect pipelining, we study the floorplanning optimiza-tion problem to minimize system CPI (cycles per instruction) and in turn maximize system performance. We develop an efficient table-based model called trajectory piece-wise linear (TPWL) model to estimate CPI with interconnect pipelining. Experiments show that the TPWL model differs from cycle-accurate simulations by less than 3.0%. We integrate this model with a simulated-annealing based floorplan optimization to obtain CPI-aware floorplanning. Compared to the conventional floorplanning to minimize area and wire length, our CPI-aware floorplanning can reduce CPI by up to 28.6% with a small area overhead of 5.69% under 100nm technol-ogy and obtain better results under 70nm technology. To the best of our knowledge, this paper is the first in-depth study on floorplan-ning optimization with consideration of interconnect pipelining.
international conference on computer aided design | 2003
Weiping Liao; Lei He
In this paper, we study the full-chip interconnect power modeling.We show that repeater insertion is no longer sufficient toachieve the target frequencies specified by ITRS, and develop concurrentrepeater and FF insertion schemes. Considering structuralinterconnects, layer assignment and concurrent repeater andFF insertion for delay specification, we develop a cycle-accuratemicroarchitecture-level interconnect power simulation. The simulationreduces the over-estimation by up to 2:46X compared topower estimation based on purely stochastic interconnects and fixedswitching factor. Furthermore, we show that interconnect pipelininghas a lower IPC but can improve throughput by up to 2.03X.This indicates that the traditional design flow optimizing IPC andclock frequency separately may no longer be valid.
international conference on parallel architectures and compilation techniques | 2003
Weiping Liao; Lei He
In this chapter, we first present a cycle-accurate power simulator based on the IMPACT toolset. This simulator allows a designer to evaluate both VLIW compiler and micro-architecture innovations for power reduction. Using this simulator, we then develop and compare the following techniques with a bounded performance loss of 1% compared to the case without any dynamic throttling: (i) clock ramping with hardware-based prescan (CRHP), and (ii) clock ramping with compiler-based prediction (CRCP). Experiments using SPEC2000 floating point benchmarks show that the power consumed by floating point units can be reduced by up to 31% and 37%, in CRHP and CRCP respectively.
IEEE Transactions on Very Large Scale Integration Systems | 2005
Weiping Liao; Joseph M. Basile; Lei He
In this paper, we study microarchitecture-level leakage energy reduction by power gating. We consider the virtual power/ground rails clamp (VRC) and multithreshold CMOS (MTCMOS) techniques and apply VRC to memory-based units for data retention and MTCMOS to the other units. We propose a systematic methodology for leakage reduction at the microarchitecture level, in which profiling of idle period distribution and ideal power gating analysis are used to select a target component for realistic power gating. We show that the ideal leakage energy reduction can be up to 30% of the total energy for the modern high-performance very long instruction word processors we study and that the secondary level (L2) cache contributes most to the reduction. We further improve the existing adaptive cache decay method for leakage reduction by using VRC for data retention and name it VRC decay . Applied to L2 cache, the VRC decay, on average, increases performance by 5.6% and reduces system energy by 24.1%, compared to the adaptive cache decay without data retention.
IEEE Transactions on Very Large Scale Integration Systems | 2007
Changbo Long; Lucanus J. Simonson; Weiping Liao; Lei He
Microarchitecture configurations and floorplanning are keys to boost throughput, and they are strongly related. In this paper, we propose a new method to optimize them simultaneously. We first concentrate on floorplanning under given microarchitecture configurations. In addition to the objectives of conventional floorplanning methods, we minimize the throughput degradation caused by pipelined global interconnects based on efficient yet accurate models for microarchitecture throughput over pipeline stages of global interconnects. Our results show that an accurate trajectory piecewise-linear (TPWL) model incurs more offline setup time to obtain 13% better throughput than a rough access ratio-based model, and both models lead to much better throughput (up to 64% higher) compared with conventional floorplanning methods. We then build a unified throughput model parameterized for pipelined global interconnects and microarchitecture configurations based on the TPWL method and apply this model to efficiently explore over one million microarchitecture configurations and corresponding floorplan variations. We obtain microarchitecture configurations and floorplans with throughput 26.9% better than manually chosen microarchitecture followed by automatic floorplanning in a very recent paper.
design, automation, and test in europe | 2005
Jennifer L. Wong; Weiping Liao; Fei Li; Lei He; Miodrag Potkonjak
Context-aware applications pose new challenges, including a need for new computational models, uncertainty management, and efficient optimization under uncertainty. Uncertainty can arise at two levels: multiple and single tasks. When a mobile user changes environments, the context changes resulting in the possibility of the user requesting tasks which are specific for the new environment. However as the user moves these requested tasks may no longer be context relevant. Additionally, the runtime of each task is often highly dependent on the input data. We introduce an hierarchical multi-resolution statistical task model that captures relevant aspects at the task and intertask levels, and captures not only uncertainty, but also introduces the notion of utility for the user. We have developed a system of nonparametric statistical techniques for modeling the runtime of a specific task. This model is a framework where we define problems of design and optimization of statistical soft real-time systems (SSRTS). The main algorithmic novelty is a cumulative potential-based task scheduling heuristic for maximizing utility. The heuristic conducts global optimization and induces low runtime overhead. We demonstrate the effectiveness of the scheduling heuristic using a Trimaran-based evaluation platform.