Is this you? Create Your Porfile

Jie S. Hu

Pennsylvania State University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jie S. Hu is active.

Explore More

Publication

Featured researches published by Jie S. Hu.

IEEE Computer | 2003

Leakage current: Moore's law meets static power

Nam Sung Kim; Todd M. Austin; D. Baauw; Trevor N. Mudge; Krisztian Flautner; Jie S. Hu; Mary Jane Irwin; Mahmut T. Kandemir; Vijay Narayanan

Off-state leakage is static power, current that leaks through transistors even when they are turned off. The other source of power dissipation in todays microprocessors, dynamic power, arises from the repeated capacitance charge and discharge on the output of the hundreds of millions of gates in todays chips. Until recently, only dynamic power has been a significant source of power consumption, and Moores law helped control it. However, power consumption has now become a primary microprocessor design constraint; one that researchers in both industry and academia will struggle to overcome in the next few years. Microprocessor design has traditionally focused on dynamic power consumption as a limiting factor in system integration. As feature sizes shrink below 0.1 micron, static power is posing new low-power design challenges.

languages compilers and tools for embedded systems | 2002

Energy-conscious compilation based on voltage scaling

Hendra Saputra; Mahmut T. Kandemir; Narayanan Vijaykrishnan; Mary Jane Irwin; Jie S. Hu; Chung-Hsing Hsu; Ulrich Kremer

As energy consumption has become a majorconstraint in current system design, it is essential to look beyond the traditional low-power circuit and architectural optimizations. Further, software is becoming an increasing portion of embedded/portable systems. Consequently, optimizing the software in conjunction with the underlying low-power hardware features such as voltage scaling is vital.In this paper, we present two compiler-directed energy optimization strategies based on voltage scaling: static voltage scaling and dynamic voltage scaling. In static voltage scaling, the compiler determines a single supply voltage level for the entire input program. We primarily aim at improving the energy consumption of a given code without increasing its execution time. To accomplish this, we employ classical loop-level compiler optimizations. However, we use these optimizations to create opportunities for voltage scaling to save energy, rather than increase program performance.In dynamic voltage scaling, the compiler can select different supply voltage levels for different parts of the code. Our compilation strategy is based on integer linear programming and can accommodate energy/performance constraints. For a benchmark suite of array-based scientific codes and embedded video/image applications, our experiments show average energy savings of 31.8% when static voltage scaling is used. Our dynamic voltage scaling strategy saves 15.3% more energy than static voltage scaling when invoked under the same performance constraints.

international symposium on microarchitecture | 2002

Compiler-directed instruction cache leakage optimization

Wei Zhang; Jie S. Hu; Vijay Degalahal; Mahmut T. Kandemir; Narayanan Vijaykrishnan; Mary Jane Irwin

Excessive power consumption is widely considered as a major impediment to designing future microprocessors. With the continued scaling down of threshold voltages, the power consumed due to leaky memory cells in on-chip caches will constitute a significant portion of the processors power budget. This work focuses on reducing the leakage energy consumed in the instruction cache using a compiler-directed approach. We present and analyze two compiler-based strategies termed as conservative and optimistic. The conservative approach does not put a cache line into a low leakage mode until it is certain that the current instruction in it is dead. On the other hand, the optimistic approach places a cache line in low leakage mode if it detects that the next access to the instruction will occur only after a long gap. We evaluate different optimization alternatives by combining the compiler strategies with state-preserving and state-destroying leakage control mechanisms.

design, automation, and test in europe | 2005

Compiler-Directed Instruction Duplication for Soft Error Detection

Jie S. Hu; Feihui Li; Vijay Degalahal; Mahmut T. Kandemir; Narayanan Vijaykrishnan; Mary Jane Irwin

We experiment with compiler-directed instruction duplication to detect soft errors in VLIW datapaths. In the proposed approach, the compiler determines the instruction schedule by balancing the permissible performance degradation with the required degree of duplication. Our experimental results show that our algorithms allow the designer to perform tradeoff analysis between performance and reliability.

international symposium on low power electronics and design | 2003

Exploiting program hotspots and code sequentiality for instruction cache leakage management

Jie S. Hu; A. Nadgir; Narayanan Vijaykrishnan; Mary Jane Irwin; Mahmut T. Kandemir

Leakage energy optimization for caches has been the target of much recent effort. In this work, we focus on instruction caches and tailor two techniques that exploit the two major factors that shape the instruction access behavior, namely, hotspot execution and sequentiality. First, we adopt a hotspot detection mechanism by profiling the branch behavior at runtime and utilize this to implement a HotSpot based Leakage Management (HSLM) mechanism. Second, we exploit code sequentiality in implementing a Just-In-Time Activation (JITA) that transitions cache lines to active mode just before they are accessed. We utilize the recently proposed drowsy cache that dynamically scales voltages for leakage reduction and implement various schemes that use different combinations of HSLM and JITA. Our experimental evaluation using the SPEC2000 benchmark suite shows that instruction cache leakage energy consumption can be reduced by 63%, 49% and 29%, on the average, as compared to an unoptimized cache, a recently proposed hardware optimized cache, and a cache optimized using compiler, respectively. Further, we observe that these energy savings can be obtained without a significant impact on performance.

ACM Transactions in Embedded Computing Systems | 2009

Compiler-assisted soft error detection under performance and energy constraints in embedded systems

Jie S. Hu; Feihui Li; Vijay Degalahal; Mahmut T. Kandemir; Narayanan Vijaykrishnan; Mary Jane Irwin

Soft errors induced by terrestrial radiation are becoming a significant concern in architectures designed in newer technologies. If left undetected, these errors can result in catastrophic consequences or costly maintenance problems in different embedded applications. In this article, we focus on utilizing the compilers help in duplicating instructions for error detection in VLIW datapaths. The instruction duplication mechanism is further supported by a hardware enhancement for efficient result verification, which avoids the need of additional comparison instructions. In the proposed approach, the compiler determines the instruction schedule by balancing the permissible performance degradation and the energy constraint with the required degree of duplication. Our experimental results show that our algorithms allow the designer to perform trade-off analysis between performance, reliability, and energy consumption.

high-performance computer architecture | 2004

Exploring Wakeup-Free Instruction Scheduling

Jie S. Hu; Narayanan Vijaykrishnan; Mary Jane Irwin

Design of wakeup-free issue queues is becoming desirable due to the increasing complexity associated with broadcast-based instruction wakeup. The effectiveness of most wakeup-free issue queue designs is critically based on their success in predicting the issue latency of an instruction accurately. Consequently, the goal of this paper is to explore the predictability of instruction issue latency under different design constraints and to identify the impediments to performance in such wakeup-free architectures. Our results indicate that structural problems in promoting instructions to the head of the instruction queue from where they are issued in wakeup-free architectures, the limited number of candidate instructions that can be considered for instruction issue, and the resource conflicts due to non-availability of issue ports all have a significant impact in degrading the performance of broadcast free architectures. Based on these observation, we explore an architecture that attempts to overcome the structural limitations by employing traditional selection logic and by using pre-check logic to reduce the impact of resource conflicts while still employing a wakeup-free strategy based on predicted instruction issue latencies. Finally, we improve this technique by limiting the selection logic to a small segment of the issue queue.

ieee computer society annual symposium on vlsi | 2003

Using dynamic branch behavior for power-efficient instruction fetch

Jie S. Hu; Narayanan Vijaykrishnan; Mary Jane Irwin; Mahmut T. Kandemir

Power consumption has become an increasing concern in high performance microprocessor design in terms of packaging and cooling cost. The fetch unit including instruction cache contributes a large portion of the total power consumption in the microprocessor The instruction cache itself suffers some hidden power consumption due to dynamic control flows. Although capturing the dynamic control flows to boost performance, conventional trace caches (CTC) may increase power consumption in the fetch unit due to its simultaneous access to both the trace cache and the instruction cache. By avoiding this simultaneous accesses, sequential trace caches (STC) achieve lower power consumption, but suffer a significant performance loss at the meantime. In this paper we propose dynamic direction prediction based trace cache (DPTC) which avoids simultaneous accesses to the trace cache and the instruction cache with the guide of fetch direction prediction. Experimental results show that dynamic prediction based trace cache can achieve 38.5% power reduction over conventional trace caches and an additional 7.2% reduction over STC, on average, while only trading a 1.8% performance loss compared to CTC.

ACM Transactions on Architecture and Code Optimization | 2004

Reducing instruction cache energy consumption using a compiler-based strategy

Wei Zhang; Jie S. Hu; Vijay Degalahal; Mahmut T. Kandemir; Narayanan Vijaykrishnan; Mary Jane Irwin

Excessive power consumption is widely considered as a major impediment to designing future microprocessors. With the continued scaling down of threshold voltages, the power consumed due to leaky memory cells in on-chip caches will constitute a significant portion of the processors power budget. This work focuses on reducing the leakage energy consumed in the instruction cache using a compiler-directed approach.We present and analyze two compiler-based strategies termed as conservative and optimistic. The conservative approach does not put a cache line into a low leakage mode until it is certain that the current instruction in it is dead. On the other hand, the optimistic approach places a cache line in low leakage mode if it detects that the next access to the instruction will occur only after a long gap. We evaluate different optimization alternatives by combining the compiler strategies with state-preserving and state-destroying leakage control mechanisms. We also evaluate the sensitivity of these optimizations to different high-level compiler transformations, energy parameters, and soft errors.

ACM Transactions in Embedded Computing Systems | 2005

Analyzing data reuse for cache reconfiguration

Jie S. Hu; Mahmut T. Kandemir; Narayanan Vijaykrishnan; Mary Jane Irwin

Classical compiler optimizations assume a fixed cache architecture and modify the program to take best advantage of it. In some cases, this may not be the best strategy because each nest might work best with a different cache configuration and transforming a nest for a given fixed cache configuration may not be possible due to data and control dependences. Working with a fixed cache configuration can also increase energy consumption in loops where the best required configuration is smaller than the default (fixed) one. In this paper, we take an alternate approach and modify the cache configuration for each nest, depending on the access pattern exhibited by the nest. We call this technique compiler-directed cache polymorphism (CDCP). More specifically, in this paper, we make the following contributions. First, we present an approach for analyzing data reuse properties of loop nests. Second, we give algorithms to simulate the footprints of array references in their reuse space. Third, based on our reuse analysis, we present an optimization algorithm to compute the cache configurations for each loop nest. Our experimental results show that CDCP is very effective in finding the near-optimal data cache configurations for different nests in array-intensive applications.

Explore More