Victor Zyuban | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Victor Zyuban is active.

Explore More

Publication

Featured researches published by Victor Zyuban.

international symposium on microarchitecture | 2000

Power-aware microarchitecture: design and modeling challenges for next-generation microprocessors

David M. Brooks; Pradip Bose; Stanley E. Schuster; Hans M. Jacobson; Prabhakar Kudva; Alper Buyuktosunoglu; John-David Wellman; Victor Zyuban; Manish Gupta; Peter W. Cook

The ability to estimate power consumption during early-stage definition and trade-off studies is a key new methodology enhancement. Opportunities for saving power can be exposed via microarchitecture-level modeling, particularly through clock-gating and dynamic adaptation. In this paper we describe the approach of using energy-enabled performance simulators in early design. We examine some of the emerging paradigms in processor design and comment on their inherent power-performance characteristics.

international symposium on low power electronics and design | 1998

The energy complexity of register files

Victor Zyuban; Peter M. Kogge

Register files (RF) represent a substantial portion of the energy budget in modern processors, and are growing rapidly with the trend towards wider instruction issue. The actual access energy costs depend greatly on the register file circuitry used. This paper compares various RF circuitry techniques for their energy efficiencies, as a function of architectural parameters such as the number of registers and the number of ports. The port priority selection technique was found to be the most energy efficient. The dependence of register file access energy upon technology scaling is also studied. However, as this paper shows, it appears that none of these will be enough to prevent centralized register files from becoming the dominant power component of next-generation superscalar computers, and alternative methods for inter-instruction communication need to be developed. Split register file architecture is analyzed as a possible alternative.

IEEE Transactions on Computers | 2001

Inherently lower-power high-performance superscalar architectures

Victor Zyuban; Peter M. Kogge

In recent years, reducing power has become an important design goal for high-performance microprocessors. This work attempts to bring the power issue to the earliest phases of microprocessor development, in particular, the stage of defining a chip microarchitecture. We investigate power-optimization techniques of superscalar microprocessors at the microarchitecture level that do not compromise performance. First, major targets for power reduction are identified within microarchitecture, where power is heavily consumed or will be heavily consumed in next-generation superscalar processors. Then, a new, energy-efficient version of a multicluster microarchitecture is developed that reduces energy the identified critical design points with minimal performance impact. A methodology is developed for energy-performance optimization at the microarchitecture level that generates, for a microarchitecture, a set of energy-efficient configurations, forming a convex hull in the power-performance space. Detailed simulation of the baseline and proposed multicluster architectures has been performed using the developed optimization methodology. A comparison of the two microarchitectures, both optimized for energy efficiency, shows that the multicluster architecture is potentially up to twice as energy efficient for wide issue processors, with an advantage that Grows with the issue width. Conversely, at the same power dissipation level, the multicluster architecture supports configurations with measurably higher performance than equivalent conventional designs.

international symposium on microarchitecture | 2002

Optimizing pipelines for power and performance

Viji Srinivasan; David M. Brooks; Michael Karl Gschwind; Pradip Bose; Victor Zyuban; Philip N. Strenski; Philip G. Emma

During the concept phase and definition of next generation high-end processors, power and performance will need to be weighted appropriately to deliver competitive cost/performance. It is not enough to adopt a CPI-centric view alone in early-stage definition studies. One of the fundamental issues confronting the architect at this stage is the choice of pipeline depth and target frequency. In this paper we present an optimization methodology that starts with an analytical power-performance model to derive optimal pipeline depth for a superscalar processor. The results are validated and further refined using detailed simulation based analysis. As part of the power-modeling methodology, we have developed equations that model the variation of energy as a function of pipeline depth. Our results using a set of SPEC2000 applications show that when both power and performance are considered for optimization, the optimal clock period is around 18 FO4. We also provide a detailed sensitivity analysis of the optimal pipeline depth against key assumptions of these energy models.

international symposium on low power electronics and design | 2002

Unified methodology for resolving power-performance tradeoffs at the microarchitectural and circuit levels

Victor Zyuban; Philip N. Strenski

Evaluation of architectural tradeoffs is complicated by implications in the circuit domain which are typically not captured in the analysis but substantially affect the results. We propose a metric of hardware intensity (&eegr;), which is useful for evaluating issues that affect both circuits and architecture. Analyzing data for actual designs we show how to measure the introduced parameters and discuss variations between observed results and common theoretical assumptions. For a power-efficient design we derive relations for &eegr; and supply voltage V under progressively more general situations, and incorporate &eegr; into a prior art architectural energy-efficiency criterion. Then, a more general relation is derived for the optimal balance between the architectural complexity, hardware intensity and power supply. Modified forms for these relations are obtained in special cases where the supply voltage is constrained or when clock gating is disallowed.

IEEE Transactions on Computers | 2004

Integrated analysis of power and performance for pipelined microprocessors

Victor Zyuban; David M. Brooks; Viji Srinivasan; Michael Karl Gschwind; Pradip Bose; Philip N. Strenski; Philip G. Emma

Choosing the pipeline depth of a microprocessor is one of the most critical design decisions that an architect must make in the concept phase of a microprocessor design. To be successful in todays cost/performance marketplace, modern CPU designs must effectively balance both performance and power dissipation. The choice of pipeline depth and target clock frequency has a critical impact on both of these metrics. We describe an optimization methodology based on both analytical models and detailed simulations for power and performance as a function of pipeline depth. Our results for a set of SPEC2000 applications show that, when both power and performance are considered for optimization, the optimal clock period is around 18 FO4. We also provide a detailed sensitivity analysis of the optimal pipeline depth against key assumptions of our energy models. Finally, we discuss the potential risks in design quality for overly aggressive or conservative choices of pipeline depth.

international symposium on low power electronics and design | 2002

Low power integrated scan-retention mechanism

Victor Zyuban; Stephen V. Kosonocky

This paper presents a methodology for unifying the scan mechanism and data retention in latches which leads to scannable latches with the data retention capability achieved at a very low power overhead during the active mode. A detailed analysis of power and area overhead is presented, with layout examples for various common latch styles. Implications of using different power gating techniques for reducing leakage during sleep mode on the design of retention latches are considered, including well biasing for leakage control and sharing wells between gated logic and retention latch devices.

international solid-state circuits conference | 2010

The implementation of POWER7 TM : A highly parallel and scalable multi-core high-end server processor

Dieter Wendel; Ronald Nick Kalla; Robert Cargoni; Joachim Clables; Joshua Friedrich; Roland Frech; James Allan Kahle; Balaram Sinharoy; William J. Starke; Scott A. Taylor; Steve Weitzel; Sam Gat-Shang Chu; Saiful Islam; Victor Zyuban

The next processor of the POWER ™ family, called POWER7™ is introduced. Eight quad-threaded cores are integrated together with two memory controllers and high-speed system links on a 567mm2 die, employing 1.2B transistors in 45nm CMOS SOI technology [4]. High on-chip performance and therefore bandwidth is achieved using 11 layers of low-к copper wiring and devices with enhanced dual-stress liners. The technology features deep trench [DT] capacitors that are used to build the 32MB embedded DRAM L3 based on a 0.067µm2 DRAM cell. DT capacitors are used also to reduce on-chip voltage-island supply noise. Focusing on speed, the dual-supply ripple-domino SRAM concepts follows the schemes described elsewhere.

dependable systems and networks | 2007

A Framework for Architecture-Level Lifetime Reliability Modeling

Jeonghee Shin; Victor Zyuban; Zhigang Hu; Jude A. Rivers; Pradip Bose

This paper tackles the issue of modeling chip lifetime reliability at the architecture level. We propose a new and robust structure-aware lifetime reliability model at the architecture-level, where devices only vulnerable to failure mechanisms and the effective stress condition of these devices are taken into account for the failure rate of microarchitecture structures. In addition, we present this reliability analysis framework based on a new concept, called the FIT of reference circuit or FORC, which allows architects to quantify failure rates without having to delve into low-level circuit- and technology-specific details of the implemented architecture. This is done through a onetime characterization of a reference circuit needed to quantify the reference FITs for each class of modeled failure mechanisms for a given technology and implementation style. With this new reliability modeling framework, architects are empowered to proceed with architecture-level reliability analysis independent of technological and environmental parameters.

IEEE Journal of Solid-state Circuits | 2011

POWER7™, a Highly Parallel, Scalable Multi-Core High End Server Processor

Dieter Wendel; R Kalla; James D. Warnock; R. Cargnoni; S G Chu; J G Clabes; Daniel M. Dreps; D. Hrusecky; Joshua Friedrich; Saiful Islam; J Kahle; Jens Leenstra; Gaurav Mittal; Jose Angel Paredes; Jürgen Pille; Phillip J. Restle; Balaram Sinharoy; G Smith; W J Starke; S Taylor; J. A. Van Norstrand; Stephen Douglas Weitzel; P G Williams; Victor Zyuban

This paper gives an overview of the latest member of the POWER™ processor family, POWER7™. Eight quad-threaded cores, operating at frequencies up to 4.14 GHz, are integrated together with two memory controllers and high speed system links on a 567 mm die, employing 1.2B transistors in a 45 nm CMOS SOI technology with 11 layers of low-k copper wiring. The technology features deep trench capacitors which are used to build a 32 MB embedded DRAM L3 based on a 0.067 m DRAM cell. The functionally equivalent chip transistor count would have been over 2.7B if the L3 had been implemented with a conventional 6 transistor SRAM cell. (A detailed paper about the eDRAM implementation will be given in a separate paper of this Journal). Deep trench capacitors are also used to reduce on-chip voltage island supply noise. This paper describes the organization of the design and the features of the processor core, before moving on to discuss the circuits used for analog elements, clock generation and distribution, and I/O designs. The final section describes the details of the clocked storage elements, including special features for test, debug, and chip frequency tuning.

Explore More