Ryan N. Rakvic | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ryan N. Rakvic is active.

Explore More

Publication

Featured researches published by Ryan N. Rakvic.

international symposium on computer architecture | 2006

Multiple Instruction Stream Processor

Richard A. Hankins; Gautham N. Chinya; Jamison D. Collins; Perry H. Wang; Ryan N. Rakvic; Hong Wang; John Paul Shen

Microprocessor design is undergoing a major paradigm shift towards multi-core designs, in anticipation that future performance gains will come from exploiting threadlevel parallelism in the software. To support this trend, we present a novel processor architecture called the Multiple Instruction Stream Processing (MISP) architecture. MISP introduces the sequencer as a new category of architectural resource, and defines a canonical set of instructions to support user-level inter-sequencer signaling and asynchronous control transfer. MISP allows an application program to directly manage user-level threads without OS intervention. By supporting the classic cache-coherent shared-memory programming model, MISP does not require a radical shift in the multithreaded programming paradigm. This paper describes the design and evaluation of the MISP architecture for the IA-32 family of microprocessors. Using a research prototype MISP processor built on an IA-32-based multiprocessor system equipped with special firmware, we demonstrate the feasibility of implementing the MISP architecture. We then examine the utility of MISP by (1) assessing the key architectural tradeoffs of the MISP architecture design and (2) showing how legacy multithreaded applications can be migrated to MISP with relative ease.

international symposium on microarchitecture | 2004

The Fuzzy Correlation between Code and Performance Predictability

Murali Annavaram; Ryan N. Rakvic; Marzia Polito; Jean-Yves Bouguet; Richard A. Hankins; Bob Davies

Recent studies have shown that most SPEC CPU2K benchmarks exhibit strong phase behavior, and the Cycles per Instruction (CPI) performance metric can be accurately predicted based on programs control-flow behavior, by simply observing the sequencing of the program counters, or extended instruction pointers (EIPs). One motivation of this paper is to see if server workloads also exhibit such phase behavior. In particular, can EIPs effectively predict CPI in server workloads? We propose using regression trees to measure the theoretical upper bound on the accuracy of predicting the CPI using EIPs, where accuracy is measure by the explained variance of CPI with EIPs. Our results show that for most server workloads and, surprisingly, even for CPU2K benchmarks, the accuracy of predicting CPI from EIPs varies widely. We classify the benchmarks into four quadrants based on their CPI variance and predictability of CPI using EIPs. Our results indicate that no single sampling technique can be broadly applied to a large class of applications. We propose a new methodology that selects the best-suited sampling technique to accurately capture the program behavior.

international symposium on microarchitecture | 2002

Compiler managed micro-cache bypassing for high performance EPIC processors

Youfeng Wu; Ryan N. Rakvic; Li-Ling Chen; Chyi-chang Miao; George Z. Chrysos; Jesse Fang

Advanced microprocessors have been increasing clock rates, well beyond the Gigahertz boundary. For such high performance microprocessors, a small and fast data micro-cache (ucache) is important to overall performance, and proper management of it via load bypassing has a significant performance impact. In this paper, we propose and evaluate a hardware-software collaborative technique to manage ucache bypassing for EPIC processors. The hardware supports the ucache bypassing with a fag in the load instruction format, and the compiler employs static analysis and profiling to identify loads that should bypass the ucache. The collaborative method achieves a significant improvement in performance for the SpecInt2000 benchmarks. On average, about 40%, 30%, 24%, and 22% of load references are identified to bypass 256 B, 1 K, 4 K, and 8 K sized ucaches, respectively. This reduces the ucache miss rates by 39%, 32%, 28%, and 26%. The number of pipeline stalls from loads to their uses is reduced by 13%, 9%, 6%, and 5%. Meanwhile, the L1 and L2 cache misses remain largely unchanged. For the 256 B ucache, bypassing improves overall performance on average by 5%.

Archive | 2011

Mechanism for monitoring instruction set based thread execution on a plurality of instruction sequencers

Richard A. Hankins; Gautham N. Chinya; Hong Wang; Shivnandan D. Kaushik; Bryant Bigbee; John Paul Shen; Trung A. Diep; Xiang Zou; Baiju V. Patel; Paul M. Petersen; Sanjiv Shah; Ryan N. Rakvic; Prashant Sethi

Archive | 2005

Load balancing for multi-threaded applications via asymmetric power throttling

Ryan N. Rakvic; Richard A. Hankins; Ed Grochowski; Hong Wang; Murali Annavaram; David K. Poulsen; Sanjiv Shah; John Paul Shen; Gautham N. Chinya

Archive | 2005

Scheduling optimizations for user-level threads

Ryan N. Rakvic; Richard A. Hankins; Hong Wang; Trung A. Diep; Xinmin Tain; Paul M. Petersen; Sanjiv Shah; John Paul Shen; Gautham N. Chinya; Shivnandan D. Kaushik; Bryant Bigbee; Baiju V. Patel; Douglas R. Armstrong

Archive | 2005

Compiler-based scheduling optimization hints for user-level threads

Shih-Wei Liao; Ryan N. Rakvic; Richard A. Hankins; Hong Wang; Gansha Wu; Guei-Yuan Lueh; Xinmin Tian; Paul M. Petersen; Sanjiv Shah; Trung A. Diep; John Paul Shen; Gautham N. Chinya

Archive | 2005

Sequencer address management

Hong Wang; Gautham N. Chinya; Richard A. Hankins; Shivnandan D. Kaushik; Bryant Bigbee; John Paul Shen; Per Hammarlund; Xiang Zou; Jason W. Brandt; Prashant Sethi; Douglas M. Carmean; Baiju V. Patel; Scott Dion Rodgers; Ryan N. Rakvic; John L. Reid; David K. Poulsen; Sanjiv Shah; James P. Held; James C. Abel

Archive | 2007