Adrian C. Moga | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Adrian C. Moga is active.

Explore More

Publication

Featured researches published by Adrian C. Moga.

high-performance computer architecture | 2015

High performing cache hierarchies for server workloads: Relaxing inclusion to capture the latency benefits of exclusive caches

Aamer Jaleel; Joseph Nuzman; Adrian C. Moga; Simon C. Steely; Joel S. Emer

Increasing transistor density enables adding more on-die cache real-estate However, devoting more space to the shared last-level-cache (LLC) causes the memory latency bottleneck to move from memory access latency to shared cache access latency. As such, applications whose working set is larger than the smaller caches spend a large fraction of their execution time on shared cache access latency. To address this problem, this paper investigates increasing the size of smaller private caches in the hierarchy as opposed to increasing the shared LLC. Doing so improves average cache access latency for workloads whose working set fits into the larger private cache while retaining the benefits of a shared LLC. The consequence of increasing the size of private caches is to relax inclusion and build exclusive hierarchies. Thus, for the same total caching capacity, an exclusive cache hierarchy provides better cache access latency. We observe that server workloads benefit tremendously from an exclusive hierarchy with large private caches. This is primarily because large private caches accommodate the large code working-sets of server workloads. For a 16-core CMP, an exclusive cache hierarchy improves server workload performance by 5-12% as compared to an equal capacity inclusive cache hierarchy. The paper also presents directions for further research to maximize performance of exclusive cache hierarchies.

european conference on parallel processing | 2008

To Snoop or Not to Snoop: Evaluation of Fine-Grain and Coarse-Grain Snoop Filtering Techniques

Jessica Young; Srihari Makineni; Ravishankar R. Iyer; Donald Newell; Adrian C. Moga

Cache coherency protocols implemented in todays shared memory multiprocessor systems use snooping mechanism to keep the data correct and consistent between the caches and the system memory. This requires a large number of snoops sent out on the system interconnection links. However, published research has been shown that a large percentage of these snoops are not necessary or can be eliminated. To detect and eliminate these unnecessary snoops, several techniques have been proposed. But these techniques have not been evaluated using commercial server benchmarks and large caches that are common on todays server platforms. In this paper, we evaluate three popular snoop filtering techniques, namely Region Scout (RS), Region Coherence Array (RCA) and Directory Cache (DC), using four different commercial server workloads. We compare and contrast these three techniques and show how effective these techniques are in eliminating unnecessary snoops. These techniques differ in implementation approaches and the implementation differences yield accuracy and areas tradeoffs. We show 38% to 98% of the last level cache snoops are unnecessary in major commercial server benchmarks. With the snoop filtering techniques we are able to eliminate 35% to 97% of the unnecessary snoops with 1-3% additional die area.

Archive | 2011

APPARATUS, METHOD, AND SYSTEM FOR IMPLEMENTING MICRO PAGE TABLES

Glenn J. Hinton; Madhavan Parthasarathy; Rajesh S. Parthasarathy; Muthukumar P. Swaminathan; Raj K. Ramanujan; David Zimmerman; Larry O. Smith; Adrian C. Moga; Scott J. Cape; Wayne A. Downer; Robert S. Chappell

Archive | 2012

PROCESSORS HAVING VIRTUALLY CLUSTERED CORES AND CACHE SLICES

Herbert H. J. Hum; Brinda Ganesh; James R. Vash; Ganesh Kumar; Leena K. Puthiyedath; Scott J. Erlanger; Eric J. DeHaemer; Adrian C. Moga; Michelle M. Sebot; Richard L. Carlson; David Bubien; Eric Delano

Archive | 2008

Method, system and apparatus for reducing memory traffic in a distributed memory system

Adrian C. Moga; Rajat Agarwal; Malcolm Mandviwalla

Archive | 2013

Allocation and write policy for a glueless area-efficient directory cache for hotly contested cache lines

Adrian C. Moga; Malcolm Mandviwalla; Vedaraman Geetha; Herbert H. J. Hum

Archive | 2010

Directory cache allocation based on snoop response information

Adrian C. Moga; Malcolm Mandviwalla; Stephen R. Van Doren

Archive | 2013

Inclusive/Non Inclusive Tracking of Local Cache Lines To Avoid Near Memory Reads On Cache Line Memory Writes Into A Two Level System Memory

Adrian C. Moga; Vedaraman Geetha; Bahaa Fahim; Robert G. Blankenship; Yen-Cheng Liu; Jeffrey D. Chamberlain; Stephen R. Van Doren

Archive | 2015

HARDWARE/SOFTWARE CO-OPTIMIZATION TO IMPROVE PERFORMANCE AND ENERGY FOR INTER-VM COMMUNICATION FOR NFVS AND OTHER PRODUCER-CONSUMER WORKLOADS

Ren Wang; Andrew J. Herdrich; Yen-Cheng Liu; Herbert H. J. Hum; Jongsoo Park; Christopher J. Hughes; Namakkal N. Venkatesan; Adrian C. Moga; Aamer Jaleel; Zeshan Chishti; Mesut A. Ergin; Jr-Shian Tsai; Alexander W. Min; Tsung-Yuan C. Tai; Christian Maciocco; Rajesh Sankaran

Archive | 2008

Obtaining data for redundant multithreading (RMT) execution

Glenn J. Hinton; Steven E. Raasch; Sebastien Hily; John G. Holm; Ronak Singhal; Avinash Sodani; Deborah T. Marr; Shubhendu S. Mukherjee; Arijit Biswas; Adrian C. Moga

Explore More