Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Jaideep Moses is active.

Publication


Featured researches published by Jaideep Moses.


international conference on parallel architectures and compilation techniques | 2007

CacheScouts: Fine-Grain Monitoring of Shared Caches in CMP Platforms

Li Zhao; Ravi R. Iyer; Ramesh Illikkal; Jaideep Moses; Srihari Makineni; Donald Newell

As multi-core architectures flourish in the marketplace, multi-application workload scenarios (such as server consolidation) are growing rapidly. When running multiple applications simultaneously on a platform, it has been shown that contention for shared platform resources such as last-level cache can severely degrade performance and quality of service (QoS). But todays platforms do not have the capability to monitor shared cache usage accurately and disambiguate its effects on the performance behavior of each individual application. In this paper, we investigate low-overhead mechanisms for fine-grain monitoring of the use of shared cache resources along three vectors: (a) occupancy - how much space is being used and by whom, (b) interference - how much contention is present and who is being affected and (c) sharing - how are threads cooperating. We propose the CacheScouts monitoring architecture consisting of novel tagging (software-guided monitoring IDs), and sampling mechanisms (set sampling) to achieve shared cache monitoring on per application basis at low overhead (<0.1%) and with very little loss of accuracy (<5%). We also present case studies to show how CacheScouts can be used by operating systems (OS) and virtual machine monitors (VMMs) for (a) characterizing execution profiles, (b) optimizing scheduling for performance management, (c) providing QoS and (d) metering for chargeback.


international conference on supercomputing | 2009

Rate-based QoS techniques for cache/memory in CMP platforms

Andrew J. Herdrich; Ramesh Illikkal; Ravi R. Iyer; Donald Newell; Vineet Chadha; Jaideep Moses

As we embrace the era of chip multi-processors (CMP), we are faced with two major architectural challenges: (i) QoS or performance management of disparate applications running on CPU cores contending for shared cache/memory resources and (ii) global/local power management techniques to stay within the overall platform constraints. The problem is exacerbated as the number of cores sharing the resources in a chip increase. In the past, researchers have proposed independent solutions for these two problems. In this paper, we show that rate-based techniques that are employed to address power management can be adapted to address cache/memory QoS issues. The basic approach is to throttle down the processing rate of a core if it is running a low-priority task and its execution is interfering with the performance of a high priority task due to platform resource contention (i.e. cache or memory contention). We evaluate two rate throttling mechanisms (clock modulation, and frequency scaling) for effectively managing the interference between applications running in a CMP platform and delivering QoS/performance management. We show that clock modulation is much more applicable to cache/memory QoS than frequency scaling and that resource monitoring along with rate control provides effective power-performance management in CMP platforms.


virtual execution environments | 2007

I/O processing in a virtualized platform: a simulation-driven approach

Vineet Chadha; Ramesh Illiikkal; Ravi R. Iyer; Jaideep Moses; Donald Newell; Renato J. O. Figueiredo

Virtualization provides levels of execution isolation and service partition that are desirable in many usage scenarios, but its associated overheads are a major impediment for wide deployment of virtualized environments. While the virtualization cost depends heavily on workloads, it has been demonstrated that the overhead is much higher with I/O intensive workloads compared to those which are compute-intensive. Unfortunately, the architectural reasons behind the I/O performance overheads are not well understood. Early research in characterizing these penalties has shown that cache misses and TLB related overheads contribute to most of I/O virtualization cost. While most of these evaluations are done using measurements, in this paper we present an execution-driven simulation based analysis methodology with symbol annotation as a means of evaluating the performance of virtualized workloads. This methodology provides detailed information at the architectural level (with a focus on cache and TLB) and allows designers to evaluate potential hardware enhancements to reduce virtualization overhead. We apply this methodology to study the network I/O performance of Xen (as a case study) in a full system simulation environment, using detailed cache and TLB models to profile and characterize software and hardware hotspots. By applying symbol annotation to the instruction flow reported by the execution driven simulator we derive function level call flow information. We follow the anatomy of I/O processing in a virtualized platform for network transmit and receive scenarios and demonstrate the impact of cache scaling and TLB size scaling on performance.


measurement and modeling of computer systems | 2009

Virtual platform architectures for resource metering in datacenters

Ravi R. Iyer; Ramesh Illikkal; Li Zhao; Don Newell; Jaideep Moses

With cloud and utility computing models gaining significant momentum, data centers are increasingly employing virtualization and consolidation as a means to support a large number of disparate applications running simultaneously on a CMP server. In such environments, it is important to meter the usage of resources by each datacenter application so that customers can be charged accordingly. In this paper, we describe a simple metering and chargeback model (pay-as-you-go) and describe a solution based on virtual platform architectures (VPA) to accurately meter visible as well as transparent resources.


international symposium on performance analysis of systems and software | 2009

CMPSched

Jaideep Moses; Konstantinos Aisopos; Aamer Jaleel; Ravi R. Iyer; Ramesh Illikkal; Donald Newell; Srihari Makineni

CMPs have now become mainstream and are growing in complexity with more cores, several shared resources (cache, memory, etc) and the potential for additional heterogeneous elements. In order to manage these resources, it is becoming critical to optimize the interaction between the execution environment (operating systems, virtual machine monitors, etc) and the CMP platform. Performance analysis of such OS and CMP interactions is challenging because it requires long running full-system execution-driven simulations. In this paper, we explore an alternative approach (CMPSched


modeling, analysis, and simulation on computer and telecommunication systems | 2004

im: Evaluating OS/CMP interaction on shared cache management

Jaideep Moses; Ramesh Illikkal; Ravi R. Iyer; Ram Huggahalli; Donald Newell

im) to evaluate the interaction of OS and CMP architectures. In particular, CMPSched


design, automation, and test in europe | 2012

ASPEN: towards effective simulation of threads & engines in evolving platforms

Konstantinos Aisopos; Jaideep Moses; Ramesh Illikkal; Ravishankar R. Iyer; Donald Newell

im is focused on evaluating techniques to address the shared cache management problem through better interaction between CMP hardware and operating system scheduling. CMPSched


ieee international conference on high performance computing data and analytics | 2007

PCASA: probabilistic control-adjusted selective allocation for shared caches

Li Zhao; Ravi R. Iyer; Srihari Makineni; Ramesh Illikkal; Jaideep Moses; Donald Newell

im enables fast and flexible exploration of this interaction by combining the benefits of (a) binary instrumentation tools (Pin), (b) user-level scheduling tools (Linsched) and (c) simple core/cache simulators. In this paper, we describe CMPSched


Archive | 2007

Constraint-aware large-scale CMP cache design

Ramesh Kumar Illikkal; Ravishankar Iyer; Jaideep Moses; Don Newell; Tryggve Fossum

im in detail and present case studies showing how CMPSched


Archive | 2011

Priority based throttling for power/performance quality of service

Jaideep Moses; Rameshkumar G. Illikkal; Ravishankar Iyer; Jared E. Bendt; Sadagopan Srinivasan; Andrew J. Herdrich; Ashish V. Choubal; Avinash N. Ananthakrishnan; Vijay S.R. Degalahal

im can be used to optimize OS scheduling by taking advantage of novel shared cache monitoring capabilities in the hardware. We also describe OS scheduling heuristics to improve overall system performance through resource monitoring and application classification to achieve near optimal scheduling that minimizes the effects of contention in the shared cache of a CMP platform.

Researchain Logo
Decentralizing Knowledge