Brian Kocoloski | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Brian Kocoloski is active.

Explore More

Publication

Featured researches published by Brian Kocoloski.

high performance distributed computing | 2015

Achieving Performance Isolation with Lightweight Co-Kernels

Jiannan Ouyang; Brian Kocoloski; John R. Lange; Kevin Pedretti

Performance isolation is emerging as a requirement for High Performance Computing (HPC) applications, particularly as HPC architectures turn to in situ data processing and application composition techniques to increase system throughput. These approaches require the co-location of disparate workloads on the same compute node, each with different resource and runtime requirements. In this paper we claim that these workloads cannot be effectively managed by a single Operating System/Runtime (OS/R). Therefore, we present Pisces, a system software architecture that enables the co-existence of multiple independent and fully isolated OS/Rs, or enclaves, that can be customized to address the disparate requirements of next generation HPC workloads. Each enclave consists of a specialized lightweight OS co-kernel and runtime, which is capable of independently managing partitions of dynamically assigned hardware resources. Contrary to other co-kernel approaches, in this work we consider performance isolation to be a primary requirement and present a novel co-kernel architecture to achieve this goal. We further present a set of design requirements necessary to ensure performance isolation, including: (1) elimination of cross OS dependencies, (2) internalized management of I/O, (3) limiting cross enclave communication to explicit shared memory channels, and (4) using virtualization techniques to provide missing OS features. The implementation of the Pisces co-kernel architecture is based on the Kitten Lightweight Kernel and Palacios Virtual Machine Monitor, two system software architectures designed specifically for HPC systems. Finally we will show that lightweight isolated co-kernels can provide better performance for HPC applications, and that isolated virtual machines are even capable of outperforming native environments in the presence of competing workloads.

symposium on cloud computing | 2012

A case for dual stack virtualization: consolidating HPC and commodity applications in the cloud

Brian Kocoloski; Jiannan Ouyang; John R. Lange

With the growth of Infrastructure as a Service (IaaS) cloud providers, many have begun to seriously consider cloud services as a substrate for HPC applications. While the cloud promises many benefits for the HPC community, it currently does not come without drawbacks for application performance. These performance issues are generally the result of resource contention as multiple VMs compete for the same hardware. This contention culminates in cross VM interference whereby one VM is able to impact the performance of another. For HPC applications this interference can have a dramatic impact on scalability and performance. In order to fully support HPC applications in the cloud, services need to be available that prevent cross VM interference and isolate HPC workloads from other users. As a means to achieve this goal, we propose a dual stack approach to IaaS cloud services that utilizes multiple concurrent VMMs on each node capable of partitioning local resources in order to provide performance isolation. Each partition can then be managed by a specialized VMM that is designed specifically for either an HPC or commodity environment. In this paper we demonstrate the use of the Palacios VMM, a virtual machine monitor specifically designed for HPC, in concert with KVM to provide a partitioned cloud platform that is capable of hosting both commodity and HPC applications on a single node without interference. Furthermore, our results demonstrate that running KVM and Palacios in parallel allows an HPC application to achieve isolated and scalable performance while sharing hardware resources with commodity VMs.

international workshop on runtime and operating systems for supercomputers | 2012

Better than native: using virtualization to improve compute node performance

Brian Kocoloski; John R. Lange

Modified variants of Linux are likely to be the underlying operating systems for future exascale platforms. Despite the many advantages of this approach, a subset of applications exist in which a lightweight kernel (LWK) based OS is needed and/or preferred. We contend that virtualization is capable of supporting LWKs as virtual machines (VMs) running at scale on top of a Linux environment. Furthermore, we claim that a properly designed virtual machine monitor (VMM) can provide an isolated and independent environment that avoids the overheads of the Linux host OS. To validate the feasibility of this approach we demonstrate that given a Linux host OS, benchmarks running in a virtualized LWK environment are capable of outperforming the same benchmarks executed directly on the Linux host.

international workshop on runtime and operating systems for supercomputers | 2015

System-Level Support for Composition of Applications

Brian Kocoloski; John R. Lange; Hasan Abbasi; David E. Bernholdt; Terry Jones; Jai Dayal; Noah Evans; Michael Lang; Jay F. Lofstead; Kevin Pedretti; Patrick G. Bridges

Current HPC system software lacks support for emerging application deployment scenarios that combine one or more simulations with in situ analytics, sometimes called multi-component or multi-enclave applications. This paper presents an initial design study, implementation, and evaluation of mechanisms supporting composite multi-enclave applications in the Hobbes exascale operating system. These mechanisms include virtualization techniques isolating application custom enclaves while using the vendor-supplied host operating system and high-performance inter-VM communication mechanisms. Our initial single-node performance evaluation of these mechanisms on multi-enclave science applications, both real and proxy, demonstrate the ability to support multi-enclave HPC job composition with minimal performance overhead.

international parallel and distributed processing symposium | 2014

HPMMAP: Lightweight Memory Management for Commodity Operating Systems

Brian Kocoloski; John R. Lange

Linux-based operating systems and runtimes (OS/Rs) have emerged as the environments of choice for the majority of modern HPC systems. While Linux-based OS/Rs have advantages such as extensive feature sets as well as developer familiarity, these features come at the cost of additional overhead throughout the system. In contrast to Linux, there is a substantial history of work in the HPC community focused on lightweight OS/R architectures that provide scalable and consistent performance for tightly coupled HPC applications, but lack many of the features offered by commodity OS/Rs. In this paper, we propose to bridge the gap between LWKs and commodity OS/Rs by selectively providing a lightweight memory subsystem for HPC applications in a commodity OS/R environment. Our system HPMMAP provides isolated and low overhead memory performance transparently to HPC applications by bypassing Linuxs memory management layer. Our approach is dynamically configurable at runtime, and adds no additional overheads nor requires any resources when not in use. We show that HPMMAP can decrease variance and reduce application runtime by up to 50%.

ieee international conference on high performance computing data and analytics | 2013

Improving compute node performance using virtualization

Brian Kocoloski; John R. Lange

Modified variants of Linux are likely to be the underlying operating systems (OSs) for future exascale platforms. Despite the many advantages of this approach, a subset of applications exist in which a lightweight kernel (LWK)-based OS is needed and/or preferred. We contend that virtualization is capable of supporting LWKs as virtual machines (VMs) running at scale on top of a Linux environment. Furthermore, we claim that a properly designed virtual machine monitor (VMM) can provide an isolated and independent environment that avoids the overheads of the Linux host OS. To validate the feasibility of this approach we demonstrate that given a Linux host OS, benchmarks running in a virtualized LWK environment are capable of outperforming the same benchmarks executed directly on the Linux host.

international conference on cluster computing | 2016

A Case for Criticality Models in Exascale Systems

Brian Kocoloski; Leonardo Piga; Wei Huang; Indrani Paul; John R. Lange

Performance variation is a significant problem for large scale HPC systems and will increase on future exascale systems. In this work, we show that performance variation impacts the performance and energy efficiency of contemporary large-scale computing systems in highly temporally inconsistent ways. We thus present a case for criticality models, a learning based mechanism that allows a system to generate holistic models of performance variation as it occurs during application runtime. Criticality models are designed to provide a mechanism by which applications can detect performance variation at runtime and take action to mitigate its effects. We present a promising preliminary analysis of criticality models on a small scale cluster. Our results demonstrate that models based on logistic regression scan accurately model criticality at this scale.

IEEE Transactions on Parallel and Distributed Systems | 2016

Lightweight Memory Management for High Performance Applications in Consolidated Environments

Brian Kocoloski; John R. Lange

Linux-based operating systems and runtimes (OS/Rs) have emerged as the environments of choice for the majority of HPC systems. While Linux-based OS/Rs have advantages such as extensive feature sets and developer familiarity, these features come at the cost of additional system overhead. In contrast to Linux, there is a substantial history of work in the HPC community focused on lightweight OS/Rs that provide scalable and consistent performance for HPC applications, but lack many of the features offered by commodity OS/Rs. In this paper, we propose to bridge the gap between LWKs and commodity OS/Rs by selectively providing a lightweight memory subsystem for HPC applications in a commodity OS/R where concurrently executing a diverse range of workloads is commonplace. Our system HPMMAP provides lightweight memory performance transparently to HPC applications by bypassing Linuxs memory management layer. Using HPMMAP, HPC applications achieve consistent performance while the same local compute nodes execute competing workloads likely to be found in HPC clusters and “in-situ” workload deployments. Our approach is dynamically configurable at runtime, and requires no resources when not in use. We show that HPMMAP can decrease variance and reduce application runtime by up to 50 percent when executing a co-located competing commodity workload.

Proceedings of the 2015 International Symposium on Memory Systems | 2015

Implications of Memory Interference for Composed HPC Applications

Brian Kocoloski; Yuyu Zhou; Bruce R. Childers; John R. Lange

The cost of inter-node I/O and data movement is becoming increasingly prohibitive for large scale High Performance Computing (HPC) applications. This trend is leading to the emergence of composed in situ applications that co-locate multiple components on the same node. However, these components may contend for underlying memory system resources. In this extended research abstract, we present a preliminary evaluation of the impacts of contention for shared resources in the memory hierarchy, including the last level cache (LLC) and DRAM bandwidth. We show that even modest levels of memory contention can have substantial performance implications for some benchmarks, and argue for a cross layer approach to resource partitioning and scheduling on future HPC systems.

international workshop on runtime and operating systems for supercomputers | 2016

A Cross-Enclave Composition Mechanism for Exascale System Software

Noah Evans; Kevin Pedretti; Brian Kocoloski; John R. Lange; Michael Lang; Patrick G. Bridges

As supercomputers move to exascale, the number of cores per node continues to increase, but the I/O bandwidth between nodes is increasing more slowly. This leads to computational power outstripping I/O bandwidth. This growth, in turn, encourages moving as much of an HPC workflow as possible onto the node in order to minimize data movement. One particular method of application composition, enclaves, co-locates different operating systems and runtimes on the same node where they communicate by in situ communication mechanisms. In this work, we describe a mechanism for communicating between composed applications. We implement a mechanism using Copy on Write cooperating with XEMEM shared memory to provide consistent, implicitly unsynchronized communication across enclaves. We then evaluate this mechanism using a composed application and analytics between the Kitten Lightweight Kernel and Linux on top of the Hobbes Operating System and Runtime. These results show a 3% overhead compared to an application running in isolation, demonstrating the viability of this approach.

Explore More