Gregory A. Koenig | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gregory A. Koenig is active.

Explore More

Publication

Featured researches published by Gregory A. Koenig.

cluster computing and the grid | 2006

Maestro-VC: a paravirtualized execution environment for secure on-demand cluster computing

Nadir Kiyanclar; Gregory A. Koenig; William Yurcik

Virtualization, a technology first developed for partitioning the resources of mainframe computers, has seen a resurgence in popularity in the realm of commodity workstation computers. This paper introduces Maestro-VC, a system which explores a novel use of VMs as the building blocks of entire virtual clusters (VCs). Virtualization of entire clusters is beneficial because existing parallel code can run without modification in the virtual environment. At the same time, inserting a layer of software between a virtual cluster and native hardware allows for security enforcement and flexible resource management in a manner transparent to running parallel code. In this paper we describe the design and implementation of Maestro-VC, and give the results of some preliminary performance experiments.

cluster computing and the grid | 2005

Clusters and security: distributed security for distributed systems

Makan Pourzandi; David Gordon; William Yurcik; Gregory A. Koenig

Large-scale commodity clusters are used in an increasing number of domains: academic, research, and industrial environments. At the same time, these clusters are exposed to an increasing number of attacks coming from public networks. Therefore, mechanisms for efficiently and flexibly managing security have now become an essential requirement for clusters. However, despite the growing importance of cluster security, this field has been only minimally addressed by contemporary cluster administration techniques. This paper presents a high-level view of existing security challenges related to clusters and proposes a structured approach for handling security in clustered servers. The goal of this paper is to identify various necessarily-distributed security services and their related characteristics as a means of enhancing cluster security.

international parallel and distributed processing symposium | 2007

Optimizing Distributed Application Performance Using Dynamic Grid Topology-Aware Load Balancing

Gregory A. Koenig; Laxmikant V. Kalé

Grid computing offers a model for solving large-scale scientific problems by uniting computational resources owned by multiple organizations to form a single cohesive resource for the duration of individual jobs. Despite the appeal of using grid computing to solve large problems, its use has been hindered by the challenges involved in developing applications that can run efficiently in grid environments. One substantial obstacle to deploying grid applications across geographically distributed resources is cross-site latency. While certain classes of applications, such as master-slave style or functional decomposition type applications, lend themselves well to running in grid environments due to inherent latency tolerance, other classes of applications, such as tightly-coupled applications in which each processor regularly communicates with its neighboring processors, represent a significant challenge to deployment on grids. In this paper, we present a dynamic load balancing technique for grid applications based on graph partitioning. This technique exploits knowledge of the topology of the grid environment to partition the computations communication graph in such a way as to reduce the volume of cross-site communication, thus improving the performance of tightly-coupled applications that are co-allocated across distributed resources. Our technique is particularly well suited to codes from disciplines like molecular dynamics or cosmology due to the non-uniform structure of communication in these types of applications. We evaluate the effectiveness of our technique when used to optimize the execution of a tightly-coupled classical molecular dynamics code called LeanMD deployed in a grid environment.

IEEE Transactions on Computers | 2015

Utility Functions and Resource Management in an Oversubscribed Heterogeneous Computing Environment

Bhavesh Khemka; Ryan Friese; Luis Diego Briceno; Howard Jay Siegel; Anthony A. Maciejewski; Gregory A. Koenig; Chris Groër; Gene Okonski; Marcia Hilton; Rajendra Rambharos; Steve Poole

We model an oversubscribed heterogeneous computing system where tasks arrive dynamically and a scheduler maps the tasks to machines for execution. The environment and workloads are based on those being investigated by the Extreme Scale Systems Center at Oak Ridge National Laboratory. Utility functions that are designed based on specifications from the system owner and users are used to create a metric for the performance of resource allocation heuristics. Each task has a time-varying utility (importance) that the enterprise will earn based on when the task successfully completes execution. We design multiple heuristics, which include a technique to drop low utility-earning tasks, to maximize the total utility that can be earned by completing tasks. The heuristics are evaluated using simulation experiments with two levels of oversubscription. The results show the benefit of having fast heuristics that account for the importance of a task and the heterogeneity of the environment when making allocation decisions in an oversubscribed environment. The ability to drop low utility-earning tasks allow the heuristics to tolerate the high oversubscription as well as earn significant utility.

ieee international symposium on parallel & distributed processing, workshops and phd forum | 2013

An Analysis Framework for Investigating the Trade-Offs between System Performance and Energy Consumption in a Heterogeneous Computing Environment

Ryan Friese; Bhavesh Khemka; Anthony A. Maciejewski; Howard Jay Siegel; Gregory A. Koenig; Sarah S Powers; Marcia Hilton; Jendra Rambharos; Gene Okonski; Stephen W. Poole

Rising costs of energy consumption and an ongoing effort for increases in computing performance are leading to a significant need for energy-efficient computing. Before systems such as supercomputers, servers, and datacenters can begin operating in an energy-efficient manner, the energy consumption and performance characteristics of the system must be analyzed. In this paper, we provide an analysis framework that will allow a system administrator to investigate the tradeoffs between system energy consumption and utility earned by a system (as a measure of system performance). We model these trade-offs as a bi-objective resource allocation problem. We use a popular multi-objective genetic algorithm to construct Pareto fronts to illustrate how different resource allocations can cause a system to consume significantly different amounts of energy and earn different amounts of utility. We demonstrate our analysis framework using real data collected from online benchmarks, and further provide a method to create larger data sets that exhibit similar heterogeneity characteristics to real data sets. This analysis framework can provide system administrators with insight to make intelligent scheduling decisions based on the energy and utility needs of their systems.

ieee international symposium on parallel & distributed processing, workshops and phd forum | 2011

Time Utility Functions for Modeling and Evaluating Resource Allocations in a Heterogeneous Computing System

Luis Diego Briceno; Bhavesh Khemka; Howard Jay Siegel; Anthony A. Maciejewski; Christopher S Groer; Gregory A. Koenig; Gene Okonski; Stephen W. Poole

This study considers a heterogeneous computing system and corresponding workload being investigated by the Extreme Scale Systems Center (ESSC) at Oak Ridge National Laboratory (ORNL). The ESSC is part of a collaborative effort between the Department of Energy (DOE) and the Department of Defense (DoD) to deliver research, tools, software, and technologies that can be integrated, deployed, and used in both DOE and DoD environments. The heterogeneous system and workload described here are representative of a prototypical computing environment being studied as part of this collaboration. Each task can exhibit a time-varying emph{importance} or emph{utility} to the overall enterprise. In this system, an arriving task has an associated priority and precedence. The priority is used to describe the importance of a task, and precedence is used to describe how soon the task must be executed. These two metrics are combined to create a utility function curve that indicates how valuable it is for the system to complete a task at any given moment. This research focuses on using time-utility functions to generate a metric that can be used to compare the performance of different resource schedulers in a heterogeneous computing system. The contributions of this paper are: (a) a mathematical model of a heterogeneous computing system where tasks arrive dynamically and need to be assigned based on their priority, precedence, utility characteristic class, and task execution type, (b) the use of priority and precedence to generate time-utility functions that describe the value a task has at any given time, (c) the derivation of a metric based on the total utility gained from completing tasks to measure the performance of the computing environment, and (d) a comparison of the performance of resource allocation heuristics in this environment

cluster computing and the grid | 2005

Cluster security with NVisionCC: process monitoring by leveraging emergent properties

Gregory A. Koenig; Xin Meng; Adam J. Lee; Michael Treaster; Nadir Kiyanclar; William Yurcik

We have observed that supercomputing clusters made up of commodity off-the-shelf computers possess emergent properties that are apparent when these systems are considered as an indivisible entity rather than as a collection of independent nodes. By exploiting predicatable characteristics inherent to supercomputing clusters coupled with these emergent properties, we propose several mechanisms by which cluster security may be enhanced. In this paper, we describe NVisionCC, a cluster security tool that monitors processes across cluster nodes and raises alerts when deviations from a predefined profile of expected processes are noted. Additionally, we demonstrate that the monitoring infrastructure used by NVisionCC incurs a negligible performance penalty on the computational and network resources of the cluster.

Informatik Spektrum | 2015

Electrical Grid and Supercomputing Centers: An Investigative Analysis of Emerging Opportunities and Challenges

Natalie J. Bates; Girish Ghatikar; Ghaleb Abdulla; Gregory A. Koenig; Sridutt Bhalachandra; Mehdi Sheikhalishahi; Tapasya Patki; Barry Rountree; Stephen W. Poole

Some of the largest supercomputing centers (SCs) in the United States are developing new relationships with their electricity service providers (ESPs). These relationships, similar to other commercial and industrial partnerships, are driven by a mutual interest to reduce energy costs and improve electrical grid reliability. While SCs are concerned about the quality, cost, environmental impact, and availability of electricity, ESPs are concerned about electrical grid reliability, particularly in terms of energy consumption, peak power demands, and power fluctuations. The power demand for SCs can be 20 MW or more – the theoretical peak power requirements are greater than 45 MW – and recurring intra-hour variability can exceed 8 MW. As a result of this, ESPs may request large SCs to engage in demand response and grid integration.This paper evaluates today’s relationships, potential partnerships, and possible integration between SCs and their ESPs. The paper uses feedback from a questionnaire submitted to supercomputing centers on the Top100 List in the United States to describe opportunities for overcoming the challenges of HPC-grid integration.

international parallel and distributed processing symposium | 2005

Using message-driven objects to mask latency in grid computing applications

Gregory A. Koenig; Laxmikant V. Kalé

One of the attractive features of grid computing is that resources in geographically distant places can be mobilized to meet computational needs as they arise. A particularly challenging issue is that of executing a single application across multiple machines that are separated by large distances. While certain classes of applications such as pipeline style or master-slave style applications may run well in grid computing environments with little or no modification, tightly-coupled applications require significant work to achieve good performance. In this paper, we demonstrate that message-driven objects, implemented in the Charm++ and adaptive MPI systems, can be used to mask the effects of latency in grid computing environments without requiring modification of application software. We examine a simple five-point stencil decomposition application as well as a more complex molecular dynamics application running in an environment in which arbitrary artificial latencies can be induced between pairs of nodes. Performance of the applications running under artificial latencies are compared to the performance of the applications running across TeraGrid nodes located at the National Center for Supercomputing Applications and Argonne National Laboratory.

symposium on computer architecture and high performance computing | 2010

A Clock Synchronization Strategy for Minimizing Clock Variance at Runtime in High-End Computing Environments

Terry Jones; Gregory A. Koenig

We present a new software-based clock synchronization scheme that provides high precision time agreement among distributed memory nodes. The technique is designed to minimize variance from a reference chimer during runtime and with minimal time-request latency. Our scheme permits initial unbounded variations in time and corrects both slow and fast chimers (clock skew). An implementation developed within the context of the MPI message passing interface is described and time coordination measurements are presented. Among our results, the mean time variance among a set of nodes improved from 20.0 milliseconds under standard Network Time Protocol (NTP) to 2.29 μsecs under our scheme.

Explore More