David K. Lowenthal | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where David K. Lowenthal is active.

Explore More

Publication

Featured researches published by David K. Lowenthal.

Frontiers in Plant Science | 2011

The iPlant Collaborative: Cyberinfrastructure for Plant Biology

Stephen A. Goff; Matthew W. Vaughn; Sheldon J. McKay; Eric Lyons; Ann E. Stapleton; Damian Gessler; Naim Matasci; Liya Wang; Matthew R. Hanlon; Andrew Lenards; Andy Muir; Nirav Merchant; Sonya Lowry; Stephen A. Mock; Matthew Helmke; Adam Kubach; Martha L. Narro; Nicole Hopkins; David Micklos; Uwe Hilgert; Michael Gonzales; Chris Jordan; Edwin Skidmore; Rion Dooley; John Cazes; Robert T. McLay; Zhenyuan Lu; Shiran Pasternak; Lars Koesterke; William H. Piel

The iPlant Collaborative (iPlant) is a United States National Science Foundation (NSF) funded project that aims to create an innovative, comprehensive, and foundational cyberinfrastructure in support of plant biology research (PSCIC, 2006). iPlant is developing cyberinfrastructure that uniquely enables scientists throughout the diverse fields that comprise plant biology to address Grand Challenges in new ways, to stimulate and facilitate cross-disciplinary research, to promote biology and computer science research interactions, and to train the next generation of scientists on the use of cyberinfrastructure in research and education. Meeting humanitys projected demands for agricultural and forest products and the expectation that natural ecosystems be managed sustainably will require synergies from the application of information technologies. The iPlant cyberinfrastructure design is based on an unprecedented period of research community input, and leverages developments in high-performance computing, data storage, and cyberinfrastructure for the physical sciences. iPlant is an open-source project with application programming interfaces that allow the community to extend the infrastructure to meet its needs. iPlant is sponsoring community-driven workshops addressing specific scientific questions via analysis tool integration and hypothesis testing. These workshops teach researchers how to add bioinformatics tools and/or datasets into the iPlant cyberinfrastructure enabling plant scientists to perform complex analyses on large datasets without the need to master the command-line or high-performance computational services.

acm sigplan symposium on principles and practice of parallel programming | 2005

Using multiple energy gears in MPI programs on a power-scalable cluster

Vincent W. Freeh; David K. Lowenthal

Recently, system architects have built low-power, high-performance clusters, such as Green Destiny. The idea behind these clusters is to improve the energy efficiency of nodes. However, these clusters save power at the expense of performance. Our approach is instead to use high-performance cluster nodes that are frequency- and voltage-scalable; energy can than be saved by scaling down the CPU. Our prior work has examined the costs and benefits of executing an entire application at a single reduced frequency.This paper presents a framework for executing a single application in several frequency-voltage settings. The basic idea is to first divide programs into phases and then execute a series of experiments, with each phase assigned a prescribed frequency. During each experiment, we measure energy consumption and time and then use a heuristic to choose the assignment of frequency to phase for the next experiment.Our results show that significant energy can be saved without an undue performance penalty; particularly, our heuristic finds assignments of frequency to phase that is superior to any fixed-frequency solution. Specifically, this paper shows that more than half of the NAS benchmarks exhibit a better energy-time tradeoff using multiple gears than using a single gear. For example, IS using multiple gears uses 9% less energy and executes in 1% less time than the closest single-gear solution. Compared to no frequency scaling, multiple gear IS uses 16% less energy while executing only 1% longer.

IEEE Transactions on Parallel and Distributed Systems | 2007

Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications

Vincent W. Freeh; David K. Lowenthal; Feng Pan; Nandini Kappiah; Robert Springer; Barry Rountree; Mark Edward Femal

Although users of high-performance computing are most interested in raw performance both energy and power consumption has become critical concerns. One approach to lowering energy and power is to use high-performance cluster nodes that have several power-performance states so that the energy-time trade-off can be dynamically adjusted. This paper analyzes the energy-time trade-off of a wide range of applications-serial and parallel-on a power-scalable cluster. We use a cluster of frequency and voltage-scalable AMD-64 nodes, each equipped with a power meter. We study the effects of memory and communication bottlenecks via direct measurement of time and energy. We also investigate metrics that can, at runtime, predict when each type of bottleneck occurs. Our results show that, for programs that have a memory or communication bottleneck, a power-scalable cluster can save significant energy with only a small time penalty. Furthermore, we find that, for some programs, it is possible to both consume less energy and execute in less time by increasing the number of nodes while reducing the frequency-voltage setting of each node

conference on high performance computing (supercomputing) | 2006

Adaptive, transparent frequency and voltage scaling of communication phases in MPI programs

Min Yeol Lim; Vincent W. Freeh; David K. Lowenthal

Although users of high-performance computing are most interested in raw performance, both energy and power consumption have become critical concerns. Some microprocessors allow frequency and voltage scaling, which enables a system to reduce CPU performance and power when the CPU is not on the critical path. When properly directed, such dynamic frequency and voltage scaling can produce significant energy savings with little performance penalty. This paper presents an MPI runtime system that dynamically reduces CPU performance during communication phases in MPI programs. It dynamically identifies such phases and, without profiling or training, selects the CPU frequency in order to minimize energy-delay product. All analysis and subsequent frequency and voltage scaling is within MPI and so is entirely transparent to the application. This means that the large number of existing MPI programs, as well as new ones being developed, can use our system without modification. Results show that the average reduction in energy-delay product over the NAS benchmark suite is 10% - the average energy reduction is 12% while the average execution time increase is only 2.1%

conference on high performance computing (supercomputing) | 2007

Bounding energy consumption in large-scale MPI programs

Barry Rountree; David K. Lowenthal; Shelby Funk; Vincent W. Freeh; Bronis R. de Supinski; Martin Schulz

Power is now a first-order design constraint in large-scale parallel computing. Used carefully, dynamic voltage scaling can execute parts of a program at a slower CPU speed to achieve energy savings with a relatively small (possibly zero) time delay. However, the problem of when to change frequencies in order to optimize energy savings is NP-complete, which has led to many heuristic energy-saving algorithms. To determine how closely these algorithms approach optimal savings, we developed a system that determines a bound on the energy savings for an application. Our system uses a linear programming solver that takes as inputs the application communication trace and the cluster power characteristics and then outputs a schedule that realizes this bound. We apply our system to three scientific programs, two of which exhibit load imbalance---particle simulation and UMT2K. Results from our bounding technique show particle simulation is more amenable to energy savings than UMT2K.

international parallel and distributed processing symposium | 2005

Exploring the energy-time tradeoff in MPI programs on a power-scalable cluster

Vincent W. Freeh; Feng Pan; Nandini Kappiah; David K. Lowenthal; Robert Springer

Recently, energy has become an important issue in high-performance computing. For example, supercomputers that have energy in mind, such as BlueGene/L, have been built; the idea is to improve the energy efficiency of nodes. Our approach, which uses off-the-shelf, high-performance cluster nodes that are frequency scalable, allows energy saving by scaling down the CPU. This paper investigates the energy consumption and execution time of applications from a standard benchmark suite (NAS) on a power-scalable cluster. We study via direct measurement and simulation both intra-node and inter-node effects of memory and communication bottlenecks, respectively. Additionally, we compare energy consumption and execution time across different numbers of nodes. Our results show that a power-scalable cluster has the potential to save energy by scaling the processor down to lower energy levels. Furthermore, we found that for some programs, it is possible to both consume less energy and execute in less time when using a larger number of nodes, each at reduced energy. Additionally, we developed and validated a model that enables us to predict the energy-time tradeoff of larger clusters.

international conference on supercomputing | 2008

A regression-based approach to scalability prediction

Bradley J. Barnes; Barry Rountree; David K. Lowenthal; Jaxk Reeves; Bronis R. de Supinski; Martin Schulz

Many applied scientific domains are increasingly relying on large-scale parallel computation. Consequently, many large clusters now have thousands of processors. However, the ideal number of processors to use for these scientific applications varies with both the input variables and the machine under consideration, and predicting this processor count is rarely straightforward. Accurate prediction mechanisms would provide many benefits, including improving cluster efficiency and identifying system configuration or hardware issues that impede performance. We explore novel regression-based approaches to predict parallel program scalability. We use several program executions on a small subset of the processors to predict execution time on larger numbers of processors. We compare three different regression-based techniques: one based on execution time only; another that uses per-processor information only; and a third one based on the global critical path. These techniques provide accurate scaling predictions, with median prediction errors between 6.2% and 17.3% for seven applications.

international parallel and distributed processing symposium | 2012

Beyond DVFS: A First Look at Performance under a Hardware-Enforced Power Bound

Barry Rountree; Dong H. Ahn; Bronis R. de Supinski; David K. Lowenthal; Martin Schulz

Dynamic Voltage Frequency Scaling (DVFS) has been the tool of choice for balancing power and performance in high-performance computing (HPC). With the introduction of Intels Sandy Bridge family of processors, researchers now have a far more attractive option: user-specified, dynamic, hardware-enforced processor power bounds. In this paper we provide a first look at this technology in the HPC environment and detail both the opportunities and potential pitfalls of using this technique to control processor power. As part of this evaluation we measure power and performance for single-processor instances of several of the NAS Parallel Benchmarks. Additionally, we focus on the behavior of a single benchmark, MG, under several different power bounds. We quantify the well-known manufacturing variation in processor power efficiency and show that, in the absence of a power bound, this variation has no correlation to performance. We then show that execution under a power bound translates this variation in efficiency into variation in performance.

passive and active network measurement | 2005

New methods for passive estimation of TCP round-trip times

Bryan Veal; David K. Lowenthal

We propose two methods to passively measure and monitor changes in round-trip times (RTTs) throughout the lifetime of a TCP connection. Our first method associates data segments with the acknowledgments (ACKs) that trigger them by leveraging the TCP timestamp option. Our second method infers TCP RTT by observing the repeating patterns of segment clusters where the pattern is caused by TCP self-clocking. We evaluate the two methods using both emulated and real Internet tests.

acm sigplan symposium on principles and practice of parallel programming | 2006

Minimizing execution time in MPI programs on an energy-constrained, power-scalable cluster

Robert Springer; David K. Lowenthal; Barry Rountree; Vincent W. Freeh

Recently, the high-performance computing community has realized that power is a performance-limiting factor. One reason for this is that supercomputing centers have limited power capacity and machines are starting to hit that limit. In addition, the cost of energy has become increasingly significant, and the heat produced by higher-energy components tends to reduce their reliability. One way to reduce power (and therefore energy) requirements is to use high-performance cluster nodes that are frequency- and voltage-scalable (e.g., AMD-64 processors).The problem we address in this paper is: given a target program, a power-scalable cluster, and an upper limit for energy consumption, choose a schedule (number of nodes and CPU frequency) that simultaneously (1) satisfies an external upper limit for energy consumption and (2) minimizes execution time. There are too many schedules for an exhaustive search. Therefore, we find a schedule through a novel combination of performance modeling, performance prediction, and program execution. Using our technique, we are able to find a near-optimal schedule for all of our benchmarks in just a handful of partial program executions.

Explore More