Richard Kaufmann | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Richard Kaufmann is active.

Explore More

Publication

Featured researches published by Richard Kaufmann.

ieee international conference on high performance computing data and analytics | 2009

Leveraging 3D PCRAM technologies to reduce checkpoint overhead for future exascale systems

Xiangyu Dong; Naveen Muralimanohar; Norman P. Jouppi; Richard Kaufmann; Yuan Xie

The scalability of future massively parallel processing (MPP) systems is challenged by high failure rates. Current hard disk drive (HDD) checkpointing results in overhead of 25% or more at the petascale. With a direct correlation between checkpoint frequencies and node counts, novel techniques that can take more frequent checkpoints with minimum overhead are critical to implement a reliable exascale system. In this work, we leverage the upcoming Phase-Change Random Access Memory (PCRAM) technology and propose a hybrid local/global checkpointing mechanism after a thorough analysis of MPP systems failure rates and failure sources. We propose three variants of PCRAM-based hybrid checkpointing schemes, DIMM+HDD, DIMM+DIMM, and 3D+3D, to reduce the checkpoint overhead and offer a smooth transition from the conventional pure HDD checkpoint to the ideal 3D PCRAM mechanism. The proposed pure 3D PCRAM-based mechanism can ultimately take checkpoints with overhead less than 4% on a projected exascale system.

high performance distributed computing | 2012

Exploring the performance and mapping of HPC applications to platforms in the cloud

Abhishek Gupta; Laxmikant V. Kalé; Dejan S. Milojicic; Paolo Faraboschi; Richard Kaufmann; Verdi March; Filippo Gioachin; Chun Hui Suen; Bu-Sung Lee

This paper presents a scheme to optimize the mapping of HPC applications to a set of hybrid dedicated and cloud resources. First, we characterize application performance on dedicated clusters and cloud to obtain application signatures. Then, we propose an algorithm to match these signatures to resources such that performance is maximized and cost is minimized. Finally, we show simulation results revealing that in a concrete scenario our proposed scheme reduces the cost by 60% at only 10-15% performance penalty vs. a non optimized configuration. We also find that the execution overhead in cloud can be minimized to a negligible level using thin hypervisors or OS-level containers.

ieee international conference on cloud computing technology and science | 2013

The Who, What, Why, and How of High Performance Computing in the Cloud

Abhishek Gupta; Laxmikant V. Kalé; Filippo Gioachin; Verdi March; Chun Hui Suen; Bu-Sung Lee; Paolo Faraboschi; Richard Kaufmann; Dejan S. Milojicic

Cloud computing is emerging as an alternative to supercomputers for some of the high-performance computing (HPC) applications that do not require a fully dedicated machine. With cloud as an additional deployment option, HPC users are faced with the challenges of dealing with highly heterogeneous resources, where the variability spans across a wide range of processor configurations, interconnections, virtualization environments, and pricing rates and models. In this paper, we take a holistic viewpoint to answer the question - why and who should choose cloud for HPC, for what applications, and how should cloud be used for HPC? To this end, we perform a comprehensive performance evaluation and analysis of a set of benchmarks and complex HPC applications on a range of platforms, varying from supercomputers to clouds. Further, we demonstrate HPC performance improvements in cloud using alternative lightweight virtualization mechanisms - thin VMs and OS-level containers, and hyper visor- and application-level CPU affinity. Next, we analyze the economic aspects and business models for HPC in clouds. We believe that is an important area that has not been sufficiently addressed by past research. Overall results indicate that current public clouds are cost-effective only at small scale for the chosen HPC applications, when considered in isolation, but can complement supercomputers using business models such as cloud burst and application-aware mapping.

2012 7th Open Cirrus Summit | 2012

N3phele: Open Science-as-a-Service Workbench for Cloud-based Scientific Computing

Nigel Cook; Dejan S. Milojicic; Richard Kaufmann; Joel Sevinsky

Because of inexpensive, on-demand resources, Cloud computing is a promising platform for scientific HPC applications, such as gene sequencing. However, it also poses challenges to users and developers in terms of running and maintaining HPC applications which is low-level and complex for scientists. This impacts reusability and reproducibility of the work and increases the cost of development and maintenance. N3phele is a cloud-based workbench that allows researchers to perform complex analysis using only browser and resources in infrastructure clouds, which are orchestrated by n3phele. Individual scientists may publish tools and workflow pipelines, registering them in n3phele for their own private or public collaborator use. To illustrate, the QIIME microbial community analysis toolkit has been registered into n3phele, and n3phele used to perform microbial analysis, including computationally intensive Roche 454 denoising, using Amazon EC2 and n3pheles point and click interface. N3phele substantially improves usability and manageability of complex scientific analysis pipelines in the cloud.

Archive | 2008