Eric Martin Heien | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Eric Martin Heien is active.

Explore More

Publication

Featured researches published by Eric Martin Heien.

ieee international conference on high performance computing data and analytics | 2011

Modeling and tolerating heterogeneous failures in large parallel systems

Eric Martin Heien; Derrick Kondo; Ana Gainaru; Dan Lapine; Bill Kramer; Franck Cappello

As supercomputers and clusters increase in size and complexity, system failures are inevitable. Different hardware components (such as memory, disk, or network) of such systems can have different failure rates. Prior works assume failures equally affect an application, whereas our goal is to provide failure models for applications that reflect their specific component usage. This is challenging because component failure dynamics are heterogeneous in space and time. To this end, we study 5 years of system logs from a production high-performance computing system and model hard ware failures involving processors, memory, storage and net work components. We model each component and construct integrated failure models given the component us age of common supercomputing applications. We show that these application-centric models provide more accurate reliability estimates compared to general models, which improves the efficacy of fault-tolerant algorithms. In particular, we demonstrate how applications can tune their checkpointing strategies to the tailored model.

Journal of Grid Computing | 2009

Computing Low Latency Batches with Unreliable Workers in Volunteer Computing Environments

Eric Martin Heien; David P. Anderson; Kenichi Hagihara

Internet based volunteer computing projects such as SETI@home are currently restricted to performing coarse grained, embarrassingly parallel master-worker style tasks. This is partly due to the “pull” nature of task distribution in volunteer computing environments, where workers request tasks from the master rather than the master assigning tasks to arbitrary workers. In this paper we propose algorithms for computing batches of medium grained tasks with deadlines in pull-style volunteer computing environments. We develop models of unreliable workers based on analysis of trace data from an actual volunteer computing project. These models are used to develop algorithms for task distribution in volunteer computing systems with a high probability of meeting batch deadlines. We develop algorithms for perfectly reliable workers, computation-reliable workers and unreliable workers. Finally, we demonstrate the effectiveness of the algorithms through simulations using traces from actual volunteer computing environments.

Journal of Physiological Sciences | 2008

Specifications of insilicoML 1.0: a multilevel biophysical model description language.

Yoshiyuki Asai; Yasuyuki Suzuki; Yoshiyuki Kido; Hideki Oka; Eric Martin Heien; Masao Nakanishi; Takahito Urai; Kenichi Hagihara; Yoshihisa Kurachi; Taishin Nomura

An extensible markup language format, insilicoML (ISML), version 0.1, describing multi-level biophysical models has been developed and available in the public domain. ISML is fully compatible with CellML 1.0, a model description standard developed by the IUPS Physiome Project, for enhancing knowledge integration and model sharing. This article illustrates the new specifications of ISML 1.0 that largely extend the capability of ISML 0.1. ISML 1.0 can describe various types of mathematical models, including ordinary/partial differential/difference equations representing the dynamics of physiological functions and the geometry of living organisms underlying the functions. ISML 1.0 describes a model using a set of functional elements (modules) each of which can specify mathematical expressions of the functions. Structural and logical relationships between any two modules are specified by edges, which allow modular, hierarchical, and/or network representations of the model. The role of edge-relationships is enriched by key words in order for use in constructing a physiological ontology. The ontology is further improved by the traceability of history of the models development and by linking between different ISML models stored in the models database using meta-information. ISML 1.0 is designed to operate with a model database and integrated environments for model development and simulations for knowledge integration and discovery.

international conference on distributed computing systems | 2011

Correlated Resource Models of Internet End Hosts

Eric Martin Heien; Derrick Kondo; David P. Anderson

Understanding and modelling resources of Internet end hosts is essential for the design of desktop software and Internet-distributed applications. In this paper we develop a correlated resource model of Internet end hosts based on real trace data taken from the SETI@home project. This data covers a 5-year period with statistics for 2.7 million hosts. The resource model is based on statistical analysis of host computational power, memory, and storage as well as how these resources change over time and the correlations between them. We find that resources with few discrete values (core count, memory) are well modeled by exponential laws governing the change of relative resource quantities over time. Resources with a continuous range of values are well modeled with either correlated normal distributions (processor speed for integer operations and floating point operations) or log-normal distributions (available disk space). We validate and show the utility of the models by applying them to a resource allocation problem for Internet-distributed applications, and demonstrate their value over other models. We also make our trace data and tool for automatically generating realistic Internet end hosts publicly available.

IEEE Transactions on Parallel and Distributed Systems | 2012

A Correlated Resource Model of Internet End Hosts

Eric Martin Heien; Derrick Kondo; David P. Anderson

Understanding and modeling resources of Internet end hosts are essential for the design of desktop software and Internet-distributed applications. In this paper, we develop a correlated resource model of Internet end hosts based on real-trace data taken from several volunteer computing projects, including SETI@home. This data cover a five-year period with statistics for 6.7 million hosts. Our resource model is based on statistical analysis of host computational power, memory, and storage as well as how these resources change over time and the correlations among them. We find that resources with few discrete values (core count, memory) are well modeled by approximations governing the change of relative resource quantities over time. Resources with a continuous range of values are well modeled by correlated log-normal distributions (cache, processor speed, and available disk space). We validate and show the utility of the model by applying it to a resource allocation problem for Internet-distributed applications, and compare it to other models. We also make our trace data and tool for automatically generating realistic Internet end hosts publicly available.

international conference of the ieee engineering in medicine and biology society | 2008

A platform for in silico modeling of physiological systems II. CellML compatibility and other extended capabilities

Yasuyuki Suzuki; Yoshiyuki Asai; Toshihiro Kawazu; Masao Nakanishi; Yoshiki Taniguchi; Eric Martin Heien; Kenichi Hagihara; Yoshihisa Kurachi; Taishin Nomura

The number of biological models published in peer reviewed journals and complexity of each of those models are rapidly increasing, making it difficult to reproduce simulation results of the published models and to reuse the models by third persons. This paper is a continuation of our previous report on a software platform development as a solution to such difficulties. We describe progresses of our development. Those include improvement in functional capabilities to import and simulate published models in the CellML model repository, to browse and edit CellML models and then to export them as new models either with the CellML format or with a XML format defined for our platform (ISML), and to newly construct large scale models by connecting CellML/ISML models. Several advantages to use ISML in parallel with CellML are; 1) ISML can deal with geometry (morphology) of a model, enabling the user to perform geometry dependent modeling and simulations. 2) ISML can deal with time series data, both simulated and experimentally acquired data, for visualization of dynamics.

international symposium on parallel and distributed processing and applications | 2010

A Multi-GPU Spectrometer System for Real-Time Wide Bandwidth Radio Signal Analysis

Hirofumi Kondo; Eric Martin Heien; Masao Okita; Dan Werthimer; Kenichi Hagihara

This paper describes the implementation of a large bandwidth multi-GPU signal processing system for radio astronomy observation. This system performs very large Fast Fourier Transform (FFT) and spectrum analysis to achieve real-time analysis of a large bandwidth spectrum. This is accomplished by implementing a four-step FFT algorithm in Compute Unified Device Architecture (CUDA). The key feature of this implementation is that the data size transferred between CPU and GPU is reduced using redundant calculation. We also apply pipeline execution to our system to minimize idle processor time, even with multiple GPUs on a shared bus. Using a single GPU, this system can analyze 1 GB of signal data (128 MHz bandwidth at 1 Hz resolution in single precision floating-point complex format) in 0.44 seconds. With the multi-GPU setup, using four GPUs enables 4 GB of signal data to be processed in 0.82 seconds. This is equivalent to a processing speed of around 60 GFLOPS. In particular, we focus on using this system in the Search for Extraterrestrial Radio Emissions from Nearby Developed Intelligent Populations (SERENDIP) project. By using multiple GPUs we can get enough practical performance for high bandwidth radio astronomy projects such as SERENDIP.

simulation tools and techniques for communications, networks and system | 2010

insilicoSim: an extendable engine for parallel heterogeneous biophysical simulations

Eric Martin Heien; Masao Okita; Yoshiyuki Asai; Taishin Nomura; Kenichi Hagihara

Recently, several multidisciplinary projects have begun to model and simulate human physiological systems. However, the simulators for these models are often limited in terms of simulation type and lack of parallel computing support. In this paper we describe insilicoSim, an extendable simulation engine for performing parallel large scale biophysical simulations. We present three key components of the simulator for improving extensibility and performance. First, we demonstrate how a standardized plugin interface allows for easy extension of the simulator to new types of input, output and simulation methods. We detail a technique for improving simulation performance by simplifying and compiling simulation related mathematical expressions into an internal byte code representation for fast evaluation. Finally, we describe the simulation object manager which allows for shared object access between simulation interfaces while transparently performing parallel synchronization. We demonstrate the effectiveness of these methods by simulating several models on both serial and parallel computing platforms.

international parallel and distributed processing symposium | 2009

PyMW - A Python module for desktop grid and volunteer computing

Eric Martin Heien; Yusuke Takata; Kenichi Hagihara; Adam Kornafeld

We describe a general purpose master-worker parallel computation Python module called PyMW. PyMW is intended to support rapid development, testing and deployment of large scale master-worker style computations on a desktop grid or volunteer computing environment. This module targets non-expert computer users by hiding complicated task submission and result retrieval procedures behind a simple interface. PyMW also provides a unified interface to multiple computing environments with easy extension to support additional environments. In this paper, we describe the internal structure and external interface to the PyMW module and its support for the Condor computing environment and the Berkeley Open Infrastructure for Network Computing (BOINC) platform. We demonstrate the effectiveness and scalability of PyMW by performing master-worker style computations on a desktop grid using Condor and a BOINC volunteer computing project.

parallel, distributed and network-based processing | 2008

Static Load Distribution for Communication Intensive Parallel Computing in Multiclusters

Eric Martin Heien; Noriyuki Fujimoto; Kenichi Hagihara

In this paper, we examine load distributions to minimize total run time in multi-cluster parallel computing algorithms by applying divisible load theory techniques. Even with homogeneous processor speeds, parallel computations in multi-clusters that evenly assign load can run at less than maximum efficiency due to communication heterogeneity. Using a modified version of the LogP parallel computing model, we propose a general technique of assigning load among multiple clusters to minimize the time each processor spends waiting. This technique is used to determine optimal load distribution for spin glass simulation and parallel bucket sort in multi-cluster systems. It also allows fast analysis of the effects of adding processors or clusters to the computation. We experimentally demonstrate the accuracy of our model, and show how it eliminates wait time in multi-cluster parallel computations. Using load distributions derived from our technique results in an execution time decrease of up to 50%, depending on the degree of heterogeneity among clusters and communication characteristics of the computation.

Explore More