Bryan N. Mills
University of Pittsburgh
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bryan N. Mills.
international workshop on energy efficient supercomputing | 2013
Bryan N. Mills; Ryan E. Grant; Kurt Brian Ferreira; Rolf Riesen
The U. S. Department of Energy has identified resilience and energy consumption as key challenges for future extreme-scale systems. All checkpoint/restart methods require I/O to local or remote storage. Efforts are under way to minimize the amount of data movement and increase scalability. Nevertheless, the energy consumed by fault resilience methods will increase with system size. It is therefore important to understand the performance overhead in conjunction with the energy consumption of each fault resilience method. In this paper we explore throttling CPU power consumption during I/O intensive checkpoint operations of real applications. We find that 10% total energy savings are possible with little impact on application time to solution.
2014 International Conference on Computing, Networking and Communications (ICNC) | 2014
Bryan N. Mills; Taieb Znati; Rami G. Melhem
The current response to fault tolerance relies upon either time or hardware redundancy in order to mask faults. Time redundancy implies a re-execution of the failed computation after the failure has been detected, although this can further be optimized by the use of checkpoints these solutions still impose a significant delay. In many mission critical systems hardware redundancy has traditionally deployed in the form of process replication to provide fault tolerance, avoiding delay and maintaining tight deadlines. Both approaches have drawbacks, re-execution requiring additional time and replication requiring additional resources, especially energy. This forces the systems engineer to choose between time or hardware redundancy, cloud computing environments have largely chosen replication because response time is often critical. In this paper we propose a new computational model called shadow computing, which provides goal-based adaptive resilience through the use of dynamic execution. Using this general model we develop shadow replication which enables a parameterized tradeoff between time and hardware redundancy to provide fault tolerance. Then we build an analytical model to predict the expected energy savings and provide an analysis using that model.
parallel, distributed and network-based processing | 2014
Bryan N. Mills; Taieb Znati; Rami G. Melhem; Kurt Brian Ferreira; Ryan E. Grant
As HPC systems continue to grow to meet the requirements of tomorrows exascale-class systems, two of the biggest challenges are power consumption and system resilience. On current systems, the dominant resilience technique is checkpoint/restart. It is believed, however, that this technique alone will not scale to the level necessary to support future systems. Therefore, alternative methods have been suggested to augment checkpoint/restart -- for example process replication. In this paper we address both resilience and power together, this is in contrast to much of the competed work which does so independently. Using an analytical model that accounts for both power consumption and failures, we study the performance of checkpoint and replication-based techniques on current and future systems and use power measurements from current systems to validate our findings. Lastly, in an attempt to optimize power consumption for replication, we introduce a new protocol termed shadow replication which not only reduces energy consumption but also produces faster response times than checkpoint/restart and traditional replication when operating under system power constraints.
network and distributed system security symposium | 2007
Subrata Acharya; Bryan N. Mills; Mehmud Abliz; Taieb Znati; Jia Wang; Zihui Ge; Albert G. Greenberg
Energies | 2014
Xiaolong Cui; Bryan N. Mills; Taieb Znati; Rami G. Melhem
annual simulation symposium | 2008
Bryan N. Mills; Taieb Znati
CLOSER | 2014
Xiaolong Cui; Bryan N. Mills; Taieb Znati; Rami G. Melhem
Archive | 2008
Bryan N. Mills; Taieb Znati
international conference on computer communications and networks | 2008
Bryan N. Mills; Taieb Znati
Archive | 2013
Kurt Brian Ferreira; Bryan N. Mills; Taieb Znati; Rami G. Melhem