David R. O'Hallaron
Carnegie Mellon University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by David R. O'Hallaron.
Cluster Computing | 2000
Peter A. Dinda; David R. O'Hallaron
This paper evaluates linear models for predicting the Digital Unix five‐second host load average from 1 to 30 seconds into the future. A detailed statistical study of a large number of long, fine grain load traces from a variety of real machines leads to consideration of the Box–Jenkins models (AR, MA, ARMA, ARIMA), and the ARFIMA models (due to self‐similarity.) We also consider a simple windowed‐mean model. The computational requirements of these models span a wide range, making some more practical than others for incorporation into an online prediction system. We rigorously evaluate the predictive power of the models by running a large number of randomized testcases on the load traces and then data‐mining their results. The main conclusions are that load is consistently predictable to a very useful degree, and that the simple, practical models such as AR are sufficient for host load prediction. We recommend AR(16) models or better for host load prediction. We implement an online host load prediction system around the AR(16) model and evaluate its overhead, finding that it uses miniscule amounts of CPU time and network bandwidth.
Computer Methods in Applied Mechanics and Engineering | 1998
Hesheng Bao; Jacobo Bielak; Omar Ghattas; Loukas F. Kallivokas; David R. O'Hallaron; Jonathan Richard Shewchuk; Jifeng Xu
This paper reports on the development of a parallel numerical methodology for simulating large-scale earthquake-induced ground motion in highly heterogeneous basins. We target large sedimentary basins with contrasts in wavelengths of over an order of magnitude. Regular grid methods prove intractable for such problems. We overcome the problem of multiple physical scales by using unstructured finite elements on locally-resolved Delaunay triangulations derived from octree-based grids. The extremely large mesh sizes require special mesh generation techniques. Despite the method’s multiresolution capability, large problem sizes necessitate the use of distributed memory parallel supercomputers to solve the elastic wave propagation problem. We have developed a system that helps automate the task of writing efficient portable unstrucmred mesh solvers for distributed memory parallel supercomputers. The numerical methodology and software system have been used to simulate the seismic response of the San Fernando Valley in Southern California to an aftershock of the 1994 Northridge Earthquake. We report on parallel performance on the Cray T3D for several models of the basin ranging in size from 35 000 to 77 million tetrahedra. The results indicate that, despite the highly irregular structure of the problem, excellent performance and scalability are achieved.
acm special interest group on data communication | 2001
Bruce Lowekamp; David R. O'Hallaron; Thomas R. Gross
Accurate network topology information is important for both network management and application performance prediction. Most topology discovery research has focused on wide-area networks and examined topology only at the IP router level, ignoring the need for LAN topology information. Recent work has demonstrated that bridged Ethernet topology can be determined using standard SNMP MIBs; however, these algorithms require each bridge to learn about all other bridges in the network. Our approach to Ethernet topology discovery can determine the connection between a pair of the bridges that share forwarding entries for only three hosts. This minimal knowledge requirement significantly expands the size of the network that can be discovered. We have implemented the new algorithm, and it has accurately determined the topology of several different networks using a variety of hardware and network configurations. Our implementation requires access to only one endpoint to perform the queries needed for topology discovery.
IEEE Computer | 2010
Arutyun Avetisyan; Roy H. Campbell; Indranil Gupta; Michael T. Heath; Steven Y. Ko; Gregory R. Ganger; Michael Kozuch; David R. O'Hallaron; M. Kunze; Thomas T. Kwan; Kevin Lai; Martha Lyons; Dejan S. Milojicic; Hing Yan Lee; Yeng Chai Soh; Ng Kwang Ming; Jing-Yuan Luke; Han Namgoong
Open Cirrus is a cloud computing testbed that, unlike existing alternatives, federates distributed data centers. It aims to spur innovation in systems and applications research and catalyze development of an open source service stack for the cloud.
international conference on computer communications | 2002
Yinglian Xie; David R. O'Hallaron
Caching is a popular technique for reducing both server load and user response time in distributed systems. We consider the question of whether caching might be effective for search engines as well. We study two real search engine traces by examining query locality and its implications for caching. Our trace analysis produced three results. One result shows that queries have significant locality, with query frequency following a Zipf distribution. Very popular queries are shared among different users and can be cached at servers or proxies, while 16% to 22% of the queries are from the same users and should be cached at the user side. Multiple-word queries are shared less and should be cached mainly at the user side. Another result shows that if caching is to be done at the user side, short-term caching for hours is enough to cover query temporal locality, while server/proxy caching should use longer periods, such as days. The third result showed that most users have small lexicons when submitting queries. Frequent users who submit many search requests tend to reuse a small subset of words to form queries. Thus, with proxy or user side caching, prefetching based on the user lexicon looks promising.
conference on high performance computing (supercomputing) | 2003
Volkan Akcelik; Jacobo Bielak; George Biros; Ioannis Epanomeritakis; Antonio Fernandez; Omar Ghattas; Eui Joong Kim; Julio Lopez; David R. O'Hallaron; Tiankai Tu; John Urbanic
For earthquake simulations to play an important role in the reduction of seismic risk, they must be capable of high resolution and high fidelity. We have developed algorithms and tools for earthquake simulation based on multiresolution hexahedral meshes. We have used this capability to carry out 1 Hz simulations of the 1994 Northridge earthquake in the LA Basin using 100 million grid points. Our wave propagation solver sustains 1.21 teraflop/s for 4 hours on 3000 AlphaServer processors at 80% parallel efficiency. Because of uncertainties in characterizing earthquake source and basin material properties, a critical remaining challenge is to invert for source and material parameter fields for complex 3D basins from records of past earthquakes. Towards this end, we present results for material and source inversion of high-resolution models of basins undergoing antiplane motion using parallel scalable inversion algorithms that overcome many of the difficulties particular to inverse heterogeneous wave propagation problems.
conference on high performance computing (supercomputing) | 2006
Tiankai Tu; Hongfeng Yu; Leonardo Ram'irez-Guzm'an; Jacobo Bielak; Omar Ghattas; Kwan-Liu Ma; David R. O'Hallaron
Parallel supercomputing has traditionally focused on the inner kernel of scientific simulations: the solver. The front and back ends of the simulation pipeline - problem description and interpretation of the output - have taken a back seat to the solver when it comes to attention paid to scalability and performance, and are often relegated to offline, sequential computation. As the largest simulations move beyond the realm of the terascale and into the petascale, this decomposition in tasks and platforms becomes increasingly untenable. We propose an end-to-end approach in which all simulation components - meshing, partitioning, solver, and visualization - are tightly coupled and execute in parallel with shared data structures and no intermediate I/O. We present our implementation of this new approach in the context of octree-based finite element simulation of earthquake ground motion. Performance evaluation on up to 2048 processors demonstrates the ability of the end-to-end approach to overcome the scalability bottlenecks of the traditional approach
acm sigplan symposium on principles and practice of parallel programming | 1993
Jaspal Subhlok; James M. Stichnoth; David R. O'Hallaron; Thomas R. Gross
For many applications, achieving good performance on a private memory parallel computer requires exploiting data parallelism as well as task parallelism. Depending on the size of the input data set and the number of nodes (i.e., processors), different tradeoffs between task and data parallelism are appropriate for a parallel system. Most existing compilers focus on only one of data parallelism and task parallelism. Therefore, to achieve the desired results, the programmer must separately program the data and task parallelism. We have taken a unified approach to exploiting both kinds of parallelism in a single framework with an existing language. This approach eases the task of programming and exposes the tradeoffs between data and task parallelism to compiler. We have implemented a parallelizing Fortran compiler for the iWarp system based on this approach. We discuss the design of our compiler, and present performance results to validate our approach.
high performance distributed computing | 1999
Peter A. Dinda; David R. O'Hallaron
Evaluates linear models for predicting the Digital Unix five-second host load average from 1 to 30 seconds into the future. A detailed statistical study of a large number of long, fine-grain load traces from a variety of real machines leads to consideration of the Box-Jenkins (1994) models (AR, MA, ARMA, ARIMA), and the ARFIMA (autoregressive fractional integrated moving average) models (due to self-similarity). These models, as well as a simple windowed-mean scheme, are then rigorously evaluated by running a large number of randomized test cases on the load traces and by data-mining their results. The main conclusions are that the load is consistently predictable to a very useful degree, and that the simpler models, such as AR, are sufficient for performing this prediction.
IEEE Internet Computing | 2007
Mahadev Satyanarayanan; Benjamin Gilbert; M. Toups; Niraj Tolia; David R. O'Hallaron; Ajay Surie; Adam Wolbach; Jan Harkes; Adrian Perrig; David J. Farber; Michael Kozuch; Casey Helfrich; Partho Nath; Horacio Andres Lagar-cavilla
The Internet suspend/resume model of mobile computing cuts the tight binding between PC state and PC hardware. By layering a virtual machine on distributed storage, ISR lets the VM encapsulate execution and user customization state; distributed storage then transports that state across space and time. This article explores the implications of ISR for an infrastructure-based approach to mobile computing. It reports on experiences with three versions of ISR and describes work in progress toward the OpenISR version