Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Peter A. Dinda is active.

Publication


Featured researches published by Peter A. Dinda.


Cluster Computing | 2000

Host load prediction using linear models

Peter A. Dinda; David R. O'Hallaron

This paper evaluates linear models for predicting the Digital Unix five‐second host load average from 1 to 30 seconds into the future. A detailed statistical study of a large number of long, fine grain load traces from a variety of real machines leads to consideration of the Box–Jenkins models (AR, MA, ARMA, ARIMA), and the ARFIMA models (due to self‐similarity.) We also consider a simple windowed‐mean model. The computational requirements of these models span a wide range, making some more practical than others for incorporation into an online prediction system. We rigorously evaluate the predictive power of the models by running a large number of randomized testcases on the load traces and then data‐mining their results. The main conclusions are that load is consistently predictable to a very useful degree, and that the simple, practical models such as AR are sufficient for host load prediction. We recommend AR(16) models or better for host load prediction. We implement an online host load prediction system around the AR(16) model and evaluate its overhead, finding that it uses miniscule amounts of CPU time and network bandwidth.


international parallel and distributed processing symposium | 2010

Palacios and Kitten: New high performance operating systems for scalable virtualized and native supercomputing

John R. Lange; Kevin Pedretti; Trammell Hudson; Peter A. Dinda; Zheng Cui; Lei Xia; Patrick G. Bridges; Andy Gocke; Steven Jaconette; Michael J. Levenhagen; Ron Brightwell

Palacios is a new open-source VMM under development at Northwestern University and the University of New Mexico that enables applications executing in a virtualized environment to achieve scalable high performance on large machines. Palacios functions as a modularized extension to Kitten, a high performance operating system being developed at Sandia National Laboratories to support large-scale supercomputing applications. Together, Palacios and Kitten provide a thin layer over the hardware to support full-featured virtualized environments alongside Kittens lightweight native environment. Palacios supports existing, unmodified applications and operating systems by using the hardware virtualization technologies in recent AMD and Intel processors. Additionally, Palacios leverages Kittens simple memory management scheme to enable low-overhead pass-through of native devices to a virtualized environment. We describe the design, implementation, and integration of Palacios and Kitten. Our benchmarks show that Palacios provides near native (within 5%), scalable performance for virtualized environments running important parallel applications. This new architecture provides an incremental path for applications to use supercomputers, running specialized lightweight host operating systems, that is not significantly performance-compromised.


conference on high performance computing (supercomputing) | 2005

VSched: Mixing Batch And Interactive Virtual Machines Using Periodic Real-time Scheduling

Bin Lin; Peter A. Dinda

We are developing Virtuoso, u system ,for distributed computing using virtual machines (VMs). Virtuoso must be uble to mix batch und interactive VMs on the same physical hardwure, while satisfiing constraint on re- sponsiveness und compute rates for each workload. VSched is the component of Virtuoso that provides this capability. VSched is an entirely user-level tool that interacts with the stock Linux kernel running below any type-11 virtual machine monitor to schedule VMs (indeed, any process) using a periodic real-time scheduling model. This abstraction allows compute rate and responsivness constraints to be straightforwardly described using a period und a slice within the period, and it allows,for just and simple admission control. This paper makes the case,for periodic real-time scheduling for VM-based computing environments, and then describes and evaluate.s VSched. It also applies VSched to scheduling parallel worklouds, showing that it can help a BSP application maintain a fixed stable performance despite externally caused loud imbalance.


Scientific Programming | 1999

The statistical properties of host load

Peter A. Dinda

Understanding how host load changes over time is instrumental in predicting the execution time of tasks or jobs, such as in dynamic load balancing and distributed soft real-time systems. To improve this understanding, we collected week-long, 1 Hz resolution traces of the Digital Unix 5 second exponential load average on over 35 different machines including production and research cluster machines, compute servers, and desktop workstations. Separate sets of traces were collected at two different times of the year. The traces capture all of the dynamic load information available to user-level programs on these machines. We present a detailed statistical analysis of these traces here, including summary statistics, distributions, and time series analysis results. Two significant new results are that load is self-similar and that it displays epochal behavior. All of the traces exhibit a high degree of self-similarity with Hurst parameters ranging from 0.73 to 0.99, strongly biased toward the top of that range. The traces also display epochal behavior in that the local frequency content of the load signal remains quite stable for long periods of time (150-450 s mean) and changes abruptly at epoch boundaries. Despite these complex behaviors, we have found that relatively simple linear models are sufficient for short-range host load prediction.


high performance distributed computing | 1999

An evaluation of linear models for host load prediction

Peter A. Dinda; David R. O'Hallaron

Evaluates linear models for predicting the Digital Unix five-second host load average from 1 to 30 seconds into the future. A detailed statistical study of a large number of long, fine-grain load traces from a variety of real machines leads to consideration of the Box-Jenkins (1994) models (AR, MA, ARMA, ARIMA), and the ARFIMA (autoregressive fractional integrated moving average) models (due to self-similarity). These models, as well as a simple windowed-mean scheme, are then rigorously evaluated by running a large number of randomized test cases on the load traces and by data-mining their results. The main conclusions are that the load is consistently predictable to a very useful degree, and that the simpler models, such as AR, are sufficient for performing this prediction.


Cluster Computing | 2002

Online Prediction of the Running Time of Tasks

Peter A. Dinda

We describe and evaluate the Running Time Advisor (RTA), a system that can predict the running time of a compute-bound task on a typical shared, unreserved commodity host. The prediction is computed from linear time series predictions of host load and takes the form of a confidence interval that neatly expresses the error associated with the measurement and prediction processes – error that must be captured to make statistically valid decisions based on the predictions. Adaptive applications make such decisions in pursuit of consistent high performance, choosing, for example, the host where a task is most likely to meet its deadline. We begin by describing the system and summarizing the results of our previously published work on host load prediction. We then describe our algorithm for computing predictions of running time from host load predictions. We next evaluate the system using over 100,000 randomized testcases run on 39 different hosts, finding that is indeed capable of computing correct and useful confidence intervals. Finally, we report on our experience with using the RTA in application-oriented real-time scheduling in distributed systems.


Lecture Notes in Computer Science | 1998

The Statistical Properties of Host Load

Peter A. Dinda

Understanding how host load changes over time is instrumental in predicting the execution time of tasks or jobs, such as in dynamic load balancing and distributed soft real-time systems.To improve this understanding, we collected week-long, 1 Hz resolution Unix load average traces on 38 different machines including production and research cluster machines, compute servers, and desktop workstations Separate sets of traces were collected at two different times of the year. The traces capture all of the dynamic load information available to user-level programs on these machines. We present a detailed statistical analysis of these traces here, including summary statistics, distributions, and time series analysis results. Two significant new results are that load is self-similar and that it displays epochal behavior. All of the traces exhibit a high degree of self similarity with Hurst parameters ranging from .63 to .97, strongly biased toward the top of that range. The traces also display epochal behavior in that the local frequency content of the load signal remains quite stable for long periods of time (150-450 seconds mean) and changes abruptly at epoch boundaries.


IEEE Transactions on Parallel and Distributed Systems | 2006

Design, implementation, and performance of an extensible toolkit for resource prediction in distributed systems

Peter A. Dinda

RPS is a publicly available toolkit that allows a practitioner to straightforwardly create flexible online and offline resource prediction systems in which resources are represented by independent, periodically sampled, scalar-valued measurement streams. The systems predict the future values of such streams from past values and are composed at runtime out of a large and extensible set of communicating components that are in turn constructed using RPSs extensible sensor, prediction, wavelet, and communication libraries. This paper describes the design, implementation, and performance of RPS. We have used RPS extensively to evaluate predictive models and build online prediction systems for host load, Windows performance data, and network bandwidth. The computation and communication overheads involved in such systems are quite low.


high performance distributed computing | 2001

The architecture of the Remos system

Peter A. Dinda; Thomas R. Gross; Roger P. Karrer; Bruce Lowekamp; Nancy Miller; Peter Steenkiste; Dean Sutherland

Remos provides resource information to distributed applications. Its design goals of scalability, flexibility, and portability are achieved through an architecture that allows components to be positioned across the network, each collecting information about its local network. To collect information from different types of networks and from hosts on those networks, Remos provides several collectors that use different technologies, such as SNMP or benchmarking. By matching the appropriate collector to each particular network environment and by providing an architecture for distributing the output of these collectors across all querying environments, Remos collects appropriately detailed information at each site and distributes this information where needed in a scalable manner. Prediction services are integrated at the user-level, allowing history-based data collected across the network to be used to generate the predictions needed by a particular user. Remos has been implemented and tested in a variety of networks and is in use in a number of different environments.


international conference on computer communications | 1999

Performance characteristics of mirror servers on the Internet

Andy Myers; Peter A. Dinda; Hui Zhang

As a growing number of Web sites introduce mirrors to increase throughput, the challenge for clients is determining which mirror will offer the best performance when a document is to be retrieved. We present findings from measuring 9 clients scattered throughout the United States retrieving over 490,000 documents from 47 production Web servers which mirror three different Web sites. We have several interesting findings that may aid in the design of protocols for choosing among mirror servers. Though server performance varies widely, we have observed that a servers performance relative to other servers is more stable and is independent of time scale. In addition, a change in an individual servers transfer time is not a strong indicator that its performance relative to other servers has changed. Finally, we have found that clients wishing to achieve near-optimal performance may only need to consider a small number of servers rather than all mirrors of a particular site.

Collaboration


Dive into the Peter A. Dinda's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar

Gokhan Memik

Northwestern University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Bin Lin

Northwestern University

View shared research outputs
Top Co-Authors

Avatar

Dong Lu

Northwestern University

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kyle C. Hale

Northwestern University

View shared research outputs
Top Co-Authors

Avatar

Lei Xia

Northwestern University

View shared research outputs
Top Co-Authors

Avatar

Ashish Gupta

Northwestern University

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge