Is this you? Create Your Porfile

John Brevik

California State University, Long Beach

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where John Brevik is active.

Explore More

Publication

Featured researches published by John Brevik.

ieee international conference on high performance computing data and analytics | 2001

Analyzing Market-Based Resource Allocation Strategies for the Computational Grid

Richard Wolski; James S. Plank; John Brevik; Todd Bryan

In this paper, the authors investigate G-commerce—computational economies for controlling resource allocation in computational Grid settings. They define hypothetical resource consumers (representing users and Grid-aware applications) and resource producers (representing resource owners who “sell” their resources to the Grid). The authors then measure the efficiency of resource allocation under two different market conditions—commodities markets and auctions—and compare both market strategies in terms of price stability, market equilibrium, consumer efficiency, and producer efficiency. The results indicate that commodities markets are a better choice for controlling Grid resources than previously defined auction strategies.

european conference on parallel processing | 2005

Modeling machine availability in enterprise and wide-area distributed computing environments

Daniel Nurmi; John Brevik; Richard Wolski

In this paper, we consider the problem of modeling machine availability in enterprise-area and wide-area distributed computing settings. Using availability data gathered from three different environments, we detail the suitability of four potential statistical distributions for each data set: exponential, Pareto, Weibull, and hyperexponential. In each case, we use software we have developed to determine the necessary parameters automatically from each data collection. To gauge suitability, we present both graphical and statistical evaluations of the accuracy with each distribution fits each data set. For all three data sets, we find that a hyperexponential model fits slightly more accurately than a Weibull, but that both are substantially better choices than either an exponential or Pareto. These results indicate that either a hyperexponential or Weibull model effectively represents machine availability in enterprise and Internet computing environments.

acm sigplan symposium on principles and practice of parallel programming | 2006

Predicting bounds on queuing delay for batch-scheduled parallel machines

John Brevik; Daniel Nurmi; Richard Wolski

Most space-sharing parallel computers presently operated by high-performance computing centers use batch-queuing systems to manage processor allocation. In many cases, users wishing to use these batch-queued resources have accounts at multiple sites and have the option of choosing at which site or sites to submit a parallel job. In such a situation, the amount of time a users job will wait in any one batch queue can significantly impact the overall time a user waits from job submission to job completion. In this work, we explore a new method for providing end-users with predictions for the bounds on the queuing delay individual jobs will experience. We evaluate this method using batch scheduler logs for distributed-memory parallel machines that cover a 9-year period at 7 large HPC centers.Our results show that it is possible to predict delay bounds reliably for jobs in different queues, and for jobs requesting different ranges of processor counts. Using this information, scientific application developers can intelligently decide where to submit their parallel codes in order to minimize overall turnaround time.

conference on high performance computing (supercomputing) | 2006

Evaluation of a workflow scheduler using integrated performance modelling and batch queue wait time prediction

Daniel Nurmi; Anirban Mandal; John Brevik; Chuck Koelbel; Richard Wolski; Ken Kennedy

Large-scale distributed systems offer computational power at unprecedented levels. In the past, HPC users typically had access to relatively few individual supercomputers and, in general, would assign a one-to-one mapping of applications to machines. Modern HPC users have simultaneous access to a large number of individual machines and are beginning to make use of all of them for single-application execution cycles. One method that application developers have devised in order to take advantage of such systems is to organize an entire application execution cycle as a workflow. The scheduling of such workflows has been the topic of a great deal of research in the past few years and, although very sophisticated algorithms have been devised, a very specific aspect of these distributed systems, namely that most supercomputing resources employ batch queue scheduling software, has therefore been omitted from consideration, presumably because it is difficult to model accurately. In this work, we augment an existing workflow scheduler through the introduction of methods which make accurate predictions of both the performance of the application on specific hardware, and the amount of time individual workflow tasks would spend waiting in batch queues. Our results show that although a workflow scheduler alone may choose correct task placement based on data locality or network connectivity, this benefit is often compromised by the fact that most jobs submitted to current systems must wait in overcommitted batch queues for a significant portion of time. However, incorporating the enhancements we describe improves workflow execution time in settings where batch queues impose significant delays on constituent workflow tasks

grid computing | 2006

Fault-aware scheduling for Bag-of-Tasks applications on Desktop Grids

Cosimo Anglano; John Brevik; Massimo Canonico; Daniel Nurmi; Richard Wolski

Desktop grids have proved to be a suitable platform for the execution of bag-of-tasks applications but, being characterized by a high resource volatility, require the availability of scheduling techniques able to effectively deal with resource failures and/or unplanned periods of unavailability. In this paper we present a set of fault-aware scheduling policies that, rather than just tolerating faults as done by traditional fault-tolerant schedulers, exploit the information concerning resource availability to improve application performance. The performance of these strategies have been compared via simulation with those attained by traditional fault-tolerant schedulers. Our results, obtained by considering a set of realistic scenarios modeled after real desktop grids, show that our approach results in better application performance and resource utilization

ieee international symposium on workload characterization | 2006

Predicting Bounds on Queuing Delay in Space-shared Computing Environments

John Brevik; Daniel Nurmi; Richard Wolski

Most space-sharing resources presently operated by high performance computing centers employ some sort of batch queueing system to manage resource allocation to multiple users. In this work, we explore a new method for providing end-users with predictions of the bounds on queuing delay individual jobs will experience when waiting to be scheduled to a machine partition. We evaluate this method using scheduler logs that cover a 10 year period from 10 large HPC systems. Our results show that it is possible to predict delay bounds with specified confidence levels for jobs in different queues, and for jobs requesting different ranges of processor counts

international conference on cluster computing | 2005

Minimizing the Network Overhead of Checkpointing in Cycle-harvesting Cluster Environments

Daniel Nurmi; John Brevik; Richard Wolski

Cycle-harvesting systems such as Condor have been developed to make desktop machines in a local area (which are often similar to clusters in hardware configuration) available as a compute platform. To provide a dual-use capability, opportunistic jobs harvesting cycles from the desktop must be checkpointed before the desktop resources are reclaimed by their owners and the job is evacuated. In this paper, we investigate a new system for computing efficient checkpoint schedules in cycle-harvesting environments. Our system records the historical availability from each resource and fits a statistical model to the observations. Because checkpointing must often traverse the network (i.e. the desktop hosts do not provide sufficient persistent storage for checkpoints), we combine this model with predictions of network performance to the storage site to compute a checkpoint schedule. When an application is initiated on a particular resource, the system uses the computed distribution to parameterize a Markov state-transition model for the applications execution, evaluates the expected time and network overhead as a function of the checkpoint interval, and numerically optimizes with respect to time. We report on the performance of and implementation of this system using the Condor cycle-harvesting environment at the University of Wisconsin. We also evaluate the efficiencies we achieve for a variety of network overheads using trace-based simulation. Finally, we validate our simulations against the observed performance with Condor. Our results indicate that while the choice of model distribution has a relatively small but positive effect on time efficiency, it has a substantial impact on network utilization

acm sigplan symposium on principles and practice of parallel programming | 2008

Probabilistic advanced reservations for batch-scheduled parallel machines

Daniel Nurmi; Richard Wolski; John Brevik

In high-performance computing (HPC) settings, in which multiprocessor machines are shared among users with potentially competing resource demands, processors are allocated to user workload using space sharing. Typically, users interact with a given machine by submitting their jobs to a centralized batch scheduler that implements a site-specific, and often partially hidden, policy designed to maximize machine utilization while providing tolerable turn-around times. In practice, while most HPC systems experience good utilization levels, the amount of time experienced by individual jobs waiting to begin execution has been shown to be highly variable and difficult to predict, leading to user confusion and/or frustration. One method for dealing with this uncertainty that has been proposed is to allow users who are willing to plan ahead to make “advanced reservations” for processor resources. To date, however, few if any HPC centers provide an advanced reservation capability to their general user populations for fear (supported by previous research) that diminished machine utilization will occur if and when advanced reservations are introduced. In this work, we describe VARQ, a new method for job scheduling that provides users with probabilistic “virtual” advanced reservations using only existing best effort batch schedulers and policies. VARQ functions as an overlay, submitting jobs that are indistinguishable from the normal workload serviced by a scheduler. We describe the statistical methods we use to implement VARQ, detail an empirical evaluation of its effectiveness in a number of HPC settings, and explore the potential future impact of VARQ should it become widely used. Without requiring HPC sites to support advanced reservations, we find that VARQ can implement a reservation capability probabilistically and that the effects of this probabilistic approach are unlikely to negatively affect resource utilization.

IEEE Transactions on Services Computing | 2014

Using Parametric Models to Represent Private Cloud Workloads

Richard Wolski; John Brevik

Cloud computing has become a popular metaphor for dynamic and secure self-service access to computational and storage capabilities. In this study, we analyze and model workloads gathered from enterprise-operated commercial private clouds that implement “Infrastructure as a Service.” Our results show that 3-phase hyperexponential distributions fit using the Estimation Maximization (E-M) algorithm capture workload attributes accurately. In addition, these models of individual attributes compose to produce estimates of overall cloud performance that our results verify to be accurate. As an early study of commercial enterprise private clouds, this work provides guidance to those researching, designing, or maintaining such installations. In particular, the cloud workloads under study do not exhibit “heavy-tailed” distributional properties in the same way that “bare metal” operating systems do, potentially leading to different design and engineering tradeoffs.

International Mathematics Research Notices | 2010

Noether–Lefschetz Theorem with Base Locus

John Brevik; Scott Nollet

For an arbitrary curve (possibly reducible, non-reduced, of mixed dimension) lying on a normal surface, the general surface S of high degree containing Z is also normal but often singular. We compute the class groups of the very general such surface, thereby extending the Noether–Lefschetz theorem (the special case when Z is empty). Our method is an adaptation of Griffiths and Harris’ degeneration proof, simplified by a cohomology and base change argument. We give applications to computing Picard groups. Dedicated to Robin Hartshorne on his 70th birthday

Explore More