Jennifer M. Schopf
Argonne National Laboratory
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Jennifer M. Schopf.
IEEE Transactions on Parallel and Distributed Systems | 2003
Francine Berman; Richard Wolski; Henri Casanova; Walfredo Cirne; Holly Dail; Marcio Faerman; Silvia Figueira; Jim Hayes; Graziano Obertelli; Jennifer M. Schopf; Gary Shao; Shava Smallen; Neil Spring; Alan Su; Dmitrii Zagorodnov
Ensembles of distributed, heterogeneous resources, also known as computational grids, have emerged as critical platforms for high-performance and resource-intensive applications. Such platforms provide the potential for applications to aggregate enormous bandwidth, computational power, memory, secondary storage, and other resources during a single execution. However, achieving this performance potential in dynamic, heterogeneous environments is challenging. Recent experience with distributed applications indicates that adaptivity is fundamental to achieving application performance in dynamic grid environments. The AppLeS (Application Level Scheduling) project provides a methodology, application software, and software environments for adaptively scheduling and deploying applications in heterogeneous, multiuser grid environments. We discuss the AppLeS project and outline our findings.
conference on high performance computing (supercomputing) | 1996
Fran Berman; Richard Wolski; Silvia Figueira; Jennifer M. Schopf; Gary Shao
Heterogeneous networks are increasingly being used as platforms for resource-intensive distributed parallel applications. A critical contributor to the performance of such applications is the scheduling of constituent application tasks on the network. Since often the distributed resources cannot be brought under the control of a single global scheduler, the application must be scheduled by the user. To obtain the best performance, the user must take into account both application-specific and dynamic system information in developing a schedule which meets his or her performance criteria. In this paper, we define a set of principles underlying application-level scheduling and describe our work-in-progress building AppLeS (application-level scheduling) agents. We illustrate the application-level scheduling approach with a detailed description and results for a distributed 2D Jacobi application on two production heterogeneous platforms.
high performance distributed computing | 2003
Xuehai Zhang; Jeffrey L. Freschl; Jennifer M. Schopf
Monitoring and information services form a key component of a distributed system, or Grid. A quantitative study of such services can aid in understanding the performance limitations, advise in the deployment of the monitoring system, and help evaluate future development work. To this end, we study the performance of three monitoring and information services for distributed systems: the Globus Toolkit/spl reg/ Monitoring and Discovery Service (MDS2), the European Data Grid Relational Grid Monitoring Architecture (R-GMA) and Hawkeye, part of the Condor project. We perform experiments to test their scalability with respect to number of users, number of resources and amount of data collected. Our study shows that each approach has different behaviors, often due to their different design goals. In the four sets of experiments we conducted to evaluate the performance of the service components under different circumstances, we found a strong advantage to caching or pre-fetching the data, as well as the need to have primary components at well-connected sites because of the high load seen by all systems.
conference on high performance computing (supercomputing) | 2003
Lingyun Yang; Jennifer M. Schopf; Ian T. Foster
In heterogeneous and dynamic environments, efficient execution of parallel computations can require mappings of tasks to processors whose performance is both irregular (because of heterogeneity) and time-varying (because of dynamicity). While adaptive domain decomposition techniques have been used to address heterogeneous resource capabilities, temporal variations in those capabilities have seldom been considered. We propose a conservative scheduling policy that uses information about expected future variance in resource capabilities to produce more efficient data mapping decisions. We first present techniques, based on time series predictors that we developed in previous work, for predicting CPU load at some future time point, average CPU load for some future time interval, and variation of CPU load over some future time interval. We then present a family of stochastic scheduling algorithms that exploit such predictions of future availability and variability when making data mapping decisions. Finally, we describe experiments in which we apply our techniques to an astrophysics application. The results of these experiments demonstrate that conservative scheduling can produce execution times that are both significantly faster and less variable than other techniques.
international parallel and distributed processing symposium | 2002
Sudharshan Vazhkudai; Jennifer M. Schopf; Ian T. Foster
As Data Grids become more commonplace, large data sets are being replicated and distributed to multiple sites, leading to the problem of determining which replica can be accessed most efficiently. The answer to this question can depend on many factors, including physical characteristics of the resources and the load behavior on the CPUs, networks, and storage devices that are part of the end-to-end path linking possible sources and sinks. We develop a predictive framework that combines (1) integrated instrumentation that collects information about the end-to-end performance of past transfers, (2) predictors to estimate future transfer times, and (3) a data delivery infrastructure that provides users with access to both the raw data and our predictions. We evaluate the performance of our predictors by applying them to log data collected from a wide area testbed. These preliminary results provide insights into the effectiveness of using predictors in this situation.
conference on high performance computing (supercomputing) | 1999
Jennifer M. Schopf; Francine Berman
There is a current need for scheduling policies that can leverage the performance variability of resources on multi-user clusters. We develop one solution to this problem called stochastic scheduling that utilizes a distribution of application execution performance on the target resources to determine a performance-efficient schedule. In this paper, we define a stochastic scheduling policy based on time-balancing for data parallel applications whose execution behavior can be represented as a normal distribution. Using three distributed applications on two contended platforms, we demonstrate that a stochastic scheduling policy can achieve good and predictable performance for the application as evaluated by several performance measures.
international parallel and distributed processing symposium | 2003
Lingyun Yang; Ian T. Foster; Jennifer M. Schopf
The dynamic nature of a resource-sharing environment means that applications must be able to adapt their behavior in response to changes in system status. Predictions of future system performance can be used to guide such adaptations. In this paper, we present and evaluate several new one-step-ahead and low-overhead time series prediction strategies that track recent trends by giving more weight to recent data. We present results that show that a dynamic tendency prediction model with different ascending and descending behavior performs best among all strategies studied. A comparative study conducted on a set of 38 machine load traces shows that this new predictor achieves average prediction errors that are between 2% and 55% less (36% less on average) than those incurred by the predictors used within the popular Network Weather Service system.
ieee international conference on high performance computing data and analytics | 2003
Sudharshan Vazhkudai; Jennifer M. Schopf
The recent proliferation of Data Grids and the increasingly common practice of using resources as distributed data stores provide a convenient environment for communities of researchers to share, replicate, and manage access to copies of large datasets. This has led to the question of which replica can be accessed most efficiently. In such environments, fetching data from one of the several replica locations requires accurate predictions of end-to-end transfer times. The answer to this question can depend on many factors, including physical characteristics of the resources and the load behavior on the CPUs, networks, and storage devices that are part of the end-to-end data path linking possible sources and sinks. Our approach combines end-to-end application throughput observations with network and disk load variations and captures whole-system performance and variations in load patterns. Our predictions characterize the effect of load variations of several shared devices (network and disk) on file transfer times. We develop a suite of univariate and multivariate predictors that can use multiple data sources to improve the accuracy of the predictions as well as address Data Grid variations (availability of data and sporadic nature of transfers). We ran a large set of data transfer experiments using GridFTP and observed performance predictions within 15% error for our testbed sites, which is quite promising for a pragmatic system.
merged international parallel processing symposium and symposium on parallel and distributed processing | 1998
Jennifer M. Schopf; Francine Berman
Accurate performance predictions are difficult to achieve for parallel applications executing on production distributed systems. Conventional point-valued performance parameters and prediction models are often inaccurate since they can only represent one point in a range of possible behaviors. The authors address this problem by allowing characteristic application and system data to be represented by a set of possible values and their probabilities, which they call stochastic values. They give a practical methodology for using stochastic values as parameters to adaptable performance prediction models. They demonstrate their usefulness for a distributed SOR application, showing stochastic values to be more effective than single (point) values in predicting the range of application behavior that can occur during execution in production environments.
international performance, computing, and communications conference | 2004
Xuehai Zhang; Jennifer M. Schopf
Monitoring and information services form a key component of a distributed system, or grid. A quantitative study of such services can aid in understanding the performance limitations, advise in the deployment of the monitoring system, and help evaluate future development work. To this end, we examined the performance of the Globus Toolkit/spl reg/ Monitoring and Discovery Service (MDS2) by instrumenting its main services using NetLogger. Our study shows a strong advantage to caching or prefetching the data, as well as the need to have primary components at well-connected sites.