Dror G. Feitelson | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dror G. Feitelson is active.

Explore More

Publication

Featured researches published by Dror G. Feitelson.

IEEE Transactions on Parallel and Distributed Systems | 2001

Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling

Ahuva Mu'alem; Dror G. Feitelson

Scheduling jobs on the IBM SP2 system and many other distributed-memory MPPs is usually done by giving each job a partition of the machine for its exclusive use. Allocating such partitions in the o...Scheduling jobs on the IBM SP2 system and many other distributed-memory MPPs is usually done by giving each job a partition of the machine for its exclusive use. Allocating such partitions in the order in which the jobs arrive (FCFS scheduling) is fair and predictable, but suffers from severe fragmentation, leading to low utilization. This situation led to the development of the EASY scheduler which uses aggressive backfilling: Small jobs are moved ahead to fill in holes in the schedule, provided they do not delay the first job in the queue. We compare this approach with a more conservative approach in which small jobs move ahead only if they do not delay any job in the queue and show that the relative performance of the two schemes depends on the workload. For workloads typical on SP2 systems, the aggressive approach is indeed better, but, for other workloads, both algorithms are similar. In addition, we study the sensitivity of backfilling to the accuracy of the runtime estimates provided by the users and find a very surprising result. Backfilling actually works better when users overestimate the runtime by a substantial factor.

job scheduling strategies for parallel processing | 1997

Theory and Practice in Parallel Job Scheduling

Dror G. Feitelson; Larry Rudolph; Uwe Schwiegelshohn; Kenneth C. Sevcik; Parkson Wong

The scheduling of jobs on parallel supercomputer is becoming the subject of much research. However, there is concern about the divergence of theory and practice. We review theoretical research in this area, and recommendations based on recent results. This is contrasted with a proposal for standard interfaces among the components of a scheduling system, that has grown from requirements in the field.

job scheduling strategies for parallel processing | 2004

Parallel job scheduling — a status report

Dror G. Feitelson; Larry Rudolph; Uwe Schwiegelshohn

The popularity of research on the scheduling of parallel jobs demands a periodic review of the status of the field. Indeed, several surveys have been written on this topic in the context of parallel supercomputers [17, 20]. The purpose of the present paper is to update that material, and to extend it to include work concerning clusters and the grid.

IEEE Transactions on Parallel and Distributed Systems | 2007

Backfilling Using System-Generated Predictions Rather than User Runtime Estimates

Dan Tsafrir; Yoav Etsion; Dror G. Feitelson

The most commonly used scheduling algorithm for parallel supercomputers is FCFS with backfilling, as originally introduced in the EASY scheduler. Backfilling means that short jobs are allowed to run ahead of their time provided they do not delay previously queued jobs (or at least the first queued job). However, predictions have not been incorporated into production schedulers, partially due to a misconception (that we resolve) claiming inaccuracy actually improves performance, but mainly because underprediction is technically unacceptable: users will not tolerate jobs being killed just because system predictions were too short. We solve this problem by divorcing kill-time from the runtime prediction and correcting predictions adaptively as needed if they are proved wrong. The end result is a surprisingly simple scheduler, which requires minimal deviations from current practices (e.g., using FCFS as the basis) and behaves exactly like EASY as far as users are concerned; nevertheless, it achieves significant improvements in performance, predictability, and accuracy. Notably, this is based on a very simple runtime predictor that just averages the runtimes of the last two jobs by the same user; counter intuitively, our results indicate that using recent data is more important than mining the history for similar jobs. All the techniques suggested in this paper can be used to enhance any backfilling algorithm and are not limited to EASY

job scheduling strategies for parallel processing | 1996

Packing Schemes for Gang Scheduling

Dror G. Feitelson

Jobs that do not require all processors in the system can be packed together for gang scheduling. We examine accounting traces from several parallel computers to show that indeed many jobs have small sizes and can be packed together. We then formulate a number of such packing algorithms, and evaluate their effectiveness using simulations based on our workload study. The results are that two algorithms are the best: either perform the mapping based on a buddy system of processors, or use migration to re-map the jobs more tightly whenever a job arrives or terminates. Other approaches, such as mapping to the least loaded PEs, proved to be counterproductive. The buddy system approach depends on the capability to gang-schedule jobs in multiple slots, if there is space. The migration algorithm is more robust, but is expected to suffer greatly due to the overhead of the migration itself. In either case fragmentation is not an issue, and utilization may top 90% with sufficiently high loads.

Journal of Parallel and Distributed Computing | 1992

Gang scheduling performance benefits for fine-grain synchronization

Dror G. Feitelson; Larry Rudolph

Abstract Multiprogrammed multiprocessors executing fine-grain parallel programs appear to require new scheduling policies. A promising new idea is gang scheduling, where a set of threads are scheduled to execute simultaneously on a set of processors. This has the intuitive appeal of supplying the threads with an environment that is very similar to a dedicated machine. It allows the threads to interact efficiently by using busy waiting, without the risk of waiting for a thread that currently is not running. Without gang scheduling, threads have to block in order to synchronize, thus suffering the overhead of a context switch. While this is tolerable in coarse-grain computations, and might even lead to performance benefits if the threads are highly unbalanced, it causes severe performance degradation in the fine-grain case. We have developed a model to evaluate the performance of different combinations of synchronization mechanisms and scheduling policies, and validated it by an implementation on the Makbilan multiprocessor. The model leads to the conclusion that gang scheduling is required for efficient fine-grain synchronization on multiprogrammed multiprocessors.

ACM Transactions on Computer Systems | 1996

The Vesta parallel file system

Peter F. Corbett; Dror G. Feitelson

The Vesta parallel file system is designed to provide parallel file access to application programs running on multicomputers with parallel I/O subsystems. Vesta uses a new abstraction of files: a file is not a sequence of bytes, but rather it can be partitioned into multiple disjoint sequences that are accessed in parallel. The partitioning—which can also be changed dynamically—reduces the need for synchronization and coordination during the access. Some control over the layout of data is also provided, so the layout can be matched with the anticipated access patterns. The system is fully implemented and forms the basis for the AIX Parallel I/O File System on the IBM SP2. The implementation does not compromise scalability or parallelism. In fact, all data accesses are done directly to the I/O node that contains the requested data, without any indirection or access to shared metadata. Disk mapping and caching functions are confined to each I/O node, so there is no need to keep data coherent across nodes. Performance measurements shown good scalability with increased resources. Moreover, different access patterns are show to achieve similar performance.

merged international parallel processing symposium and symposium on parallel and distributed processing | 1998

Utilization and predictability in scheduling the IBM SP2 with backfilling

Dror G. Feitelson; Ahuva Mu’alem Weil

Scheduling jobs on the IBM SP2 system is usually done by giving each job a partition of the machine for its exclusive use. Allocating such partitions in the order that the jobs arrive (FCFS scheduling) is fair and predictable, but su ers from severe fragmentation, leading to low utilization. This motivated Argonne National Lab, where the rst large SP1 was installed, to develop the EASY scheduler. This scheduler, which has since been adopted by many other SP2 sites, uses aggressive backfilling: small jobs are moved ahead to fill in holes in the schedule, provided they do not delay the first job in the queue. We show that a more conservative approach, in which small jobs move ahead only if they do not delay any job in the queue, produces essentially the same benefits in terms of utilization. Our conservative scheme has the added advantage that queueing times can be predicted in advance, whereas in EASY the queueing time is

job scheduling strategies for parallel processing | 1997

Improved Utilization and Responsiveness with Gang Scheduling

Dror G. Feitelson; Morris A. Jette

Most commercial multicomputers use space-slicing schemes in which each scheduling decision has an unknown impact on the future: should a job be scheduled, risking that it will block other larger jobs later, or should the processors be left idle for now in anticipation of future arrivals? This dilemma is solved by using gang scheduling, because then the impact of each decision is limited to its time slice, and future arrivals can be accommodated in other time slices. This added flexibility is shown to improve overall system utilization and responsiveness. Empirical evidence from using gang scheduling on a Cray T3D installed at Lawrence Livermore National Lab corroborates these results, and shows conclusively that gang scheduling can be very effective with current technology.

job scheduling strategies for parallel processing | 1995

Job Characteristics of a Production Parallel Scientivic Workload on the NASA Ames iPSC/860

Dror G. Feitelson; Bill Nitzberg

Statistics of a parallel workload on a 128-node iPSC/860 located at NASA Ames are presented. It is shown that while the number of sequential jobs dominates the number of parallel jobs, most of the resources (measured in node-seconds) were consumed by parallel jobs. Moreover, most of the sequential jobs were for system administration. The average runtime of jobs grew with the number of nodes used, so the total resource requirements of large parallel jobs were larger by more than the number of nodes they used. The job submission rate during peak day activity was somewhat lower than one every two minutes, and the average job size was small. At night, submission rate was low but job sizes and system utilization were high, mainly due to NQS. Submission rate and utilization over the weekend were lower than on weekdays. The overall utilization was 50%, after accounting for downtime. About 2/3 of the applications were executed repeatedly, some for a significant number of times.

Explore More