Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Umit Rencuzogullari is active.

Publication


Featured researches published by Umit Rencuzogullari.


acm sigplan symposium on principles and practice of parallel programming | 2001

Dynamic adaptation to available resources for parallel computing in an autonomous network of workstations

Umit Rencuzogullari; Sandhya Dwardadas

Networks of workstations (NOWs), which are generally composed of autonomous compute elements networked together, are an attractive parallel computing platform since they offer high performance at low cost. The autonomous nature of the environment, however, often results in inefficient utilization due to load imbalances caused by three primary factors: 1) unequal load (compute or communication) assignment to equally-powerful compute nodes, 2) unequal resources at compute nodes, and 3) multiprogramming. These load imbalances result in idle waiting time on cooperating processes that need to synchronize or communicate data. Additional waiting time may result due to local scheduling decisions in a multiprogrammed environment. In this paper, we present a combined approach of compile-time analysis, run-time load distribution, and operating system scheduler cooperation for improved utilization of available resources in an autonomous NOW. The techniques we propose allow efficient resource utilization by taking into consideration all three causes of load imbalance in addition to locality of access in the process of load distribution. The resulting adaptive load distribution and cooperative scheduling system allows applications to take advantage of parallel resources when available by providing better performance than when the loaded resources are not used at all.


high performance computer architecture | 2000

The effect of network total order, broadcast, and remote-write capability on network-based shared memory computing

Robert J. Stets; Sandhya Dwarkadas; Leonidas I. Kontothanassis; Umit Rencuzogullari; Michael L. Scott

Emerging system-area networks provide a variety of features that can dramatically reduce network communication overhead. In this paper, we evaluate the impact of such features on the implementation of Software Distributed Shared Memory (SDSM), and on the Cashmere system in particular. Cashmere has been implemented on the Compaq Memory Channel network, which supports low-latency messages, protected remote memory writes, in-expensive broadcast, and total ordering of network packets. Our evaluation is based on several Cashmere protocol variants, ranging from a protocol that fully leverages the Memory Channels special features to one that uses the network only for fast messaging. We find that the special features improve performance by 18-44% for three of our applications, but less than 12% for our other seven applications. We also find that home node migration, an optimization available only in the message-based protocol, can improve performance by as much as 67%. These results suggest that for systems of modest size, low latency is much more important for SDSM performance than are remote writes, broadcast, or total ordering. At the same time, results on an emulated 32-node system indicate that broadcast based on remote writes of widely-shared data may improve performance by up to 51% for some applications. If hardware broadcast or multicast facilities can be made to scale, they can be beneficial in future system-area networks.


Scientific Programming | 1999

CRAULc Compiler and run-time integration for adaptation under load[1]This work was supported in part by NSF grants CDA-9401142, CCR-9702466, and CCR-9705594s and an external research grant from Compaq.

Sotiris Ioannidis; Umit Rencuzogullari; Robert J. Stets; Sandhya Dwarkadas

Clusters of workstations provide a cost-effective, high performance parallel computing environment. These environments, however, are often shared by multiple users, or may consist of heterogeneous machines. As a result, parallel applications executing in these environments must operate despite unequal computational resources. For maximum performance, applications should automatically adapt execution to maximize use of the available resources. Ideally, this adaptation should be transparent to the application programmer. In this paper, we present CRAUL (Compiler and Run-Time Integration for Adaptation Under Load), a system that dynamically balances computational load in a parallel application. Our target run-time is software-based distributed shared memory (SDSM). SDSM is a good target for parallelizing compilers since it reduces compile-time complexity by providing data caching and other support for dynamic load balancing. CRAUL combines compile-time support to identify data access patterns with a run-time system that uses the access information to intelligently distribute the parallel workload in loop-based programs. The distribution is chosen according to the relative power of the processors and so as to minimize SDSM overhead and maximize locality. We have evaluated the resulting load distribution in the presence of different types of load - computational, computational and memory intensive, and network load. CRAUL performs within 5-23% of ideal in the presence of load, and is able to improve on naive compiler-based work distribution that does not take locality into account even in the absence of load.Clusters of workstations provide a cost-effective, high performance parallel computing environment. These environments, however, are often shared by multiple users, or may consist of heterogeneous machines. As a result, parallel applications executing in these environments must operate despite unequal computational resources. For maximum performance, applications should automatically adapt execution to maximize use of the available resources. Ideally, this adaptation should be transparent to the application programmer. In this paper, we present CRAUL (Compiler and Run-Time Integration for Adaptation Under Load), a system that dynamically balances computational load in a parallel application. Our target run-time is software-based distributed shared memory (SDSM). SDSM is a good target for parallelizing compilers since it reduces compile-time complexity by providing data caching and other support for dynamic load balancing. CRAUL combines compile-time support to identify data access patterns with a run-time system that uses the access information to intelligently distribute the parallel workload in loop-based programs. The distribution is chosen according to the relative power of the processors and so as to minimize SDSM overhead and maximize locality. We have evaluated the resulting load distribution in the presence of different types of load – computational, computational and memory intensive, and network load. CRAUL performs within 5–23% of ideal in the presence of load, and is able to improve on naive compiler-based work distribution that does not take locality into account even in the absence of load.


international conference on parallel processing | 2002

A technique for adaptation to available resources on clusters independent of synchronization methods used

Umit Rencuzogullari; Sandhya Dwarkadas

Clusters of workstations (COW) offer high performance relative to their cost. Generally these clusters operate as autonomous systems running independent copies of the operating system, where access to machines is not controlled and all users enjoy the same access privileges. While these features are desirable and reduce operating costs, they create adverse effects on parallel applications running on these clusters. Load imbalances are common for parallel applications on COWs due to: 1) variable amount of load on nodes caused by an inherent lack of parallelism, 2) variable resource availability on nodes, and 3) independent scheduling decisions made by the independent schedulers on each node. Our earlier study has shown that an approach combining static program analysis, dynamic load balancing, and scheduler cooperation is effective in countering the adverse effects mentioned above. In our current study, we investigate the scalability of our approach as the number of processors is increased. We further relax the requirement of global synchronization, avoiding the need to use barriers and allowing the use of any other synchronization primitives while still achieving dynamic load balancing. The use of alternative synchronization primitives avoids the inherent vulnerability of barriers to load imbalance. It also allows load balancing to take place at any point in the course of execution, rather than only at a synchronization point, potentially reducing the time the application runs imbalanced. Moreover, load readjustment decisions are made in a distributed fashion, thus preventing any need for processes to globally synchronize in order to redistribute load.


Archive | 2009

Reducing Power Consumption in a Server Cluster

Alok Kumar Gupta; Minwen Ji; Timothy Mann; Tahir Mobashir; Umit Rencuzogullari; Ganesha Shanmuganathan; Limin Wang; Anne Holler


Archive | 2010

System and Method for Automatically Optimizing Capacity Between Server Clusters

Xianan Zhang; Eddie Ma; Umit Rencuzogullari; Irfan Ahmad; Orran Krieger; Mukil Kesavan


ACM Transactions on Computer Systems | 2005

Shared memory computing on clusters with symmetric multiprocessors and system area networks

Leonidas I. Kontothanassis; Robert J. Stets; Galen C. Hunt; Umit Rencuzogullari; Gautam Altekar; Sandhya Dwarkadas; Michael L. Scott


Operating Systems Review | 2000

Interweave: object caching meets software distributed shared memory

Michael L. Scott; Sandhya Dwarkadas; Srinivasan Parthasarthy; Rajeev Balasubramonian; DeQing Chen; Grigorios Magklis; Athanasios E. Papathanasiou; Eduardo Pinheiro; Umit Rencuzogullari; Chunquiang Tang


Archive | 1999

The Implementation of Cashmere

Robert J. Stets; DeQing Chen; Sandhya Dwarkadas; Nikolaos Hardavellas; Galen C. Hunt; Leonidas I. Kontothanassis; Grigorios Magklis; Srinivasan Parthasarathy; Umit Rencuzogullari; Michael L. Scott


Dynamic resource management for parallel applications in an autonomous cluster of workstations | 2004

Dynamic resource management for parallel applications in an autonomous cluster of workstations

Umit Rencuzogullari; Sandhya Dwarkadas

Collaboration


Dive into the Umit Rencuzogullari's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

DeQing Chen

University of Rochester

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge