Liviu Iftode | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Liviu Iftode is active.

Explore More

Publication

Featured researches published by Liviu Iftode.

IEEE Journal on Selected Areas in Communications | 1995

Improving the performance of reliable transport protocols in mobile computing environments

Ramón Cáceres; Liviu Iftode

We explore the performance of reliable data communication in mobile computing environments. Motion across wireless cell boundaries causes increased delays and packet losses while the network learns how to route data to a hosts new location. Reliable transport protocols like TCP interpret these delays and losses as signs of network congestion. They consequently throttle their transmissions, further degrading performance. We quantify this degradation through measurements of protocol behavior in a wireless networking testbed. We show how current TCP implementations introduce unacceptably long pauses in communication during cellular handoffs (800 ms and longer), and propose an end-to-end fast retransmission scheme that can reduce these pauses to levels more suitable for human interaction (200 ms). Our work makes clear the need for reliable transport protocols to differentiate between motion-related and congestion-related packet losses and suggests how to adapt these protocols to perform better in mobile computing environments. >

acm symposium on parallel algorithms and architectures | 1996

Scope consistency: a bridge between release consistency and entry consistency

Liviu Iftode; Jaswinder Pal Singh; Kai Li

Systems that maintain coherence at large granularity, such as shared virtual memory systems, suffer from false sharing and extra communication. Relaxed memory consistency models have been used to alleviate these problems, but at a cost in programming complexity. Release Consistency (RC) and Lazy Release Consistency (LRC) are accepted to offer a reasonable tradeoff between performance and programming complexity. Entry Consistency (EC) offers a more relaxed consistency model, but it requires explicit association of shared data objects with synchronization variables. The programming burden of providing such associations can be substantial.

operating systems design and implementation | 1996

Performance evaluation of two home-based lazy release consistency protocols for shared virtual memory systems

Yuanyuan Zhou; Liviu Iftode; Kai Li

This paper investigates the performance of shared virtual memory protocols on large-scale multicomputers. Using experiments on a 64-node Paragon, we show that the traditional Lazy Release Consistency (LRC) protocol does not scale well, because of the large number of messages it requires, the large amount of memory it consumes for protocol overhead data, and because of the diÆculty of garbage collecting that data. To achieve more scalable performance, we introduce and evaluate two new protocols. The rst, Home-based LRC (HLRC), is based on the Automatic Update Release Consistency (AURC) protocol. Like AURC, HLRC maintains a home for each page to which all updates are propagated and from which all copies are derived. Unlike AURC, HLRC requires no specialized hardware support. We nd that the use of homes provides substantial improvements in performance and scalability over LRC. Our second protocol, called Overlapped Home-based LRC (OHLRC), takes advantage of the communication processor found on each node of the Paragon to o oad some of the protocol overhead of HLRC from the critical path followed by the compute processor. We nd that OHLRC provides modest improvements over HLRC. We also apply overlapping to the base LRC protocol, with similar results. Our experiments were done using ve of the Splash-2 benchmarks. We report overall execution times, as well as detailed breakdowns of elapsed time, message traÆc, and memory use for each of the protocols.

high-performance computer architecture | 1996

Improving release-consistent shared virtual memory using automatic update

Liviu Iftode; Cezary Dubnicki; Edward W. Felten; Kai Li

Shared virtual memory is a software technique to provide shared memory on a network of computers without special hardware support. Although several relaxed consistency models and implementations are quite effective, there is still a considerable performance gap between the software-only approach and the hardware approach that uses directory-based caches. Automatic update is a simple communication mechanism, implemented in the SHRIMP multicomputer, that forwards local writes to remote memory transparently. In this paper we propose a new lazy release consistency based protocol, called Automatic Update Release Consistency (AURC), that uses automatic update to propagate and merge shared memory modifications. We compare the performance of this protocol against a software-only LRC implementation on several Splash-2 applications and show that the AURC approach can substantially improve the performance of LRC. For 16 processors, the average speedup has increased from 5.9 under LRC to 8.3 under AURC.

international conference on parallel processing | 1996

Software support for virtual memory-mapped communication

Cezary Dubnicki; Liviu Iftode; Edward W. Felten; Kai Li

Virtual memory-mapped communication (VMMC) is a communication model providing direct data transfer between the senders and receivers virtual address spaces. This model eliminates operating system involvement in communication, provides full protection, supports user-level buffer management and zero-copy protocols, and minimizes software communication overhead. This paper describes system software support for the model including its API, operating system support, and software architecture, for two network interfaces designed in the SHRIMP project. Our implementations and experiments show that the VMMC model can indeed expose the available hardware performance to user programs. On two Pentium PCs with our prototype network interface hardware over a network, we have achieved user-to-user latency of 4.8 /spl mu/sec and sustained bandwidth of 23 MB/s, which is close to the peak hardware bandwidth. Software communication overhead is only a few user-level instructions.

Digest of Papers. Compcon Spring | 1993

Memory servers for multicomputers

Liviu Iftode; Kai Li; Karin Petersen

A virtual memory management technique for multicomputers called memory servers is investigated. The memory server model extends the memory hierarchy of multicomputers by introducing a remote memory server layer. Memory servers are multicomputer nodes whose memory is used for fast backing storage and logically lie between the local physical memory and disks. The authors present the model, describe how the model supports sequential programs, message-passing programs, and shared virtual memory systems, discuss several design issues, and show preliminary results of a prototype implementation on an Intel iPSC/860.<<ETX>>

acm sigplan symposium on principles and practice of parallel programming | 1997

Relaxed consistency and coherence granularity in DSM systems: a performance evaluation

Yuanyuan Zhou; Liviu Iftode; Jaswinder Pal Sing; Kai Li; Brian R. Toonen; Ioannis Schoinas; Mark D. Hill; David A. Wood

During the past few years, two main approaches have been taken to improve the performance of software shared memory implementations: relaxing consistency models and providing fine-grained access control. Their performance tradeoffs, however, we not well understood. This paper studies these tradeoffs on a platform that provides access control in hardware but runs coherence protocols in software, We compare the performance of three protocols across four coherence granularities, using 12 applications on a 16-node cluster of workstations. Our results show that no single combination of protocol and granularity performs best for all the applications. The combination of a sequentially consistent (SC) protocol and fine granularity works well with 7 of the 12 applications. The combination of a multiple-writer, home-based lazy release consistency (HLRC) protocol and page granularity works well with 8 out of the 12 applications. For applications that suffer performance losses in moving to coarser granularity under sequential consistency, the performance can usually be regained quite effectively using relaxed protocols, particularly HLRC. We also find that the HLRC protocol performs substantially better than a single-writer lazy release consistent (SW-LRC) protocol at coase granularity for many irregular applications. For our applications and platform, when we use the original versions of the applications ported directly from hardware-coherent shared memory, we find that the SC protocol with 256-byte granularity performs best on average. However, when the best versions of the applications are compared, the balance shifts in favor of HLRC at page granularity.

international symposium on computer architecture | 1996

Understanding Application Performance on Shared Virtual Memory Systems

Jaswinder Pal Singh; Kai Li; Liviu Iftode

Many researchers have proposed interesting protocols for shared virtual memory (SVM) systems, and demonstrated performance improvements on parallel programs. However, there is still no clear understanding of the performance potential of SVM systems for different classes of applications. This paper begins to fill this gap, by studying the performance of a range of applications in detail and understanding it in light of application characteristics.We first develop a brief classification of the inherent data sharing patterns in the applications, and how they interact with system granularities to yield the communication patterns relevant to SVM systems. We then use detailed simulation to compare the performance of two SVM approaches---Lazy Released Consistency (LRC) and Automatic Update Release Consistency (AURC)---with each other and with an all-hardware CC-NUMA approach. We examine how performance is affected by problem size, machine size, key system parameters, and the use of less optimized program implementations. We find that SVM can indeed perform quite well for systems of at leant up to 32 processors for several nontrivial applications. However, performance is much more variable across applications than on CC-NUMA systems, and the problem sizes needed to obtain good parallel performance are substantially larger. The hardware-assisted AURC system tends to perform significantly better than the all-software LRC under our system assumptions, particularly when realistic cache hierarchies are used.

international symposium on computer architecture | 1996

Early Experience with Message-Passing on the SHRIMP Multicomputer

Edward W. Felten; Richard D. Alpert; Angelos Bilas; Matthias A. Blumrich; Douglas W. Clark; Stefanos N. Damianakis; Cezary Dubnicki; Liviu Iftode; Kai Li

The SHRIMP multicomputer provides virtual memory-mapped communication (VMMC), which supports protected, user-level message passing, allows user programs to perform their own buffer management, and separates data transfers from control transfers so that a data transfer can be done without the intervention of the receiving node CPU. An important question is whether such a mechanism can indeed deliver all of the available hardware performance to applications which use conventional message-passing libraries.This paper reports our early experience with message-passing on a small, working SHRIMP multicomputer. We have implemented several user-level communication libraries on top of the VMMC mechanism, including the NX message-passing interface, Sun RPC, stream sockets, and specialized RPC. The first three are fully compatible with existing systems. Our experience shows that the VMMC mechanism supports these message-passing interfaces well. When zero-copy protocols are allowed by the semantics of the interface, VMMC can effectively deliver to applications almost all of the raw hardwares communication performance.

Archive | 1999

Supporting a Coherent Shared Address Space Across SMP Nodes: An Application-Driven Investigation

Angelos Bilas; Liviu Iftode; Rudrajit Samanta; Jaswinder Pal Singh

As the workstation market moves form single processor to small-scale shared memory multiprocessors, it is very attractive to construct larger-scale multiprocessors by connecting symmetric multiprocessors (SMPs) with efficient commodity network interfaces such as Myrinet. With hardware-supported cache-coherent shared memory within the SMPs, the question is what programming model to support across SMPs. A coherent shared address space has been found to be attractive for a wide range of applications, and shared virtual memory (SVM) protocols have been developed to provide this model in software at page granularity across uniprocessor nodes. It is therefore attractive to extend SVM protocols to efficiently incorporate SMP nodes, instead of using a hybrid programming model with a shared address space within SMP nodes and explicit message passing across them. The protocols should be optimized to exploit the efficient hardware sharing within an SMP as much as possible, and invoke the less efficient software protocol across nodes as infrequently as possible.

Explore More