Cezary Dubnicki | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Cezary Dubnicki is active.

Explore More

Publication

Featured researches published by Cezary Dubnicki.

international symposium on computer architecture | 1994

Virtual memory mapped network interface for the SHRIMP multicomputer

Matthias A. Blumrich; Kai Li; Richard D. Alpert; Cezary Dubnicki; Edward W. Felten; Jonathan S. Sandberg

The network interfaces of existing multicomputers require a significant amount of software overhead to provide protection and to implement message passing protocols. This paper describes the design of a low-latency, high-bandwidth, virtual memory-mapped network interface for the SHRIMP multicomputer project at Princeton University. Without sacrificing protection, the network interface achieves low latency by using virtual memory mapping and write-latency hiding techniques, and obtains high bandwidth by providing a user-level block data transfer mechanism. We have implemented several message passing primitives in an experimental environment, demonstrating that our approach can reduce the message passing overhead to a few user-level instructions.

measurement and modeling of computer systems | 2006

Exploiting redundancy to conserve energy in storage systems

Eduardo Pinheiro; Ricardo Bianchini; Cezary Dubnicki

This paper makes two main contributions. First, it introduces Diverted Accesses, a technique that leverages the redundancy in storage systems to conserve disk energy. Second, it evaluates the previous (redundancy-oblivious) energy conservation techniques, along with Diverted Accesses, as a function of the amount and type of redundancy in the system. The evaluation is based on novel analytic models of the energy consumed by the techniques. Using these energy models and previous models of reliability, availability, and performance, we can determine the best redundancy configuration for new energy-aware storage systems. To study Diverted Accesses for realistic systems and workloads, we simulate a wide-area storage system under two file-access traces. Our modeling results show that Diverted Accesses is more effective and robust than the redundancy-oblivious techniques. Our simulation results show that our technique can conserve 20-61% of the disk energy consumed by the wide-area storage system.

ACM Transactions on Storage | 2006

Improving duplicate elimination in storage systems

Deepak R. Bobbarjung; Suresh Jagannathan; Cezary Dubnicki

Minimizing the amount of data that must be stored and managed is a key goal for any storage architecture that purports to be scalable. One way to achieve this goal is to avoid maintaining duplicate copies of the same data. Eliminating redundant data at the source by not writing data which has already been stored not only reduces storage overheads, but can also improve bandwidth utilization. For these reasons, in the face of todays exponentially growing data volumes, redundant data elimination techniques have assumed critical significance in the design of modern storage systems.Intelligent object partitioning techniques identify data that is new when objects are updated, and transfer only these chunks to a storage server. In this article, we propose a new object partitioning technique, called fingerdiff, that improves upon existing schemes in several important respects. Most notably, fingerdiff dynamically chooses a partitioning strategy for a data object based on its similarities with previously stored objects in order to improve storage and bandwidth utilization. We present a detailed evaluation of fingerdiff, and other existing object partitioning schemes, using a set of real-world workloads. We show that for these workloads, the duplicate elimination strategies employed by fingerdiff improve storage utilization on average by 25%, and bandwidth utilization on average by 40% over comparable techniques.

international symposium on microarchitecture | 1995

Virtual-memory-mapped network interfaces

Matthias A. Blumrich; Cezary Dubnicki; Edward W. Felten; Kai Li

In todays multicomputers, software overhead dominates the message-passing latency cost. We designed two multicomputer network interfaces that significantly reduce this overhead. Both support virtual-memory-mapped communication, allowing user processes to communicate without expensive buffer management and without making system calls across the protection boundary separating user processes from the operating system kernel. Here we compare the two interfaces and discuss the performance trade-offs between them. >

high-performance computer architecture | 1996

Protected, user-level DMA for the SHRIMP network interface

Matthias A. Blumrich; Cezary Dubnicki; Edward W. Felten; Kai Li

Traditional DMA requires the operating system to perform many tasks to initiate a transfer, with overhead on the order of hundreds or thousands of CPU instructions. This paper describes a mechanism, called User-level Direct Memory Access (UDMA), for initiating DMA transfers of input/output data, with full protection, at a cost of only two user-level memory references. The UDMA mechanism uses existing virtual memory translation hardware to perform permission checking and address translation without kernel involvement. The implementation of the UDMA mechanism is simple, requiring a small extension to the traditional DMA controller and minimal operating system kernel support. The mechanism can be used with a wide variety of I/O devices including network interfaces, data storage devices such as disks and tape drives, and memory-mapped devices such as graphics frame-buffers. As an illustration, we describe how we used UDMA in building network interface hardware for the SHRIMP multicomputer.

international conference on parallel processing | 1996

Software support for virtual memory-mapped communication

Cezary Dubnicki; Liviu Iftode; Edward W. Felten; Kai Li

Virtual memory-mapped communication (VMMC) is a communication model providing direct data transfer between the senders and receivers virtual address spaces. This model eliminates operating system involvement in communication, provides full protection, supports user-level buffer management and zero-copy protocols, and minimizes software communication overhead. This paper describes system software support for the model including its API, operating system support, and software architecture, for two network interfaces designed in the SHRIMP project. Our implementations and experiments show that the VMMC model can indeed expose the available hardware performance to user programs. On two Pentium PCs with our prototype network interface hardware over a network, we have achieved user-to-user latency of 4.8 /spl mu/sec and sustained bandwidth of 23 MB/s, which is close to the peak hardware bandwidth. Software communication overhead is only a few user-level instructions.

international parallel processing symposium | 1997

Design and implementation of virtual memory-mapped communication on Myrinet

Cezary Dubnicki; Angelos Bilas; Kai Li; James Philbin

Describes the design and implementation of the Virtual Memory-Mapped Communication (VMMC) model on a Myrinet network of PCI-based PCs. VMMC has been designed and implemented for the SHRIMP multicomputer, where it delivers user-to-user latency and bandwidth close to the limits imposed by the underlying hardware. The goal of this work is: to provide an implementation of VMMC on a commercially available hardware platform; to determine whether the benefits of VMMC can be realized on the new hardware; and to investigate network interface design tradeoffs by comparing SHRIMP with Myrinet and its respective VMMC implementation. Our Myrinet implementation of VMMC achieves 9.8 /spl mu/s one-way latency and provides 108.4 MByte/s user-to-user bandwidth. Compared to SHRIMP, the Myrinet implementation of VMMC incurs relatively higher overhead and demands more network interface resources (LANai processor, on-board SRAM) but requires less operating system support.

international symposium on computer architecture | 2002

Experiences with VI communication for database storage

Yuanyuan Zhou; Angelos Bilas; Suresh Jagannathan; Cezary Dubnicki; James Philbin; Kai Li

This paper examines how VI-based interconnects can be used to improve I/O path performance between a database server and the storage subsystem. We design and implement a software layer, DSA, that is layered between the application and VI. DSA takes advantage of specific VI features and deals with many of its shortcomings. We provide and evaluate one kernel-level and two user-level implementations of DSA. These implementations trade transparency and generality for performance at different degrees, and unlike research prototypes are designed to be suitable for real-world deployment. We present detailed measurements using a commercial database management system with both micro-benchmarks and industrial database workloads on a mid-size, 4 CPU, and a large, 32 CPU, database server.Our results show that VI-based interconnects and user-level communication can improve all aspects of the I/O path between the database system and the storage back-end. We also find that to make effective use of VI in I/O intensive environments we need to provide substantial additional functionality than what is currently provided by VI. Finally, new storage APIs that help minimize kernel involvement in the I/O path are needed to fully exploit the benefits of VI-based communication.

international symposium on computer architecture | 1996

Early Experience with Message-Passing on the SHRIMP Multicomputer

Edward W. Felten; Richard D. Alpert; Angelos Bilas; Matthias A. Blumrich; Douglas W. Clark; Stefanos N. Damianakis; Cezary Dubnicki; Liviu Iftode; Kai Li

The SHRIMP multicomputer provides virtual memory-mapped communication (VMMC), which supports protected, user-level message passing, allows user programs to perform their own buffer management, and separates data transfers from control transfers so that a data transfer can be done without the intervention of the receiving node CPU. An important question is whether such a mechanism can indeed deliver all of the available hardware performance to applications which use conventional message-passing libraries.This paper reports our early experience with message-passing on a small, working SHRIMP multicomputer. We have implemented several user-level communication libraries on top of the VMMC mechanism, including the NX message-passing interface, Sun RPC, stream sockets, and specialized RPC. The first three are fully compatible with existing systems. Our experience shows that the VMMC mechanism supports these message-passing interfaces well. When zero-copy protocols are allowed by the semantics of the interface, VMMC can effectively deliver to applications almost all of the raw hardwares communication performance.

conference on high performance computing (supercomputing) | 1998

User-Space Communication: A Quantitative Study

Soichiro Araki; Angelos Bilas; Cezary Dubnicki; Jan Edler; Koichi Konishi; James Philbin

Powerful commodity systems and networks offer a promising direction for high performance computing because they are inexpensive and they closely track technology progress. However, high, raw-hardware performance is rarely delivered to the end user. Previous work has shown that the bottleneck in these architectures is the overheads imposed by the software communication layer. To reduce these overheads, researchers have proposed a number of user-space communication models. The common feature of these models is that applications have direct access to the network, bypassing the operating system in the common case and thus avoiding the cost of send/receive system calls. In this paper we examine five user-space communication layers, that represent different points in the configuration space: Generic AM, BIP-0.92, FM-2.02, PM-1.2, and VMMC-2. Although these systems support different communication paradigms and employ a variety of different implementation tradeoffs, we are able to quantitatively compare them on a single testbed consisting of a cluster of high-end PCs connected by a Myrinet network. We find that all five communication systems have very low latency for small messages, in the range of 5 to 17 s. Not surprisingly, this range is strongly influenced by the functionality offered by each system. We are encouraged, however, to find that features such as protected and reliable communication at user level and multiprogramming can be provided at very low cost. Bandwidth, however, depends primarily on how data is transferred between host memory and the network. Most of the investigated libraries support zero-copy protocols for certain types of data transfers, but differ significantly in the bandwidth delivered to end users. The highest bandwidth, between 95 and 125 MBytes/s for long message transfers, is delivered by libraries that use DMA on both send and receive sides and avoid all data copies. Libraries that perform additional data copies or use programmed I/O to send data to the network achieve lower maximum bandwidth, in the range of 60-70 MBytes/s.

Explore More