Ronald C. Unrau | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ronald C. Unrau is active.

Explore More

Publication

Featured researches published by Ronald C. Unrau.

The Journal of Supercomputing | 1995

Hierarchical clustering: a structure for scalable multiprocessor operating system design

Ronald C. Unrau; Orran Krieger; Benjamin Gamsa; Michael Stumm

We introduce the concept ofhierarchical clustering as a way to structure shared-memory multiprocessor operating systems for scalability. The concept is based on clustering and hierarchical system design. Hierarchical clustering leads to a modular system, composed of easy-to-design and efficient building blocks. The resulting structure is scalable because it 1) maximizes locality, which is key to good performance in NUMA (non-uniform memory access) systems and 2) provides for concurrency that increases linearly with the number of processors. At the same time, there is tight coupling within a cluster, so the system performs well for local interactions that are expected to constitute the common case. A clustered system can easily be adapted to different hardware configurations and architectures by changing the size of the clusters. We show how this structuring technique is applied to the design of a microkernel-based operating system calledHurricane. This prototype system is the first complete and running implementation of its kind and demonstrates the feasibility of a hierarchically clustered system. We present performance results based on the prototype, demonstrating the characteristics and behavior of a clustered system. In particular, we show how clustering trades off the efficiencies of tight coupling for the advantages of replication, increased locality, and decreased lock contention.

international conference on parallel processing | 1993

A Fair Fast Scalable Rea,der-Writer Lock

Orran Krieger; Michael Stumm; Ronald C. Unrau; Jonathan Hanna

A reader-writer (RW) lock allows either multiple readers to inspect shared data or a single writer exclusive access for modifying that data. On shared memory multiprocessors,cost of acquiring and releasing these locks can have a large impact on the performance of parallel applications. A major problem with naive implementations of these locks, where processors spin on a global lock variable waiting for the lock to become available, is that the memory containing the lock and the interconnection network to that memory will also become contended when the lock is contended.

IEEE Computer | 1994

The Alloc Stream Facility: a redesign of application-level stream I/O

Orran Krieger; Michael Stumm; Ronald C. Unrau

The authors introduce an application-level I/O facility, the Alloc Stream Facility, that addresses three primary goals. First, ASF addresses recent computing substrate changes to improve performance, allowing applications to benefit from specific features such as mapped files. Second, it is designed for parallel systems, maximizing concurrency and reporting errors properly. Finally, its modular and object-oriented structure allows it to support a variety of popular I/O interfaces (including stdio and C++ stream I/O) and to be tuned to system behavior, exploiting a systems strengths while avoiding its weaknesses. On a number of standard Unix systems, I/O-intensive applications perform substantially better when linked to the Alloc facility. Also, modifying applications to use a new interface provided by the facility can improve performance by another factor of two. These performance improvements are achieved primarily by reducing data copying and the number of system calls. Not visible in these improvements is the extra degree of concurrency the facility brings to multithreaded and parallel applications.<<ETX>>

international conference on parallel processing | 1998

Efficient sleep/wake-up protocols for user-level IPC

Ronald C. Unrau; Orran Krieger

We present a new facility for cross-address space IPC that exploits queues in memory shared between the client and server address space. The facility employs only widely available operating system mechanisms, and is hence easily portable to different commercial operating systems. It incorporates blocking semantics to avoid wasting processor cycles, and still achieves almost twice the throughput of the native kernel-mediated IPC facilities on SGI and IBM uniprocessors. In addition, we demonstrate significantly higher performance gains on an SGI multiprocessor. We argue that co-operating tasks will be better served if the operating system is aware of the co-operation, and propose an interface for a hand-off-scheduling mechanism. Finally, we report initial performance results from a Linux implementation of our proposal.

european conference on parallel processing | 1995

Implementing Flexible Computation Rules with Subexpression-level Loop Transformation

Dattatraya Kulkarni; Michael Stumm; Ronald C. Unrau

Computation Decomposition and Alignment (CDA) is a new loop transformation framework that extends the linear loop transformation framework and the more recently proposed Computation Alignment frameworks by linearly transforming computations at the granularity of subexpressions. It can be applied to achieve a number of optimization objectives, including the removal of data alignment constraints, the elimination of ownership tests, the reduction of cache conflicts, and improvements in data access locality.

european conference on parallel processing | 1995

On the Scalability of Demand-Driven Parallel Systems

Ronald C. Unrau; Michael Stumm; Orran Krieger

Demand-driven systems follow the model where customers enter the system, request some service, and then depart. Examples are databases, transaction processing systems and operating systems, which form the system software layer between the applications and the hardware. Achieving scalability at the system software layer is critical for the scalability of the system as a whole, and yet this layer has largely been ignored.

operating systems design and implementation | 1994