Bruce Delagi
Stanford University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bruce Delagi.
IEEE Computer | 1990
Manu Thapar; Bruce Delagi
The Stanford distributed-directory (SDD) cache-coherence protocol, based on a singly linked list of distributed directories, is examined. Sharing-list additions and removals are explained diagramatically. Reads, writes, pending signals, replacement, and synchronization are discussed. Replacing lines linked in a list is done by invalidating the lower part of the list. A doubly linked list may be used to patch the list in case of replacements. However, in practice, performance improvement depends on the list lengths and access patterns. A distributed-directory cache-coherence protocol allows efficient implementation of locks at minimal extra cost. The SDD protocol allows a lock implementation that minimizes network traffic.<<ETX>>
international parallel processing symposium | 1993
Manu Thapar; Bruce Delagi; Michael J. Flynn
This paper presents a singly-linked distributed directory (SDD) cache coherence protocol and compares the performance of the SDD protocol with the fully mapped centralized directory protocol and the IEEE SCI Standard protocol. To maintain coherence, the SDD protocol uses a linked list of cache lines that contain shared copies of the data. The protocol has scalable cost. Coherency related messages are not required to be delivered in order, thus allowing adaptive routing, making the performance more robust in the presence of congested networks. The authors analysis shows that the SDD protocol has generally better performance in the presence of memory and interconnect contention. They discuss the various factors, such as memory reference behavior and interconnect traffic, that affect the performance of these protocols.<<ETX>>
hawaii international conference on system sciences | 1994
David Glasco; Bruce Delagi; Michael J. Flynn
Presents two hardware-controlled update-based cache coherence protocols. The authors discuss the two major disadvantages of the update protocols: inefficiency of updates and the mismatch between the granularity of synchronization and the data transfer. They present two enhancements to the update-based protocols, a write combining scheme and a finer grain synchronization, to overcome these disadvantages. The results demonstrate the effectiveness of these enhancements that, when used together, allow the update-based protocols to significantly improve the execution time of a set of scientific applications when compared to three invalidate-based protocols.<<ETX>>
joint international conference on vector and parallel processing parallel processing | 1990
Manu Thapar; Bruce Delagi
This paper describes a new hardware solution for the cache coherence problem in large scale shared memory multiprocessors. The protocol is based on a linked list of caches — forming a distributed directory and (to ensure a scalable design) does not require a global broadcast mechanism. Fully-mapped directory-based solutions proposed earlier also do not require a global broadcast mechanism. However, our solution has a lower cost and potentially better performance than the fully-mapped directory-based protocol. We provide simulation results to show that the performance of the distributed directory protocol is more robust when there is contention for the data and for variations in memory technology. Further, we do not assume that the network preserves the order of messages. Thus we do not preclude adaptive routing. Our solution also allows an efficient implementation of locks.
international parallel and distributed processing symposium | 1994
David Glasco; Bruce Delagi; Michael J. Flynn
In our previous work, we demonstrated the possible performance gains from update-based cache coherence protocols for a set of fine-grain scientific applications running on a scalable shared-memory multiprocessor. In this paper, we examine in detail the hardware-based write grouping scheme presented in our earlier work. First we describe both software-based and hardware-based write grouping schemes. The software-based scheme, with its perfect knowledge of the applications write pattern, is able to achieve optimal write grouping efficiency, but not without added complexity to the applications code. Nevertheless, we use the software-based scheme to determine the optimal grouping efficiency for each application studied and then demonstrate that the hardware-based write grouping scheme is almost as efficient as the software-based scheme, but it requires little, if any, software modifications.<<ETX>>
Archive | 1992
Manu Thapar; Bruce Delagi
This paper analyzes a new hardware solution for the cache coherence problem in large scale shared memory multiprocessors. The protocol is based on a linked list of caches — forming a distributed directory and does not require a global broadcast mechanism. Fully-mapped directory-based solutions proposed earlier also do not require a global broadcast mechanism. However, our solution is more scalable and provides potentially better performance than the fully-mapped directory-based protocol. We provide simulation results to show that the performance of the distributed directory protocol is more robust when there is contention for the data and for variations in memory technology. Further, we do not assume that the network preserves the order of messages. Thus we do not preclude adaptive routing.
Proceedings of the First International ACPC Conference on Parallel Computation | 1991
Manu Thapar; Bruce Delagi; Michael J. Flynn
This paper presents a performance analysis of a new directory based cache coherence protocol. We compare the fully mapped centralized directory protocol with a distributed directory protocol developed by us. The distributed directory protocol is based on a linked list of caches and is more scalable in terms of cost and performance. It does not require the network to preserve the order of messages and allows adaptive routing so that network performance may be more robust. Simulation results show that the distributed directory protocol has better performance than the centralized directory protocol for the benchmarks we have analyzed.
international conference on parallel processing | 1991
Gregory T. Byrd; Bruce Delagi
ACM Sigarch Computer Architecture News | 1991
Manu Thapar; Bruce Delagi
Archive | 1998
Gregory T. Byrd; Bruce Delagi; Michael J. Flynn