Michael F. Spear
Lehigh University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Michael F. Spear.
principles of distributed computing | 2007
Michael F. Spear; Virendra J. Marathe; Luke Dalessandro; Michael L. Scott
Early implementations of software transactional memory (STM) assumed that sharable data would be accessed only within transactions. Memory may appear inconsistent in programs that violate this assumption, even when program logic would seem to make extra-transactional accesses safe. Designing STM systems that avoid such inconsistency has been dubbed the privatization problem. We argue that privatization comprises a pair of symmetric subproblems: private operations may fail to see updates made by transactions that have committed but not yet completed; conversely, transactions that are doomed but have not yet aborted may see updates made by private code, causing them to perform erroneous, externally visible operations. We explain how these problems arise in different styles of STM, present strategies to address them, and discuss their implementation tradeoffs. We also propose a taxonomy of contracts between the system and the user, analogous to programmer-centric memory consistency models, which allow us to classify programs based on their privatization requirements. Finally, we present empirical comparisons of several privatization strategies. Our results suggest that the best strategy may depend on application characteristics.
acm symposium on parallel algorithms and architectures | 2008
Michael F. Spear; Maged M. Michael; Christoph von Praun
Existing Software Transactional Memory (STM) designs attach metadata to ranges of shared memory; subsequent runtime instructions read and update this metadata in order to ensure that an in-flight transactions reads and writes remain correct. The overhead of metadata manipulation and inspection is linear in the number of reads and writes performed by a transaction, and involves expensive read-modify-write instructions, resulting in substantial overheads. We consider a novel approach to STM, in which transactions represent their read and write sets as Bloom filters, and transactions commit by enqueuing a Bloom filter onto a global list. Using this approach, our RingSTM system requires at most one read-modify-write operation for any transaction, and incurs validation overhead linear not in transaction size, but in the number of concurrent writers who commit. Furthermore, RingSTM is the first STM that is inherently livelock-free and privatization-safe while at the same time permitting parallel writeback by concurrent disjoint transactions.We evaluate three variants of the RingSTM algorithm, and find that it offers superior performance and/or stronger semantics than the state-of-the-art TL2 algorithm under a number of workloads.
international symposium on distributed computing | 2006
Michael F. Spear; Virendra J. Marathe; William N. Scherer; Michael L. Scott
In a software transactional memory (STM) system, conflict detection is the problem of determining when two transactions cannot both safely commit. Validation is the related problem of ensuring that a transaction never views inconsistent data, which might potentially cause a doomed transaction to exhibit irreversible, externally visible side effects. Existing mechanisms for conflict detection vary greatly in their degree of speculation and their relative treatment of read-write and write-write conflicts. Validation, for its part, appears to be a dominant factor—perhaps the dominant factor—in the cost of complex transactions. We present the most comprehensive study to date of conflict detection strategies, characterizing the tradeoffs among them and identifying the ones that perform the best for various types of workload. In the process we introduce a lightweight heuristic mechanism—the global commit counter—that can greatly reduce the cost of validation and of single-threaded execution. The heuristic also allows us to experiment with mixed invalidation, a more opportunistic interleaving of reading and writing transactions. Experimental results on a 16-processor SunFire machine running our RSTM system indicate that the choice of conflict detection strategy can have a dramatic impact on performance, and that the best choice is workload dependent. In workloads whose transactions rarely conflict, the commit counter does little to help (and can even hurt) performance. For less scalable applications, however—those in which STM performance has traditionally been most problematic—it can improve transaction throughput many fold.
acm sigplan symposium on principles and practice of parallel programming | 2009
Michael F. Spear; Luke Dalessandro; Virendra J. Marathe; Michael L. Scott
In Software Transactional Memory (STM), contention management refers to the mechanisms used to ensure forward progress--to avoid livelock and starvation, and to promote throughput and fairness. Unfortunately, most past approaches to contention management were designed for obstruction-free STM frameworks, and impose significant constant-time overheads. Priority-based approaches in particular typically require that reads be visible to all transactions, an expensive property that is not easy to support in most STM systems. In this paper we present a comprehensive strategy for contention management via fair resolution of conflicts in an STM with invisible reads. Our strategy depends on (1) lazy acquisition of ownership, (2) extendable timestamps, and (3) an efficient way to capture both priority and conflicts. We introduce two mechanisms--one using Bloom filters, the other using visible read bits--that implement point (3). These mechanisms unify the notions of conflict resolution, inevitability, and transaction retry. They are orthogonal to the rest of the contention management strategy, and could be used in a wide variety of hardware and software TM systems. Experimental evaluation demonstrates that the overhead of the mechanisms is low, particularly when conflicts are rare, and that our strategy as a whole provides good throughput and fairness, including livelock and starvation freedom, even for challenging workloads.
architectural support for programming languages and operating systems | 2011
Luke Dalessandro; François Carouge; Sean White; Yossi Lev; Mark S. Moir; Michael L. Scott; Michael F. Spear
Transactional memory (TM) is a promising synchronization mechanism for the next generation of multicore processors. Best-effort Hardware Transactional Memory (HTM) designs, such as Suns prototype Rock processor and AMDs proposed Advanced Synchronization Facility (ASF), can efficiently execute many transactions, but abort in some cases due to various limitations. Hybrid TM systems can use a compatible software TM (STM) in such cases. We introduce a family of hybrid TMs built using the recent NOrec STM algorithm that, unlike existing hybrid approaches, provide both low overhead on hardware transactions and concurrent execution of hardware and software transactions. We evaluate implementations for Rock and ASF, exploring how the differing HTM designs affect optimization choices. Our investigation yields valuable input for designers of future best-effort HTMs.
international conference on parallel processing | 2008
Michael F. Spear; Michael Silverman; Luke Dalessandro; Maged M. Michael; Michael L. Scott
Transactional Memory (TM) takes responsibility for concurrent, atomic execution of labeled regions of code, freeing the programmer from the need to manage locks. Typical implementations rely on speculation and rollback, but this creates problems for irreversible operations like interactive I/O. A widely assumed solution allows a transaction to operate in an inevitable mode that excludes all other transactions and is guaranteed to complete, but this approach does not scale. This paper explores a richer set of alternatives for software TM, and demonstrates that it is possible for an inevitable transaction to run in parallel with (non-conflicting) non-inevitable transactions, without introducing significant overhead in the non-inevitable case. We report experience with these alternatives in a graphical game application. We also consider the use of inevitability to accelerate certain common-case transactions.
european conference on computer systems | 2006
Michael F. Spear; Tom Roeder; Orion Hodson; Galen C. Hunt; Steven P. Levi
Run-time conflicts can affect even the most rigorously tested software systems. A reliance on execution-based testing makes it prohibitively costly to test every possible interaction among potentially thousands of programs with complex configurations. In order to reduce configuration problems, detect developer errors, and reduce developer effort, we have created a new first class operating system abstraction, the application abstraction, which enables both online and offline reasoning about programs and their configuration requirements.We have implemented a subset of the application abstraction for device drivers in the Singularity operating system. Programmers use the application abstraction by placing declarative statements about hardware and communication requirements within their code. Our design enables Singularity to learn the input/output and interprocess communication requirements of drivers without executing driver code. By reasoning about this information within the domain of Singularitys strong software isolation architecture, the installer can execute a subset the systems resource management algorithm at install time to verify that a new driver will not conflict with existing software. This abstract representation also allows the system to run the full algorithm at driver start time to ensure that there are never resource conflicts between executing drivers, and that drivers never use undeclared resources.
european conference on parallel processing | 2010
Luke Dalessandro; David Dice; Michael L. Scott; Nir Shavit; Michael F. Spear
Mutual exclusion (mutex) locks limit concurrency but offer low single-thread latency. Software transactional memory (STM) typically has much higher latency, but scales well. We present transactional mutex locks (TML), which attempt to achieve the best of both worlds for read-dominated workloads. We also propose compiler optimizations that reduce the latency of TML to within a small fraction of mutex overheads. Our evaluation of TML, using microbenchmarks on the x86 and SPARC architectures, is promising. Using optimized spinlocks and the TL2 STM algorithm as baselines, we find that TML provides the low latency of locks at low thread levels, and the scalability of STM for read-dominated workloads. These results suggest that TML is a good reference implementation to use when evaluating STM algorithms, and that TML is a viable alternative to mutex locks for a variety of workloads.
international conference on principles of distributed systems | 2008
Michael F. Spear; Luke Dalessandro; Virendra J. Marathe; Michael L. Scott
It has been widely suggested that memory transactions should behave as if they acquired and released a single global lock. Unfortunately, this behavior can be expensive to achieve, particularly when--as in the natural publication/privatization idiom--the same data are accessed both transactionally and nontransactionally. To avoid overhead, we propose selective strict serializability (SSS) semantics, in which transactions have a global total order, but nontransactional accesses are globally ordered only with respect to explicitly marked transactions. Our definition of SSS formally characterizes the permissible behaviors of an STM system without recourse to locks. If all transactions are marked, then SSS, single-lock semantics, and database-style strict serializability are equivalent. We evaluate several SSS implementations in the context of a TL2-like STM system. We also evaluate a weaker model, selective flow serializability (SFS), which is similar in motivation to the asymmetric lock atomicity (ALA) of Menon et al. We argue that ordering-based semantics are conceptually preferable to lock-based semantics, and just as efficient.
acm symposium on parallel algorithms and architectures | 2007
Michael F. Spear; Arrvindh Shriraman; Luke Dalessandro; Sandhya Dwarkadas; Michael L. Scott
Nonblocking implementations of software transactional memory (STM) typically impose an extra level of indirection when accessing an object; some researchers have claimed that the cost of this indirection outweighs the semantic advantages of nonblocking progress guarantees. We consider this claim in the context of a simple hardware assist, alert-on-update (AOU), which allows a thread to request immediate notification if specified line(s) are replaced or invalidated in its cache. We show that even a single AOU line allows us to construct a simple, nonblocking STM system without extra indirection. At the same time, we observe that per-load validation operations, required for intra-object consistency in both the new system and in lock-based (blocking) STM, at least partially negate the resulting performance gain. Moreover, inter-object consistency checks, also required in both kinds of systems, remain the dominant cost for transactions that access many objects. We therefore present a second nonblocking STM system that uses multiple AOU lines (one per accessed object) to eliminate validation overhead entirely, resulting in a nonblocking, zero-indirection STM system that outperforms competing systems by as much as a factor of 2.