Antony L. Hosking
Purdue University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Antony L. Hosking.
conference on object-oriented programming systems, languages, and applications | 2006
Stephen M. Blackburn; Robin Garner; Chris Hoffmann; Asjad M. Khang; Kathryn S. McKinley; Rotem Bentzur; Amer Diwan; Daniel Feinberg; Daniel Frampton; Samuel Z. Guyer; Martin Hirzel; Antony L. Hosking; Maria Jump; Han Lee; J. Eliot B. Moss; Aashish Phansalkar; Darko Stefanovic; Thomas VanDrunen; Daniel von Dincklage; Ben Wiedermann
Since benchmarks drive computer science research and industry product development, which ones we use and how we evaluate them are key questions for the community. Despite complex runtime tradeoffs due to dynamic compilation and garbage collection required for Java programs, many evaluations still use methodologies developed for C, C++, and Fortran. SPEC, the dominant purveyor of benchmarks, compounded this problem by institutionalizing these methodologies for their Java benchmark suite. This paper recommends benchmarking selection and evaluation methodologies, and introduces the DaCapo benchmarks, a set of open source, client-side Java benchmarks. We demonstrate that the complex interactions of (1) architecture, (2) compiler, (3) virtual machine, (4) memory management, and (5) application require more extensive evaluation than C, C++, and Fortran which stress (4) much less, and do not require (3). We use and introduce new value, time-series, and statistical metrics for static and dynamic properties such as code complexity, code size, heap composition, and pointer mutations. No benchmark suite is definitive, but these metrics show that DaCapo improves over SPEC Java in a variety of ways, including more complex code, richer object behaviors, and more demanding memory system requirements. This paper takes a step towards improving methodologies for choosing and evaluating benchmarks to foster innovation in system design and implementation for Java and other managed languages.
acm sigplan symposium on principles and practice of parallel programming | 2007
Yang Ni; Vijay Menon; Ali-Reza Adl-Tabatabai; Antony L. Hosking; Richard L. Hudson; J. Eliot B. Moss; Bratin Saha; Tatiana Shpeisman
Transactional memory (TM) promises to simplify concurrent programming while providing scalability competitive to fine-grained locking. Language-based constructs allow programmers to denote atomic regions declaratively and to rely on the underlying system to provide transactional guarantees along with concurrency. In contrast with fine-grained locking, TM allows programmers to write simpler programs that are composable and deadlock-free. TM implementations operate by tracking loads and stores to memory and by detecting concurrent conflicting accesses by different transactions. By automating this process, they greatly reduce the programmers burden, but they also are forced to be conservative. Incertain cases, conflicting memory accesses may not actually violate the higher-level semantics of a program, and a programmer may wish to allow seemingly conflicting transactions to execute concurrently. Open nested transactions enable expert programmers to differentiate between physical conflicts, at the level of memory, and logical conflicts that actually violate application semantics. A TMsystem with open nesting can permit physical conflicts that are not logical conflicts, and thus increase concurrency among application threads. Here we present an implementation of open nested transactions in a Java-based software transactional memory (STM)system. We describe new language constructs to support open nesting in Java, and we discuss new abstract locking mechanisms that a programmer can use to prevent logical conflicts. We demonstrate how these constructs can be mapped efficiently to existing STM data structures. Finally, we evaluate our system on a set of Java applications and data structures, demonstrating how open nesting can enhance application scalability.
conference on object-oriented programming systems, languages, and applications | 2005
Adam Welc; Suresh Jagannathan; Antony L. Hosking
A future is a simple and elegant abstraction that allows concurrency to be expressed often through a relatively small rewrite of a sequential program. In the absence of side-effects, futures serve as benign annotations that mark potentially concurrent regions of code. Unfortunately, when computation relies heavily on mutation as is the case in Java, its meaning is less clear, and much of its intended simplicity lost.This paper explores the definition and implementation of safe futures for Java. One can think of safe futures as truly transparent annotations on method calls, which designate opportunities for concurrency. Serial programs can be made concurrent simply by replacing standard method calls with future invocations. Most significantly, even though some parts of the program are executed concurrently and may indeed operate on shared data, the semblance of serial execution is nonetheless preserved. Thus, program reasoning is simplified since data dependencies present in a sequential program are not violated in a version augmented with safe futures.Besides presenting a programming model and API for safe futures, we formalize the safety conditions that must be satisfied to ensure equivalence between a sequential Java program and its future-annotated counterpart. A detailed implementation study is also provided. Our implementation exploits techniques such as object versioning and task revocation to guarantee necessary safety conditions. We also present an extensive experimental evaluation of our implementation to quantify overheads and limitations. Our experiments indicate that for programs with modest mutation rates on shared data, applications can use futures to profitably exploit parallelism, without sacrificing safety.
conference on object oriented programming systems languages and applications | 1992
Antony L. Hosking; J. Eliot B. Moss; Darko Stefanovic
Generational garbage collectors are able to achieve very small pause times by concentrating on the youngest (most recently allocated) objects when collecting, since objects have been observed to die young in many systems. Generational collectors must keep track of all pointers from older to younger generations, by “monitoring” all stores into the heap. This write barrier has been implemented in a number of ways, varying essentially in the granularity of the information observed and stored. Here we examine a range of write barrier implementations and evaluate their relative performance within a generation scavenging garbage collector for Smalltalk.
Science of Computer Programming | 2006
J. Eliot B. Moss; Antony L. Hosking
We offer a reference model for nested transactions at the level of memory accesses, and sketch possible hardware architecture designs that implement that model. We describe both closed and open nesting. The model is abstract in that it does not relate to hardware, such as caches, but describes memory as seen by each transaction, memory access conflicts, and the effects of commits and aborts. The hardware sketches describe approaches to implementing the model using bounded size caches in a processor with overflows to memory. In addition to a model that will support concurrency within a transaction, we describe a simpler model that we call linear nesting. Linear nesting supports only a single thread of execution in a transaction nest, but may be easier to implement. While we hope that the model is a good target to which to compile transactions from source languages, the mapping from source constructs to nested transactional memory is beyond the scope of the paper.
european conference on object-oriented programming | 2004
Adam Welc; Suresh Jagannathan; Antony L. Hosking
Transactional monitors are proposed as an alternative to monitors based on mutual-exclusion synchronization for object-oriented programming languages. Transactional monitors have execution semantics similar to mutual- exclusion monitors but implement monitors as lightweight transactions that can be executed concurrently (or in parallel on multiprocessors). They alleviate many of the constraints that inhibit construction of transparently scalable and robust applications. We undertake a detailed study of alternative implementation schemes for transac- tional monitors. These different schemes are tailored to different concurrent access patterns, and permit a given application to use potentially different implementa- tions in different contexts. We also examine the performance and scalability of these alternative approaches in the context of the Jikes Research Virtual Machine, a state-of-the-art Java im- plementation. We show that transactional monitors are competitive with mutual- exclusion synchronization and can outperform lock-based approaches up to five times on a wide range of workloads.
symposium on operating systems principles | 1994
Antony L. Hosking; J. Eliot B. Moss
Many operating systems allow user programs to specify the protection level (inaccessible, read-only, read-write) of pages in their virtual memory address space, and to handle any protection violations that may occur. Such page-protection techniques have been exploited by several user-level algorithms for applications including generational garbage collection and persistent stores. Unfortunately, modern hardware has made efficient handling of page protection faults more difficult. Moreover, page-sized granularity may not match the natural granularity of a given application. In light of these problems, we reevaluate the usefulness of page-protection primitives in such applications, by comparing the performance of implementations that make use of the primitives with others that do not. Our results show that for certain applications software solutions outperform solutions that rely on page-protection or other related virtual memory primitives.
Communications of The ACM | 2008
Stephen M. Blackburn; Kathryn S. McKinley; Robin Garner; Chris Hoffmann; Asjad M. Khan; Rotem Bentzur; Amer Diwan; Daniel Feinberg; Daniel Frampton; Samuel Z. Guyer; Martin Hirzel; Antony L. Hosking; Maria Jump; Han Lee; J. Eliot B. Moss; Aashish Phansalkar; Darko Stefanovik; Thomas VanDrunen; Daniel von Dincklage; Ben Wiedermann
Evaluation methodology underpins all innovation in experimental computer science. It requires relevant workloads, appropriate experimental design, and rigorous analysis. Unfortunately, methodology is not keeping pace with the changes in our field. The rise of managed languages such as Java, C#, and Ruby in the past decade and the imminent rise of commodity multicore architectures for the next decade pose new methodological challenges that are not yet widely understood. This paper explores the consequences of our collective inattention to methodology on innovation, makes recommendations for addressing this problem in one domain, and provides guidelines for other domains. We describe benchmark suite design, experimental design, and analysis for evaluating Java applications. For example, we introduce new criteria for measuring and selecting diverse applications for a benchmark suite. We show that the complexity and nondeterminism of the Java runtime system make experimental design a first-order consideration, and we recommend mechanisms for addressing complexity and nondeterminism. Drawing on these results, we suggest how to adapt methodology more broadly. To continue to deliver innovations, our field needs to significantly increase participation in and funding for developing sound methodological foundations.
programming language design and implementation | 2010
Filip Pizlo; Lukasz Ziarek; Petr Maj; Antony L. Hosking; Ethan Blanton; Jan Vitek
Managed languages such as Java and C# are being considered for use in hard real-time systems. A hurdle to their widespread adoption is the lack of garbage collection algorithms that offer predictable space-and-time performance in the face of fragmentation. We introduce SCHISM/CMR, a new concurrent and real-time garbage collector that is fragmentation tolerant and guarantees time-and-space worst-case bounds while providing good throughput. SCHISM/CMR combines mark-region collection of fragmented objects and arrays (arraylets) with separate replication-copying collection of immutable arraylet spines, so as to cope with external fragmentation when running in small heaps. We present an implementation of SCHISM/CMR in the Fiji VM, a high-performance Java virtual machine for mission-critical systems, along with a thorough experimental evaluation on a wide variety of architectures, including server-class and embedded systems. The results show that SCHISM/CMR tolerates fragmentation better than previous schemes, with a much more acceptable throughput penalty.
international symposium on memory management | 2004
Stephen M. Blackburn; Antony L. Hosking
Modern garbage collectors rely on read and write barriers imposed on heap accesses by the mutator, to keep track of references between different regions of the garbage collected heap, and to synchronize actions of the mutator with those of the collector. It has been a long-standing untested assumption that barriers impose significant overhead to garbage-collected applications. As a result, researchers have devoted effort to development of optimization approaches for elimination of unnecessary barriers, or proposed new algorithms for garbage collection that avoid the need for barriers while retaining the capability for independent collection of heap partitions. On the basis of the results presented here, we dispel the assumption that barrier overhead should be a primary motivator for such efforts. We present a methodology for precise measurement of mutator overheads for barriers associated with mutator heap accesses. We provide a taxonomy of different styles of barrier and measure the cost of a range of popular barriers used for different garbage collectors within Jikes RVM. Our results demonstrate that barriers impose surprisingly low cost on the mutator, though results vary by architecture. We found that the average overhead for a reasonable generational write barrier was less than 2% on average, and less than 6% in the worst case. Furthermore, we found that the average overhead of a read barrier consisting of just an unconditional mask of the low order bits read on the PowerPC was only 0.85%, while on the AMD it was 8.05%. With both read and write barriers, we found that second order locality effects were sometimes more important than the overhead of the barriers themselves, leading to counter-intuitive speedups in a number of situations.