Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Cathy May is active.

Publication


Featured researches published by Cathy May.


programming language design and implementation | 1987

Mimic: a fast system/370 simulator

Cathy May

Software simulation of one computer on another tends to be slow. Traditional simulators typically execute about 100 instructions on the host machine per instruction simulated. Newer simulators reduce the expansion factor to about 10, by saving and reusing translations of individual instructions. This paper describes an experimental simulator which takes the progression one step further, translating groups of instructions as a unit. This approach, combined with flow analysis, reduces the expansion factor to about 4. The new simulator simulates System/370 on a RISC, namely the IBM RT PC.


international symposium on computer architecture | 2013

Robust architectural support for transactional memory in the power architecture

Harold W. Cain; Maged M. Michael; Brad Frey; Cathy May; Derek Edward Williams; Hung Q. Le

On the twentieth anniversary of the original publication [10], following ten years of intense activity in the research literature, hardware support for transactional memory (TM) has finally become a commercial reality, with HTM-enabled chips currently or soon-to-be available from many hardware vendors. In this paper we describe architectural support for TM added to a future version of the Power ISA#8482;. Two imperatives drove the development: the desire to complement our weakly-consistent memory model with a more friendly interface to simplify the development and porting of multithreaded applications, and the need for robustness beyond that of some early implementations. In the process of commercializing the feature, we had to resolve some previously unexplored interactions between TM and existing features of the ISA, for example translation shootdown, interrupt handling, atomic read-modify-write primitives, and our weakly consistent memory model. We describe these interactions, the overall architecture, and discuss the motivation and rationale for our choices of architectural semantics, beyond what is typically found in reference manuals.


Ibm Journal of Research and Development | 2015

Transactional memory support in the IBM POWER8 processor

Hung Q. Le; Guy Lynn Guthrie; Derek Edward Williams; Maged M. Michael; Brad Frey; William J. Starke; Cathy May; Rei Odaira; Takuya Nakaike

With multi-core processors, parallel programming has taken on greater importance. Traditional parallel programming techniques based on critical sections controlled by locking have several well-known drawbacks. To allow for more efficient parallel programming with higher performance, the IBM POWER8™ processor implements a hardware transactional memory facility. Transactional memory allows groups of load and store operations to execute and commit as a single atomic unit without the use of traditional locks, thereby improving performance and simplifying the parallel programming model. The POWER8 transactional memory facility provides a robust capability to execute transactions that can survive interrupts. It also allows non-speculative accesses within transactions, which facilitates debugging and thread-level speculation. Unique challenges caused by implementing transactional memory on top of the Power ISA (Instruction Set Architecture) weakly consistent memory model are addressed. We detail the Power ISA transactional memory architecture, the POWER8 implementation of this architecture, and two practical uses of this architecture—Transactional Lock Elision (TLE) and Thread-Level Speculation (TLS)—and provide performance results for these uses.


IEEE Transactions on Software Engineering | 1989

The parallel assignment problem redefined

Cathy May

The parallel assignment problem is slightly redefined using a subtler cost function that tends to reduce the number of extra assignments required. It is shown that the new problem, like the classical, is NP-hard. The new problem is then solved for the restricted case of assignment from invertible functions of single variables. For this restricted case and optimum solution can be found in linear time for both the classical problem and the new problem. However, the number of extra assignments required for the classical problem is equal to the number of cycles in the dependency graph, while in the new problem it is equal to the number of isolated cycles in the dependency graph which may be less. >


Archive | 1994

The PowerPC architecture: a specification for a new family of RISC processors

Cathy May; Ed Silha; Rick Simpson; Hank Warren


Archive | 2007

Data stream prefetching in a microprocessor

Eric Fluhr; Bradly G. Frey; John Barry Griswell; Hung Qui Le; Cathy May; Francis Patrick O'Connell; Edward John Silha; Albert Thomas Williams


Archive | 2008

HYPERVISOR-ENFORCED ISOLATION OF ENTITIES WITHIN A SINGLE LOGICAL PARTITION'S VIRTUAL ADDRESS SPACE

William Joseph Armstrong; Orran Krieger; Cathy May; Michal Ostrowski; Randal C. Swanberg


Archive | 2010

Transactional Memory System Supporting Unbroken Suspended Execution

Harold W. Cain; Bradly G. Frey; Benjamin Herrenschmidt; Hung Q. Le; Cathy May; Maged M. Michael; José E. Moreira; Priya Nagpurkar; Naresh Nayar; Randal C. Swanberg


Archive | 2012

Transactional Memory Preemption Mechanism

Richard Louis Arndt; Harold W. Cain; Bradly G. Frey; Cathy May


Archive | 2009

Specifying an access hint for prefetching partial cache block data in a cache hierarchy

Bradly G. Frey; Guy Lynn Guthrie; Cathy May; Ramakrishnan Rajamony; Balaram Sinharoy; William J. Starke; Peter K. Szwed

Researchain Logo
Decentralizing Knowledge