Milo M. K. Martin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Milo M. K. Martin is active.

Explore More

Publication

Featured researches published by Milo M. K. Martin.

ACM Sigarch Computer Architecture News | 2005

Multifacet's general execution-driven multiprocessor simulator (GEMS) toolset

Milo M. K. Martin; Daniel J. Sorin; Bradford M. Beckmann; Michael R. Marty; Min Xu; Alaa R. Alameldeen; Kevin E. Moore; Mark D. Hill; David A. Wood

The Wisconsin Multifacet Project has created a simulation toolset to characterize and evaluate the performance of multiprocessor hardware systems commonly used as database and web servers. We leverage an existing full-system functional simulation infrastructure (Simics [14]) as the basis around which to build a set of timing simulator modules for modeling the timing of the memory system and microprocessors. This simulator infrastructure enables us to run architectural experiments using a suite of scaled-down commercial workloads [3]. To enable other researchers to more easily perform such research, we have released these timing simulator modules as the Multifacet General Execution-driven Multiprocessor Simulator (GEMS) Toolset, release 1.0, under GNU GPL [9].

programming language design and implementation | 2009

SoftBound: highly compatible and complete spatial memory safety for c

Santosh Nagarakatte; Jianzhou Zhao; Milo M. K. Martin; Steve Zdancewic

The serious bugs and security vulnerabilities facilitated by C/C++s lack of bounds checking are well known, yet C and C++ remain in widespread use. Unfortunately, Cs arbitrary pointer arithmetic, conflation of pointers and arrays, and programmer-visible memory layout make retrofitting C/C++ with spatial safety guarantees extremely challenging. Existing approaches suffer from incompleteness, have high runtime overhead, or require non-trivial changes to the C source code. Thus far, these deficiencies have prevented widespread adoption of such techniques. This paper proposes SoftBound, a compile-time transformation for enforcing spatial safety of C. Inspired by HardBound, a previously proposed hardware-assisted approach, SoftBound similarly records base and bound information for every pointer as disjoint metadata. This decoupling enables SoftBound to provide spatial safety without requiring changes to C source code. Unlike HardBound, SoftBound is a software-only approach and performs metadata manipulation only when loading or storing pointer values. A formal proof shows that this is sufficient to provide spatial safety even in the presence of arbitrary casts. SoftBounds full checking mode provides complete spatial violation detection with 67% runtime overhead on average. To further reduce overheads, SoftBound has a store-only checking mode that successfully detects all the security vulnerabilities in a test suite at the cost of only 22% runtime overhead on average.

IEEE Computer Architecture Letters | 2006

Subtleties of transactional memory atomicity semantics

Colin Blundell; E.C. Lewis; Milo M. K. Martin

Transactional memory has great potential for simplifying multithreaded programming by allowing programmers to specify regions of the program that must appear to execute atomically. Transactional memory implementations then optimistically execute these transactions concurrently to obtain high performance. This work shows that the same atomic guarantees that give transactions their power also have unexpected and potentially serious negative effects on programs that were written assuming narrower scopes of atomicity. We make four contributions: (1) we show that a direct translation of lock-based critical sections into transactions can introduce deadlock into otherwise correct programs, (2) we introduce the terms strong atomicity and weak atomicity to describe the interaction of transactional and non-transactional code, (3) we show that code that is correct under weak atomicity can deadlock under strong atomicity, and (4) we demonstrate that sequentially composing transactional code can also introduce deadlocks. These observations invalidate the intuition that transactions are strictly safer than lock-based critical sections, that strong atomicity is strictly safer than weak atomicity, and that transactions are always composable

programming language design and implementation | 2007

CheckFence: checking consistency of concurrent data types on relaxed memory models

Sebastian Burckhardt; Rajeev Alur; Milo M. K. Martin

Concurrency libraries can facilitate the development of multi-threaded programs by providing concurrent implementations of familiar data types such as queues or sets. There exist many optimized algorithms that can achieve superior performance on multiprocessors by allowing concurrent data accesses without using locks. Unfortunately, such algorithms can harbor subtle concurrency bugs. Moreover, they requirememory ordering fences to function correctly on relaxed memory models. To address these difficulties, we propose a verification approach that can exhaustively check all concurrent executions of a given test program on a relaxed memory model and can verify that they are observationally equivalent to a sequential execution. Our CheckFence prototype automatically translates the C implementation code and the test program into a SAT formula, hands the latter to a standard SAT solver, and constructs counter example traces if there exist incorrect executions. Applying CheckFence to five previously published algorithms, we were able to (1) find several bugs (some not previously known), and (2) determine how to place memory ordering fences for relaxed memory models.

Communications of The ACM | 2012

Why on-chip cache coherence is here to stay

Milo M. K. Martin; Mark D. Hill; Daniel J. Sorin

On-chip hardware coherence can scale gracefully as the number of cores increases.

ieee symposium on security and privacy | 2010

Overcoming an Untrusted Computing Base: Detecting and Removing Malicious Hardware Automatically

Matthew Hicks; Murph Finnicum; Samuel T. King; Milo M. K. Martin; Jonathan M. Smith

The computer systems security arms race between attackers and defenders has largely taken place in the domain of software systems, but as hardware complexity and design processes have evolved, novel and potent hardware-based security threats are now possible. This paper presents a hybrid hardware/software approach to defending against malicious hardware. We propose BlueChip, a defensive strategy that has both a design-time component and a runtime component. During the design verification phase, BlueChip invokes a new technique, unused circuit identification (UCI), to identify suspicious circuitry—those circuits not used or otherwise activated by any of the design verification tests. BlueChip removes the suspicious circuitry and replaces it with exception generation hardware. The exception handler software is responsible for providing forward progress by emulating the effect of the exception generating instruction in software, effectively providing a detour around suspicious hardware. In our experiments, BlueChip is able to prevent all hardware attacks we evaluate while incurring a small runtime overhead.

international symposium on memory management | 2010

CETS: compiler enforced temporal safety for C

Santosh Nagarakatte; Jianzhou Zhao; Milo M. K. Martin; Steve Zdancewic

Temporal memory safety errors, such as dangling pointer dereferences and double frees, are a prevalent source of software bugs in unmanaged languages such as C. Existing schemes that attempt to retrofit temporal safety for such languages have high runtime overheads and/or are incomplete, thereby limiting their effectiveness as debugging aids. This paper presents CETS, a compile-time transformation for detecting all violations of temporal safety in C programs. Inspired by existing approaches, CETS maintains a unique identifier with each object, associates this metadata with the pointers in a disjoint metadata space to retain memory layout compatibility, and checks that the object is still allocated on pointer dereferences. A formal proof shows that this is sufficient to provide temporal safety even in the presence of arbitrary casts if the program contains no spatial safety violations. Our CETS prototype employs both temporal check removal optimizations and traditional compiler optimizations to achieve a runtime overhead of just 48% on average. When combined with a spatial-checking system, the average overall overhead is 116% for complete memory safety

international symposium on computer architecture | 2003

Using destination-set prediction to improve the latency/bandwidth tradeoff in shared-memory multiprocessors

Milo M. K. Martin; Pacia J. Harper; Daniel J. Sorin; Mark D. Hill; David A. Wood

Destination-set prediction can improve the latency/bandwidth tradeoff in shared-memory multiprocessors. The destination set is the collection of processors that receive a particular coherence request. Snooping protocols send requests to the maximal destination set (i.e., all processors), reducing latency for cache-to-cache misses at the expense of increased traffic. Directory protocols send requests to the minimal destination set, reducing bandwidth at the expense of an indirection through the directory for cache-to-cache misses. Recently proposed hybrid protocols trade-off latency and bandwidth by directly sending requests to a predicted destination set.This paper explores the destination-set predictor design space, focusing on a collection of important commercial workloads. First, we analyze the sharing behavior of these workloads. Second, we propose predictors that exploit the observed sharing behavior to target different points in the latency/bandwidth tradeoff. Third, we illustrate the effectiveness of destination-set predictors in the context of a multicast snooping protocol. For example, one of our predictors obtains almost 90% of the performance of snooping while using only 15% more bandwidth than a directory protocol (and less than half the bandwidth of snooping).

IEEE Computer | 2003

Simulating a

Alaa R. Alameldeen; Milo M. K. Martin; Carl J. Mauer; Kevin E. Moore; Min Xu; Mark D. Hill; David A. Wood; Daniel J. Sorin

As dependence on database management systems and Web servers increases, so does the need for them to run reliably and efficiently-goals that rigorous simulations can help achieve. Execution-driven simulation models system hardware. These simulations capture actual program behavior and detailed system interactions. The authors have developed a simulation methodology that uses multiple simulations, pays careful attention to the effects of scaling on workload behavior, and extends Virtutech ABs Simics full system functional simulator with detailed timing models. The Wisconsin Commercial Workload Suite contains scaled and tuned benchmarks for multiprocessor servers, enabling full-system simulations to run on the PCs that are routinely available to researchers.

international symposium on computer architecture | 2007

2M commercial server on a

Colin Blundell; Joseph Devietti; E. Christopher Lewis; Milo M. K. Martin

Hardware transactional memory has great potential to simplify the creation ofcorrect and efficient multithreaded programs, allowing programmers to exploitmore effectively the soon-to-be-ubiquitous multi-core designs. Several recentproposals have extended the original bounded transactional memory to unboundedtransactional memory, a crucial step toward transactions becoming ageneral-purpose primitive. Unfortunately, supporting the concurrent executionof an unbounded number of unbounded transactions is challenging, and as aresult, many proposed implementations are complex. This paper explores a different approach. First, we introduce thepermissions-only cache to extend the bound at which transactions overflow toallow the fast, bounded case to be used as frequently as possible. Second, wepropose OneTM to simplify the implementation of unbounded transactional memoryby bounding the concurrency of transactions that overflow the cache. Thesemechanisms work synergistically to provide a simple and fast unboundedtransactional memory system. The permissions-only cache efficiently maintains the coherencepermissions-but not data-for blocks read or written transactionally thathave been evicted from the processors caches. By holding coherencepermissions for these blocks, the regular cache coherence protocol can be usedto detect transactional conflicts using only a few bits of on-chip storage peroverflowed cache block.OneTM allows only one overflowed transaction at a time, relying on thepermissions-only cache to ensure that overflow is infrequent. We present twoimplementations. In OneTM-Serialized, an overflowed transaction simply stallsall other threads in the application. In OneTM-Concurrent, non-overflowedtransactions and non-transactional code can execute concurrently with theoverflowed transaction, providing more concurrency while retaining OneTMs coresimplifying assumption.

Explore More