Dan Grossman | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Dan Grossman is active.

Explore More

Publication

Featured researches published by Dan Grossman.

programming language design and implementation | 2011

EnerJ: approximate data types for safe and general low-power computation

Adrian Sampson; Werner Dietl; Emily Fortuna; Danushen Gnanapragasam; Luis Ceze; Dan Grossman

Energy is increasingly a first-order concern in computer systems. Exploiting energy-accuracy trade-offs is an attractive choice in applications that can tolerate inaccuracies. Recent work has explored exposing this trade-off in programming models. A key challenge, though, is how to isolate parts of the program that must be precise from those that can be approximated so that a program functions correctly even as quality of service degrades. We propose using type qualifiers to declare data that may be subject to approximate computation. Using these types, the system automatically maps approximate variables to low-power storage, uses low-power operations, and even applies more energy-efficient algorithms provided by the programmer. In addition, the system can statically guarantee isolation of the precise program component from the approximate component. This allows a programmer to control explicitly how information flows from approximate data to precise data. Importantly, employing static analysis eliminates the need for dynamic checks, further improving energy savings. As a proof of concept, we develop EnerJ, an extension to Java that adds approximate data types. We also propose a hardware architecture that offers explicit approximate storage and computation. We port several applications to EnerJ and show that our extensions are expressive and effective; a small number of annotations lead to significant potential energy savings (10%-50%) at very little accuracy cost.

programming language design and implementation | 2002

Region-based memory management in cyclone

Dan Grossman; J. Gregory Morrisett; Trevor Jim; Michael Hicks; Yanling Wang; James Cheney

Cyclone is a type-safe programming language derived from C. The primary design goal of Cyclone is to let programmers control data representation and memory management without sacrificing type-safety. In this paper, we focus on the region-based memory management of Cyclone and its static typing discipline. The design incorporates several advancements, including support for region subtyping and a coherent integration with stack allocation and a garbage collector. To support separate compilation, Cyclone requires programmers to write some explicit region annotations, but a combination of default annotations, local type inference, and a novel treatment of region effects reduces this burden. As a result, we integrate C idioms in a region-based framework. In our experience, porting legacy C to Cyclone has required altering about 8% of the code; of the changes, only 6% (of the 8%) were region annotations.

international conference on machine learning | 2004

Learning Bayesian network classifiers by maximizing conditional likelihood

Dan Grossman; Pedro M. Domingos

Bayesian networks are a powerful probabilistic representation, and their use for classification has received considerable attention. However, they tend to perform poorly when learned in the standard way. This is attributable to a mismatch between the objective function used (likelihood or a function thereof) and the goal of classification (maximizing accuracy or conditional likelihood). Unfortunately, the computational cost of optimizing structure and parameters for conditional likelihood is prohibitive. In this paper we show that a simple approximation---choosing structures by maximizing conditional likelihood while setting parameters by maximum likelihood---yields good results. On a large suite of benchmark datasets, this approach produces better class probability estimates than naive Bayes, TAN, and generatively-trained Bayesian networks.

architectural support for programming languages and operating systems | 2010

CoreDet: a compiler and runtime system for deterministic multithreaded execution

Tom Bergan; Owen Anderson; Joseph Devietti; Luis Ceze; Dan Grossman

The behavior of a multithreaded program does not depend only on its inputs. Scheduling, memory reordering, timing, and low-level hardware effects all introduce nondeterminism in the execution of multithreaded programs. This severely complicates many tasks, including debugging, testing, and automatic replication. In this work, we avoid these complications by eliminating their root cause: we develop a compiler and runtime system that runs arbitrary multithreaded C/C++ POSIX Threads programs deterministically. A trivial non-performant approach to providing determinism is simply deterministically serializing execution. Instead, we present a compiler and runtime infrastructure that ensures determinism but resorts to serialization rarely, for handling interthread communication and synchronization. We develop two basic approaches, both of which are largely dynamic with performance improved by some static compiler optimizations. First, an ownership-based approach detects interthread communication via an evolving table that tracks ownership of memory regions by threads. Second, a buffering approach uses versioned memory and employs a deterministic commit protocol to make changes visible to other threads. While buffering has larger single-threaded overhead than ownership, it tends to scale better (serializing less often). A hybrid system sometimes performs and scales better than either approach individually. Our implementation is based on the LLVM compiler infrastructure. It needs neither programmer annotations nor special hardware. Our empirical evaluation uses the PARSEC and SPLASH2 benchmarks and shows that our approach scales comparably to nondeterministic execution.

programming language design and implementation | 2007

Enforcing isolation and ordering in STM

Tatiana Shpeisman; Vijay Menon; Ali-Reza Adl-Tabatabai; Steven Balensiefer; Dan Grossman; Richard L. Hudson; Katherine F. Moore; Bratin Saha

Transactional memory provides a new concurrency control mechanism that avoids many of the pitfalls of lock-based synchronization. High-performance software transactional memory (STM) implementations thus far provide weak atomicity: Accessing shared data both inside and outside a transaction can result in unexpected, implementation-dependent behavior. To guarantee isolation and consistent ordering in such a system, programmers are expected to enclose all shared-memory accesses inside transactions. A system that provides strong atomicity guarantees isolation even in the presence of threads that access shared data outside transactions. A strongly-atomic system also orders transactions with conflicting non-transactional memory operations in a consistent manner. In this paper, we discuss some surprising pitfalls of weak atomicity, and we present an STM system that avoids these problems via strong atomicity. We demonstrate how to implement non-transactional data accesses via efficient read and write barriers, and we present compiler optimizations that further reduce the overheads of these barriers. We introduce a dynamic escape analysis that differentiates private and public data at runtime to make barriers cheaper and a static not-accessed-in-transaction analysis that removes many barriers completely. Our results on a set of Java programs show that strong atomicity can be implemented efficiently in a high-performance STM system.

types in languages design and implementation | 2003

Type-safe multithreading in cyclone

Dan Grossman

We extend Cyclone, a type-safe polymorphic language at the C level of abstraction, with threads and locks. Data races can violate type safety in Cyclone. An extended type system statically guarantees their absence by enforcing that thread-shared data is protected via locking and that thread-local data does not escape the thread that creates it. The extensions interact smoothly with parametric polymorphism and region-based memory management. We present a formal abstract machine that models the need to prevent races, a polymorphic type system for the machine that supports thread-local data, and a corresponding type-safety result.

international conference on management of data | 2010

ParaTimer: a progress indicator for MapReduce DAGs

Kristi Morton; Magdalena Balazinska; Dan Grossman

Time-oriented progress estimation for parallel queries is a challenging problem that has received only limited attention. In this paper, we present ParaTimer, a new type of time-remaining indicator for parallel queries. Several parallel data processing systems exist. ParaTimer targets environments where declarative queries are translated into ensembles of MapReduce jobs. ParaTimer builds on previous techniques and makes two key contributions. First, it estimates the progress of queries that translate into directed acyclic graphs of MapReduce jobs, where jobs on different paths can execute concurrently (unlike prior work that looked at sequences only). For such queries, we use a new type of critical-path-based progress-estimation approach. Second, ParaTimer handles a variety of real systems challenges such as failures and data skew. To handle unexpected changes in query execution times due to runtime condition changes, ParaTimer provides users with not only one but with a set of time-remaining estimates, each one corresponding to a different carefully selected scenario. We implement our estimator in the Pig system and demonstrate its performance on experiments running on a real, small-scale cluster.

symposium on principles of programming languages | 2008

High-level small-step operational semantics for transactions

Katherine F. Moore; Dan Grossman

Software transactions have received significant attention as a way to simplify shared-memory concurrent programming, but insufficient focus has been given to the precise meaning of software transactions or their interaction with other language features. This work begins to rectify that situation by presenting a family of formal languages that model a wide variety of behaviors for software transactions. These languages abstract away implementation details of transactional memory, providing high-level definitions suitable for programming languages. We use small-step semantics in order to represent explicitly the interleaved execution of threads that is necessary to investigate pertinent issues. We demonstrate the value of our core approach to modeling transactions by investigating two issues in depth. First, we consider parallel nesting, in which parallelism and transactions can nest arbitrarily. Second, we present multiple models for weak isolation, in which nontransactional code can violate the isolation of a transaction. For both, type-and-effect systems let us soundly and statically restrict what computation can occur inside or outside a transaction. We prove some key language-equivalence theorems to confirm that under sufficient static restrictions, in particular that each mutable memory location is used outside transactions or inside transactions (but not both), no program can determine whether the language implementation uses weak isolation or strong isolation.

international conference on data engineering | 2010

Estimating the progress of MapReduce pipelines

Kristi Morton; Abram L. Friesen; Magdalena Balazinska; Dan Grossman

In parallel query-processing environments, accurate, time-oriented progress indicators could provide much utility given that inter- and intra-query execution times can have high variance. However, none of the techniques used by existing tools or available in the literature provide non-trivial progress estimation for parallel queries. In this paper, we introduce Parallax, the first such indicator. While several parallel data processing systems exist, the work in this paper targets environments where queries consist of a series of MapReduce jobs. Parallax builds on recently-developed techniques for estimating the progress of single-site SQL queries, but focuses on the challenges related to parallelism and variable execution speeds. We have implemented our estimator in the Pig system and demonstrate its performance through experiments with the PigMix benchmark and other queries running in a real, small-scale cluster.

international symposium on memory management | 2004

Experience with safe manual memory-management in cyclone

Michael Hicks; J. Gregory Morrisett; Dan Grossman; Trevor Jim

The goal of the Cyclone project is to investigate type safety for low-level languages such as C. Our most difficult challenge has been providing programmers control over memory management while retaining type safety. This paper reports on our experience trying to integrate and effectively use two previously proposed, type-safe memory management mechanisms: statically-scoped regions and unique pointers. We found that these typing mechanisms can be combined to build alternative memory-management abstractions, such as reference counted objects and arenas with dynamic lifetimes, and thus provide a flexible basis. Our experience---porting C programs and building new applications for resource-constrained systems---confirms that experts can use these features to improve memory footprint and sometimes to improve throughput when used instead of, or in combination with, conservative garbage collection.

Explore More