Is this you? Create Your Porfile

Michael Carbin

Massachusetts Institute of Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michael Carbin is active.

Explore More

Publication

Featured researches published by Michael Carbin.

symposium on operating systems principles | 2009

Automatically patching errors in deployed software

Jeff H. Perkins; Sunghun Kim; Samuel Larsen; Saman P. Amarasinghe; Jonathan Bachrach; Michael Carbin; Carlos Pacheco; Frank Sherwood; Stelios Sidiroglou; Greg Sullivan; Weng-Fai Wong; Yoav Zibin; Michael D. Ernst; Martin C. Rinard

We present ClearView, a system for automatically patching errors in deployed software. ClearView works on stripped Windows x86 binaries without any need for source code, debugging information, or other external information, and without human intervention. ClearView (1) observes normal executions to learn invariants thatcharacterize the applications normal behavior, (2) uses error detectors to distinguish normal executions from erroneous executions, (3) identifies violations of learned invariants that occur during erroneous executions, (4) generates candidate repair patches that enforce selected invariants by changing the state or flow of control to make the invariant true, and (5) observes the continued execution of patched applications to select the most successful patch. ClearView is designed to correct errors in software with high availability requirements. Aspects of ClearView that make it particularly appropriate for this context include its ability to generate patches without human intervention, apply and remove patchesto and from running applications without requiring restarts or otherwise perturbing the execution, and identify and discard ineffective or damaging patches by evaluating the continued behavior of patched applications. ClearView was evaluated in a Red Team exercise designed to test its ability to successfully survive attacks that exploit security vulnerabilities. A hostile external Red Team developed ten code injection exploits and used these exploits to repeatedly attack an application protected by ClearView. ClearView detected and blocked all of the attacks. For seven of the ten exploits, ClearView automatically generated patches that corrected the error, enabling the application to survive the attacks and continue on to successfully process subsequent inputs. Finally, the Red Team attempted to make Clear-View apply an undesirable patch, but ClearViews patch evaluation mechanism enabled ClearView to identify and discard both ineffective patches and damaging patches.

symposium on principles of database systems | 2005

Context-sensitive program analysis as database queries

Monica S. Lam; John Whaley; V. Benjamin Livshits; Michael C. Martin; Dzintars Avots; Michael Carbin; Christopher Unkel

Program analysis has been increasingly used in softwareengineering tasks such as auditing programs for securityvulnerabilities and finding errors in general. Such tools oftenrequire analyses much more sophisticated than those traditionallyused in compiler optimizations. In particular, context-sensitivepointer alias information is a prerequisite for any sound andprecise analysis that reasons about uses of heap objects in aprogram. Context-sensitive analysis is challenging because thereare over 1014 contexts in a typical large program, evenafter recursive cycles are collapsed. Moreover, pointers cannot beresolved in general without analyzing the entire program. This paper presents a new framework, based on the concept ofdeductive databases, for context-sensitive program analysis. Inthis framework, all program information is stored as relations;data access and analyses are written as Datalog queries. To handlethe large number of contexts in a program, the database representsrelations with binary decision diagrams (BDDs). The system we havedeveloped, called bddbddb, automatically translates databasequeries into highly optimized BDD programs. Our preliminary experiences suggest that a large class ofanalyses involving heap objects can be described succinctly inDatalog and implemented efficiently with BDDs. To make developingapplication-specific analyses easy for programmers, we have alsocreated a language called PQL that makes a subset of Datalogqueries more intuitive to define. We have used the language to findmany security holes in Web applications.

asian symposium on programming languages and systems | 2005

Using datalog with binary decision diagrams for program analysis

John Whaley; Dzintars Avots; Michael Carbin; Monica S. Lam

Many problems in program analysis can be expressed naturally and concisely in a declarative language like Datalog. This makes it easy to specify new analyses or extend or compose existing analyses. However, previous implementations of declarative languages perform poorly compared with traditional implementations. This paper describes bddbddb, a BDD-Based Deductive DataBase, which implements the declarative language Datalog with stratified negation, totally-ordered finite domains and comparison operators. bddbddb uses binary decision diagrams (BDDs) to efficiently represent large relations. BDD operations take time proportional to the size of the data structure, not the number of tuples in a relation, which leads to fast execution times. bddbddb is an effective tool for implementing a large class of program analyses. We show that a context-insensitive points-to analysis implemented with bddbddb is about twice as fast as a carefully hand-tuned version. The use of BDDs also allows us to solve heretofore unsolved problems, like context-sensitive pointer analysis for large programs.

conference on object-oriented programming systems, languages, and applications | 2014

Chisel: reliability- and accuracy-aware optimization of approximate computational kernels

Sasa Misailovic; Michael Carbin; Sara Achour; Zichao Qi; Martin C. Rinard

The accuracy of an approximate computation is the distance between the result that the computation produces and the corresponding fully accurate result. The reliability of the computation is the probability that it will produce an acceptably accurate result. Emerging approximate hardware platforms provide approximate operations that, in return for reduced energy consumption and/or increased performance, exhibit reduced reliability and/or accuracy. We present Chisel, a system for reliability- and accuracy-aware optimization of approximate computational kernels that run on approximate hardware platforms. Given a combined reliability and/or accuracy specification, Chisel automatically selects approximate kernel operations to synthesize an approximate computation that minimizes energy consumption while satisfying its reliability and accuracy specification. We evaluate Chisel on five applications from the image processing, scientific computing, and financial analysis domains. The experimental results show that our implemented optimization algorithm enables Chisel to optimize our set of benchmark kernels to obtain energy savings from 8.7% to 19.8% compared to the fully reliable kernel implementations while preserving important reliability guarantees.

programming language design and implementation | 2012

Proving acceptability properties of relaxed nondeterministic approximate programs

Michael Carbin; Deokhwan Kim; Sasa Misailovic; Martin C. Rinard

Approximate program transformations such as skipping tasks [29, 30], loop perforation [21, 22, 35], reduction sampling [38], multiple selectable implementations [3, 4, 16, 38], dynamic knobs [16], synchronization elimination [20, 32], approximate function memoization [11],and approximate data types [34] produce programs that can execute at a variety of points in an underlying performance versus accuracy tradeoff space. These transformed programs have the ability to trade accuracy of their results for increased performance by dynamically and nondeterministically modifying variables that control their execution. We call such transformed programs relaxed programs because they have been extended with additional nondeterminism to relax their semantics and enable greater flexibility in their execution. We present language constructs for developing and specifying relaxed programs. We also present proof rules for reasoning about properties [28] which the program must satisfy to be acceptable. Our proof rules work with two kinds of acceptability properties: acceptability properties [28], which characterize desired relationships between the values of variables in the original and relaxed programs, and unary acceptability properties, which involve values only from a single (original or relaxed) program. The proof rules support a staged reasoning approach in which the majority of the reasoning effort works with the original program. Exploiting the common structure that the original and relaxed programs share, relational reasoning transfers reasoning effort from the original program to prove properties of the relaxed program. We have formalized the dynamic semantics of our target programming language and the proof rules in Coq and verified that the proof rules are sound with respect to the dynamic semantics. Our Coq implementation enables developers to obtain fully machine-checked verifications of their relaxed programs.

acm sigplan symposium on principles and practice of parallel programming | 2007

Transactional collection classes

Brian D. Carlstrom; Austen McDonald; Michael Carbin; Christos Kozyrakis; Kunle Olukotun

While parallel programmers find it easier to reason about large atomic regions, the conventional mutual exclusion-based primitives for synchronization force them to interleave many small operations to achieve performance. Transactional memory promises that programmer scan use large atomic regions while achieving similar performance. However, these large transactions can conflict when operating on shared data structures, even for logically independent operations. Transactional collection classes address this problem by allowing long-running transactions to operate on shared data while eliminating unnecessary conflicts. Transactional collection classes wrap existing data structures, without the need for custom implementations or knowledge of data structure internals. Without transactional collection classes, access to shared datafrom within long-running transactions can suffer from data dependency conflicts that are logically unnecessary, but are artifacts of the data structure implementation such as hash table collisions or tree-balancing rotations. Our transactional collection classes use the concept of semantic concurrency control to eliminate these unnecessary data dependencies, replacing them with conflict detection based on the operations of the abstract data type. The design and behavior of these transactional collection classes is discussed with reference to the related work from the database community such as multi-level transactions and semantic concurrency control, as well as other concurrent data structures such as java.util.concurrent. The required transactional semantics needed for implementing transactional collection are enumerated, including open-nested transactions and commit and abort handlers. We also discuss how isolation can be reduced for greater concurrency. Finally, we provide guidelines on the construction of classes that preserve isolation and serializability. The performance of these classes is evaluated with a number of benchmarks including targeted micro-benchmarks and a version of SPECjbb2000 with increased contention. The results show that easier-to-use long transactions can still allow programs to deliver scalable performance by simply wrapping existing data structures with transactional collection classes.

european conference on object oriented programming | 2011

Detecting and escaping infinite loops with jolt

Michael Carbin; Sasa Misailovic; Michael W. Kling; Martin C. Rinard

Infinite loops can make applications unresponsive. Potential problems include lost work or output, denied access to application functionality, and a lack of responses to urgent events. We present Jolt, a novel system for dynamically detecting and escaping infinite loops. At the users request, Jolt attaches to an application to monitor its progress. Specifically, Jolt records the program state at the start of each loop iteration. If two consecutive loop iterations produce the same state, Jolt reports to the user that the application is in an infinite loop. At the users option, Jolt can then transfer control to a statement following the loop, thereby allowing the application to escape the infinite loop and ideally continue its productive execution. The immediate goal is to enable the application to execute long enough to save any pending work, finish any in-progress computations, or respond to any urgent events. We evaluated Jolt by applying it to detect and escape eight infinite loops in five benchmark applications. Jolt was able to detect seven of the eight infinite loops (the eighth changes the state on every iteration). We also evaluated the effect of escaping an infinite loop as an alternative to terminating the application. In all of our benchmark applications, escaping an infinite loop produced a more useful output than terminating the application. Finally, we evaluated how well escaping from an infinite loop approximated the correction that the developers later made to the application. For two out of our eight loops, escaping the infinite loop produced the same output as the corrected version of the application.

international symposium on software testing and analysis | 2010

Automatically identifying critical input regions and code in applications

Michael Carbin; Martin C. Rinard

Applications that process complex inputs often react in different ways to changes in different regions of the input. Small changes to forgiving regions induce correspondingly small changes in the behavior and output. Small changes to critical regions, on the other hand, can induce disproportionally large changes in the behavior or output. Identifying the critical and forgiving regions in the input and the corresponding critical and forgiving regions of code is directly relevant to many software engineering tasks. We present a system, Snap, for automatically grouping related input bytes into fields and classifying each field and corresponding regions of code as critical or forgiving. Given an application and one or more inputs, Snap uses targeted input fuzzing in combination with dynamic execution and influence tracing to classify regions of input fields and code as critical or forgiving. Our experimental evaluation shows that Snap makes classifications with close to perfect precision (99%) and very good recall (between 99% and 73%, depending on the application).

generative programming and component engineering | 2006

Reflective program generation with patterns

Manuel Fähndrich; Michael Carbin; James R. Larus

Runtime reflection facilities, as present in Java and .NET, are powerful mechanisms for inspecting existing code and metadata, as well as generating new code and metadata on the fly. Such power does come at a high price though. The runtime reflection support in Java and .NET imposes a cost on all programs, whether they use reflection or not, simply by the necessity of keeping all metadata around and the inability to optimize code because of future possible code changes. A second---often overlooked---cost is the difficulty of writing correct reflection code to inspect or emit new metadata and code and the risk that the emitted code is not well-formed.In this paper we examine a subclass of problems that can be addressed using a simpler mechanism than runtime reflection, which we call compile-time reflection. We argue for a high-level construct called a transform that allows programmers to write inspection and generation code in a pattern matching and template style, avoiding at the same time the complexities of reflection APIs and providing the benefits of staged compilation in that the generated code and metadata is known to be well-formed and type safe ahead of time.

international conference on software engineering | 2012

Automatic input rectification

Fan Long; Vijay Ganesh; Michael Carbin; Stelios Sidiroglou; Martin C. Rinard

We present a novel technique, automatic input rectification, and a prototype implementation, SOAP. SOAP learns a set of constraints characterizing typical inputs that an application is highly likely to process correctly. When given an atypical input that does not satisfy these constraints, SOAP automatically rectifies the input (i.e., changes the input so that it satisfies the learned constraints). The goal is to automatically convert potentially dangerous inputs into typical inputs that the program is highly likely to process correctly. Our experimental results show that, for a set of benchmark applications (Google Picasa, ImageMagick, VLC, Swfdec, and Dillo), this approach effectively converts malicious inputs (which successfully exploit vulnerabilities in the application) into benign inputs that the application processes correctly. Moreover, a manual code analysis shows that, if an input does satisfy the learned constraints, it is incapable of exploiting these vulnerabilities. We also present the results of a user study designed to evaluate the subjective perceptual quality of outputs from benign but atypical inputs that have been automatically rectified by SOAP to conform to the learned constraints. Specifically, we obtained benign inputs that violate learned constraints, used our input rectifier to obtain rectified inputs, then paid Amazon Mechanical Turk users to provide their subjective qualitative perception of the difference between the outputs from the original and rectified inputs. The results indicate that rectification can often preserve much, and in many cases all, of the desirable data in the original input.

Explore More