Michael Y. Levin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Michael Y. Levin is active.

Explore More

Publication

Featured researches published by Michael Y. Levin.

programming language design and implementation | 2008

Grammar-based whitebox fuzzing

Patrice Godefroid; Adam Kiezun; Michael Y. Levin

Whitebox fuzzing is a form of automatic dynamic test generation, based on symbolic execution and constraint solving, designed for security testing of large applications. Unfortunately, the current effectiveness of whitebox fuzzing is limited when testing applications with highly-structured inputs, such as compilers and interpreters. These applications process their inputs in stages, such as lexing, parsing and evaluation. Due to the enormous number of control paths in early processing stages, whitebox fuzzing rarely reaches parts of the application beyond those first stages. In this paper, we study how to enhance whitebox fuzzing of complex structured-input applications with a grammar-based specification of their valid inputs. We present a novel dynamic test generation algorithm where symbolic execution directly generates grammar-based constraints whose satisfiability is checked using a custom grammar-based constraint solver. We have implemented this algorithm and evaluated it on a large security-critical application, the JavaScript interpreter of Internet Explorer 7 (IE7). Results of our experiments show that grammar-based whitebox fuzzing explores deeper program paths and avoids dead-ends due to non-parsable inputs. Compared to regular whitebox fuzzing, grammar-based whitebox fuzzing increased coverage of the code generation module of the IE7 JavaScript interpreter from 53% to 81% while using three times fewer tests.

embedded software | 2008

Active property checking

Patrice Godefroid; Michael Y. Levin; David Molnar

Runtime property checking (as implemented in tools like Purify or Valgrind) checks whether a program execution satisfies a property. Active property checking extends runtime checking by checking whether the property is satisfied by all program executions that follow the same program path. This check is performed on a symbolic execution of the given program path using a constraint solver. If the check fails, the constraint solver generates an alternative program input triggering a new program execution that follows the same program path but exhibits a property violation. Combined with systematic dynamic test generation, which attempts to exercise all feasible paths in a program, active property checking defines a new form of dynamic software model checking (program verification). In this paper, we formalize and study active property checking. We show how static and dynamic type checking can be extended with active type checking. Then, we discuss how to implement active property checking efficiently. Finally, we discuss results of experiments with media playing applications on Windows, where active property checking was able to detect several new security-related bugs.

Journal of Functional Programming | 2002

Recursive subtyping revealed

Vladimir Gapeyev; Michael Y. Levin; Benjamin C. Pierce

Algorithms for checking subtyping between recursive types lie at the core of many programming language implementations. But the fundamental theory of these algorithms and how they relate to simpler declarative specifications is not widely understood, due in part to the difficulty of the available introductions to the area. This tutorial paper offers an ‘end-to-end’ introduction to recursive types and subtyping algorithms, from basic theory to efficient implementation, set in the unifying mathematical framework of coinduction.

international symposium on software testing and analysis | 2009

Precise pointer reasoning for dynamic test generation

Bassem Elkarablieh; Patrice Godefroid; Michael Y. Levin

Dynamic test generation consists of executing a program while gathering symbolic constraints on inputs from predicates encountered in branch statements, and of using a constraint solver to infer new program inputs from previous constraints in order to steer next executions towards new program paths. Variants of this technique have recently been adopted in several bug detection tools, including our whitebox fuzzer SAGE, which has found dozens of new expensive security-related bugs in many Windows applications and is now routinely used in various Microsoft groups. In this paper, we discuss how to perform precise symbolic pointer reasoning in the context of dynamic test generation. We present a new memory model for representing arbitrary symbolic pointer dereferences to memory regions accessible by a program during its execution, and show that this memory model is the most precise one can hope for in our context, under some realistic assumptions. We also describe how the symbolic constraints generated by our model can be solved using modern SMT solvers, which provide powerful constructs for reasoning about bit-vectors and arrays. This new memory model has been implemented in SAGE, and we present results of experiments with several large Windows applications showing that an increase in precision can often be obtained at a reasonable cost. Better precision in symbolic pointer reasoning means more relevant constraints and fewer imprecise ones, hence better test coverage, more bugs found and fewer redundant test cases.

Journal of Functional Programming | 2003

TinkerType: a language for playing with formal systems

Michael Y. Levin; Benjamin C. Pierce

TinkerType is a pragmatic framework for compact and modular description of formal systems (type systems, operational semantics, logics, etc.). A family of related systems is broken down into a set of clauses – individual inference rules – and a set of features controlling the inclusion of clauses in particular systems. Simple static checks are used to help maintain consistency of the generated systems. We present TinkerType and its implementation and describe its application to two substantial repositories of typed lambda-calculi. The first repository covers a broad range of typing features, including subtyping, polymorphism, type operators and kinding, computational effects, and dependent types. It describes both declarative and algorithmic aspects of the systems, and can be used with our tool, the TinkerType Assembler, to generate calculi either in the form of typeset collections of inference rules or as executable ML typecheckers. The second repository addresses a smaller collection of systems, and provides modularized proofs of basic safety properties.

database programming languages | 2005

Type-based optimization for regular patterns

Michael Y. Levin; Benjamin C. Pierce

Pattern matching mechanisms based on regular expressions feature in a number of recent languages for processing XML. The flexibility of these mechanisms demands novel approaches to the familiar problems of pattern-match compilation—how to minimize the number of tests performed during pattern matching while keeping the size of the output code small. We describe semantic compilation methods in which we use the schema of the value flowing into a pattern matching expression to generate efficient target code. We start by discussing a pragmatic algorithm used currently in the compiler of Xtatic and report some preliminary performance results. For a more fundamental analysis, we define an optimality criterion of “no useless tests” and show that it is not satisfied by Xtatic’s algorithm. We constructively demonstrate that the problem of generating optimal pattern matching code is decidable for finite (non-recursive) patterns.

compiler construction | 2005

XML goes native: run-time representations for XTATIC

Vladimir Gapeyev; Michael Y. Levin; Benjamin C. Pierce; Alan Schmitt

Xtatic is a lightweight extension of C# offering native support for statically typed XML processing. XML trees are built-in values in Xtatic, and static analysis of the trees manipulated by programs is part of the ordinary job of the typechecker. “Tree grep” pattern matching is used to investigate and transform XML trees. Xtatics surface syntax and type system are tightly integrated with those of C#. Beneath the hood, however, an implementation of Xtatic must address a number of issues common to any language supporting a declarative style of XML processing (e.g., XQuery, XSLT, XDuce, CDuce, Xact, Xen, etc.). In particular, it must provide representations for XML tags, trees, and textual data that use memory efficiently, support efficient pattern matching, allow maximal sharing of common substructures, and permit separate compilation. We analyze these representation choices in detail and describe the solutions used by the Xtatic compiler.

international conference on functional programming | 2003

Compiling regular patterns

Michael Y. Levin

Pattern matching mechanisms based on regular expressions feature in a number of recent languages for processing tree-structured data such as XML. A compiler for such a language must address not only the familiar problems of pattern optimization for ML-style algebraic datatypes and pattern matching, but also some new ones, arising principally from the use of recursion in patterns. We identify several factors playing a significant role in the quality of the generated code, propose two pattern compilers---one generating backtracking target programs, the other non-backtracking---and sketch proofs of their correctness.

network and distributed system security symposium | 2008