Stephen McCamant
University of Minnesota
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Stephen McCamant.
Science of Computer Programming | 2007
Michael D. Ernst; Jeff H. Perkins; Philip J. Guo; Stephen McCamant; Carlos Pacheco; Matthew S. Tschantz; Chen Xiao
Daikon is an implementation of dynamic detection of likely invariants; that is, the Daikon invariant detector reports likely program invariants. An invariant is a property that holds at a certain point or points in a program; these are often used in assert statements, documentation, and formal specifications. Examples include being constant (x=a), non-zero (x 0), being in a range (a@?x@?b), linear relationships (y=ax+b), ordering (x@?y), functions from a library (x=fn(y)), containment (x@?y), sortedness (xissorted), and many more. Users can extend Daikon to check for additional invariants. Dynamic invariant detection runs a program, observes the values that the program computes, and then reports properties that were true over the observed executions. Dynamic invariant detection is a machine learning technique that can be applied to arbitrary data. Daikon can detect invariants in C, C++, Java, and Perl programs, and in record-structured data sources; it is easy to extend Daikon to other applications. Invariants can be useful in program understanding and a host of other applications. Daikons output has been used for generating test cases, predicting incompatibilities in component integration, automating theorem proving, repairing inconsistent data structures, and checking the validity of data streams, among other tasks. Daikon is freely available in source and binary form, along with extensive documentation, at http://pag.csail.mit.edu/daikon/.
ieee symposium on security and privacy | 2010
Prateek Saxena; Devdatta Akhawe; Steve Hanna; Feng Mao; Stephen McCamant; Dawn Song
As AJAX applications gain popularity, client-side JavaScript code is becoming increasingly complex. However, few automated vulnerability analysis tools for JavaScript exist. In this paper, we describe the first system for exploring the execution space of JavaScript code using symbolic execution. To handle JavaScript code’s complex use of string operations, we design a new language of string constraints and implement a solver for it. We build an automatic end-to-end tool, Kudzu, and apply it to the problem of finding client-side code injection vulnerabilities. In experiments on 18 live web applications, Kudzu automatically discovers 2 previously unknown vulnerabilities and 9 more that were previously found only with a manually-constructed test suite.
international symposium on software testing and analysis | 2009
Prateek Saxena; Pongsin Poosankam; Stephen McCamant; Dawn Song
Mixed concrete and symbolic execution is an important technique for finding and understanding software bugs, including security-relevant ones. However, existing symbolic execution techniques are limited to examining one execution path at a time, in which symbolic variables reflect only direct data dependencies. We introduce loop-extended symbolic execution, a generalization that broadens the coverage of symbolic results in programs with loops. It introduces symbolic variables for the number of times each loop executes, and links these with features of a known input grammar such as variable-length or repeating fields. This allows the symbolic constraints to cover a class of paths that includes different numbers of loop iterations, expressing loop-dependent program values in terms of properties of the input. By performing more reasoning symbolically, instead of by undirected exploration, applications of loop-extended symbolic execution can achieve better results and/or require fewer program executions. To demonstrate our technique, we apply it to the problem of discovering and diagnosing buffer-overflow vulnerabilities in software given only in binary form. Our tool finds vulnerabilities in both a standard benchmark suite and 3 real-world applications, after generating only a handful of candidate inputs, and also diagnoses general vulnerability conditions.
computer and communications security | 2009
Min Gyung Kang; Heng Yin; Steve Hanna; Stephen McCamant; Dawn Song
The authors of malware attempt to frustrate reverse engineering and analysis by creating programs that crash or otherwise behave differently when executed on an emulated platform than when executed on real hardware. In order to defeat such techniques and facilitate automatic and semi-automatic dynamic analysis of malware, we propose an automated technique to dynamically modify the execution of a whole-system emulator to fool a malware samples anti-emulation checks. Our approach uses a scalable trace matching algorithm to locate the point where emulated execution diverges, and then compares the states of the reference system and the emulator to create a dynamic state modification that repairs the difference. We evaluate our technique by building an implementation into an emulator used for in-depth malware analysis. On case studies that include real samples of malware collected in the wild and an attack that has not yet been exploited, our tool automatically ameliorates the malware samples anti-emulation checks to enable analysis, and its modifications are robust to system changes.
acm workshop on programming languages and analysis for security | 2009
James Newsome; Stephen McCamant; Dawn Song
The channel capacity of a program is a quantitative measure of the amount of control that the inputs to a program have over its outputs. Because it corresponds to worst-case assumptions about the probability distribution over those inputs, it is particularly appropriate for security applications where the inputs are under the control of an adversary. We introduce a family of complementary techniques for measuring channel capacity automatically using a decision procedure (SAT or #SAT solver), which give either exact or narrow probabilistic bounds. We then apply these techniques to the problem of analyzing false positives produced by dynamic taint analysis used to detect control-flow hijacking in commodity software. Dynamic taint analysis is based on the principle that an attacker should not be able to control values such as function pointers and return addresses, but it uses a simple binary approximation of control that commonly leads to both false positive and false negative errors. Based on channel capacity, we propose a more refined quantitative measure of influence, which can effectively distinguish between true attacks and false positives. We use a practical implementation of our influence measuring techniques, integrated with a dynamic taint analysis operating on x86 binaries, to classify tainting warnings produced by vulnerable network servers, such as those attacked by the Blaster and SQL Slammer worms. Influence measurement correctly distinguishes real attacks from tainting false positives, a task that would otherwise need to be done manually.
foundations of software engineering | 2003
Stephen McCamant; Michael D. Ernst
We present a new, automatic technique to assess whether replacing a component of a software system by a purportedly compatible component may change the behavior of the system. The technique operates before integrating the new component into the system or running system tests, permitting quicker and cheaper identification of problems. It takes into account the systems use of the component, because a particular component upgrade may be desirable in one context but undesirable in another. No formal specifications are required, permitting detection of problems due either to errors in the component or to errors in the system. Both external and internal behaviors can be compared, enabling detection of problems that are not immediately reflected in the output.The technique generates an operational abstraction for the old component in the context of the system and generates an operational abstraction for the new component in the context of its test suite; an operational abstraction is a set of program properties that generalizes over observed run-time behavior. If automated logical comparison indicates that the new component does not make all the guarantees that the old one did, then the upgrade may affect system behavior and should not be performed without further scrutiny. In case studies, the technique identified several incompatibilities among software components.
international symposium on software testing and analysis | 2006
Philip J. Guo; Jeff H. Perkins; Stephen McCamant; Michael D. Ernst
An abstract type groups variables that are used for related purposes in a program. We describe a dynamic unification-based analysis for inferring abstract types. Initially, each run-time value gets a unique abstract type. A run-time interaction among values indicates that they have the same abstract type, so their abstract types are unified. Also at run time, abstract types for variables are accumulated from abstract types for values. The notion of interaction may be customized, permitting the analysis to compute finer or coarser abstract types; these different notions of abstract type are useful for different tasks. We have implemented the analysis for compiled x86 binaries and for Java bytecodes. Our experiments indicate that the inferred abstract types are useful for program comprehension, improve both the results and the run time of a follow-on program analysis, and are more precise than the output of a comparable static analysis, without suffering from overfitting.
international symposium on software testing and analysis | 2011
Domagoj Babić; Lorenzo Martignoni; Stephen McCamant; Dawn Song
We present a new technique for exploiting static analysis to guide dynamic automated test generation for binary programs, prioritizing the paths to be explored. Our technique is a three-stage process, which alternates dynamic and static analysis. In the first stage, we run dynamic analysis with a small number of seed tests to resolve indirect jumps in the binary code and build a visibly pushdown automaton (VPA) reflecting the global control-flow of the program. Further, we augment the computed VPA with statically computable jumps not executed by the seed tests. In the second stage, we apply static analysis to the inferred automaton to find potential vulnerabilities, i.e., targets for the dynamic analysis. In the third stage, we use the results of the prior phases to assign weights to VPA edges. Our symbolic-execution based automated test generation tool then uses the weighted shortest-path lengths in the VPA to direct its exploration to the target potential vulnerabilities. Preliminary experiments on a suite of benchmarks extracted from real applications show that static analysis allows exploration to reach vulnerabilities it otherwise would not, and the generated test inputs prove that the static warnings indicate true positives.
ieee symposium on security and privacy | 2011
Noah M. Johnson; Juan Caballero; Kevin Zhijie Chen; Stephen McCamant; Pongsin Poosankam; Daniel Reynaud; Dawn Song
A security analyst often needs to understand two runs of the same program that exhibit a difference in program state or output. This is important, for example, for vulnerability analysis, as well as for analyzing a malware program that features different behaviors when run in different environments. In this paper we propose a differential slicing approach that automates the analysis of such execution differences. Differential slicing outputs a causal difference graph that captures the input differences that triggered the observed difference and the causal path of differences that led from those input differences to the observed difference. The analyst uses the graph to quickly understand the observed difference. We implement differential slicing and evaluate it on the analysis of 11 real-world vulnerabilities and 2 malware samples with environment-dependent behaviors. We also evaluate it in an informal user study with two vulnerability analysts. Our results show that differential slicing successfully identifies the input differences that caused the observed difference and that the causal difference graph significantly reduces the amount of time and effort required for an analyst to understand the observed difference.
architectural support for programming languages and operating systems | 2012
Lorenzo Martignoni; Stephen McCamant; Pongsin Poosankam; Dawn Song; Petros Maniatis
Processor emulators are widely used to provide isolation and instrumentation of binary software. However they have proved difficult to implement correctly: processor specifications have many corner cases that are not exercised by common workloads. It is untenable to base other system security properties on the correctness of emulators that have received only ad-hoc testing. To obtain emulators that are worthy of the required trust, we propose a technique to explore a high-fidelity emulator with symbolic execution, and then lift those test cases to test a lower-fidelity emulator. The high-fidelity emulator serves as a proxy for the hardware specification, but we can also further validate by running the tests on real hardware. We implement our approach and apply it to generate about 610,000 test cases; for about 95% of the instructions we achieve complete path coverage. The tests reveal thousands of individual differences; we analyze those differences to shed light on a number of root causes, such as atomicity violations and missing security features.