Gogul Balakrishnan
Princeton University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Gogul Balakrishnan.
compiler construction | 2004
Gogul Balakrishnan; Thomas W. Reps
This paper concerns static-analysis algorithms for analyzing x86 executables. The aim of the work is to recover intermediate representations that are similar to those that can be created for a program written in a high-level language. Our goal is to perform this task for programs such as plugins, mobile code, worms, and virus-infected code. For such programs, symbol-table and debugging information is either entirely absent, or cannot be relied upon if present; hence, the technique described in the paper makes no use of symbol-table/debugging information. Instead, an analysis is carried out to recover information about the contents of memory locations and how they are manipulated by the executable.
verified software: theories, tools, experiments | 2005
Gogul Balakrishnan; Thomas W. Reps; David Melski; Tim Teitelbaum
What You See Is Not What You eXecute: computers do not execute source-code programs; they execute machine-code programs that are generated from source code. Not only can the WYSINWYX phenomenon create a mismatch between what a programmer intends and what is actually executed by the processor, it can cause analyses that are performed on source code to fail to detect certain bugs and vulnerabilities. This issue arises regardless of whether ones favorite approach to assuring that programs behave as desired is based on theorem proving, model checking, or abstract interpretation.
ACM Transactions on Programming Languages and Systems | 2010
Gogul Balakrishnan; Thomas W. Reps
Over the last seven years, we have developed static-analysis methods to recover a good approximation to the variables and dynamically allocated memory objects of a stripped executable, and to track the flow of values through them. The article presents the algorithms that we developed, explains how they are used to recover Intermediate Representations (IRs) from executables that are similar to the IRs that would be available if one started from source code, and describes their application in the context of program understanding and automated bug hunting. Unlike algorithms for analyzing executables that existed prior to our work, the ones presented in this article provide useful information about memory accesses, even in the absence of debugging information. The ideas described in the article are incorporated in a tool for analyzing Intel x86 executables, called CodeSurfer/x86. CodeSurfer/x86 builds a system dependence graph for the program, and provides a GUI for exploring the graph by (i) navigating its edges, and (ii) invoking operations, such as forward slicing, backward slicing, and chopping, to discover how parts of the program can impact other parts. To assess the usefulness of the IRs recovered by CodeSurfer/x86 in the context of automated bug hunting, we built a tool on top of CodeSurfer/x86, called Device-Driver Analyzer for x86 (DDA/x86), which analyzes device-driver executables for bugs. Without the benefit of either source code or symbol-table/debugging information, DDA/x86 was able to find known bugs (that had been discovered previously by source-code analysis tools), along with useful error traces, while having a low false-positive rate. DDA/x86 is the first known application of program analysis/verification techniques to industrial executables.
compiler construction | 2005
Gogul Balakrishnan; Radu Gruian; Thomas W. Reps; Tim Teitelbaum
CodeSurfer/x86 is a prototype system for analyzing x86 executables. It uses a static-analysis algorithm called value-set analysis (VSA) to recover intermediate representations that are similar to those that a compiler creates for a program written in a high-level language. A major challenge in building an analysis tool for executables is in providing useful information about operations involving memory. This is difficult when symbol-table and debugging information is absent or untrusted. CodeSurfer/x86 overcomes these challenges to provide an analyst with a powerful and flexible platform for investigating the properties and behaviors of potentially malicious code (such as COTS components, plugins, mobile code, worms, Trojans, and virus-infected code) using (i) CodeSurfer/x86s GUI, (ii) CodeSurfer/x86s scripting language, which provides access to all of the intermediate representations that CodeSurfer/x86 builds for the executable, and (iii) GrammaTechs Path Inspector, which is a tool that uses a sophisticated pattern-matching engine to answer questions about the flow of execution in a program.
verification model checking and abstract interpretation | 2007
Gogul Balakrishnan; Thomas W. Reps
This paper addresses the problem of recovering variable-like entities when analyzing executables in the absence of debugging information. We show that variable-like entities can be recovered by iterating Value-Set Analysis (VSA), a combined numeric-analysis and pointer-analysis algorithm, and Aggregate Structure Identification, an algorithm to identify the structure of aggregates. Our initial experiments show that the technique is successful in correctly identifying 88% of the local variables and 89% of the fields of heap-allocated objects. Previous techniques recovered 83% of the local variables, but 0% of the fields of heap-allocated objects. Moreover, the values computed by VSA using the variables recovered by our algorithm would allow any subsequent analysis to do a better job of interpreting instructions that use indirect addressing to access arrays and heap-allocated data objects: indirect operands can be resolved better at 4% to 39% of the sites of writes and up to 8% of the sites of reads. (These are the memory-access operations for which it is the most difficult for an analyzer to obtain useful results.).
static analysis symposium | 2006
Gogul Balakrishnan; Thomas W. Reps
In this paper, we present an abstraction for heap-allocated storage, called the recency-abstraction, that allows abstract-interpretation algorithms to recover some non-trivial information for heap-allocated data objects. As an application of the recency-abstraction, we show how it can resolve virtual-function calls in stripped executables (i.e., executables from which debugging information has been removed). This approach succeeded in resolving 55% of virtual-function call-sites, whereas previous tools for analyzing executables fail to resolve any of the virtual-function call-sites.
partial evaluation and semantic-based program manipulation | 2006
Thomas W. Reps; Gogul Balakrishnan; Junghee Lim
The goal of our work is to create tools that an analyst can use to understand the workings of COTS components, plugins, mobile code, and DLLs, as well as memory snapshots of worms and virus-infected code. This paper describes how static analysis provides techniques that can be used to recover intermediate representations that are similar to those that can be created for a program written in a high-level language.
computer aided verification | 2005
Akash Lal; Thomas W. Reps; Gogul Balakrishnan
Recent work on weighted-pushdown systems shows how to generalize interprocedural-dataflow analysis to answer “stack-qualified queries”, which answer the question “what dataflow values hold at a program node for a particular set of calling contexts?” The generalization, however, does not account for precise handling of local variables. Extended-weighted-pushdown systems address this issue, and provide answers to stack-qualified queries in the presence of local variables as well.
computer aided verification | 2005
Gogul Balakrishnan; Thomas W. Reps; Nicholas Kidd; Akash Lal; Junghee Lim; David Melski; Radu Gruian; Suan Hsi Yong; Chi-Hua Chen; Tim Teitelbaum
This paper presents a toolset for model checking x86 executables. The members of the toolset are CodeSurfer/x86, WPDS++, and the Path Inspector. CodeSurfer/x86 is used to extract a model from an executable in the form of a weighted pushdown system. WPDS++ is a library for answering generalized reachability queries on weighted pushdown systems. The Path Inspector is a software model checker built on top of CodeSurfer and WPDS++ that supports safety queries about the programs possible control configurations.
static analysis symposium | 2008
Gogul Balakrishnan; Sriram Sankaranarayanan; Franjo Ivancic; Ou Wei; Aarti Gupta
We present a technique for detecting semantically infeasible paths in programs using abstract interpretation. Our technique uses a sequence of path-insensitive forward and backward runs of an abstract interpreter to infer paths in the control flow graph that cannot be exercised in concrete executions of the program. We then present a syntactic language refinement (SLR) technique that automatically excludes semantically infeasible paths from a program during static analysis. SLR allows us to iteratively prove more properties. Specifically, our technique simulates the effect of a path-sensitive analysis by performing syntactic language refinement over an underlying path-insensitive static analyzer. Finally, we present experimental results to quantify the impact of our technique on an abstract interpreter for C programs.