Gary Wassermann | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Gary Wassermann is active.

Explore More

Publication

Featured researches published by Gary Wassermann.

symposium on principles of programming languages | 2006

The essence of command injection attacks in web applications

Zhendong Su; Gary Wassermann

Web applications typically interact with a back-end database to retrieve persistent data and then present the data to the user as dynamically generated output, such as HTML web pages. However, this interaction is commonly done through a low-level API by dynamically constructing query strings within a general-purpose programming language, such as Java. This low-level interaction is ad hoc because it does not take into account the structure of the output language. Accordingly, user inputs are treated as isolated lexical entities which, if not properly sanitized, can cause the web application to generate unintended output. This is called a command injection attack, which poses a serious threat to web application security. This paper presents the first formal definition of command injection attacks in the context of web applications, and gives a sound and complete algorithm for preventing them based on context-free grammars and compiler parsing techniques. Our key observation is that, for an attack to succeed, the input that gets propagated into the database query or the output document must change the intended syntactic structure of the query or document. Our definition and algorithm are general and apply to many forms of command injection attacks. We validate our approach with SqlCheckS, an implementation for the setting of SQL command injection attacks. We evaluated SqlCheckS on real-world web applications with systematically compiled real-world attack data as input. SqlCheckS produced no false positives or false negatives, incurred low runtime overhead, and applied straightforwardly to web applications written in different languages.

programming language design and implementation | 2007

Sound and precise analysis of web applications for injection vulnerabilities

Gary Wassermann; Zhendong Su

Web applications are popular targets of security attacks. One common type of such attacks is SQL injection, where an attacker exploits faulty application code to execute maliciously crafted database queries. Bothstatic and dynamic approaches have been proposed to detect or prevent SQL injections; while dynamic approaches provide protection for deployed software, static approaches can detect potential vulnerabilities before software deployment. Previous static approaches are mostly based on tainted information flow tracking and have at least some of the following limitations: (1) they do not model the precise semantics of input sanitization routines; (2) they require manually written specifications, either for each query or for bug patterns; or (3) they are not fully automated and may require user intervention at various points in the analysis. In this paper, we address these limitations by proposing a precise, sound, and fully automated analysis technique for SQL injection. Our technique avoids the need for specifications by consideringas attacks those queries for which user input changes the intended syntactic structure of the generated query. It checks conformance to this policy byconservatively characterizing the values a string variable may assume with a context free grammar, tracking the nonterminals that represent user-modifiable data, and modeling string operations precisely as language transducers. We have implemented the proposed technique for PHP, the most widely-used web scripting language. Our tool successfully discovered previously unknown and sometimes subtle vulnerabilities in real-world programs, has a low false positive rate, and scales to large programs (with approx. 100K loc).

international conference on software engineering | 2008

Static detection of cross-site scripting vulnerabilities

Gary Wassermann; Zhendong Su

Web applications support many of our daily activities, but they often have security problems, and their accessibility makes them easy to exploit. In cross-site scripting (XSS), an attacker exploits the trust a Web client (browser) has for a trusted server and executes injected script on the browser with the servers privileges. In 2006, XSS constituted the largest class of newly reported vulnerabilities making it the most prevalent class of attacks today. Web applications have XSS vulnerabilities because the validation they perform on untrusted input does not suffice to prevent that input from invoking a browsers JavaScript interpreter, and this validation is particularly difficult to get right if it must admit some HTML mark-up. Most existing approaches to finding XSS vulnerabilities are taint-based and assume input validation functions to be adequate, so they either miss real vulnerabilities or report many false positives. This paper presents a static analysis for finding XSS vulnerabilities that directly addresses weak or absent input validation. Our approach combines work on tainted information flow with string analysis. Proper input validation is difficult largely because of the many ways to invoke the JavaScript interpreter; we face the same obstacle checking for vulnerabilities statically, and we address it by formalizing a policy based on the W3C recommendation, the Firefox source code, and online tutorials about closed-source browsers. We provide effective checking algorithms based on our policy. We implement our approach and provide an extensive evaluation that finds both known and unknown vulnerabilities in real-world web applications.

ACM Transactions on Software Engineering and Methodology | 2007

Static checking of dynamically generated queries in database applications

Gary Wassermann; Carl Gould; Zhendong Su; Premkumar T. Devanbu

Many data-intensive applications dynamically construct queries in response to client requests and execute them. Java servlets, e.g., can create string representations of SQL queries and then send the queries, using JDBC, to a database server for execution. The servlet programmer enjoys static checking via Javas strong type system. However, the Java type system does little to check for possible errors in the dynamically generated SQL query strings. Thus, a type error in a generated selection query (e.g., comparing a string attribute with an integer) can result in an SQL runtime exception. Currently, such defects must be rooted out through careful testing, or (worse) might be found by customers at runtime. In this paper, we present a sound, static, program analysis technique to verify the correctness of dynamically generated query strings. We describe our analysis technique and provide soundness results for our static analysis algorithm. We also describe the details of a prototype tool based on the algorithm and present several illustrative defects found in senior software-engineering student-team projects, online tutorial examples, and a real-world purchase order system written by one of the authors.

international symposium on software testing and analysis | 2008

Dynamic test input generation for web applications

Gary Wassermann; Dachuan Yu; Ajay Chander; Dinakar Dhurjati; Hiroshi Inamura; Zhendong Su

Web applications routinely handle sensitive data, and many people rely on them to support various daily activities, so errors can have severe and broad-reaching consequences. Unlike most desktop applications, many web applications are written in scripting languages, such as PHP. The dynamic features commonly supported by these languages significantly inhibit static analysis and existing static analysis of these languages can fail to produce meaningful results on realworld web applications. Automated test input generation using the concolic testing framework has proven useful for finding bugs and improving test coverage on C and Java programs, which generally emphasize numeric values and pointer-based data structures. However, scripting languages, such as PHP, promote a style of programming for developing web applications that emphasizes string values, objects, and arrays. In this paper, we propose an automated input test generation algorithm that uses runtime values to analyze dynamic code, models the semantics of string operations, and handles operations whose argument and return values may not share a common type. As in the standard concolic testing framework, our algorithm gathers constraints during symbolic execution. Our algorithm resolves constraints over multiple types by considering each variable instance individually, so that it only needs to invert each operation. By recording constraints selectively, our implementation successfully finds bugs in real-world web applications which state-of-the-art static analysis tools fail to analyze.

architectural support for programming languages and operating systems | 2006

ExecRecorder: VM-based full-system replay for attack analysis and system recovery

Daniela A. S. de Oliveira; Jedidiah R. Crandall; Gary Wassermann; S. Felix Wu; Zhendong Su; Frederic T. Chong

Log-based recovery and replay systems are important for system reliability, debugging and postmortem analysis/recovery of malware attacks. These systems must incur low space and performance overhead, provide full-system replay capabilities, and be resilient against attacks. Previous approaches fail to meet these requirements: they replay only a single process, or require changes in the host and guest OS, or do not have a fully-implemented replay component. This paper studies full-system replay for uniprocessors by logging and replaying architectural events. To limit the amount of logged information, we identify architectural nondeterministic events, and encode them compactly. Here we present ExecRecorder, a full-system, VM-based, log and replay framework for post-attack analysis and recovery. ExecRecorder can replay the execution of an entire system by checkpointing the system state and logging architectural nondeterministic events, and imposes low performance overhead (less than 4% on average). In our evaluation its log files grow at about 5.4 GB/hour (arithmetic mean). Thus it is practical to log on the order of hours or days between checkpoints. It can also be integrated naturally with an IDS and a post-attack analysis tool for intrusion analysis and recovery.

network operations and management symposium | 2008

Bezoar: Automated virtual machine-based full-system recovery from control-flow hijacking attacks

Daniela A. S. de Oliveira; Jedidiah R. Crandall; Gary Wassermann; Shaozhi Ye; Shyhtsun Felix Wu; Zhendong Su; Frederic T. Chong

System availability is difficult for systems to maintain in the face of Internet worms. Large systems have vulnerabilities, and if a system attempts to continue operation after an attack, it may not behave properly. Traditional mechanisms for detecting attacks disrupt service and current recovery approaches are application-based and cannot guarantee recovery in the face of exploits that corrupt the kernel, involve multiple processes or target multithreaded network services. This paper presents Bezoar, an automated full-system virtual machine-based approach to recover from zero-day control-flow hijacking attacks. Bezoar tracks down the source of network bytes in the system and after an attack, replays the checkpointed run while ignoring inputs from the malicious source. We evaluated our proof-of-concept prototype on six notorious exploits for Linux and Windows. In all cases, it recovered the full system state and resumed execution. Bezoar incurs low overhead to the virtual machine: less than 1% for the recovery and log components and approximately 1.4X for the memory monitor component that tracks down network bytes, for five SPEC INT 2000 benchmarks.

trans. computational science | 2009

Putting Trojans on the Horns of a Dilemma: Redundancy for Information Theft Detection

Jedidiah R. Crandall; John Brevik; Shaozhi Ye; Gary Wassermann; Daniela A. S. de Oliveira; Zhendong Su; S. Felix Wu; Frederic T. Chong

Conventional approaches to either information flow security or intrusion detection are not suited to detecting Trojans that steal information such as credit card numbers using advanced cryptovirological and inference channel techniques. We propose a technique based on repeated deterministic replays in a virtual machine to detect the theft of private information. We prove upper bounds on the average amount of information an attacker can steal without being detected, even if they are allowed an arbitrary distribution of visible output states. Our intrusion detection approach is more practical than traditional approaches to information flow security. We show that it is possible to, for example, bound the average amount of information an attacker can steal from a 53-bit credit card number to less than a bit by sampling only 11 of the 253 possible outputs visible to the attacker, using a two-pronged approach of hypothesis testing and information theory.

foundations of software technology and theoretical computer science | 2006

Validity checking for finite automata over linear arithmetic constraints

Gary Wassermann; Zhendong Su

Decision procedures underlie many program analysis problems. Traditional program analysis algorithms attempt to prove some property about a single, statically-defined program by generating a single constraint. Accordingly, traditional decision procedures take single constraints as input. Extending these traditional program analysis algorithms to reason about potentially infinite languages of programs (as generated by a given metaprogram) requires a new class of decision procedures that reason about languages of constraints. This paper introduces the parameterized class of validity checking problems that take as input a language generator A. The parameters are: (1) the language formalism for A, (2) the theory under which each string in the language of A is interpretted, and (3) the quantification (existential/universal) of the constraints in the language to which the validity property applies. We introduce such decision problems by presenting an algorithm that decides whether a given finite state automaton A generates any valid linear arithmetic constraints.

Archive | 2004