Saumya K. Debray | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Saumya K. Debray is active.

Explore More

Publication

Featured researches published by Saumya K. Debray.

computer and communications security | 2003

Obfuscation of executable code to improve resistance to static disassembly

Cullen Linn; Saumya K. Debray

A great deal of software is distributed in the form of executable code. The ability to reverse engineer such executables can create opportunities for theft of intellectual property via software piracy, as well as security breaches by allowing attackers to discover vulnerabilities in an application. The process of reverse engineering an executable program typically begins with disassembly, which translates machine code to assembly code. This is then followed by various decompilation steps that aim to recover higher-level abstractions from the assembly code. Most of the work to date on code obfuscation has focused on disrupting or confusing the decompilation phase. This paper, by contrast, focuses on the initial disassembly phase. Our goal is to disrupt the static disassembly process so as to make programs harder to disassemble correctly. We describe two widely used static disassembly algorithms, and discuss techniques to thwart each of them. Experimental results indicate that significant portions of executables that have been obfuscated using our techniques are disassembled incorrectly, thereby showing the efficacy of our methods.

ACM Transactions on Programming Languages and Systems | 1993

Reasoning about naming systems

Mic Bowman; Saumya K. Debray; Larry L. Peterson

This paper reasons about naming systems as specialized inference mechanisms, It describes a preference )-zierarch.v that can be used to specify the structure of a naming system’s inference mechanism and defines criteria by which different naming systems can be evaluated, For example, the preference hierarchy allows one to compare naming systems based on how dkcrzmznating they are and to identify the class of names for which a given naming system is sound and complete. A study of several example naming systems demonstrates how the preference hierarchy can be used as a formal tool for designing naming systems.

ACM Transactions on Programming Languages and Systems | 2000

Compiler techniques for code compaction

Saumya K. Debray; William S. Evans; Robert Muth; Bjorn De Sutter

In recent years there has been an increasing trend toward the incorpor ation of computers into a variety of devices where the amount of memory available is limited. This makes it desirable to try to reduce the size of applications where possible. This article explores the use of compiler techniques to accomplish code compaction to yield smaller executables. The main contribution of this article is to show that careful, aggressive, interprocedural optimization, together with procedural abstraction of repeated code fragments, can yield significantly better reductions in code size than previous approaches, which have generally focused on abstraction of repeated instruction sequences. We also show how “equivalent” code fragments can be detected and factored out using conventional compiler techniques, and without having to resort to purely linear treatments of code sequences as in suffix-tree-based approaches, thereby setting up a framework for code compaction that can be more flexible in its treatment of what code fragments are considered equivalent. Our ideas have been implemented in the form of a binary-rewriting tool that reduces the size of executables by about 30% on the average.

working conference on reverse engineering | 2002

Disassembly of executable code revisited

Benjamin Schwarz; Saumya K. Debray; Gregory R. Andrews

Machine code disassembly routines form a fundamental component of software systems that statically analyze or modify executable programs, e.g., reverse engineering systems, static binary translators, and link-time optimizers. The task of disassembly is complicated by indirect jumps and the presence of non-executable data - jump tables, alignment bytes, etc. - in the instruction stream. Existing disassembly algorithms are not always able to cope successfully with executable files containing such features, and they fail silently - i.e., produce incorrect disassemblies without any indication that the results they are producing are incorrect. In this paper we examine two commonly-used disassembly algorithms and illustrate their shortcomings. We propose a hybrid approach that performs better than these algorithms in the sense that it is able to detect situations where the disassembly may be incorrect and limit the extent of such disassembly errors. Experimental results indicate that the algorithm is quite effective: the amount of code flagged as incurring disassembly errors is usually quite small.

programming language design and implementation | 2004

Dynamic path-based software watermarking

Christian S. Collberg; Edward Carter; Saumya K. Debray; Andrew S. Huntwork; John D. Kececioglu; Cullen Linn; Michael Stepp

Software watermarking is a tool used to combat software piracy by embedding identifying information into a program. Most existing proposals for software watermarking have the shortcoming that the mark can be destroyed via fairly straightforward semantics-preserving code transformations. This paper introduces path-based watermarking, a new approach to software watermarking based on the dynamic branching behavior of programs. The advantage of this technique is that error-correcting and tamper-proofing techniques can be used to make path-based watermarks resilient against a wide variety of attacks. Experimental results, using both Java bytecode and IA-32 native code, indicate that even relatively large watermarks can be embedded into programs at modest cost.

working conference on reverse engineering | 2005

Deobfuscation: reverse engineering obfuscated code

Sharath K. Udupa; Saumya K. Debray; Matias Madou

In recent years, code obfuscation has attracted attention as a low cost approach to improving software security by making it difficult for attackers to understand the inner workings of proprietary software systems. This paper examines techniques for automatic deobfuscation of obfuscated programs, as a step towards reverse engineering such programs. Our results indicate that much of the effects of code obfuscation, designed to increase the difficulty of static analyses, can be defeated using simple combinations of straightforward static and dynamic analyses. Our results have applications to both software engineering and software security. In the context of software engineering, we show how dynamic analyses can be used to enhance reverse engineering, even for code that has been designed to be difficult to reverse engineer. For software security, our results serve as an attack model for code obfuscators, and can help with the development of obfuscation techniques that are more resilient to straightforward reverse engineering.

ACM Transactions on Programming Languages and Systems | 1993

Cost analysis of logic programs

Saumya K. Debray; Nai-Wei Lin

Cost analysis of programs has been studied in the context of imperative and functional programming languages. For logic programs, the problem is complicated by the fact that programs may be nondeterministic and produce multiple solutions. A related problem is that because failure of execution is not an abnormal situation, it is possible to write programs where implicit failures have to be dealt with explicitly in order to get meaningful results. This paper addresses these problems and develops a method for (semi-)automatic analysis of the worst-case cost of a large class of logic programs. The primary contribution of this paper is the development of techniques to deal with nondeterminism and the generation of multiple solutions via backtracking

ACM Transactions on Programming Languages and Systems | 1989

Static inference of modes and data dependencies in logic programs

Saumya K. Debray

Mode and data dependency analyses find many applications in the generation of efficient executable code for logic programs. For example, mode information can be used to generate specialized unification instructions where permissible, to detect determinacy and functionality of programs, generate index structures more intelligently, reduce the amount of runtime tests in systems that support goal suspension, and in the integration of logic and functional languages. Data dependency information can be used for various source-level optimizing transformations, to improve backtracking behavior and to parallelize logic programs. This paper describes and proves correct an algorithm for the static inference of modes and data dependencies in a program. The algorithm is shown to be quite efficient for programs commonly encountered in practice.

programming language design and implementation | 1990

Task granularity analysis in logic programs

Saumya K. Debray; Nai-Wei Lin; Manuel Hermnegildo

While logic programming languages offer a great deal of scope for parallelism, there is usually some overhead associated with the execution of goals in parallel because of the work involved in task creation and scheduling. In practice, therefore, the “granularity” of a goal, i.e. an estimate of the work available under it, should be taken into account when deciding whether or not to execute a goal concurrently as a separate task. This paper describes a method for estimating the granularity of a goal at compile time. The runtime overhead associated with our approach is usually quite small, and the performance improvements resulting from the incorporation of grainsize control can be quite good. This is shown by means of experimental results.

Journal of Logic Programming | 1988

Automatic mode inference for logic programs

Saumya K. Debray; David Scott Warren

In general, logic programs are undirected, i.e., there is no concept of “input” and “output” arguments to a procedure. An argument may be used either as an input or as an output argument, and programs may be executed either in a “forward” direction or in a “backward” direction. However, it is often the case that in a given program, a predicate is used with some of its arguments used consistently as input arguments and others as output arguments. Such mode information can be used by a compiler to effect various optimizations. This paper considers the problem of automatically inferring the models of the predicates in a program. The dataflow analysis we use is more powerful than approaches relying on syntactic characteristics of programs. Our work differs from that of Mellish in that (1) we give a sound and efficient treatment of variable aliasing in mode inference; (2) by propagating instantiation information using state transformations rather than through dependencies between variables, we achieve greater precision in the treatment of unification, e.g. through =/2; and (3) we describe an efficient implementation based on the dynamic generation of customized mode interpreters. Several optimizations to improve the performance of the mode inference algorithm are described, as are various program optimizations based on mode information.

Explore More