Mark N. Wegman | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mark N. Wegman is active.

Explore More

Publication

Featured researches published by Mark N. Wegman.

Journal of Computer and System Sciences | 1979

Universal classes of hash functions

J. Lawrence Carter; Mark N. Wegman

Abstract This paper gives an input independent average linear time algorithm for storage and retrieval on keys. The algorithm makes a random choice of hash function from a suitable class of hash functions. Given any sequence of inputs the expected time (averaging over all functions in the class) to store and retrieve elements is linear in the length of the sequence. The number of references to the data base required by the algorithm for any input is extremely close to the theoretical minimum for any possible hash function with randomly distributed inputs. We present three suitable classes of hash functions which also can be evaluated rapidly. The ability to analyze the cost of storage and retrieval without worrying about the distribution of the input allows as corollaries improvements on the bounds of several algorithms.

Journal of Computer and System Sciences | 1981

New hash functions and their use in authentication and set equality

Mark N. Wegman; J. Lawrence Carter

Abstract In this paper we exhibit several new classes of hash functions with certain desirable properties, and introduce two novel applications for hashing which make use of these functions. One class contains a small number of functions, yet is almost universal2. If the functions hash n-bit long names into m-bit indices, then specifying a member of the class requires only O((m + log2log2(n)) · log2(n)) bits as compared to O(n) bits for earlier techniques. For long names, this is about a factor of m larger than the lower bound of m + log2n − log2m bits. An application of this class is a provably secure authentication technique for sending messages over insecure lines. A second class of functions satisfies a much stronger property than universal2. We present the application of testing sets for equality. The authentication technique allows the receiver to be certain that a message is genuine. An “enemy”—even one with infinite computer resources—cannot forge or modify a message without detection. The set equality technique allows operations including “add member to set,” “delete member from set” and “test two sets for equality” to be performed in expected constant time and with less than a specified probability of error.

programming language design and implementation | 1990

Analysis of pointers and structures

David R. Chase; Mark N. Wegman; F. Kenneth Zadeck

Compilers can make good use of knowledge about the shape of data structures and the values that pointers assume during execution. We present results which show how a compiler can automatically determine some of this information. We believe that practical analyses based on this work can be used in compilers for languages that provide linked data structures. The analysis we present obtains useful information about linked data structures. We summarize unbounded data structures by taking advantage of structure present in the original program. The worst-case time bounds for a naive algorithm are high-degree polynomial, but for the expected (sparse) case we have an efficient algorithm. Previous work has addressed time bounds rarely, and efficient algorithms not at all. The quality of information obtained by this analysis appears to be (generally) an improvement on what is obtained by existing techniques. A simple extension obtains aliasing information for entire data structures that previously was obtained only through declarations. Previous work has shown that this information, however obtained, allows worthwhile optimization.

symposium on principles of programming languages | 1988

Detecting equality of variables in programs

Bowen Alpern; Mark N. Wegman; F. K. Zadeck

paper presents an algorithm for detecting when two computations produce equivalent values. The equivalence of programs, and hence the equivalence of values, is in general undecidable. Thus, the best one can hope to do is to give an efficient algorithm that detects a large subclass of all the possible equivalences in a program. Two variables are said to be equivalent at a point p if those variables contain the same values whenever control reaches p during any possible execution of the program. We will not examine all possible executions of the program. Instead, we will develop a static property called congruence. Congruence implies, but is not implied by, equivalence. Our approach is conservative in that any variables detected to be e:quivalent will in fact be equivalent, but not all equivalences are detected. Previous work has shown how to apply a technique c.alled value numbering in basic blocks [CS70]. Value numbering is essentially symbolic execution on straight-line programs (basic blocks). Symbolic execution implies that two expressions are assumed to be equal only when they consist of the same functions and the corresponding arguments of these functions are equal. An expression DAG is associated with each assignment statement. A hashing algorithm assigns a unique integer, the value number, to each different expression tree. Two variables that are assigned the same integer are guaranteed to be equivalent. After the code

symposium on principles of programming languages | 1989

An efficient method of computing static single assignment form

Ron Cytron; Jeanne Ferrante; Barry K. Rosen; Mark N. Wegman; F. K. Zadeck

In optimizing compilers, data structure choices directly influence the power and efficiency of practical program optimization. A poor choice of data structure can inhibit optimization or slow compilation to the point where advanced optimization features become undesirable. Recently, static single assignment form and the control dependence graph have been proposed to represent data flow and control flow properties of programs. Each of these previously unrelated techniques lends efficiency and power to a useful class of program optimizations. Although both of these structures are attractive, the difficulty of their construction and their potential size have discouraged their use. We present a new algorithm that efficiently computes these data structures for arbitrary control flow graph We also give analytical and experimental evidence that they are usually {\em linear} in the size of the original program. This paper thus presents strong evidence that these structures can be of {\em practical} use in optimization.

symposium on principles of programming languages | 1988

Global value numbers and redundant computations

Barry K. Rosen; Mark N. Wegman; F. K. Zadeck

Most previous redundancy elilmination algorithms have been of two kinds. The lexical algorithms deal with the entire program, but they can only detect redundancy among computations of lexicatlly identical expressions, where expressions are lexically identical if they apply exactly the same operator to exactly the same operands. The value numbering algorithms,, on the other hand, can recognize redundancy among ex:pressions that are lexically different but that are certain to compute the same value. This is accomplished by assigning special symbolic names called value numbers to expr,essions. If the value numbers of the operands of two expressions are identical, and if the operators applied by the expressions are identical, then the expressions receive the: same value number and are certain to have the same values. Sameness of value numbers permits more extensive optimization than lexical identity, but value numbering algor:ithms have usually been restricted in the past to basic blocks (sequences of computations with no branching) or extended basic blocks (sequences of computations with no joins). We propose a redundancy elimination algorithm that is global (in that it deals with the entire program), yet able to recognize redundancy among expressions that are lexitally different. The al,gorithm also takes advantage of second order effects: transformations based on the discovery that two computations compute the same value may create opportunities to discover that other computations are equivalent. The algorithm applies to programs expressed as reducible [l] [9] control flow gratphs. As the examples in section 7 illustrate, our algorithm optimizes reducible programs much more extensively than previous algorithms. In the special case of a program without loops, the code generated by our algorithm is provably “optimal” in the technical sense explained in section 8. Thiis degree of optimization is

symposium on the theory of computing | 1977

Universal classes of hash functions (Extended Abstract)

J. Lawrence Carter; Mark N. Wegman

This paper gives an input independent average linear time algorithm for storage and retrieval on keys. The algorithm makes a random choice of hash function from a suitable class of hash functions. Given any sequence of inputs the expected time (averaging over all functions in the class) to store and retrieve elements is linear in the length of the sequence. The number of references to the data base required by the algorithm for any input is extremely close to the theoretical minimum for any possible hash function with randomly distributed inputs. We present three suitable classes of hash functions which also may be evaluated rapidly. The ability to analyze the cost of storage and retrieval without worrying about the distribution of the input allows as corollaries improvements on the bounds of several algorithms.

symposium on the theory of computing | 1978

Exact and approximate membership testers

Larry Carter; Robert W. Floyd; John Gill; George Markowsky; Mark N. Wegman

In this paper we consider the question of how much space is needed to represent a set. Given a finite universe U and some subset V (called the vocabulary), an exact membership tester is a procedure that for each element s in U determines if s is in V. An approximate membership tester is allowed to make mistakes: we require that the membership tester correctly accepts every element of V, but we allow it to also accept a small fraction of the elements of U - V.

Archive | 1985

Variations on a theme by Ziv and Lempel

Victor S. Miller; Mark N. Wegman

The data compression methods of Ziv and Lempel are modified and augmented, in three ways in order to improve the compression ratio, and hold the size of the encoding tables to a fixed size. The improvements are in the area of dispensing with any uncompressed output, ability to use fixed size encoding tables by using a replacement strategy, and more rapid adaptation by widening the class of strings which may be added to the dictionary. Following Langdon, we show how these improvements also provide an adaptive probabilistic model for the input data. The issue of data structures for efficient implementation is also addressed.

symposium on principles of programming languages | 1985

Constant propagation with conditional branches

Mark N. Wegman; Frank Kenneth Zadeck

Constant propagation is a well-known global flow analysis problem. The goal of constant propagation is to discover values that are constant on all possible executions of a program and to propagate these constant values as far forward through the program as possible. Expressions whose operands are all constants can be evaluated at compile time and the results propagated further. Using the algorithms presented in this paper can produce smaller and faster compiled programs. The same algorithms can be used for other kinds of analyses (e.g., type determination). We present four algorithms in this paper, all conservative in the sense that all constants may not be found, but each constant found is constant over all possible executions of the program. These algorithms are among the simplest, fastest, and most powerful global constant propagation algorithms known. We also present a new algorithm that performs a form of interprocedural data flow analysis in which aliasing information is gathered in conjunction with constant propagation. Several variants of this algorithm are considered.

Explore More