Peng Tu
University of Illinois at Urbana–Champaign
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Peng Tu.
languages and compilers for parallel computing | 1993
Peng Tu; David A. Padua
Array privatization is one of the most effective transformations for the exploitation of parallelism. In this paper, we present a technique for automatic array privatization. Our algorithm uses data flow analysis of array references to identify privatizable arrays intraprocedurally as well as interprocedurally. It employs static and dynamic resolution to determine the last value of a lived private array. We compare the result of automatic array privatization with that of manual array privatization and identify directions for future improvement. To enhance the effectiveness of our algorithm, we develop a goal directly technique to analysis symbolic variables in the present of conditional statements, loops and index arrays.
international conference on supercomputing | 1995
Peng Tu; David A. Padua
In this paper, we present a GSA-based technique that performs more efficient and more precise symbolic analysis of predicated assignments, recurrences and index arrays. The efficiency is improved by using a backward substitution scheme that performs resolution of assertions on-demand and uses heuristics to limit the number of substitution. The precision is increased by utilizing the gating predicate information embedded in the GSA and the control dependence information in the program flow graph. Examples from array privatization are used to illustrate how the technique aids loop parallelization.
IEEE Parallel & Distributed Technology: Systems & Applications | 1994
William Blume; Rudolf Eigenmann; Jay Hoeflinger; David A. Padua; Paul M. Petersen; Lawrence Rauchwerger; Peng Tu
The limited ability of compilers to find the parallelism in programs is a significant barrier to the use of high-performance computers.However, a combination of static and runtime techniques can improve compilers to the extent that a significant group of scientific programs can be parallelized automatically.
languages and compilers for parallel computing | 1994
William Blume; Rudolf Eigenmann; Keith A. Faigin; John R. Grout; Jay Hoeflinger; David A. Padua; Paul M. Petersen; William M. Pottenger; Lawrence Rauchwerger; Peng Tu; Stephen A. Weatherford
It is the goal of the Polaris project to develop a new parallelizing compiler that will overcome limitations of current compilers. While current parallelizing compilers may succeed on small kernels, they often fail to extract any meaningful parallelism from large applications. After a study of application codes, it was concluded that by adding a few new techniques to current compilers, automatic parallelization becomes possible. The techniques needed are interprocedural analysis, scalar and array privatization, symbolic dependence analysis, and advanced induction and reduction recognition and elimination, along with run-time techniques to allow data dependent behavior.
programming language design and implementation | 1995
Peng Tu; David A. Padua
In this paper, we present an almost-linear time algorithm for constructing Gated Single Assignment (GSA), which is SSA augmented with gating functions at ø-nodes. The gating functions specify the control dependences for each reaching definition at a ø-node. We introduce a new concept of <italic>gating path</italic>, which is path in the control flow graph from the immediate dominator <italic>u</italic> of a node <italic>v</italic> to <italic>v</italic>, such that every node in the path is dominated by <italic>u</italic>. Previous algorithms start with ø-function placement, and then traverse the control flow graph to compute the gating functions. By formulating the problem into gating path construction, we are able to identify not only a ø-node, but also a gating path expression which defines a gating function for the ø-node.
international conference on parallel processing | 1996
Blume; Eigenmann; Faigin; Grout; Jaejin Lee; Lawrence; Hoeflinger; Padua; Yunheung Paek; Petersen; Pottenger; Rauchwerger; Peng Tu
The ability to automatically parallelize standard programming languages results in program portability across a wide range of machine architectures. It is the goal of the Polaris project to develop a new parallelizing compiler that overcomes limitations of current compilers. While current parallelizing compilers may succeed on small kernels, they often fail to extract any meaningful parallelism from whole applications. After a study of application codes, it was concluded that by adding a few new techniques to current compilers, automatic parallelization becomes feasible for a range of whole applications. The techniques needed are interprocedural analysis, scalar and array privatization, symbolic dependence analysis, and advanced induction and reduction recognition and elimination, along with run-time techniques to permit the parallelization of loops with unknown dependence relations.
Sigplan Notices | 1993
Peng Tu; David A. Padua
Memory related anti- and output-dependences can limit the potential parallelism in ordinary programs. In a distributed memory system, improper partition and distribution of data involved in memory related dependences may incur unnecessary communications and load imbalance. In this extended abstract, we present an overview of our work on using array privatization to enhance inherent parallelism and reduce communications.
Archive | 1993
David A. Padua; Jay Hoeflinger; Keith A. Faigin; Paul M. Petersen; Peng Tu; Rudolf Eigenmann; Stephen A. Weatherford
International Journal of Parallel Programming | 1995
William Blume; Rudolf Eigenmann; Keith A. Faigin; John R. Grout; Jay Hoeflinger; David A. Padua; Paul M. Petersen; William M. Pottenger; Lawrence Rauchwerger; Peng Tu; Stephen A. Weatherford
Compiler optimizations for scalable parallel systems | 2001
Peng Tu; David A. Padua