Peiyi Tang
University of Arkansas at Little Rock
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Peiyi Tang.
IEEE Transactions on Computers | 1990
Zhixi Fang; Peiyi Tang; Pen Chung Yew; Chuan-Qi Zhu
A processor self-scheduling scheme is proposed for general parallel nested loops in multiprocessor systems. In this scheme, programs are instrumented to allow processors to schedule loop iterations among themselves dynamically at run time without involving the operating system. The scheme has two levels. At the low level, it uses simple fetch-and-op operations to take advantage of the regular structure in the innermost parallel loop nests; at the high level, the irregular structure of the outer loops (parallel or serial) and the IF-THEN-ELSE constructs are handled by using dynamic parallel linked lists. The larger granularity or the processes at the high level easily justifies the added overhead incurred from maintaining such dynamic data structures. The use of guided self-scheduling (GSS) and shortest-delay self-scheduling (SDSS) in this scheme is analyzed. >
international conference on supercomputing | 1994
Peiyi Tang; John N. Zigman
If the iterations of a loop nest cannot be partitioned into independent tasks, data communication for data dependence is inevitable in order to execute them on parallel machines. This kind of loop nest is referred to as a DOACROSS loop nest. This paper is concerned with compiler algorithms for parallelizing DOACROSS loop nests for distributed-memory multicomputers. We present a method that combines loop tiling, chain-based scheduling and indirect message passing to generate efficient message-passing parallel code. We present our experiment results on the Fujitsu AP1000 to show that low communication overhead and high speedup for DOACROSS loop nests on multicomputers can be achieved by tuning these techniques.
acm southeast regional conference | 2011
Peiyi Tang; Erich Allen Peterson
This paper defines probabilistic support and probabilistic frequent closed itemsets in uncertain databases for the first time. It also proposes a probabilistic frequent closed itemset mining (PFCIM) algorithm to mine probabilistic frequent closed itemsets from uncertain databases.
Journal of Parallel and Distributed Computing | 1990
Peiyi Tang; Pen Chung Yew
Abstract In a large shared-memory multiprocessor system, a large number of simultaneous accesses to a single shared variable (called a hot spot in [10]) can degrade the performance of its shared memory system. Software combining [14] is an inexpensive alternative to the hardware combining networks [3, 9] for tackling this problem. This paper gives software combining algorithms for three different types of hot-spot accesses: (1) barrier synchronizations in parallel loops, (2) fetch-and-add type of operations, and (3) P and V operations on semaphores. They include most of the general hot-spot access patterns. By using software combining trees to distribute hot-spot accessings, the number of processors that can access the same location is greatly reduced. In these algorithms, the completion time of a hot-spot access is in the order O(log2N) in a multiprocessor system with N processors, assuming that the delay of a switch element in an interconnection network is a constant time, O(1).
international conference on supercomputing | 1990
Peiyi Tang; Pen Chung Yew; Chuan-Qi Zhu
The major source of parallelism in ordinary programs is do loops. When loop iterations of parallelized loops are executed on multiprocessors, the cross-iteration data dependencies need to be enforced by synchronization between processors. Existing data synchronization schemes are either too simple to handle general nested loop structures with non-trivia array subscript functions or inefficient due to the large run-time overhead. In this paper, we propose a new synchronization scheme based on two data-oriented synchronization instructions: synch_read(x,s) and synch_write(x,s). We present the algorithm to compute the ordering number, s, for each data access. Using our scheme, a parallelizing compiler can parallelize a general nested loop structure with complicated cross-iteration data dependencies. If the computations of ordering numbers cannot be done at compile time, the run-time overhead is smaller than the other existing run-time schemes.
parallel computing | 2000
Peiyi Tang; Jingling Xue
Abstract Tiling can improve the performance of nested loops on distributed memory machines by exploiting coarse-grain parallelism and reducing communication overhead and frequency. Tiling calls for a compilation approach that performs first computation distribution and then data distribution, both possibly on a skewed iteration space. This paper presents a suite of compiler techniques for generating efficient SPMD programs to execute rectangularly tiled iteration spaces on distributed memory machines. The following issues are addressed: computation and data distribution, message-passing code generation, memory management and optimisations, and global to local address translation. Methods are developed for partitioning arbitrary iteration spaces and skewed data spaces. Techniques for generating efficient message-passing code for both arbitrary and rectangular iteration spaces are presented. A storage scheme for managing both local and nonlocal references is developed, which leads to the SPMD code with high locality of references. Two memory optimisations are given to reduce the amount of memory usage for skewed iteration spaces and expanded arrays, respectively. The proposed compiler techniques are illustrated using a simple running example and finally analysed and evaluated based on experimental results on a Fujitsu AP1000 consisting of 128 processors.
international symposium on multimedia | 2003
Chia-Chu Chiang; Peiyi Tang
The current state of the art of existing middleware technologies does not support the development of distributed applications that need processes to complete a task collaboratively. We extend current middleware technologies with multiparty interaction rather than design a new distributed programming language for application developers. An object adapter for coordination running as a component provides support for the coordinated applications by isolating, encapsulating, and managing a components interactions outside the component. Dynamic interface binding was designed to allow an adapter to examine the signature of the requested services at runtime such as operation names, parameter orders, parameter types, and parameter sizes. The interfaces of interconnecting components are bound at runtime. In addition, the interface language mapping allows an interface in a specific programming language to be automatically generated from an IP (interacting processes)-like IDL (IP-IDL) interface. The use of IP-IDL for coordination simplifies the development of distributed applications for coordination.
acm southeast regional conference | 2008
Erich Allen Peterson; Peiyi Tang
In this paper, a new pattern-growth algorithm is presented to mine frequent sequential patterns using First-Occurrence Forests (FOF). This algorithm uses a simple list of pointers to the first-occurrences of a symbol in the aggregate tree [1], as the basic data structure for database representation, and does not rebuild aggregate trees for projection databases. The experimental evaluation shows that our new FOF mining algorithm outperforms the PLWAP-tree mining algorithm [2] and the FLWAP-tree mining algorithm [3], both in the mining time and the amount of memory used.
acm southeast regional conference | 2012
Erich Allen Peterson; Peiyi Tang
In recent years, the concept of and algorithm for mining probabilistic frequent itemsets (PFIs) in uncertain databases, based on possible worlds semantics and a dynamic programming approach for frequency calculations, has been proposed. The frequentness of a given itemset in this scheme can be characterized by the Poisson binomial distribution. Further and more recently, others have extended those concepts to mine for probabilistic frequent closed itemsets (PFCIs), in an attempt to reduce the number and redundancy of output. In addition, work has been done to accelerate the computation of PFIs through approximation, to mine approximate probabilistic frequent itemsets (A-PFIs), based on the fact that the Poisson distribution can closely approximate the Poisson binomial distribution---especially when the size of the database is large. In this paper, we introduce the concept of and an algorithm for mining approximate probabilistic frequent closed itemsets (A-PFCIs). A new mining algorithm for mining such concepts is introduced and called A-PFCIM. It is shown through an experimental evaluation that mining for A-PFCIs can be orders of magnitude faster than mining for traditional PFCIs.
international conference on supercomputing | 1993
Peiyi Tang
Exact side effects of subroutine calls are essential for exact interprocedural dependence analysis. To summarize the side effect of multiple array references, a collective representation of all the array elements accessed is needed. So far all existing forms of collective summary of side effects of multiple array references are approximate. In this paper, we propose an approach for exact interprocedural dependence analysis based on the Omega test. In particular, we provide a method of representing the exact image of multiple array references in the form of integer programming projection and a method of back-propagation to form the exact side effect on the actual array. The representation of the exact side effect proposed in this paper can be used by the Omega test to support the exact interprocedural dependence analysis in parallelizing compilers or semi-automatic parallelization tools.