Yuan-Shin Hwang
National Taiwan University of Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yuan-Shin Hwang.
Journal of Parallel and Distributed Computing | 1994
Raja Das; Mustafa Uysal; Joel H. Saltz; Yuan-Shin Hwang
Abstract This paper describes a number of optimizations that can be used to support the efficient execution of irregular problems on distributed memory parallel machines. These primitives (1) coordinate interprocessor data movement, (2) manage the storage of, and access to, copies of off-processor data, (3) minimize interprocessor communication requirements, and (4) support a shared name space. We present a detailed performance and scalability analysis of the communication primitives. This performance and scalability analysis is carried out using a workload generator, kernels from real applications, and a large unstructured adaptive application (the molecular dynamics code CHARMM).
conference on high performance computing (supercomputing) | 1994
Shamik D. Sharma; Ravi Ponnusamy; Bongki Moon; Yuan-Shin Hwang; Raja Das; Joel H. Saltz
In adaptive irregular problems, data arrays are accessed via indirection arrays, and data access patterns change during computation. Parallelizing such problems on distributed memory machines requires support for dynamic data partitioning, efficient preprocessing and fast data migration. This paper describes CHAOS, a library of efficient runtime primitives that provides such support. To demonstrate the effectiveness of the runtime support, two adaptive irregular applications have been parallelized using CHAOS primitives: a molecular dynamics code (CHARMM) and a code for simulating gas flows (DSMC). We have also proposed minor extensions to Fortran D which would enable compilers to parallelize irregular for all loops in such adaptive applications by embedding calls to primitives provided by a runtime library. We have implemented our proposed extensions in the Syracuse Fortran 90D/HPF prototype compiler, and have used the compiler to parallelize kernels from two adaptive applications.<<ETX>>
Software - Practice and Experience | 1995
Yuan-Shin Hwang; Bongki Moon; Shamik D. Sharma; Ravi Ponnusamy; Raja Das; Joel H. Saltz
In many scientific applications, arrays containing data are indirectly indexed through indirection arrays. Such scientific applications are called irregular programs and are a distinct class of applications that require special techniques for parallelization.
IEEE Transactions on Parallel and Distributed Systems | 1995
Ravi Ponnusamy; Joel H. Saltz; Alok N. Choudhary; Yuan-Shin Hwang; Geoffrey C. Fox
This paper describes two new ideas by which a High Performance Fortran compiler can deal with irregular computations effectively. The first mechanism invokes a user specified mapping procedure via a set of proposed compiler directives. The directives allow use of program arrays to describe graph connectivity, spatial location of array elements, and computational load. The second mechanism is a conservative method for compiling irregular loops in which dependence arises only due to reduction operations. This mechanism in many cases enables a compiler to recognize that it is possible to reuse previously computed information from inspectors (e.g., communication schedules, loop iteration partitions, and information that associates off-processor data copies with on-processor buffer locations). This paper also presents performance results for these mechanisms from a Fortran 90D compiler implementation. >
acm sigplan symposium on principles and practice of parallel programming | 2003
Peng-Sheng Chen; Ming-Yu Hung; Yuan-Shin Hwang; Roy Dz-Ching Ju; Jenq Kuen Lee
Speculative multithreading (SpMT) architecture can exploit thread-level parallelism that cannot be identified statically. Speedup can be obtained by speculatively executing threads in parallel that are extracted from a sequential program. However, performance degradation might happen if the threads are highly dependent, since a recovery mechanism will be activated when a speculative thread executes incorrectly and such a recovery action usually incurs a very high penalty. Therefore, it is essential for SpMT to quantify the degree of dependences and to turn off speculation if the degree of dependences passes certain thresholds. This paper presents a technique that quantitatively computes dependences between loop iterations and such information can be used to determine if loop iterations can be executed in parallel by speculative threads. This technique can be broken into two steps. First probabilistic points-to analysis is performed to estimate the probabilities of points-to relationships in case there are pointer references in programs, and then the degree of dependences between loop iterations is computed quantitatively. Preliminary experimental results show compiler-directed thread-level speculation based on the information gathered by this technique can achieve significant performance improvement on SpMT.
IEEE Transactions on Parallel and Distributed Systems | 2004
Peng-Sheng Chen; Yuan-Shin Hwang; Roy Dz-Ching Ju; Jenq Kuen Lee
When performing aggressive optimizations and parallelization to exploit features of advanced architectures, optimizing and parallelizing compilers need to quantitatively assess the profitability of any transformations in order to achieve high performance. Useful optimizations and parallelization can be performed if it is known that certain points-to relationships would hold with high or low probabilities. For instance, if the probabilities are low, a compiler could transform programs to perform data speculation or partition iterations into threads in speculative multithreading, or it would avoid conducting code specialization. Consequently, it is essential for compilers to incorporate pointer analysis techniques that can estimate the possibility for every points-to relationship that it would hold during the execution. However, conventional pointer analysis techniques do not provide such quantitative descriptions and, thus, hinder compilers from more aggressive optimizations, such as thread partitioning in speculative multithreading, data speculations, code specialization, etc. We address this issue by proposing a probabilistic points-to analysis technique to compute the probability of every points-to relationship at each program point. A context-sensitive interprocedural algorithm has been implemented based on the iterative data flow analysis framework, and has been incorporated into SUIF and MachSUIF. Experimental results show this technique can estimate the probabilities of points-to relationships in benchmark programs with reasonable small errors, about 4.6 percent on average. Furthermore, the current implementation cannot disambiguate heap and array elements. The errors are further significantly reduced when the future implementation incorporates techniques to disambiguate heap and array elements.
IEEE Parallel & Distributed Technology: Systems & Applications | 1995
Ravi Ponnusamy; Yuan-Shin Hwang; Raja Das; Joel H. Saltz; Alok N. Choudhary; Geoffrey C. Fox
Languages such as Fortran D provide irregular distribution schemes that can efficiently support irregular problems. Irregular distributions can also be emulated in HPF. Compilers can incorporate runtime procedures to automatically support these distributions. >
software product lines | 1993
Raja Das; Yuan-Shin Hwang; Mustafa Uysal; Joel H. Saltz; Alan Sussman
This paper describes a number of optimizations that can be used to support the efficient execution of irregular problems on distributed memory parallel machines. We describe software primitives that (1) coordinate interprocessor data movement, (2) manage the storage of, and access to, copies of off-processor data, (3) minimize interprocessor communication requirements and (4) support a shared name space. The performance of the primitives is characterized by examination of kernels from real applications and from a full implementation of a large unstructured adaptive application (the molecular dynamics code CHARMM).<<ETX>>
languages and compilers for parallel computing | 2001
Yuan-Shin Hwang; Peng-Sheng Chen; Jenq Kuen Lee; Roy Dz-Ching Ju
Information gathered by the existing pointer analysis techniques can be classified as must aliases or definitely-points-to relationships, which hold for all executions, and may aliases or possibly-points-to relationships, which might hold for some executions. Such information does not provide quantitative descriptions to tell how likely the conditions will hold for the executions, which are needed for modern compiler optimizations, and thus has hindered compilers from more aggressive optimizations. This paper addresses this issue by proposing a probabilistic points-to analysis technique to compute the probability of each points-to relationship. Initial experiments are done by incorporating the probabilistic data flow analysis algorithm into SUIF and MachSUIF, and preliminary experimental results show the probability distributions of points-to relationships in several benchmark programs. This work presents a major enhancement for pointer analysis to keep up with modern compiler optimizations.
Advances in Computers | 1997
Joel H. Saltz; Chialin Chang; Guy Edjlali; Yuan-Shin Hwang; Bongki Moon; Ravi Ponnusamy; Shamik D. Sharma; Alan Sussman; Mustafa Uysal; Gagan Agrawal; Raja Das; Paul Havlak
Abstract In this chapter, we present a summary of the runtime support, compiler and tools development efforts in the CHAOS group at the University of Maryland. The principal focus of the CHAOS groups research has been to develop tools, compiler runtime support and compilation techniques to help scientists and engineers develop high-speed parallel implementations of codes for irregular scientific problems (i.e. problems that are unstructured, sparse, adaptive or block structured). We have developed a series of runtime support libraries (CHAOS, CHAOS+ +) that carry out the preprocessing and data movement needed to efficiently implement irregular and block structured scientific algorithms on distributed memory machines and networks of workstations. Our compilation research has played a major role in demonstrating that it is possible to develop data parallel compilers able to make effective use of a wide variety of runtime optimizations. We have also been exploring ways to support interoperability between sequential and parallel programs written using different languages and programming paradigms.