Sheng-De Wang
National Taiwan University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Sheng-De Wang.
international conference on parallel and distributed systems | 2005
Sheng-De Wang; I-Tar Hsu; Zheng Yi Huang
In this paper, we propose an adaptive and dynamic scheduling method, called most fit task first (MFTF), for a class of computational grids, which are characterized by heterogeneous computing nodes and dynamic task arrivals. Some existing static scheduling methods assume that tasks arrive statically and may not perform well in the case of dynamic task arrivals. Our method can get stable task execution times whether tasks arrive statically or dynamically. We compare the task execution time with other methods to show the performance of the scheduling method.
IEEE Transactions on Parallel and Distributed Systems | 2000
Ming-Jer Tsai; Sheng-De Wang
Message routing achieves the internode communication in parallel computers. A reliable routing is supposed to be deadlock-free and fault-tolerant. While many routing algorithms are able to tolerate a large number of faults enclosed by rectangular faulty blocks, there is no existing algorithm that is capable of handling irregular faulty patterns for wormhole networks. In this paper, a two-staged adaptive and deadlock-free routing algorithm called Routing for Irregular Faulty Patterns (RIFP) is proposed. It can tolerate irregular faulty patterns by transmitting messages from sources or to destinations within faulty blocks via multiple intermediate nodes. A method employed by RIFP is first introduced to generate intermediate nodes using the local failure information. By its aid, two communicating nodes can always exchange their data or intermediate results if there is at least one path between them. RIFP needs two virtual channels per physical link in meshes.
IEEE Transactions on Parallel and Distributed Systems | 1992
Chien-Min Wang; Sheng-De Wang
An important issue for the efficient use of multiprocessor systems is the assignment of parallel processors to nested parallel loops. It is desirable for a processor assignment algorithm to be fast and always generate an optimal processor assignment. The paper proposes two efficient algorithms to decide the optimal number of processors assigned to each individual loop. Efficient parallel counterparts of these two algorithms are also presented. These algorithms not only always generate an optimal processor assignment, but also are much faster than the exiting optimal algorithm in the literature. The paper discusses improving the performance of parallel execution by transforming a nested parallel loop into a semantically equivalent one. Three loop transformations are investigated. It is observed that, in most cases, the parallel execution time is improved after applying these transformations. >
Journal of Parallel and Distributed Computing | 1996
Yeong-Sheng Chen; Sheng-De Wang; Chien-Min Wang
In this paper, an approach to tiling nested loops for maximizing parallelism is proposed. The proposed method aims at aggregating independent computations of a loop nest into rectangular blocks and maximizing the block sizes for maximizing parallelism. At first, all the independent computations that can be executed in the first time unit are identified. These computations are called the initially independent computations. Then it is shown that all of them can be collected as a union of rectangular blocks. So, based on these, the entire iteration space of the loops is partitioned into rectangular blocks for maximizing parallelism. The proposed method is formulated as systematic procedures which can easily be implemented in a parallelizing compiler. It is shown that when the wavefront transformation is combined with the proposed method, the loops can always be tiled so that the tile size is greater than one. In comparison with previous work on tiling, the proposed method is shown to have several advantages as summarized in the conclusions of this paper.
IEEE Transactions on Parallel and Distributed Systems | 1995
Isaac Yi-Yuan Lee; Sheng-De Wang
Reviews a 1-fault-tolerant (1-ft) hypercube model with degree 2r: the ring-connected network (RCN), which has the lowest degree among all 1-ft, one-spare node, r-dimensional hypercube architectures yet discovered. Then, we propose a constant-time reconfiguration algorithm via an add-and-modulo automorphism. Furthermore, by introducing the equivalence from hypercubes to cube-connected cycles (CCCs) and to butterflies (BFs), we find that there is also a corresponding equivalence from RCNs to cubical ring-connected cycles (CRCCs) and to dynamic redundancy networks (DRNs). From this fact, we find that once a symmetric fault-tolerant structure has been discovered for one of the three models, then it can be applied directly to the other hypercubic networks. Applying the technique, we find a degree-6, 1-ft Benes network. We think that more attention should be paid to the strong relationship between hypercubes, CCCs and BFs. Finally, from this equivalence relationship we propose three new bounded-degree k-ft models: k-ft CCCs, k-ft BFs and k-ft Benes networks. >
international conference on parallel and distributed systems | 2001
Sheng-De Wang; Yuhder Lin
Java has been a very important programming language, especially with its cross-platform characteristics, but the CLASS file format defined in the Java Virtual Machine (JVM) specification contains many redundancies and replications of information. These redundancies most come from the constant pool of a CLASS file. We propose a compact binary file format, called Jato, and its associated archive format, called Jatar, for the Java system. Using these two formats, many of the redundancies can be removed. We didnt utilize any text compression technique in the proposed formats, so they do not sacrifice the loading speed and are thus very suitable for use in embedded environments. Weve also implemented a class loader that is capable of loading the Jato files into a regular JVM. Using this approach, we show that the Jato file format is effective and promising, while still keeping the cross-platform features of Java.
parallel computing | 2000
Pao Hwa Sui; Sheng-De Wang
Abstract We investigate fault-tolerant routing schemes which aim at using low number of virtual channels in wormhole-routed mesh networks. The faults under consideration are rectangular block faults, which are suitable for modeling faults on board level in networks with grid structures. There is no restriction on the number of faults. The concepts of f -ring and f -chain are used in our scheme. Messages are routed minimally when not blocked by faults and are routed along the boundaries of the faults encountered. Only three virtual channels and local knowledge of faults are required for our routing scheme to be correct, deadlock- and livelock-free. By allocating virtual channels to messages carefully, all virtual channels have the potential to be used by messages; hence, none of the virtual channels and its associated hardware is wasted.
parallel computing | 1990
Chien-Min Wang; Sheng-De Wang
Abstract In this paper, the problem of partitioning of parallel programs for execution on multiprocessors is investigated. By assuming run-time scheduling approaches, the static partitioning problems are formulated and solved in the context of a structured program representation model, in which a program is assumed to be composed of parallel loops. The structured partitioning problem is then defined as the problem of partitioning each parallel loop into appropriate number of tasks such that the program execution time is minimized in some sense. The worst case of the program execution cost is adopted as the minimization criterion, so the results obtained in this paper are guaranteed to be within some performance bound. Two algorithms are developed to solve this problem. The first algorithm using a prune-and-search technique can find the optimal partition, while the second algorithm can obtain a near optimal partition for the simplified partition problem in linear time. It is proved that the cost of the partition generated by the linear time algorithm is at most twice the cost of the optimal partition.
IEEE Transactions on Parallel and Distributed Systems | 1998
Ming-Jer Tsai; Sheng-De Wang
Unicast V is a progressive, misrouting algorithm for packet or virtual cut-through networks. A progressive protocol forwards a message at an intermediate node if a nonfaulty profitable link is available and waits, deroutes, or aborts otherwise. A misrouting protocol uses both profitable and nonprofitable links at each node; thus, a message can move farther away from its destination at some steps. Unicast V is simple for hardware implementation, requires a very small message overhead, and makes routing decisions by local failure information only. However, it is claimed to be partially adaptive and to be able to tolerate static faults in hypercubes only. In this paper, we uncover some new features of Unicast V: (1) it is fully-adaptive, (2) it also applies to meshes and tori, and (3) it can tolerate dynamic faults by careful implementation. In addition, we also provide bounds on the performance of the algorithm.
international symposium on microarchitecture | 1994
Isaac Yi-Yuan Lee; Sheng-De Wang
Our work in deriving and comparing the reliability formulas for three leading hypercubic models-the original hypercube, Shihs cube, and the ring-connected hypercube-demonstrates the superiority of the ring-connected hypercube (RCH) approach. Though it needs twice the links of the original hypercube network, which is the best result to date, the RCH can recover from node failures. The higher resultant reliability of this fault-tolerant architecture makes the RCH an attractive candidate for many critical parallel-computation applications.<<ETX>>