Yeh-Ching Chung
National Tsing Hua University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Yeh-Ching Chung.
conference on high performance computing (supercomputing) | 1992
Yeh-Ching Chung; Sanjay Ranka
The authors discuss applications of BTDH (bottom-up top-down duplication heuristic) to list scheduling algorithms (LSAs). There are two ways to use BTDH for LSAs. BTDH can be used with an LSA to form a new scheduling algorithm (LSA/BTDH), and it can be used as a pure optimization algorithm for an LSA (LSA-BTDH). BTDH has been applied with two well-known LSAs: the highest level first with estimated time (HLFET) and the earlier task first (ETF) heuristics. Simulation results show that, given a directed acyclic growth (DAG), the graph parallelism of the DAG can accurately predict the number of processors to be used such that a good scheduling length and a good resource utilization (or efficiency) can be achieved simultaneously. In terms of speedups, LSA/BTDH >or= LSA-BTDH >or= ETF >or= HLFET. Experimental results of scheduling FFT programs, which are written in a single program multiple data (SPMD) programming approach, on NCUBE-2 are also presented. The results confirm the simulation results and show that the speedups of LSA/BTDH and LSA-BTDH are better than the speedups of LSAs.<<ETX>>
Computer Communications | 2007
Chun-Hsien Wu; Kuo-Chuan Lee; Yeh-Ching Chung
To obtain a satisfied performance of wireless sensor network, an adaptable sensor deployment method for various applications is essential. In this paper, we propose a centralized and deterministic sensor deployment method, DT-Score (Delaunay Triangulation-Score), aims to maximize the coverage of a given sensing area with obstacles. The DT-Score consists of two phases. In the first phase, we use a contour-based deployment to eliminate the coverage holes near the boundary of sensing area and obstacles. In the second phase, a deployment method based on the Delaunay Triangulation is applied for the uncovered regions. Before deploying a sensor, each candidate position generated from the current sensor configuration is scored by a probabilistic sensor detection model. A new sensor is placed to the position with the most coverage gains. According to the simulation results, DT-Score can reach higher coverage than grid-based and random deployment methods with the increasing of deployable sensors.
international parallel and distributed processing symposium | 2004
Xuan-Yi Lin; Yeh-Ching Chung; Tai-Yi Huang
Summary form only given. In a cluster system, performance of the interconnection network greatly affects the computation power generated together from all interconnected processing nodes. The network architecture, the interconnection topology, and the routing scheme are three key elements dominating the performance of an interconnection network. InfiniBand architecture (IBA) is a new industry standard architecture. It defines a high-bandwidth, high-speed, and low-latency message switching network that is good for constructing high-speed interconnection networks for cluster systems. Fat-trees are well-adopted as the topologies of interconnection networks because of many nice properties they have. We proposed an m-port n-tree approach to construct fat-tree-based InfiniBand networks. Based on the constructed fat-tree-based InfiniBand networks, we proposed an efficient multiple LID (MLID) routing scheme. The proposed routing scheme is composed of processing node addressing scheme, path selection scheme, and forwarding table assignment scheme. To evaluate the performance of the proposed routing scheme, we have developed a software simulator for InfiniBand networks. The simulation results show that the proposed routing scheme runs well on the constructed fat-tree-based InfiniBand networks and is able to efficiently utilize the bandwidth and the multiple paths that fat-tree topology offers under InfiniBand architecture.
Information Sciences | 2014
Ching-Hsien Hsu; Kenn Slagter; Shih-Chang Chen; Yeh-Ching Chung
Task consolidation is a way to maximize utilization of cloud computing resources. Maximizing resource utilization provides various benefits such as the rationalization of maintenance, IT service customization, QoS and reliable services, etc. However, maximizing resource utilization does not mean efficient energy use. Much of the literature shows that energy consumption and resource utilization in clouds are highly coupled. Consequently, some of the literature aims to decrease resource utilization in order to save energy, while others try to reach a balance between resource utilization and energy consumption. In this paper, we present an energy-aware task consolidation (ETC) technique that minimizes energy consumption. ETC achieves this by restricting CPU use below a specified peak threshold. ETC does this by consolidating tasks amongst virtual clusters. In addition, the energy cost model considers network latency when a task migrates to another virtual cluster. To evaluate the performance of ETC we compare it against MaxUtil. MaxUtil is a recently developed greedy algorithm that aims to maximize cloud computing resources. The simulation results show that ETC can significantly reduce power consumption in a cloud system, with 17% improvement over MaxUtil.
grid and pervasive computing | 2007
Chun-Hsien Wu; Yeh-Ching Chung
Heterogeneous wireless sensor network (heterogeneous WSN) consists of sensor nodes with different ability, such as different computing power and sensing range. Compared with homogeneous WSN, deployment and topology control are more complex in heterogeneous WSN. In this paper, a deployment and topology control method is presented for heterogeneous sensor nodes with different communication and sensing range. It is based on the irregular sensor model used to approximate the behavior of sensor nodes. Besides, a cost model is proposed to evaluate the deployment cost of heterogeneous WSN. According to experiment results, the proposed method can achieve higher coverage rate and lower deployment cost for the same deployable sensor nodes.
international conference on parallel and distributed systems | 2006
Chun-Hsien Wu; Kuo-Chuan Lee; Yeh-Ching Chung
To obtain a satisfied performance of wireless sensor network, an adaptable sensor deployment method for various applications is essential. In this paper, we propose a centralized sensor deployment method, DT-Score, aims to maximize the coverage of a given sensing area with obstacles. The DT-Score consists of two phases. In the first phase, we use a contour-based deployment to eliminate the coverage holes near the boundary of sensing area and obstacles. In the second phase, a deployment method based on the Delaunay triangulation is applied for the uncovered regions. Before deploying a sensor, each candidate position generated from the current sensor configuration is scored by a probabilistic sensor detection model. A new sensor is placed to the position with the most coverage gains. According to the simulation results, DT-Score can reach higher coverage than grid-based and random deployment methods with the increasing of deployable sensors
IEEE Transactions on Computers | 2002
Chun-Yuan Lin; Jen-Shiuh Liu; Yeh-Ching Chung
Array operations are used in a large number of important scientific codes. To implement these array operations efficiently, many methods have been proposed in the literature, most of which are focused on two-dimensional arrays. When extended to higher dimensional arrays, these methods usually do not perform well. Hence, designing efficient algorithms for multidimensional array operations becomes an important issue. We propose a new scheme, extended Karnaugh map representation (EKMR), for the multidimensional array representation. The main idea of the EKMR scheme is to represent a multidimensional array by a set of two-dimensional arrays. Hence, efficient algorithm design for multidimensional array operations becomes less complicated. To evaluate the proposed scheme, we design efficient algorithms for multidimensional array operations, matrix-matrix addition/subtraction and matrix-matrix multiplications, based on the EKMR and the traditional matrix representation (TMR) schemes. Theoretical and experimental tests for these array operations were conducted. In the experimental test, we compare the performance of intrinsic functions provided by the Fortran 90 compiler with those based on the EKMR scheme. The experimental results show that the algorithms based on the EKMR scheme outperform those based on the TMR scheme and those provided by the Fortran 90 compiler.
Computer Networks | 2007
Jen-Shiuh Liu; Zhi-Jian Lee; Yeh-Ching Chung
Recently, denial-of-service (DoS) attack has become a pressing problem due to the lack of an efficient method to locate the real attackers and ease of launching an attack with readily available source codes on the Internet. Traceback is a subtle scheme to tackle DoS attacks. Probabilistic packet marking (PPM) is a new way for practical IP traceback. Although PPM enables a victim to pinpoint the attackers origin to within 2-5 equally possible sites, it has been shown that PPM suffers from uncertainty under spoofed marking attack. Furthermore, the uncertainty factor can be amplified significantly under distributed DoS attack, which may diminish the effectiveness of PPM. In this work, we present a new approach, called dynamic probabilistic packet marking (DPPM), to further improve the effectiveness of PPM. Instead of using a fixed marking probability, we propose to deduce the traveling distance of a packet and then choose a proper marking probability. DPPM may completely remove uncertainty and enable victims to precisely pinpoint the attacking origin even under spoofed marking DoS attacks. DPPM supports incremental deployment. Formal analysis indicates that DPPM outperforms PPM in most aspects.
symposium on code generation and optimization | 2012
Ding-Yong Hong; Chun Chen Hsu; Pen Chung Yew; Jan Jan Wu; Wei-Chung Hsu; Pangfeng Liu; Chien-Min Wang; Yeh-Ching Chung
Dynamic binary translation (DBT) is a core technology to many important applications such as system virtualization, dynamic binary instrumentation and security. However, there are several factors that often impede its performance: (1) emulation overhead before translation; (2) translation and optimization overhead, and (3) translated code quality. On the dynamic binary translator itself, the issues also include its retargetability to support guest applications from different instruction-set architectures (ISAs) to host machines also with different ISAs, an important feature for system virtualization. In this work, we take advantage of the ubiquitous multicore platforms, using multithreaded approach to implement DBT. By running the translators and the dynamic binary optimizers on different threads on different cores, it could off-load the overhead caused by DBT on the target applications; thus, afford DBT of more sophisticated optimization techniques as well as the support of its retargetability. Using QEMU (a popular retargetable DBT for system virtualization) and LLVM (Low Level Virtual Machine) as our building blocks, we demonstrated in a multi-threaded DBT prototype, called HQEMU, that it could improve QEMU performance by a factor of 2.4X and 4X on the SPEC 2006 integer and floating point benchmarks for x86 to x86-64 emulations, respectively, i.e. it is only 2.5X and 2.1X slower than native execution of the same benchmarks on x86-64, as opposed to 6X and 8.4X slowdown on QEMU. For ARM to x86-64 emulation, HQEMU could gain a factor of 2.4X speedup over QEMU for the SPEC 2006 integer benchmarks.
IEEE Transactions on Computers | 2003
Chun-Yuan Lin; Yeh-Ching Chung; Jen-Shiuh Liu
We have proposed the extended Karnaugh map representation (EKMH) scheme for multidimensional array representation. We propose two data compression schemes, EKMR compressed row/column storage (ECRS/ECCS), for multidimensional sparse arrays based on the EKMR scheme. To evaluate the proposed schemes, we compare them to the CRS/CCS schemes. Both theoretical analysis and experimental tests were conducted. In the theoretical analysis, we analyze the CRS/CCS and the ECRS/ECCS schemes in terms of the time complexity, the space complexity, and the range of their usability for practical applications. In experimental tests, we compare the compressing time of sparse arrays and the execution time of matrix-matrix addition and matrix-matrix multiplication based on the CRS/CCS and the ECRS/ECCS schemes. The theoretical analysis and experimental results show that the ECRS/ECCS schemes are superior to the CRS/CCS schemes for all the evaluated criteria, except the space complexity in some case.