Che-Rung Lee | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Che-Rung Lee is active.

Explore More

Publication

Featured researches published by Che-Rung Lee.

Physical Review B | 2009

Quantum Monte Carlo study of the two-dimensional fermion Hubbard model

C. N. Varney; Che-Rung Lee; Zhaojun Bai; Simone Chiesa; Mark Jarrell; R. T. Scalettar

We report large scale determinant quantum Monte Carlo calculations of the effective bandwidth, momentum distribution, and magnetic correlations of the square lattice fermion Hubbard Hamiltonian at half-filling. The sharp Fermi surface of the noninteracting limit is significantly broadened by the electronic correlations but retains signatures of the approach to the edges of the first Brillouin zone as the density increases. Finite-size scaling of simulations on large lattices allows us to extract the interaction dependence of the antiferromagnetic order parameter, exhibiting its evolution from weak-coupling to the strong-coupling Heisenberg limit. Our lattices provide improved resolution of the Green’s function in momentum space, allowing a more quantitative comparison with time-of-flight optical lattice experiments.

international conference on computer aided design | 2011

On the preconditioner of conjugate gradient method: a power grid simulation perspective

Chung-Han Chou; Nien-Yu Tsai; Hao Yu; Che-Rung Lee; Yiyu Shi; Shih-Chieh Chang

Preconditioned Conjugate Gradient (PCG) method has been demonstrated to be effective in solving large-scale linear systems for sparse and symmetric positive definite matrices. One critical problem in PCG is to design a good preconditioner, which can significantly reduce the runtime while keeping memory usage efficient. Universal preconditioners are simple and easy to construct, but their effectiveness is highly problem-dependent. On the other hand, domain-specific preconditioners that explore the underlying physical meaning of the matrices usually work better, but are difficult to design. In this paper, we study the problem in the context of power grid simulation, and develop a novel preconditioner based on the power grid structure through simple circuit simulations. Experimental results show 43% reduction in the number of iterations and 23% speedup over existing universal preconditioners.

Parallel Processing Letters | 2011

HIERARCHICAL MAPPING FOR HPC APPLICATIONS

I-Hsin Chung; Che-Rung Lee; Jiazheng Zhou; Yeh-Ching Chung

As the high performance computing systems scale up, mapping the tasks of a parallel application onto physical processors to allow efficient communication becomes one of the critical performance issues. Existing algorithms were usually designed to map applications with regular communication patterns. Their mapping criterion usually overlooks the size of communicated messages, which is the primary factor of communication time. In addition, most of their time complexities are too high to process large scale problems. In this paper, we present a hierarchical mapping algorithm (HMA), which is capable of mapping applications with irregular communication patterns. It first partitions tasks according to their run-time communication information. The tasks that communicate with each others more frequently are regarded as strongly connected. Based on their connectivity strength, the tasks are partitioned into super nodes based on the algorithms in spectral graph theory. The hierarchical partitioning reduces the mapping algorithm complexity to achieve scalability. Finally, the run-time communication information will be used again in fine tuning to explore better mappings. With the experiments, we show how the mapping algorithm helps to reduce the point-to-point communication time for the PDGEMM, a ScaLAPACK matrix multiplication computation kernel, up to 20% and the AMG2006, a tier 1 application of the Sequoia benchmark, up to 7%.

ACM Transactions on Modeling and Computer Simulation | 2013

Optimizing Pairwise Box Intersection Checking on GPUs for Large-Scale Simulations

Shih-Hsiang Lo; Che-Rung Lee; I-Hsin Chung; Yeh-Ching Chung

Box intersection checking is a common task used in many large-scale simulations. Traditional methods cannot provide fast box intersection checking with large-scale datasets. This article presents a parallel algorithm to perform Pairwise Box Intersection checking on Graphics processing units (PBIG). The PBIG algorithm consists of three phases: planning, mapping and checking. The planning phase partitions the space into small cells, the sizes of which are determined to optimize performance. The mapping phase maps the boxes into the cells. The checking phase examines the box intersections in the same cell. Several performance optimizations, including load-balancing, output data compression/encoding, and pipelined execution, are presented for the PBIG algorithm. The experimental results show that the PBIG algorithm can process large-scale datasets and outperforms three well-performing algorithms.

ieee international symposium on parallel & distributed processing, workshops and phd forum | 2011

Hierarchical Mapping for HPC Applications

I-Hsin Chung; Che-Rung Lee; Jiazheng Zhou; Yeh-Ching Chung

real-time systems symposium | 2000

A fast algorithm for scheduling imprecise computations with timing constraints to minimize weighted error

Wei-Kuan Shih; Che-Rung Lee; Ching-Hui Tang

Scheduling tasks with different weights in the imprecise computation model is rather difficult. Each task in the imprecise computation model is logically decomposed into a mandatory subtask and an optional subtask. The mandatory subtask must be completely executed before a deadline to produce an acceptable result; the optional subtask begins after the mandatory subtask to refine the result. The error in the results of a task is measured by the processing time of the unexecuted portion of the optional subtask. This paper proposes a fast algorithm for scheduling imprecise computation with timing constraints on uniprocessor systems. The proposed algorithm can obtain the optimal schedule for different weighted tasks with time complexity O(n log/sup 2/n).

ieee international symposium on parallel & distributed processing, workshops and phd forum | 2013

Enhancing Accuracy and Performance of Collaborative Filtering Algorithm by Stochastic SVD and Its MapReduce Implementation

Che-Rung Lee; Ya-Fang Chang

Collaborative filtering algorithms that extract desired information from records have been widely used in data mining and information retrieval, such as recommendation systems. However, the rapidly increased data size demands more efficient and scalable algorithms and implementations. In this paper, we present a novel algorithm that utilizes stochastic singular value decomposition (SSVD) in the calculation of item-based collaborative filtering. The use of SSVD does not only provide more accurate results in terms of precision and recall, but also reduces the computational cost. The proposed algorithm was implemented using Hadoop MapReduce, which allows distributed processing of massive data stored in a distributed file system. The implementation was evaluated and compared with the recommendation systems provided in the Apache Mahout project, and a 2.53 speedup can be obtained for processing millions records. The accuracy of our algorithm is also 3 times better than the non-SVD algorithm in terms of the F1 metric, a combinative measurement of precision and recall.

ieee/acm international symposium cluster, cloud and grid computing | 2011

A Parallel Rectangle Intersection Algorithm on GPU+CPU

Shih-Hsiang Lo; Che-Rung Lee; Yeh-Ching Chung; I-Hsin Chung

In this paper, we investigate efficient algorithms and implementations using GPU plus CPU to solve the rectangle intersection problem on a plane. The problem is to report all intersecting pairs of iso-oriented rectangles, whose parallelization on GPUs poses two major computational challenges: data partition and the massive output. The algorithm we presented is called PRI-GC, Parallel Rectangle Intersection algorithm on GPU+CPU, which consists of two phases: mapping and intersection-checking. In the mapping phase, rectangles are hashed into different subspaces (called cells) to reduce the unnecessary intersection checking for far-apart rectangles. In the intersection-checking phase, pairs of rectangles within the same cell are examined in parallel, and the intersecting pairs of rectangles are reported. Several optimization techniques, including rectangles re-ordering, output data compressing/encoding, and the execution overlapping of GPU and CPU, are applied to enhance the performance. We had evaluated the performance of PRI-GC and the result shows over 30x speedup against two well-implemented sequential algorithms on single CPU. The effectiveness of each optimization technique for this problem was evaluated as well. Several parameters, including different degrees of rectangle coverage, different block sizes, and different cell sizes, were also experimented to explore their influences on the performance of PRI-GC.

ieee international conference on cloud engineering | 2014

Taiwan UniCloud: A Cloud Testbed with Collaborative Cloud Services

Wu–Chun Chung; Po Chi Shih; Kuan-Chou Lai; Kuan-Ching Li; Che-Rung Lee; Jerry Chou; Ching-Hsien Hsu; Yeh-Ching Chung

This paper introduces a prototype of Taiwan UniCloud, a community-driven hybrid cloud platform for academics in Taiwan. The goal is to leverage resources in multiple clouds among different organizations. Each self-managing cloud can join the UniCloud platform to share its resources and simultaneously benefit from other clouds with scale-out capabilities. Accordingly, resources are elastic and sharable with each other such as to afford unexpected resource demands to each cloud. The proposed platform provides a web portal to operate each cloud via a uniform user interface. The construction of virtual clusters with multi-core VMs is supplied for parallel and distributed processing models. An object-based storage system is also delivered to federate different storage providers. This paper not only presents the architectural design of Taiwan UniCloud, but also evaluates the performance to demonstrate the possibility of current implementation. Experimental results show the feasibility of the proposed platform as well as the benefit from the cloud federation.

international symposium on pervasive systems, algorithms, and networks | 2012

EEGRA: Energy Efficient Geographic Routing Algorithms for Wireless Sensor Network

Tseng-Yi Chen; Hsin-Wen Wei; Che-Rung Lee; Fu-Nan Huang; Tsan-sheng Hsu; Wei-Kuan Shih

Energy efficiency is critical in wireless sensor networks (WSN) for system reliability and deployment cost. The power consumption of the communication in multi-hop WSN is primarily decided by three factors: routing distance, signal interference, and computation cost of routing. Several routing algorithms designed for energy efficiency or interference avoidance had been proposed. However, they are either too complex to be useful in practices or specialized for certain WSN architectures. In this paper, we propose two energy efficient geographic routing algorithms (EEGRA) for wireless sensor networks, which are based on existing geographic routing algorithms and take all three factors into account. The first algorithm combines the interference into the routing cost function, and uses it in the routing decision. The second algorithm transforms the problem into a constrained optimization problem, and solves it by searching the optimal discretized interference level. We integrate four geographic routing algorithms: GOAFR+, Face Routing, GPSR, and RandHT, to both EEGRA algorithms and compare them with three other routing methods in terms of power consumption and computation cost for the grid and irregular sensor topologies. The results of our experiments show both algorithms conserve sensors routing energy 30% ~ 50% comparing to general geographic routing algorithms. In addition, the time complexity of EEGRA algorithms is similar to the geographic greedy routing methods, which is much faster than the optimal SINR-based algorithm.

Explore More