Ce Yu | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Ce Yu is active.

Explore More

Publication

Featured researches published by Ce Yu.

IEEE Transactions on Parallel and Distributed Systems | 2012

EasyPDP: An Efficient Parallel Dynamic Programming Runtime System for Computational Biology

Shanjiang Tang; Ce Yu; Jizhou Sun; Bu-Sung Lee; Tao Zhang; Zhen Xu; Huabei Wu

Dynamic programming (DP) is a popular and efficient technique in many scientific applications such as computational biology. Nevertheless, its performance is limited due to the burgeoning volume of scientific data, and parallelism is necessary and crucial to keep the computation time at acceptable levels. The intrinsically strong data dependency of dynamic programming makes it difficult and error-prone for the programmer to write a correct and efficient parallel program. Therefore, this paper builds a runtime system named EasyPDP aiming at parallelizing dynamic programming algorithms on multicore and multiprocessor platforms. Under the concept of software reusability and complexity reduction of parallel programming, a DAG Data Driven Model is proposed, which supports those applications with a strong data interdependence relationship. Based on the model, EasyPDP runtime system is designed and implemented. It automatically handles thread creation, dynamic data task allocation and scheduling, data partitioning, and fault tolerance. Five frequently used DAG patterns from biological dynamic programming algorithms have been put into the DAG pattern library of EasyPDP, so that the programmer can choose to use any of them according to his/her specific application. Besides, an ideal computing distribution model is proposed to discuss the optimal values for the performance tuning arguments of EasyPDP. We evaluate the performance potential and fault tolerance feature of EasyPDP in multicore system. We also compare EasyPDP with other methods such as Block-Cycle Wavefront (BCW). The experimental results illustrate that EasyPDP system is fine and provides an efficient infrastructure for dynamic programming algorithms.

international conference on algorithms and architectures for parallel processing | 2009

A Paralleled Large-Scale Astronomical Cross-Matching Function

Qing Zhao; Jizhou Sun; Ce Yu; Chenzhou Cui; Liqiang Lv; Jian Xiao

Multi-wavelength data cross-match among multiple catalogs is a basic and unavoidable step to make distributed digital archives accessible and interoperable. As current catalogs often contain millions or billions objects, it is a typical data-intensive computation problem. In this paper, a high-efficient parallel approach of astronomical cross-match is introduced. We issue our partitioning and parallelization approach, after that we address some problems introduced by task partition and give the solutions correspondingly, including a sky splitting function HEALPix we selected which play a key role on both the task partitioning and the database indexing, and a quick bit-operation algorithm we advanced to resolve the block-edge problem. Our experiments prove that the function has a marked performance superiority comparing with the previous functions and is fully applicable to large-scale cross-match.

BMC Bioinformatics | 2017

CMSA: a heterogeneous CPU/GPU computing system for multiple similar RNA/DNA sequence alignment

Xi Chen; Chen Wang; Shanjiang Tang; Ce Yu; Quan Zou

BackgroundThe multiple sequence alignment (MSA) is a classic and powerful technique for sequence analysis in bioinformatics. With the rapid growth of biological datasets, MSA parallelization becomes necessary to keep its running time in an acceptable level. Although there are a lot of work on MSA problems, their approaches are either insufficient or contain some implicit assumptions that limit the generality of usage. First, the information of users’ sequences, including the sizes of datasets and the lengths of sequences, can be of arbitrary values and are generally unknown before submitted, which are unfortunately ignored by previous work. Second, the center star strategy is suited for aligning similar sequences. But its first stage, center sequence selection, is highly time-consuming and requires further optimization. Moreover, given the heterogeneous CPU/GPU platform, prior studies consider the MSA parallelization on GPU devices only, making the CPUs idle during the computation. Co-run computation, however, can maximize the utilization of the computing resources by enabling the workload computation on both CPU and GPU simultaneously.ResultsThis paper presents CMSA, a robust and efficient MSA system for large-scale datasets on the heterogeneous CPU/GPU platform. It performs and optimizes multiple sequence alignment automatically for users’ submitted sequences without any assumptions. CMSA adopts the co-run computation model so that both CPU and GPU devices are fully utilized. Moreover, CMSA proposes an improved center star strategy that reduces the time complexity of its center sequence selection process from O(mn2) to O(mn). The experimental results show that CMSA achieves an up to 11× speedup and outperforms the state-of-the-art software.ConclusionCMSA focuses on the multiple similar RNA/DNA sequence alignment and proposes a novel bitmap based algorithm to improve the center star strategy. We can conclude that harvesting the high performance of modern GPU is a promising approach to accelerate multiple sequence alignment. Besides, adopting the co-run computation model can maximize the entire system utilization significantly. The source code is available at https://github.com/wangvsa/CMSA.

ieee international symposium on parallel & distributed processing, workshops and phd forum | 2013

EasyHPS: A Multilevel Hybrid Parallel System for Dynamic Programming

Jun Du; Ce Yu; Jizhou Sun; Chao Sun; Shanjiang Tang; Yanlong Yin

Dynamic programming approach solves complex problems efficiently by breaking them down into simpler sub-problems, and is widely utilized in scientific computing. With the increasing data volume of scientific applications and development of multi-core/multi-processor hardware technologies, it is necessary to develop efficient techniques for parallelizing dynamic programming algorithms, particularly in multilevel computing environment. The intrinsically strong data dependency of dynamic programming also makes it difficult and error-prone for the programmer to write a correct and efficient parallel program. In order to make the parallel programming easier and more efficient, we have developed a multilevel hybrid parallel runtime system for dynamic programming named EasyHPS based on the Directed Acyclic Graph(DAG) Data Driven Model in this paper. The EasyHPS system encapsulates details of parallelization implementation, such as task scheduling and message passing, and provides easy API for users to reduce the complexity of parallel programming parallelization. In the DAG Data Driven Model, the entire application is initially partitioned by data partitioning into sub-tasks that each sub-task processing a data block. Then all sub-tasks are modeled as a DAG, in which each vertex represents a sub-task and each edge indicates the communication dependency between the two sub-tasks. In task scheduling, a dynamic approach is proposed based on DAG Data Driven Model to achieve load balancing. Data partitioning and task scheduling are both done on processor-level and thread-level in the multilevel computing environment. In addition, experimental results demonstrate that the proposed dynamic scheduling approach in EasyHPS is more efficient in comparison with those static ones such as block-cyclic based wave front.

ieee/acm international symposium cluster, cloud and grid computing | 2015

Joint Scheduling of Data and Computation in Geo-Distributed Cloud Systems

Lingyan Yin; Jizhou Sun; Laiping Zhao; Chenzhou Cui; Jian Xiao; Ce Yu

Recent trends show that cloud computing is growing to span more and more globally distributed data centers. For geo-distributed data centers, there is an increasing need for scheduling algorithms to place tasks across data centers, by jointly considering data and computation. This scheduling must deal with situations such as wide-area distributed data, data sharing, WAN bandwidth costs and data center capacity limits, while also minimizing completion time. However, this kind of scheduling problems is known to be NP-Hard. In this paper, inspired by real applications in astronomy field, we propose a two-phase scheduling algorithm that addresses these challenges. The mapping phase groups tasks considering the data-sharing relations, and dispatches groups to data centers by way of one-to-one correspondence. The reassigning phase balances the completion time across data centers according to relations between tasks and groups. We utilize the real China-Astronomy-Cloud model and typical applications to evaluate our proposal. Simulations show that our algorithm obtains up to 22% better completion time and effectively reduces the amount of data transfers compared with other similar scheduling algorithms.

ieee international conference on cloud computing technology and science | 2011

Fast n-point Correlation Function Approximation with Recursive Convolution for Scalar Fields

Xiang Zhang; Ce Yu

In astrophysics, n-point Correlation Function (n-PCF) is an important tool for computation and analysis, but its algorithmic complex has long been a notorious problem. In this paper we are going to propose two algorithms that are easy to be parallized to compute the n-PCF problem efficiently. The algorithms are based on the definition of recursive convolution for scalar fields (RCSF), and it can be computed using varous fast Fourier Transform (FFT) algorithms in literature. Compared to traditional ways of dealing with this problem, our method is most efficient, for that it can achieve results with point sets as large as 1 billion in less than 1 minute. Moreover, the algorithms are intrinsically appropriate to be used on parallel computing environments such as computer clusters, multi-CPU/GPU super-computers, MapReduce and etc. Better computing environments can deal with better accuracy and time requirements.

IEEE Transactions on Services Computing | 2017

Online Virtual Machine Placement for Increasing Cloud Provider’s Revenue

Laiping Zhao; Liangfu Lu; Zhou Jin; Ce Yu

Cost savings have become a significant challenge in the management of data centers. In this paper, we show that, besides energy consumption, service level agreement (SLA) violations also severely degrade the cost-efficiency of data centers. We present online VM placement algorithms for increasing cloud provider’s revenue. First, First-Fit and Harmonic algorithm are devised for VM placement without considering migrations. Both algorithms get the same performance in the worst-case analysis, and equal to the lower bound of the competitive ratio. However, Harmonic algorithm could create more revenue than First-Fit by more than 10 percent when job arriving rate is greater than 1.0. Second, we formulate an optimization problem of maximizing revenue from VM migration, and prove it as NP-Hard by a reduction from 3-Partition problem. Therefore, we propose two heuristics: Least-Reliable-First (LRF) and Decreased-Density-Greedy (DDG). Experiments demonstrate that DDG yields more revenue than LRF when migration cost is low, yet leads to losses when SLA penalty is low or job arriving rate is high, due to the large number of migrations. Finally, we compare the four algorithms above with algorithms adopted in Openstack using a real trace, and find that the results are consistent with the ones using synthetic data.

international conference on algorithms and architectures for parallel processing | 2015

AQUAdex: A Highly Efficient Indexing and Retrieving Method for Astronomical Big Data of Time Series Images

Zhi Hong; Ce Yu; Ruolei Xia; Jian Xiao; Jie Wang; Jizhou Sun; Chenzhou Cui

In the era of Big Data, scientific research is challenged with handling massive data sets. To actually take advantage of Big Data, the key problem is to retrieve the desired cup of data from the ocean, as most applications only need a fraction of the entire data set. As the indexing and retrieving method is intrinsically connected with specific features of the data set and the goal of research, a universal solution is hardly possible. Designed for efficiently querying Big Data in astronomy time domain research, AQUAdex, a new spatial indexing and retrieving method is proposed to extract Time Series Images form Astronomical Big Data. By mapping images to tiles pixels on the celestial sphere, AQUAdex can complete queries 9 times faster, which is proven by theoretical analysis and experimental results. AQUAdex is especially suitable for Big Data applications because of its excellent scalability. The query time only increases 59i?ź% while the data size grows 14 times larger.

international parallel and distributed processing symposium | 2012

Adaptive Data Refinement for Parallel Dynamic Programming Applications

Shanjiang Tang; Ce Yu; Bu-Sung Lee; Chao Sun; Jizhou Sun

Load balancing is a challenging work for parallel dynamic programming due to its intrinsically strong data dependency. Two issues are mainly involved and equally important, namely, the partitioning method as well as scheduling and distribution policy of subtasks. However, researchers take into account their load balancing strategies primarily from the aspect of scheduling and allocation policy, while the partitioning approach is roughly considered. In this paper, an adaptive data refinement scheme is proposed. It is based on our previous work of DAG Data Driven Model. It can spawn more new computing subtasks during the execution by repartitioning the current block of task into smaller ones if the workload unbalance is detected. The experiment shows that it can dramatically improve the performance of system. Moreover, in order to substantially evaluate the quality of our method, a theoretic upper bound of improvable space for parallel dynamic programming is given. The experimental result in comparison with theoretical analysis clearly shows the fairly good performance of our approach.

Proceedings of SPIE | 2012

Operation, control, and data system for Antarctic Survey Telescope (AST3)

Zhaohui Shang; Keliang Hu; Yi Hu; Jiliang Li; Jin Li; Qiang Liu; Bin Ma; Jason Lee Quinn; Jizhou Sun; Lifan Wang; Jian Xiao; Jia Yu; Ce Yu; Mujin Yang; Xiangyan Yuan; Zhen Zeng

The first of the trio Antarctic Survey Telescopes (AST3) has been deployed to Dome A, Antarctica in January 2012. This largest optical survey telescope in Antarctica is equipped with a 10k × 10k CCD. The huge amount of data, limited satellite communication bandwidth, low temperature, low pressure and limited energy supply all place challenges to the control and operation of the telescope. We have developed both the hardware and software systems to operate the unattended telescope and carry out the survey automatically. Our systems include the main survey control, data storage, real-time pipeline, and database, for all of which we have dealt with various technical difficulties. These include developing customized computer systems and data storage arrays working at the harsh environment, temperature control for the disk arrays, automatic and fast data reduction in real-time, and building robust database system.

Explore More