Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yulai Yuan is active.

Publication


Featured researches published by Yulai Yuan.


IEEE Transactions on Parallel and Distributed Systems | 2010

Adaptive Workload Prediction of Grid Performance in Confidence Windows

Yongwei Wu; Kai Hwang; Yulai Yuan; Weiming Zheng

Predicting grid performance is a complex task because heterogeneous resource nodes are involved in a distributed environment. Long execution workload on a grid is even harder to predict due to heavy load fluctuations. In this paper, we use Kalman filter to minimize the prediction errors. We apply Savitzky-Golay filter to train a sequence of confidence windows. The purpose is to smooth the prediction process from being disturbed by load fluctuations. We present a new adaptive hybrid method (AHModel) for load prediction guided by trained confidence windows. We test the effectiveness of this new prediction scheme with real-life workload traces on the AuverGrid and Grid5000 in France. Both theoretical and experimental results are reported in this paper. As the lookahead span increases from 10 to 50 steps (5 minutes per step), the AHModel predicts the grid workload with a mean-square error (MSE) of 0.04-0.73 percent, compared with 2.54-30.2 percent in using the static point value autoregression (AR) prediction method. The significant gain in prediction accuracy makes the new model very attractive to predict Grid performance. The model was proved especially effective to predict large workload that demands very long execution time, such as exceeding 4 hours on the Grid5000 over 5,000 processors. With minor changes of some system parameters, the AHModel can apply to other computational grids as well. At the end, we discuss extended research issues and tool development for Grid performance prediction.


grid computing | 2007

Load prediction using hybrid model for computational grid

Yongwei Wu; Yulai Yuan; Guangwen Yang; Weimin Zheng

Due to the dynamic nature of grid environments, schedule algorithms always need assistance of a long-time-ahead load prediction to make decisions on how to use grid resources efficiently. In this paper, we present and evaluate a new hybrid model, which predicts the n-step-ahead load status by using interval values. This model integrates autoregressive (AR) model with confidence interval estimations to forecast the future load of a system. Meanwhile, two filtering technologies from signal processing field are also introduced into this model to eliminate data noise and enhance prediction accuracy. The results of experiments conducted on a real grid environment demonstrate that this new model is more capable of predicting n-step-ahead load in a computational grid than previous works. The proposed hybrid model performs well on prediction advance time for up to 50 minutes, with significant less prediction errors than conventional AR model. It also achieves an interval length acceptable for task scheduler.


grid and cooperative computing | 2007

Dynamic Data Replication based on Local Optimization Principle in Data Grid

Yulai Yuan; Yongwei Wu; Guangwen Yang; Feng Yu

Efficient data access is one way of improving the performance of the data grid. In order to speed up the data access and reduce bandwidth consumption, data grid replicates essential data in multiple locations. This paper studies data replication strategy in data grid, taking into account two important issues which bound the replication: storage capability of different nodes and the bandwidth between these nodes. We propose a new dynamic replication strategy based on the principle of local optimization. The data grid can achieve the global data access optimization through the interaction of the local optimization in the local optimization areas.


high performance distributed computing | 2010

PV-EASY: a strict fairness guaranteed and prediction enabled scheduler in parallel job scheduling

Yulai Yuan; Guangwen Yang; Yongwei Wu; Weimin Zheng

As the most widely used parallel job scheduling strategy in production schedulers, EASY has achieved great success, not only because it can balance fairness and performance, but also because it is universally applicable to most HPC systems. However, unfairness still exists in EASY. For real workloads used in this work, our simulation shows that a blocked job can be delayed by later jobs for more than 90 hours. In addition, EASY cannot directly employ parallel job runtime prediction techniques, because this would lead to a serious situation called reservation violation. In this paper, we aim at guaranteeing strict fairness (no job is delayed by any jobs of lower priority) while achieving attractive performance, and employing prediction without causing reservation violation in parallel job scheduling. We propose two novel strategies, shadow load preemption (SLP) and venture backfilling (VB), which are together integrated into EASY to construct a preemptive venture EASY backfilling (PV-EASY) strategy. Experimental results on three workloads of real HPC systems demonstrate that: First, PV-EASY guarantees strict fairness, in addition to avoiding reservation violation when employing job runtime prediction techniques in scheduling; Second, PV-EASY achieves the same performance as EASY, and outperforms prediction employed EASY; Third, the preemption in PV-EASY is not resource costly and simple enough to be implemented in all HPC systems where EASY works. These advantages make PV-EASY more attractive than EASY in parallel job scheduling, both from academic and industry perspectives.


IEEE Transactions on Parallel and Distributed Systems | 2014

Guarantee Strict Fairness and UtilizePrediction Better in Parallel Job Scheduling

Yulai Yuan; Yongwei Wu; Weiming Zheng; Keqin Li

As the most widely used parallel job scheduling strategy, EASY backfilling achieved great success, not only because it can balance fairness and performance, but also because it is universally applicable to most HPC systems. However, unfairness still exists in EASY. Our simulation shows that a blocked job can be delayed by later jobs for more than 90 hours on real workloads. Additionally, directly employing runtime prediction techniques in EASY would lead to a serious situation called reservation violation. In this paper, we aim at guaranteeing strict fairness (no job is delayed by any jobs of lower priority) while achieving attractive performance, and employing prediction without causing reservation violation in parallel job scheduling. We propose two novel strategies, namely, shadow load preemption (SLP) and venture backfilling (VB), which are integrated into EASY to construct preemptive venture EASY backfilling (PV-EASY). Experimental results on three real HPC workloads demonstrate that PV-EASY is more attractive than EASY in parallel job scheduling, from both academic and industry perspectives.


Future Generation Computer Systems | 2010

VDB-MR: MapReduce-based distributed data integration using virtual database

Yulai Yuan; Yongwei Wu; Xiao Feng; Jing Li; Guangwen Yang; Weimin Zheng

Data Integration is becoming very important in many commercial applications and scientific research. A lot of algorithms and systems have been proposed and developed to address related issues from different aspects. Virtual database systems are well-recognized as one of the effective solutions of data integration. The existing execution modules in virtual database systems are very ineffective. MapReduce (MR) is a new computing model for parallel processing and has a good performance on large-scale data execution. In this paper, we propose a new distributed data integration system, called VDB-MR, which is based on the MapReduce technology, to efficiently integrate data from heterogeneous data sources. With VDB-MR, a unified view (i.e., a single virtual database) of multiple databases can be provided to users. We also conducted a series of experiments to evaluate VDB-MR by comparing it with an open source data integration system OGSA-DAI and two DBMSs in parallel. Experiment results show that VDB-MR significantly outperforms OGSA-DAI and the DBMSs in parallel.


The Journal of Supercomputing | 2010

An adaptive task-level fault-tolerant approach to Grid

Yongwei Wu; Yulai Yuan; Guangwen Yang; Weimin Zheng

A strong failure recovery mechanism handling diverse failures in heterogeneous and dynamic Grid is so important to ensure the complete execution of long-running applications. Although there have been various efforts made to address this issue, existing solutions either focus on employing only one single fault-tolerant technique without considering the diversity of failures, or propose some frameworks which cannot deal with various kinds of failures adaptively in Grid. In this paper, an adaptive task-level, fault-tolerant approach to Grid is proposed. This approach aims at handling quite a complete set of failures arising in Grid environment by integrating basic fault-tolerant approaches. Moreover, this paper puts forward that resource consumption (not received enough attention) is also an important evaluation metric for any fault-tolerant approach. The corresponding evaluation models based on mean execution time and resource consumption are constructed to evaluate any fault-tolerant approach. Based on the models, we also demonstrate the effectiveness of our approach and illustrate the performance gains achieved via simulations. The experiments based on a real Grid have been made and the results show that our approach can achieve better performance and consume less resource.


International Journal of Web and Grid Services | 2007

Parallel programming over ChinaGrid

Weiyuan Huang; Yongwei Wu; Yulai Yuan; Jia Liu; Guangwen Yang; Weimin Zheng

Grid computing is becoming more and more attractive for providing a convenient uniform platform for coordinating highly distributed and heterogeneous resources and services. In this paper, a practical task-level distributed Parallel Programming Interface (PPI) for grid computing is introduced GridPPI. It is an MPI-like interface with high-level parallel tasking over the grid. GridPPI is a prototype of Grid-API, which supports all operations that are necessary for such task-level distributed parallel computing over grids, including service discovering and selecting, task submitting and reporting, etc. In this paper, the Web Service Resource Framework (WSRF) service-oriented implementation and evaluation of the GridPPI on the ChinaGrid Support Platform (CGSP) are discussed in detail. A performance analysis is also made that shows that our efforts could provide a flexible and effective programming library in the grid environment.


cluster computing and the grid | 2008

Adaptive Hybrid Model for Long Term Load Prediction in Computational Grid

Yulai Yuan; Yongwei Wu; Guangwen Yang; Weimin Zheng


Computers & Mathematics With Applications | 2012

Job failures in high performance computing systems: A large-scale empirical study

Yulai Yuan; Yongwei Wu; Qiuping Wang; Guangwen Yang; Weimin Zheng

Collaboration


Dive into the Yulai Yuan's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Kai Hwang

University of Southern California

View shared research outputs
Researchain Logo
Decentralizing Knowledge