Zhikun Chen
National University of Defense Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Zhikun Chen.
international conference on computer science and network technology | 2012
Hui Zhao; Shuqiang Yang; Zhikun Chen; Hong Yin; Songchang Jin
Scheduling algorithms place a crucial role in MapReduce systems. Several recent scheduling algorithms, however, are all under Job-Task scheduling model which makes task scheduling confined, leading to poor task scheduling preference such as data locality, scan sharing and etc. These characteristics are very important heuristics on data intensive computing and helpful in improving system throughput. In this paper, we firstly design a novel scheduling model termed as Tasks-Job scheduling to overcome the above issues. Furthermore, we propose a locality aware algorithm to improve system throughput. Comprehensive experiments have been conducted to compare the proposed scheduling model and algorithm with state-of-the-art Job-Task based algorithms. The experimental results validate the efficiency and effectiveness of our proposed algorithm.
Frontiers of Computer Science in China | 2015
Zhikun Chen; Shuqiang Yang; Shuang Tan; Li He; Hong Yin; Ge Zhang
NoSQL databases are famed for the characteristics of high scalability, high availability, and high fault-tolerance. So NoSQL databases are used in a lot of applications. The data partitioning strategy and fragment allocation strategy directly affect NoSQL database systems’ performance. The data partition strategy of large, global databases is performed by horizontally, vertically partitioning or combination of both. In the general way the system scatters the related fragments as possible to improve operations’ parallel degree. But the operations are usually not very complicated in some applications, and an operation may access to more than one fragment. At the same time, those fragments which have to be accessed by an operation may interact with each other. The general allocation strategies will increase system’s communication cost during operations execution over sites. In order to improve those applications’ performance and enable NoSQL database systems to work efficiently, these applications’ fragments have to be allocated in a reasonable way that can reduce the communication cost i.e., to minimize the total volume of data transmitted during operations execution over sites. A strategy of clustering fragments based on hypergraph is proposed, which can cluster fragments which were accessed together in most operations to the same cluster. Themethod uses a weighted hypergraph to represent the fragments’ access pattern of operations. A hypergraph partitioning algorithmis used to cluster fragments in our strategy. This method can reduce the amount of sites that an operation has to span. So it can reduce the communication cost over sites. Experimental results confirm that the proposed technique will effectively contribute in solving fragments re-allocation problem in a specific application environment of NoSQL database system.
China Communications | 2015
Hong Yin; Shuqiang Yang; Shaodong Ma; Fei Liu; Zhikun Chen
The similarity search is one of the fundamental components in time series data mining, e.g. clustering, classification, association rules mining. Many methods have been proposed to measure the similarity between time series, including Euclidean distance, Manhattan distance, and dynamic time warping (DTW). In contrast, DTW has been suggested to allow more robust similarity measure and be able to find the optimal alignment in time series. However, due to its quadratic time and space complexity, DTW is not suitable for large time series datasets. Many improving algorithms have been proposed for DTW search in large databases, such as approximate search or exact indexed search. Unlike the previous modified algorithm, this paper presents a novel parallel scheme for fast similarity search based on DTW, which is called MRDTW (MapRedcue-based DTW). The experimental results show that our approach not only retained the original accuracy as DTW, but also greatly improved the efficiency of similarity measure in large time series.
international conference on computer science and service system | 2012
Zhikun Chen; Shuqiang Yang; Hui Zhao; Hong Yin
Column Family has been used in NoSQL Database, and we have to think over how to divide the Column Family in the designing of NoSQL Database. But the dividing of column family is similar to the vertical partitioning in the distributed database. And there are some object functions of vertical partitioning in the database system, but they do not fit to the NoSQL Database. In this paper we will propose an objective function for dividing class family in NoSQL database. The objective function derived generalizes and subsumes earlier work on vertical partitioning in distributed database and the scenarios of NoSQL database. Furthermore, the approach proposed in this paper is shown to be useful to evaluate the cost of dividing column family.
international conference on computer science and network technology | 2012
Hui Zhao; Shuqiang Yang; Zhikun Chen; Hua Fan; Jinghu Xu
MapReduce is an important programming paradigm on big data-intensive computing using share-nothing cluster containing ten of thousands of nodes, in which computing nodes also acts as storage nodes. Since tasks belonging to different jobs are physical executing entities scattered among the whole cluster, task scheduling plays a crucial role in MapReduce systems. For data consolidation and utilization, MapReduce cluster is usually used as an shared computing environment rather than several private clusters. Typical workloads consist of concurrent jobs, which include interactive jobs and batch jobs, so fairness is an important target in this scenario. On the other hand, efficiency is also an vital concern for cluster owner, data locality is used as a heuristic to achieve high efficiency. To achieve both goals is a huge challenge, requiring extensive research work. State of the art schedulers cannot well solve this problem. In this paper, we proposed K%-Fair scheduling, a flexible task scheduling strategy, based on multiple task queues on node level, according to fairness and data locality. Finally, we evaluate our scheduling on data locality and fairness among jobs, it improves data locality much more, in the same time, fairness is kept on nearly the same.
fuzzy systems and knowledge discovery | 2012
Hong Yin; Shuqiang Yang; Hui Zhao; Zhikun Chen
With the emerging of large database and the explosion of mass data, people need to extract useful information, knowledge from huge database, and to improve the utilization rate of the information further. Partitioning technologies allows users to divisive a big table into some smaller and can manage partitions easier, so it can solve some of the problems of mass data. This paper discussed the Partial-MAX/MIN query optimization for the partitioned tables. We introduce the Rank Bisection Partition Tree (RBP-T) structure to improving the efficiency of this class of query. The experimental results show that our method to solve the Partial-MAX/MIN query in the mass data cases is effective.
Archive | 2011
Shuqiang Yang; Rongling Luo; Huaimin Wang; Quanyuan Wu; Yan Jia; Bin Zhou; Weihong Han; Meng Teng; Zhikun Chen; Hui Zhao; Songchang Jin; Qi Shu; Kai Wang
Archive | 2011
Shuqiang Yang; Meng Teng; Huaimin Wang; Quanyuan Wu; Yan Jia; Bin Zhou; Weihong Han; Zhikun Chen; Hui Zhao; Qi Shu; Songchang Jin; Rongling Luo; Kai Wang
Archive | 2011
Shuqiang Yang; Hui Zhao; Huaimin Wang; Quanyuan Wu; Yan Jia; Bin Zhou; Weihong Han; Meng Teng; Zhikun Chen; Songchang Jin; Rongling Luo; Kai Wang; Qi Shu
Archive | 2011
Shuqiang Yang; Kai Wang; Huaimin Wang; Quanyuan Wu; Yan Jia; Bin Zhou; Weihong Han; Meng Teng; Zhikun Chen; Hui Zhao; Songchang Jin; Rongling Luo; Qi Shu