Is this you? Create Your Porfile

Jie Ma

Chinese Academy of Sciences

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Jie Ma is active.

Explore More

Publication

Featured researches published by Jie Ma.

international conference on cluster computing | 2010

Exploiting Data Deduplication to Accelerate Live Virtual Machine Migration

Xiang Zhang; Zhigang Huo; Jie Ma; Dan Meng

As one of the key characteristics of virtualization, live virtual machine (VM) migration provides great benefits for load balancing, power management, fault tolerance and other system maintenance issues in modern clusters and data centers. Although Pre-Copy is a widespread used migration algorithm, it does transfer a lot of duplicated memory image data from source to destination, which results in longer migration time and downtime. This paper proposes a novel VM migration approach, named Migration with Data Deduplication (MDD), which introduces data deduplication into migration. MDD utilizes the self-similarity of run-time memory image, uses hash based fingerprints to find identical and similar memory pages, and employs Run Length Encode (RLE) to eliminate redundant memory data during migration. Experiment demonstrates that compared with Xens default Pre-Copy migration algorithm, MDD can reduce 56.60% of total data transferred during migration, 34.93% of total migration time, and 26.16% of downtime on average.

ieee international conference on high performance computing data and analytics | 2009

Adaptive and scalable metadata management to support a trillion files

Jing Xing; Jin Xiong; Ninghui Sun; Jie Ma

Nowadays more and more applications require file systems to efficiently maintain million or more files. How to provide high access performance with such a huge number of files and such large directories is a big challenge for cluster file systems. Limited by static directory structures, existing file systems will be prohibitively inefficient for this use. To address this problem, we present a scalable and adaptive metadata management system which aims to maintain a trillion files efficiently. Firstly, our system exploits an adaptive two-level directory partitioning based on extendible hashing to manage very large directories. Secondly, our system utilizes fine-grained parallel processing within a directory and greatly improves performance of file creation or deletion. Thirdly, our system uses multiple-layered metadata cache management which improves memory utilization on the servers. And finally, our system uses a dynamic loadbalance mechanism based on consistent hashing which enables our system to scale up and down easily. Our performance results on 32 metadata servers show that our user-level prototype implementation can create more than 74 thousand files per second and can get more than 270 thousand files attributes per second in a single directory with 100 million files. Moreover, it delivers a peak throughput of more than 60 thousand file creates/second in a single directory with 1 billion files.

ieee international conference on high performance computing data and analytics | 2004

A failure recovery mechanism for distributed metadata servers in DCFS2

Zhihua Fan; Jin Xiong; Jie Ma

Distributed metadata servers are required for cluster file systems scalability. However, how to distribute the file system metadata among multiple metadata servers and how to make the file system reliable in case of server failures are two difficult problem. We present a journal-based failure-recovery mechanism for distributed metadata servers in the dawning cluster file system-DCFS2. The DCFS2 metadata protocol exploits a modified two-phase commit protocol which ensures consistent metadata updates on multiple metadata servers even in case of one servers failure. We focus on the logging policy and concurrent control policy for metadata updates, and the failure recovery policy. The DCFS2 metadata protocol is compared with the two phase commit protocol and some virtues are shown. Some results of performance experiments on our system are also presented.

grid and cooperative computing | 2003

Grid Gateway: Message-Passing between Separated Cluster Interconnects

Wei Cui; Jie Ma; Zhigang Huo

Geographically distributed computing requires high-performance clusters to be integrated to solve problems in computational Grid. Because cluster interconnect is isolated, its low-level communication protocol doesn’t exchange messages with others directly. This paper presents a plug-in, Grid Gateway, which enables separated low-level communication protocols to communicate with each other. Grid Gateway can be used in many topologies of inter-cluster network. It has some dynamic features, such as support for multi-gateway mechanism to enhance communication performance. Grid Gateway allows low-level communication protocol to involve in the high-performance Grid computing. Thus it is expected to support the implementation of Grid-enabled tools over it, such as Grid-enabled MPI. This paper describes its architecture and implementation, and presents some design issues.

grid and cooperative computing | 2008

Memory Based Metadata Server for Cluster File Systems

Jing Xing; Jin Xiong; Jie Ma; Ninghui Sun

In high performance computing environment, the metadata servers of distributed file system become critical to impact overall system performance. An approach of memory based metadata server is proposed, instead of the disk based approach. We present a metadata management system with matrix organization, non-overhead reliable mechanism and static scalability method, which is design to efficiently utilize large memory and provide high performance. We examine and demonstrate the performance, overhead of reliability and scalability in a test bed environment of 28 machines. The result shows that the performance of our system is higher than other traditional distributed file system, the reliability can be achieved with little overhead and the metadata servers can be linear scaling.

networking architecture and storages | 2006

IncFS: an integrated high-performance distributed file system based on NFS

Yi Zhao; Rongfeng Tang; Jin Xiong; Jie Ma

Scientific computing applications running in the cluster environment require high performance distributed file system to store and share data. A new approach, the IncFS, of building a high performance distributed file system by integrating many NFS servers is presented in this paper. The IncFS is aimed at providing a simple and convenient way to achieve high aggregate I/O bandwidth for scientific computing applications that require intensive concurrent file access. The IncFS uses a hyper structure to integrate multiple NFS file systems. And it provides multiple data layouts to effectively distribute file data among those NFS servers. Performance evaluations demonstrate that the IncFS has very good data access bandwidth with near perfect scalability, while still maintains an acceptable meta data throughput

international parallel and distributed processing symposium | 2007

United-FS: A Logical File System Providing a Single Image of Multiple Physical File Systems on NFS Server

Huan Chen; Yi Zhao; Jin Xiong; Jie Ma; Ninghui Sun

NFS is considered to be the bottleneck in cluster computing environment because of its limited resources and centralized data management. With the development of hardware, NFS server has more than one I/O channel, more storage space and more powerful CPU. In this paper, we describe the design and the implementation of a new logical file system called United-FS. It can make storage devices connected to multiple I/O channels work concurrently and cooperatively. It can be exported by NFS server to provide a single file system image to clients by hiding a variety of native file systems built on different type of storage devices. This paper also compares the United-FS with the software RAID system both from theoretical analysis and experiments. The results show that United-FS is much more flexible and its performance is better than software RAID in most cases.

international parallel and distributed processing symposium | 2002

Semi-user-level communication architecture

Dan Meng; Jie Ma; Jin He; Limin Xiao; Zhiwei Xu

This paper introduces semi-user-level communication architecture, a new high-performance light-weighted communication architecture for inter-node communication of clusters. Different from traditional kernel-level networking architecture and user-level communication architecture, semi-user-level communication architecture removes OS kernel from its message-receiving path while reserves an OS trapping on its message-sending path. No interrupt handling is needed. This new communication architecture doesnt support user-level access to network interface. It provides good portability, security, and support for heterogeneous networking environment and usage of large memory. Semi-user-level communication architecture has been implemented on a SMP workstation cluster system called DAWNING-3000, which is interconnected through Myrinet. Communication performance results are given and overhead distribution is analyzed.

international parallel and distributed processing symposium | 2008

HPPNET: A novel network for HPC and its implication for communication software

Panyong Zhang; Can Ma; Jie Ma; Qiang Li; Dan Meng

With the widespread adoption of multicore processors in high performance computing (HPC) environment, the balance between computation and communication moves towards computation. It has been becoming more important to design a high efficient network for HPC system, which commonly has two challengers associated: 1) To provide a communication environment with low latency, high bandwidth, and high small message processing rate; 2) and to efficiently support the partitioned global address space (PGAS) programming model. With respects to these needs this paper proposes a novel network named HPPNET. By adopting HyperTransport interface, separate channel design, on-load part of processing work to host, and transparent direct load/store in global physical address space, HPPNET can sufficiently support both needs for HPC. Meanwhile, we have adopted several key technologies to minimize the implication of new network for communication software. Evaluation shows that HPPNET hardware design can achieve high performance and bring no barrier to high efficiency software design. Our results also show that the 8 bytes remote store cost 0.4s in HPPNET prototype.

grid and cooperative computing | 2006

Research on Key Technologies of Load Balancing for NFS Server with Multiple Network Paths

Huan Chen; Rongfeng Tang; Yi Zhao; Jin Xiong; Jie Ma; Ninghui Sun

NFS server is designed to run on a single node. Even if NFS server is configured with multiple network interfaces, each client can only access NFS server through one network interface of the server. In this paper, we design and implement the multi-path load distribution mechanism in the SunRPC layer of NFS. This mechanism allows the NFS clients to use the aggregated network bandwidth of the server which has multiple network channels and balances the load among the multiple network channels. The result shows that the extended version of NFS can make good use of multi-path of the server and can tolerate network failure

Explore More