Xunfei Jiang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xunfei Jiang is active.

Explore More

Publication

Featured researches published by Xunfei Jiang.

international performance computing and communications conference | 2012

Thermal modeling and analysis of storage systems

Xunfei Jiang; Mohammed I. Alghamdi; Ji Zhang; Maen M. Al Assaf; Xiaojun Ruan; Tausif Muzaffar; Xiao Qin

Recognizing that power and cooling cost for data centers are increasing, we address in this study the thermal impact of storage systems. In the first phase of this work, we generate the thermal profile of a storage server containing three hard disks. The profiling results show that disks have comparable thermal impacts as processing and networking elements to overall storage node temperature. We develop a thermal model to estimate the outlet temperature of a storage server based on processor and disk utilizations. The thermal model is validated against data acquired by an infrared thermometer as well as build-in temperature sensors on disks. Next, we apply the thermal model to investigate the thermal impact of workload management on storage systems. Our study suggests that disk-aware thermal management techniques have significant impacts on reducing cooling cost of storage systems. We further show that this work can be extended to analysis the cooling cost of data centers with massive storage capacity.

IEEE Transactions on Parallel and Distributed Systems | 2016

Efficient Parallel Skyline Evaluation Using MapReduce

Ji Zhang; Xunfei Jiang; Wei-Shinn Ku; Xiao Qin

This research develops an advanced two-phase MapReduce solution that is able to efficiently address skyline queries on large datasets. Unlike existing parallel skyline approaches, our scheme considers data partitioning, filtering, and parallel skyline evaluation as a holistic query process. In particular, we apply filtering techniques and angle-based partitioning in the first phase, in which unqualified objects are discarded and the processed objects are partitioned by their angles to the origin.In the second phase, local skyline objects in each partition are calculated in parallel, and global skyline objects are output after a merging skyline process. To improve the parallel local skyline calculation, we propose two partition-aware filtering methods that keep skyline candidates in a balanced manner. The aggressive partition-aware filtering aggressively eliminates objects in the partition with the greatest population of candidate objects, whereas the proportional partition-aware filtering slows down the growth of partition population proportionally. Recognizing the lack of studies that incorporate the MapReduce framework into parallel skyline processing, we propose a partialpresort grid-based partition skyline algorithm that is able to significantly improve the merging skyline computation on large datasets. The presort process can be completed in the shuffle phase with little overhead. Our experimental results show the efficiency and effectiveness of the proposed parallel skyline solution utilizing MapReduce on large-scale datasets.

signal processing systems | 2013

Eco-Storage: A Hybrid Storage System with Energy-Efficient Informed Prefetching

Maen M. Al Assaf; Xunfei Jiang; Mohamed Riduan Abid; Xiao Qin

In this paper, we present an energy-aware informed prefetching technique called Eco-Storage that makes use of the application-disclosed access patterns to group the informed prefetching process in a hybrid storage system (e.g., hard disk drive and solid state disks). Since the SSDs are more energy efficient than HDDs, aggressive prefetching for the data in the HDD level enables it to have as much standby time as possible in order to save power. In the Eco-Storage system, the application can still read its on-demand I/O reading requests from the hybrid storage system while the data blocks are prefetched in groups from HDD to SSD. We show that these two steps can be handled in parallel to decreases the system’s power consumption. Our Eco-Storage technique differs from existing energy-aware prefetching schemes in two ways. First, Eco-Storage is implemented in a hybrid storage system where the SDD level is more energy efficient. Second, it can group the informed prefetching process and quickly prefetch the data from the HDD to the SSD to increase the frequent HDD standby times. This will makes the application finds most of its on-demand I/O reading requests in the SSD level. Finally, we develop a simulator to evaluate our Eco-Storage system performance. Our results show that our Eco-Storage reduces the power consumption by at least 75 % when compared with the worst case of non-Eco-Storage case using a real-world I/O trace.

Journal of Network and Computer Applications | 2017

Scalable influence maximization under independent cascade model

Feng Lu; Weikang Zhang; Liwen Shao; Xunfei Jiang; Peng Xu; Hai Jin

Abstract Influence maximization is a fundamental problem that aims at finding a small subset of seed nodes to maximize the spread of influence in social networks. Influence maximization problem plays an important role in viral marketing, which is widely adopted in social network advertising. However, finding an optimal solution is NP hard. A more realistic approach is to find a balanced point between the effectiveness and efficiency. A greedy algorithm and its improvements (including Cost-Effective Lazy Forward ( CELF ) algorithm) were developed to provide an approximation solution with errors bounded at (1–1/ e ). But the method still suffers from high computational overhead. In this paper, we analyse the bottleneck of the greedy algorithm and propose a more efficient method to replace the time-consuming part of the greedy algorithm. Then, we design a CascadeDiscount algorithm to solve the influence maximization problem. The experimental results on real-world datasets demonstrate that (1) our CascadeDiscount algorithm maintains a close influence spread to CELF and performs better than two heuristics methods, DegreeDiscountIC and TwoStage; (2) our CascadeDiscount method runs two orders of magnitude faster than CELF over real-world datasets.

international performance computing and communications conference | 2012

Improving write performance by enhancing internal parallelism of Solid State Drives

Xiaojun Ruan; Ziliang Zong; Mohammed I. Alghamdi; Yun Tian; Xunfei Jiang; Xiao Qin

Most researches of Solid State Drives (SSDs) architectures rely on Flash Translation Layer (FTL) algorithms and wear-leveling; however, internal parallelism in Solid State Drives has not been well explored. In this research, we proposed a new strategy to improve SSD write performance by enhancing internal parallelism inside SSDs. A SDRAM buffer is added in the design for buffering and scheduling write requests. Because the same logical block numbers may be translated to different physical numbers at different times in FTL, the on-board SDRAM buffer is used to buffer requests at the lower level of FTL. When the buffer is full, same amount of data will be assigned to each storage package in SSDs to enhance internal parallelism. To accurately evaluate performance, we use both synthetic workloads and real-world applications in experiments. We compare the enhanced internal parallelism scheme with the traditional LRU strategy since it is unfair to compare an SSD having buffer with an SSD without a buffer. The simulation results demonstrate that the writing performance of our design is significantly improved compared with the LRU-cache strategy with the same amount of buffer sizes.

signal processing systems | 2018

Informed Prefetching for Distributed Multi-Level Storage Systems

Maen M. Al Assaf; Xunfei Jiang; Xiao Qin; Mohamed Riduan Abid; Meikang Qiu; Jifu Zhang

In this paper, we present an informed prefetching technique called IPODS that makes use of application-disclosed access patterns to prefetch hinted blocks in distributed multi-level storage systems. We develop a prefetching pipeline in IPODS, where an informed prefetching process is divided into a set of independent prefetching steps and separated among multiple storage levels in a distributed system. In the IPODS system, while data blocks are prefetched from hard disks to memory buffers in remote storage servers, data blocks buffered in the servers are prefetched through networks to the clients’ local cache. We show that these two prefetching steps can be handled in a pipelining manner to improve I/O performance of distributed storage systems. Our IPODS technique differs from existing prefetching schemes in two ways. First, it reduces applications’ I/O stalls by keeping hinted data in clients’ local caches and storage servers’ fast buffers (e.g., solid state disks). Second, in a prefetching pipeline, multiple informed prefetching mechanisms coordinate semi-dependently to fetch blocks (1) from low-level (slow) to high-level (fast) storage devices in servers and (2) from high-level devices in servers to the clients’ local cache. The prefetching pipeline in IPODS judiciously hides network latency in distributed storage systems, thereby reducing the overall I/O access time in distributed systems. Using a wide range of real-world I/O traces, our experiments show that IPODS can noticeably improve I/O performance of distributed storage systems by 6%.

Journal of Communications | 2014

Thermal Modeling and Analysis of Cloud Data Storage Systems

Xunfei Jiang; Mohammed I. Alghamdi; Maen M. Al Assaf; Xiaojun Ruan; Ji Zhang; Meikang Qiu; Xiao Qin

An explosive increment of data and a variety of data analysis make it indispensable to lower power and cooling costs of cloud datacenters. To address this issue, we investigate the thermal impact of I/O access patterns on data storage systems. Firstly, we conduct some preliminary experiments to study the thermal behavior of a data storage node. The experimental results show that disks have ignorable thermal impacts as processors to outlet temperatures of storage nodes. We raise an approach to model the outlet temperature of a storage node. The thermal models generated by our approach gains a precision error less than 6%. Next, we investigate the thermal impact of data placement strategies on storage systems. We compare the cooling cost of storage systems governed by different data placement schemes. Our study shows that evenly distributing the data leads to highest outlet temperature for the sake of shortest execution time and energy efficiency. According to the energy consumption of various data placement schemes, we propose a thermal-ware energy-efficient data placement strategy. We further show that this work can be extended to analyze the cooling cost of data centers with massive storage capacity. Big data, which is composed of a collection of huge and complex data sets, has been positioned as must have commodity and resource in industry, government, and academia. Processing big data requires a large-scale storage system, which increases both power and cooling costs. In this study, we investigate the thermal behavior of real storage systems and their I/O access patterns, which offer a guideline of building energy-efficient cloud storage systems. The cooling consumption of data centers can be considerably reduced by using an efficient thermal management for storage systems. However, disk is not considered in traditional thermal models for data centers. In this paper, we investigate the thermal impact of hard disks and propose a thermal modeling approach for storage systems. In addition, we estimate the outlet temperature of a storage server by applying the proposed

networking architecture and storages | 2013

PEAM: Predictive Energy-Aware Management for Storage Systems

Xunfei Jiang; Ji Zhang; Mohammed I. Alghamdi; Xiao Qin; Minghua Jiang; Jifu Zhang

This paper presents a novel Predictive Energy-Aware Management (PEAM) system that is able to reduce the energy costs of storage systems by appropriately selecting data transmission methods. In particular, we evaluate the energy costs of three methods (1. transfer data without archiving and compression, 2. archive and transfer data, 3. compress and transfer data) in preliminary experiments. According to the results, we observe that the energy consumption of data transmission greatly varies case by case. We cannot simply apply one method in all cases. Therefore, we design an energy prediction model that can estimate the total energy cost of data transmission by using particular transmission methods. Based on the model, our predictive energy-aware management system can automatically select the most energy efficient method for data transmission. Our experimental results show that our system performs better than simply selecting any one among the three methods for data transmission in terms of energy efficiency.

mobile data management | 2013

Evaluation of Spatial Keyword Queries with Partial Result Support on Spatial Networks

Ji Zhang; Wei-Shinn Ku; Xunfei Jiang; Xiao Qin; Yu-Ling Hsueh

Numerous geographic information system applications need to retrieve spatial objects which bear user specified keywords close to a given location. In this research, we present efficient approaches to answer spatial keyword queries on spatial networks. In particular, we formally introduce definitions of Spatial Keyword k Nearest Neighbor (SKkNN) and Spatial Keyword Range (SKR) queries. Then, we present a framework of a spatial keyword query evaluation system which is comprised of Keyword Constraint Filter (KCF), Keyword and Spatial Refinement (KSR), and the spatial keyword ranker. KCF employs an inverted index to calculate keyword relevancy of spatial objects, and KSR refines intermediate results by considering both spatial and keyword constraints with the spatial keyword ranker. In addition, we design novel algorithms for evaluating SKkNN and SKR queries. These algorithms employ the inverted index technique, shortest path search algorithms, and network Voronoi diagrams. Our extensive simulations show that the proposed SKkNN and SKR algorithms can answer spatial keyword queries effectively and efficiently.

Journal of Network and Computer Applications | 2017

Towards two-phase scheduling of real-time applications in distributed systems

Mohammed I. Alghamdi; Xunfei Jiang; Ji Zhang; Jifu Zhang; Minghua Jiang; Xiao Qin

In this work we propose a two-phase scheduling technique (TOPS) for distributed real-time systems. Our TOPS scheduling approach has two distinct phases. The first phase is in charge of producing a scheduling sequence, whereas the second phase aims to dispatch tasks to computing nodes of a distributed system. The second phase also judiciously determines the starting time of each task. One salient feature of our approach lies in high flexibility, which allows system developers to apply multiple policies in each phase. The two phases are independent of one another; therefore, one can change a policy in one phase without configuring another phase. With TOPS in place, we are able to observe the impacts of sorting policies on the performance of scheduling policies. We implement a prototype of TOPS, where the first phase is comprised of three sorting policies, and the second phase consists of two scheduling policies. TOPS enables us to discover that combining the EDF (earliest deadline first) and AEAP (as early as possible) policies leads to an optimized performance among all six candidate algorithms.

Explore More