Is this you? Create Your Porfile

Yinlong Xu

University of Science and Technology of China

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yinlong Xu is active.

Explore More

Publication

Featured researches published by Yinlong Xu.

measurement and modeling of computer systems | 2010

Optimal recovery of single disk failure in RDP code storage systems

Liping Xiang; Yinlong Xu; John C. S. Lui; Qian Chang

Modern storage systems use thousands of inexpensive disks to meet the storage requirement of applications. To enhance the data availability, some form of redundancy is used. For example, conventional RAID-5 systems provide data availability for single disk failure only, while recent advanced coding techniques such as row-diagonal parity (RDP) can provide data availability with up to two disk failures. To reduce the probability of data unavailability, whenever a single disk fails, disk recovery (or rebuild) will be carried out. We show that conventional recovery scheme of RDP code for a single disk failure is inefficient and suboptimal. In this paper, we propose an optimal and efficient disk recovery scheme, Row-Diagonal Optimal Recovery (RDOR), for single disk failure of RDP code that has the following properties: (1) it is read optimal in the sense that it issues the smallest number of disk reads to recover the failed disk; (2) it has the load balancing property that all surviving disks will be subjected to the same amount of additional workload in rebuilding the failed disk. We carefully explore the design state space and theoretically show the optimality of RDOR. We carry out performance evaluation to quantify the merits of RDOR on some widely used disks.

international parallel and distributed processing symposium | 2008

SIFT implementation and optimization for multi-core systems

Qi Zhang; Yurong Chen; Yimin Zhang; Yinlong Xu

Scale invariant feature transform (SIFT) is an approach for extracting distinctive invariant features from images, and it has been successfully applied to many computer vision problems (e.g. face recognition and object detection). However, the SIFT feature extraction is compute-intensive, and a real-time or even super-real-time processing capability is required in many emerging scenarios. Nowadays, with the multi- core processor becoming mainstream, SIFT can be accelerated by fully utilizing the computing power of available multi-core processors. In this paper, we propose two parallel SIFT algorithms and present some optimization techniques to improve the implementation s performance on multi-core systems. The result shows our improved parallel SIFT implementation can process general video images in super-real-time on a dual-socket, quad-core system, and the speed is much faster than the implementation on GPUs. We also conduct a detailed scalability and memory performance analysison the 8-core system and on a 32-core chip multiprocessor (CMP) simulator. The analysis helps us identify possible causes of bottlenecks, and we suggest avenues for scalability improvement to make this application more powerful on future large-scale multi- core systems.

2011 IEEE/IFIP 19th International Conference on VLSI and System-on-Chip | 2011

STT-RAM based energy-efficiency hybrid cache for CMPs

Jianhua Li; Chun Jason Xue; Yinlong Xu

Modern high performance Chip Multiprocessor (CMP) systems rely on large on-chip cache hierarchy. As technology scales down, the leakage power of present SRAM based cache gradually dominates the on-chip power consumption, which can severely jeopardize system performance. The emerging nonvolatile Spin Transfer Torque RAM (STT-RAM) is a promising candidate for large on-chip cache because of the ultra low leakage power. However, the write operations on STT-RAM suffer from considerably higher energy as well as longer latency compared with SRAM which will make STT-RAM in trouble for write-intensive workloads. In this paper, we propose to integrate SRAM with STT-RAM to construct a novel hybrid cache architecture for CMPs. We also propose dedicated microarchitectural mechanisms to make the hybrid cache robust to workloads with different write patterns. Extensive simulation results demonstrate that the proposed hybrid scheme is adaptive to variations of workloads. Overall power consumption is reduced by 37.1% and performance is improved by 23.6% on average compared with SRAM based static NUCA under the same area configuration.

Journal of Lightwave Technology | 2010

Integrated Fiber-Wireless (FiWi) Access Networks Supporting Inter-ONU Communications

Yan Li; Jianping Wang; Chunming Qiao; Ashwin Gumaste; Yun Xu; Yinlong Xu

Integrated fiber-wireless (FiWi) access networks provide a powerful platform to improve the throughput of peer-to-peer communication by enabling traffic to be sent from the source wireless client to an ingress optical network unit (ONU), then to the egress ONU close to the destination wireless client, and finally delivered to the destination wireless client. Such wireless-optical-wireless communication mode introduced by FiWi access networks can reduce the interference in wireless subnetwork, thus improving network throughput. With the support for direct inter-ONU communication in the optical subnetwork, throughput of peer-to-peer communication in a FiWi access network can be further improved. In this paper, we propose a novel hybrid wavelength division multiplexed/time division multiplexed passive optical network (WDM/TDM PON) architecture supporting direct inter-ONU communication, a corresponding decentralized dynamic bandwidth allocation (DBA) protocol for inter-ONU communication and an algorithm to dynamically select egress ONU. The complexity of the proposed architecture is analyzed and compared with other alternatives, and the efficiency of the proposed system is validated by the simulations.

Information Processing Letters | 2002

A note on the minimum label spanning tree

Yingyu Wan; Guoliang Chert; Yinlong Xu

We give a tight analysis of the greedy algorithm introduced by Krumke and Wirth for the minimum label spanning tree problem. The algorithm is shown to be a (ln(n - 1) + 1)-approximation for any graph with n nodes (n < 1), which improves the known performance guarantee 2 ln n + 1.

IEEE Transactions on Wireless Communications | 2009

Minimum-energy all-to-all multicasting in wireless ad hoc networks

Weifa Liang; Richard P. Brent; Yinlong Xu; Qingshan Wang

A wireless ad hoc network consists of mobile nodes that are powered by batteries. The limited battery lifetime imposes a severe constraint on the network performance, energy conservation in such a network thus is of paramount importance, and energy efficient operations are critical to prolong the lifetime of the network. All-to-all multicasting is one fundamental operation in wireless ad hoc networks, in this paper we focus on the design of energy efficient routing algorithms for this operation. Specifically, we consider the following minimum-energy all-to-all multicasting problem. Given an all-to-all multicast session consisting of a set of terminal nodes in a wireless ad hoc network, where the transmission power of each node is either fixed or adjustable, assume that each terminal node has a message to share with each other, the problem is to build a shared multicast tree spanning all terminal nodes such that the total energy consumption of realizing the all-to-all multicast session by the tree is minimized. We first show that this problem is NP-complete. We then devise approximation algorithms with guaranteed approximation ratios. We also provide a distributed implementation of the proposed algorithm. We finally conduct experiments by simulations to evaluate the performance of the proposed algorithm. The experimental results demonstrate that the proposed algorithm significantly outperforms all the other known algorithms.

IEEE Transactions on Wireless Communications | 2011

Coding-Based Data Broadcast Scheduling in On-Demand Broadcast

Cheng Zhan; Victor C. S. Lee; Jianping Wang; Yinlong Xu

According to data broadcast, we can satisfy multiple requests for the same data item in a broadcast tick. However, there is no significant breakthrough in performance improvement until recently that some studies proposed to use network coding in data broadcast. After broadcasting an encoded packet which encodes a number of data items, multiple clients can retrieve different requested data items in a broadcast tick. This not only utilizes bandwidth more efficiently, but also improves system performance. In this work, we propose a generalized encoding framework to incorporate network coding into data scheduling algorithms for on-demand broadcast. In the framework, data scheduling can be formulated as a weighted maximum clique problem in a graph where the weight of the clique is defined according to the performance objectives of the applications. Under the proposed framework, existing data scheduling algorithms for on-demand broadcast can be migrated into their corresponding coding versions while preserving their original criteria in scheduling data items. Our simulation results using a number of representative scheduling algorithms show that significant performance improvement can be achieved with coding.

ieee conference on mass storage systems and technologies | 2012

On the speedup of single-disk failure recovery in XOR-coded storage systems: Theory and practice

Yunfeng Zhu; Patrick P. C. Lee; Yuchong Hu; Liping Xiang; Yinlong Xu

Modern storage systems stripe redundant data across multiple disks to provide availability guarantees against disk failures. One form of data redundancy is based on XOR-based erasure codes, which use only XOR operations for encoding and decoding. In addition to providing failure tolerance, a storage system must also provide fast failure recovery to avoid data unavailability. We consider the problem of speeding up the recovery of a single-disk failure for arbitrary XOR-based erasure codes. We address this problem from both theoretical and practical perspectives. We propose a replace recovery algorithm, which uses a hill-climbing technique to search for a fast recovery solution, such that the solution search can be completed within a short time period. We further implement our replace recovery algorithm atop a parallelized architecture to justify its practicality. We experiment our replace recovery algorithm and its parallelized implementation on a networked storage system testbed, and demonstrate that our replace recovery algorithm uses less recovery time than the conventional approach.

ACM Transactions on Storage | 2011

A Hybrid Approach to Failed Disk Recovery Using RAID-6 Codes: Algorithms and Performance Evaluation

Liping Xiang; Yinlong Xu; John C. S. Lui; Qian Chang; Yubiao Pan; Runhui Li

The current parallel storage systems use thousands of inexpensive disks to meet the storage requirement of applications. Data redundancy and/or coding are used to enhance data availability, for instance, Row-diagonal parity (RDP) and EVENODD codes, which are widely used in RAID-6 storage systems, provide data availability with up to two disk failures. To reduce the probability of data unavailability, whenever a single disk fails, disk recovery will be carried out. We find that the conventional recovery schemes of RDP and EVENODD codes for a single failed disk only use one parity disk. However, there are two parity disks in the system, and both can be used for single disk failure recovery. In this article, we propose a hybrid recovery approach that uses both parities for single disk failure recovery, and we design efficient recovery schemes for RDP code (RDOR-RDP) and EVENODD code (RDOR-EVENODD). Our recovery scheme has the following attractive properties: (1) “read optimality” in the sense that our scheme issues the smallest number of disk reads to recover a single failed disk and it reduces approximately 1/4 of disk reads compared with conventional schemes; (2) “load balancing property” in that all surviving disks will be subjected to the same (or almost the same) amount of additional workload in rebuilding the failed disk. We carry out performance evaluation to quantify the merits of RDOR-RDP and RDOR-EVENODD on some widely used disks with DiskSim. The offline experimental results show that RDOR-RDP and RDOR-EVENODD outperform the conventional recovery schemes of RDP and EVENODD codes in terms of total recovery time and recovery workload on individual surviving disk. However, the improvements are less than the theoretical value (approximately 25%), as RDOR-RDP and RDOR-EVENODD change the disk access pattern from purely sequential to a more random one compared with their conventional schemes.

distributed computing in sensor systems | 2012

Network Lifetime Maximization in Delay-Tolerant Sensor Networks with a Mobile Sink

Zichuan Xu; Weifa Liang; Yinlong Xu

In this paper we investigate the network lifetime maximization problem in a delay-tolerant wireless sensor network with a mobile sink by exploiting a nontrivial tradeoff between the network lifetime and the data delivery delay. We formulate the problem as a joint optimization problem that consists of finding a trajectory for the mobile sink and designing an energy-efficient routing protocol to route sensing data to the sink, subject to the bounded delay on data delivery and the given potential sink location space. Due to NP-hardness of the problem, we then propose a novel optimization framework, which not only prolongs the network lifetime but also improves the other performance metrics including the network scalability, robustness, and the average delivery delay. We finally conduct extensive experiments by simulations to evaluate the performance of the proposed algorithm against other heuristics. The experimental results demonstrate that the proposed algorithm outperforms the others significantly in terms of network lifetime prolongation.

Explore More