Mingzhu Deng
National University of Defense Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Mingzhu Deng.
international performance computing and communications conference | 2015
Songping Yu; Nong Xiao; Mingzhu Deng; Yuxuan Xing; Fang Liu; Zhiping Cai; Wei Chen
The non-volatile memory (NVM) has the illustrious merits of byte-addressability, fast speed, persistency and low power consumption, which make it attractive to be used as main memory. Commonly, user process dynamically acquires memory through memory allocators. However, traditional memory allocators designed with in-place data writes are not appropriate for non-volatile main memory (NVRAM) due to the limited endurance. For instance, the number of write operations is merely 108 times per PCM cell. In this paper, we quantitatively analyze the wear-oblivious of DRAM-oriented designed allocator-glibc malloc and the inefficiency of wear-conscious allocator-NVMalloc. For example, the average imbalance factor (the maximum/the average) of memory allocation is about 7.5 and 3, respectively. Based on our observations, we propose WAlloc, an efficient wear-aware manual memory allocator designed for NVRAM, decouples metadata and data, uses Less Allocated First Out allocation policy and redirects the data writes. Experimental results show that the wear-leveling of WAlloc outperforms that of NVMalloc about 30% and 60% under random workloads and well-distributed workloads, respectively. In addition, considering the trade-off between space and wear-leveling, WAlloc reduces average data memory writes in 64 bytes block by average 1.5X comparing with malloc with almost 8% extra space overhead.
ACM Journal on Emerging Technologies in Computing Systems | 2017
Songping Yu; Nong Xiao; Mingzhu Deng; Fang Liu; Wei Chen
The non-volatile memory (NVM) has the merits of byte-addressability, fast speed, persistency and low power consumption, which make it attractive to be used as main memory. Commonly, user process dynamically acquires memory through memory allocators. However, traditional memory allocators designed with in-place data writes are not appropriate for the non-volatile main memory (NVRAM) due to the limited endurance. In this article, first, we quantitatively analyze the wear-oblivious of DRAM-oriented designed allocator—glibc malloc and the inefficiency of wear-conscious allocator NVMalloc. Then, we propose WAlloc, an efficient wear-aware manual memory allocator designed for NVRAM: (1) decouples metadata and data management; (2) distinguishes metadata with volatility; (3) redirects the data writes around to achieve wear-leveling; (4) redesigns an efficient and effective NVM copy mechanism, bypassing the CPU cache partially and prefetching data explicitly. Finally, experimental results show that the wear-leveling of WAlloc outperforms that of NVMalloc about 30% and 60% under random workloads and well-distributed workloads, respectively. Besides, WAlloc reduces the average data memory writes in 64 bytes block by 1.5 times comparing with glibc malloc. With the fulfillment of data persistency, cache bypassing NVM copy is better than cache line flushing NVM copy with performance improvement circa 14%.
networking architecture and storages | 2017
Songping Yu; Nong Xiao; Mingzhu Deng; Yuxuan Xing; Fang Liu; Wei Chen
As the expected emerging Non-Volatile Memory (NVM) technologies, such as 3DXPoint, are in production, there has been a recent push in the big data processing community from storage-centric towards memory-centric. Generally, in large-scale systems, distributed memory management through traditional network with TCP/IP protocol exposes performance bottleneck. Briefly, CPU- centric network involves context switching, memory copy etc. Remote Direct Memory Access (RDMA) technology reveals the tremendous performance advantage over than TCP/IP: Allowing access to remote memory directly bypassing OS kernel. In this paper, we propose Megalloc, a distributed NVM allocator exposes NVMs as a shared address space of a cluster of machines based-on RDMA. Firstly, it makes memory allocation metadata accessed directly by each machine, allocating NVM in coarse-grained way; secondly, adopting fine-grained memory chunk for applications to read or store data; finally, it guarantees high distributed memory allocation performance.
Mobile Information Systems | 2017
Mingzhu Deng; Nong Xiao; Songping Yu; Fang Liu; Lingyu Zhu; Zhiguang Chen
Existing RAID-6 code extensions assume that failures are independent and instantaneous, overlooking the underlying mechanism of multifailure occurrences. Also, the effect of reconstruction window is ignored. Additionally, these coding extensions have not been adapted to occurrence patterns of failure in real-world applications. As a result, the third parity drive is set to handle the triple-failure scenario; however, the lower level failure situations have been left unattended. Therefore, a new methodology of extending RAID-6 codes named RAID-6Plus with better compromise has been studied in this paper. RAID-6Plus (Deng et al., 2015) employs short combinations which can greatly reuse overlapped elements during reconstruction to remake the third parity drive. A sample extension code called RDP
International Conference on Security, Privacy and Anonymity in Computation, Communication and Storage | 2017
Songping Yu; Mingzhu Deng; Yuxuan Xing; Nong Xiao; Fang Liu; Wei Chen
Remote Direct Memory Access (RDMA) provides the ability to direct access remote user space memory without remote CPU’s involvement, shortening the network latency tremendously; in addition, a new generation of fast Non-Volatile Memory (NVM) technologies, such as 3D XPoint, is in production, and its property has the promise to access-speed like memory and durability-like storage. So, Remote access Non-Volatile Main Memory is reasonable. Traditional local memory extension is bounded by slow storage media (HDD/SSD). In this paper, first, we revisit local memory extension and propose a new memory extension model, Pyramid, extending memory with remote NVM; then, discussing the mechanism of remote data consistency, which can be delivered with RDMA operation of write-with-immediate in Pyramid; besides, we evaluate the performance of random access to remote NVM and manifest the performance opportunity brought by remote accessible NVM through comparing it with new technologies of storage-NVMe-SSD and PCM-based SSD. Finally, we argue that Pyramid promises memory scalability with good performance guarantee.
international conference on computer science and network technology | 2015
Mingzhu Deng; Lingyu Zhu; Nong Xiao; Zhiguang Chen; Fang Liu
Modern storage systems call for higher reliability care beyond RAID-6 against ever-increasing data failures. And no code extension exist based on X-code, which is noted for vertical alignment and optimal update complexity. In order to construct flexible and practically reliable codes, we argue that failures happen successively and higher-level failures can be degraded into separate lower-level failures with shorter rebuilding time. Therefore, we present X-code+, a fast recoverable coding scheme by explicitly modifying and extending X-code. We modify the original X-code by redistributing elements to keep a base reliability for double-failure and further extend it with two-element tuples in one extra parity drive to shorten rebuild windows and degrade complex failure scenarios. Analysis shows that X-code+ has better recovery performance and load-balance and less update penalty than its peer codes.
asia pacific services computing conference | 2015
Mingzhu Deng; Yang Ou; Nong Xiao; Songping Yu; Wei Chen; Zhiguang Chen; Fang Liu
Existing triple-failure-tolerant codes assume that failures are independent and instantaneous. Such assumptions overlook the underlying mechanism of multi-failure occurrences and ignored the effect of reconstruction window. These codes are not adapted to the occurrence pattern of failure in real-world applications. As a result, the third parity drive is almost idle as it set to handle the triple-failure scenario only with lower-level failure situations unattended. Furthermore, the problem of single failure rebuild deteriorates with the increasing disk capacity, and the systems reliability will decrease with user experience impaired. Aiming at these problems, a fast reconstructable coding scheme extended from RAID-6 has been developed in this study. RAID-6Plus maintains a smaller reconstruction window by recoding the third parity drive. Existing codes provide absolute reliability for triple failures via full combinations. As a contrast, RAID-6Plus employs short combinations which are able to greatly reuse overlapped elements during reconstruction to remake the third parity drive. The short combinations shorten the reconstruction window of single failure, which avoids multi-failure overlapping in the reconstruction window. The capability of multi-failure degradation provides RAID-6Plus with 1 a better system performance comparing to RTP and STAR and 2 an enhanced reliability comparing to RAID-6.
international conference on control and automation | 2014
Mingzhu Deng; Zhiguang Chen; Yimo Du; Nong Xiao; Fang Liu
Data continues to explode exponentially in big data era, making the 3-replication scheme unaffordably expensive and space-consuming for a storage system. To this end, erasure code schemes prevail by offering the same or higher reliability at a relatively lower spatial cost. In this paper, an overview of up-to-date research work of erasure codes in big data era is presented. Additionally, we identify some problems of recent work and attempt to share our insights about possible future work of erasure codes.
International Journal of Parallel Programming | 2017
Mingzhu Deng; Wei Chen; Nong Xiao; Songping Yu; Yupeng Hu
ubiquitous computing | 2017
Songping Yu; Nong Xiao; Mingzhu Deng; Yuxuan Xing; Fang Liu; Wei Chen