Is this you? Create Your Porfile

Wenzhe Zhang

National University of Defense Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Wenzhe Zhang is active.

Explore More

Publication

Featured researches published by Wenzhe Zhang.

acm sigplan symposium on principles and practice of parallel programming | 2013

RaceFree: an efficient multi-threading model for determinism

Kai Lu; Xu Zhou; Xiaoping Wang; Wenzhe Zhang; Gen Li

Current deterministic systems generally incur large overhead due to the difficulty of detecting and eliminating data races. This paper presents RaceFree, a novel multi-threading runtime that adopts a relaxed deterministic model to provide a data-race-free environment for parallel programs. This model cuts off unnecessary shared-memory communication by isolating threads in separated memories, which eliminates direct data races. Meanwhile, we leverage the happen-before relation defined by applications themselves as one-way communication pipes to perform necessary thread communication. Shared-memory communication is transparently converted to message-passing style communication by our Memory Modification Propagation (MMP) mechanism, which propagates local memory modifications to other threads through the happen-before relation pipes. The overhead of RaceFree is 67.2% according to our tests on parallel benchmarks.

Journal of Zhejiang University Science C | 2018

Versionized process based on non-volatile random-access memory for fine-grained fault tolerance

Wenzhe Zhang; Kai Lu; Xiaoping Wang

Non-volatile random-access memory (NVRAM) technology is maturing rapidly and its byte-persistence feature allows the design of new and efficient fault tolerance mechanisms. In this paper we propose the versionized process (VerP), a new process model based on NVRAM that is natively non-volatile and fault tolerant. We introduce an intermediate software layer that allows us to run a process directly on NVRAM and to put all the process states into NVRAM, and then propose a mechanism to versionize all the process data. Each piece of the process data is given a special version number, which increases with the modification of that piece of data. The version number can effectively help us trace the modification of any data and recover it to a consistent state after a system crash. Compared with traditional checkpoint methods, our work can achieve fine-grained fault tolerance at very little cost.

virtual execution environments | 2017

Flexible Page-level Memory Access Monitoring Based on Virtualization Hardware

Kai Lu; Wenzhe Zhang; Xiaoping Wang; Mikel Luján; Andy Nisbet

Page protection is often used to achieve memory access monitoring in many applications, dealing with program-analysis, checkpoint-based failure recovery, and garbage collection in managed runtime systems. Typically, low overhead access monitoring is limited by the relatively large page-level granularity of memory management unit hardware support for virtual memory protection. In this paper, we improve upon traditional page-level mechanisms by additionally using hardware support for virtualization in order to achieve fine and flexible granularities that can be smaller than a page. We first introduce a memory allocator based on page protection that can achieve fine-grained monitoring. Second, we explain how virtualization hardware support can be used to achieve dynamic adjustment of the monitoring granularity. In all, we propose a process-level virtual machine to achieve dynamic and fine-grained monitoring. Any application can run on our process-level virtual machine without modification. Experimental results for an incremental checkpoint tool provide a use-case to demonstrate our work. Comparing with traditional page-based checkpoint, our work can effectively reduce the amount of checkpoint data and improve performance.

Journal of Zhejiang University Science C | 2017

Fine-grained checkpoint based on non-volatile memory

Wenzhe Zhang; Kai Lu; Mikel Luján; Xiaoping Wang; Xu Zhou

New non-volatile memory (e.g., phase-change memory) provides fast access, large capacity, byte-addressability, and non-volatility features. These features, fast-byte-persistency, will bring new opportunities to fault tolerance. We propose a fine-grained checkpoint based on non-volatile memory. We extend the current virtual memory manager to manage non-volatile memory, and design a persistent heap with support for fast allocation and checkpointing of persistent objects. To achieve a fine-grained checkpoint, we scatter objects across virtual pages and rely on hardware page-protection to monitor the modifications. In our system, two objects in different virtual pages may reside on the same physical page. Modifying one object would not interfere with the other object. This allows us to monitor and checkpoint objects smaller than 4096 bytes in a fine-grained way. Compared with previous page-grained based checkpoint mechanisms, our new checkpoint method can greatly reduce the data copied at checkpoint time and better leverage the limited bandwidth of non-volatile memory.

network and parallel computing | 2016

Application-Based Coarse-Grained Incremental Checkpointing Based on Non-volatile Memory

Zhan Shi; Kai Lu; Xiaoping Wang; Wenzhe Zhang; Yiqi Wang

The Mean Time to Failure continues to decrease as the scaling of computing systems. To maintain the reliability of computing systems, checkpoint has to be taken more frequently. Incremental checkpointing is a well-researched technique that makes frequent checkpointing possible. Fine-grained incremental checkpointing minimizes checkpoint size but suffers from significant monitoring overhead. We observe the memory access at page granularity and find that the size of contiguous memory regions visited by applications tends to be proportional to size of corresponding memory allocation. In this paper, we propose the Application-Based Coarse-Grained Incremental Checkpointing (ACCK) that leverages the priori information of the memory allocation to release the memory monitoring granularity in an incremental and appropriate way. This provides better opportunities for balancing the tradeoff between monitoring and copying overhead. ACCK is also assisted by hugepage to alleviate the TLB overhead. Our experiment shows that ACCK presents 2.56x performance improvement over the baseline mechanism.

Scientific Programming | 2015

Write-combined logging: an optimized logging for consistency in NVRAM

Wenzhe Zhang; Kai Lu; Mikel Luján; Xiaoping Wang; Xu Zhou

Nonvolatile memory (e.g., Phase Change Memory) blurs the boundary between memory and storage and it could greatly facilitate the construction of in-memory durable data structures. Data structures can be processed and stored directly in NVRAM. To maintain the consistency of persistent data, logging is a widely adopted mechanism. However, logging introduces write-twice overhead. This paper introduces an optimized write-combined logging to reduce the writes to NVRAM log. By leveraging the fast-read and byte-addressable features of NVRAM, we can perform a read-and-compare operation before writes and thus issue writes in a finer-grained way. We tested our system on the benchmark suit STAMP which contains real-world applications. Experiment results show that our system can reduce the writes to NVRAM by 33%-34%, which can help extend the lifetime of NVRAM and improve performance. Averagely our system can improve performance by 7%-11%.

The International Conference on Computer Science and Technology (CST2016) | 2017