Zhirong Shen
Tsinghua University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Zhirong Shen.
international workshop on quality of service | 2013
Zhirong Shen; Jiwu Shu; Wei Xue
Cloud computing cuts down large capital outlays in facilities purchase and eliminates complex system management for users. To protect data confidentiality in cloud utilization, sensitive data are usually stored in encrypted form, making traditional search service on plaintext inapplicable. Thus, enabling keyword search over encrypted data becomes a paramount urgency. Given massive data users with various search preferences, it becomes necessary to support preferred keyword search and output the data files in the order of the users preference. In this paper, for the first time, we investigate the challenging problem of preferred keyword search over encrypted data (PSED). We first establish a set of privacy requirements and utilize the appearance frequency of each keyword to serve as its “weight”. A preference preprocessing mechanism is then explored to ensure that the search result will faithfully respect the users preference and the Lagrange polynomial is introduced to express the users preference formula. We further represent keyword weights of each file by using vectors, convert the preference polynomial into the vector form, and securely calculate their inner products to quantitatively characterize the relevance measure between data files and a query. Finally, an extensive performance evaluation demonstrates the proposed scheme can achieve acceptable efficiency.
dependable systems and networks | 2014
Zhirong Shen; Jiwu Shu
The increasing expansion of data scale leads to the widespread deployment of storage systems with larger capacity and further induces the climbing probability of data loss or damage. The Maximum Distance Separable (MDS) code in RAID-6, which tolerates the concurrent failures of any two disks with minimal storage requirement, is one of the best candidates to enhance the data reliability. However, most of the existing works in this literature are more inclined to be specialized and cannot provide a satisfied performance under an all-round evaluation. Aiming at this problem, we propose an all-round MDS code named Horizontal-Vertical Code (HV Code) by taking advantage of horizontal parity and vertical parity. HV Code achieves the perfect I/O balancing and optimizes the operation of partial stripe writes to continuous data elements, while preserving the optimal encode/decode/update efficiency. Moreover, it owns a shorter parity chain which grants it a more efficient recovery for one disk failure. HV Code also behaves well for the degraded read operation and accelerates the process to reconstruct two disabled disks by executing four recovery chains in parallel. The performance evaluation demonstrates that HV Code well balances the I/O distribution and eliminates up to 27.6% and 32.4% I/O requests for partial stripe writes operation when compared with RDP Code and HDP Code. Moreover, compared to RDP Code, HDP Code, X-Code and H-Code, HV Code reduces up to 5.4%~39.8% I/O requests per element for the single disk reconstruction, decreases 6.6%~28.3% I/O requests for degraded read operations, and achieves the same efficiency of X-Code for double disk recovery by shortening 47.4%~59.7% recovery time compared with other three codes.
Journal of Parallel and Distributed Computing | 2014
Jiwu Shu; Zhirong Shen; Wei Xue
With the increasing amount of personal data stored in public storage, users are losing control of their physical data, putting their data information at risk of theft or being compromised. Traditional secure storage systems either require users to completely trust the storage provider or impose the considerable burden of managing files on file owners; such systems are inapplicable in the practical cloud environment. This paper addresses these challenging problems by proposing a new secure system architecture and implementing a stackable secure storage system named Shield, in which a proxy server is introduced to be in charge of authentication and access control. We propose a new variant of the Merkle Hash Tree to support efficient integrity checking and file content update; further, we have designed a hierarchical key organization to achieve convenient keys management and efficient permission revocation. Shield supports concurrent write access by employing a virtual linked list; it also provides secure file sharing without any modification to the underlying file systems. A series of evaluations over various real benchmarks show that Shield causes about 7%~13% performance degradation when compared with eCryptfs but provides enhanced security for users data.
dependable systems and networks | 2016
Zhirong Shen; Jiwu Shu; Patrick P. C. Lee
How to improve the performance of single failure recovery has been an active research topic because of its prevalence in large-scale storage systems. We argue that when erasure coding is deployed in a cluster file system (CFS), existing single failure recovery designs are limited in different aspects: neglecting the bandwidth diversity property in a CFS architecture, targeting specific erasure code constructions, and no special treatment on load balancing during recovery. In this paper, we reconsider the single failure recovery problem in a CFS setting, and propose CAR, a cross-rack-aware recovery algorithm. For each stripe, CAR finds a recovery solution that retrieves data from the minimum number of racks. It also reduces the amount of cross-rack repair traffic by performing intra-rack data aggregation prior to cross-rack transmission. Furthermore, by considering multi-stripe recovery, CAR balances the amount of cross-rack repair traffic across multiple racks. Evaluation results show that CAR can effectively reduce the amount of cross-rack repair traffic and the resulting recovery time.
IEEE Transactions on Parallel and Distributed Systems | 2016
Zhirong Shen; Jiwu Shu; Yingxun Fu
Erasure codes tolerate disk failures by pre-storing a low degree of data redundancy, and have been commonly adopted in current storage systems. However, the attached requirement on data consistency exaggerates partial stripe write operations and thus seriously downgrades system performance. Previous works to optimize partial stripe writes are relatively limited, and a general mechanism is still absent. In this paper, we propose a Parity-Switched Data Placement (PDP) to optimize partial stripe writes for any XOR-coded storage system. PDP first reduces the write operations by arranging continuous data elements to join a common parity elements generation. To achieve a deeper optimization, PDP further explores the generation orders of parity elements and makes any two continuous data elements associate with a common parity element. Intensive evaluations show that for tested erasure codes, PDP reduces up to 31.9 percent of write operations and further increases the write speed by up to 59.8 percent when compared with two state-of-the-art data placement methods.
asia and south pacific design automation conference | 2013
Jiwu Shu; Zhirong Shen; Wei Xue; Yingxun Fu
With the rapid development of cloud storage, data security in storage receives great attention and becomes the top concern to block the spread development of cloud service. In this paper, we systematically study the security researches in the storage systems. We first present the design criteria that are used to evaluate a secure storage system and summarize the widely adopted key technologies. Then, we further investigate the security research in cloud storage and conclude the new challenges in the cloud environment. Finally, we give a detailed comparison among the selected secure storage systems and draw the relationship between the key technologies and the design criteria.
IEEE Transactions on Computers | 2017
Yingxun Fu; Jiwu Shu; Xianghong Luo; Zhirong Shen; Qingda Hu
As reliability requirements are increasingly important in both clusters and data centers, RAID-6, which can tolerate any two concurrent disk failures, has been widely used in modern storage systems. However, most existing RAID-6 codes cannot provide satisfied performance on both degraded reads and partial stripe writes, which are important performance metrics in storage systems. To address these problems, in this paper we propose a new RAID-6 MDS erasure code called Short Code, in order to optimize the degraded reads and partial stripe writes. In Short Code, we propose a novel short horizontal parity chain, which assures that all disks contribute to degraded reads while the continuous data elements are more likely to share the same horizontal chain for optimizing degraded reads. On the other hand, Short Code distributes all diagonal parities among disks for optimizing partial stripe writes. The proposed Short Code not only owns the optimal storage efficiency, but also keeps the optimal complexity for both encoding/decoding computations and update operations. The experiments show that Short Code achieves much higher speed on degraded reads and partial stripe writes than other popular RAID-6 codes, and provide acceptable performance on single disk failure recoveries and normal reads. Specifically, compared to RDP code, Short Code provides 6.1 to 26.3 percent higher speed on degraded reads and 36.2 to 80.3 percent higher speed on partial stripe writes with the same number of disks.
IEEE Sensors Journal | 2017
Zhirong Shen; Jiwu Shu; Wei Xue
In this paper, we study the problem of keyword search with access control (KSAC) over encrypted data in cloud computing. We first propose a scalable framework where user can use his attribute values and a search query to locally derive a search capability, and a file can be retrieved only when its keywords match the query and the user’s attribute values can pass the policy check. Using this framework, we propose a novel scheme called KSAC, which enables keyword search with access control over encrypted data. KSAC utilizes a recent cryptographic primitive called hierarchical predicate encryption to enforce fine-grained access control and perform multi-field query search. Meanwhile, it also supports the search capability deviation, and achieves efficient access policy update as well as keyword update without compromising data privacy. To enhance the privacy, KSAC also plants noises in the query to hide users’ access privileges. Intensive evaluations on real-world dataset are conducted to validate the applicability of the proposed scheme and demonstrate its protection for user’s access privilege.
symposium on reliable distributed systems | 2017
Zhirong Shen; Patrick P. C. Lee; Jiwu Shu; Wenzhong Guo
Erasure coding has been extensively employed for data availability protection in production storage systems by maintaining a low degree of data redundancy. However, how to mitigate the parity update overhead of partial stripe writes in erasure-coded storage systems is still a critical concern. In this paper, we reconsider this problem from two new perspectives: data correlation and stripe organization, and propose CASO, a correlation-aware stripe organization algorithm. CASO captures data correlation of a data access stream. It packs correlated data into a small number of stripes to reduce the incurred I/Os in partial stripe writes, and further organizes uncorrelated data into stripes to leverage the spatial locality in later accesses. By differentiating correlated and uncorrelated data in stripe organization, we show via extensive trace-driven evaluation that CASO reduces up to 25.1% of parity updates and accelerates the write speed by up to 28.4%.
IEEE Transactions on Dependable and Secure Computing | 2017
Zhirong Shen; Patrick P. C. Lee; Jiwu Shu; Wenzhong Guo
How to improve the performance of single failure recovery has been an active research topic because of its prevalence in large-scale storage systems. We argue that when erasure coding is deployed in a clustered file system (CFS), existing single failure recovery designs are limited in different aspects: neglecting the bandwidth diversity property in a CFS architecture, targeting specific erasure code constructions, and no special treatment on load balancing during recovery. In this paper, we propose CAR, a cross-rack-aware recovery algorithm that is designed to improve the performance of single failure recovery of a CFS that employs Reed-Solomon codes for general fault tolerance. For each stripe, CAR finds a recovery solution that retrieves data from the minimum number of racks. It also reduces the amount of cross-rack repair traffic by performing intra-rack data aggregation prior to cross-rack transmission. Furthermore, by considering multi-stripe recovery, CAR balances the amount of cross-rack repair traffic across multiple racks. Evaluation results show that CAR can effectively reduce the amount of cross-rack repair traffic and the resulting recovery time.