Yongzhi Wang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yongzhi Wang is active.

Explore More

Publication

Featured researches published by Yongzhi Wang.

international conference on cloud computing | 2011

VIAF: Verification-Based Integrity Assurance Framework for MapReduce

Yongzhi Wang; Jinpeng Wei

MapReduce, a cloud computing paradigm, is gaining popularity. However, like all open distributed computing frameworks, MapReduce suffers from the integrity assurance vulnerability: it takes merely one malicious worker to render the overall computation result useless. Existing solutions are effective in defeating the malicious behavior of non-collusive workers, but are futile in detecting collusive workers. In this paper, we focus on the mappers, which typically constitute the majority of workers, and propose the Verification-based Integrity Assurance Framework (VIAF) to detect both non-collusive and collusive mappers. The basic idea of VIAF is to combine task replication with non-deterministic verification, in which consistent but malicious results from collusive mappers can be detected by a trusted verifier. We have implemented VIAF in Hadoop, an open source MapReduce implementation. Our theoretical analysis and experimental result show that VIAF can achieve high task accuracy while imposing acceptable overhead.

international conference on cloud computing | 2013

Result Integrity Check for MapReduce Computation on Hybrid Clouds

Yongzhi Wang; Jinpeng Wei; Mudhakar Srivatsa

Large scale adoption of MapReduce computations on public clouds is hindered by the lack of trust on the participating virtual machines, because misbehaving worker nodes can compromise the integrity of the computation result. In this paper, we propose a novel MapReduce framework, Cross Cloud MapReduce (CCMR), which overlays the MapReduce computation on top of a hybrid cloud: the master that is in control of the entire computation and guarantees result integrity runs on a private and trusted cloud, while normal workers run on a public cloud. In order to achieve high accuracy, CCMR proposes a result integrity check scheme on both the map phase and the reduce phase, which combines random task replication, random task verification, and credit accumulation, and CCMR strives to reduce the overhead by reducing cross-cloud communication. We implement our approach based on Apache Hadoop MapReduce and evaluate our implementation on Amazon EC2. Both theoretical and experimental analysis show that our approach can guarantee high result integrity in a normal cloud environment while incurring non-negligible performance overhead (e.g., when 16.7% workers are malicious, CCMR can guarantee at least 99.52% of accuracy with 33.6% of overhead when replication probability is 0.3 and the credit threshold is 50).

Computers & Security | 2015

Toward protecting control flow confidentiality in cloud-based computation

Yongzhi Wang; Jinpeng Wei

Cloud based computation services have grown in popularity in recent years. Cloud users can deploy an arbitrary computation cluster to public clouds and execute their programs on that remote cluster to reduce infrastructure investment and maintenance costs. However, how to leverage cloud resources while keeping the computation confidential is a new challenge to be explored. In this paper, we propose runtime control flow obfuscation (RCFO) to protect the control flow confidentiality of outsourced programs. RCFO transforms an outsourced program into two parts: the public program running on the untrusted public cloud and the private program running on the trusted private cloud. By hiding parts of the control flow information in the private program and inserting fake branch statements into the public program, RCFO raises the bar for static and dynamic analysis-based reverse engineering attacks. Based on RCFO, we implement a system called MRDisguiser to protect cloud-based MapReduce services. We perform experiments on a real MapReduce service, Amazon Elastic MapReduce. The experimental results indicate that MRDisguiser is compatible with current cloud-based MapReduce services, and incurs moderate performance overhead. Specifically, when the obfuscation degree increases from 0 to 1.0, the average performance overhead is between 14.9% and 33.2%. We propose a novel control flow obfuscation technology.We propose the continuous cache to limit the performance overhead in a moderate range.Our method makes it difficult for attackers to perform reverse engineering attacks.We implement a system to protect the program confidentiality of MapReduce jobs.

international conference on big data | 2013

IntegrityMR: Integrity assurance framework for big data analytics and management applications

Yongzhi Wang; Jinpeng Wei; Mudhakar Srivatsa; Yucong Duan; Wencai Du

Big data analytics and knowledge management is becoming a hot topic with the emerging techniques of cloud computing and big data computing model such as MapReduce. However, large-scale adoption of MapReduce applications on public clouds is hindered by the lack of trust on the participating virtual machines deployed on the public cloud. In this paper, we extend the existing hybrid cloud MapReduce architecture to multiple public clouds. Based on such architecture, we propose IntegrityMR, an integrity assurance framework for big data analytics and management applications. We explore the result integrity check techniques at two alternative software layers: the MapReduce task layer and the applications layer. We design and implement the system at both layers based on Apache Hadoop MapReduce and Pig Latin, and perform a series of experiments with popular big data analytics and management applications such as Apache Mahout and Pig on commercial public clouds (Amazon EC2 and Microsoft Azure) and local cluster environment. The experimental result of the task layer approach shows high integrity (98% with a credit threshold of 5) with non-negligible performance overhead (18% to 82% extra running time compared to original MapReduce). The experimental result of the application layer approach shows better performance compared with the task layer approach (less than 35% of extra running time compared with the original MapReduce).

IEEE Transactions on Big Data | 2017

Secure k-NN Query on Encrypted Cloud Data with Multiple Keys

Ke Cheng; Liangmin Wang; Yulong Shen; Hua Wang; Yongzhi Wang; Xiaohong Jiang; Hong Zhong

The k-nearest neighbors (k-NN) query is a fundamental primitive in spatial and multimedia databases. It has extensive applications in location-based services, classification & clustering and so on. With the promise of confidentiality and privacy, massive data are increasingly outsourced to cloud in the encrypted form for enjoying the advantages of cloud computing (e.g., reduce storage and query processing costs). Recently, many schemes have been proposed to support k-NN query on encrypted cloud data. However, prior works have all assumed that the query users (QUs) are fully-trusted and know the key of the data owner (DO), which is used to encrypt and decrypt outsourced data. The assumptions are unrealistic in many situations, since many users are neither trusted nor knowing the key. In this paper, we propose a novel scheme for secure k-NN query on encrypted cloud data with multiple keys, in which the DO and each QU all hold their own different keys, and do not share them with each other; meanwhile, the DO encrypts and decrypts outsourced data using the key of his own. Our scheme is constructed by a distributed two trapdoors public-key cryptosystem (DT-PKC) and a set of protocols of secure two-party computation, which not only preserves the data confidentiality and query privacy but also supports the offline data owner. Our extensive theoretical and experimental evaluations demonstrate the effectiveness of our scheme in terms of security and performance.

Future Generation Computer Systems | 2016

Toward integrity assurance of outsourced computing - a game theoretic perspective

Yongzhi Wang; Jinpeng Wei; Shaolei Ren; Yulong Shen

Outsourced computing is gaining popularity in recent years. However, due to the existence of malicious workers in the open outsourced environment, offering high accuracy computing services is critical and challenging. A practical solution for this class of problems is to replicate outsourced tasks and compare the replicated task results, or to verify task results by the outsourcer herself. However, since most outsourced computing services are not free, the portion of tasks to be replicated or verified is restricted by the outsourcers budget. In this paper, we propose Integrity Assurance Outsourced Computing (IAOC) system, which employs probabilistic task replication, probabilistic task verification and credit management techniques to offer a high accuracy guarantee for the generalized outsourced computing jobs. Based on IAOC system, we perform theoretical analysis and model the behaviors of IAOC system and the attacker as a two-player zero sum game. We propose two algorithms, Interactive Gradient Descent (IGD) algorithm and Tiered Interactive Gradient Descent (TIGD) algorithm that can find the optimal parameter settings under users accuracy requirement, without or with considering users budget requirement. We prove that the parameter setting generated by IGD/TIGD algorithm form a Nash Equilibrium, and also suggests an accuracy lower bound. Our experiments show that even in the most severe situation, where the malicious workers dominate the outsourced computing environment, our algorithm is able to find the parameter settings satisfying users budget and accuracy requirement. Our method can ensure high result integrity in outsourced computing systems.Our algorithm can guarantee the highest result integrity under system restrictions.We proved the correctness of the proposed algorithms.We performed experiments to show the effectiveness of the proposed algorithms.

International Journal of Networked and Distributed Computing | 2014

Service Value Broker Patterns: An Empirical Collection and Analysis

Yucong Duan; Keman Huang; Dan Chen; Yongzhi Wang; Ajay Kattepur; Wencai Du

The service value broker(SVB) pattern integrates business modeling, knowledge management and economic analysis with relieved complexity, enhanced reusability and efficiency,etc. The study of SVB is an emerging interdisciplinary subject which will help to promote the reuse of knowledge, strategy and experience in service based designs and solutions. In this paper, we focus on enumerating collected SVBs empirically with initial analysis on their composition manners. The results from this paper will play a dominating role in fueling a coming E-service Economics era.

IEEE Transactions on Big Data | 2018

MtMR: Ensuring MapReduce Computation Integrity with Merkle Tree-Based Verifications

Yongzhi Wang; Yulong Shen; Hua Wang; Jinli Cao; Xiaohong Jiang

Big data applications have made significant impacts in recent years thanks to the fast growth of cloud computing and big data infrastructures. However, public cloud is still not widely accepted to perform big data computing, due to the concern with the public clouds security. Result integrity is one of the most significant security problems that exists in the cloud-based big data computing scenario. In this paper, we propose MtMR, a Merkle tree-based verification method that assures high result integrity of MapReduce jobs. MtMR overlays MapReduce on a hybrid cloud environment and applies two rounds of Merkle tree-based verifications on the pre-reduce phase (i.e., the map phase and the shuffle phase) and the reduce phase, respectively. In each round, MtMR samples a small portion of reduce task input/output records on the private cloud and performs Merkle tree-based verification on all the task input/output records. Based on the design of MtMR, we perform a series of theoretical studies to analyze its security and performance overhead. Our results indicate that MtMR is a promising method in terms of high result integrity and low performance overhead. For example, by setting the sampled record ratio as an optimal value, MtMR can guarantee no more than 10 incorrect records in each reduce task by sampling only 4 percent of records in that task.

Computer Communications | 2017

Exploiting Content Delivery Networks for covert channel communications

Yongzhi Wang; Yulong Shen; Xiaopeng Jiao; Tao Zhang; Xu Si; Ahmed Salem; Jia Liu

We proposed a CDN-based covert channel communication attack.We performed experiments on a commercial CDN to show that such an attack is possible.We discussed possible countermeasures against such an attack. Content Delivery Networks (CDNs) became an important infrastructure in todays Internet architecture. More and more content providers use CDNs to improve their service quality and reliability. However, providing better quality of service (QoS) by using CDNs could also be abused by attackers to commit network crimes. In this paper, we show that CDNs can be used as a covert communication channel to circumvent network censorships. Specifically, we propose the CDN covert channel attack, where accessing contents through different CDN nodes can form a unique pattern, which can be used in encoding secret messages. We implemented a proof-of-concept covert channel based on our proposed attack on CloudFront, a commercial CDN service provided by Amazon Web Service. We showed that our constructed covert channel can transmit messages with various lengths with an average transmission efficiency as 2.29 bits per request (i.e., each penetration request transmits 2.29 bits of secret message on average). After presenting the CDN covert channel attack, we also discuss possible countermeasures.

international conference on web services | 2014

Exploring Cloud Service Brokering from an Interface Perspective

Yucong Duan; Nanjangud C. Narendra; Wencai Du; Yongzhi Wang; Nianjun Zhou

Service brokering has an increasingly prominent role in bridging the gap between business requirement and technology enablement. We propose the concept of service value brokering (SVB) to fulfil the possible missing linkages between business and technology layer. In this paper, we modeled a SVB Web service as an integration of two layers with the business interface (BIF) and the technical interface (TIF). With this distinction, Web service compositions can map to two layers of compositions at both BIF and TIF levels. We notice that any partial consideration on the consistency of either BIF or TIF layers would likely leave mismatching situations on the other layer. We employ SVB to solve these mismatching situations. With the help of SVB, we address the needs of coherent business planning and IT implementation in a model drivenmanner. Finally, we illustrate the feasibility of our approach in the development of a modern cloud-based tourism e-commerce platform.

Explore More