Weining Yang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Weining Yang is active.

Explore More

Publication

Featured researches published by Weining Yang.

international conference on data engineering | 2013

Differentially private grids for geospatial data

Wahbeh H. Qardaji; Weining Yang; Ninghui Li

In this paper, we tackle the problem of constructing a differentially private synopsis for two-dimensional datasets such as geospatial datasets. The current state-of-the-art methods work by performing recursive binary partitioning of the data domains, and constructing a hierarchy of partitions. We show that the key challenge in partition-based synopsis methods lies in choosing the right partition granularity to balance the noise error and the non-uniformity error. We study the uniform-grid approach, which applies an equi-width grid of a certain size over the data domain and then issues independent count queries on the grid cells. This method has received no attention in the literature, probably due to the fact that no good method for choosing a grid size was known. Based on an analysis of the two kinds of errors, we propose a method for choosing the grid size. Experimental results validate our method, and show that this approach performs as well as, and often times better than, the state-of-the-art methods. We further introduce a novel adaptive-grid method. The adaptive grid method lays a coarse-grained grid over the dataset, and then further partitions each cell according to its noisy count. Both levels of partitions are then used in answering queries over the dataset. This method exploits the need to have finer granularity partitioning over dense regions and, at the same time, coarse partitioning over sparse regions. Through extensive experiments on real-world datasets, we show that this approach consistently and significantly outperforms the uniform-grid method and other state-of-the-art methods.

computer and communications security | 2012

Minimizing private data disclosures in the smart grid

Weining Yang; Ninghui Li; Yuan Qi; Wahbeh H. Qardaji; Stephen E. McLaughlin; Patrick D. McDaniel

Smart electric meters pose a substantial threat to the privacy of individuals in their own homes. Combined with non-intrusive load monitors, smart meter data can reveal precise home appliance usage information. An emerging solution to behavior leakage in smart meter measurement data is the use of battery-based load hiding. In this approach, a battery is used to store and supply power to home devices at strategic times to hide appliance loads from smart meters. A few such battery control algorithms have already been studied in the literature, but none have been evaluated from an adversarial point of view. In this paper, we first consider two well known battery privacy algorithms, Best Effort (BE) and Non-Intrusive Load Leveling (NILL), and demonstrate attacks that recover precise load change information, which can be used to recover appliance behavior information, under both algorithms. We then introduce a stepping approach to battery privacy algorithms that fundamentally differs from previous approaches by maximizing the error between the load demanded by a home and the external load seen by a smart meter. By design, precise load change recovery attacks are impossible. We also propose mutual-information based measurements to evaluate the privacy of different algorithms. We implement and evaluate four novel algorithms using the stepping approach, and show that under the mutual-information metrics they outperform BE and NILL.

ieee symposium on security and privacy | 2014

A Study of Probabilistic Password Models

Jerry Ma; Weining Yang; Min Luo; Ninghui Li

A probabilistic password model assigns a probability value to each string. Such models are useful for research into understanding what makes users choose more (or less) secure passwords, and for constructing password strength meters and password cracking utilities. Guess number graphs generated from password models are a widely used method in password research. In this paper, we show that probability-threshold graphs have important advantages over guess-number graphs. They are much faster to compute, and at the same time provide information beyond what is feasible in guess-number graphs. We also observe that research in password modeling can benefit from the extensive literature in statistical language modeling. We conduct a systematic evaluation of a large number of probabilistic password models, including Markov models using different normalization and smoothing methods, and found that, among other things, Markov models, when done correctly, perform significantly better than the Probabilistic Context-Free Grammar model proposed in Weir et al., which has been used as the state-of-the-art password model in recent research.

very large data bases | 2013

Understanding hierarchical methods for differentially private histograms

Wahbeh H. Qardaji; Weining Yang; Ninghui Li

In recent years, many approaches to differentially privately publish histograms have been proposed. Several approaches rely on constructing tree structures in order to decrease the error when answer large range queries. In this paper, we examine the factors affecting the accuracy of hierarchical approaches by studying the mean squared error (MSE) when answering range queries. We start with one-dimensional histograms, and analyze how the MSE changes with different branching factors, after employing constrained inference, and with different methods to allocate the privacy budget among hierarchy levels. Our analysis and experimental results show that combining the choice of a good branching factor with constrained inference outperform the current state of the art. Finally, we extend our analysis to multi-dimensional histograms. We show that the benefits from employing hierarchical methods beyond a single dimension are significantly diminished, and when there are 3 or more dimensions, it is almost always better to use the Flat method instead of a hierarchy.

international conference on management of data | 2014

PriView: practical differentially private release of marginal contingency tables

Wahbeh H. Qardaji; Weining Yang; Ninghui Li

We consider the problem of publishing a differentially private synopsis of a d-dimensional dataset so that one can reconstruct any k-way marginal contingency tables from the synopsis. Marginal tables are the workhorses of categorical data analysis. Thus, the private release of such tables has attracted a lot of attention from the research community. However, for situations where

computer and communications security | 2013

Membership privacy: a unifying framework for privacy definitions

Ninghui Li; Wahbeh H. Qardaji; Dong Su; Yi Wu; Weining Yang

Synthesis Lectures on Information Security, Privacy, and Trust | 2016

Differential Privacy:From Theory to Practice

Ninghui Li; Min Lyu; Dong Su; Weining Yang

is moderate to large and k is beyond 3, no accurate and practical method exists. We introduce PriView, which computes marginal tables for a number of strategically chosen sets of attributes that we call views, and then use these view marginal tables to reconstruct any desired k-way marginal. In PriView, we apply maximum entropy optimization to reconstruct k-way marginals from views. We also develop a novel method to efficiently making all view marginals consistent while correcting negative entries to improve accuracy. For view selection, we borrow the concept of covering design from combinatorics theory. We conduct extensive experiments on real and synthetic datasets, and show that PriView reduces the error over existing approaches by 2 to 3 orders of magnitude.

Proceedings of the Hot Topics in Science of Security: Symposium and Bootcamp on | 2017

Use of Phishing Training to Improve Security Warning Compliance: Evidence from a Field Experiment

Weining Yang; Aiping Xiong; Jing Chen; Robert W. Proctor; Ninghui Li

We introduce a novel privacy framework that we call Membership Privacy. The framework includes positive membership privacy, which prevents the adversary from significantly increasing its ability to conclude that an entity is in the input dataset, and negative membership privacy, which prevents leaking of non-membership. These notions are parameterized by a family of distributions that captures the adversarys prior knowledge. The power and flexibility of the proposed framework lies in the ability to choose different distribution families to instantiate membership privacy. Many privacy notions in the literature are equivalent to membership privacy with interesting distribution families, including differential privacy, differential identifiability, and differential privacy under sampling. Casting these notions into the framework leads to deeper understanding of the strengthes and weaknesses of these notions, as well as their relationships to each other. The framework also provides a principled approach to developing new privacy notions under which better utility can be achieved than what is possible under differential privacy.

Human Factors | 2017

Is Domain Highlighting Actually Helpful in Identifying Phishing Web Pages

Aiping Xiong; Robert W. Proctor; Weining Yang; Ninghui Li

Over the last decade, differential privacy (DP) has emerged as the de facto standard privacy notion for research in privacy-preserving data analysis and publishing. The DP notion offers strong privacy guarantee and has been applied to many data analysis tasks. This Synthesis Lecture is the first of two volumes on differential privacy. This lecture differs from the existing books and surveys on differential privacy in that we take an approach balancing theory and practice. We focus on empirical accuracy performances of algorithms rather than asymptotic accuracy guarantees. At the same time, we try to explain why these algorithms have those empirical accuracy performances. We also take a balanced approach regarding the semantic meanings of differential privacy, explaining both its strong guarantees and its limitations. We start by inspecting the definition and basic properties of DP, and the main primitives for achieving DP. Then, we give a detailed discussion on the the semantic privacy guarantee provided by DP and the caveats when applying DP. Next, we review the state of the art mechanisms for publishing histograms for low-dimensional datasets, mechanisms for conducting machine learning tasks such as classification, regression, and clustering, and mechanisms for publishing information to answer marginal queries for high-dimensional datasets. Finally, we explain the sparse vector technique, including the many errors that have been made in the literature using it. The planned Volume 2 will cover usage of DP in other settings, including high-dimensional datasets, graph datasets, local setting, location privacy, and so on. We will also discuss various relaxations of DP.

computer and communications security | 2016

An Empirical Study of Mnemonic Sentence-based Password Generation Strategies

Weining Yang; Ninghui Li; Omar Chowdhury; Aiping Xiong; Robert W. Proctor

The current approach to protect users from phishing attacks is to display a warning when the webpage is considered suspicious. We hypothesize that users are capable of making correct informed decisions when the warning also conveys the reasons why it is displayed. We chose to use traffic rankings of domains, which can be easily described to users, as a warning trigger and evaluated the effect of the phishing warning message and phishing training. The evaluation was conducted in a field experiment. We found that knowledge gained from the training enhances the effectiveness of phishing warnings, as the number of participants being phished was reduced. However, the knowledge by itself was not sufficient to provide phishing protection. We suggest that integrating training in the warning interface, involving traffic ranking in phishing detection, and explaining why warnings are generated will improve current phishing defense.

Explore More