Shubha U. Nabar
Stanford University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Shubha U. Nabar.
conference on information and knowledge management | 2008
Aleksandra Korolova; Rajeev Motwani; Shubha U. Nabar; Ying Xu
We consider a privacy threat to a social network in which the goal of an attacker is to obtain knowledge of a significant fraction of the links in the network. We formalize the typical social network interface and the information about links that it provides to its users in terms of lookahead. We consider a particular threat in which an attacker subverts user accounts to gain information about local neighborhoods in the network and pieces them together in order to build a global picture. We analyze, both experimentally and theoretically, the number of user accounts an attacker would need to subvert for a successful attack, as a function of his strategy for choosing users whose accounts to subvert and a function of the lookahead provided by the network. We conclude that such an attack is feasible in practice, and thus any social network that wishes to protect the link privacy of its users should take great care in choosing the lookahead of its interface, limiting it to 1 or 2, whenever possible.
conference on information and knowledge management | 2006
Andrei Z. Broder; Marcus Fontura; Vanja Josifovski; Ravi Kumar; Rajeev Motwani; Shubha U. Nabar; Rina Panigrahy; Andrew Tomkins; Ying Xu
We consider the problem of estimating the size of a collection of documents using only a standard query interface. Our main idea is to construct an unbiased and low-variance estimator that can closely approximate the size of any set of documents defined by certain conditions, including that each document in the set must match at least one query from a uniformly sampleable query pool of known size, fixed in advance.Using this basic estimator, we propose two approaches to estimating corpus size. The first approach requires a uniform random sample of documents from the corpus. The second approach avoids this notoriously difficult sample generation problem, and instead uses two fairly uncorrelated sets of terms as query pools; the accuracy of the second approach depends on the degree of correlation among the two sets of terms.Experiments on a large TREC collection and on three major search engines demonstrates the effectiveness of our algorithms.
very large data bases | 2009
Anish Das Sarma; Omar Benjelloun; Alon Y. Halevy; Shubha U. Nabar; Jennifer Widom
In general terms, an uncertain relation encodes a set of possible certain relations. There are many ways to represent uncertainty, ranging from alternative values for attributes to rich constraint languages. Among the possible models for uncertain data, there is a tension between simple and intuitive models, which tend to be incomplete, and complete models, which tend to be nonintuitive and more complex than necessary for many applications. We present a space of models for representing uncertain data based on a variety of uncertainty constructs and tuple-existence constraints. We explore a number of properties and results for these models. We study completeness of the models, as well as closure under relational operations, and we give results relating closure and completeness. We then examine whether different models guarantee unique representations of uncertain data, and for those models that do not, we provide complexity results and algorithms for testing equivalence of representations. The next problem we consider is that of minimizing the size of representation of models, showing that minimizing the number of tuples also minimizes the size of constraints. We show that minimization is intractable in general and study the more restricted problem of maintaining minimality incrementally when performing operations. Finally, we present several results on the problem of approximating uncertain data in an insufficiently expressive model.
international conference on data engineering | 2008
Aleksandra Korolova; Rajeev Motwani; Shubha U. Nabar; Ying Xu
We consider a privacy threat to a social network in which the goal of an attacker is to obtain knowledge of a significant fraction of the links in the network. We formalize the typical social network interface and the information about links that it provides to its users in terms of lookahead. We consider a particular threat in which an attacker subverts user accounts to gain information about local neighborhoods in the network and pieces them together in order to build a global picture. We analyze, both experimentally and theoretically, the number of user accounts an attacker would need to subvert for a successful attack, as a function of his strategy for choosing users whose accounts to subvert and a function of the lookahead provided by the network. We conclude that such an attack is feasible in practice, and thus any social network that wishes to protect the link privacy of its users should take great care in choosing the lookahead of its interface, limiting it to 1 or 2, whenever possible.
international conference on data engineering | 2008
Rajeev Motwani; Shubha U. Nabar; Dilys Thomas
We study the problem of auditing a batch of SQL queries: given a forbidden view of a database that should have been kept confidential, a batch of queries that were posed over this database and answered, and a definition of suspiciousness, determine if the query batch is suspicious with respect to the forbidden view. We consider several notions of suspiciousness that span a spectrum both in terms of their disclosure detection guarantees and the tractability of auditing under them for different classes of queries. We identify a particular notion of suspiciousness, weak syntactic suspiciousness, that allows for an efficient auditor for a large class of conjunctive queries. The auditor can be used together with a specific set of forbidden views to detect disclosures of the association between individuals and their private attributes. Further it can also be used to prevent disclosures by auditing queries on the fly in an online setting. Finally, we tie in our work with recent research on query auditing and access control and relate the above definitions of suspiciousness to the notion of unconditional validity of a query introduced in database access control literature.
Privacy-Preserving Data Mining | 2008
Shubha U. Nabar; Krishnaram Kenthapadi; Nina Mishra; Rajeev Motwani
17.1 Introduction This chapter is a survey of query auditing techniques for detecting and preventing disclosures in a database containing private data. Informally, auditing is the process of examining past actions to check whether they were in conformance with official policies. In the context of database systems with specific data disclosure policies, auditing is the process of examining queries that were answered in the past to determine whether answers to these queries could have been used by an individual to ascertain confidential information forbidden by the disclosure policies. Techniques used for detecting disclosures could potentially also be used or extended to prevent disclosures, and so in addition to the retroactive auditing mentioned above, researchers have also studied an online
international conference on data engineering | 2006
Arnd Christian König; Shubha U. Nabar
Physical database design is critical to the performance of a large-scale DBMS. The corresponding automated design tuning tools need to select the best physical design from a large set of candidate designs quickly. However, for large workloads, evaluating the cost of each query in the workload for every candidate does not scale. To overcome this, we present a novel comparison primitive that only evaluates a fraction of the workload and provides an accurate estimate of the likelihood of selecting correctly. We show how to use this primitive to construct accurate and scalable selection procedures. Furthermore, we address the issue of ensuring that the estimates are conservative, even for highly skewed cost distributions. The proposed techniques are evaluated through a prototype implementation inside a commercial physical design tool.
international conference on data engineering | 2007
Rajeev Motwani; Shubha U. Nabar; Dilys Thomas
In this paper, we study the problem of auditing a batch of SQL queries: given a set of SQL queries that have been posed over a database, determine whether some subset of these queries have revealed private information about an individual or group of individuals. In (Agrawal et al., 2004), the authors studied the problem of determining whether any single SQL query in isolation revealed information forbidden by the database systems data disclosure policies. In this paper, we extend this work to the problem of auditing a batch of SQL queries. We define two different notions of auditing-semantic auditing and syntactic auditing -and show that while syntactic auditing seems more desirable, it is in fact NP-hard to achieve. The problem of semantic auditing of a batch of SQL queries is, however, tractable and we give a polynomial time algorithm for this purpose.
global communications conference | 2005
Shubha U. Nabar; Neha Kumar; Mohsen Bayati; Abtin Keshavarzian
In recent years, several high-throughput low-delay scheduling algorithms have been designed for input-queued (IQ) switches. It has been shown however that scheduling policies such as maximum weight matching, that perform optimally for an isolated switch, fail to provide stability in a network of IQ switches (M. Andrews and L. Zhang, 2001). Although there exist algorithms that ensure stability in networks of switches (M. Andrews and L. Zhang, 2001) (M. Ajmone Marsan et al., 2003), they are either not fully local or require knowledge/estimation of rates, and are thus not desirable. Here we propose a local and online switch-scheduling algorithm and prove that it achieves stability in a network of single-server switches when arriving traffic is admissible and obeys the strong law of large numbers. We then propose its counterpart for networks of crossbar switches and conjecture that this too is stable. Additionally, we prove that our algorithms provide a max-min fair rate allocation for isolated switches even when arriving traffic is inadmissible. We believe that fairness is key to ensuring stability in networks.
very large data bases | 2006
Parag Agrawal; Omar Benjelloun; Anish Das Sarma; Chris Hayworth; Shubha U. Nabar; Tomoe Sugihara; Jennifer Widom