Prasang Upadhyaya | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Prasang Upadhyaya is active.

Explore More

Publication

Featured researches published by Prasang Upadhyaya.

international conference on management of data | 2011

A latency and fault-tolerance optimizer for online parallel query plans

Prasang Upadhyaya; YongChul Kwon; Magdalena Balazinska

We address the problem of making online, parallel query plans fault-tolerant: i.e., provide intra-query fault-tolerance without blocking. We develop an approach that not only achieves this goal but does so through the use of different fault-tolerance techniques at different operators within a query plan. Enabling each operator to use a different fault-tolerance strategy leads to a space of fault-tolerance plans amenable to cost-based optimization. We develop FTOpt, a cost-based fault-tolerance optimizer that automatically selects the best strategy for each operator in a query plan in a manner that minimizes the expected processing time with failures for the entire query. We implement our approach in a prototype parallel query-processing engine. Our experiments demonstrate that (1) there is no single best fault-tolerance strategy for all query plans, (2) often hybrid strategies that mix-and-match recovery techniques outperform any uniform strategy, and (3) our optimizer correctly identifies winning fault-tolerance configurations.

international conference on management of data | 2013

Toward practical query pricing with QueryMarket

Paraschos Koutris; Prasang Upadhyaya; Magdalena Balazinska; Bill Howe; Dan Suciu

We develop a new pricing system, QueryMarket, for flexible query pricing in a data market based on an earlier theoretical framework (Koutris et al., PODS 2012). To build such a system, we show how to use an Integer Linear Programming formulation of the pricing problem for a large class of queries, even when pricing is computationally hard. Further, we leverage query history to avoid double charging when queries purchased over time have overlapping information, or when the database is updated. We then present a technique that fairly shares revenue when multiple sellers are involved. Finally, we implement our approach in a prototype and evaluate its performance on several query workloads.

very large data bases | 2012

How to price shared optimizations in the cloud

Prasang Upadhyaya; Magdalena Balazinska; Dan Suciu

Data-management-as-a-service systems are increasingly being used in collaborative settings, where multiple users access common datasets. Cloud providers have the choice to implement various optimizations, such as indexing or materialized views, to accelerate queries over these datasets. Each optimization carries a cost and may benefit multiple users. This creates a major challenge: how to select which optimizations to perform and how to share their cost among users. The problem is especially challenging when users are selfish and will only report their true values for different optimizations if doing so maximizes their utility. In this paper, we present a new approach for selecting and pricing shared optimizations by using Mechanism Design. We first show how to apply the Shapley Value Mechanism to the simple case of selecting and pricing additive optimizations, assuming an offline game where all users access the service for the same time-period. Second, we extend the approach to online scenarios where users come and go. Finally, we consider the case of substitutive optimizations. We show analytically that our mechanisms induce truthfulness and recover the optimization costs. We also show experimentally that our mechanisms yield higher utility than the state-of-the-art approach based on regret accumulation.

international conference on management of data | 2013

Secure database-as-a-service with Cipherbase

Arvind Arasu; Spyros Blanas; Ken Eguro; Manas Joglekar; Raghav Kaushik; Donald Kossmann; Ravishankar Ramamurthy; Prasang Upadhyaya; Ramarathnam Venkatesan

Data confidentiality is one of the main concerns for users of public cloud services. The key problem is protecting sensitive data from being accessed by cloud administrators who have root privileges and can remotely inspect the memory and disk contents of the cloud servers. While encryption is the basic mechanism that can leveraged to provide data confidentiality, providing an efficient database-as-a-service that can run on encrypted data raises several interesting challenges. In this demonstration we outline the functionality of Cipherbase --- a full fledged SQL database system that supports the full generality of a database system while providing high data confidentiality. Cipherbase has a novel architecture that tightly integrates custom-designed trusted hardware for performing operations on encrypted data securely such that an administrator cannot get access to any plaintext corresponding to sensitive data.

Journal of the ACM | 2015

Query-Based Data Pricing

Paraschos Koutris; Prasang Upadhyaya; Magdalena Balazinska; Bill Howe; Dan Suciu

Data is increasingly being bought and sold online, and Web-based marketplace services have emerged to facilitate these activities. However, current mechanisms for pricing data are very simple: buyers can choose only from a set of explicit views, each with a specific price. In this article, we propose a framework for pricing data on the Internet that, given the price of a few views, allows the price of any query to be derived automatically. We call this capability query-based pricing. We first identify two important properties that the pricing function must satisfy, the arbitrage-free and discount-free properties. Then, we prove that there exists a unique function that satisfies these properties and extends the sellers explicit prices to all queries. Central to our framework is the notion of query determinacy, and in particular instance-based determinacy: we present several results regarding the complexity and properties of it. When both the views and the query are unions of conjunctive queries or conjunctive queries, we show that the complexity of computing the price is high. To ensure tractability, we restrict the explicit prices to be defined only on selection views (which is the common practice today). We give algorithms with polynomial time data complexity for computing the price of two classes of queries: chain queries (by reducing the problem to network flow), and cyclic queries. Furthermore, we completely characterize the class of conjunctive queries without self-joins that have PTIME data complexity, and prove that pricing all other queries is NP-complete, thus establishing a dichotomy on the complexity of the pricing problem when all views are selection queries.

In Search of Elegance in the Theory and Practice of Computation | 2013

A Discussion on Pricing Relational Data

Magdalena Balazinska; Bill Howe; Paraschos Koutris; Dan Suciu; Prasang Upadhyaya

There exists a growing market for structured data on the Internet today, and this motivates a theoretical study of how relational data should be priced. We advocate for a framework where the seller defines a pricing scheme, by essentially stipulating the price of some queries, and the buyer is allowed to purchase data expressed by any query they wish: the system will derive the price automatically from the pricing scheme. We show that, in order to understand pricing, one needs to understand determinacy first. We also discuss some other open problems in pricing relational data.

international conference on management of data | 2015

Automatic Enforcement of Data Use Policies with DataLawyer

Prasang Upadhyaya; Magdalena Balazinska; Dan Suciu

Data has value and is increasingly being exchanged for commercial and research purposes. Data, however, is typically accompanied by terms of use, which limit how it can be used. To date, there are only a few, ad-hoc methods to enforce these terms. We propose DataLawyer, a new system to formally specify usage policies and check them automatically at query runtime in a relational database management system (DBMS). We develop a new model to specify policies compactly and precisely. We introduce novel algorithms to efficiently evaluate policies that can cut policy-checking overheads to only a few percent of the total query runtime. We implement DataLawyer and evaluate it on a real database from the health-care domain.

very large data bases | 2016

Price-optimal querying with data APIs

Prasang Upadhyaya; Magdalena Balazinska; Dan Suciu

Data is increasingly being purchased online in data markets and REST APIs have emerged as a favored method to acquire such data. Typically, sellers charge buyers based on how much data they purchase. In many scenarios, buyers need to make repeated calls to the sellers API. The challenge is then for buyers to keep track of the data they purchase and avoid purchasing the same data twice. In this paper, we propose lightweight modifications to data APIs to achieve optimal history-aware pricing so that buyers are only charged once for data that they have purchased and that has not been updated. The key idea behind our approach is the notion of refunds: buyers buy data as needed but have the ability to ask for refunds of data that they had already purchased before. We show that our techniques can provide significant data cost savings while reducing overheads by two orders of magnitude as compared to the state-of-the-art competing approaches.

international conference on management of data | 2013

The power of data use management in action

Prasang Upadhyaya; Nicholas R. Anderson; Magdalena Balazinska; Bill Howe; Raghav Kaushik; Ravishankar Ramamurthy; Dan Suciu

In this demonstration, we show-case a database management system extended with a new type of component that we call a Data Use Manager (DUM). The DUM enables DBAs to attach policies to data loaded into the DBMS. It then monitors how users query the data, flags potential policy violations, recommends possible fixes, and supports offline analysis of user activities related to data policies. The demonstration uses real healthcare data.

symposium on principles of database systems | 2012