Manasi Vartak
Massachusetts Institute of Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Manasi Vartak.
very large data bases | 2015
Manasi Vartak; Sajjadur Rahman; Samuel Madden; Aditya G. Parameswaran; Neoklis Polyzotis
Data analysts often build visualizations as the first step in their analytical workflow. However, when working with high-dimensional datasets, identifying visualizations that show relevant or desired trends in data can be laborious. We propose SeeDB, a visualization recommendation engine to facilitate fast visual analysis: given a subset of data to be studied, SeeDB intelligently explores the space of visualizations, evaluates promising visualizations for trends, and recommends those it deems most “useful” or “interesting”. The two major obstacles in recommending interesting visualizations are (a) scale: evaluating a large number of candidate visualizations while responding within interactive time scales, and (b) utility: identifying an appropriate metric for assessing interestingness of visualizations. For the former, SeeDB introduces pruning optimizations to quickly identify high-utility visualizations and sharing optimizations to maximize sharing of computation across visualizations. For the latter, as a first step, we adopt a deviation-based metric for visualization utility, while indicating how we may be able to generalize it to other factors influencing utility. We implement SeeDB as a middleware layer that can run on top of any DBMS. Our experiments show that our framework can identify interesting visualizations with high accuracy. Our optimizations lead to multiple orders of magnitude speedup on relational row and column stores and provide recommendations at interactive time scales. Finally, we demonstrate via a user study the effectiveness of our deviation-based utility metric and the value of recommendations in supporting visual analytics.
international conference on management of data | 2014
Rebecca Taft; Manasi Vartak; Nadathur Satish; Narayanan Sundaram; Samuel Madden; Michael Stonebraker
This paper introduces a new benchmark designed to test database management system (DBMS) performance on a mix of data management tasks (joins, filters, etc.) and complex analytics (regression, singular value decomposition, etc.) Such mixed workloads are prevalent in a number of application areas including most science workloads and web analytics. As a specific use case, we have chosen genomics data for our benchmark and have constructed a collection of typical tasks in this domain. In addition to being representative of a mixed data management and analytics workload, this benchmark is also meant to scale to large dataset sizes and multiple nodes across a cluster. Besides presenting this benchmark, we have run it on a variety of storage systems including traditional row stores, newer column stores, Hadoop, and an array DBMS. We present performance numbers on all systems on single and multiple nodes, and show that performance differs by orders of magnitude between the various solutions. In addition, we demonstrate that most platforms have scalability issues. We also test offloading the analytics onto a coprocessor. The intent of this benchmark is to focus research interest in this area; to this end, all of our data, data generators, and scripts are available on our web site.
international conference on management of data | 2017
Manasi Vartak; Silu Huang; Tarique Siddiqui; Samuel Madden; Aditya G. Parameswaran
Data visualization is often used as the first step while performing a variety of analytical tasks. With the advent of large, high-dimensional datasets and significant interest in data science, there is a need for tools that can support rapid visual analysis. In this paper we describe our vision for a new class of visualization systems, namely visualization recommendation systems, that can automatically identify and interactively recommend visualizations relevant to an analytical task. We detail the key requirements and design considerations for a visualization recommendation system. We also identify a number of challenges in realizing this vision and describe some approaches to address them.
international conference on management of data | 2013
Manasi Vartak; Samuel Madden
Current recommender systems are focused largely on recommending items based on similarity. For instance, Netflix can recommend movies similar to previously viewed movies, and Amazon can recommend items based on ratings of similar users. Although similarity-based recommendation works well for books and movies, it provides an incomplete solution for items such as clothing or furniture which are inherently used in combination with other items of the same type, e.g., shirt with pants, and desk with a chair. As a result, the decision to buy a clothing or furniture item depends not only on the item itself, but also on how well it works with other items of that type. Recommending such items therefore requires a combination-based recommendation system that given an item, can suggest interesting and diverse combinations containing that item. This problem is challenging because features affecting combination quality are often difficult to identify; quality, being a function of all items in the combination, cannot be computed independently; and there are an exponential number of combinations to explore. In this demonstration, we present CHIC, a first-of-its-kind, combination-based recommendation system for clothing. The audience will interact with our system through the CHIC mobile app which allows the user to take a picture of a clothing item and search for interesting combinations containing the item instantly. The audience can also compete with CHIC to create alternate ensembles and compare quality. Finally, we highlight via visualizations the core modules of CHIC including model building and our novel search and classification algorithm, C-Search.
international conference on management of data | 2010
Manasi Vartak; Venkatesh Raghavan; Elke A. Rundensteiner
In many business and consumer applications, queries have cardinality constraints. However, current database systems provide minimal support for cardinality assurance. Consequently, users must adopt a cumbersome trial-and-error approach to find queries that are close to the original query but also attain the desired cardinality. In this demonstration, we present QRelX a novel framework to automatically generate alternate queries that meet the cardinality and closeness criteria. QRelX employs an innovative query space transformation strategy, proximity-based search and incremental cardinality estimation to efficiently find alternate queries. Our demonstration is an interactive game that allows the audience to compete with QRelX via manual query refinement. We illustrate the importance of cardinality assurance through real-time comparisons between manual refinement and QRelX. We also highlight the novelty of our solution by visualizing the core algorithms of QRelX.
very large data bases | 2015
Aaron J. Elmore; Jennie Duggan; Michael Stonebraker; Magdalena Balazinska; Ugur Çetintemel; Vijay Gadepally; Jeffrey Heer; Bill Howe; Jeremy Kepner; Tim Kraska; Samuel Madden; David Maier; Timothy G. Mattson; Stavros Papadopoulos; Jeff Parkhurst; Nesime Tatbul; Manasi Vartak; Stan Zdonik
very large data bases | 2014
Manasi Vartak; Samuel Madden; Aditya G. Parameswaran; Neoklis Polyzotis
international conference on management of data | 2016
Manasi Vartak; Harihar Subramanyam; Wei-En Lee; Srinidhi Viswanathan; Saadiyah Husnoo; Samuel Madden; Matei Zaharia
extending database technology | 2016
Manasi Vartak; Venkatesh Raghavan; Elke A. Rundensteiner; Samuel Madden
international conference on management of data | 2018
Manasi Vartak; Joana M. F. da Trindade; Samuel Madden; Matei Zaharia