Is this you? Create Your Porfile

Mahashweta Das

University of Texas at Arlington

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Mahashweta Das is active.

Explore More

Publication

Featured researches published by Mahashweta Das.

very large data bases | 2012

Who tags what?: an analysis framework

Mahashweta Das; Saravanan Thirumuruganathan; Sihem Amer-Yahia; Gautam Das; Cong Yu

The rise of Web 2.0 is signaled by sites such as Flickr, del.icio.us, and YouTube, and social tagging is essential to their success. A typical tagging action involves three components, user, item (e.g., photos in Flickr), and tags (i.e., words or phrases). Analyzing how tags are assigned by certain users to certain items has important implications in helping users search for desired information. In this paper, we explore common analysis tasks and propose a dual mining framework for social tagging behavior mining. This framework is centered around two opposing measures, similarity and diversity, being applied to one or more tagging components, and therefore enables a wide range of analysis scenarios such as characterizing similar users tagging diverse items with similar tags, or diverse users tagging similar items with diverse tags, etc. By adopting different concrete measures for similarity and diversity in the framework, we show that a wide range of concrete analysis problems can be defined and they are NP-Complete in general. We design efficient algorithms for solving many of those problems and demonstrate, through comprehensive experiments over real data, that our algorithms significantly out-perform the exact brute-force approach without compromising analysis result quality.

knowledge discovery and data mining | 2013

Learning to question: leveraging user preferences for shopping advice

Mahashweta Das; Gianmarco De Francisci Morales; Aristides Gionis; Ingmar Weber

We present ShoppingAdvisor, a novel recommender system that helps users in shopping for technical products. ShoppingAdvisor leverages both user preferences and technical product attributes in order to generate its suggestions. The system elicits user preferences via a tree-shaped flowchart, where each node is a question to the user. At each node, ShoppingAdvisor suggests a ranking of products matching the preferences of the user, and that gets progressively refined along the path from the trees root to one of its leafs. In this paper we show (i) how to learn the structure of the tree, i.e., which questions to ask at each node, and (ii) how to produce a suitable ranking at each node. First, we adapt the classical top-down strategy for building decision trees in order to find the best user attribute to ask at each node. Differently from decision trees, ShoppingAdvisor partitions the user space rather than the product space. Second, we show how to employ a learning-to-rank approach in order to learn, for each node of the tree, a ranking of products appropriate to the users who reach that node. We experiment with two real-world datasets for cars and cameras, and a synthetic one. We use mean reciprocal rank to evaluate ShoppingAdvisor, and show how the performance increases by more than 50% along the path from root to leaf. We also show how collaborative recommendation algorithms such as k-nearest neighbor benefits from feature selection done by the ShoppingAdvisor tree. Our experiments show that ShoppingAdvisor produces good quality interpretable recommendations, while requiring less input from users and being able to handle the cold-start problem.

very large data bases | 2012

MapRat: meaningful explanation, interactive exploration and geo-visualization of collaborative ratings

Saravanan Thirumuruganathan; Mahashweta Das; Shrikant Desai; Sihem Amer-Yahia; Gautam Das; Cong Yu

Collaborative rating sites such as IMDB and Yelp have become rich resources that users consult to form judgments about and choose from among competing items. Most of these sites either provide a plethora of information for users to interpret all by themselves or a simple overall aggregate information. Such aggregates (e.g., average rating over all users who have rated an item, aggregates along pre-defined dimensions, etc.) can not help a user quickly decide the desirability of an item. In this paper, we build a system MapRat that allows a user to explore multiple carefully chosen aggregate analytic details over a set of user demographics that meaningfully explain the ratings associated with item(s) of interest. MapRat allows a user to systematically explore, visualize and understand user rating patterns of input item(s) so as to make an informed decision quickly. In the demo, participants are invited to explore collaborative movie ratings for popular movies.

very large data bases | 2014

An expressive framework and efficient algorithms for the analysis of collaborative tagging

Mahashweta Das; Saravanan Thirumuruganathan; Sihem Amer-Yahia; Gautam Das; Cong Yu

The rise of Web 2.0 is signaled by sites such as Flickr, del.icio.us, and YouTube, and social tagging is essential to their success. A typical tagging action involves three components, user, item (e.g., photos in Flickr), and tags (i.e., words or phrases). Analyzing how tags are assigned by certain users to certain items has important implications in helping users search for desired information. In this paper, we develop a dual mining framework to explore tagging behavior. This framework is centered around two opposing measures, similarity and diversity, applied to one or more tagging components, and therefore enables a wide range of analysis scenarios such as characterizing similar users tagging diverse items with similar tags or diverse users tagging similar items with diverse tags. By adopting different concrete measures for similarity and diversity in the framework, we show that a wide range of concrete analysis problems can be defined and they are NP-Complete in general. We design four sets of efficient algorithms for solving many of those problems and demonstrate, through comprehensive experiments over real data, that our algorithms significantly out-perform the exact brute-force approach without compromising analysis result quality.

Proceedings of the Fourth International Workshop on Graph Data Management Experiences and Systems | 2016

A hybrid solution for mixed workloads on dynamic graphs

Mahashweta Das; Alkis Simitsis; Kevin Wilkinson

The scale and significance of graph structured data today has led to the development of graph management systems that are optimized either for graph navigation requests or graph analytic requests. We present a general purpose graph system that provides high performance concurrently for both navigation and analytic requests. In addition, it supports highly dynamic graphs wherein vertices and edges are added or deleted and properties are modified. Our solution employs a hybrid architecture comprising two graph engines, one for each workload, with a synchronization unit to manage updates and a federation layer to present the hybrid system as a single API to graph applications. We develop a proof-of-concept, describe its implementation in details, and present experimental results that demonstrate its potential.

international conference on management of data | 2015

The TagAdvisor: Luring the Lurkers to Review Web Items

Azade Nazi; Mahashweta Das; Gautam Das

The increasing popularity and widespread use of online review sites over the past decade has motivated businesses of all types to possess an expansive arsenal of user feedback (preferably positive) in order to mark their reputation and presence in the Web. Though a significant proportion of purchasing decisions today are driven by average numeric scores (e.g., movie rating in IMDB), detailed reviews are critical for activities such as buying an expensive digital SLR camera, reserving a vacation package, etc. Since writing a detailed review for a product (or, a service) is usually time-consuming and may not offer any incentive, the number of useful reviews available in the Web is far from many. The corpus of reviews available at our disposal for making informed decisions also suffers from spam and misleading content, typographical and grammatical errors, etc. In this paper, we address the problem of how to engage the lurkers (i.e., people who read reviews but never take time and effort to write one) to participate and write online reviews by systematically simplifying the reviewing task. Given a user and an item that she wants to review, the task is to identify the top-

very large data bases | 2016

AD-WIRE: add-on for web item reviewing system

Rajeshkumar Kannapalli; Azade Nazi; Mahashweta Das; Gautam Das

very large data bases | 2015

Structured analytics in social media

Mahashweta Das; Gautam Das

meaningful phrases (i.e., tags) from the set of all tags (i.e., available user feedback for items) that, when advised, would help her review an item easily. We refer to it as the TagAdvisor problem, and formulate it as a general-constrained optimization goal. Our framework is centered around three measures - relevance (i.e., how well the result set of tags describes an item to a user), coverage (i.e., how well the result set of tags covers the different aspects of an item), and polarity (i.e., how well sentiment is attached to the result set of tags) in order to help a user review an item satisfactorily. By adopting different definitions of coverage, we identify two concrete problem instances that enable a wide range of real-world scenarios. We show that these problems are NP-hard and develop practical algorithms with theoretical bounds to solve them efficiently. We conduct detailed experiments on synthetic and real data crawled from the web to validate the utility of our problem and effectiveness of our solutions.

international conference on management of data | 2013

Exploratory mining of collaborative social content

Mahashweta Das

Over the past few decades as purchasing options moved online, the widespread use and popularity of online review sites has simultaneously increased. In spite of the fact that a huge extent of buying choices today are driven by numeric scores (e.g., rating a product), detailed reviews play an important role for activities like purchasing an expensive DSLR camera. Since writing a detailed review for an item is usually time-consuming, the number of reviews available in the Web is far from many. In this paper, we build a system AD-WIRE that given a user and an item, our system identifies the top-k meaningful tags to help her review the item easily. AD-WIRE allows a user to compose her review by quickly selecting from among the set of returned tags or writes her own review. AD-WIRE also visualizes the dependency of the tags to different aspects of an item so a user can make an informed decision quickly. The system can be used for different type of the products. The current demonstration is built to explore review writing process for the mobile phones.

Proceedings of The Vldb Endowment | 2011

MRI: Meaningful Interpretations of Collaborative Ratings.

Mahashweta Das; Sihem Amer-Yahia; Gautam Das; Cong Yu

The rise of social media has turned the Web into an online community where people connect, communicate, and collaborate with each other. Structured analytics in social media is the process of discovering the structure of the relationships emerging from this social media use. It focuses on identifying the users involved, the activities they undertake, the actions they perform, and the items (e.g., movies, restaurants, blogs, etc.) they create and interact with. There are two key challenges facing these tasks: how to organize and model social media content, which is often unstructured in its raw form, in order to employ structured analytics on it; and how to employ analytics algorithms to capture both explicit link-based relationships and implicit behavior-based relationships. In this tutorial, we systemize and summarize the research so far in analyzing social interactions between users and items in the Web from data mining and database perspectives. We start with a general overview of the topic, including discourse to various exciting and practical applications. Then, we discuss the state-of-art for modeling the data, formalizing the mining task, developing the algorithmic solutions, and evaluating on real datasets. We also emphasize open problems and challenges for future research in the area of structured analytics and social media.

Explore More