Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Prateek Jain is active.

Publication


Featured researches published by Prateek Jain.


european semantic web conference | 2014

User Interests Identification on Twitter Using a Hierarchical Knowledge Base

Pavan Kapanipathi; Prateek Jain; Chitra Venkataramani; Amit P. Sheth

Twitter, due to its massive growth as a social networking platform, has been in focus for the analysis of its user generated content for personalization and recommendation tasks. A common challenge across these tasks is identifying user interests from tweets. Semantic enrichment of Twitter posts, to determine user interests, has been an active area of research in the recent past. These approaches typically use available public knowledge-bases (such as Wikipedia) to spot entities and create entity-based user profiles. However, exploitation of such knowledge-bases to create richer user profiles is yet to be explored. In this work, we leverage hierarchical relationships present in knowledge-bases to infer user interests expressed as a Hierarchical Interest Graph. We argue that the hierarchical semantics of concepts can enhance existing systems to personalize or recommend items based on a varied level of conceptual abstractness. We demonstrate the effectiveness of our approach through a user study which shows an average of approximately eight of the top ten weighted hierarchical interests in the graph being relevant to a user’s interests.


Geospatial Semantics and the Semantic Web | 2011

SPARQL-ST: Extending SPARQL to Support Spatiotemporal Queries

Matthew Perry; Prateek Jain; Amit P. Sheth

Spatial and temporal data is plentiful on the Web, and Semantic Web technologies have the potential to make this data more accessible and more useful. Semantic Web researchers have consequently made progress towards better handling of spatial and temporal data.SPARQL, the W3C-recommended query language for RDF, does not adequately support complex spatial and temporal queries. In this work, we present the SPARQL-ST query language. SPARQL-ST is an extension of SPARQL for complex spatiotemporal queries. We present a formal syntax and semantics for SPARQL-ST. In addition, we describe a prototype implementation of SPARQL-ST and demonstrate the scalability of this implementation with a performance study using large real-world and synthetic RDF datasets.


international conference on semantic systems | 2013

A statistical and schema independent approach to identify equivalent properties on linked data

Kalpa Gunaratna; Krishnaprasad Thirunarayan; Prateek Jain; Amit P. Sheth; Sanjaya Wijeratne

Linked Open Data (LOD) cloud has gained significant attention in the Semantic Web community recently. Currently it consists of approximately 295 interlinked datasets with over 50 billion triples including 500 million links, and continues to expand in size. This vast source of structured information has the potential to have a significant impact on knowledge-based applications. However, a key impediment to the use of LOD cloud is limited support for data integration tasks over concepts, instances, and properties. Efforts to address this limitation over properties have focused on matching data-type properties across datasets; however, matching of object-type properties has not received similar attention. We present an approach that can automatically match object-type properties across linked datasets, primarily exploiting and bootstrapping from entity co-reference links such as owl:sameAs. Our evaluation, using sample instance sets taken from Freebase, DBpedia, LinkedMDB, and DBLP datasets covering multiple domains shows that our approach matches properties with high precision and recall (on average, F measure gain of 57% - 78%).


Proceedings of the 2013 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT) on | 2013

Automatic Domain Identification for Linked Open Data

Sarasi Lalithsena; Pascal Hitzler; Amit P. Sheth; Prateek Jain

Linked Open Data (LOD) has emerged as one of the largest collections of interlinked structured datasets on the Web. Although the adoption of such datasets for applications is increasing, identifying relevant datasets for a specific task or topic is still challenging. As an initial step to make such identification easier, we provide an approach to automatically identify the topic domains of given datasets. Our method utilizes existing knowledge sources, more specifically Freebase, and we present an evaluation which validates the topic domains we can identify with our system. Furthermore, we evaluate the effectiveness of identified topic domains for the purpose of finding relevant datasets, thus showing that our approach improves reusability of LOD datasets.


international conference on big data | 2013

Constructing consumer profiles from social media data

Mauricio A. Hernández; Kirsten Hildrum; Prateek Jain; Rohit Wagle; Bogdan Alexe; Rajasekar Krishnamurthy; Ioana Stanoi; Chitra Venkatramani

Social media is playing a growing role in providing consumer feedback to companies about their products and services. To maximize the benefit of this feedback, companies want to know how different consumer-segments they are interested in, such as parents, frequent travelers, and comic book fans react to their products and campaigns. In this paper, we describe how constructing consumer profiles is valuable to obtain such insights. We present the challenges in analyzing noisy social media data and the techniques we employ for building the profiles. We also present detailed experimental results from the analysis of over seven billion messages to construct profiles of over 100 million consumers. We demonstrate how consumer profiles can help in understanding consumer feedback by different key segments using a TV show analysis scenario.


Ibm Journal of Research and Development | 2014

Social media and customer behavior analytics for personalized customer engagements

Stephen J. Buckley; Markus Ettl; Prateek Jain; Ronny Luss; Marek Petrik; Rajesh Kumar Ravi; Chitra Venkatramani

Companies in various industries, including travel, hospitality, and retail, increasingly focus on improving customer relationships and customer loyalty. In this paper, we propose a new systems architecture that combines the textual content in social media messages with product information, such as the descriptions summarized in catalogs, in order to provide marketing campaign recommendations. Companies commonly build user profiles based on purchase histories and other customer-specific information; however, when dealing with social media, we often cannot match the social media users with the customers. In this regard, we address the problem of targeting individual social media messages for which no personalized profile information can be retrieved. Our solution combines two disparate computational toolboxes for text analytics—natural language processing and machine learning—in order to select social media users for whom to target with topic-specific advertisements. Natural language processing is used to analyze the context of social media messages, and machine learning is used to analyze product information, with the goal being to match social media messages to products and ranking potential advertisements. To demonstrate the framework, we detail a real-world application in the travel and tourism industry using Twitter® as the social media platform.


international world wide web conferences | 2014

Hierarchical interest graph from tweets

Pavan Kapanipathi; Prateek Jain; Chitra Venkataramani; Amit P. Sheth

Industry and researchers have identified numerous ways to monetize microblogs for personalization and recommendation. A common challenge across these different works is the identification of user interests. Although techniques have been developed to address this challenge, a flexible approach that spans multiple levels of granularity in user interests has not been forthcoming. In this work, we focus on exploiting hierarchical semantics of concepts to infer richer user interests expressed as a Hierarchical Interest Graph. To create such graphs, we utilize users tweets to first ground potential user interests to structured background knowledge such as Wikipedia Category Graph. We then adapt spreading activation theory to assign user interest score to each category in the hierarchy. The Hierarchical Interest Graph not only comprises of users explicitly mentioned interests determined from Twitter, but also their implicit interest categories inferred from the background knowledge source.


ACM Sigweb Newsletter | 2013

Linked open data alignment & querying

Prateek Jain

Prateek Jain is a Research Staff Member at IBM T.J. Watson Research Center at Yorktown, NY. Prateek earned his Ph.D. in Computer Science in July 2012 under the supervision of Prof. Amit Sheth and Prof. Pascal Hitzler at the Ohio Center of Excellence in Knowledge-enabled Computing of Wright State University in Dayton, Ohio. He has been involved in the SIGWEB and Semantic Web community by contributing with publications and participating in several conferences including ISWC, Hypertext and ESWC. Prateek’s research interests are in the area of data management, integration and querying. He is currently exploring use of background knowledge for social network data analytics.


pattern recognition and machine intelligence | 2005

Multi-objective optimization for adaptive web site generation

Prateek Jain; Pabitra Mitra

Designing web sites is a complex problem. Adaptive sites are those which improve themselves by learning from user access patterns. In this paper we have considered a problem of index page synthesis for an adaptive website and framed it in a new type of Multi-Objective Optimization problem. We give a solution to index page synthesis which uses a popular clustering algorithm DBSCAN alongwith NSGA-II–an evolutionary algorithm–to find out best index pages for a website. Our experiments shows that very good candidate index pages can be generated automatically, and that our technique outperforms various existing methods such as PageGather, K-Means and Hierarchical Agglomerative Clustering.


Archive | 2014

Analysis of social media messages

Stephen J. Buckley; Markus Ettl; Matthias O. Frey; Prateek Jain; Ronny Luss; Marek Petrik; Rajesh Kumar Ravi; Chitra Venkatramani

Collaboration


Dive into the Prateek Jain's collaboration.

Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge