Is this you? Create Your Porfile

Kenneth Wai-Ting Leung

Hong Kong University of Science and Technology

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Kenneth Wai-Ting Leung is active.

Explore More

Publication

Featured researches published by Kenneth Wai-Ting Leung.

international acm sigir conference on research and development in information retrieval | 2011

CLR: a collaborative location recommendation framework based on co-clustering

Kenneth Wai-Ting Leung; Dik Lun Lee; Wang-Chien Lee

GPS data tracked on mobile devices contains rich information about human activities and preferences. In this paper, GPS data is used in location-based services (LBSs) to provide collaborative location recommendations. We observe that most existing LBSs provide location recommendations by clustering the User-Location matrix. Since the User-Location matrix created based on GPS data is huge, there are two major problems with these methods. First, the number of similar locations that need to be considered in computing the recommendations can be numerous. As a result, the identification of truly relevant locations from numerous candidates is challenging. Second, the clustering process on large matrix is time consuming. Thus, when new GPS data arrives, complete re-clustering of the whole matrix is infeasible. To tackle these two problems, we propose the Collaborative Location Recommendation (CLR) framework for location recommendation. By considering activities (i.e., temporal preferences) and different user classes (i.e., Pattern Users, Normal Users, and Travelers) in the recommendation process, CLR is capable of generating more precise and refined recommendations to the users compared to the existing methods. Moreover, CLR employs a dynamic clustering algorithm CADC to cluster the trajectory data into groups of similar users, similar activities and similar locations efficiently by supporting incremental update of the groups when new GPS trajectory data arrives. We evaluate CLR with a real-world GPS dataset, and confirm that the CLR framework provides more accurate location recommendations compared to the existing methods.

IEEE Transactions on Knowledge and Data Engineering | 2010

Deriving Concept-Based User Profiles from Search Engine Logs

Kenneth Wai-Ting Leung; Dik Lun Lee

User profiling is a fundamental component of any personalization applications. Most existing user profiling strategies are based on objects that users are interested in (i.e., positive preferences), but not the objects that users dislike (i.e., negative preferences). In this paper, we focus on search engine personalization and develop several concept-based user profiling methods that are based on both positive and negative preferences. We evaluate the proposed methods against our previously proposed personalized query clustering method. Experimental results show that profiles which capture and utilize both of the users positive and negative preferences perform the best. An important result from the experiments is that profiles with negative preferences can increase the separation between similar and dissimilar queries. The separation provides a clear threshold for an agglomerative clustering algorithm to terminate and improve the overall quality of the resulting query clusters.

conference on information and knowledge management | 2011

Context-aware search personalization with concept preference

Di Jiang; Kenneth Wai-Ting Leung; Wilfred Ng

As the size of the web is growing rapidly, a well-recognized challenge for developing web search engines is to optimize the search result towards each users preference. In this paper, we propose and develop a new personalization framework that captures the users preference in the form of concepts obtained by mining web search contexts. The search context consists of both the users clickthroughs and query reformulations that satisfy some specific information need, which is able to provide more information than each individual query in a search session. We also propose a method that discovers search contexts by one-pass of raw search query log. Using the information of the search context, we develop eight strategies that derive conceptual preference judgment. A learning-to-rank approach is employed to combine the derived preference judgments and then a Context-Aware User Profile (CAUP) is created. We further employ CAUP to adapt a personalized ranking function. Experimental results demonstrate that our approach captures accurate and comprehensive users preference and, in terms of Top-N results quality, outperforms those existing concept-based personalization approaches without using search contexts.

international acm sigir conference on research and development in information retrieval | 2014

Collaborative personalized Twitter search with topic-language models

Jan Vosecky; Kenneth Wai-Ting Leung; Wilfred Ng

The vast amount of real-time and social content in microblogs results in an information overload for users when searching microblog data. Given the users search query, delivering content that is relevant to her interests is a challenging problem. Traditional methods for personalized Web search are insufficient in the microblog domain, because of the diversity of topics, sparseness of user data and the highly social nature. In particular, social interactions between users need to be considered, in order to accurately model users interests, alleviate data sparseness and tackle the cold-start problem. In this paper, we therefore propose a novel framework for Collaborative Personalized Twitter Search. At its core, we develop a collaborative user model, which exploits the users social connections in order to obtain a comprehensive account of her preferences. We then propose a novel user model structure to manage the topical diversity in Twitter and to enable semantic-aware query disambiguation. Our framework integrates a variety of information about the users preferences in a principled manner. A thorough evaluation is conducted using two personalized Twitter search query logs, demonstrating a superior ranking performance of our framework compared with state-of-the-art baselines.

conference on information and knowledge management | 2013

Dynamic multi-faceted topic discovery in twitter

Jan Vosecky; Di Jiang; Kenneth Wai-Ting Leung; Wilfred Ng

Microblogging platforms, such as Twitter, already play an important role in cultural, social and political events around the world. Discovering high-level topics from social streams is therefore important for many downstream applications. However, traditional text mining methods that rely on the bag-of-words model are insufficient to uncover the rich semantics and temporal aspects of topics in Twitter. In particular, topics in Twitter are inherently dynamic and often focus on specific entities, such as people or organizations. In this paper, we therefore propose a method for mining multifaceted topics from Twitter streams. The Multi-Faceted Topic Model (MfTM) is proposed to jointly model latent semantics among terms and entities and captures the temporal characteristics of each topic. We develop an efficient online inference method for MfTM, which enables our model to be applied to large-scale and streaming data. Our experimental evaluation shows the effectiveness and efficiency of our model compared with state-of-the-art baselines. We further demonstrate the effectiveness of our framework in the context of tweet clustering.

international conference on data engineering | 2014

Personalized Query Suggestion With Diversity Awareness

Di Jiang; Kenneth Wai-Ting Leung; Jan Vosecky; Wilfred Ng

Query suggestion is an important functionality provided by the search engine to facilitate information seeking of the users. Existing query suggestion methods usually focus on recommending queries that are the most relevant to the input query. However, such relevance-oriented strategy cannot effectively handle query uncertainty, a common scenario that the input query can be interpreted as multiple different meanings. To alleviate this problem, the concepts of diversification and person-alization have been individually introduced to query suggestion systems. These two concepts are often seen as incompatible alternatives, because diversification considers multiple aspects of the input query to maximize the probability that some query aspect is relevant to the user while personalization aims to adapt the suggestions to a specific aspect that aligns with the preference of a specific user. In this paper, we refute this antagonistic view and propose a new query suggestion paradigm, Personalized Query Suggestion With Diversity Awareness (PQS-DA) to effectively combine diversification and personalization into one unified framework. In PQS-DA, the suggested queries are effectively diversified to cover different potential facets of the input query while the ranking of suggested queries are personalized to ensure that the top ones are those that align with a users personal preference. We evaluate PQS-DA on a real-life search engine query log against several state-of-the-art methods with respect to a variety of metrics. The experimental results verify our hypothesis that diversification and personalization can be effectively integrated and they are able to enhance each other within the PQS-DA framework, which significantly outperforms several strong baselines with respect to a series of metrics.

ACM Transactions on Internet Technology | 2014

Integrating Social and Auxiliary Semantics for Multifaceted Topic Modeling in Twitter

Jan Vosecky; Di Jiang; Kenneth Wai-Ting Leung; Kai Xing; Wilfred Ng

Microblogging platforms, such as Twitter, have already played an important role in recent cultural, social and political events. Discovering latent topics from social streams is therefore important for many downstream applications, such as clustering, classification or recommendation. However, traditional topic models that rely on the bag-of-words assumption are insufficient to uncover the rich semantics and temporal aspects of topics in Twitter. In particular, microblog content is often influenced by external information sources, such as Web documents linked from Twitter posts, and often focuses on specific entities, such as people or organizations. These external sources provide useful semantics to understand microblogs and we generally refer to these semantics as auxiliary semantics. In this article, we address the mentioned issues and propose a unified framework for Multifaceted Topic Modeling from Twitter streams. We first extract social semantics from Twitter by modeling the social chatter associated with hashtags. We further extract terms and named entities from linked Web documents to serve as auxiliary semantics during topic modeling. The Multifaceted Topic Model (MfTM) is then proposed to jointly model latent semantics among the social terms from Twitter, auxiliary terms from the linked Web documents and named entities. Moreover, we capture the temporal characteristics of each topic. An efficient online inference method for MfTM is developed, which enables our model to be applied to large-scale and streaming data. Our experimental evaluation shows the effectiveness and efficiency of our model compared with state-of-the-art baselines. We evaluate each aspect of our framework and show its utility in the context of tweet clustering.

World Wide Web | 2016

Query intent mining with multiple dimensions of web search data

Di Jiang; Kenneth Wai-Ting Leung; Wilfred Ng

Understanding the users’ latent intents behind the search queries is critical for search engines. Hence, there has been an increasing attention on studying how to effectively mine the intents of search queries by analyzing search engine query log. However, we observe that the information richness of query log is not fully utilized so far and the information underuse heavily limits the performance of the existing methods. In this paper, we tackle the problem of query intent mining by taking full advantage of the information richness of query log from a multi-dimensional perspective. Specifically, we capture the latent relations between search queries via three different dimensions: the URL dimension, the session dimension and the term dimension. We first propose the Result-Oriented Framework (ROF), which is easy to implement and significantly improves both the precision and the recall of query intent mining. We further propose the Topic-Oriented Framework (TOF), in order to significantly reduce the online time and memory consumptions for query intent mining. TOF employs the Query Log Topic Model (QLTM) that derives the latent topics from query log to integrate the information of the three dimensions in a principled way. The latent topics that are considered as low-dimensional descriptions of the query relations and serve as the basis of efficient online query intent mining. We conduct extensive experiments on a major commercial search engine query log. Experimental results show that the two frameworks significantly outperform the state-of-the-art methods with respect to a variety of metrics.

extending database technology | 2013

Panorama: a semantic-aware application search framework

Di Jiang; Jan Vosecky; Kenneth Wai-Ting Leung; Wilfred Ng

Third-party applications (or commonly referred to the apps) proliferate on the web and mobile platforms in recent years. The tremendous amount of available apps in app market-places suggests the necessity of designing effective app search engines. However, existing app search engines typically ignore the latent semantics in the app corpus and thus usually fail to provide high-quality app snippets and effective app rankings. In this paper, we present a novel framework named Panorama to provide independent search results for Android apps with semantic awareness. We first propose the App Topic Model (ATM) to discover the latent semantics from the app corpus. Based on the discovered semantics, we tackle two central challenges that are faced by current app search engines: (1) how to generate concise and informative snippets for apps and (2) how to rank apps effectively with respect to search queries. To handle the first challenge, we propose several new metrics for measuring the quality of the sentences in app description and develop a greedy algorithm with fixed probability guarantee of near-optimal performance for app snippet generation. To handle the second challenge, we propose a variety of new features for app ranking and also design a new type of inverted index to support efficient Top-k app retrieval. We conduct extensive experiments on a large-scale data collection of Android apps and build an app search engine prototype for human-based performance evaluation. The proposed framework demonstrates superior performance against several strong baselines with respect to different metrics.

extending database technology | 2011

Constructing concept relation network and its application to personalized web search

Kenneth Wai-Ting Leung; Hing Yuet Fung; Dik Lun Lee

Search engines are very effective in finding relevant pages for a query. When a query is ambiguous, the search engine returns a mix of results for different semantic interpretations of the query. This paper proposes a method to extract concepts from the search results of a query, and, treating each retrieved concept as a query, it recursively constructs a network of concepts related to different semantic interpretations of the query. By connecting networks of concepts obtained from different queries, a large integrated network, called Concept Relation Network (CRN), is formed. CRN is a semantic network that can be automatically constructed and maintained using existing search engines (e.g., Google) on the web. Taking advantage of large scale commercial search engines, CRN is able to derive a large number of highly coherent, highly related concepts. We study several ways to weight the connections between the concepts in CRN. By distinguishing between location concepts and content concepts, we analyze the ambiguity of each type of concepts individually. We also propose to extract concept clusters from CRN based on different graph topology. We observe that complete subgraphs in CRN can be used to effectively determine semantically related concepts. Finally, we apply CRN to search engine personalization. Experimental results show that the application of CRN to a concept-based personalization algorithm significantly improves precision comparing to the baseline.

Explore More