Is this you? Create Your Porfile

Qiankun Zhao

Nanyang Technological University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Qiankun Zhao is active.

Explore More

Publication

Featured researches published by Qiankun Zhao.

international world wide web conferences | 2006

Time-dependent semantic similarity measure of queries using historical click-through data

Qiankun Zhao; Steven C. H. Hoi; Tie-Yan Liu; Sourav S. Bhowmick; Michael R. Lyu; Wei-Ying Ma

It has become a promising direction to measure similarity of Web search queries by mining the increasing amount of click-through data logged by Web search engines, which record the interactions between users and the search engines. Most existing approaches employ the click-through data for similarity measure of queries with little consideration of the temporal factor, while the click-through data is often dynamic and contains rich temporal information. In this paper we present a new framework of time-dependent query semantic similarity model on exploiting the temporal characteristics of historical click-through data. The intuition is that more accurate semantic similarity values between queries can be obtained by taking into account the timestamps of the log data. With a set of user-defined calendar schema and calendar patterns, our time-dependent query similarity model is constructed using the marginalized kernel technique, which can exploit both explicit similarity and implicit semantics from the click-through data effectively. Experimental results on a large set of click-through data acquired from a commercial search engine show that our time-dependent query similarity model is more accurate than the existing approaches. Moreover, we observe that our time-dependent query similarity model can, to some extent, reflect real-world semantics such as real-world events that are happening over time.

knowledge discovery and data mining | 2006

Event detection from evolution of click-through data

Qiankun Zhao; Tie-Yan Liu; Sourav S. Bhowmick; Wei-Ying Ma

Previous efforts on event detection from the web have focused primarily on web content and structure data ignoring the rich collection of web log data. In this paper, we propose the first approach to detect events from the click-through data, which is the log data of web search engines. The intuition behind event detection from click-through data is that such data is often event-driven and each event can be represented as a set ofquery-page pairs that are not only semantically similar but also have similar evolution pattern over time. Given the click-through data, in our proposed approach, we first segment it into a sequence of bipartite graphs based on theuser-defined time granularity. Next, the sequence of bipartite graphs is represented as a vector-based graph, which records the semantic and evolutionary relationships between queries and pages. After that, the vector-based graph is transformed into its dual graph, where each node is a query-page pair that will be used to represent real world events. Then, the problem of event detection is equivalent to the problem of clustering the dual graph of the vector-based graph. The clustering process is based on a two-phase graph cut algorithm. In the first phase, query-page pairs are clustered based on thesemantic-based similarity such that each cluster in the result corresponds to a specific topic. In the second phase, query-page pairs related to the same topic are further clustered based on the evolution pattern-based similarity such that each cluster is expected to represent a specific event under the specific topic. Experiments with real click-through data collected from a commercial web search engine show that the proposed approach produces high quality results.

conference on information and knowledge management | 2004

Discovering frequently changing structures from historical structural deltas of unordered XML

Qiankun Zhao; Sourav S. Bhowmick; Mukesh K. Mohania; Yahiko Kambayashi

Recently, a large amount of work has been done in XML data mining. However, we observed that most of the existing works focus on the snapshot XML data, while XML data is dynamic in real applications. To the best of our knowledge, none of the existing works has addressed the issue of mining the history of changes to XML documents. Such mining results can be useful in many applications such as XML change detection, XML indexing, association rule mining, and classification etc. In this paper, we propose a novel approach to discover the frequently changing structures from the sequence of historical structural deltas of unordered XML. To make the structure discovering process efficient, an expressive and compact data model, Historical-Document Object Model (H-DOM), is proposed. Using this model, two basic algorithms, which can discover all the frequently changing structures with only two scans of the XML sequence, are presented. Experimental results show that our algorithms, together with the optimization techniques, are efficient and scalable.

conference on information and knowledge management | 2005

WAM-Miner: in the search of web access motifs from historical web log data

Qiankun Zhao; Sourav S. Bhowmick; Le Gruenwald

Existing web usage mining techniques focus only on discovering knowledge based on the statistical measures obtained from the static characteristics of web usage data. They do not consider the dynamic nature of web usage data. In this paper, we focus on discovering novel knowledge by analyzing the change patterns of historical web access sequence data. We present an algorithm called WAM-MINER to discover Web Access Motifs (WAMs). WAMs are web access patterns that never change or do not change significantly most of the time (if not always) in terms of their support values during a specific time period. WAMs are useful for many applications, such as intelligent web advertisement, web site restructuring, business intelligence, and intelligent web caching.

conference on information and knowledge management | 2008

Characterizing and predicting community members from evolutionary and heterogeneous networks

Qiankun Zhao; Sourav S. Bhowmick; Xin Zheng; Kai Yi

Mining different types of communities from web data have attracted a lot of research efforts in recent years. However, none of the existing community mining techniques has taken into account both the dynamic as well as heterogeneous nature of web data. In this paper, we propose to characterize and predict community members from the evolution of heterogeneous web data. We first propose a general framework for analyzing the evolution of heterogeneous networks. Then, the academic network, which is extracted from 1 million computer science papers, is used as an example to illustrate the framework. Finally, two example applications of the academic network are presented. Experimental results with a real and very large heterogeneous academic network show that our proposed framework can produce good results in terms of community member recommendation. Also, novel knowledge and insights can be gained by analyzing the community evolution pattern.

european conference on principles of data mining and knowledge discovery | 2004

Mining history of changes to web access patterns

Qiankun Zhao; Sourav S. Bhowmick

Recently, a lot of work has been done in web usage mining [2]. Among them, mining of frequent Web Access Pattern (WAP) is the most well researched issue [1]. The idea is to transform web logs into sequences of events with user identifications and timestamps, and then extract association and sequential patterns from the events data with certain metrics. The frequent WAPs have been applied to a wide range of applications such as personalization, system improvement, site modification, business intelligence, and usage characterization [2]. However, most of the existing techniques focus only on mining frequent WAP from snapshot web usage data, while web usage data is dynamic in real life. While the frequent WAPs are useful in many applications, knowledge hidden behind the historical changes of web usage data, which reflects how WAPs change, is also critical to many applications such as adaptive web, web site maintenance, business intelligence, etc.In this paper, we propose a novel approach to discover hidden knowledge from historical changes to WAPs. Rather than focusing on the occurrence of the WAPs, we focus on the frequently changing web access patterns. We define a novel type of knowledge, Frequent Mutating WAP (FM-WAP), based on the historical changes of WAPs. The FM-WAP mining process consists of three phases. Firstly, web usage data is represented as a set of WAP trees and partitioned into a sequence of WAP groups ( subsets of the WAP trees) according to a user-defined calendar pattern, where each WAP group is represented as a WAP forest. Consequently, the log data is represented by a sequence of WAP forests called WAP history. Then, changes among the WAP history are detected and stored in the global forest. Finally, the FM-WAP is extracted by a traversal of the global forest. Extensive experiments show that our proposed approach can produce novel knowledge of web access patterns efficiently with good scalability.

data warehousing and knowledge discovery | 2004

Discovering Pattern-Based Dynamic Structures from Versions of Unordered XML Documents

Qiankun Zhao; Sourav S. Bhowmick; Sanjay Kumar Madria

Existing works on XML data mining deal with snapshot XML data only, while XML data is dynamic in real applications. In this paper, we discover knowledge from XML data by taking account its dynamic nature. We present a novel approach to extract pattern-based dynamic structures from versions of unordered XML documents. With the proposed dynamic metrics, the pattern-based dynamic structures are expected to summarize and predict interesting change trends of certain structures based on their past behaviors. Two types of pattern-based dynamic structures, increasing dynamic structure and decreasing dynamic structure are considered. With our proposed data model, SMH-Tree, an algorithm for mining such pattern-based dynamic structures with only two scans of the XML sequence is presented. Experimental results show that the proposed algorithm can extract the pattern-based dynamic structures efficiently with good scalability.

knowledge discovery and data mining | 2006

Cleopatra: evolutionary pattern-based clustering of web usage data

Qiankun Zhao; Sourav S. Bhowmick; Le Gruenwald

Existing web usage mining techniques focus only on discovering knowledge based on the statistical measures obtained from the static characteristics of web usage data. They do not consider the dynamic nature of web usage data. In this paper, we present an algorithm called CLEOPATRA (CLustering of Evolutionary PAtTeRn-based web Access sequences) to cluster web access sequences (WASs) based on their evolutionary patterns. In this approach, Web access sequences that have similar change patterns in their support counts in the history are grouped into the same cluster. The intuition is that often WASs are event/task-driven. As a result, WASs related to the same event/task are expected to be accessed in similar ways over time. Such clusters are useful for several applications such as intelligent web site maintenance and personalized web services.

database systems for advanced applications | 2005

FASST mining: discovering frequently changing semantic structure from versions of unordered XML documents

Qiankun Zhao; Sourav S. Bhowmick

In this paper, we present a FASST mining approach to extract the frequently changing semantic structures (FASSTs), which are a subset of semantic substructures that change frequently, from versions of unordered XML documents. We propose a data structure, H-DOM+, and a FASST mining algorithm, which incorporates the semantic issue and takes the advantage of the related domain knowledge. The distinct feature of this approach is that the FASST mining process is guided by the user-defined concept hierarchy. Rather than mining all the frequent changing structures, only these frequent changing structures that are semantically meaningful are extracted. Our experimental results show that the H-DOM+ structure is compact and the FASST algorithm is efficient with good scalability. We also design a declarative FASST query language, FASSTQUEL, to make the FASST mining process interactive and flexible.

conference on information and knowledge management | 2005

Mining conserved XML query paths for dynamic-conscious caching

Qiankun Zhao; Sourav S. Bhowmick; Le Gruenwald

Existing XML query pattern-based caching strategies focus on extracting the set of frequently issued query pattern trees based on the number of occurrences of the query pattern trees in the history. Each occurrence of the same query pattern tree is considered equally important for the caching strategy. However, the same query pattern tree may occur at different timepoints in the history of XML queries. This temporal feature can be used to improve the caching strategy. In this paper, we propose a novel type of query pattern called conserved query paths for efficient caching by integrating the support and temporal features together. Conserved query paths are paths in query pattern trees that never change or do not change significantly most of the time (if not always) in terms of their support values during a specific time period. We proposed an algorithm to extract those conserved query paths. By ranking those conserved query paths, a dynamic-conscious caching (DCC) strategy is proposed for efficient XML query processing. Experiments show that the DCC caching strategy outperforms the existing XML query pattern tree-based caching strategies.

Explore More