Han Su | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Han Su is active.

Explore More

Publication

Featured researches published by Han Su.

international conference on data engineering | 2015

Interactive Top-k Spatial Keyword queries

Kai Zheng; Han Su; Bolong Zheng; Shuo Shang; Jiajie Xu; Jiajun Liu; Xiaofang Zhou

Conventional top-k spatial keyword queries require users to explicitly specify their preferences between spatial proximity and keyword relevance. In this work we investigate how to eliminate this requirement by enhancing the conventional queries with interaction, resulting in Interactive Top-k Spatial Keyword (ITkSK) query. Having confirmed the feasibility by theoretical analysis, we propose a three-phase solution focusing on both effectiveness and efficiency. The first phase substantially narrows down the search space for subsequent phases by efficiently retrieving a set of geo-textual k-skyband objects as the initial candidates. In the second phase three practical strategies for selecting a subset of candidates are developed with the aim of maximizing the expected benefit for learning user preferences at each round of interaction. Finally we discuss how to determine the termination condition automatically and estimate the preference based on the users feedback. Empirical study based on real PoI datasets verifies our theoretical observation that the quality of top-k results in spatial keyword queries can be greatly improved through only a few rounds of interactions.

international conference on management of data | 2013

Calibrating trajectory data for similarity-based analysis

Han Su; Kai Zheng; Haozhou Wang; Jiamin Huang; Xiaofang Zhou

Due to the prevalence of GPS-enabled devices and wireless communications technologies, spatial trajectories that describe the movement history of moving objects are being generated and accumulated at an unprecedented pace. Trajectory data in a database are intrinsically heterogeneous, as they represent discrete approximations of original continuous paths derived using different sampling strategies and different sampling rates. Such heterogeneity can have a negative impact on the effectiveness of trajectory similarity measures, which are the basis of many crucial trajectory processing tasks. In this paper, we pioneer a systematic approach to trajectory calibration that is a process to transform a heterogeneous trajectory dataset to one with (almost) unified sampling strategies. Specifically, we propose an anchor-based calibration system that aligns trajectories to a set of anchor points, which are fixed locations independent of trajectory data. After examining four different types of anchor points for the purpose of building a stable reference system, we propose a geometry-based calibration approach that considers the spatial relationship between anchor points and trajectories. Then a more advanced model-based calibration method is presented, which exploits the power of machine learning techniques to train inference models from historical trajectory data to improve calibration effectiveness. Finally, we conduct extensive experiments using real trajectory datasets to demonstrate the effectiveness and efficiency of the proposed calibration system.

very large data bases | 2015

Calibrating trajectory data for spatio-temporal similarity analysis

Han Su; Kai Zheng; Jiamin Huang; Haozhou Wang; Xiaofang Zhou

Due to the prevalence of GPS-enabled devices and wireless communications technologies, spatial trajectories that describe the movement history of moving objects are being generated and accumulated at an unprecedented pace. Trajectory data in a database are intrinsically heterogeneous, as they represent discrete approximations of original continuous paths derived using different sampling strategies and different sampling rates. Such heterogeneity can have a negative impact on the effectiveness of trajectory similarity measures, which are the basis of many crucial trajectory processing tasks. In this paper, we pioneer a systematic approach to trajectory calibration that is a process to transform a heterogeneous trajectory dataset to one with (almost) unified sampling strategies. Specifically, we propose an anchor-based calibration system that aligns trajectories to a set of anchor points, which are fixed locations independent of trajectory data. After examining four different types of anchor points for the purpose of building a stable reference system, we propose a spatial-only geometry-based calibration approach that considers the spatial relationship between anchor points and trajectories. Then a more advanced spatial-only model-based calibration method is presented, which exploits the power of machine learning techniques to train inference models from historical trajectory data to improve calibration effectiveness. Afterward, since trajectory has temporal information, we extend these two spatial-only trajectory calibration algorithms to incorporate the temporal information, which can infer a proper time stamp to each anchor point of a calibrated trajectory. At last, we provide a solution to reduce cost, i.e., the number of trajectories that is necessary to be re-calibrated, of the updating of the reference system. Finally, we conduct extensive experiments using real trajectory datasets to demonstrate the effectiveness and efficiency of the proposed calibration system.

international conference on data engineering | 2015

Making sense of trajectory data: A partition-and-summarization approach

Han Su; Kai Zheng; Kai Zeng; Jiamin Huang; Shazia Wasim Sadiq; Nicholas Jing Yuan; Xiaofang Zhou

Due to the prevalence of GPS-enabled devices and wireless communication technology, spatial trajectories that describe the movement history of moving objects are being generated and accumulated at an unprecedented pace. However, a raw trajectory in the form of sequence of timestamped locations does not make much sense for humans without semantic representation. In this work we aim to facilitate humans understanding of a raw trajectory by automatically generating a short text to describe it. By formulating this task as the problem of adaptive trajectory segmentation and feature selection, we propose a partition-and-summarization framework. In the partition phase, we first define a set of features for each trajectory segment and then derive an optimal partition with the aim to make the segments within each partition as homogeneous as possible in terms of their features. In the summarization phase, for each partition we select the most interesting features by comparing against the common behaviours of historical trajectories on the same route and generate short text description for these features. For empirical study, we apply our solution to a real trajectory dataset and have found that the generated text can effectively reflect the important parts in a trajectory.

international conference on data engineering | 2016

Keyword-aware continuous kNN query on road networks

Bolong Zheng; Kai Zheng; Xiaokui Xiao; Han Su; Hongzhi Yin; Xiaofang Zhou; Guohui Li

It is nowadays quite common for road networks to have textual contents on the vertices, which describe auxiliary information (e.g., business, traffic, etc.) associated with the vertex. In such road networks, which are modelled as weighted undirected graphs, each vertex is associated with one or more keywords, and each edge is assigned with a weight, which can be its physical length or travelling time. In this paper, we study the problem of keyword-aware continuous k nearest neighbour (KCkNN) search on road networks, which computes the k nearest vertices that contain the query keywords issued by a moving object and maintains the results continuously as the object is moving on the road network. Reducing the query processing costs in terms of computation and communication has attracted considerable attention in the database community with interesting techniques proposed. This paper proposes a framework, called a Labelling AppRoach for Continuous kNN query (LARC), on road networks to cope with KCkNN query efficiently. First we build a pivot-based reverse label index and a keyword-based pivot tree index to improve the efficiency of keyword-aware k nearest neighbour (KkNN) search by avoiding massive network traversals and sequential probe of keywords. To reduce the frequency of unnecessary result updates, we develop the concepts of dominance interval and region on road network, which share the similar intuition with safe region for processing continuous queries in Euclidean space but are more complicated and thus require more dedicated design. For high frequency keywords, we resolve the dominance interval when the query results changed. In addition, a path-based dominance updating approach is proposed to compute the dominance region efficiently when the query keywords are of low frequency. We conduct extensive experiments by comparing our algorithms with the state-of-the-art methods on real data sets. The empirical observations have verified the superiority of our proposed solution in all aspects of index size, communication cost and computation time.

very large data bases | 2014

STMaker: a system to make sense of trajectory data

Han Su; Kai Zheng; Kai Zeng; Jiamin Huang; Xiaofang Zhou

Widely adoption of GPS-enabled devices generates large amounts of trajectories every day. The raw trajectory data describes the movement history of moving objects by a sequence of triples, which are nonintuitive for human to perceive the prominent features of the trajectory, such as where and how the moving object travels. In this demo, we present the STMaker system to help users make sense of individual trajectories. Given a trajectory, STMaker can automatically extract the significant semantic behavior of the trajectory, and summarize the behavior by a short human-readable text. In this paper, we first introduce the phrases of generating trajectory summarizations, and then show several real trajectory summarization cases.

World Wide Web | 2018

SharkDB: an in-memory column-oriented storage for trajectory analysis

Bolong Zheng; Haozhou Wang; Kai Zheng; Han Su; Kuien Liu; Shuo Shang

The last decade has witnessed the prevalence of sensor and GPS technologies that produce a high volume of trajectory data representing the motion history of moving objects. However some characteristics of trajectories such as variable lengths and asynchronous sampling rates make it difficult to fit into traditional database systems that are disk-based and tuple-oriented. Motivated by the success of column store and recent development of in-memory databases, we try to explore the potential opportunities of boosting the performance of trajectory data processing by designing a novel trajectory storage within main memory. In contrast to most existing trajectory indexing methods that keep consecutive samples of the same trajectory in the same disk page, we partition the database into frames in which the positions of all moving objects at the same time instant are stored together and aligned in main memory. We found this column-wise storage to be surprisingly well suited for in-memory computing since most frames can be stored in highly compressed form, which is pivotal for increasing the memory throughput and reducing CPU-cache miss. The independence between frames also makes them natural working units when parallelizing data processing on a multi-core environment. Lastly we run a variety of common trajectory queries on both real and synthetic datasets in order to demonstrate advantages and study the limitations of our proposed storage.

Data Science and Engineering | 2016

Landmark-Based Route Recommendation with Crowd Intelligence

Bolong Zheng; Han Su; Kai Zheng; Xiaofang Zhou

Route recommendation is one of the most widely used location-based services nowadays, as it is vital for nice-driving experience and smooth public traffic. Given a pair of user-specified origin and destination, a route recommendation service aims to provide users with the routes of the best travelling experience according to given criteria. However, even the routes recommended by the big-thumb service providers can deviate significantly from the ones travelled by experienced drivers, which motivates the previous research that leverages crowds’ knowledge to improve the recommendation quality. Since route recommendation is normally an online task, low-latency response to drivers’ queries is required in this kind of systems. Unfortunately, latency of crowdsourced systems is usually high, because they need to generate tasks and wait for workers’ feedbacks before answering queries. To address this issue, we extend our previous system—CrowdPlanner—by proposing some strategies to reuse existing answers (truths) to deal with newly coming queries more efficiently. A prototype system has been deployed to many voluntary mobile clients and extensive tests on real-scenario queries have shown the superiority of our system in comparison with the results given by map services and popular route-mining algorithms.

australasian database conference | 2014

Efficient Aggregate Farthest Neighbour Query Processing on Road Networks

Haozhou Wang; Kai Zheng; Han Su; Jiping Wang; Shazia Wasim Sadiq; Xiaofang Zhou

This paper addresses the problem of searching the k aggregate farthest neighbours (AkFN query in short) on road networks. Given a query point set, AkFN is aimed at finding the top-k points from a dataset with the largest aggregate network distance. The challenge of the AkFN query on the road network is how to reduce the number of network distance evaluation which is an expensive operation. In our work, we propose a three-phase solution, including clustering points in dataset, network distance bound pre-computing and searching. By organizing the objects into compact clusters and pre-calculating the network distance bound from clusters to a set of reference points, we can effectively prune a large fraction of clusters without probing each individual point inside. Finally, we demonstrate the efficiency of our proposed approaches by extensive experiments on a real Point- of-Interest (POI) dataset.

web age information management | 2018

Location Prediction in Social Networks

Rong Liu; Guanglin Cong; Bolong Zheng; Kai Zheng; Han Su

User locations in social networks are needed in many applications which utilize location information to recommend local news and places of interest to users, as well as detect and alert emergencies around users. However, considering individual privacy, only a small portion users share their location on social networks. Thus, to predict the fine-grained locations of user tweets, we present a joint model containing three sub models: content-based model, social relationship based model and behavior habit based model. In the content-based model, we filter out those location-independent tweets and use deep learning algorithm to mine the relationship between semantics and locations. User trajectory similarity measure is used to build a social graph for users, and historical check-ins is used to provide users’ daily activity habits. We conduct experiments using tweets collected from Shanghai during one year. The result shows that our joint model perform well, especially the content-based model. We find that our approach improves accuracy compared to the state-of-the-art location prediction algorithm.

Explore More