Haozhou Wang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Haozhou Wang is active.

Explore More

Publication

Featured researches published by Haozhou Wang.

international conference on management of data | 2013

Calibrating trajectory data for similarity-based analysis

Han Su; Kai Zheng; Haozhou Wang; Jiamin Huang; Xiaofang Zhou

Due to the prevalence of GPS-enabled devices and wireless communications technologies, spatial trajectories that describe the movement history of moving objects are being generated and accumulated at an unprecedented pace. Trajectory data in a database are intrinsically heterogeneous, as they represent discrete approximations of original continuous paths derived using different sampling strategies and different sampling rates. Such heterogeneity can have a negative impact on the effectiveness of trajectory similarity measures, which are the basis of many crucial trajectory processing tasks. In this paper, we pioneer a systematic approach to trajectory calibration that is a process to transform a heterogeneous trajectory dataset to one with (almost) unified sampling strategies. Specifically, we propose an anchor-based calibration system that aligns trajectories to a set of anchor points, which are fixed locations independent of trajectory data. After examining four different types of anchor points for the purpose of building a stable reference system, we propose a geometry-based calibration approach that considers the spatial relationship between anchor points and trajectories. Then a more advanced model-based calibration method is presented, which exploits the power of machine learning techniques to train inference models from historical trajectory data to improve calibration effectiveness. Finally, we conduct extensive experiments using real trajectory datasets to demonstrate the effectiveness and efficiency of the proposed calibration system.

conference on information and knowledge management | 2014

SharkDB: An In-Memory Column-Oriented Trajectory Storage

Haozhou Wang; Kai Zheng; Jiajie Xu; Bolong Zheng; Xiaofang Zhou; Shazia Wasim Sadiq

The last decade has witnessed the prevalence of sensor and GPS technologies that produce a high volume of trajectory data representing the motion history of moving objects. However some characteristics of trajectories such as variable lengths and asynchronous sampling rates make it difficult to fit into traditional database systems that are disk-based and tuple-oriented. Motivated by the success of column store and recent development of in-memory databases, we try to explore the potential opportunities of boosting the performance of trajectory data processing by designing a novel trajectory storage within main memory. In contrast to most existing trajectory indexing methods that keep consecutive samples of the same trajectory in the same disk page, we partition the database into frames in which the positions of all moving objects at the same time instant are stored together and aligned in main memory. We found this column-wise storage to be surprisingly well suited for in-memory computing since most frames can be stored in highly compressed form, which is pivotal for increasing the memory throughput and reducing CPU-cache miss. The independence between frames also makes them natural working units when parallelizing data processing on a multi-core environment. Lastly we run a variety of common trajectory queries on both real and synthetic datasets in order to demonstrate advantages and study the limitations of our proposed storage.

very large data bases | 2015

Calibrating trajectory data for spatio-temporal similarity analysis

Han Su; Kai Zheng; Jiamin Huang; Haozhou Wang; Xiaofang Zhou

Due to the prevalence of GPS-enabled devices and wireless communications technologies, spatial trajectories that describe the movement history of moving objects are being generated and accumulated at an unprecedented pace. Trajectory data in a database are intrinsically heterogeneous, as they represent discrete approximations of original continuous paths derived using different sampling strategies and different sampling rates. Such heterogeneity can have a negative impact on the effectiveness of trajectory similarity measures, which are the basis of many crucial trajectory processing tasks. In this paper, we pioneer a systematic approach to trajectory calibration that is a process to transform a heterogeneous trajectory dataset to one with (almost) unified sampling strategies. Specifically, we propose an anchor-based calibration system that aligns trajectories to a set of anchor points, which are fixed locations independent of trajectory data. After examining four different types of anchor points for the purpose of building a stable reference system, we propose a spatial-only geometry-based calibration approach that considers the spatial relationship between anchor points and trajectories. Then a more advanced spatial-only model-based calibration method is presented, which exploits the power of machine learning techniques to train inference models from historical trajectory data to improve calibration effectiveness. Afterward, since trajectory has temporal information, we extend these two spatial-only trajectory calibration algorithms to incorporate the temporal information, which can infer a proper time stamp to each anchor point of a calibrated trajectory. At last, we provide a solution to reduce cost, i.e., the number of trajectories that is necessary to be re-calibrated, of the updating of the reference system. Finally, we conduct extensive experiments using real trajectory datasets to demonstrate the effectiveness and efficiency of the proposed calibration system.

international conference on management of data | 2015

SharkDB: An In-Memory Storage System for Massive Trajectory Data

Haozhou Wang; Kai Zheng; Xiaofang Zhou; Shazia Wasim Sadiq

An increasing amount of motion history data, which is called trajectory, is being collected from different sources such as GPS-enabled mobile devices, surveillance cameras and social networks. However it is hard to store and manage trajectory data in traditional database systems, since its variable lengths and asynchronous sampling rates do not fit disk-based and tuple-oriented structures, which are the fundamental structures of traditional database systems. We implement a novel trajectory storage system that is motivated by the success of column store and recent development of in-memory based databases. In this storage design, we try to explore the potential opportunities, which can boost the performance of query processing for trajectory data. To achieve this, we partition the trajectories into frames as column-oriented storage in order to store the sample points of a moving object, which are aligned by the time interval, within the main memory. Furthermore, the frames can be highly compressed and well structured to increase the memory utilization ratio and reduce the CPU-cache missing. It is also easier for parallelizing data processing on the multi-core server since the frames are mutually independent.

World Wide Web | 2018

SharkDB: an in-memory column-oriented storage for trajectory analysis

Bolong Zheng; Haozhou Wang; Kai Zheng; Han Su; Kuien Liu; Shuo Shang

mobile data management | 2014

Cost-Efficient Spatial Network Partitioning for Distance-Based Query Processing

Jiping Wang; Kai Zheng; Hoyoung Jeung; Haozhou Wang; Bolong Zheng; Xiaofang Zhou

The efficiency of spatial query processing is crucial for many applications such as location-based services. In spatial networks, queries like k-NN queries are all based on network distance evaluation. Classic solutions for these queries rely on network expansion and are not efficient enough for large networks. Some approaches have improved the query efficiency but brought considerable space cost for index. To address these problems, we propose a hierarchical graph partitioning based index named Partition Tree. It organizes the vertices of a spatial network into a hierarchy through a series of graph partitioning processes. Meanwhile precomputed distances are associated with this hierarchy to facilitate efficient query processing. Inspired by the observation that queries are usually invoked around objects of interest, we propose a query-oriented optimization on top of the Partition Tree. It uses a cost model to evaluate the influence of the object distribution and partitioning topology on the query efficiency. Then a cost-efficient graph partitioning method is developed based on this cost model. Experimental results on real datasets demonstrate that our proposed index and algorithms have superior performance over the state-of-the-art approaches and are scalable to large spatial networks.

australasian database conference | 2015

Storing and Processing Massive Trajectory Data on SAP HANA

Haozhou Wang; Kai Zheng; Hoyoung Jeung; Shane Bracher; Asadul K. Islam; Wasim Sadiq; Shazia Wasim Sadiq; Xiaofang Zhou

Owing to the development of cheap RAM-based storage technology, modern computing hardware can afford much larger main memory. Consequently, traditional database systems can be re-designed to store and manage all the data in main memory permanently. Such kind of in-memory database systems (IMDB) have attracted increasing attention from both academia and industry due to its outstanding performance in processing large amount of data. In this work, we will exploit the computational power of SAP HANA, the in-memory column-oriented data analytics platform designed by SAP, to support efficient query processing for moving object trajectories. We have tailored the frame-based data structure designed by our previous SharkDB project and made the trajectory data with variable lengths and sampling rates suitable for relational database model in SAP HANA. Extensive experiments based on large-scale real dataset have demonstrated superior performance of our frame-based design in processing a variant of queries.

australasian database conference | 2014

Efficient Aggregate Farthest Neighbour Query Processing on Road Networks

Haozhou Wang; Kai Zheng; Han Su; Jiping Wang; Shazia Wasim Sadiq; Xiaofang Zhou

This paper addresses the problem of searching the k aggregate farthest neighbours (AkFN query in short) on road networks. Given a query point set, AkFN is aimed at finding the top-k points from a dataset with the largest aggregate network distance. The challenge of the AkFN query on the road network is how to reduce the number of network distance evaluation which is an expensive operation. In our work, we propose a three-phase solution, including clustering points in dataset, network distance bound pre-computing and searching. By organizing the objects into compact clusters and pre-calculating the network distance bound from clusters to a set of reference points, we can effectively prune a large fraction of clusters without probing each individual point inside. Finally, we demonstrate the efficiency of our proposed approaches by extensive experiments on a real Point- of-Interest (POI) dataset.

australasian database conference | 2013