Wenjian Xu
Hong Kong Polytechnic University
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Wenjian Xu.
symposium on large spatial databases | 2015
Yu Li; Man Lung Yiu; Wenjian Xu
Emerging spatial crowdsourcing platforms enable the workers (i.e., crowd) to complete spatial crowdsourcing tasks (like taking photos, conducting citizen journalism) that are associated with rewards and tagged with both time and location features. In this paper, we study the problem of online recommending an optimal route for a crowdsourcing worker, such that he can (i) reach his destination on time and (ii) receive the maximum reward from tasks along the route. We show that no optimal online algorithm exists in this problem. Therefore, we propose several heuristics, and powerful pruning rules to speed up our methods. Experimental results on real datasets show that our proposed heuristics are very efficient, and return routes that contain 82–91 % of the optimal reward.
international conference on management of data | 2015
Ziqiang Feng; Eric Lo; Ben Kao; Wenjian Xu
Scan and lookup are two core operations in main memory column stores. A scan operation scans a column and returns a result bit vector that indicates which records satisfy a filter. Once a column scan is completed, the result bit vector is converted into a list of record numbers, which is then used to look up values from other columns of interest for a query. Recently there are several in-memory data layout proposals that aim to improve the performance of in-memory data processing. However, these solutions all stand at either end of a trade-off --- each is either good in lookup performance or good in scan performance, but not both. In this paper we present ByteSlice, a new main memory storage layout that supports both highly efficient scans and lookups. ByteSlice is a byte-level columnar layout that fully leverages SIMD data-parallelism. Micro-benchmark experiments show that ByteSlice achieves a data scan speed at less than 0.5 processor cycle per column value --- a new limit of main memory data scan, without sacrificing lookup performance. Our experiments on TPC-H data and real data show that ByteSlice offers significant performance improvement over all state-of-the-art approaches.
advances in geographic information systems | 2012
Wenjian Xu; Chi-Yin Chow; Man Lung Yiu; Qing Li; Chung Keung Poon
A location-aware news feed system enables mobile users to share geo-tagged user-generated messages, e.g., a user can receive nearby messages that are the most relevant to her. In this paper, we present MobiFeed that is a framework designed for scheduling news feeds for mobile users. MobiFeed consists of three key functions, location prediction, relevance measure, and news feed scheduler. The location prediction function is designed to predict a mobile users locations based on an existing path prediction algorithm. The relevance measure function is implemented by combining the vector space model with non-spatial and spatial factors to determine the relevance of a message to a user. The news feed scheduler works with the other two functions to generate news feeds for a mobile user at her current and predicted locations with the best overall quality. To ensure that MobiFeed can scale up to a larger number of messages, we design a heuristic news feed scheduler.
data management on new hardware | 2015
Petrie Wong; Ziqiang Feng; Wenjian Xu; Eric Lo; Ben Kao
Efficient main-memory index structures are crucial to main-memory database systems. Adaptive Radix Tree (ART) is the most recent in-memory index structure. ART is designed to avoid cache miss, leverage SIMD data parallelism, minimize branch mis-prediction, and have small memory footprint. When an in-memory index structure like ART has significantly few cache misses and branch mis-predictions, it is natural to question whether misses in Translation Lookaside Buffer (TLB) matters. In this paper, we try to confirm whether this is the case and if the answer is positive, what are the measures that we can take to alleviate that and how effective they are.
Geoinformatica | 2015
Wenjian Xu; Chi-Yin Chow; Man Lung Yiu; Qing Li; Chung Keung Poon
A location-aware news feed system enables mobile users to share geo-tagged user-generated messages, e.g., a user can receive nearby messages that are the most relevant to her. In this paper, we present MobiFeed that is a framework designed for scheduling news feeds for mobile users. MobiFeed consists of three key functions, location prediction, relevance measure, and news feed scheduler. The location prediction function is designed to estimate a mobile user’s locations based on a path prediction algorithm. The relevance measure function is implemented by combining the vector space model with non-spatial and spatial factors to determine the relevance of a message to a user. The news feed scheduler works with the other two functions to generate news feeds for a mobile user at her current and predicted locations with the best overall quality. We propose a heuristic algorithm as well as an optimal algorithm for the location-aware news feed scheduler. The performance of MobiFeed is evaluated through extensive experiments using a real road map and a real social network data set. The scalability of MobiFeed is also investigated using a synthetic data set. Experimental results show that MobiFeed obtains a relevance score two times higher than the state-of-the-art approach, and it can scale up to a large number of geo-tagged messages.
IEEE Transactions on Services Computing | 2016
Wenjian Xu; Chi-Yin Chow
A location-aware news feed (LANF) system generates news feeds for a mobile user based on her spatial preference (i.e., her current location and future locations) and non-spatial preference (i.e., her interest). Existing LANF systems simply send the most relevant geo-tagged messages to their users. Unfortunately, the major limitation of such an existing approach is that, a news feed may contain messages related to the same location (i.e., point-of-interest) or the same category of locations (e.g., food, entertainment or sport). We argue that diversity is a very important feature for location-aware news feeds because it helps users discover new places and activities. In this paper, we propose D-MobiFeed; a new LANF system enables a user to specify the minimum number of message categories (h) for the messages in a news feed. In D-MobiFeed, our objective is to efficiently schedule news feeds for a mobile user at her current and predicted locations, such that (i) each news feed contains messages belonging to at least h different categories, and (ii) their total relevance to the user is maximized. To achieve this objective, we formulate the problem into two parts, namely, a decision problem and an optimization problem. For the decision problem, we provide an exact solution by modeling it as a maximum flow problem and proving its correctness. The optimization problem is solved by our proposed three-stage heuristic algorithm. We conduct a user study and experiments to evaluate the performance of D-MobiFeed using a real data set crawled from Foursquare. Experimental results show that our proposed three-stage heuristic scheduling algorithm outperforms the brute-force optimal algorithm by at least an order of magnitude in terms of running time and the relative error incurred by the heuristic algorithm is below 1 percent. D-MobiFeed with the location prediction method effectively improves the relevance, diversity, and efficiency of news feeds.
Archive | 2014
Chi-Yin Chow; Wenjian Xu; Tian He
Since wireless sensor networks (WSNs) are vulnerable to malicious attacks due to their characteristics, privacy is a critical issue in many WSN applications. In this chapter, we discuss existing privacy enhancing technologies designed for protecting system privacy, data privacy and context privacy in wireless sensor networks (WSNs). The privacy-preserving techniques for the system privacy hide the information about the location of source nodes and the location of receiver nodes. The data privacy techniques mainly protect the privacy of data content and in-network data aggregation. The context privacy refers to location privacy of users and the temporal privacy of events. For each of these three kinds of privacy in WSNs, we describe its threats and illustrate its existing privacy-preserving techniques. More importantly, we make comparisons between different techniques and indicate their strengths and weaknesses. We also discuss possible improvement, thus highlighting some research trends in this area.
IEEE Transactions on Knowledge and Data Engineering | 2016
Wenjian Xu; Zhian He; Eric Lo; Chi-Yin Chow
Due to the fact that existing database systems are increasingly more difficult to use, improving the quality and the usability of database systems has gained tremendous momentum over the last few years. In particular, the feature of explaining why some expected tuples are missing in the result of a query has received more attention. In this paper, we study the problem of explaining missing answers to top-k queries in the context of SQL (i.e., with selection, projection, join, and aggregation). To approach this problem, we use the query-refinement method. That is, given as inputs the original top-k SQL query and a set of missing tuples, our algorithms return to the user a refined query that includes both the missing tuples and the original query results. Case studies and experimental results show that our algorithms are able to return high quality explanations efficiently.
advances in geographic information systems | 2013
Wenjian Xu; Chi-Yin Chow; Jia-Dong Zhang
extending database technology | 2015
Yu Li; Eric Lo; Man Lung Yiu; Wenjian Xu