Hanxiong Chen | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hanxiong Chen is active.

Explore More

Publication

Featured researches published by Hanxiong Chen.

database and expert systems applications | 2008

Efficient Bounds in Finding Aggregate Nearest Neighbors

Sansarkhuu Namnandorj; Hanxiong Chen; Kazutaka Furuse; Nobuo Ohbo

Developed from Nearest Neighbor (NN) queries, Aggregate Nearest Neighbor (ANN) queries return the object that minimizes an aggregate distance function with respect to a set of query points. Because of the multiple query points, ANN queries are much more complex than NN queries. For optimizing the query processing and improving the query efficiency, many ANN queries algorithms utilizes pruning strategies, with or without an index structure. Obviously, the pruning effect highly depends on the tightness of the bound estimation. In this paper, we figure out a property in vector space and develop some efficient bound estimations for two most popular types of ANN queries. Based on these bounds, we design the indexed and non-index ANN algorithms, and conduct experimental studies. Our algorithms show good performance, especially for high dimensional queries, for both real dataset and synthetic datasets.

Frontiers of Computer Science in China | 2013

Co-occurrence prediction in a large location-based social network

Rong-Hua Li; Jianquan Liu; Jeffrey Xu Yu; Hanxiong Chen; Hiroyuki Kitagawa

Location-based social network (LBSN) is at the forefront of emerging trends in social network services (SNS) since the users in LBSN are allowed to “check-in” the places (locations) when they visit them. The accurate geographical and temporal information of these check-in actions are provided by the end-user GPS-enabled mobile devices, and recorded by the LBSN system. In this paper, we analyze and mine a big LBSN data, Gowalla, collected by us. First, we investigate the relationship between the spatio-temporal co-occurrences and social ties, and the results show that the co-occurrences are strongly correlative with the social ties. Second, we present a study of predicting two users whether or not they will meet (co-occur) at a place in a given future time, by exploring their check-in habits. In particular, we first introduce two new concepts, bag-of-location and bag-of-time-lag, to characterize user’s check-in habits. Based on such bag representations, we define a similarity metric called habits similarity to measure the similarity between two users’ check-in habits. Then we propose a machine learning formula for predicting co-occurrence based on the social ties and habits similarities. Finally, we conduct extensive experiments on our dataset, and the results demonstrate the effectiveness of the proposed method.

Knowledge and Information Systems | 2005

CVA file: an index structure for high-dimensional datasets

Jiyuan An; Hanxiong Chen; Kazutaka Furuse; Nobuo Ohbo

Similarity search is important in information-retrieval applications where objects are usually represented as vectors of high dimensionality. This paper proposes a new dimensionality-reduction technique and an indexing mechanism for high-dimensional datasets. The proposed technique reduces the dimensions for which coordinates are less than a critical value with respect to each data vector. This flexible datawise dimensionality reduction contributes to improving indexing mechanisms for high-dimensional datasets that are in skewed distributions in all coordinates. To apply the proposed technique to information retrieval, a CVA file (compact VA file), which is a revised version of the VA file is developed. By using a CVA file, the size of index files is reduced further, while the tightness of the index bounds is held maximally. The effectiveness is confirmed by synthetic and real data.

intelligent data engineering and automated learning | 2003

Grid-Based Indexing for Large Time Series Databases

Jiyuan An; Hanxiong Chen; Kazutaka Furuse; Nobuo Ohbo; Eamonn J. Keogh

Similarity search in large time series databases is an interesting and challenging problem. Because of the high dimensional nature of the data, the difficulties associated with dimensionality curse arise. The most promising solution is to use dimensionality reduction, and construct a multi-dimensional index structure for the reduced data. In this work we introduce a new approach called grid-based Datawise Dimensionality Reduction(DDR) which attempts to preserve the characteristics of time series. We then apply quantization to construct an index structure. An experimental comparison with existing techniques demonstrate the utility of our approach.

web age information management | 2002

C2VA: Trim High Dimensional Indexes

Hanxiong Chen; Jiyuan An; Kazutaka Furuse; Nobuo Ohbo

Classical multi-dimensional indexes are based on data space partitioning. The effectiveness declines because the number of indexing units grows exponentially as the number of dimensions increases. Then, unfortunately, using such index structures is less effective than linear scanning of all the data. The VA-file proposed a method of coordinate approximation, observing that nearest neighbor search becomes of linear complexity in high-dimensional spaces.In this paper we propose CM2VA(Clustered Compact VA) for dimensionality reduction. We investigate and find that real datasets are rarely uniformly distributed, which is the main assumption of VA-file. Instead of approximation on all dimensions, we figure out the condition of skipping less important dimensions. This avoids the problem of generating huge index file for a large, high dimensional dataset and hence saves a lot of I/O accesses when scanning. Moreover, we guarantee that C2VA preserves the precision of bounds as in VA-file, which maximizes the efficiency gain. The conviction is found in our experimental results.

advanced data mining and applications | 2011

Indexing expensive functions for efficient multi-dimensional similarity search

Hanxiong Chen; Jianquan Liu; Kazutaka Furuse; Jeffrey Xu Yu; Nobuo Ohbo

Similarity search is important in information retrieval applications where objects are usually represented as vectors of high dimensionality. This leads to the increasing need for supporting the indexing of high-dimensional data. On the other hand, indexing structures based on space partitioning are powerless because of the well-known “curse of dimensionality”. Linear scan of the data with approximation is more efficient in the high-dimensional similarity search. However, approaches so far have concentrated on reducing I/O, and ignored the computation cost. For an expensive distance function such as Lp norm with fractional p, the computation cost becomes the bottleneck. We propose a new technique to address expensive distance functions by “indexing the function” by pre-computing some key values of the function once. Then, the values are used to develop the upper/lower bounds of the distance between a data vector and the query vector. The technique is extremely efficient since it avoids most of the distance function computations; moreover, it does not involve any extra secondary storage because no index is constructed and stored. The efficiency is confirmed by cost analysis, as well as experiments on synthetic and real data.

database and expert systems applications | 2010

An efficient algorithm for reverse furthest neighbors query with metric index

Jianquan Liu; Hanxiong Chen; Kazutaka Furuse; Hiroyuki Kitagawa

The variants of similarity queries have been widely studied in recent decade, such as k-nearest neighbors (k-NN), range query, reverse nearest neighbors (RNN), an so on. Nowadays, the reverse furthest neighbor (RFN) query is attracting more attention because of its applicability. Given an object set O and a query object q, the RFN query retrieves the objects of O, which take q as their furthest neighbor. Yao et al. proposed R-tree based algorithms to handle the RFN query using Voronoi diagrams and the convex hull property of dataset. However, computing the convex hull and executing range query on R-tree are very expensive on the fly. In this paper, we propose an efficient algorithm for RFN query with metric index. We also adapt the convex hull property to enhance the efficiency, but its computation is not on the fly. We select external pivots to construct metric indexes, and employ the triangle inequality to do efficient pruning by using the metric indexes. Experimental evaluations on both synthetic and real datasets are performed to confirm the efficiency and scalability.

australasian database conference | 2002

The convex polyhedra technique: an index structure for high-dimensional space

Jiyuan An; Hanxiong Chen; Kazutaka Furuse; Masahiro Ishikawa; Nobuo Ohbo

This paper proposes a new dimensionality reduction technique and an indexing mechanism for high dimensional data sets in which data points are not uniformly distributed. The proposed technique decomposes a data space into convex polyhedra, and the dimensionality of each data point is reduced according to which polyhedron includes the data point. One of the advantages of the proposed technique is that it reduces the dimensionality locally. This local dimensionality reduction contributes to improve indexing mechanisms for non-uniformly distributed data sets.To show the applicability and the effectiveness of the proposed technique, this paper describes a new indexing mechanism called CVA-file (Compact VA-File) which is a revised version of the VA-file. With the proposed dimensionality reduction technique, the size of data points stored in index files can be reduced. Furthermore, it can estimate upper and lower bounds of each entry in index files by using geographic properties of convex polyhedra. Results from experimental simulations show that the CVA-file is better than the VA-file for non-uniformly distributed real data sets.

Information Processing Letters | 1992

Decomposition—an approach for optimizing queries including ADT functions

Hanxiong Chen; Xu Yu; Kazunori Yamaguchi; Hiroyuki Kitagawa; Nobuo Ohbo; Yuzuru Fujiwara

In order to extend the database application to CAD/CAM and other engineering areas, several systems have employed ADTs [&lo]. This extension introduces a new dimension to query optimization. Conventional query processing focuses on the cost of input and output between main and secondary memory. In contrast, to process queries involving ADT functions, the computing cost of the ADT functions must be taken into consideration, since it is often the case that they dominate the I/O cost. For example, in a chemical DBMS, to represent the structure of chemical compounds, the support of ADT “GRAPH” is necessary. Accordingly, ADT functions such as displaying a graph, calculating the number of nodes of a graph, and comparing two graphs, etc., are required. Suppose that an ADT function isomorphic is defined to determine isomorphism between two graphs. The function isomorphic is

IAENG TRANSACTIONS ON ENGINEERING TECHNOLOGIES VOLUME I: Special Edition of the#N#International MultiConference of Engineers and Computer Scientists 2008 | 2009

Figure and Ground: A Complete Approach to Outlier Detection

Ching‐an Hsiao; Hanxiong Chen; Kazutaka Furuse; Nobuo Ohbo

Outlier detection is an important problem in various fields. Unsatisfying point is that definitions seem vague, which makes the problem an ad hoc one. We presented a supplementary definition to clarify the meaning. We then develop an efficient D algorithm, which converts outlier problem to pattern and relative deviation degree (RDD) problem. Finally, we present a new mechanism to distinguish outliers from the remainder in univariate dataset. Experimental results on synthetic and real datasets show efficiency of the complete solution.

Explore More