Xiaoguang Gu
Chinese Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xiaoguang Gu.
acm multimedia | 2013
Lei Zhang; Yongdong Zhang; Jinhui Tang; Xiaoguang Gu; Jintao Li; Qi Tian
Binary hashing has been widely used for efficient similarity search. Learning efficient codes has become a research focus and it is still a challenge. In many cases, the real-world data often lies on a low-dimensional manifold, which should be taken into account to capture meaningful neighbors with hashing. The importance of a manifold is its topology, which represents the neighborhood relationships between its subregions and the relative proximities between the neighbors of each subregion, e.g. the relative ranking of neighbors of each subregion. Most existing hashing methods try to preserve the neighborhood relationships by mapping similar points to close codes, while ignoring the neighborhood rankings. Moreover, most hashing methods lack in providing a good ranking for query results since they use Hamming distance as the similarity metric, and in practice, there are often a lot of results sharing the same distance to a query. In this paper, we propose a novel hashing method to solve these two issues jointly. The proposed method is referred to as Topology Preserving Hashing (TPH). TPH is distinct from prior works by preserving the neighborhood rankings of data points in Hamming space. The learning stage of TPH is formulated as a generalized eigendecomposition problem with closed form solutions. Experimental comparisons with other state-of-the-art methods on three noted image benchmarks demonstrate the efficacy of the proposed method.
IEEE Transactions on Image Processing | 2014
Lei Zhang; Yongdong Zhang; Xiaoguang Gu; Jinhui Tang; Qi Tian
Hashing-based similarity search techniques is becoming increasingly popular in large data sets. To capture meaningful neighbors, the topology of a data set, which represents the neighborhood relationships between its subregions and the relative proximities between the neighbors of each subregion, e.g., the relative neighborhood ranking of each subregion, should be exploited. However, most existing hashing methods are developed to preserve neighborhood relationships while ignoring the relative neighborhood proximities. Moreover, most hashing methods lack in providing a good result ranking, since there are often lots of results sharing the same Hamming distance to a query. In this paper, we propose a novel hashing method to solve these two issues jointly. The proposed method is referred to as topology preserving hashing (TPH). TPH is distinct from prior works by also preserving the neighborhood ranking. Based on this framework, we present three different TPH methods, including linear unsupervised TPH, semisupervised TPH, and kernelized TPH. Particularly, our unsupervised TPH is capable of mining semantic relationship between unlabeled data without supervised information. Extensive experiments on four large data sets demonstrate the superior performances of the proposed methods over several state-of-the-art unsupervised and semisupervised hashing techniques.
IEEE Transactions on Multimedia | 2011
Wenying Wang; Dongming Zhang; Yongdong Zhang; Jintao Li; Xiaoguang Gu
Spatial matching for object retrieval is often time-consuming and susceptible to viewpoint changes. To address this problem, we propose a novel spatial matching method that is robust to viewpoint changes and implement it on modern graphics processing unit (GPU) in parallel for real-time applications. Unlike previous spatial matching methods used in object retrieval, in which the affine transformation estimation is based on the gravity vector assumption, our method abandons this strong assumption by matching the affine covariant neighbors (ACNs) of corresponding local regions and estimating affine transformation from each single pair of corresponding local regions. Taking into account real-time applications, we implement the method on modern GPU in parallel to speed up the process. Computations are distributed evenly to threads with load balancing, and device memory accesses are optimized with bitmap-based parallel scan. Experimental results demonstrate that our method is more robust and more efficient than previous methods especially when the viewpoints are changed, and the parallel implementation on GPU obtains ten times speedup.
visual communications and image processing | 2014
Renhao Zhou; Qingsheng Yuan; Xiaoguang Gu; Dongming Zhang
In recent years, VLAD has become a popular method which encoding powerful local descriptors to the compact representations. By using this approach, an image can be represented by just a few dozen bytes while preserving excellent retrieval results after the dimensionality reduction and compression. However, throwing away the spatial information is one of the biggest weaknesses of VLAD. This paper adopts the spatial pyramid pooling method to incorporate the spatial information into the VLAD vectors. Furthermore, a new normalization method is proposed to hold this advantage. By the proposed method, the performance of VLAD can be boosted through combining spatial information. The experimental results show that our approach outperforms VLAD in almost all configurations.
international conference on acoustics, speech, and signal processing | 2013
Xiaoguang Gu; Dongming Zhang; Yongdong Zhang; Jintao Li; Lei Zhang
This paper presents a novel algorithm for fast and robust video copy detection. The idea is to use local features to estimate the copy transformation parameters first and then use the estimated parameters to guide the global-feature-based matching at a later stage. It is based on the fact that the copy transformations generally remain unchanged in a continuous video clip even in the whole video. Local-feature-based matching can find the candidates which are difficult to be detected only using global features. Furthermore, the matched local feature points can provide enough information to estimate the copy transformations. After the copy transformations are estimated, the subsequent detection can be accelerated by doing global-feature-based matching. The experimental results show that the proposed algorithm can get the same good robustness as the local-feature-based method but the faster detection speed.
international conference on multimedia and expo | 2010
Wenying Wang; Dongming Zhang; Yongdong Zhang; Jintao Li; Xiaoguang Gu
Spatial matching for object retrieval is often time-consuming and susceptible to viewpoint changes. To address this problem, we propose a novel spatial matching method and implement it on modern GPU in parallel. Unlike previous spatial matching methods, in which the affine transformation estimation is based on the gravity vector assumption, our method abandons this strong assumption by matching the ACNs (affine covariant neighbors) of corresponding local regions and estimating affine transformation from a single pair of corresponding local regions. To speed up the process, we implement the method on modern GPU in parallel. Computations are distributed evenly to threads with load balancing, and the memory accesses are optimized and bitmap based parallel scan is exploited. Experimental results demonstrate that our method is more robust and more efficient than previous methods especially when the viewpoints are changed, and the parallel implementation on GPU obtains ten times speedup.
international conference on multimedia retrieval | 2014
Tiancai Ye; Dongming Zhang; Guoqing Jin; Ke Gao; Xiaoguang Gu; Yongdong Zhang
In this paper, a simple and effective method is proposed for salient region detection. Based on the observation that salient regions tend to be compact, connected and surrounded, our original idea is to exploit these three kinds of prior knowledge. However, concepts of spatial structure (such as connectivity and surroundedness) only have definite meanings in binary images. Thus, a Monte Carlo Sampling based Saliency model is proposed. Our model has two main advantages over other methods. Firstly, the result of each sampling process is a binary map which can greatly simplify the combination with prior knowledge of spatial structure. Secondly, our method is naturally parallelized because every sampling process is independent with each other, which makes our method very efficient. Experimental results on two datasets show that, compared with eleven state-of-the-art methods, our approach has a competitive performance and also runs very fast.
international conference on multimedia and expo | 2014
Xiaoguang Gu; Yongdong Zhang; Dongming Zhang; Jintao Li
Local features have been widely used in many computer vision related researches, such as near-duplicate image and video retrieval. However, the storage and query cost of local features become prohibitive on large-scale database. In this paper, we propose a representative local features mining method to generate a compact but more effective feature subset. First, we do an unsupervised annotation for all similar images(or frames in video) in the database. Second, we compute a comprehensive score for every local feature. The score function combines the robustness and discrimination. Finally, we sort all the local features in an image by their scores and the low-score local features can be removed. The selected local features are robust and discriminative, which can guarantee the better retrieval quality than using full of the original feature set. By our method, the number of local features can be significantly reduced and a large amount of storage and computational cost can be saved. The experimental results show that we can use 30% of the features to get a better query performance than that of full feature set.
international conference on systems | 2012
Dongming Zhang; Gang Cao; Xiaoguang Gu
Motion estimation is always regarded as the most time consuming module in video coding, and many fast motion estimation algorithms have been proposed to speed-up it. However one fact that motion regions need more complex search whereas still regions does not instead is often ignored. On the other hand, the analysis of the distribution of motion vector difference shows that the predicted motion vector is very near to real motion vector. Herein, in this improved motion estimation algorithm, motion region is first identified using improved visual rhythm analysis and then one efficient search scheme, named search center adaptive motion estimation, is carried out according the MB motion. In the simulations, the algorithm is verified on platform JM7.3. The results show that the search scheme can great speed-up the motion estimation of still MBs, and it can eliminate about 25% integer pixel search process of motion MBs as well. The encoding performance loss is negligible for low motion video and trifling for video sequences with complex motion.
visual communications and image processing | 2011
Lei Zhang; Xiaoguang Gu; Yongdong Zhang; Dongming Zhang; Jintao Li
In recent years, Locality Sensitive Hashing (LSH) (and its variant Euclidean LSH) has become a popular index structure for large-scale and high-dimensional similarity search problem. In this paper, we analyze a phenomenon we called “Non-Uniform” that degrades the query performance of LSH and propose a pivot-based algorithm to improve the query performance. We also provide a method to get optimal pivot for even larger improvement. Experiments show that our algorithm significantly improves the query performance of LSH.