Hongtao Xie | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Hongtao Xie is active.

Explore More

Publication

Featured researches published by Hongtao Xie.

IEEE Transactions on Multimedia | 2011

Efficient Feature Detection and Effective Post-Verification for Large Scale Near-Duplicate Image Search

Hongtao Xie; Ke Gao; Yongdong Zhang; Sheng Tang; Jintao Li; Yizhi Liu

State-of-the-art near-duplicate image search systems mostly build on the bag-of-local features (BOF) representation. While favorable for simplicity and scalability, these systems have three shortcomings: 1) high time complexity of the local feature detection; 2) discriminability reduction of local descriptors due to BOF quantization; and 3) neglect of the geometric relationships among local features after BOF representation. To overcome these shortcomings, we propose a novel framework by using graphics processing units (GPU). The main contributions of our method are: 1) a new fast local feature detector coined Harris-Hessian (H-H) is designed according to the characteristics of GPU to accelerate the local feature detection; 2) the spatial information around each local feature is incorporated to improve its discriminability, supplying semi-local spatial coherent verification (LSC); and 3) a new pairwise weak geometric consistency constraint (P-WGC) algorithm is proposed to refine the search result. Additionally, part of the system is implemented on GPU to improve efficiency. Experiments conducted on reference datasets and a dataset of one million images demonstrate the effectiveness and efficiency of H-H, LSC, and P-WGC.

IEEE Transactions on Multimedia | 2014

Contextual Query Expansion for Image Retrieval

Hongtao Xie; Yongdong Zhang; Jianlong Tan; Li Guo; Jintao Li

In this paper, we study the problem of image retrieval by introducing contextual query expansion to address the shortcomings of bag-of-words based frameworks: semantic gap of visual word quantization, and the efficiency and storage loss due to query expansion. Our method is built on common visual patterns (CVPs), which are the distinctive visual structures between two images and have rich contextual information. With CVPs, two contextual query expansions on visual word-level and image-level are explored, respectively. For visual word-level expansion, we find contextual synonymous visual words (CSVWs) and expand a word in the query image with its CSVWs to boost retrieval accuracy. CSVWs are the words that appear in the same CVPs and have same contextual meaning, i.e. similar spatial layout and geometric transformations. For image-level expansion, the database images that have the same CVPs are organized by linked list and the images that have the same CVPs as the query image, but not included in the results are automatically expanded. The main computation of these two expansions is carried out offline, and they can be integrated into the inverted file and efficiently applied to all images in the dataset. Experiments conducted on three reference datasets and a dataset of one million images demonstrate the effectiveness and efficiency of our method.

Journal of Visual Communication and Image Representation | 2013

Robust common visual pattern discovery using graph matching

Hongtao Xie; Yongdong Zhang; Ke Gao; Sheng Tang; Kefu Xu; Li Guo; Jintao Li

Abstract Discovering common visual patterns (CVPs) between two images is a difficult and time-consuming task, due to the photometric and geometric transformations. The state-of-the-art methods for CVPs discovery are either computationally expensive or have complicated constraints. In this paper, we formulate CVPs discovery as a graph matching problem, depending on pairwise geometric compatibility between feature correspondences. To efficiently find all CVPs, we propose a novel framework which consists of three components: Preliminary Initialization Optimization (PIO), Guided Expansion (GE) and Post Agglomerative Combination (PAC). PIO gets the initial CVPs and reduces the search space of CVPs discovery, based on the internal homogeneity of CVPs. Then, GE anchors on the initializations and gradually explores them, to find more and more correct correspondences. Finally, to reduce false and miss detection, PAC refines the discovery result in an agglomerative way. Experiments and applications conducted on benchmark datasets demonstrate the effectiveness and efficiency of our method.

international conference on multimedia retrieval | 2011

Pairwise weak geometric consistency for large scale image search

Hongtao Xie; Ke Gao; Yongdong Zhang; Jintao Li; Yizhi Liu

State-of-the-art image search systems mostly build on bag-of-features (BOF) representation. As BOF ignores geometric relationships among local features, geometric consistency constraints have been proposed to improve search precision. However, exploiting full geometric constraints are too computational expensive. Weak geometric constraints have strong assumptions and can only deal with uniform transformations. To handle view point changes and nonrigid deformations, in this paper we present a novel pairwise weak geometric consistency constraint (P-WGC) method. It utilizes the local similarity characteristic of deformations, and measures the pairwise geometric similarity of matches between two sets of local features. Experiments performed on four famous datasets and a dataset of one million of images show a significant improvement due to P-WGC as well as its efficiency. Further improvement of search accuracy is obtained when it is combined with full geometric verification.

acm multimedia | 2011

Common visual pattern discovery via graph matching

Hongtao Xie; Ke Gao; Yongdong Zhang; Jintao Li; Huamin Ren

Discovering common visual patterns (CVPs) between two images is a challenging problem, due to the significant photometric and geometric transformations, and the high computational cost. In this paper, we formulate CVPs discovery as a graph matching problem, depending on pairwise geometric compatibility between feature correspondences. To efficiently find all CVPs, we propose two algorithms--Preliminary Initialization Optimization (PIO) and Post Agglomerative Combining (PAC). PIO reduces the search space of CVPs discovery based on the internal homogeneity of CVPs, while PAC refines the discovery result in an agglomerative way. Experiments on object recognition and near-duplicate image re-trieval validate the effectiveness and efficiency of our method.

Multimedia Tools and Applications | 2017

Detecting Uyghur text in complex background images with convolutional neural network

Shancheng Fang; Hongtao Xie; Zhineng Chen; Shiai Zhu; Xiaoyan Gu; Xingyu Gao

Uyghur text detection is crucial to a variety of real-world applications, while little researches put their attention on it. In this paper, we develop an effective and efficient region-based convolutional neural network for Uyghur text detection in complex background images. The characteristics of the network include: (1) Three region proposal networks are used to improve the recall, which simultaneously utilize feature maps from different convolutional layers. (2) The overall architecture of our network is in the form of fully convolutional network, and global average pooling is applied to replace the fully connected layers in the classification and bounding box regression layers. (3) To fully utilize the baseline information, Uyghur text lines are detected directly by the network in an end-to-end fashion. Experiment results on benchmark dataset show that our method achieves an F-measure of 0.83 and detection time of 0.6 s for each image in a single K20c GPU, which is much faster than the state-of-the-art methods while keeps competitive accuracy.

international conference on acoustics, speech, and signal processing | 2010

GPU-based fast scale invariant interest point detector

Hongtao Xie; Ke Gao; Yongdong Zhang; Jintao Li; Yizhi Liu

To take full advantage of the powerful computing capability of graphics processing units (GPU) to speed up local feature detection, we present a novel GPU-based scale invariant interest point detector, coined Harris-Hessian(H-H). H-H detects Harris points in low scale and refines their location and scale in higher scale-space with the determinant of Hessian matrix. Compared to the existing methods, H-H significantly reduces the pixel-level computation complexity and has better parallelism. The experiment results show that with the assistance of GPU, H-H achieves up to a 10–20x speedup than CPU-based method. It only takes 6.3ms to detect a 640 × 480 image with high detection accuracy, meeting the need of real-time detection.

european conference on computer vision | 2010

Effective and efficient image copy detection based on GPU

Hongtao Xie; Ke Gao; Yongdong Zhang; Jintao Li; Yizhi Liu; Huamin Ren

To improve the accuracy and efficiency of image copy detection, a novel system is proposed based on Graphics Processing Units (GPU). We combine two complementary local features, Harris-Laplace and SURF, to provide a compact representation of an image. By using complementary features, the image is better covered and the detection accuracy becomes less dependent on the actual image content. Moreover, ordinal measure (OM) is applied as semilocal spatial coherent verification. To improve time performance, the process of local features generation and OM calculating are implemented on the GPU through NVIDIA CUDA. Experiments show that our system achieves a 15% precision improvement over the baseline Hamming embedding approach. Compared to the CPU-based method, the GPU realization reaches up to a 30-40x speedup, having real-time performance.

Journal of Visual Communication and Image Representation | 2014

Extracting salient region for pornographic image detection

Chenggang Clarence Yan; Yizhi Liu; Hongtao Xie; Zhuhua Liao; Jian Yin

Abstract Content-based pornographic image detection, in which region-of-interest (ROI) plays an important role, is effective to filter pornography. Traditionally, skin-color regions are extracted as ROI. However, skin-color regions are always larger than the subareas containing pornographic parts, and the approach is difficult to differentiate between human skins and other objects with the skin-colors. In this paper, a novel approach of extracting salient region is presented for pornographic image detection. At first, a novel saliency map model is constructed. Then it is integrated with a skin-color model and a face detection model to capture ROI in pornographic images. Next, a ROI-based codebook algorithm is proposed to enhance the representative power of visual-words. Taking into account both the speed and the accuracy, we fuse speed up robust features (SURF) with color moments (CM). Experimental results show that the precision of our ROI extraction method averagely achieves 91.33%, more precisely than that of using the skin-color model alone. Besides, the comparison with the state-of-the-art methods of pornographic image detection shows that our approach is able to remarkably improve the performance.

international conference on image processing | 2011

Local geometric consistency constraint for image retrieval

Hongtao Xie; Ke Gao; Yongdong Zhang; Jintao Li

In state-of-the-art image retrieval systems, an image is represented by bag-of-features (BOF). As BOF representation discards geometric relationships among local features, exploiting geometric constraints as post-processing procedure has been shown to greatly improve retrieval precision. However, full geometric constraints are computationally expensive and weak geometric constraints have limited range of applications. To efficiently handle common transformations and deformations, we present a novel local geometric consistency constraint (LGC) method. It utilizes the local similarity characteristic of deformations, and measures the pairwise geometric similarity of matches between two sets of local features. Besides, we propose a new method to accurately calculate the transformation matrix between two matched features, with the information provided by their local neighbors. Experiments performed on famous datasets show the excellent performance of our method.

Explore More