Is this you? Create Your Porfile

Xiaomeng Wu

Nippon Telegraph and Telephone

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xiaomeng Wu is active.

Explore More

Publication

Featured researches published by Xiaomeng Wu.

IEEE Transactions on Circuits and Systems for Video Technology | 2015

Second-Order Configuration of Local Features for Geometrically Stable Image Matching and Retrieval

Xiaomeng Wu; Kunio Kashino

Local features offer high repeatability, which supports efficient matching between images, but they do not provide sufficient discriminative power. Imposing a geometric coherence constraint on local features improves the discriminative power but makes the matching sensitive to anisotropic transformations. We propose a novel feature representation approach to solve the latter problem. Each image is abstracted by a set of tuples of local features. We revisit affine shape adaptation and extend its conclusion to characterize the geometrically stable feature of each tuple. The representation thus provides higher repeatability with anisotropic scaling and shearing than found in previous research. We develop a simple matching model by voting in the geometrically stable feature space, where votes arise from tuple correspondences. To make the required index space linear as regards the number of features, we propose a second approach called a centrality-sensitive pyramid to select potentially meaningful tuples of local features on the basis of their spatial neighborhood information. It achieves faster neighborhood association and has a greater robustness to errors in interest point detection and description. We comprehensively evaluated our approach using Flickr Logos 32, Holiday, Oxford Buildings, and Flickr 100 K benchmarks. Extensive experiments and comparisons with advanced approaches demonstrate the superiority of our approach in image retrieval tasks.

british machine vision conference | 2015

Robust Spatial Matching as Ensemble of Weak Geometric Relations.

Xiaomeng Wu; Kunio Kashino

Existing spatial matching methods permit geometrically-stable image matching, but still involve a difficult trade-off between flexibility and discriminative power. To address this issue, we regard spatial matching as an ensemble of geometric relations on a set of feature correspondences. A geometric relation is defined as a set of pairs of correspondences, in which every correspondence is associated with every other correspondence if and only if the pair satisfy a given geometric constraint. We design a novel, unified collection of weak geometric relations that fall into four fundamental classes of geometric coherences in terms of both spatial contexts and between-image transformations. The spatial similarity reduces to the cardinality of the conjunction of all geometric relations. The flexibility of weak geometric relations makes our method robust as regards incorrect rejections of true correspondences, and the conjunctive ensemble provides a high discriminative power in terms of mismatches. Extensive experiments are conducted on five datasets. Besides significant performance gain, our method yields much better scalability than existing methods, and so can be easily integrated into any image retrieval process.

international conference on computer vision | 2015

Adaptive Dither Voting for Robust Spatial Verification

Xiaomeng Wu; Kunio Kashino

Hough voting in a geometric transformation space allows us to realize spatial verification, but remains sensitive to feature detection errors because of the inflexible quantization of single feature correspondences. To handle this problem, we propose a new method, called adaptive dither voting, for robust spatial verification. For each correspondence, instead of hard-mapping it to a single transformation, the method augments its description by using multiple dithered transformations that are deterministically generated by the other correspondences. The method reduces the probability of losing correspondences during transformation quantization, and provides high robustness as regards mismatches by imposing three geometric constraints on the dithering process. We also propose exploiting the non-uniformity of a Hough histogram as the spatial similarity to handle multiple matching surfaces. Extensive experiments conducted on four datasets show the superiority of our method. The method outperforms its state-of-the-art counterparts in both accuracy and scalability, especially when it comes to the retrieval of small, rotated objects.

international conference on pattern recognition | 2014

Image Retrieval Based on Anisotropic Scaling and Shearing Invariant Geometric Coherence

Xiaomeng Wu; Kunio Kashino

Imposing a spatial coherence constraint on image matching is becoming a necessity for local feature based object retrieval. We tackle the affine invariance problem of the prior spatial coherence model and propose a novel approach for geometrically stable image retrieval. Compared with related studies focusing simply on translation, rotation, and isotropic scaling, our approach can deal with more significant transformations including anisotropic scaling and shearing. Our contribution consists of revisiting the first-order affine adaptation approach and extending its application to represent the geometric coherence of a second-order local feature structure. We comprehensively evaluated our approach using Flickr Logos 32, Holiday, and Oxford Buildings benchmarks. Extensive experimentation and comparisons with state-of-the-art spatial coherence models demonstrate the superiority of our approach in image retrieval tasks.

international conference on acoustics, speech, and signal processing | 2014

Image retrieval based on spatial context with Relaxed Gabriel Graph pyramid

Xiaomeng Wu; Kunio Kashino

Imposing the coherence of the spatial context on local features is becoming a necessity for object retrieval and recognition. Motivated by the success of proximity graphs in topological decomposition, clustering, and gradient estimation, we introduce a variation on and a generalization of Delaunay Triangulation, called a Relaxed Gabriel Graph (RGG), as the apex of spatial neighborhood association and design a Centrality-Sensitive Pyramid (CSP) model for hierarchical spatial context modeling. RGG is parameterized, and so allows the tuning of various applications and datasets. CSP achieves better neighborhood association and is more robust as regards feature description error than other related work. Our method is evaluated on Flickr Logos 32, Holiday, and Oxford Buildings benchmarks. Experimental results and comparisons demonstrate the superiority of our method in an image retrieval scenario.

international conference on acoustics, speech, and signal processing | 2016

Scene text recognition with high performance CNN classifier and efficient word inference

Xinhao Liu; Takahito Kawanishi; Xiaomeng Wu; Kunio Kashino

The recognition of text in natural scene images is a practical yet challenging task due to the large variations in backgrounds, textures, fonts, and illumination conditions. In this paper, we propose a highly accurate character recognition model by utilizing the representational power of a specially designed Convolutional Neural Network (CNN). Based on the recognition model, we also develop an efficient post processing approach for error correction and hypothesis re-verification. Character and word image recognition experiments on two public datasets, namely the ICDAR 2003 Robust Reading dataset and the Street View Text (SVT) dataset both show that the proposed approach provides superior or comparable results to the state-of-the-art techniques.

british machine vision conference | 2014

Tri-Map Self-Validation Based on Least Gibbs Energy for Foreground Segmentation

Xiaomeng Wu; Kunio Kashino

The Bayesian framework forms a solid foundation for image segmentation. With this as a basis, an image is modeled as a Markov random field (MRF) with observations incorporated with a given tri-map. Although MRF-based methods have proved successful in interactive or supervised foreground segmentation, high-quality segmentation can be obtained only when the tri-map is sufficiently discriminative. We argue that the least Gibbs energy can be formulated as a goal function of a tri-map and can be a powerful means of validating the separability of predefined feature distributions. Further, we propose a split-and-validate strategy for decomposing the complex problem into a series of tractable subproblems, and suboptimal tri-map optimization is gradually achieved by making decisions between cluster-level operations. The splitting is determined by a novel combination of Bregman hierarchical clustering and an information theoretic method for realizing non-parametric clustering. We have evaluated our method against the Oxford Flower 17 and Caltech-UCSD Bird 200 benchmarks and show the superiority of tri-map self-validation in unsupervised foreground segmentation tasks.

Multimedia Tools and Applications | 2015

Interest point selection by topology coherence for multi-query image retrieval

Xiaomeng Wu; Kunio Kashino

Although the bag-of-visual-words (BOVW) model in computer vision has been demonstrated successfully for the retrieval of particular objects, it suffers from limited accuracy when images of the same object are very different in terms of viewpoint or scale. Naively leveraging multiple views of the same object to query the database naturally alleviates this problem to some extent. However, the bottleneck appears to be the presence of background clutter, which causes significant confusion with images of different objects. To address this issue, we explore the structural organization of interest points within multiple query images and select those that derive from the tentative region of interest (ROI) to significantly reduce the negative contributions of confusing images. Specifically, we propose the use of a multi-layered undirected graph model built on sets of Hessian affine interest points to model the images’ elastic spatial topology. We detect repeating patterns that preserve a coherent local topology, show how these redundancies are leveraged to estimate tentative ROIs, and demonstrate how this novel interest point selection approach improves the quality of visual matching. The approach is discriminative in distinguishing clutter from interest points, and at the same time, is highly robust as regards variation in viewpoint and scale as well as errors in interest point detection and description. Large-scale datasets are used for extensive experimentation and discussion.

International Journal of Computer Vision | 2018

Label Propagation with Ensemble of Pairwise Geometric Relations: Towards Robust Large-Scale Retrieval of Object Instances

Xiaomeng Wu; Kaoru Hiramatsu; Kunio Kashino

Spatial verification methods permit geometrically stable image matching, but still involve a difficult trade-off between robustness as regards incorrect rejection of true correspondences and discriminative power in terms of mismatches. To address this issue, we ask whether an ensemble of weak geometric constraints that correlates with visual similarity only slightly better than a bag-of-visual-words model performs better than a single strong constraint. We consider a family of spatial verification methods and decompose them into fundamental constraints imposed on pairs of feature correspondences. Encompassing such constraints leads us to propose a new method, which takes the best of existing techniques and functions as a unified Ensemble of pAirwise GEometric Relations (EAGER), in terms of both spatial contexts and between-image transformations. We also introduce a novel and robust reranking method, in which the object instances localized by EAGER in high-ranked database images are reissued as new queries. EAGER is extended to develop a smoothness constraint where the similarity between the optimized ranking scores of two instances should be maximally consistent with their geometrically constrained similarity. Reranking is newly formulated as two label propagation problems: one is to assess the confidence of new queries and the other to aggregate new independently executed retrievals. Extensive experiments conducted on four datasets show that EAGER and our reranking method outperform most of their state-of-the-art counterparts, especially when large-scale visual vocabularies are used.

international conference on acoustics, speech, and signal processing | 2017

Deep salience map guided arbitrary direction scene text recognition

Xinhao Liu; Takahito Kawanishi; Xiaomeng Wu; Kaoru Hiramatsu; Kunio Kashino

Irregular scene text such as curved, rotated or perspective texts commonly appear in natural scene images due to different camera view points, special design purposes etc. In this work, we propose a text salience map guided model to recognize these arbitrary direction scene texts. We train a deep Fully Convolutional Network (FCN) to calculate the precise salience map for texts. Then we estimate the positions and rotations of the text and utilize this information to guide the generation of CNN sequence features. Finally the sequence is recognized with a Recurrent Neural Network (RNN) model. Experiments on various public datasets show that the proposed approach is robust to different distortions and performs superior or comparable to the state-of-the-art techniques.

Explore More