Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Yanhui Xiao is active.

Publication


Featured researches published by Yanhui Xiao.


ACM Transactions on Intelligent Systems and Technology | 2016

Modality-Dependent Cross-Media Retrieval

Yunchao Wei; Yao Zhao; Zhenfeng Zhu; Shikui Wei; Yanhui Xiao; Jiashi Feng; Shuicheng Yan

In this article, we investigate the cross-media retrieval between images and text, that is, using image to search text (I2T) and using text to search images (T2I). Existing cross-media retrieval methods usually learn one couple of projections, by which the original features of images and text can be projected into a common latent space to measure the content similarity. However, using the same projections for the two different retrieval tasks (I2T and T2I) may lead to a tradeoff between their respective performances, rather than their best performances. Different from previous works, we propose a modality-dependent cross-media retrieval (MDCR) model, where two couples of projections are learned for different cross-media retrieval tasks instead of one couple of projections. Specifically, by jointly optimizing the correlation between images and text and the linear regression from one modal space (image or text) to the semantic space, two couples of mappings are learned to project images and text from their original feature spaces into two common latent subspaces (one for I2T and the other for T2I). Extensive experiments show the superiority of the proposed MDCR compared with other methods. In particular, based on the 4,096-dimensional convolutional neural network (CNN) visual feature and 100-dimensional Latent Dirichlet Allocation (LDA) textual feature, the mAP of the proposed method achieves the mAP score of 41.5%, which is a new state-of-the-art performance on the Wikipedia dataset.


IEEE Transactions on Systems, Man, and Cybernetics | 2014

Topographic NMF for Data Representation

Yanhui Xiao; Zhenfeng Zhu; Yao Zhao; Yunchao Wei; Shikui Wei; Xuelong Li

Nonnegative matrix factorization (NMF) is a useful technique to explore a parts-based representation by decomposing the original data matrix into a few parts-based basis vectors and encodings with nonnegative constraints. It has been widely used in image processing and pattern recognition tasks due to its psychological and physiological interpretation of natural data whose representation may be parts-based in human brain. However, the nonnegative constraint for matrix factorization is generally not sufficient to produce representations that are robust to local transformations. To overcome this problem, in this paper, we proposed a topographic NMF (TNMF), which imposes a topographic constraint on the encoding factor as a regularizer during matrix factorization. In essence, the topographic constraint is a two-layered network, which contains the square nonlinearity in the first layer and the square-root nonlinearity in the second layer. By pooling together the structure-correlated features belonging to the same hidden topic, the TNMF will force the encodings to be organized in a topographical map. Thus, the feature invariance can be promoted. Some experiments carried out on three standard datasets validate the effectiveness of our method in comparison to the state-of-the-art approaches.


IEEE Transactions on Neural Networks | 2015

Kernel Reconstruction ICA for Sparse Representation

Yanhui Xiao; Zhenfeng Zhu; Yao Zhao; Yunchao Wei; Shikui Wei

Independent component analysis with soft reconstruction cost (RICA) has been recently proposed to linearly learn sparse representation with an overcomplete basis, and this technique exhibits promising performance even on unwhitened data. However, linear RICA may not be effective for the majority of real-world data because nonlinearly separable data structure pervasively exists in original data space. Meanwhile, RICA is essentially an unsupervised method and does not employ class information. Motivated by the success of the kernel trick that maps a nonlinearly separable data structure into a linearly separable case in a high-dimensional feature space, we propose a kernel RICA (kRICA) model to nonlinearly capture sparse representation in feature space. Furthermore, we extend the unsupervised kRICA to a supervised one by introducing a class-driven discrimination constraint, such that the data samples from the same class are well represented on the basis of the corresponding subset of basis vectors. This discrimination constraint minimizes inhomogeneous representation energy and maximizes homogeneous representation energy simultaneously, which is essentially equivalent to maximizing between-class scatter and minimizing within-class scatter at the same time in an implicit manner. Experimental results demonstrate that the proposed algorithm is more effective than other state-of-the-art methods on several datasets.


international conference on multimedia and expo | 2014

Learning a mid-level feature space for cross-media regularization

Yunchao Wei; Yao Zhao; Zhenfeng Zhu; Yanhui Xiao; Shikui Wei

In this paper, we propose a cross-media regularization framework to enhance image understanding which can benefit image retrieval, classification and so on. The goal of cross-media regularization is to find regularization projections by exploiting the correlations between visual features and textual features. Thus, the original noisy distribution of visual features can be refined by leveraging the discriminative distribution of the corresponding textual features. Within the proposed cross-media regularization framework, a mid-level representation is built by jointly projecting both visual and textual features into a shared feature subspace, which leads to transferring of the discriminative semantic characteristic embedded in the textual modality into the corresponding visual modality. Meanwhile, the discriminative characteristic of textual features can also be boosted simultaneously. The experimental results demonstrate that the proposed mid-level space learning process can remarkably improve the search quality and outperform the existing semantic regularization methods.


Journal of Computer Science and Technology | 2013

Class-Driven Non-Negative Matrix Factorization for Image Representation

Yanhui Xiao; Zhenfeng Zhu; Yao Zhao; Yunchao Wei

Non-negative matrix factorization (NMF) is a useful technique to learn a parts-based representation by decomposing the original data matrix into a basis set and coefficients with non-negative constraints. However, as an unsupervised method, the original NMF cannot utilize the discriminative class information. In this paper, we propose a semi-supervised class-driven NMF method to associate a class label with each basis vector by introducing an inhomogeneous representation cost constraint. This constraint forces the learned basis vectors to represent better for their own classes but worse for the others. Therefore, data samples in the same class will have similar representations, and consequently the discriminability in new representations could be boosted. Some experiments carried out on several standard databases validate the effectiveness of our method in comparison with the state-of-the-art approaches.


intelligent information hiding and multimedia signal processing | 2010

Kernel Canonical Correlation with Similarity Refinement for Automatic Image Tagging

Yanhui Xiao; Yao Zhao; Zhenfeng Zhu

Automatic image tagging (AIT) is an effective technology to facilitate the process of image retrieval without requiring user to provide a retrieval instance beforehand. In this paper, we propose an AIT method based on kernel canonical correlation analysis (KCCA) with similarity refinement (KCCSR). As a statistic correlation technique, the KCCA aims at extracting some kind of hidden information shared commonly by the two random variables. Different from the previous KCCA based tagging methods, the graph based similarity refinements are first implemented by an interactive way to obtain the enhanced visual and textual representations. Subsequently, the KCCA is applied to them to mine the unique intrinsic semantic representation space, in which the AIT can be completed. The final experimental results validate the effectiveness of the proposed KCCSR.


acm multimedia | 2012

Discriminative ICA model with reconstruction constraint for image classification

Yanhui Xiao; Zhenfeng Zhu; Shikui Wei; Yao Zhao

Independent Component Analysis (ICA) is an effective unsupervised tool to learn statistically independent representations. However, ICA is not only sensitive to whitening but also difficult to learn an over-complete basis set. Consequently, ICA with soft Reconstruction cost(RICA) was presented to learn sparse representations with over-complete basis even on unwhitened data. Nevertheless, this model may not be an optimal discriminative model for classification tasks, because it failed to consider the association between the training sample and its class. In this paper, we propose a supervised Discriminative ICA model with Reconstruction constraint for image classification, named DRICA. DRICA brings in class information to learn the over-complete basis by incorporating inhomogeneous representation cost constraint into the RICA framework. This constraint leads to partition the set of basis vectors into several subsets corresponding to the sample classes, where each subset could sparsely model data samples from the same class but not others. Therefore, the proposed ICA model can learn an over-complete basis and an optimal multi-class classifier jointly. Some experiments carried out on several standard image databases validate the effectiveness of DRICA for image classification.


international conference on internet multimedia computing and service | 2012

Correlation preserved dictionary learning for sparse representation

Yanhui Xiao; Zhenfeng Zhu; Yao Zhao

Sparse representation(SR) based classification has recently led to promising results in image classification, while the performance of classification relies on the quality of SR. However, most of the existing SR approaches failed to consider the geometrical structure of dictionary, which has been shown essential for classification. In this paper, we propose a reinforced SR algorithm by jointly preserving data structure consistency for sparse coding and dictionary correlation for dictionary learning. Specifically, we utilize an inconsistency regularization term to enforce structure consistency between data and SR. In addition, a new non-correlation regularization term is introduced to preserve the correlations between dictionary atoms. Therefore, the learned sparse representations will simultaneously respect the data structure and dictionary correlation. Some experiments carried out with two standard image databases validate the effectiveness of the proposed method for image classification.


intelligent information hiding and multimedia signal processing | 2012

Knowledge Transferring for Image Classification

Yunchao Wei; Yao Zhao; Zhenfeng Zhu; Yanhui Xiao

Traditional image classification approaches focused on utilizing a host of target data to learn an efficient classification model. However, these methods were generally based on the target data without considering auxiliary data. If the knowledge from auxiliary data could be successfully transferred to the target data, the performance of the model would be improved. In recent years, transfer learning has emerged to address this problem. Based on transfer learning, we present a knowledge transferring method to enhance the image classification performance. Since the target data are merely limited on images, we employ an auxiliary dataset to construct the pseudo text for each target image. By exploiting the semantic structure of the pseudo text data, the visual features are mapped to the semantic space which respects the text structure. Experiments show that the proposed approach in this paper is feasible.


advances in multimedia | 2012

An interactive semi-supervised approach for automatic image annotation

Yanhui Xiao; Zhenfeng Zhu; Nan Liu; Yao Zhao

Automatic image annotation (AIA) is an effective technique to bridge the semantic gap between low level image features and high level semantics. However, most of the existing AIA approaches failed to consider the use of unlabeled data. In this paper, we present an interactive semi-supervised approach for AIA by integrating graph propagation model and kernel canonical correlation analysis (KCCA) together. We aim to jointly utilize the keywords associated with labeled and selected unlabeled images to annotate the residual unlabeled images. Toward this goal, we firstly estimate the annotations of unlabeled images by the consistency-driven graph propagation model. Then, the KCCA is applied to seek the semantic consistency between the two concurrent visual and textual features. In addition, the unlabeled image with highest semantic consistency is selected into the training set. Thus, with the enlarged training set, the potential of the semantic consistency between visual and textual representations could be boosted. Some experiments carried out on two standard databases validate the effectiveness of the proposed method.

Collaboration


Dive into the Yanhui Xiao's collaboration.

Top Co-Authors

Avatar

Yao Zhao

Beijing Jiaotong University

View shared research outputs
Top Co-Authors

Avatar

Zhenfeng Zhu

Beijing Jiaotong University

View shared research outputs
Top Co-Authors

Avatar

Yunchao Wei

Beijing Jiaotong University

View shared research outputs
Top Co-Authors

Avatar

Shikui Wei

Beijing Jiaotong University

View shared research outputs
Top Co-Authors

Avatar

Nan Liu

Beijing Jiaotong University

View shared research outputs
Top Co-Authors

Avatar

Xuelong Li

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Jiashi Feng

National University of Singapore

View shared research outputs
Top Co-Authors

Avatar

Shuicheng Yan

National University of Singapore

View shared research outputs
Researchain Logo
Decentralizing Knowledge