Is this you? Create Your Porfile

Xinxing Xu

Nanyang Technological University

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Xinxing Xu is active.

Explore More

Publication

Featured researches published by Xinxing Xu.

IEEE Transactions on Image Processing | 2012

Human Gait Recognition Using Patch Distribution Feature and Locality-Constrained Group Sparse Representation

Dong Xu; Yi Huang; Zinan Zeng; Xinxing Xu

In this paper, we propose a new patch distribution feature (PDF) (i.e., referred to as Gabor-PDF) for human gait recognition. We represent each gait energy image (GEI) as a set of local augmented Gabor features, which concatenate the Gabor features extracted from different scales and different orientations together with the X-Y coordinates. We learn a global Gaussian mixture model (GMM) (i.e., referred to as the universal background model) with the local augmented Gabor features from all the gallery GEIs; then, each gallery or probe GEI is further expressed as the normalized parameters of an image-specific GMM adapted from the global GMM. Observing that one video is naturally represented as a group of GEIs, we also propose a new classification method called locality-constrained group sparse representation (LGSR) to classify each probe video by minimizing the weighted l1, 2 mixed-norm-regularized reconstruction error with respect to the gallery videos. In contrast to the standard group sparse representation method that is a special case of LGSR, the group sparsity and local smooth sparsity constraints are both enforced in LGSR. Our comprehensive experiments on the benchmark USF HumanID database demonstrate the effectiveness of the newly proposed feature Gabor-PDF and the new classification method LGSR for human gait recognition. Moreover, LGSR using the new feature Gabor-PDF achieves the best average Rank-1 and Rank-5 recognition rates on this database among all gait recognition algorithms proposed to date.

IEEE Transactions on Neural Networks | 2013

Soft Margin Multiple Kernel Learning

Xinxing Xu; Ivor W. Tsang; Dong Xu

Multiple kernel learning (MKL) has been proposed for kernel methods by learning the optimal kernel from a set of predefined base kernels. However, the traditional L1MKL method often achieves worse results than the simplest method using the average of base kernels (i.e., average kernel) in some practical applications. In order to improve the effectiveness of MKL, this paper presents a novel soft margin perspective for MKL. Specifically, we introduce an additional slack variable called kernel slack variable to each quadratic constraint of MKL, which corresponds to one support vector machine model using a single base kernel. We first show that L1MKL can be deemed as hard margin MKL, and then we propose a novel soft margin framework for MKL. Three commonly used loss functions, including the hinge loss, the square hinge loss, and the square loss, can be readily incorporated into this framework, leading to the new soft margin MKL objective functions. Many existing MKL methods can be shown as special cases under our soft margin framework. For example, the hinge loss soft margin MKL leads to a new box constraint for kernel combination coefficients. Using different hyper-parameter values for this formulation, we can inherently bridge the method using average kernel, L1MKL, and the hinge loss soft margin MKL. The square hinge loss soft margin MKL unifies the family of elastic net constraint/regularizer based approaches; and the square loss soft margin MKL incorporates L2MKL naturally. Moreover, we also develop efficient algorithms for solving both the hinge loss and square hinge loss soft margin MKL. Comprehensive experimental studies for various MKL algorithms on several benchmark data sets and two real world applications, including video action recognition and event recognition demonstrate that our proposed algorithms can efficiently achieve an effective yet sparse solution for MKL.

european conference on computer vision | 2012

Beyond spatial pyramids: a new feature extraction framework with dense spatial sampling for image classification

Shengye Yan; Xinxing Xu; Dong Xu; Stephen Lin; Xuelong Li

We introduce a new framework for image classification that extends beyond the window sampling of fixed spatial pyramids to include a comprehensive set of windows densely sampled over location, size and aspect ratio. To effectively deal with this large set of windows, we derive a concise high-level image feature using a two-level extraction method. At the first level, window-based features are computed from local descriptors (e.g., SIFT, spatial HOG, LBP) in a process similar to standard feature extractors. Then at the second level, the new image feature is determined from the window-based features in a manner analogous to the first level. This higher level of abstraction offers both efficient handling of dense samples and reduced sensitivity to misalignment. More importantly, our simple yet effective framework can readily accommodate a large number of existing pooling/coding methods, allowing them to extract features beyond the spatial pyramid representation. To effectively fuse the second level feature with a standard first level image feature for classification, we additionally propose a new learning algorithm, called Generalized Adaptive lp-norm Multiple Kernel Learning (GA-MKL), to learn an adapted robust classifier based on multiple base kernels constructed from image features and multiple sets of pre-learned classifiers of all the classes. Extensive evaluation on the object recognition (Caltech256) and scene recognition (15Scenes) benchmark datasets demonstrates that the proposed method outperforms state-of-the-art image classification algorithms under a broad range of settings.

IEEE Transactions on Neural Networks | 2015

Distance Metric Learning Using Privileged Information for Face Verification and Person Re-Identification

Xinxing Xu; Wen Li; Dong Xu

In this paper, we propose a new approach to improve face verification and person re-identification in the RGB images by leveraging a set of RGB-D data, in which we have additional depth images in the training data captured using depth cameras such as Kinect. In particular, we extract visual features and depth features from the RGB images and depth images, respectively. As the depth features are available only in the training data, we treat the depth features as privileged information, and we formulate this task as a distance metric learning with privileged information problem. Unlike the traditional face verification and person re-identification tasks that only use visual features, we further employ the extra depth features in the training data to improve the learning of distance metric in the training process. Based on the information-theoretic metric learning (ITML) method, we propose a new formulation called ITML with privileged information (ITML+) for this task. We also present an efficient algorithm based on the cyclic projection method for solving the proposed ITML+ formulation. Extensive experiments on the challenging faces data sets EUROCOM and CurtinFaces for face verification as well as the BIWI RGBD-ID data set for person re-identification demonstrate the effectiveness of our proposed approach.

IEEE Transactions on Systems, Man, and Cybernetics | 2015

Image Classification With Densely Sampled Image Windows and Generalized Adaptive Multiple Kernel Learning

Shengye Yan; Xinxing Xu; Dong Xu; Stephen Lin; Xuelong Li

We present a framework for image classification that extends beyond the window sampling of fixed spatial pyramids and is supported by a new learning algorithm. Based on the observation that fixed spatial pyramids sample a rather limited subset of the possible image windows, we propose a method that accounts for a comprehensive set of windows densely sampled over location, size, and aspect ratio. A concise high-level image feature is derived to effectively deal with this large set of windows, and this higher level of abstraction offers both efficient handling of the dense samples and reduced sensitivity to misalignment. In addition to dense window sampling, we introduce generalized adaptive ℓp-norm multiple kernel learning (GA-MKL) to learn a robust classifier based on multiple base kernels constructed from the new image features and multiple sets of prelearned classifiers from other classes. With GA-MKL, multiple levels of image features are effectively fused, and information is shared among different classifiers. Extensive evaluation on benchmark datasets for object recognition (Caltech256 and Caltech101) and scene recognition (15Scenes) demonstrate that the proposed method outperforms the state-of-the-art under a broad range of settings.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016

Co-Labeling for Multi-View Weakly Labeled Learning

Xinxing Xu; Wen Li; Dong Xu; Ivor W. Tsang

It is often expensive and time consuming to collect labeled training samples in many real-world applications. To reduce human effort on annotating training samples, many machine learning techniques (e.g., semi-supervised learning (SSL), multi-instance learning (MIL), etc.) have been studied to exploit weakly labeled training samples. Meanwhile, when the training data is represented with multiple types of features, many multi-view learning methods have shown that classifiers trained on different views can help each other to better utilize the unlabeled training samples for the SSL task. In this paper, we study a new learning problem called multi-view weakly labeled learning, in which we aim to develop a unified approach to learn robust classifiers by effectively utilizing different types of weakly labeled multi-view data from a broad range of tasks including SSL, MIL and relative outlier detection (ROD). We propose an effective approach called co-labeling to solve the multi-view weakly labeled learning problem. Specifically, we model the learning problem on each view as a weakly labeled learning problem, which aims to learn an optimal classifier from a set of pseudo-label vectors generated by using the classifiers trained from other views. Unlike traditional co-training approaches using a single pseudo-label vector for training each classifier, our co-labeling approach explores different strategies to utilize the predictions from different views, biases and iterations for generating the pseudo-label vectors, making our approach more robust for real-world applications. Moreover, to further improve the weakly labeled learning on each view, we also exploit the inherent group structure in the pseudo-label vectors generated from different strategies, which leads to a new multi-layer multiple kernel learning problem. Promising results for text-based image retrieval on the NUS-WIDE dataset as well as news classification and text categorization on several real-world multi-view datasets clearly demonstrate that our proposed co-labeling approach achieves state-of-the-art performance for various multi-view weakly labeled learning problems including multi-view SSL, multi-view MIL and multi-view ROD.

international conference on data mining | 2012

Handling Ambiguity via Input-Output Kernel Learning

Xinxing Xu; Ivor W. Tsang; Dong Xu

Data ambiguities exist in many data mining and machine learning applications such as text categorization and image retrieval. For instance, it is generally beneficial to utilize the ambiguous unlabeled documents to learn a more robust classifier for text categorization under the semi-supervised learning setting. To handle general data ambiguities, we present a unified kernel learning framework named Input-Output Kernel Learning (IOKL). Based on our framework, we further propose a novel soft margin group sparse Multiple Kernel Learning (MKL) formulation by introducing a group kernel slack variable to each group of base input-output kernels. Moreover, an efficient block-wise coordinate descent algorithm with an analytical solution for the kernel combination coefficients is developed to solve the proposed formulation. We conduct comprehensive experiments on benchmark datasets for both semi-supervised learning and multiple instance learning tasks, and also apply our IOKL framework to a computer vision application called text-based image retrieval on the NUS-WIDE dataset. Promising results demonstrate the effectiveness of our proposed IOKL framework.

IEEE Transactions on Neural Networks | 2017

Action and Event Recognition in Videos by Learning From Heterogeneous Web Sources

Li Niu; Xinxing Xu; Lin Chen; Lixin Duan; Dong Xu

In this paper, we propose new approaches for action and event recognition by leveraging a large number of freely available Web videos (e.g., from Flickr video search engine) and Web images (e.g., from Bing and Google image search engines). We address this problem by formulating it as a new multi-domain adaptation problem, in which heterogeneous Web sources are provided. Specifically, we are given different types of visual features (e.g., the DeCAF features from Bing/Google images and the trajectory-based features from Flickr videos) from heterogeneous source domains and all types of visual features from the target domain. Considering the target domain is more relevant to some source domains, we propose a new approach named multi-domain adaptation with heterogeneous sources (MDA-HS) to effectively make use of the heterogeneous sources. In MDA-HS, we simultaneously seek for the optimal weights of multiple source domains, infer the labels of target domain samples, and learn an optimal target classifier. Moreover, as textual descriptions are often available for both Web videos and images, we propose a novel approach called MDA-HS using privileged information (MDA-HS+) to effectively incorporate the valuable textual information into our MDA-HS method, based on the recent learning using privileged information paradigm. MDA-HS+ can be further extended by using a new elastic-net-like regularization. We solve our MDA-HS and MDA-HS+ methods by using the cutting-plane algorithm, in which a multiple kernel learning problem is derived and solved. Extensive experiments on three benchmark data sets demonstrate that our proposed approaches are effective for action and event recognition without requiring any labeled samples from the target domain.

pacific-rim symposium on image and video technology | 2010

Video Concept Detection Using Support Vector Machine with Augmented Features

Xinxing Xu; Dong Xu; Ivor W. Tsang

In this paper, we present a direct application of Support Vector Machine with Augmented Features (AFSVM) for video concept detection. For each visual concept, we learn an adapted classifier by leveraging the pre-learnt SVM classifiers of other concepts. The solution of AFSVM is to re-train the SVM classifier using augmented feature, which concatenates the original feature vector with the decision value vector obtained from the pre-learnt SVM classifiers in the Reproducing Kernel Hilbert Space (RKHS). The experiments on the challenging TRECVID 2005 dataset demonstrate the effectiveness of AFSVM for video concept detection.

international joint conference on artificial intelligence | 2016