Xiaotong Yuan
Nanjing University of Information Science and Technology
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Xiaotong Yuan.
computer vision and pattern recognition | 2010
Xiaotong Yuan; Shuicheng Yan
We address the problem of computing joint sparse representation of visual signal across multiple kernel-based representations. Such a problem arises naturally in supervised visual recognition applications where one aims to reconstruct a test sample with multiple features from as few training subjects as possible. We cast the linear version of this problem into a multi-task joint covariate selection model [15], which can be very efficiently optimized via ker-nelizable accelerated proximal gradient method. Furthermore, two kernel-view extensions of this method are provided to handle the situations where descriptors and similarity functions are in the form of kernel matrices. We then investigate into two applications of our algorithm to feature combination: 1) fusing gray-level and LBP features for face recognition, and 2) combining multiple kernels for object categorization. Experimental results on challenging real-world datasets show that the feature combination capability of our proposed algorithm is competitive to the state-of-the-art multiple kernel learning methods.
international conference on machine learning | 2009
Xiaotong Yuan; Bao-Gang Hu
In this paper, we present a robust feature extraction framework based on information-theoretic learning. Its formulated objective aims at simultaneously maximizing the Renyis quadratic information potential of features and the Renyis cross information potential between features and class labels. This objective function reaps the advantages in robustness from both redescending M-estimator and manifold regularization, and can be efficiently optimized via half-quadratic optimization in an iterative manner. In addition, the popular algorithms LPP, SRDA and LapRLS for feature extraction are all justified to be the special cases within this framework. Extensive comparison experiments on several real-world data sets, with contaminated features or labels, well validate the encouraging gain in algorithmic robustness from this proposed framework.
computer vision and pattern recognition | 2011
Yadong Mu; Jian Dong; Xiaotong Yuan; Shuicheng Yan
Exact recovery from contaminated visual data plays an important role in various tasks. By assuming the observed data matrix as the addition of a low-rank matrix and a sparse matrix, theoretic guarantee exists under mild conditions for exact data recovery. Practically matrix nuclear norm is adopted as a convex surrogate of the non-convex matrix rank function to encourage low-rank property and serves as the major component of recently-proposed Robust Principal Component Analysis (R-PCA). Recent endeavors have focused on enhancing the scalability of R-PCA to large-scale datasets, especially mitigating the computational burden of frequent large-scale Singular Value Decomposition (SVD) inherent with the nuclear norm optimization. In our proposed scheme, the nuclear norm of an auxiliary matrix is minimized instead, which is related to the original low-rank matrix by random projection. By design, the modified optimization entails SVD on matrices of much smaller scale, as compared to the original optimization problem. Theoretic analysis well justifies the proposed scheme, along with greatly reduced optimization complexity. Both qualitative and quantitative studies are provided on various computer vision benchmarks to validate its effectiveness, including facial shadow removal, surveillance background modeling and large-scale image tag transduction. It is also highlighted that the proposed solution can serve as a general principal to accelerate many other nuclear norm oriented problems in numerous tasks.
computer vision and pattern recognition | 2007
Lun Zhang; Stan Z. Li; Xiaotong Yuan; Shiming Xiang
Classifying moving objects to semantically meaningful categories is important for automatic visual surveillance. However, this is a challenging problem due to the factors related to the limited object size, large intra-class variations of objects in a same class owing to different viewing angles and lighting, and real-time performance requirement in real-world applications. This paper describes an appearance-based method to achieve real-time and robust objects classification in diverse camera viewing angles. A new descriptor, i.e., the multi-block local binary pattern (MB-LBP), is proposed to capture the large-scale structures in object appearances. Based on MB-LBP features, an adaBoost algorithm is introduced to select a subset of discriminative features as well as construct the strong two-class classifier. To deal with the non-metric feature value of MB-LBP features, a multi-branch regression tree is developed as the weak classifiers of the boosting. Finally, the error correcting output code (ECOC) is introduced to achieve robust multi-class classification performance. Experimental results show that our approach can achieve real-time and robust object classification in diverse scenes.
computer vision and pattern recognition | 2010
Bineng Zhong; Hongxun Yao; Sheng Chen; Rongrong Ji; Xiaotong Yuan; Shaohui Liu; Wen Gao
Long-term persistent tracking in ever-changing environments is a challenging task, which often requires addressing difficult object appearance update problems. To solve them, most top-performing methods rely on online learning-based algorithms. Unfortunately, one inherent problem of online learning-based trackers is drift, a gradual adaptation of the tracker to non-targets. To alleviate this problem, we consider visual tracking in a novel weakly supervised learning scenario where (possibly noisy) labels but no ground truth are provided by multiple imperfect oracles (i.e., trackers), some of which may be mediocre. A probabilistic approach is proposed to simultaneously infer the most likely object position and the accuracy of each tracker. Moreover, an online evaluation strategy of trackers and a heuristic training data selection scheme are adopted to make the inference more effective and fast. Consequently, the proposed method can avoid the pitfalls of purely single tracking approaches and get reliable labeled samples to incrementally update each tracker (if it is an appearance-adaptive tracker) to capture the appearance changes. Extensive comparing experiments on challenging video sequences demonstrate the robustness and effectiveness of the proposed method.
acm multimedia | 2011
Meng Wang; Xiaotong Yuan; Mengdi Xu; Jianguo Jiang; Shuicheng Yan; Tat-Seng Chua
There are more than 66 million people suffering from hearing impairment and this disability brings them difficulty in video content understanding due to the loss of audio information. If the scripts are available, captioning technology can help them in a certain degree by synchronously illustrating the scripts during the playing of videos. However, we show that the existing captioning techniques are far from satisfactory in assisting the hearing-impaired audience to enjoy videos. In this article, we introduce a scheme to enhance video accessibility using a Dynamic Captioning approach, which explores a rich set of technologies including face detection and recognition, visual saliency analysis, text-speech alignment, etc. Different from the existing methods that are categorized as static captioning, dynamic captioning puts scripts at suitable positions to help the hearing-impaired audience better recognize the speaking characters. In addition, it progressively highlights the scripts word-by-word via aligning them with the speech signal and illustrates the variation of voice volume. In this way, the special audience can better track the scripts and perceive the moods that are conveyed by the variation of volume. We implemented the technology on 20 video clips and conducted an in-depth study with 60 real hearing-impaired users. The results demonstrated the effectiveness and usefulness of the video accessibility enhancement scheme.
international conference on computer vision | 2011
Xiangyu Chen; Xiaotong Yuan; Qiang Chen; Shuicheng Yan; Tat-Seng Chua
We introduce in this paper a novel approach to multi-label image classification which incorporates a new type of context — label exclusive context — with linear representation and classification. Given a set of exclusive label groups that describe the negative relationship among class labels, our method, namely LELR for Label Exclusive Linear Representation, enforces repulsive assignment of the labels from each group to a query image. The problem can be formulated as an exclusive Lasso (eLasso) model with group overlaps and affine transformation. Since existing eLasso solvers are not directly applicable to solving such an variant of eLasso in our setting, we propose a Nesterovs smoothing approximation algorithm for efficient optimization. Extensive comparing experiments on the challenging real-world visual classification benchmarks demonstrate the effectiveness of incorporating label exclusive context into visual classification.
IEEE Transactions on Knowledge and Data Engineering | 2012
Xiaotong Yuan; Bao-Gang Hu; Ran He
Mean-Shift (MS) is a powerful nonparametric clustering method. Although good accuracy can be achieved, its computational cost is particularly expensive even on moderate data sets. In this paper, for the purpose of algorithmic speedup, we develop an agglomerative MS clustering method along with its performance analysis. Our method, namely Agglo-MS, is built upon an iterative query set compression mechanism which is motivated by the quadratic bounding optimization nature of MS algorithm. The whole framework can be efficiently implemented in linear running time complexity. We then extend Agglo-MS into an incremental version which performs comparably to its batch counterpart. The efficiency and accuracy of Agglo-MS are demonstrated by extensive comparing experiments on synthetic and real data sets.
Pattern Recognition | 2011
Hui Yan; Xiaotong Yuan; Shuicheng Yan; Jingyu Yang
Most feature selection algorithms based on information-theoretic learning (ITL) adopt ranking process or greedy search as their searching strategies. The former selects features individually so that it ignores feature interaction and dependencies. The latter heavily relies on the search paths, as only one path will be explored with no possible back-track. In addition, both strategies typically lead to heuristic algorithms. To cope with these problems, this article proposes a novel feature selection framework based on correntropy in ITL, namely correntropy based feature selection using binary projection (BPFS). Our framework selects features by projecting the original high-dimensional data to a low-dimensional space through a special binary projection matrix. The formulated objective function aims at maximizing the correntropy between selected features and class labels. And this function can be efficiently optimized via standard mathematical tools. We apply the half-quadratic method to optimize the objective function in an iterative manner, where each iteration reduces to an assignment subproblem which can be highly efficiently solved with some off-the-shelf toolboxes. Comparative experiments on six real-world datasets indicate that our framework is effective and efficient.
asian conference on machine learning | 2009
Ran He; Bao-Gang Hu; Xiaotong Yuan
In this paper, we propose a Robust Discriminant Analysis based on maximum entropy (MaxEnt) criterion (MaxEnt-RDA), which is derived from a nonparametric estimate of Renyis quadratic entropy. MaxEnt-RDA uses entropy as both objective and constraints; thus the structural information of classes is preserved while information loss is minimized. It is a natural extension of LDA from Gaussian assumption to any distribution assumption. Like LDA, the optimal solution of MaxEnt-RDA can also be solved by an eigen-decomposition method, where feature extraction is achieved by designing two Parzen probability matrices that characterize the within-class variation and the between-class variation respectively. Furthermore, MaxEnt-RDA makes use of high order statistics (entropy) to estimate the probability matrix so that it is robust to outliers. Experiments on toy problem , UCI datasets and face datasets demonstrate the effectiveness of the proposed method with comparison to other state-of-the-art methods.