Bing-Kun Bao
Chinese Academy of Sciences
Network
Latest external collaboration on country level. Dive into details by clicking on the dots.
Publication
Featured researches published by Bing-Kun Bao.
IEEE Transactions on Image Processing | 2012
Bing-Kun Bao; Guangcan Liu; Changsheng Xu; Shuicheng Yan
In this paper, we address the error correction problem, that is, to uncover the low-dimensional subspace structure from high-dimensional observations, which are possibly corrupted by errors. When the errors are of Gaussian distribution, principal component analysis (PCA) can find the optimal (in terms of least-square error) low-rank approximation to high-dimensional data. However, the canonical PCA method is known to be extremely fragile to the presence of gross corruptions. Recently, Wright established a so-called robust principal component analysis (RPCA) method, which can well handle the grossly corrupted data. However, RPCA is a transductive method and does not handle well the new samples, which are not involved in the training procedure. Given a new datum, RPCA essentially needs to recalculate over all the data, resulting in high computational cost. So, RPCA is inappropriate for the applications that require fast online computation. To overcome this limitation, in this paper, we propose an inductive robust principal component analysis (IRPCA) method. Given a set of training data, unlike RPCA that targets on recovering the original data matrix, IRPCA aims at learning the underlying projection matrix, which can be used to efficiently remove the possible corruptions in any datum. The learning is done by solving a nuclear-norm regularized minimization problem, which is convex and can be solved in polynomial time. Extensive experiments on a benchmark human face dataset and two video surveillance datasets show that IRPCA cannot only be robust to gross corruptions, but also handle the new data well and in an efficient way.
IEEE Transactions on Systems, Man, and Cybernetics | 2013
Chao Sun; Tianzhu Zhang; Bing-Kun Bao; Changsheng Xu; Tao Mei
Sign language recognition is a growing research area in the field of computer vision. A challenge within it is to model various signs, varying with time resolution, visual manual appearance, and so on. In this paper, we propose a discriminative exemplar coding (DEC) approach, as well as utilizing Kinect sensor, to model various signs. The proposed DEC method can be summarized as three steps. First, a quantity of class-specific candidate exemplars are learned from sign language videos in each sign category by considering their discrimination. Then, every video of all signs is described as a set of similarities between frames within it and the candidate exemplars. Instead of simply using a heuristic distance measure, the similarities are decided by a set of exemplar-based classifiers through the multiple instance learning, in which a positive (or negative) video is treated as a positive (or negative) bag and those frames similar to the given exemplar in Euclidean space as instances. Finally, we formulate the selection of the most discriminative exemplars into a framework and simultaneously produce a sign video classifier to recognize sign. To evaluate our method, we collect an American sign language dataset, which includes approximately 2000 phrases, while each phrase is captured by Kinect sensor with color, depth, and skeleton information. Experimental results on our dataset demonstrate the feasibility and effectiveness of the proposed approach for sign language recognition.
IEEE Transactions on Image Processing | 2013
Bing-Kun Bao; Guangyu Zhu; Jialie Shen; Shuicheng Yan
Recent techniques based on sparse representation (SR) have demonstrated promising performance in high-level visual recognition, exemplified by the highly accurate face recognition under occlusion and other sparse corruptions. Most research in this area has focused on classification algorithms using raw image pixels, and very few have been proposed to utilize the quantized visual features, such as the popular bag-of-words feature abstraction. In such cases, besides the inherent quantization errors, ambiguity associated with visual word assignment and misdetection of feature points, due to factors such as visual occlusions and noises, constitutes the major cause of dense corruptions of the quantized representation. The dense corruptions can jeopardize the decision process by distorting the patterns of the sparse reconstruction coefficients. In this paper, we aim to eliminate the corruptions and achieve robust image analysis with SR. Toward this goal, we introduce two transfer processes (ambiguity transfer and mis-detection transfer) to account for the two major sources of corruption as discussed. By reasonably assuming the rarity of the two kinds of distortion processes, we augment the original SR-based reconstruction objective with \mmbl0-norm regularization on the transfer terms to encourage sparsity and, hence, discourage dense distortion/transfer. Computationally, we relax the nonconvex \mmbl0-norm optimization into a convex \mmbl1-norm optimization problem, and employ the accelerated proximal gradient method to optimize the convergence provable updating procedure. Extensive experiments on four benchmark datasets, Caltech-101, Caltech-256, Corel-5k, and CMU pose, illumination, and expression, manifest the necessity of removing the quantization corruptions and the various advantages of the proposed framework.
Neurocomputing | 2014
Jing Wang; Ke Lu; Daru Pan; Ning He; Bing-Kun Bao
Object removal can be accomplished by an image inpainting process which obtains a visually plausible image interpolation of an occluded or damaged region. There are two key components in an exemplar-based image inpainting approach: computing filling priority of patches in the missing region and searching for the best matching patch. In this paper, we present a robust exemplar-based method. In the improved model, a regularized factor is introduced to adjust the patch priority function. A modified sum of squared differences (SSD) and normalized cross correlation (NCC) are combined to search for the best matching patch. We evaluate the proposed method by applying it to real-life photos and testing the removal of large objects. The results demonstrate the effectiveness of the approach.
international conference on multimedia retrieval | 2013
Bing-Kun Bao; Weiqing Min; Ke Lu; Changsheng Xu
This paper is devoted to detecting social, real-world events from the sharing images/videos on social media sites like Flickr and YouTube. The fast growing contents make the social media sites become gold mines for social event detection, but we still need to overcome the challenge of processing the associated heterogeneous metadata, such as time-stamp, location, visual content and textual content. Different from the traditional early or late fusion with different types of metadata, we represent them into a star-structured
IEEE Transactions on Multimedia | 2012
Bing-Kun Bao; Teng Li; Shuicheng Yan
K
IEEE Transactions on Image Processing | 2013
Bing-Kun Bao; Guangcan Liu; Shuicheng Yan; Changsheng Xu
-partite graph, that is, social media itself is regarded as the central vertices set and different types of metadata are treated as the auxiliary vertices sets which are pairwise independent with each other but correlated with the central one. Based on this graph, Social Event Detection with Robust High-Order Co-Clustering (SED-RHOCC) algorithm is proposed and it includes two steps: 1) coarse event detection, 2) clusters and samples refinement. In the first step, by revealing the inter-relationship on the constructed star-structured
Pattern Recognition | 2011
Bing-Kun Bao; Bingbing Ni; Yadong Mu; Shuicheng Yan
K
IEEE Transactions on Multimedia | 2014
Weiqing Min; Changsheng Xu; Min Xu; Xian Xiao; Bing-Kun Bao
-partite graph and the intra-relationship within some metadata sets such as time-stamp, we co-cluster social media and the associated metadata separately and iteratively to avoid information loss in early/late fusion. After that, a post process is utilized to refine the clusters and social media samples in the second step. MediaEval Social Event Detection Dataset [1] and its subset are selected to demonstrate the effectiveness of our proposed approach in handling the datasets with and without non-event samples.
acm multimedia | 2012
Bing-Kun Bao; Weiqing Min; Jitao Sang; Changsheng Xu
Conventional semisupervised image annotation algorithms usually propagate labels predominantly via holistic similarities over image representations and do not fully consider the label locality, inter-label similarity, and intra-label diversity among multilabel images. Taking these problems into consideration, we present the hidden-concept driven image annotation and label ranking algorithm (HDIALR), which conducts label propagation based on the similarity over a visually semantically consistent hidden-concepts space. The proposed method has the following characteristics: 1) each holistic image representation is implicitly decomposed into label representations to reveal label locality: the decomposition is guided by the so-called hidden concepts, characterizing image regions and reconstructing both visual and nonvisual labels of the entire image; 2) each label is represented by a linear combination of hidden concepts, while the similar linear coefficients reveal the inter-label similarity; 3) each hidden concept is expressed as a respective subspace, and different expressions of the same label over the subspace then induce the intra-label diversity; and 4) the sparse coding-based graph is proposed to enforce the collective consistency between image labels and image representations, such that it naturally avoids the dilemma of possible inconsistency between the pairwise label similarity and image representation similarity in multilabel scenario. These properties are finally embedded in a regularized nonnegative data factorization formulation, which decomposes images representations into label representations over both labeled and unlabeled data for label propagation and ranking. The objective function is iteratively optimized by a convergence provable updating procedure. Extensive experiments on three benchmark image datasets well validate the effectiveness of our proposed solution to semisupervised multilabel image annotation and label ranking problem.