Shu Kong | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Shu Kong is active.

Explore More

Publication

Featured researches published by Shu Kong.

european conference on computer vision | 2016

Photo Aesthetics Ranking Network with Attributes and Content Adaptation

Shu Kong; Xiaohui Shen; Zhe L. Lin; Radomir Mech; Charless C. Fowlkes

Real-world applications could benefit from the ability to automatically generate a fine-grained ranking of photo aesthetics. However, previous methods for image aesthetics analysis have primarily focused on the coarse, binary categorization of images into high- or low-aesthetic categories. In this work, we propose to learn a deep convolutional neural network to rank photo aesthetics in which the relative ranking of photo aesthetics are directly modeled in the loss function. Our model incorporates joint learning of meaningful photographic attributes and image content information which can help regularize the complicated photo aesthetics rating problem.

Pattern Recognition | 2014

A classification-oriented dictionary learning model

Donghui Wang; Shu Kong

Empirically, we find that despite the most exclusively discriminative features owned by one specific object category, the various classes of objects usually share some common patterns, which do not contribute to the discrimination of them. Concentrating on this observation and motivated by the success of dictionary learning (DL) framework, in this paper, we propose to explicitly learn a class-specific dictionary (called particularity) for each category that captures the most discriminative features of this category, and simultaneously learn a common pattern pool (called commonality), whose atoms are shared by all the categories and only contribute to representation of the data rather than discrimination. In this way, the particularity differentiates the categories while the commonality provides the essential reconstruction for the objects. Thus, we can simply adopt a reconstruction-based scheme for classification. By reviewing the existing DL-based classification methods, we can see that our approach simultaneously learns a classification-oriented dictionary and drives the sparse coefficients as discriminative as possible. In this way, the proposed method will achieve better classification performance. To evaluate our method, we extensively conduct experiments both on synthetic data and real-world benchmarks in comparison with the existing DL-based classification algorithms, and the experimental results demonstrate the effectiveness of our method. HighlightsWe propose a discriminative dictionary learning method for image classification.Our method learns class-specific feature sub-dictionaries and a common pattern pool.The class-specific dictionary captures the most discriminative features of the class.The common pattern pool complements the representation of images over the dictionary.We provide the explanation of our model and the comparisons with other methods.

computer vision and pattern recognition | 2017

Low-Rank Bilinear Pooling for Fine-Grained Classification

Shu Kong; Charless C. Fowlkes

Pooling second-order local feature statistics to form a high-dimensional bilinear feature has been shown to achieve state-of-the-art performance on a variety of fine-grained classification tasks. To address the computational demands of high feature dimensionality, we propose to represent the covariance features as a matrix and apply a low-rank bilinear classifier. The resulting classifier can be evaluated without explicitly computing the bilinear feature map which allows for a large reduction in the compute time as well as decreasing the effective number of parameters to be learned. To further compress the model, we propose a classifier co-decomposition that factorizes the collection of bilinear classifiers into a common factor and compact per-class terms. The co-decomposition idea can be deployed through two convolutional layers and trained in an end-to-end architecture. We suggest a simple yet effective initialization that avoids explicitly first training and factorizing the larger bilinear classifiers. Through extensive experiments, we show that our model achieves state-of-the-art performance on several public datasets for fine-grained classification trained with only category labels. Importantly, our final model is an order of magnitude smaller than the recently proposed compact bilinear model [8], and three orders smaller than the standard bilinear CNN model [19].

Image and Vision Computing | 2013

Integration of multi-feature fusion and dictionary learning for face recognition

Donghui Wang; Xikui Wang; Shu Kong

Recent research emphasizes more on analyzing multiple features to improve face recognition (FR) performance. One popular scheme is to extend the sparse representation based classification framework with various sparse constraints. Although these methods jointly study multiple features through the constraints, they just process each feature individually such that they overlook the possible high-level relationship among different features. It is reasonable to assume that the low-level features of facial images, such as edge information and smoothed/low-frequency image, can be fused into a more compact and more discriminative representation based on the latent high-level relationship. FR on the fused features is anticipated to produce better performance than that on the original features, since they provide more favorable properties. Focusing on this, we propose two different strategies which start from fusing multiple features and then exploit the dictionary learning (DL) framework for better FR performance. The first strategy is a simple and efficient two-step model, which learns a fusion matrix from training face images to fuse multiple features and then learns class-specific dictionaries based on the fused features. The second one is a more effective model requiring more computational time that learns the fusion matrix and the class-specific dictionaries simultaneously within an iterative optimization procedure. Besides, the second model considers to separate the shared common components from class-specified dictionaries to enhance the discrimination power of the dictionaries. The proposed strategies, which integrate multi-feature fusion process and dictionary learning framework for FR, realize the following goals: (1) exploiting multiple features of face images for better FR performances; (2) learning a fusion matrix to merge the features into a more compact and more discriminative representation; (3) learning class-specific dictionaries with consideration of the common patterns for better classification performance. We perform a series of experiments on public available databases to evaluate our methods, and the experimental results demonstrate the effectiveness of the proposed models.

ieee international conference on automatic face gesture recognition | 2013

Multiple feature fusion for face recognition

Shu Kong; Xikui Wang; Donghui Wang; Fei Wu

Recent studies show face recognition (FR) with additional features achieves better performance than that with single one. Different features can represent different characteristics of human faces, and utilizing different features effectively will have positive effect on FR. Meanwhile, the advances of sparse coding enable researchers to develop various recognition methods to cooperate with multiple features. However, even if these methods achieve very encouraging performances, there still exist some intrinsic problems. Firstly, these methods directly encode the multiple features over the original training set, by which way some redundant, noisy and trivial information are incorporated and the recognition performance can be compromised. Moreover, when the training data increase in number, the jointly-encoding process can be very time-consuming. Thirdly, these methods ignore some semantic relationships among the features, which can boost the FR performance. Thus, coarsely utilizing all the features not only adds extra computation burden, but also prevent further improvement. To address these issues, we propose to fuse the multiple features into a more preferable presentation, which is more compact and more discriminative for better FR performance. As well, we take advantage of the dictionary learning framework to derive an effective recognition scheme. We evaluate our model by comparing it with other state-of-the-art approaches, and the experimental results demonstrate the effectiveness of our approach.

Pattern Recognition Letters | 2012

Feature selection from high-order tensorial data via sparse decomposition

Donghui Wang; Shu Kong

Principal component analysis (PCA) suffers from the fact that each principal component (PC) is a linear combination of all the original variables, thus it is difficult to interpret the results. For this reason, sparse PCA (sPCA), which produces modified PCs with sparse loadings, arises to clear away this interpretation puzzlement. However, as a result of that sPCA is limited in handling vector-represented data, if we use sPCA to reduce the dimensionality and select significant features on the real-world data which are often naturally represented by high-order tensors, we have to reshape them into vectors beforehand, and this will destroy the intrinsic data structures and induce the curse of dimensionality. Focusing on this issue, in this paper, we address the problem to find a set of critical features with multi-directional sparse loadings directly from the tensorial data, and propose a novel method called sparse high-order PCA (sHOPCA) to derive a set of sparse loadings in multiple directions. The computational complexity analysis is also presented to illustrate the efficiency of sHOPCA. To evaluate the proposed sHOPCA, we perform several experiments on both synthetic and real-world datasets, and the experimental results demonstrate the merit of sHOPCA on sparse representation of high-order tensorial data.

computer vision and pattern recognition | 2016

Spatially Aware Dictionary Learning and Coding for Fossil Pollen Identification

Shu Kong; Surangi W. Punyasena; Charless C. Fowlkes

We propose a robust approach for performing automatic species-level recognition of fossil pollen grains in microscopy images that exploits both global shape and local texture characteristics in a patch-based matching methodology. We introduce a novel criteria for selecting meaningful and discriminative exemplar patches. We optimize this function during training using a greedy submodular function optimization framework that gives a near-optimal solution with bounded approximation error. We use these selected exemplars as a dictionary basis and propose a spatially-aware sparse coding method to match testing images for identification while maintaining global shape correspondence. To accelerate the coding process for fast matching, we introduce a relaxed form that uses spatiallyaware soft-thresholding during coding. Finally, we carry out an experimental study that demonstrates the effectiveness and efficiency of our exemplar selection and classification mechanisms, achieving 86.13% accuracy on a difficult fine-grained species classification task distinguishing three types of fossil spruce pollen 1.

ieee international conference on automatic face gesture recognition | 2013

Learning individual-specific dictionaries with fused multiple features for face recognition

Shu Kong; Donghui Wang

Recent researches emphasize more on exploring multiple features to improve classification performance. One popular scheme is to extend the sparse representation-based classification framework with various regularizations. These methods sparsely encode the query image over the training set under different constraints, and achieve very encouraging performances in various applications, especially in face recognition (FR). However, they merely make an issue on how to collaboratively encode the query, but ignore the latent relationships among the multiple features that can further improve the classification accuracy. It is reasonable to anticipate that the low-level features of facial images, such as edges and smoothed/low-frequency image, can be fused into a more compact and more discriminative representation through some relationships for better FR performances. Focusing on this, we propose a unified framework for FR to take advantage of this latent relationship and to fully make use of the fused features. Our method can realize the following tasks: (1) learning a specific dictionary for each individual that captures the most distinctive features; (2) learning a common pattern pool that provides the less-discriminative and shared patterns for all individuals, such as illuminations and poses; (3) simultaneously learning a fusion matrix to merge the features into a more discriminative and more compact representation. We perform a series of experiments on public available databases to evaluate our method, and the experimental results demonstrate the effectiveness of our proposed approach.

wri global congress on intelligent systems | 2010

Traffic Sign Recognition Using Dictionary Learning Method

Xiao Deng; Donghui Wang; Lili Cheng; Shu Kong

Recent researches have paid more and more attention to traffic sign recognition due to its important role in the intelligence transportation system. In the traditional methods for this task, first the traffic signs are located using the color or shape information of the traffic signs, then a classifier is applied for classification. In this paper, we propose a novel framework using the sparse model for traffic information representation and a classifier using a probability method for classification. Results of experiments using examples from the Caltech 101Object Categories show that the proposed method is efficient for traffic sign recognition.

wri global congress on intelligent systems | 2010

Sparse Representation for Three-Dimensional Number Ball Recognition

Lili Cheng; Donghui Wang; Xiao Deng; Shu Kong

We consider the classification problem as a linear regression problem, and find that sparse signal representation offers the key to address this problem. Therefore, a new method, which is based on sparse representation, is proposed for classification. This new method provides insights into two critical issues in classification: sparse representation and classification. For sparse representation, we use the lasso [1],[8], the elastic net [2] and nonnegative garrote [3] as the initial estimate of a new test sample. In the classification stage, we classify the test sample to the correct class via a simple l2-distance measurement. Finally, we propose an efficient algorithm for computing the whole solution path of this method, and conduct extensive experiments on the number ball recognition. From the experiment results, we conclude that this method achieves high recognition rate.

Explore More