Network


Latest external collaboration on country level. Dive into details by clicking on the dots.

Hotspot


Dive into the research topics where Song Bai is active.

Publication


Featured researches published by Song Bai.


IEEE Signal Processing Letters | 2015

DeepPano: Deep Panoramic Representation for 3-D Shape Recognition

Baoguang Shi; Song Bai; Zhichao Zhou; Xiang Bai

This letter introduces a robust representation of 3-D shapes, named DeepPano, learned with deep convolutional neural networks (CNN). Firstly, each 3-D shape is converted into a panoramic view, namely a cylinder projection around its principle axis. Then, a variant of CNN is specifically designed for learning the deep representations directly from such views. Different from typical CNN, a row-wise max-pooling layer is inserted between the convolution and fully-connected layers, making the learned representations invariant to the rotation around a principle axis. Our approach achieves state-of-the-art retrieval/classification results on two large-scale 3-D model datasets (ModelNet-10 and ModelNet-40), outperforming typical methods by a large margin.


IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015

3D Shape Matching via Two Layer Coding

Xiang Bai; Song Bai; Zhuotun Zhu; Longin Jan Latecki

View-based 3D shape retrieval is a popular branch in 3D shape analysis owing to the high discriminative property of 2D views. However, many previous works do not scale up to large 3D shape databases. We propose a two layer coding (TLC) framework to conduct shape matching much more efficiently. The first layer coding is applied to pairs of views represented as depth images. The spatial relationship of each view pair is captured with so-called eigen-angle, which is the planar angle between the two views measured at the center of the 3D shape. Prior to the second layer coding, the view pairs are divided into subsets according to their eigen-angles. Consequently, view pairs that differ significantly in their eigen-angles are encoded with different codewords, which implies that spatial arrangement of views is preserved in the second layer coding. The final feature vector of a 3D shape is the concatenation of all the encoded features from different subsets, which is used for efficient indexing directly. TLC is not limited to encode the local features from 2D views, but can be also applied to encoding 3D features. Exhaustive experimental results confirm that TLC achieves state-of-the-art performance in both retrieval accuracy and efficiency.


IEEE Transactions on Image Processing | 2016

Sparse Contextual Activation for Efficient Visual Re-Ranking

Song Bai; Xiang Bai

In this paper, we propose an extremely efficient algorithm for visual re-ranking. By considering the original pairwise distance in the contextual space, we develop a feature vector called sparse contextual activation (SCA) that encodes the local distribution of an image. Hence, re-ranking task can be simply accomplished by vector comparison under the generalized Jaccard metric, which has its theoretical meaning in the fuzzy set theory. In order to improve the time efficiency of re-ranking procedure, inverted index is successfully introduced to speed up the computation of generalized Jaccard metric. As a result, the average time cost of re-ranking for a certain query can be controlled within 1 ms. Furthermore, inspired by query expansion, we also develop an additional method called local consistency enhancement on the proposed SCA to improve the retrieval performance in an unsupervised manner. On the other hand, the retrieval performance using a single feature may not be satisfactory enough, which inspires us to fuse multiple complementary features for accurate retrieval. Based on SCA, a robust feature fusion algorithm is exploited that also preserves the characteristic of high time efficiency. We assess our proposed method in various visual re-ranking tasks. Experimental results on Princeton shape benchmark (3D object), WM-SRHEC07 (3D competition), YAEL data set B (face), MPEG-7 data set (shape), and Ukbench data set (image) manifest the effectiveness and efficiency of SCA.


Neurocomputing | 2016

Deep Learning Representation using Autoencoder for 3D Shape Retrieval

Zhuotun Zhu; Xinggang Wang; Song Bai; Cong Yao; Xiang Bai

We study the problem of how to build a deep learning representation for 3D shape. Deep learning has shown to be very effective in variety of visual applications, such as image classification and object detection. However, it has not been successfully applied to 3D shape recognition. This is because 3D shape has complex structure in 3D space and there are limited number of 3D shapes for feature learning. To address these problems, we project 3D shapes into 2D space and use autoencoder for feature learning on the 2D images. High accuracy 3D shape retrieval performance is obtained by aggregating the features learned on 2D images. In addition, we show the proposed deep learning feature is complementary to conventional local image descriptors. By combing the global deep learning representation and the local descriptor representation, our method can obtain the state-of-the-art performance on 3D shape retrieval benchmarks.


computer vision and pattern recognition | 2016

GIFT: A Real-Time and Scalable 3D Shape Search Engine

Song Bai; Xiang Bai; Zhichao Zhou; Zhaoxiang Zhang; Longin Jan Latecki

Projective analysis is an important solution for 3D shape retrieval, since human visual perceptions of 3D shapes rely on various 2D observations from different view points. Although multiple informative and discriminative views are utilized, most projection-based retrieval systems suffer from heavy computational cost, thus cannot satisfy the basic requirement of scalability for search engines. In this paper, we present a real-time 3D shape search engine based on the projective images of 3D shapes. The real-time property of our search engine results from the following aspects: (1) efficient projection and view feature extraction using GPU acceleration, (2) the first inverted file, referred as F-IF, is utilized to speed up the procedure of multi-view matching, (3) the second inverted file (S-IF), which captures a local distribution of 3D shapes in the feature manifold, is adopted for efficient context-based reranking. As a result, for each query the retrieval task can be finished within one second despite the necessary cost of IO overhead. We name the proposed 3D shape search engine, which combines GPU acceleration and Inverted File (Twice), as GIFT. Besides its high efficiency, GIFT also outperforms the state-of-the-art methods significantly in retrieval accuracy on various shape benchmarks and competitions.


computer vision and pattern recognition | 2017

Scalable Person Re-identification on Supervised Smoothed Manifold

Song Bai; Xiang Bai; Qi Tian

Most existing person re-identification algorithms either extract robust visual features or learn discriminative metrics for person images. However, the underlying manifold which those images reside on is rarely investigated. That arises a problem that the learned metric is not smooth with respect to the local geometry structure of the data manifold. In this paper, we study person re-identification with manifold-based affinity learning, which did not receive enough attention from this area. An unconventional manifold-preserving algorithm is proposed, which can 1) make best use of supervision from training data, whose label information is given as pairwise constraints, 2) scale up to large repositories with low on-line time complexity, and 3) be plunged into most existing algorithms, serving as a generic postprocessing procedure to further boost the identification accuracies. Extensive experimental results on five popular person re-identification benchmarks consistently demonstrate the effectiveness of our method. Especially, on the largest CUHK03 and Market-1501, our method outperforms the state-of-the-art alternatives by a large margin with high efficiency, which is more appropriate for practical applications.


Information Sciences | 2015

Beyond diffusion process

Xiang Bai; Song Bai; Xinggang Wang

NSS is proposed to replace the role of Diffusion Process to capture the geometry of the underlying manifold in shape and image retrieval.NSS is more precise than Diffusion Process, and more robust to noise.NSS is computed more efficiently than Diffusion Process, and it no longer needs an iterative process to guarantee the retrieval precision.We obtain state-of-the-art retrieval performance on several benchmark datasets. Measuring the similarity between two instances reliably, shape or image, is a challenging problem in shape and image retrieval. In this paper, a simple yet effective method called Neighbor Set Similarity (NSS) is proposed, which is superior to both traditional pairwise similarity and diffusion process. NSS makes full use of contextual information to capture the geometry of the underlying manifold, and obtains a more precise measure than the original pairwise similarity. Moreover, based on NSS, we propose a powerful fusion process to utilize the complementarity of different descriptors to further enhance the retrieval performance. The experimental results on MPEG-7 shape dataset, N-S image dataset and ORL face dataset demonstrate the effectiveness of the proposed method. In addition, the time complexity of NSS is much lower than diffusion process, which suggests that NSS is more suitable for large scale image retrieval than diffusion process.


Pattern Recognition Letters | 2015

Neural shape codes for 3D model retrieval

Song Bai; Xiang Bai; Wenyu Liu; Fabio Roli

A practical way for applying deep learning to the 3D model retrieval task.The activations of different layers of CNN are shown to be complementary.State-of-the-art performances on many 3D shape benchmark datasets. The paradigm of Convolutional Neural Network (CNN) has already shown its potential for many challenging applications of computer vision, such as image classification, object detection and action recognition. In this paper, the task of 3D model retrieval is addressed by exploiting such promising paradigm. However, 3D models are usually represented with a collection of orderless points, lines and surfaces in a three dimensional space, which makes it difficult to involve the operation of convolution, pooling, etc. Yet, we propose a practical and effective way for applying CNN to 3D model retrieval, by training the network with the depth projections of 3D model. This CNN is regarded as a generic feature extractor for depth image. With large amounts of training data, the learned feature, which is called Neural Shape Codes, can handle various deformation changes that exist in shape analysis. The reported experimental results on several 3D shape benchmark datasets show the superior performance of the proposed method.


european conference on computer vision | 2016

Smooth Neighborhood Structure Mining on Multiple Affinity Graphs with Applications to Context-Sensitive Similarity

Song Bai; Shaoyan Sun; Xiang Bai; Zhaoxiang Zhang; Qi Tian

Due to the ability of capturing geometry structures of the data manifold, diffusion process has demonstrated impressive performances in retrieval task by spreading the similarities on the affinity graph. In view of robustness to noise edges, diffusion process is usually localized, i.e., only propagating similarities via neighbors. However, selecting neighbors smoothly on graph-based manifolds is more or less ignored by previous works. In this paper, we propose a new algorithm called Smooth Neighborhood (SN) that mines the neighborhood structure to satisfy the manifold assumption. By doing so, nearby points on the underlying manifold are guaranteed to yield similar neighbors as much as possible. Moreover, SN is adjusted to tackle multiple affinity graphs by imposing a weight learning paradigm, and this is the primary difference compared with related works which are only applicable with one affinity graph. Exhausted experimental results and comparisons against other algorithms manifest the effectiveness of the proposed algorithm.


international conference on computer vision | 2017

Ensemble Diffusion for Retrieval

Song Bai; Zhichao Zhou; Jingdong Wang; Xiang Bai; Longin Jan Latecki; Qi Tian

As a postprocessing procedure, diffusion process has demonstrated its ability of substantially improving the performance of various visual retrieval systems. Whereas, great efforts are also devoted to similarity (or metric) fusion, seeing that only one individual type of similarity cannot fully reveal the intrinsic relationship between objects. This stimulates a great research interest of considering similarity fusion in the framework of diffusion process (i.e., fusion with diffusion) for robust retrieval.,,In this paper, we firstly revisit representative methods about fusion with diffusion, and provide new insights which are ignored by previous researchers. Then, observing that existing algorithms are susceptible to noisy similarities, the proposed Regularized Ensemble Diffusion (RED) is bundled with an automatic weight learning paradigm, so that the negative impacts of noisy similarities are suppressed. At last, we integrate several recently-proposed similarities with the proposed framework. The experimental results suggest that we can achieve new state-of-the-art performances on various retrieval tasks, including 3D shape retrieval on ModelNet dataset, and image retrieval on Holidays and Ukbench dataset.

Collaboration


Dive into the Song Bai's collaboration.

Top Co-Authors

Avatar

Xiang Bai

Huazhong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Zhichao Zhou

Huazhong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Top Co-Authors

Avatar

Qi Tian

University of Texas at San Antonio

View shared research outputs
Top Co-Authors

Avatar

Xinggang Wang

Huazhong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Zhaoxiang Zhang

Chinese Academy of Sciences

View shared research outputs
Top Co-Authors

Avatar

Cong Yao

Huazhong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Wenyu Liu

Huazhong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar

Xinwei He

Huazhong University of Science and Technology

View shared research outputs
Top Co-Authors

Avatar
Researchain Logo
Decentralizing Knowledge