Is this you? Create Your Porfile

Yinan Yu

Chinese Academy of Sciences

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yinan Yu is active.

Explore More

Publication

Featured researches published by Yinan Yu.

computer vision and pattern recognition | 2011

Boosted local structured HOG-LBP for object localization

Junge Zhang; Kaiqi Huang; Yinan Yu; Tieniu Tan

Object localization is a challenging problem due to variations in objects structure and illumination. Although existing part based models have achieved impressive progress in the past several years, their improvement is still limited by low-level feature representation. Therefore, this paper mainly studies the description of object structure from both feature level and topology level. Following the bottom-up paradigm, we propose a boosted Local Structured HOG-LBP based object detector. Firstly, at feature level, we propose Local Structured Descriptor to capture the objects local structure, and develop the descriptors from shape and texture information, respectively. Secondly, at topology level, we present a boosted feature selection and fusion scheme for part based object detector. All experiments are conducted on the challenging PASCAL VOC2007 datasets. Experimental results show that our method achieves the state-of-the-art performance.

computer vision and pattern recognition | 2015

Deep multiple instance learning for image classification and auto-annotation

Jiajun Wu; Yinan Yu; Chang Huang; Kai Yu

The recent development in learning deep representations has demonstrated its wide applications in traditional vision tasks like classification and detection. However, there has been little investigation on how we could build up a deep learning framework in a weakly supervised setting. In this paper, we attempt to model deep learning in a weakly supervised learning (multiple instance learning) framework. In our setting, each image follows a dual multi-instance assumption, where its object proposals and possible text annotations can be regarded as two instance sets. We thus design effective systems to exploit the MIL property with deep learning strategies from the two ends; we also try to jointly learn the relationship between object and annotation proposals. We conduct extensive experiments and prove that our weakly supervised deep learning framework not only achieves convincing performance in vision tasks including classification and image annotation, but also extracts reasonable region-keyword pairs with little supervision, on both widely used benchmarks like PASCAL VOC and MIT Indoor Scene 67, and also a dataset for image-and patch-level annotations.

IEEE Transactions on Image Processing | 2012

A Novel Algorithm for View and Illumination Invariant Image Matching

Yinan Yu; Kaiqi Huang; Wei Chen; Tieniu Tan

The challenges in local-feature-based image matching are variations of view and illumination. Many methods have been recently proposed to address these problems by using invariant feature detectors and distinctive descriptors. However, the matching performance is still unstable and inaccurate, particularly when large variation in view or illumination occurs. In this paper, we propose a view and illumination invariant image-matching method. We iteratively estimate the relationship of the relative view and illumination of the images, transform the view of one image to the other, and normalize their illumination for accurate matching. Our method does not aim to increase the invariance of the detector but to improve the accuracy, stability, and reliability of the matching results. The performance of matching is significantly improved and is not affected by the changes of view and illumination in a valid range. The proposed method would fail when the initial view and illumination method fails, which gives us a new sight to evaluate the traditional detectors. We propose two novel indicators for detector evaluation, namely, valid angle and valid illumination, which reflect the maximum allowable change in view and illumination, respectively. Extensive experimental results show that our method improves the traditional detector significantly, even in large variations, and the two indicators are much more distinctive.

international conference on computer vision | 2015

A Deep Visual Correspondence Embedding Model for Stereo Matching Costs

Zhuoyuan Chen; Xun Sun; Liang Wang; Yinan Yu; Chang Huang

This paper presents a data-driven matching cost for stereo matching. A novel deep visual correspondence embedding model is trained via Convolutional Neural Network on a large set of stereo images with ground truth disparities. This deep embedding model leverages appearance data to learn visual similarity relationships between corresponding image patches, and explicitly maps intensity values into an embedding feature space to measure pixel dissimilarities. Experimental results on KITTI and Middlebury data sets demonstrate the effectiveness of our model. First, we prove that the new measure of pixel dissimilarity outperforms traditional matching costs. Furthermore, when integrated with a global stereo framework, our method ranks top 3 among all two-frame algorithms on the KITTI benchmark. Finally, cross-validation results show that our model is able to make correct predictions for unseen data which are outside of its labeled training set.

computer vision and pattern recognition | 2015

Object detection by labeling superpixels

Junjie Yan; Yinan Yu; Xiangyu Zhu; Zhen Lei; Stan Z. Li

Object detection is often conducted by object proposal generation and classification sequentially. This paper handles object detection in a superpixel oriented manner instead of the proposal oriented. Specially, this paper takes object detection as a multi-label superpixel labeling problem by minimizing an energy function. It uses the data cost term to capture the appearance, smooth cost term to encode the spatial context and label cost term to favor compact detection. The data cost is learned through a convolutional neural network and the parameters in the labeling model are learned through a structural SVM. Compared with proposal generation and classification based methods, the proposed superpixel labeling method can naturally detect objects missed by proposal generation step and capture the global image context to infer the overlapping objects. The proposed method shows its advantage in Pascal VOC and ImageNet. Notably, it performs better than the ImageNet ILSVRC2014 winner GoogLeNet (45.0% V.S. 43.9% in mAP) with much shallower and fewer CNNs.

international conference on pattern recognition | 2014

Early Hierarchical Contexts Learned by Convolutional Networks for Image Segmentation

Zifeng Wu; Yongzhen Huang; Yinan Yu; Liang Wang; Tieniu Tan

We propose a foreground segmentation method based on convolutional networks. To predict the label of a pixel in an image, the model takes a hierarchical context as the input, which is obtained by combining multiple context patches on different scales. Short range contexts depict the local details, while long range contexts capture the object-scene relationships in an image. Early means that we combine the context patches of a pixel into a hierarchical one before any trainable layers are learned, i.e., early-combing. In contrast, late-combing means that the combination occurs later, e.g., when the convolutional feature extractor in a network has already been learned. We find that it is vital for the whole model to jointly learn the patterns of contexts on different scales in our task. Experiments show that early-combing performs better than late-combing. On the dataset1 built up by Baidu IDL2 for a latest person segmentation contest, our method beats all the competitors with a considerable margin. Qualitative results also show that the proposed method is almost ready for practical application.

asian conference on computer vision | 2009

A harris-like scale invariant feature detector

Yinan Yu; Kaiqi Huang; Tieniu Tan

Image feature detection is a fundamental issue in computer vision. SIFT[1] and SURF[2] are very effective in scale-space feature detection, but their stabilities are not good enough because unstable features such as edges are often detected even if they use edge suppression as a post-treatment. Inspired by Harris function[3], we extend Harris to scale-space and propose a novel method - Harris-like Scale Invariant Feature Detector (HLSIFD). Different to Harris-Laplace which is a hybrid method of Harris and Laplace, HLSIFD uses Hessian Matrix which is proved to be more stable in scale-space than Harris matrix. Unlike other methods suppressing edges in a sudden way(SIFT) or ignoring it(SURF), HLSIFD suppresses edges smoothly and uniformly, so fewer fake points are detected by HLSIFD. The approach is evaluated on public databases and in real scenes. Compared to the state of arts feature detectors: SIFT and SURF, HLSIFD shows high performance of HLSIFD.

international conference on pattern recognition | 2014

Learning Convolutional Nonlinear Features for K Nearest Neighbor Image Classification

Weiqiang Ren; Yinan Yu; Junge Zhang; Kaiqi Huang

Learning low-dimensional feature representations is a crucial task in machine learning and computer vision. Recently the impressive breakthrough in general object recognition made by large scale convolutional networks shows that convolutional networks are able to extract discriminative hierarchical features in large scale object classification task. However, for vision tasks other than end-to-end classification, such as K Nearest Neighbor classification, the learned intermediate features are not necessary optimal for the specific problem. In this paper, we aim to exploit the power of deep convolutional networks and optimize the output feature layer with respect to the task of K Nearest Neighbor (kNN) classification. By directly optimizing the kNN classification error on training data, we in fact learn convolutional nonlinear features in a data-driven and task-driven way. Experimental results on standard image classification benchmarks show that the proposed method is able to learn better feature representations than other general end-to-end classification methods on kNN classification task.

international conference on image processing | 2012

Feature coding via vector difference for image classification

Xin Zhao; Yinan Yu; Yongzhen Huang; Kaiqi Huang; Tieniu Tan

An effective image representation is important to an image classification task. The most popular image representation framework utilizes a feature coding algorithm to encode the extracted low-level feature descriptors into a vector representation. In this paper, we analyze the recently developed feature coding methods in a general way. According to their common characteristics, we propose a new coding scheme to perform feature coding based on the vector difference in a high-dimensional space which is obtained by explicit feature maps. As we illustrate, our method has promising results with small codebook sizes and generalizes most existing coding methods in a unified form.

international conference on neural information processing | 2013

Exploring the Power of Kernel in Feature Representation for Object Categorization

Weiqiang Ren; Yinan Yu; Junge Zhang; Kaiqi Huang

Learning robust and invariant feature representations is always a crucial task in visual recognition and analysis. Mean square error (MSE) has been used in many feature encoding methods as a feature reconstruction criterion. However, due to the non-Gaussian noises and non-linearity structures in natural images, second order statistics like MSE are usually not sufficient to capture these information from image data. In this paper, motivated by the information-theoretic learning framework and kernel machine learning, we adopt a similarity measure called correntropy in the auto-encoder model to tackle this problem. The proposed maximum correntropy auto-encoder (MCAE) learns more robust and discriminative representations than MSE based model by performing computation in an infinite dimensional kernel space. Moreover, we further exploit the power of kernel by learning a kernel embedding neural network which explicitly maps data from Euclidean space to an approximated kernel space. Experimental results on standard object categorization datasets show the effectiveness of kernel learning in feature representation for visual recognition task.

Explore More