Yuanqing Lin | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Yuanqing Lin is active.

Explore More

Publication

Featured researches published by Yuanqing Lin.

computer vision and pattern recognition | 2011

Large-scale image classification: Fast feature extraction and SVM training

Yuanqing Lin; Fengjun Lv; Shenghuo Zhu; Ming Yang; Timothee Cour; Kai Yu; Liangliang Cao; Thomas S. Huang

Most research efforts on image classification so far have been focused on medium-scale datasets, which are often defined as datasets that can fit into the memory of a desktop (typically 4G∼48G). There are two main reasons for the limited effort on large-scale image classification. First, until the emergence of ImageNet dataset, there was almost no publicly available large-scale benchmark data for image classification. This is mostly because class labels are expensive to obtain. Second, large-scale classification is hard because it poses more challenges than its medium-scale counterparts. A key challenge is how to achieve efficiency in both feature extraction and classifier training without compromising performance. This paper is to show how we address this challenge using ImageNet dataset as an example. For feature extraction, we develop a Hadoop scheme that performs feature extraction in parallel using hundreds of mappers. This allows us to extract fairly sophisticated features (with dimensions being hundreds of thousands) on 1.2 million images within one day. For SVM training, we develop a parallel averaging stochastic gradient descent (ASGD) algorithm for training one-against-all 1000-class SVM classifiers. The ASGD algorithm is capable of dealing with terabytes of training data and converges very fast–typically 5 epochs are sufficient. As a result, we achieve state-of-the-art performance on the ImageNet 1000-class classification, i.e., 52.9% in classification accuracy and 71.8% in top 5 hit rate.

computer vision and pattern recognition | 2011

Learning image representations from the pixel level via hierarchical sparse coding

Kai Yu; Yuanqing Lin; John D. Lafferty

We present a method for learning image representations using a two-layer sparse coding scheme at the pixel level. The first layer encodes local patches of an image. After pooling within local regions, the first layer codes are then passed to the second layer, which jointly encodes signals from the region. Unlike traditional sparse coding methods that encode local patches independently, this approach accounts for high-order dependency among patterns in a local image neighborhood. We develop algorithms for data encoding and codebook learning, and show in experiments that the method leads to more invariant and discriminative image representations. The algorithm gives excellent results for hand-written digit recognition on MNIST and object recognition on the Caltech101 benchmark. This marks the first time that such accuracies have been achieved using automatically learned features from the pixel level, rather than using hand-designed descriptors.

european conference on computer vision | 2012

Object-Centric spatial pooling for image classification

Olga Russakovsky; Yuanqing Lin; Kai Yu; Li Fei-Fei

Spatial pyramid matching (SPM) based pooling has been the dominant choice for state-of-art image classification systems. In contrast, we propose a novel object-centric spatial pooling (OCP) approach, following the intuition that knowing the location of the object of interest can be useful for image classification. OCP consists of two steps: (1) inferring the location of the objects, and (2) using the location information to pool foreground and background features separately to form the image-level representation. Step (1) is particularly challenging in a typical classification setting where precise object location annotations are not available during training. To address this challenge, we propose a framework that learns object detectors using only image-level class labels, or so-called weak labels. We validate our approach on the challenging PASCAL07 dataset. Our learned detectors are comparable in accuracy with state-of-the-art weakly supervised detection methods. More importantly, the resulting OCP approach significantly outperforms SPM-based pooling in image classification.

IEEE Transactions on Multimedia | 2014

Towards Codebook-Free: Scalable Cascaded Hashing for Mobile Image Search

Wengang Zhou; Ming Yang; Houqiang Li; Xiaoyu Wang; Yuanqing Lin; Qi Tian

State-of-the-art image retrieval algorithms using local invariant features mostly rely on a large visual codebook to accelerate the feature quantization and matching. This codebook typically contains millions of visual words, which not only demands for considerable resources to train offline but also consumes large amount of memory at the online retrieval stage. This is hardly affordable in resource limited scenarios such as mobile image search applications. To address this issue, we propose a codebook-free algorithm for large scale mobile image search. In our method, we first employ a novel scalable cascaded hashing scheme to ensure the recall rate of local feature matching. Afterwards, we enhance the matching precision by an efficient verification with the binary signatures of these local features. Consequently, our method achieves fast and accurate feature matching free of a huge visual codebook. Moreover, the quantization and binarizing functions in the proposed scheme are independent of small collections of training images and generalize well for diverse image datasets. Evaluated on two public datasets with a million distractor images, the proposed algorithm demonstrates competitive retrieval accuracy and scalability against four recent retrieval methods in literature.

international conference on computer vision | 2013

Semantic-Aware Co-indexing for Image Retrieval

Shiliang Zhang; Ming Yang; Xiaoyu Wang; Yuanqing Lin; Qi Tian

Inverted indexes in image retrieval not only allow fast access to database images but also summarize all knowledge about the database, so that their discriminative capacity largely determines the retrieval performance. In this paper, for vocabulary tree based image retrieval, we propose a semantic-aware co-indexing algorithm to jointly embed two strong cues into the inverted indexes: 1) local invariant features that are robust to delineate low-level image contents, and 2) semantic attributes from large-scale object recognition that may reveal image semantic meanings. For an initial set of inverted indexes of local features, we utilize 1000 semantic attributes to filter out isolated images and insert semantically similar images to the initial set. Encoding these two distinct cues together effectively enhances the discriminative capability of inverted indexes. Such co-indexing operations are totally off-line and introduce small computation overhead to online query cause only local features but no semantic attributes are used for query. Experiments and comparisons with recent retrieval methods on 3 datasets, i.e., UKbench, Holidays, Oxford5K, and 1.3 million images from Flickr as distractors, manifest the competitive performance of our method.

european conference on computer vision | 2012

Multi-component models for object detection

Chunhui Gu; Pablo Andrés Arbeláez; Yuanqing Lin; Kai Yu; Jitendra Malik

In this paper, we propose a multi-component approach for object detection. Rather than attempting to represent an object category with a monolithic model, or pre-defining a reduced set of aspects, we form visual clusters from the data that are tight in appearance and configuration spaces. We train individual classifiers for each component, and then learn a second classifier that operates at the category level by aggregating responses from multiple components. In order to reduce computation cost during detection, we adopt the idea of object window selection, and our segmentation-based selection mechanism produces fewer than 500 windows per image while preserving high object recall. When compared to the leading methods in the challenging VOC PASCAL 2010 dataset, our multi-component approach obtains highly competitive results. Furthermore, unlike monolithic detection methods, our approach allows the transfer of finer-grained semantic information from the components, such as keypoint location and segmentation masks.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015

Regionlets for Generic Object Detection

Xiaoyu Wang; Ming Yang; Shenghuo Zhu; Yuanqing Lin

Generic object detection is confronted by dealing with different degrees of variations in distinct object classes with tractable computations, which demands for descriptive and flexible object representations that are also efficient to evaluate for many locations. In view of this, we propose to model an object class by a cascaded boosting classifier which integrates various types of features from competing local regions, named as region lets. A region let is a base feature extraction region defined proportionally to a detection window at an arbitrary resolution (i.e. size and aspect ratio). These region lets are organized in small groups with stable relative positions to delineate fine grained spatial layouts inside objects. Their features are aggregated to a one-dimensional feature within one group so as to tolerate deformations. Then we evaluate the object bounding box proposal in selective search from segmentation cues, limiting the evaluation locations to thousands. Our approach significantly outperforms the state-of-the-art on popular multi-class detection benchmark datasets with a single method, without any contexts. It achieves the detection mean average precision of 41.7% on the PASCAL VOC 2007 dataset and 39.7% on the VOC 2010 for 20 object categories. It achieves 14.7% mean average precision on the Image Net dataset for 200 object categories, outperforming the latest deformable part-based model (DPM) by 4.7%.

asian conference on computer vision | 2014

Accurate Object Detection with Location Relaxation and Regionlets Re-localization

Chengjiang Long; Xiaoyu Wang; Gang Hua; Ming Yang; Yuanqing Lin

Standard sliding window based object detection requires dense classifier evaluation on densely sampled locations in scale space in order to achieve an accurate localization. To avoid such dense evaluation, selective search based algorithms only evaluate the classifier on a small subset of object proposals. Notwithstanding the demonstrated success, object proposals do not guarantee perfect overlap with the object, leading to a suboptimal detection accuracy. To address this issue, we propose to first relax the dense sampling of the scale space with coarse object proposals generated from bottom-up segmentations. Based on detection results on these proposals, we then conduct a top-down search to more precisely localize the object using supervised descent. This two-stage detection strategy, dubbed location relaxation, is able to localize the object in the continuous parameter space. Furthermore, there is a conflict between accurate object detection and robust object detection. That is because the achievement of the later requires the accommodation of inaccurate and perturbed object locations in the training phase. To address this conflict, we leverage the rich spatial information learned from the Regionlets detection framework to determine where the object is precisely localized. Our proposed approaches are extensively validated on the PASCAL VOC 2007 dataset and a self-collected large scale car dataset. Our method boosts the mean average precision of the current state-of-the-art (41.7 %) to 44.1 % on PASCAL VOC 2007 dataset. To our best knowledge, it is the best performance reported without using outside data (Convolutional neural network based approaches are commonly pre-trained on a large scale outside dataset and fine-tuned on the VOC dataset.).

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2016

Scalable Feature Matching by Dual Cascaded Scalar Quantization for Image Retrieval

Wengang Zhou; Ming Yang; Xiaoyu Wang; Houqiang Li; Yuanqing Lin; Qi Tian

In this paper, we investigate the problem of scalable visual feature matching in large-scale image search and propose a novel cascaded scalar quantization scheme in dual resolution. We formulate the visual feature matching as a range-based neighbor search problem and approach it by identifying hyper-cubes with a dual-resolution scalar quantization strategy. Specifically, for each dimension of the PCA-transformed feature, scalar quantization is performed at both coarse and fine resolutions. The scalar quantization results at the coarse resolution are cascaded over multiple dimensions to index an image database. The scalar quantization results over multiple dimensions at the fine resolution are concatenated into a binary super-vector and stored into the index list for efficient verification. The proposed cascaded scalar quantization (CSQ) method is free of the costly visual codebook training and thus is independent of any image descriptor training set. The index structure of the CSQ is flexible enough to accommodate new image features and scalable to index large-scale image database. We evaluate our approach on the public benchmark datasets for large-scale image retrieval. Experimental results demonstrate the competitive retrieval performance of the proposed method compared with several recent retrieval algorithms on feature quantization.

IEEE Transactions on Pattern Analysis and Machine Intelligence | 2015

Semantic-Aware Co-Indexing for Image Retrieval

Shiliang Zhang; Ming Yang; Xiaoyu Wang; Yuanqing Lin; Qi Tian

In content-based image retrieval, inverted indexes allow fast access to database images and summarize all knowledge about the database. Indexing multiple clues of image contents allows retrieval algorithms search for relevant images from different perspectives, which is appealing to deliver satisfactory user experiences. However, when incorporating diverse image features during online retrieval, it is challenging to ensure retrieval efficiency and scalability. In this paper, for large-scale image retrieval, we propose a semantic-aware co-indexing algorithm to jointly embed two strong cues into the inverted indexes: 1) local invariant features that are robust to delineate low-level image contents, and 2) semantic attributes from large-scale object recognition that may reveal image semantic meanings. Specifically, for an initial set of inverted indexes of local features, we utilize semantic attributes to filter out isolated images and insert semantically similar images to this initial set. Encoding these two distinct and complementary cues together effectively enhances the discriminative capability of inverted indexes. Such co-indexing operations are totally off-line and introduce small computation overhead to online retrieval, because only local features but no semantic attributes are employed for the query. Hence, this co-indexing is different from existing image retrieval methods fusing multiple features or retrieval results. Extensive experiments and comparisons with recent retrieval methods manifest the competitive performance of our method.

Explore More