Congyan Lang | Researchain

Archive Network Publication Hotspot Collaboration

Network

Latest external collaboration on country level. Dive into details by clicking on the dots.

Explore More

Hotspot

Dive into the research topics where Congyan Lang is active.

Explore More

Publication

Featured researches published by Congyan Lang.

european conference on computer vision | 2012

Depth matters: influence of depth cues on visual saliency

Congyan Lang; Tam V. Nguyen; Harish Katti; Karthik Yadati; Mohan S. Kankanhalli; Shuicheng Yan

Most previous studies on visual saliency have only focused on static or dynamic 2D scenes. Since the human visual system has evolved predominantly in natural three dimensional environments, it is important to study whether and how depth information influences visual saliency. In this work, we first collect a large human eye fixation database compiled from a pool of 600 2D-vs-3D image pairs viewed by 80 subjects, where the depth information is directly provided by the Kinect camera and the eye tracking data are captured in both 2D and 3D free-viewing experiments. We then analyze the major discrepancies between 2D and 3D human fixation data of the same scenes, which are further abstracted and modeled as novel depth priors. Finally, we evaluate the performances of state-of-the-art saliency detection models over 3D images, and propose solutions to enhance their performances by integrating the depth priors.

IEEE Transactions on Image Processing | 2012

Saliency Detection by Multitask Sparsity Pursuit

Congyan Lang; Guangcan Liu; Jian Yu; Shuicheng Yan

This paper addresses the problem of detecting salient areas within natural images. We shall mainly study the problem under unsupervised setting, i.e., saliency detection without learning from labeled images. A solution of multitask sparsity pursuit is proposed to integrate multiple types of features for detecting saliency collaboratively. Given an image described by multiple features, its saliency map is inferred by seeking the consistently sparse elements from the joint decompositions of multiple-feature matrices into pairs of low-rank and sparse matrices. The inference process is formulated as a constrained nuclear norm and as an ℓ2,1 -norm minimization problem, which is convex and can be solved efficiently with an augmented Lagrange multiplier method. Compared with previous methods, which usually make use of multiple features by combining the saliency maps obtained from individual features, the proposed method seamlessly integrates multiple features to produce jointly the saliency map with a single inference step and thus produces more accurate and reliable results. In addition to the unsupervised setting, the proposed method can be also generalized to incorporate the top-down priors obtained from supervised environment. Extensive experiments well validate its superiority over other state-of-the-art methods.

IEEE Transactions on Circuits and Systems for Video Technology | 2013

Improving Bottom-up Saliency Detection by Looking into Neighbors

Congyan Lang; Jiashi Feng; Guangcan Liu; Jinhui Tang; Shuicheng Yan; Jiebo Luo

Bottom-up saliency detection aims to detect salient areas within natural images usually without learning from labeled images. Typically, the saliency map of an image is inferred by only using the information within this image (referred to as the “current image”). While efficient, such single-image-based methods may fail to obtain reliable results, because the information within a single image may be insufficient for defining saliency. In this paper, we investigate how saliency detection can benefit from the nearest neighbor structure in the image space. First, we show that existing methods can be improved by extending them to include the visual neighborhood information. This verifies the significance of the neighbors. Next, a solution of multitask sparsity pursuit is proposed to integrate the current image and its neighbors to collaboratively detect saliency. The integration is done by first representing each image as a feature matrix, and then seeking the consistently sparse elements from the joint decompositions of multiple matrices into pairs of low-rank and sparse matrices. The computational procedure is formulated as a constrained nuclear norm and ℓ2,1-norm minimization problem, which is convex and can be solved efficiently with the augmented Lagrange multiplier method. Besides the nearest neighbor structure in the visual feature space, the proposed model can also be generalized to handle multiple visual features. Extensive experiments have clearly validated its superiority over other state-of-the-art methods.

Neurocomputing | 2016

From sample selection to model update

Zhu Teng; Tao Wang; Feng Liu; Dong-Joong Kang; Congyan Lang; Songhe Feng

This paper proposes an online tracking algorithm that employs a confidence combinatorial map model. Drifting is a problem that easily occurs in object tracking and most of the recent tracking algorithms have attempted to solve this problem. In this paper, we propose a confidence combinatorial map that describes the structure of the object, based on which the confidence combinatorial map model is developed. The model associates the relationship between the object in the current frame and that in the previous frame. On the strength of this relationship, more precisely classified samples can be selected and are employed in the model update stage, which directly influences the occurrence of the tracking drift. The proposed algorithm was estimated on several public video sequences and the performance was compared with several state-of-the-art algorithms. The experiments demonstrate that the proposed algorithm outperforms other comparative algorithms and gives a very good performance.

international conference on internet multimedia computing and service | 2015

Vehicle detection and classification based on convolutional neural network

Dongmei He; Congyan Lang; Songhe Feng; Xuetao Du; Chen Zhang

Deep learning has emerged as a hot topic due to extensive application and high accuracy. In this paper this efficient method is used for vehicle detection and classification. We extract visual features from the activation of a deep convolutional network, large-scale sparse learning and other distinguishing features in order to compare their accuracy. When compared to the leading methods in the challenging ImageNet dataset, our deep learning approach obtains highly competitive results. Through the experiments with in short of training data, the features extracted by deep learning method outperform those generated by traditional approaches.

international conference on internet multimedia computing and service | 2014

Depth Information Fused Salient Object Detection

Fangfang Chen; Congyan Lang; Songhe Feng; Zehai Song

Saliency Detection has emerged as a hot topic due to its potential application in image and video understanding. Most existing saliency detection algorithms focus on two-dimensional information while the depth information is often ignored. In this paper, we first create the salient object ground truth of a specific image dataset which contains 600 RGB-D (color and depth information) images taken from different surroundings with different angle and intensity of illumination. The depth image describes the depth information of each object in the image from the perspective of a viewer, and the intensity value of every pixel in the depth image denotes the depth information. With the help of depth information, a more precise object description can be acquired. Furthermore, several state-of-the-art saliency detection models can be utilized to generate 2D salient maps, which can be fused with the depth map to detect the salient object in a given image. Experimental results demonstrate the effectiveness of the proposed method.

Neurocomputing | 2012

Letters: A unified supervised codebook learning framework for classification

Congyan Lang; Songhe Feng; Bing Cheng; Bingbing Ni; Shuicheng Yan

In this paper, we investigate a discriminative visual dictionary learning method for boosting the classification performance. Tied to the K-means clustering philosophy, those popular algorithms for visual dictionary learning cannot guarantee the well-separation of the normalized visual word frequency vectors from distinctive classes or large label distances. The rationale of this work is to harness sample label information for learning visual dictionary in a supervised manner, and this target is then formulated as an objective function, where each sample element, e.g., SIFT descriptor, is expected to be close to its assigned visual word, and at the same time the normalized aggregative visual word frequency vectors are expected to possess the property that kindred samples shall be close to each other while inhomogeneous samples shall be far away. By relaxing the hard binary constraints to soft nonnegative ones, a multiplicative nonnegative update procedure is proposed to optimize the objective function along with theoretic convergence proof. Extensive experiments on classification tasks (i.e., natural scene and sports event classifications) all demonstrate the superiority of this proposed framework over conventional clustering based visual dictionary learning.

Neurocomputing | 2015

An error-tolerant approximate matching algorithm for labeled combinatorial maps

Tao Wang; Hua Yang; Congyan Lang; Songhe Feng

Combinatorial maps are widely used in image representation and processing, and measuring distance or similarity between combinatorial maps is therefore an important issue in this field. The existed distance measures between combinatorial maps based on the largest common submap and the edit distance have high computational complexity, and are hard to be applied in real applications. This paper addresses the problem of inexact matching between labeled combinatorial maps, and aims to find a rapid algorithm for measuring distance between maps. We first define joint-tree of combinatorial maps and prove that it can be used to decide of isomorphism between combinatorial maps. Subsequently, a distance measure based on joint-trees and an approximate approach are proposed to compute the distance between combinatorial maps. Experimental results show that the proposed approach performs better in practice than the previous approach based on approximate map edit distance.

IEEE Transactions on Neural Networks | 2016

Dual Low-Rank Pursuit: Learning Salient Features for Saliency Detection

Congyan Lang; Jiashi Feng; Songhe Feng; Jingdong Wang; Shuicheng Yan

Saliency detection is an important procedure for machines to understand visual world as humans do. In this paper, we consider a specific saliency detection problem of predicting human eye fixations when they freely view natural images, and propose a novel dual low-rank pursuit (DLRP) method. DLRP learns saliency-aware feature transformations by utilizing available supervision information and constructs discriminative bases for effectively detecting human fixation points under the popular low-rank and sparsity-pursuit framework. Benefiting from the embedded high-level information in the supervised learning process, DLRP is able to predict fixations accurately without performing the expensive object segmentation as in the previous works. Comprehensive experiments clearly show the superiority of the proposed DLRP method over the established state-of-the-art methods. We also empirically demonstrate that DLRP provides stronger generalization performance across different data sets and inherits the advantages of both the bottom-up- and top-down-based saliency detection methods.

international conference on internet multimedia computing and service | 2018

Stabilizing video facial landmark detection and tracking via global and local filtering

Xingyan Guo; Yi Jin; Yidong Li; Junliang Xing; Congyan Lang

Video facial landmark detection and tracking are important computer vision tasks with many applications such as face anti-spoofing, animation and recognition. Most of existing facial landmark detection and tracking methods, however, often suffer from the instability issue which will affect their effectiveness in real world applications. In this work we present a novel solution for stabilized facial landmark detection and tracking in video frames. The proposed solution addresses various kinds of challenging situations and provides effective remedies to solve them. The main contribution within our solution is a novel global and local filtering strategy, which guarantees the robustness of global whole facial shape tracking and the adaptivity of local facial parts tracking. The proposed solution does not depend on specific face detection and alignment algorithms, thus can be easily deployed into existing systems. Extensive experimental evaluations and analyses on different benchmark datasets verify the effectiveness of the proposed approach.

Explore More